1. 14 Mar, 2018 11 commits
    • Andrew Morton's avatar
      drivers/infiniband/core/verbs.c: fix build with gcc-4.4.4 · 6ee68773
      Andrew Morton authored
      gcc-4.4.4 has issues with initialization of anonymous unions.
      
      drivers/infiniband/core/verbs.c: In function '__ib_drain_sq':
      drivers/infiniband/core/verbs.c:2204: error: unknown field 'wr_cqe' specified in initializer
      drivers/infiniband/core/verbs.c:2204: warning: initialization makes integer from pointer without a cast
      
      Work around this.
      
      Fixes: a1ae7d03 ("RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access")
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Steve Wise <swise@opengridcomputing.com>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      6ee68773
    • Martin Wilck's avatar
      rdma_rxe: make rxe work over 802.1q VLAN devices · 43c9fc50
      Martin Wilck authored
      This patch fixes RDMA/rxe over 802.1q VLAN devices.
      
      Without it, I observed the following behavior:
      
      a) adding a VLAN device to RXE via rxe_net_add() creates a non-functional
         RDMA device. This is caused by the logic in enum_all_gids_of_dev_cb() /
         is_eth_port_of_netdev(), which only considers networks connected to
         "upper devices" of the configured network device, resulting in an empty
         set of gids for a VLAN interface that is an "upper device" itself.
         Later attempts to connect via this rdma device fail in cma_acuire_dev()
         because no gids can be resolved.
      
      b) adding the master device of the VLAN device instead seems to work
         initially, target addresses via VLAN devices are resolved successfully.
         But the connection times out because no 802.1q VLAN headers are
         inserted in the ethernet packets, which are therefore never received.
         This happens because the RXE layer sends the packets via the master
         device rather than the VLAN device.
      
      The problem could be solved by changing either a) or b). My thinking was
      that the logic in a) was created deliberately, thus I decided to work on
      b). It turns out that the information about the VLAN interface for the gid
      at hand is available in the AV information. My patch converts the RXE code
      to use this netdev instead of rxe->ndev. With this change, RXE over vlan
      works on my test system.
      Signed-off-by: default avatarMartin Wilck <mwilck@suse.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      43c9fc50
    • Arnd Bergmann's avatar
      RDMA/i40iw: include linux/irq.h · baa00fcd
      Arnd Bergmann authored
      We get a build failure on ARM unless the header is included explicitly:
      
      drivers/infiniband/hw/i40iw/i40iw_verbs.c: In function 'i40iw_get_vector_affinity':
      drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: implicit declaration of function 'irq_get_affinity_mask'; did you mean 'irq_create_affinity_masks'? [-Werror=implicit-function-declaration]
        return irq_get_affinity_mask(msix_vec->irq);
               ^~~~~~~~~~~~~~~~~~~~~
               irq_create_affinity_masks
      drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: returning 'int' from a function with return type 'const struct cpumask *' makes pointer from integer without a cast [-Werror=int-conversion]
        return irq_get_affinity_mask(msix_vec->irq);
      
      Fixes: 7e952b19 ("i40iw: Implement get_vector_affinity API")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      baa00fcd
    • Ilya Lesokhin's avatar
      IB/mlx5: Maintain a single emergency page · c44ef998
      Ilya Lesokhin authored
      The mlx5 driver needs to be able to issue invalidation to ODP MRs
      even if it cannot allocate memory. To this end it preallocates
      emergency pages to use when the situation arises.
      
      This flow should be extremely rare enough, that we don't need
      to worry about contention and therefore a single emergency page
      is good enough.
      Signed-off-by: default avatarIlya Lesokhin <ilyal@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c44ef998
    • Daniel Jurgens's avatar
      IB/mlx5: Only synchronize RCU once when removing mkeys · 65edd0e7
      Daniel Jurgens authored
      Instead synchronizing RCU in a loop when removing mkeys in a batch do it
      once at the end before freeing them. The result is only waiting for one
      RCU grace period instead of many serially.
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      65edd0e7
    • Mark Bloch's avatar
      IB/mlx5: Expose more priorities for bypass namespace · 72f7cc09
      Mark Bloch authored
      BYPASS namespace is used by the RDMA side to insert flow rules into
      the vport RX flow tables. Currently only 8 priorities are exposed,
      increase this to 16 to allow more flexibility. This change will also
      cause the BYPASS namespace to use 32 levels (as apposed to 16 today) of
      flow tables, 16 levels for regular rules and 16 for don't trap rules.
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      72f7cc09
    • Bart Van Assche's avatar
      IB/srp: Fix IPv6 address parsing · c62adb7d
      Bart Van Assche authored
      Split IPv6 addresses at the colon that separates the IPv6 address
      and the port number instead of at a colon in the middle of the IPv6
      address. Check whether the IPv6 address is surrounded with square
      brackets.
      
      Fixes: 19f31343 ("IB/srp: Add RDMA/CM support")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c62adb7d
    • Leon Romanovsky's avatar
      RDMA/verbs: Simplify modify QP check · 19b1f540
      Leon Romanovsky authored
      All callers to ib_modify_qp_is_ok() provides enum ib_qp_state
      makes the checks of out-of-scope redundant. Let's remove them
      together with updating function signature to return boolean result.
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      19b1f540
    • Leon Romanovsky's avatar
      RDMA/pvrdma: Properly annotate QP states · fbf1795c
      Leon Romanovsky authored
      QP states provided by core layer are converted to enum ib_qp_state
      and better to use internal variable in that type instead of int.
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      fbf1795c
    • Leon Romanovsky's avatar
      RDMA/uverbs: Ensure validity of current QP state value · 88de869b
      Leon Romanovsky authored
      The QP state is internal enum which is checked at the driver
      level by calling to ib_modify_qp_is_ok(). Move this check closer
      to user and leave kernel users to be checked by compiler.
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      88de869b
    • Leon Romanovsky's avatar
      RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs · 75a45982
      Leon Romanovsky authored
      mlx5 modify_qp() relies on FW that the error will be thrown if wrong
      state is supplied. The missing check in FW causes the following crash
      while using XRC_TGT QPs.
      
      [   14.769632] BUG: unable to handle kernel NULL pointer dereference at (null)
      [   14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0
      [   14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0
      [   14.773126] Oops: 0002 [#1] SMP PTI
      [   14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119
      [   14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [   14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0
      [   14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246
      [   14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000
      [   14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000
      [   14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504
      [   14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240
      [   14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000
      [   14.785800] FS:  00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000
      [   14.787073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0
      [   14.788689] Call Trace:
      [   14.789007]  _ib_modify_qp+0x71/0x120
      [   14.789475]  modify_qp.isra.20+0x207/0x2f0
      [   14.790010]  ib_uverbs_modify_qp+0x90/0xe0
      [   14.790532]  ib_uverbs_write+0x1d2/0x3c0
      [   14.791049]  ? __handle_mm_fault+0x93c/0xe40
      [   14.791644]  __vfs_write+0x36/0x180
      [   14.792096]  ? handle_mm_fault+0xc1/0x210
      [   14.792601]  vfs_write+0xad/0x1e0
      [   14.793018]  SyS_write+0x52/0xc0
      [   14.793422]  do_syscall_64+0x75/0x180
      [   14.793888]  entry_SYSCALL_64_after_hwframe+0x21/0x86
      [   14.794527] RIP: 0033:0x7f545ad76099
      [   14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001
      [   14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099
      [   14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003
      [   14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480
      [   14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760
      [   14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000
      [   14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00
      00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00
      00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c
      [   14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8
      [   14.804838] CR2: 0000000000000000
      [   14.805288] ---[ end trace 3f1da0df5c8b7c37 ]---
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Reported-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      75a45982
  2. 13 Mar, 2018 8 commits
  3. 08 Mar, 2018 10 commits
  4. 07 Mar, 2018 11 commits
    • Leon Romanovsky's avatar
      net/mlx5: Fix wrongly assigned CQ reference counter · 31135eb3
      Leon Romanovsky authored
      The kernel compiled with CONFIG_REFCOUNT_FULL produces the following
      error. The reason to it that initial value of refcount_t is supposed
      to be more than 0, change it.
      
      [    3.106634] ------------[ cut here ]------------
      [    3.107756] refcount_t: increment on 0; use-after-free.
      [    3.109130] WARNING: CPU: 0 PID: 1 at lib/refcount.c:153 refcount_inc+0x27/0x30
      [    3.110085] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1-00028-gf683e04bdccc #137
      [    3.110085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [    3.110085] RIP: 0010:refcount_inc+0x27/0x30
      [    3.110085] RSP: 0000:ffffaa620000fba0 EFLAGS: 00010286
      [    3.110085] RAX: 0000000000000000 RBX: ffff9a6d1a1821c8 RCX: ffffffff98a50f48
      [    3.110085] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 0000000000000246
      [    3.110085] RBP: ffff9a6d1ac800a0 R08: 0000000000000289 R09: 000000000000000a
      [    3.110085] R10: fffff03bc0682840 R11: ffffffff9949856d R12: ffff9a6d1b4a4000
      [    3.110085] R13: 0000000000000000 R14: ffff9a6d1a0a6c00 R15: ffffaa620000fc5c
      [    3.110085] FS:  0000000000000000(0000) GS:ffff9a6d1fc00000(0000) knlGS:0000000000000000
      [    3.110085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    3.110085] CR2: 0000000000000000 CR3: 000000000ba0a000 CR4: 00000000000006b0
      [    3.110085] Call Trace:
      [    3.110085]  mlx5_core_create_cq+0xde/0x250
      [    3.110085]  ? __kmalloc+0x1ce/0x1e0
      [    3.110085]  mlx5e_create_cq+0x15c/0x1e0
      [    3.110085]  mlx5e_open_drop_rq+0xea/0x190
      [    3.110085]  mlx5e_attach_netdev+0x53/0x140
      [    3.110085]  mlx5e_attach+0x3d/0x60
      [    3.110085]  mlx5e_add+0x11d/0x2f0
      [    3.110085]  mlx5_add_device+0x77/0x170
      [    3.110085]  mlx5_register_interface+0x74/0xc0
      [    3.110085]  ? set_debug_rodata+0x11/0x11
      [    3.110085]  init+0x67/0x72
      [    3.110085]  ? mlx4_en_init_ptys2ethtool_map+0x346/0x346
      [    3.110085]  do_one_initcall+0x98/0x147
      [    3.110085]  ? set_debug_rodata+0x11/0x11
      [    3.110085]  kernel_init_freeable+0x164/0x1e0
      [    3.110085]  ? rest_init+0xb0/0xb0
      [    3.110085]  kernel_init+0xa/0x100
      [    3.110085]  ret_from_fork+0x35/0x40
      [    3.110085] Code: 00 00 00 00 e8 ab ff ff ff 84 c0 74 02 f3 c3 80 3d 3b c3 64 01 00 75 f5 48 c7 c7 68 0b 81 98 c6 05 2b c3 64 01 01 e8 79 d7 a3 ff <0f> ff c3 66 0f 1f 44 00 00 8b 06 83 f8 ff 74 39 31 c9 39 f8 89
      [    3.110085] ---[ end trace a0068e1c68438a74 ]---
      
      Fixes: f105b45b ("net/mlx5: CQ hold/put API")
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      31135eb3
    • Aviad Yehezkel's avatar
      net/mlx5: IPSec, Add support for ESN · cb010083
      Aviad Yehezkel authored
      Currently ESN is not supported with IPSec device offload.
      
      This patch adds ESN support to IPsec device offload.
      Implementing new xfrm device operation to synchronize offloading device
      ESN with xfrm received SN. New QP command to update SA state at the
      following:
      
                 ESN 1                    ESN 2                  ESN 3
      |-----------*-----------|-----------*-----------|-----------*
      ^           ^           ^           ^           ^           ^
      
      ^ - marks where QP command invoked to update the SA ESN state
          machine.
      | - marks the start of the ESN scope (0-2^32-1). At this point move SA
          ESN overlap bit to zero and increment ESN.
      * - marks the middle of the ESN scope (2^31). At this point move SA
          ESN overlap bit to one.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarYossef Efraim <yossefe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      cb010083
    • Aviad Yehezkel's avatar
      net/mlx5e: Added common function for to_ipsec_sa_entry · 75ef3f55
      Aviad Yehezkel authored
      New function for getting driver internal sa entry from xfrm state.
      All checks are done in one function.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      75ef3f55
    • Aviad Yehezkel's avatar
      net/mlx5: Add flow-steering commands for FPGA IPSec implementation · 05564d0a
      Aviad Yehezkel authored
      In order to add a context to the FPGA, we need to get both the software
      transform context (which includes the keys, etc) and the
      source/destination IPs (which are included in the steering
      rule). Therefore, we register new set of firmware like commands for
      the FPGA. Each time a rule is added, the steering core infrastructure
      calls the FPGA command layer. If the rule is intended for the FPGA,
      it combines the IPs information with the software transformation
      context and creates the respective hardware transform.
      Afterwards, it calls the standard steering command layer.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      05564d0a
    • Aviad Yehezkel's avatar
      net/mlx5: Refactor accel IPSec code · d6c4f029
      Aviad Yehezkel authored
      The current code has one layer that executed FPGA commands and
      the Ethernet part directly used this code. Since downstream patches
      introduces support for IPSec in mlx5_ib, we need to provide some
      abstractions. This patch refactors the accel code into one layer
      that creates a software IPSec transformation and another one which
      creates the actual hardware context.
      The internal command implementation is now hidden in the FPGA
      core layer. The code also adds the ability to share FPGA hardware
      contexts. If two contexts are the same, only a reference count
      is taken.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d6c4f029
    • Aviad Yehezkel's avatar
      net/mlx5: Added required metadata capability for ipsec · af9fe19d
      Aviad Yehezkel authored
      Currently our device requires additional metadata in packet
      to perform ipsec crypto offload.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      af9fe19d
    • Aviad Yehezkel's avatar
      net/mlx5: Export ipsec capabilities · 1d2005e2
      Aviad Yehezkel authored
      We will need that for ipsec verbs.
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      1d2005e2
    • Aviad Yehezkel's avatar
      net/mlx5: IPSec, Add command V2 support · 65802f48
      Aviad Yehezkel authored
      This patch adds V2 command support.
      New fpga devices support extended features (udp encap, esn etc...), this
      features require new hardware sadb format therefore we have a new version
      of commands to manipulate it.
      Signed-off-by: default avatarYossef Efraim <yossefe@mellanox.com>
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      65802f48
    • Yossi Kuperman's avatar
      net/mlx5e: IPSec, Add support for ESP trailer removal by hardware · 788a8210
      Yossi Kuperman authored
      Current hardware decrypts and authenticates incoming ESP packets.
      Subsequently, the software extracts the nexthdr field, truncates the
      trailer and adjusts csum accordingly.
      
      With this patch and a capable device, the trailer is being removed
      by the hardware and the nexthdr field is conveyed via PET. This way
      we avoid both the need to access the trailer (cache miss) and to
      compute its relative checksum, which significantly improve
      the performance.
      
      Experiment shows that trailer removal improves the performance by
      2Gbps, (netperf). Both forwarding and host-to-host configurations.
      Signed-off-by: default avatarYossi Kuperman <yossiku@mellanox.com>
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      788a8210
    • Yossi Kuperman's avatar
      net/mlx5: IPSec, Generalize sandbox QP commands · 581fddde
      Yossi Kuperman authored
      The current code assume only SA QP commands.
      Refactor in order to pave the way for new QP commands:
      1. Generic cmd response format.
      2. SA cmd checks are in dedicated functions.
      3. Aligned debug prints.
      Signed-off-by: default avatarYossi Kuperman <yossiku@mellanox.com>
      Signed-off-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      581fddde
    • Saeed Mahameed's avatar
      net/mlx5: Use MLX5_IPSEC_DEV macro for ipsec caps · d83a69c2
      Saeed Mahameed authored
      Fix build break of mlx5_accel_ipsec_device_caps is not defined when
      MLX5_ACCEL is not selected, use MLX5_IPSEC_DEV instead which handles
      such case.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reported-by: default avatarDoug Ledford <dledford@redhat.com>
      d83a69c2