1. 15 Jan, 2018 1 commit
    • Leon Romanovsky's avatar
      RDMA/mlx5: Fix out-of-bound access while querying AH · ae59c3f0
      Leon Romanovsky authored
      The rdma_ah_find_type() accesses the port array based on an index
      controlled by userspace. The existing bounds check is after the first use
      of the index, so userspace can generate an out of bounds access, as shown
      by the KASN report below.
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0
      Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409
      
      CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
       dump_stack+0xe9/0x18f
       print_address_description+0xa2/0x350
       kasan_report+0x3a5/0x400
       to_rdma_ah_attr+0xa8/0x3b0
       mlx5_ib_query_qp+0xd35/0x1330
       ib_query_qp+0x8a/0xb0
       ib_uverbs_query_qp+0x237/0x7f0
       ib_uverbs_write+0x617/0xd80
       __vfs_write+0xf7/0x500
       vfs_write+0x149/0x310
       SyS_write+0xca/0x190
       entry_SYSCALL_64_fastpath+0x18/0x85
      RIP: 0033:0x7fe9c7a275a0
      RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0
      RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003
      RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018
      R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000
      R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560
      
      Allocated by task 1:
       __kmalloc+0x3f9/0x430
       alloc_mad_private+0x25/0x50
       ib_mad_post_receive_mads+0x204/0xa60
       ib_mad_init_device+0xa59/0x1020
       ib_register_device+0x83a/0xbc0
       mlx5_ib_add+0x50e/0x5c0
       mlx5_add_device+0x142/0x410
       mlx5_register_interface+0x18f/0x210
       mlx5_ib_init+0x56/0x63
       do_one_initcall+0x15b/0x270
       kernel_init_freeable+0x2d8/0x3d0
       kernel_init+0x14/0x190
       ret_from_fork+0x24/0x30
      
      Freed by task 0:
      (stack is not available)
      
      The buggy address belongs to the object at ffff880019ae2000
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 104 bytes to the right of
       512-byte region [ffff880019ae2000, ffff880019ae2200)
      The buggy address belongs to the page:
      page:000000005d674e18 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
      flags: 0x4000000000008100(slab|head)
      raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c
      raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc
      >ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                                ^
       ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      ==================================================================
      Disabling lock debugging due to kernel taint
      
      Cc: <stable@vger.kernel.org>
      Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      ae59c3f0
  2. 10 Jan, 2018 2 commits
  3. 04 Jan, 2018 2 commits
  4. 02 Jan, 2018 3 commits
    • Leon Romanovsky's avatar
      RDMA/netlink: Fix locking around __ib_get_device_by_index · f8978bd9
      Leon Romanovsky authored
      Holding locks is mandatory when calling __ib_device_get_by_index,
      otherwise there are races during the list iteration with device removal.
      
      Since the locks are static to device.c, __ib_device_get_by_index can
      never be called correctly by any user out side the file.
      
      Make the function static and provide a safe function that gets the
      correct locks and returns a kref'd pointer. Fix all callers.
      
      Fixes: e5c9469e ("RDMA/netlink: Add nldev device doit implementation")
      Fixes: c3f66f7b ("RDMA/netlink: Implement nldev port doit callback")
      Fixes: 7d02f605 ("RDMA/netlink: Add nldev port dumpit implementation")
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      f8978bd9
    • Erez Shitrit's avatar
      IB/ipoib: Fix race condition in neigh creation · 16ba3def
      Erez Shitrit authored
      When using enhanced mode for IPoIB, two threads may execute xmit in
      parallel to two different TX queues while the target is the same.
      In this case, both of them will add the same neighbor to the path's
      neigh link list and we might see the following message:
      
        list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
        WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
        ipoib_start_xmit+0x477/0x680 [ib_ipoib]
        dev_hard_start_xmit+0xb9/0x3e0
        sch_direct_xmit+0xf9/0x250
        __qdisc_run+0x176/0x5d0
        __dev_queue_xmit+0x1f5/0xb10
        __dev_queue_xmit+0x55/0xb10
      
      Analysis:
      Two SKB are scheduled to be transmitted from two cores.
      In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
      Two calls to neigh_add_path are made. One thread takes the spin-lock
      and calls ipoib_neigh_alloc which creates the neigh structure,
      then (after the __path_find) the neigh is added to the path's neigh
      link list. When the second thread enters the critical section it also
      calls ipoib_neigh_alloc but in this case it gets the already allocated
      ipoib_neigh structure, which is already linked to the path's neigh
      link list and adds it again to the list. Which beside of triggering
      the list, it creates a loop in the linked list. This loop leads to
      endless loop inside path_rec_completion.
      
      Solution:
      Check list_empty(&neigh->list) before adding to the list.
      Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"
      
      Fixes: b63b70d8 ('IPoIB: Use a private hash table for path lookup in xmit path')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarAlex Vesker <valex@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      16ba3def
    • Leon Romanovsky's avatar
      IB/mlx4: Fix mlx4_ib_alloc_mr error flow · 5a371cf8
      Leon Romanovsky authored
      ibmr.device is being set only after ib_alloc_mr() is successfully complete.
      Therefore, in case imlx4_mr_enable() returns with error, the error flow
      unwinder calls to mlx4_free_priv_pages(), which uses ibmr.device.
      
      Such usage causes to NULL dereference oops and to fix it, the IB device
      should be set in the mr struct earlier stage (e.g. prior to calling
      mlx4_free_priv_pages()).
      
      Fixes: 1b2cd0fc ("IB/mlx4: Support the new memory registration API")
      Signed-off-by: default avatarNitzan Carmi <nitzanc@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      5a371cf8
  5. 27 Dec, 2017 4 commits
  6. 22 Dec, 2017 1 commit
  7. 21 Dec, 2017 8 commits
  8. 13 Dec, 2017 1 commit
  9. 11 Dec, 2017 1 commit
    • Steve Wise's avatar
      iw_cxgb4: only insert drain cqes if wq is flushed · c058ecf6
      Steve Wise authored
      Only insert our special drain CQEs to support ib_drain_sq/rq() after
      the wq is flushed. Otherwise, existing but not yet polled CQEs can be
      returned out of order to the user application.  This can happen when the
      QP has exited RTS but not yet flushed the QP, which can happen during
      a normal close (vs abortive close).
      
      In addition never count the drain CQEs when determining how many CQEs
      need to be synthesized during the flush operation.  This latter issue
      should never happen if the QP is properly flushed before inserting the
      drain CQE, but I wanted to avoid corrupting the CQ state.  So we handle
      it and log a warning once.
      
      Fixes: 4fe7c296 ("iw_cxgb4: refactor sq/rq drain logic")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c058ecf6
  10. 07 Dec, 2017 5 commits
    • Steve Wise's avatar
      iw_cxgb4: only clear the ARMED bit if a notification is needed · 335ebf6f
      Steve Wise authored
      In __flush_qp(), the CQ ARMED bit was being cleared regardless of
      whether any notification is actually needed.  This resulted in the iser
      termination logic getting stuck in ib_drain_sq() because the CQ was not
      marked ARMED and thus the drain CQE notification wasn't triggered.
      
      This new bug was exposed when this commit was merged:
      
      commit cbb40fad ("iw_cxgb4: only call the cq comp_handler when the
      cq is armed")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      335ebf6f
    • Leon Romanovsky's avatar
      RDMA/netlink: Fix general protection fault · d0e312fe
      Leon Romanovsky authored
      The RDMA netlink core code checks validity of messages by ensuring
      that type and operand are in range. It works well for almost all
      clients except NLDEV, which has cb_table less than number of operands.
      
      Request to access such operand will trigger the following kernel panic.
      
      This patch updates all places where cb_table is declared for the
      consistency, but only NLDEV is actually need it.
      
      general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
      Modules linked in:
      CPU: 0 PID: 522 Comm: syz-executor6 Not tainted 4.13.0+ #4
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      task: ffff8800657799c0 task.stack: ffff8800695d000
      RIP: 0010:rdma_nl_rcv_msg+0x13a/0x4c0
      RSP: 0018:ffff8800695d7838 EFLAGS: 00010207
      RAX: dffffc0000000000 RBX: 1ffff1000d2baf0b RCX: 00000000704ff4d7
      RDX: 0000000000000000 RSI: ffffffff81ddb03c RDI: 00000003827fa6bc
      RBP: ffff8800695d7900 R08: ffffffff82ec0578 R09: 0000000000000000
      R10: ffff8800695d7900 R11: 0000000000000001 R12: 000000000000001c
      R13: ffff880069d31e00 R14: 00000000ffffffff R15: ffff880069d357c0
      FS:  00007fee6acb8700(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000201a9000 CR3: 0000000059766000 CR4: 00000000000006b0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ? rdma_nl_multicast+0x80/0x80
       rdma_nl_rcv+0x36b/0x4d0
       ? ibnl_put_attr+0xc0/0xc0
       netlink_unicast+0x4bd/0x6d0
       ? netlink_sendskb+0x50/0x50
       ? drop_futex_key_refs.isra.4+0x68/0xb0
       netlink_sendmsg+0x9ab/0xbd0
       ? nlmsg_notify+0x140/0x140
       ? wake_up_q+0xa1/0xf0
       ? drop_futex_key_refs.isra.4+0x68/0xb0
       sock_sendmsg+0x88/0xd0
       sock_write_iter+0x228/0x3c0
       ? sock_sendmsg+0xd0/0xd0
       ? do_futex+0x3e5/0xb20
       ? iov_iter_init+0xaf/0x1d0
       __vfs_write+0x46e/0x640
       ? sched_clock_cpu+0x1b/0x190
       ? __vfs_read+0x620/0x620
       ? __fget+0x23a/0x390
       ? rw_verify_area+0xca/0x290
       vfs_write+0x192/0x490
       SyS_write+0xde/0x1c0
       ? SyS_read+0x1c0/0x1c0
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       entry_SYSCALL_64_fastpath+0x18/0xad
      RIP: 0033:0x7fee6a74a219
      RSP: 002b:00007fee6acb7d58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000638000 RCX: 00007fee6a74a219
      RDX: 0000000000000078 RSI: 0000000020141000 RDI: 0000000000000006
      RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000212 R12: ffff8800695d7f98
      R13: 0000000020141000 R14: 0000000000000006 R15: 00000000ffffffff
      Code: d6 48 b8 00 00 00 00 00 fc ff df 66 41 81 e4 ff 03 44 8d 72 ff 4a 8d 3c b5 c0 a6 7f 82 44 89 b5 4c ff ff ff 48 89 f9 48 c1 e9 03 <0f> b6 0c 01 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RIP: rdma_nl_rcv_msg+0x13a/0x4c0 RSP: ffff8800695d7838
      ---[ end trace ba085d123959c8ec ]---
      Kernel panic - not syncing: Fatal exception
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Fixes: b4c598a6 ("RDMA/netlink: Implement nldev device dumpit calback")
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d0e312fe
    • Guy Levi's avatar
      IB/mlx4: Fix RSS hash fields restrictions · 4d02ebd9
      Guy Levi authored
      Mistakenly the driver didn't allow RSS hash fields combinations which
      involve both IPv4 and IPv6 protocols. This bug caused to failures for
      user's use cases for RSS.
      
      Consequently, this patch fixes this bug and allows any combination that
      the HW can support.
      
      Additionally, the patch fixes the driver to return an error in case the
      user provides an unsupported mask for RSS hash fields.
      
      Fixes: 3078f5f1 ("IB/mlx4: Add support for RSS QP")
      Signed-off-by: default avatarGuy Levi <guyle@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4d02ebd9
    • Daniel Jurgens's avatar
      IB/core: Don't enforce PKey security on SMI MADs · 0fbe8f57
      Daniel Jurgens authored
      Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey
      on SMI MADs is not necessary, and it seems that some older adapters
      using the mthca driver don't follow the convention of using the default
      PKey, resulting in false denials, or errors querying the PKey cache.
      
      SMI MAD security is still enforced, only agents allowed to manage the
      subnet are able to receive or send SMI MADs.
      Reported-by: default avatarChris Blake <chrisrblake93@gmail.com>
      Cc: <stable@vger.kernel.org> # v4.12
      Fixes: 47a2b338 ("IB/core: Enforce security on management datagrams")
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      0fbe8f57
    • Daniel Jurgens's avatar
      IB/core: Bound check alternate path port number · 4cae8ff1
      Daniel Jurgens authored
      The alternate port number is used as an array index in the IB
      security implementation, invalid values can result in a kernel panic.
      
      Cc: <stable@vger.kernel.org> # v4.12
      Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs")
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4cae8ff1
  11. 01 Dec, 2017 11 commits
  12. 30 Nov, 2017 1 commit
    • Geert Uytterhoeven's avatar
      IB: INFINIBAND should depend on HAS_DMA · db0acbc4
      Geert Uytterhoeven authored
      If NO_DMA=y:
      
          ERROR: "bad_dma_ops" [net/sunrpc/xprtrdma/rpcrdma.ko] undefined!
          ERROR: "bad_dma_ops" [net/smc/smc.ko] undefined!
          ERROR: "bad_dma_ops" [net/rds/rds_rdma.ko] undefined!
          ERROR: "bad_dma_ops" [net/9p/9pnet_rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/nvme/target/nvmet-rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/nvme/host/nvme-rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srpt/ib_srpt.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srp/ib_srp.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/isert/ib_isert.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/ipoib/ib_ipoib.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/core/ib_core.ko] undefined!
      
      Before, this was handled implicitly by the dependency on PCI.
      Add an explicit dependency on HAS_DMA to fix this.
      
      Fixes: 931bc0d9 ("IB: Move PCI dependency from root KConfig to HW's KConfigs")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      db0acbc4