1. 02 Jan, 2018 3 commits
    • Leon Romanovsky's avatar
      RDMA/netlink: Fix locking around __ib_get_device_by_index · f8978bd9
      Leon Romanovsky authored
      Holding locks is mandatory when calling __ib_device_get_by_index,
      otherwise there are races during the list iteration with device removal.
      
      Since the locks are static to device.c, __ib_device_get_by_index can
      never be called correctly by any user out side the file.
      
      Make the function static and provide a safe function that gets the
      correct locks and returns a kref'd pointer. Fix all callers.
      
      Fixes: e5c9469e ("RDMA/netlink: Add nldev device doit implementation")
      Fixes: c3f66f7b ("RDMA/netlink: Implement nldev port doit callback")
      Fixes: 7d02f605 ("RDMA/netlink: Add nldev port dumpit implementation")
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      f8978bd9
    • Erez Shitrit's avatar
      IB/ipoib: Fix race condition in neigh creation · 16ba3def
      Erez Shitrit authored
      When using enhanced mode for IPoIB, two threads may execute xmit in
      parallel to two different TX queues while the target is the same.
      In this case, both of them will add the same neighbor to the path's
      neigh link list and we might see the following message:
      
        list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
        WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
        ipoib_start_xmit+0x477/0x680 [ib_ipoib]
        dev_hard_start_xmit+0xb9/0x3e0
        sch_direct_xmit+0xf9/0x250
        __qdisc_run+0x176/0x5d0
        __dev_queue_xmit+0x1f5/0xb10
        __dev_queue_xmit+0x55/0xb10
      
      Analysis:
      Two SKB are scheduled to be transmitted from two cores.
      In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
      Two calls to neigh_add_path are made. One thread takes the spin-lock
      and calls ipoib_neigh_alloc which creates the neigh structure,
      then (after the __path_find) the neigh is added to the path's neigh
      link list. When the second thread enters the critical section it also
      calls ipoib_neigh_alloc but in this case it gets the already allocated
      ipoib_neigh structure, which is already linked to the path's neigh
      link list and adds it again to the list. Which beside of triggering
      the list, it creates a loop in the linked list. This loop leads to
      endless loop inside path_rec_completion.
      
      Solution:
      Check list_empty(&neigh->list) before adding to the list.
      Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"
      
      Fixes: b63b70d8 ('IPoIB: Use a private hash table for path lookup in xmit path')
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarAlex Vesker <valex@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      16ba3def
    • Leon Romanovsky's avatar
      IB/mlx4: Fix mlx4_ib_alloc_mr error flow · 5a371cf8
      Leon Romanovsky authored
      ibmr.device is being set only after ib_alloc_mr() is successfully complete.
      Therefore, in case imlx4_mr_enable() returns with error, the error flow
      unwinder calls to mlx4_free_priv_pages(), which uses ibmr.device.
      
      Such usage causes to NULL dereference oops and to fix it, the IB device
      should be set in the mr struct earlier stage (e.g. prior to calling
      mlx4_free_priv_pages()).
      
      Fixes: 1b2cd0fc ("IB/mlx4: Support the new memory registration API")
      Signed-off-by: default avatarNitzan Carmi <nitzanc@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      5a371cf8
  2. 27 Dec, 2017 4 commits
  3. 22 Dec, 2017 1 commit
  4. 21 Dec, 2017 8 commits
  5. 13 Dec, 2017 1 commit
  6. 11 Dec, 2017 1 commit
    • Steve Wise's avatar
      iw_cxgb4: only insert drain cqes if wq is flushed · c058ecf6
      Steve Wise authored
      Only insert our special drain CQEs to support ib_drain_sq/rq() after
      the wq is flushed. Otherwise, existing but not yet polled CQEs can be
      returned out of order to the user application.  This can happen when the
      QP has exited RTS but not yet flushed the QP, which can happen during
      a normal close (vs abortive close).
      
      In addition never count the drain CQEs when determining how many CQEs
      need to be synthesized during the flush operation.  This latter issue
      should never happen if the QP is properly flushed before inserting the
      drain CQE, but I wanted to avoid corrupting the CQ state.  So we handle
      it and log a warning once.
      
      Fixes: 4fe7c296 ("iw_cxgb4: refactor sq/rq drain logic")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c058ecf6
  7. 07 Dec, 2017 5 commits
    • Steve Wise's avatar
      iw_cxgb4: only clear the ARMED bit if a notification is needed · 335ebf6f
      Steve Wise authored
      In __flush_qp(), the CQ ARMED bit was being cleared regardless of
      whether any notification is actually needed.  This resulted in the iser
      termination logic getting stuck in ib_drain_sq() because the CQ was not
      marked ARMED and thus the drain CQE notification wasn't triggered.
      
      This new bug was exposed when this commit was merged:
      
      commit cbb40fad ("iw_cxgb4: only call the cq comp_handler when the
      cq is armed")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      335ebf6f
    • Leon Romanovsky's avatar
      RDMA/netlink: Fix general protection fault · d0e312fe
      Leon Romanovsky authored
      The RDMA netlink core code checks validity of messages by ensuring
      that type and operand are in range. It works well for almost all
      clients except NLDEV, which has cb_table less than number of operands.
      
      Request to access such operand will trigger the following kernel panic.
      
      This patch updates all places where cb_table is declared for the
      consistency, but only NLDEV is actually need it.
      
      general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
      Modules linked in:
      CPU: 0 PID: 522 Comm: syz-executor6 Not tainted 4.13.0+ #4
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      task: ffff8800657799c0 task.stack: ffff8800695d000
      RIP: 0010:rdma_nl_rcv_msg+0x13a/0x4c0
      RSP: 0018:ffff8800695d7838 EFLAGS: 00010207
      RAX: dffffc0000000000 RBX: 1ffff1000d2baf0b RCX: 00000000704ff4d7
      RDX: 0000000000000000 RSI: ffffffff81ddb03c RDI: 00000003827fa6bc
      RBP: ffff8800695d7900 R08: ffffffff82ec0578 R09: 0000000000000000
      R10: ffff8800695d7900 R11: 0000000000000001 R12: 000000000000001c
      R13: ffff880069d31e00 R14: 00000000ffffffff R15: ffff880069d357c0
      FS:  00007fee6acb8700(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000201a9000 CR3: 0000000059766000 CR4: 00000000000006b0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ? rdma_nl_multicast+0x80/0x80
       rdma_nl_rcv+0x36b/0x4d0
       ? ibnl_put_attr+0xc0/0xc0
       netlink_unicast+0x4bd/0x6d0
       ? netlink_sendskb+0x50/0x50
       ? drop_futex_key_refs.isra.4+0x68/0xb0
       netlink_sendmsg+0x9ab/0xbd0
       ? nlmsg_notify+0x140/0x140
       ? wake_up_q+0xa1/0xf0
       ? drop_futex_key_refs.isra.4+0x68/0xb0
       sock_sendmsg+0x88/0xd0
       sock_write_iter+0x228/0x3c0
       ? sock_sendmsg+0xd0/0xd0
       ? do_futex+0x3e5/0xb20
       ? iov_iter_init+0xaf/0x1d0
       __vfs_write+0x46e/0x640
       ? sched_clock_cpu+0x1b/0x190
       ? __vfs_read+0x620/0x620
       ? __fget+0x23a/0x390
       ? rw_verify_area+0xca/0x290
       vfs_write+0x192/0x490
       SyS_write+0xde/0x1c0
       ? SyS_read+0x1c0/0x1c0
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       entry_SYSCALL_64_fastpath+0x18/0xad
      RIP: 0033:0x7fee6a74a219
      RSP: 002b:00007fee6acb7d58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000638000 RCX: 00007fee6a74a219
      RDX: 0000000000000078 RSI: 0000000020141000 RDI: 0000000000000006
      RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000212 R12: ffff8800695d7f98
      R13: 0000000020141000 R14: 0000000000000006 R15: 00000000ffffffff
      Code: d6 48 b8 00 00 00 00 00 fc ff df 66 41 81 e4 ff 03 44 8d 72 ff 4a 8d 3c b5 c0 a6 7f 82 44 89 b5 4c ff ff ff 48 89 f9 48 c1 e9 03 <0f> b6 0c 01 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RIP: rdma_nl_rcv_msg+0x13a/0x4c0 RSP: ffff8800695d7838
      ---[ end trace ba085d123959c8ec ]---
      Kernel panic - not syncing: Fatal exception
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Fixes: b4c598a6 ("RDMA/netlink: Implement nldev device dumpit calback")
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d0e312fe
    • Guy Levi's avatar
      IB/mlx4: Fix RSS hash fields restrictions · 4d02ebd9
      Guy Levi authored
      Mistakenly the driver didn't allow RSS hash fields combinations which
      involve both IPv4 and IPv6 protocols. This bug caused to failures for
      user's use cases for RSS.
      
      Consequently, this patch fixes this bug and allows any combination that
      the HW can support.
      
      Additionally, the patch fixes the driver to return an error in case the
      user provides an unsupported mask for RSS hash fields.
      
      Fixes: 3078f5f1 ("IB/mlx4: Add support for RSS QP")
      Signed-off-by: default avatarGuy Levi <guyle@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4d02ebd9
    • Daniel Jurgens's avatar
      IB/core: Don't enforce PKey security on SMI MADs · 0fbe8f57
      Daniel Jurgens authored
      Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey
      on SMI MADs is not necessary, and it seems that some older adapters
      using the mthca driver don't follow the convention of using the default
      PKey, resulting in false denials, or errors querying the PKey cache.
      
      SMI MAD security is still enforced, only agents allowed to manage the
      subnet are able to receive or send SMI MADs.
      Reported-by: default avatarChris Blake <chrisrblake93@gmail.com>
      Cc: <stable@vger.kernel.org> # v4.12
      Fixes: 47a2b338 ("IB/core: Enforce security on management datagrams")
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      0fbe8f57
    • Daniel Jurgens's avatar
      IB/core: Bound check alternate path port number · 4cae8ff1
      Daniel Jurgens authored
      The alternate port number is used as an array index in the IB
      security implementation, invalid values can result in a kernel panic.
      
      Cc: <stable@vger.kernel.org> # v4.12
      Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs")
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4cae8ff1
  8. 01 Dec, 2017 11 commits
  9. 30 Nov, 2017 2 commits
    • Geert Uytterhoeven's avatar
      IB: INFINIBAND should depend on HAS_DMA · db0acbc4
      Geert Uytterhoeven authored
      If NO_DMA=y:
      
          ERROR: "bad_dma_ops" [net/sunrpc/xprtrdma/rpcrdma.ko] undefined!
          ERROR: "bad_dma_ops" [net/smc/smc.ko] undefined!
          ERROR: "bad_dma_ops" [net/rds/rds_rdma.ko] undefined!
          ERROR: "bad_dma_ops" [net/9p/9pnet_rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/nvme/target/nvmet-rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/nvme/host/nvme-rdma.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srpt/ib_srpt.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srp/ib_srp.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/isert/ib_isert.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/ulp/ipoib/ib_ipoib.ko] undefined!
          ERROR: "bad_dma_ops" [drivers/infiniband/core/ib_core.ko] undefined!
      
      Before, this was handled implicitly by the dependency on PCI.
      Add an explicit dependency on HAS_DMA to fix this.
      
      Fixes: 931bc0d9 ("IB: Move PCI dependency from root KConfig to HW's KConfigs")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      db0acbc4
    • Dennis Dalessandro's avatar
      IB/hfi1: Initialize bth1 in 16B rc ack builder · 8935780b
      Dennis Dalessandro authored
      It is possible the bth1 variable could be used uninitialized so going
      ahead and giving it a default value.
      
      Otherwise we leak stack memory to the network.
      
      Fixes: 5b6cabb0 ("IB/hfi1: Add 16B RC/UC support")
      Reviewed-by: default avatarDon Hiatt <don.hiatt@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      8935780b
  10. 27 Nov, 2017 1 commit
  11. 26 Nov, 2017 3 commits
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · bbecb1cf
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - LPAE fixes for kernel-readonly regions
      
       - Fix for get_user_pages_fast on LPAE systems
      
       - avoid tying decompressor to a particular platform if DEBUG_LL is
         enabled
      
       - BUG if we attempt to return to userspace but the to-be-restored PSR
         value keeps us in privileged mode (defeating an issue that ftracetest
         found)
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: BUG if jumping to usermode address in kernel mode
        ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
        ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
        ARM: make decompressor debug output user selectable
        ARM: fix get_user_pages_fast
      bbecb1cf
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dec0029a
      Linus Torvalds authored
      Pull irq fixes from Thomas Glexiner:
      
       - unbreak the irq trigger type check for legacy platforms
      
       - a handful fixes for ARM GIC v3/4 interrupt controllers
      
       - a few trivial fixes all over the place
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/matrix: Make - vs ?: Precedence explicit
        irqchip/imgpdc: Use resource_size function on resource object
        irqchip/qcom: Fix u32 comparison with value less than zero
        irqchip/exiu: Fix return value check in exiu_init()
        irqchip/gic-v3-its: Remove artificial dependency on PCI
        irqchip/gic-v4: Add forward definition of struct irq_domain_ops
        irqchip/gic-v3: pr_err() strings should end with newlines
        irqchip/s3c24xx: pr_err() strings should end with newlines
        irqchip/gic-v3: Fix ppi-partitions lookup
        irqchip/gic-v4: Clear IRQ_DISABLE_UNLAZY again if mapping fails
        genirq: Track whether the trigger type has been set
      dec0029a
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 02fc87b1
      Linus Torvalds authored
      Pull misc x86 fixes from Ingo Molnar:
       - topology enumeration fixes
       - KASAN fix
       - two entry fixes (not yet the big series related to KASLR)
       - remove obsolete code
       - instruction decoder fix
       - better /dev/mem sanity checks, hopefully working better this time
       - pkeys fixes
       - two ACPI fixes
       - 5-level paging related fixes
       - UMIP fixes that should make application visible faults more debuggable
       - boot fix for weird virtualization environment
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        x86/decoder: Add new TEST instruction pattern
        x86/PCI: Remove unused HyperTransport interrupt support
        x86/umip: Fix insn_get_code_seg_params()'s return value
        x86/boot/KASLR: Remove unused variable
        x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
        x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
        x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
        x86/pkeys/selftests: Fix protection keys write() warning
        x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
        x86/mpx/selftests: Fix up weird arrays
        x86/pkeys: Update documentation about availability
        x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
        x86/smpboot: Fix __max_logical_packages estimate
        x86/topology: Avoid wasting 128k for package id array
        perf/x86/intel/uncore: Cache logical pkg id in uncore driver
        x86/acpi: Reduce code duplication in mp_override_legacy_irq()
        x86/acpi: Handle SCI interrupts above legacy space gracefully
        x86/boot: Fix boot failure when SMP MP-table is based at 0
        x86/mm: Limit mmap() of /dev/mem to valid physical addresses
        x86/selftests: Add test for mapping placement for 5-level paging
        ...
      02fc87b1