1. 10 Oct, 2018 1 commit
    • Valentine Fatiev's avatar
      IB/mlx5: Unmap DMA addr from HCA before IOMMU · dd9a4034
      Valentine Fatiev authored
      The function that puts back the MR in cache also removes the DMA address
      from the HCA. Therefore we need to call this function before we remove
      the DMA mapping from MMU. Otherwise the HCA may access a memory that
      is no longer DMA mapped.
      
      Call trace:
      NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0.
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4
      Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012
      RIP: 0010:intel_idle+0x73/0x120
      Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02
      RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046
      RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000
      RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9
      R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000
      R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d
      FS:  0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0
      Call Trace:
       cpuidle_enter_state+0x7e/0x2e0
       do_idle+0x1ed/0x290
       cpu_startup_entry+0x6f/0x80
       start_kernel+0x524/0x544
       ? set_init_arg+0x55/0x55
       secondary_startup_64+0xa4/0xb0
      DMAR: DRHD: handling fault status reg 2
      DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set
      DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set
      
      Fixes: f3f134f5 ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory")
      Signed-off-by: default avatarValentine Fatiev <valentinef@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      dd9a4034
  2. 25 Sep, 2018 3 commits
    • Parav Pandit's avatar
      RDMA/core: Set right entry state before releasing reference · 5c5702e2
      Parav Pandit authored
      Currently add_modify_gid() for IB link layer has followong issue
      in cache update path.
      
      When GID update event occurs, core releases reference to the GID
      table without updating its state and/or entry pointer.
      
      CPU-0                              CPU-1
      ------                             -----
      ib_cache_update()                    IPoIB ULP
         add_modify_gid()                   [..]
            put_gid_entry()
            refcnt = 0, but
            state = valid,
            entry is valid.
            (work item is not yet executed).
                                         ipoib_create_ah()
                                           rdma_create_ah()
                                              rdma_get_gid_attr() <--
                                         	Tries to acquire gid_attr
                                              which has refcnt = 0.
                                         	This is incorrect.
      
      GID entry state and entry pointer is provides the accurate GID enty
      state. Such fields must be updated with rwlock to protect against
      readers and, such fields must be in sane state before refcount can drop
      to zero. Otherwise above race condition can happen leading to
      use-after-free situation.
      
      Following backtrace has been observed when cache update for an IB port
      is triggered while IPoIB ULP is creating an AH.
      
      Therefore, when updating GID entry, first mark a valid entry as invalid
      through state and set the barrier so that no callers can acquired
      the GID entry, followed by release reference to it.
      
      refcount_t: increment on 0; use-after-free.
      WARNING: CPU: 4 PID: 29106 at lib/refcount.c:153 refcount_inc_checked+0x30/0x50
      Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
      RIP: 0010:refcount_inc_checked+0x30/0x50
      RSP: 0018:ffff8802ad36f600 EFLAGS: 00010082
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000002 RSI: 0000000000000008 RDI: ffffffff86710100
      RBP: ffff8802d6e60a30 R08: ffffed005d67bf8b R09: ffffed005d67bf8b
      R10: 0000000000000001 R11: ffffed005d67bf8a R12: ffff88027620cee8
      R13: ffff8802d6e60988 R14: ffff8802d6e60a78 R15: 0000000000000202
      FS: 0000000000000000(0000) GS:ffff8802eb200000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f3ab35e5c88 CR3: 00000002ce84a000 CR4: 00000000000006e0
      IPv6: ADDRCONF(NETDEV_CHANGE): ib1: link becomes ready
      Call Trace:
      rdma_get_gid_attr+0x220/0x310 [ib_core]
      ? lock_acquire+0x145/0x3a0
      rdma_fill_sgid_attr+0x32c/0x470 [ib_core]
      rdma_create_ah+0x89/0x160 [ib_core]
      ? rdma_fill_sgid_attr+0x470/0x470 [ib_core]
      ? ipoib_create_ah+0x52/0x260 [ib_ipoib]
      ipoib_create_ah+0xf5/0x260 [ib_ipoib]
      ipoib_mcast_join_complete+0xbbe/0x2540 [ib_ipoib]
      
      Fixes: b150c386 ("IB/core: Introduce GID entry reference counts")
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      5c5702e2
    • Yishai Hadas's avatar
      IB/mlx5: Destroy the DEVX object upon error flow · e8ef090a
      Yishai Hadas authored
      Upon DEVX object creation the object must be destroyed upon a follows
      error flow.
      
      Fixes: 7efce369 ("IB/mlx5: Add obj create and destroy functionality")
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      e8ef090a
    • Mark Bloch's avatar
      IB/uverbs: Free uapi on destroy · a9360abd
      Mark Bloch authored
      Make sure we free struct uverbs_api once we clean the radix tree. It was
      allocated by uverbs_alloc_api().
      
      Fixes: 9ed3e5f4 ("IB/uverbs: Build the specs into a radix tree at runtime")
      Reported-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      a9360abd
  3. 24 Sep, 2018 1 commit
    • Selvin Xavier's avatar
      RDMA/bnxt_re: Fix system crash during RDMA resource initialization · de5c95d0
      Selvin Xavier authored
      bnxt_re_ib_reg acquires and releases the rtnl lock whenever it accesses
      the L2 driver.
      
      The following sequence can trigger a crash
      
      Acquires the rtnl_lock ->
      	Registers roce driver callback with L2 driver ->
      		release the rtnl lock
      bnxt_re acquires the rtnl_lock ->
      	Request for MSIx vectors ->
      		release the rtnl_lock
      
      Issue happens when bnxt_re proceeds with remaining part of initialization
      and L2 driver invokes bnxt_ulp_irq_stop as a part of bnxt_open_nic.
      
      The crash is in bnxt_qplib_nq_stop_irq as the NQ structures are
      not initialized yet,
      
      <snip>
      [ 3551.726647] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 3551.726656] IP: [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      [ 3551.726674] PGD 0
      [ 3551.726679] Oops: 0002 1 SMP
      ...
      [ 3551.726822] Hardware name: Dell Inc. PowerEdge R720/08RW36, BIOS 2.4.3 07/09/2014
      [ 3551.726826] task: ffff97e30eec5ee0 ti: ffff97e3173bc000 task.ti: ffff97e3173bc000
      [ 3551.726829] RIP: 0010:[<ffffffffc0840ee9>] [<ffffffffc0840ee9>]
      bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      ...
      [ 3551.726872] Call Trace:
      [ 3551.726886] [<ffffffffc082cb9e>] bnxt_re_stop_irq+0x4e/0x70 [bnxt_re]
      [ 3551.726899] [<ffffffffc07d6a53>] bnxt_ulp_irq_stop+0x43/0x70 [bnxt_en]
      [ 3551.726908] [<ffffffffc07c82f4>] bnxt_reserve_rings+0x174/0x1e0 [bnxt_en]
      [ 3551.726917] [<ffffffffc07cafd8>] __bnxt_open_nic+0x368/0x9a0 [bnxt_en]
      [ 3551.726925] [<ffffffffc07cb62b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
      [ 3551.726934] [<ffffffffc07cc62f>] bnxt_setup_mq_tc+0x11f/0x260 [bnxt_en]
      [ 3551.726943] [<ffffffffc07d5f58>] bnxt_dcbnl_ieee_setets+0xb8/0x1f0 [bnxt_en]
      [ 3551.726954] [<ffffffff890f983a>] dcbnl_ieee_set+0x9a/0x250
      [ 3551.726966] [<ffffffff88fd6d21>] ? __alloc_skb+0xa1/0x2d0
      [ 3551.726972] [<ffffffff890f72fa>] dcb_doit+0x13a/0x210
      [ 3551.726981] [<ffffffff89003ff7>] rtnetlink_rcv_msg+0xa7/0x260
      [ 3551.726989] [<ffffffff88ffdb00>] ? rtnl_unicast+0x20/0x30
      [ 3551.726996] [<ffffffff88bf9dc8>] ? __kmalloc_node_track_caller+0x58/0x290
      [ 3551.727002] [<ffffffff890f7326>] ? dcb_doit+0x166/0x210
      [ 3551.727007] [<ffffffff88fd6d0d>] ? __alloc_skb+0x8d/0x2d0
      [ 3551.727012] [<ffffffff89003f50>] ? rtnl_newlink+0x880/0x880
      ...
      [ 3551.727104] [<ffffffff8911f7d5>] system_call_fastpath+0x1c/0x21
      ...
      [ 3551.727164] RIP [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
      [ 3551.727175] RSP <ffff97e3173bf788>
      [ 3551.727177] CR2: 0000000000000000
      
      Avoid this inconsistent state and  system crash by acquiring
      the rtnl lock for the entire duration of device initialization.
      Re-factor the code to remove the rtnl lock from the individual function
      and acquire and release it from the caller.
      
      Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
      Fixes: 6e04b103 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
      Signed-off-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      de5c95d0
  4. 21 Sep, 2018 4 commits
    • Michael J. Ruhl's avatar
      IB/hfi1: Fix destroy_qp hang after a link down · b4a4957d
      Michael J. Ruhl authored
      rvt_destroy_qp() cannot complete until all in process packets have
      been released from the underlying hardware.  If a link down event
      occurs, an application can hang with a kernel stack similar to:
      
      cat /proc/<app PID>/stack
       quiesce_qp+0x178/0x250 [hfi1]
       rvt_reset_qp+0x23d/0x400 [rdmavt]
       rvt_destroy_qp+0x69/0x210 [rdmavt]
       ib_destroy_qp+0xba/0x1c0 [ib_core]
       nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
       nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
       nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
       nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
       process_one_work+0x17a/0x440
       worker_thread+0x126/0x3c0
       kthread+0xcf/0xe0
       ret_from_fork+0x58/0x90
       0xffffffffffffffff
      
      quiesce_qp() waits until all outstanding packets have been freed.
      This wait should be momentary.  During a link down event, the cleanup
      handling does not ensure that all packets caught by the link down are
      flushed properly.
      
      This is caused by the fact that the freeze path and the link down
      event is handled the same.  This is not correct.  The freeze path
      waits until the HFI is unfrozen and then restarts PIO.  A link down
      is not a freeze event.  The link down path cannot restart the PIO
      until link is restored.  If the PIO path is restarted before the link
      comes up, the application (QP) using the PIO path will hang (until
      link is restored).
      
      Fix by separating the linkdown path from the freeze path and use the
      link down path for link down events.
      
      Close a race condition sc_disable() by acquiring both the progress
      and release locks.
      
      Close a race condition in sc_stop() by moving the setting of the flag
      bits under the alloc lock.
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      b4a4957d
    • Michael J. Ruhl's avatar
      IB/hfi1: Fix context recovery when PBC has an UnsupportedVL · d623500b
      Michael J. Ruhl authored
      If a packet stream uses an UnsupportedVL (virtual lane), the send
      engine will not send the packet, and it will not indicate that an
      error has occurred.  This will cause the packet stream to block.
      
      HFI has 8 virtual lanes available for packet streams.  Each lane can
      be enabled or disabled using the UnsupportedVL mask.  If a lane is
      disabled, adding a packet to the send context must be disallowed.
      
      The current mask for determining unsupported VLs defaults to 0 (allow
      all).  This is incorrect.  Only the VLs that are defined should be
      allowed.
      
      Determine which VLs are disabled (mtu == 0), and set the appropriate
      unsupported bit in the mask.  The correct mask will allow the send
      engine to error on the invalid VL, and error recovery will work
      correctly.
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      d623500b
    • Michael J. Ruhl's avatar
      IB/hfi1: Invalid user input can result in crash · 94694d18
      Michael J. Ruhl authored
      If the number of packets in a user sdma request does not match
      the actual iovectors being sent, sdma_cleanup can be called on
      an uninitialized request structure, resulting in a crash similar
      to this:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
      PGD 8000001044f61067 PUD 1052706067 PMD 0
      Oops: 0000 [#1] SMP
      CPU: 30 PID: 69912 Comm: upsm Kdump: loaded Tainted: G           OE
      ------------   3.10.0-862.el7.x86_64 #1
      Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
      SE5C610.86B.01.01.0019.101220160604 10/12/2016
      task: ffff8b331c890000 ti: ffff8b2ed1f98000 task.ti: ffff8b2ed1f98000
      RIP: 0010:[<ffffffffc0ae8bb7>]  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0
      [hfi1]
      RSP: 0018:ffff8b2ed1f9bab0  EFLAGS: 00010286
      RAX: 0000000000008b2b RBX: ffff8b2adf6e0000 RCX: 0000000000000000
      RDX: 00000000000000a0 RSI: ffff8b2e9eedc540 RDI: ffff8b2adf6e0000
      RBP: ffff8b2ed1f9bad8 R08: 0000000000000000 R09: ffffffffc0b04a06
      R10: ffff8b331c890190 R11: ffffe6ed00bf1840 R12: ffff8b3315480000
      R13: ffff8b33154800f0 R14: 00000000fffffff2 R15: ffff8b2e9eedc540
      FS:  00007f035ac47740(0000) GS:ffff8b331e100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 0000000c03fe6000 CR4: 00000000001607e0
      Call Trace:
       [<ffffffffc0b0570d>] user_sdma_send_pkts+0xdcd/0x1990 [hfi1]
       [<ffffffff9fe75fb0>] ? gup_pud_range+0x140/0x290
       [<ffffffffc0ad3105>] ? hfi1_mmu_rb_insert+0x155/0x1b0 [hfi1]
       [<ffffffffc0b0777b>] hfi1_user_sdma_process_request+0xc5b/0x11b0 [hfi1]
       [<ffffffffc0ac193a>] hfi1_aio_write+0xba/0x110 [hfi1]
       [<ffffffffa001a2bb>] do_sync_readv_writev+0x7b/0xd0
       [<ffffffffa001bede>] do_readv_writev+0xce/0x260
       [<ffffffffa022b089>] ? tty_ldisc_deref+0x19/0x20
       [<ffffffffa02268c0>] ? n_tty_ioctl+0xe0/0xe0
       [<ffffffffa001c105>] vfs_writev+0x35/0x60
       [<ffffffffa001c2bf>] SyS_writev+0x7f/0x110
       [<ffffffffa051f7d5>] system_call_fastpath+0x1c/0x21
      Code: 06 49 c7 47 18 00 00 00 00 0f 87 89 01 00 00 5b 41 5c 41 5d 41 5e 41 5f
      5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b 4e 10 48 89 fb <48> 8b 51 08 49 89 d4
      83 e2 0c 41 81 e4 00 e0 00 00 48 c1 ea 02
      RIP  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
       RSP <ffff8b2ed1f9bab0>
      CR2: 0000000000000008
      
      There are two exit points from user_sdma_send_pkts().  One (free_tx)
      merely frees the slab entry and one (free_txreq) cleans the sdma_txreq
      prior to freeing the slab entry.   The free_txreq variation can only be
      called after one of the sdma_init*() variations has been called.
      
      In the panic case, the slab entry had been allocated but not inited.
      
      Fix the issue by exiting through free_tx thus avoiding sdma_clean().
      
      Cc: <stable@vger.kernel.org> # 4.9.x+
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      94694d18
    • Ira Weiny's avatar
      IB/hfi1: Fix SL array bounds check · 0dbfaa9f
      Ira Weiny authored
      The SL specified by a user needs to be a valid SL.
      
      Add a range check to the user specified SL value which protects from
      running off the end of the SL to SC table.
      
      CC: stable@vger.kernel.org
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      0dbfaa9f
  5. 20 Sep, 2018 1 commit
  6. 19 Sep, 2018 1 commit
  7. 13 Sep, 2018 1 commit
    • Cong Wang's avatar
      ucma: fix a use-after-free in ucma_resolve_ip() · 5fe23f26
      Cong Wang authored
      There is a race condition between ucma_close() and ucma_resolve_ip():
      
      CPU0				CPU1
      ucma_resolve_ip():		ucma_close():
      
      ctx = ucma_get_ctx(file, cmd.id);
      
              list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
                      mutex_lock(&mut);
                      idr_remove(&ctx_idr, ctx->id);
                      mutex_unlock(&mut);
      		...
                      mutex_lock(&mut);
                      if (!ctx->closing) {
                              mutex_unlock(&mut);
                              rdma_destroy_id(ctx->cm_id);
      		...
                      ucma_free_ctx(ctx);
      
      ret = rdma_resolve_addr();
      ucma_put_ctx(ctx);
      
      Before idr_remove(), ucma_get_ctx() could still find the ctx
      and after rdma_destroy_id(), rdma_resolve_addr() may still
      access id_priv pointer. Also, ucma_put_ctx() may use ctx after
      ucma_free_ctx() too.
      
      ucma_close() should call ucma_put_ctx() too which tests the
      refcnt and waits for the last one releasing it. The similar
      pattern is already used by ucma_destroy_id().
      
      Reported-and-tested-by: syzbot+da2591e115d57a9cbb8b@syzkaller.appspotmail.com
      Reported-by: syzbot+cfe3c1e8ef634ba8964b@syzkaller.appspotmail.com
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5fe23f26
  8. 12 Sep, 2018 1 commit
    • Steve Wise's avatar
      RDMA/uverbs: Atomically flush and mark closed the comp event queue · 67e38168
      Steve Wise authored
      Currently a uverbs completion event queue is flushed of events in
      ib_uverbs_comp_event_close() with the queue spinlock held and then
      released.  Yet setting ev_queue->is_closed is not set until later in
      uverbs_hot_unplug_completion_event_file().
      
      In between the time ib_uverbs_comp_event_close() releases the lock and
      uverbs_hot_unplug_completion_event_file() acquires the lock, a completion
      event can arrive and be inserted into the event queue by
      ib_uverbs_comp_handler().
      
      This can cause a "double add" list_add warning or crash depending on the
      kernel configuration, or a memory leak because the event is never dequeued
      since the queue is already closed down.
      
      So add setting ev_queue->is_closed = 1 to ib_uverbs_comp_event_close().
      
      Cc: stable@vger.kernel.org
      Fixes: 1e7710f3 ("IB/core: Change completion channel to use the reworked objects schema")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      67e38168
  9. 11 Sep, 2018 1 commit
  10. 06 Sep, 2018 2 commits
  11. 05 Sep, 2018 3 commits
    • Parav Pandit's avatar
      RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one() · 08e74be1
      Parav Pandit authored
      If ib_uverbs_create_uapi() fails, dev_num should be freed from the bitmap.
      
      Fixes: 7d96c9b1 ("IB/uverbs: Have the core code create the uverbs_root_spec")
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      08e74be1
    • Somnath Kotur's avatar
      bnxt_re: Fix couple of memory leaks that could lead to IOMMU call traces · f40f299b
      Somnath Kotur authored
      1. DMA-able memory allocated for Shadow QP was not being freed.
      2. bnxt_qplib_alloc_qp_hdr_buf() had a bug wherein the SQ pointer was
         erroneously pointing to the RQ. But since the corresponding
         free_qp_hdr_buf() was correct, memory being free was less than what was
         allocated.
      
      Fixes: 1ac5a404 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
      Signed-off-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      f40f299b
    • Aaron Knister's avatar
      IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler · 816e846c
      Aaron Knister authored
      Inside of start_xmit() the call to check if the connection is up and the
      queueing of the packets for later transmission is not atomic which leaves
      a window where cm_rep_handler can run, set the connection up, dequeue
      pending packets and leave the subsequently queued packets by start_xmit()
      sitting on neigh->queue until they're dropped when the connection is torn
      down. This only applies to connected mode. These dropped packets can
      really upset TCP, for example, and cause multi-minute delays in
      transmission for open connections.
      
      Here's the code in start_xmit where we check to see if the connection is
      up:
      
             if (ipoib_cm_get(neigh)) {
                     if (ipoib_cm_up(neigh)) {
                             ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                             goto unref;
                     }
             }
      
      The race occurs if cm_rep_handler execution occurs after the above
      connection check (specifically if it gets to the point where it acquires
      priv->lock to dequeue pending skb's) but before the below code snippet in
      start_xmit where packets are queued.
      
             if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
                     push_pseudo_header(skb, phdr->hwaddr);
                     spin_lock_irqsave(&priv->lock, flags);
                     __skb_queue_tail(&neigh->queue, skb);
                     spin_unlock_irqrestore(&priv->lock, flags);
             } else {
                     ++dev->stats.tx_dropped;
                     dev_kfree_skb_any(skb);
             }
      
      The patch acquires the netif tx lock in cm_rep_handler for the section
      where it sets the connection up and dequeues and retransmits deferred
      skb's.
      
      Fixes: 839fcaba ("IPoIB: Connected mode experimental support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAaron Knister <aaron.s.knister@nasa.gov>
      Tested-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      816e846c
  12. 04 Sep, 2018 3 commits
    • Steve Wise's avatar
      iw_cxgb4: only allow 1 flush on user qps · 308aa2b8
      Steve Wise authored
      Once the qp has been flushed, it cannot be flushed again.  The user qp
      flush logic wasn't enforcing it however.  The bug can cause
      touch-after-free crashes like:
      
      Unable to handle kernel paging request for data at address 0x000001ec
      Faulting instruction address: 0xc008000016069100
      Oops: Kernel access of bad area, sig: 11 [#1]
      ...
      NIP [c008000016069100] flush_qp+0x80/0x480 [iw_cxgb4]
      LR [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      Call Trace:
      [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
      [c00800001606e868] c4iw_ib_modify_qp+0x118/0x200 [iw_cxgb4]
      [c0080000119eae80] ib_security_modify_qp+0xd0/0x3d0 [ib_core]
      [c0080000119c4e24] ib_modify_qp+0xc4/0x2c0 [ib_core]
      [c008000011df0284] iwcm_modify_qp_err+0x44/0x70 [iw_cm]
      [c008000011df0fec] destroy_cm_id+0xcc/0x370 [iw_cm]
      [c008000011ed4358] rdma_destroy_id+0x3c8/0x520 [rdma_cm]
      [c0080000134b0540] ucma_close+0x90/0x1b0 [rdma_ucm]
      [c000000000444da4] __fput+0xe4/0x2f0
      
      So fix flush_qp() to only flush the wq once.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      308aa2b8
    • Artemy Kovalyov's avatar
      IB/core: Release object lock if destroy failed · e4ff3d22
      Artemy Kovalyov authored
      The object lock was supposed to always be released during destroy, but
      when the destruction retry series was integrated with the destroy series
      it created a failure path that missed the unlock.
      
      Keep with convention, if destroy fails the caller must undo all locking.
      
      Fixes: 87ad80ab ("IB/uverbs: Consolidate uobject destruction")
      Signed-off-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      e4ff3d22
    • Jann Horn's avatar
      RDMA/ucma: check fd type in ucma_migrate_id() · 0d23ba60
      Jann Horn authored
      The current code grabs the private_data of whatever file descriptor
      userspace has supplied and implicitly casts it to a `struct ucma_file *`,
      potentially causing a type confusion.
      
      This is probably fine in practice because the pointer is only used for
      comparisons, it is never actually dereferenced; and even in the
      comparisons, it is unlikely that a file from another filesystem would have
      a ->private_data pointer that happens to also be valid in this context.
      But ->private_data is not always guaranteed to be a valid pointer to an
      object owned by the file's filesystem; for example, some filesystems just
      cram numbers in there.
      
      Check the type of the supplied file descriptor to be safe, analogous to how
      other places in the kernel do it.
      
      Fixes: 88314e4d ("RDMA/cma: add support for rdma_migrate_id()")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      0d23ba60
  13. 02 Sep, 2018 8 commits
    • Linus Torvalds's avatar
      Linux 4.19-rc2 · 57361846
      Linus Torvalds authored
      57361846
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · fd6868d8
      Linus Torvalds authored
      Pull devicetree updates from Rob Herring:
       "A couple of new helper functions in preparation for some tree wide
        clean-ups.
      
        I'm sending these new helpers now for rc2 in order to simplify the
        dependencies on subsequent cleanups across the tree in 4.20"
      
      * tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of: Add device_type access helper functions
        of: add node name compare helper functions
        of: add helper to lookup compatible child node
      fd6868d8
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · a3ea9911
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "First batch of fixes post-merge window:
      
         - A handful of devicetree changes for i.MX2{3,8} to change over to
           new panel bindings. The platforms were moved from legacy
           framebuffers to DRM and some development board panels hadn't yet
           been converted.
      
         - OMAP fixes related to ti-sysc driver conversion fallout, fixing
           some register offsets, no_console_suspend fixes, etc.
      
         - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
      
         - Fixed 0755->0644 permissions on a newly added file.
      
         - Defconfig changes to make ARM Versatile more useful with QEMU
           (helps testing).
      
         - Enable defconfig options for new TI SoC platform that was merged
           this window (AM6)"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm64: defconfig: Enable TI's AM6 SoC platform
        ARM: defconfig: Update the ARM Versatile defconfig
        ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
        ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
        ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
        ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
        ARM: dts: imx23-evk: Convert to the new display bindings
        ARM: dts: imx23-evk: Move regulators outside simple-bus
        ARM: dts: imx28-evk: Convert to the new display bindings
        ARM: dts: imx28-evk: Move regulators outside simple-bus
        Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
        arm: dts: am4372: setup rtc as system-power-controller
        ARM: dts: omap4-droid4: fix vibrations on Droid 4
        bus: ti-sysc: Fix no_console_suspend handling
        bus: ti-sysc: Fix module register ioremap for larger offsets
        ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
        ARM: OMAP2+: Fix null hwmod for ti-sysc debug
      a3ea9911
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 899ba795
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Speculation:
      
         - Make the microcode check more robust
      
         - Make the L1TF memory limit depend on the internal cache physical
           address space and not on the CPUID advertised physical address
           space, which might be significantly smaller. This avoids disabling
           L1TF on machines which utilize the full physical address space.
      
         - Fix the GDT mapping for EFI calls on 32bit PTI
      
         - Fix the MCE nospec implementation to prevent #GP
      
        Fixes and robustness:
      
         - Use the proper operand order for LSL in the VDSO
      
         - Prevent NMI uaccess race against CR3 switching
      
         - Add a lockdep check to verify that text_mutex is held in
           text_poke() functions
      
         - Repair the fallout of giving native_restore_fl() a prototype
      
         - Prevent kernel memory dumps based on usermode RIP
      
         - Wipe KASAN shadow stack before rewinding the stack to prevent false
           positives
      
         - Move the AMS GOTO enforcement to the actual build stage to allow
           user API header extraction without a compiler
      
         - Fix a section mismatch introduced by the on demand VDSO mapping
           change
      
        Miscellaneous:
      
         - Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/pti: Fix section mismatch warning/error
        x86/vdso: Fix lsl operand order
        x86/mce: Fix set_mce_nospec() to avoid #GP fault
        x86/efi: Load fixmap GDT in efi_call_phys_epilog()
        x86/nmi: Fix NMI uaccess race against CR3 switching
        x86: Allow generating user-space headers without a compiler
        x86/dumpstack: Don't dump kernel memory based on usermode RIP
        x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
        x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
        x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
        x86/irqflags: Mark native_restore_fl extern inline
        x86/build: Remove jump label quirk for GCC older than 4.5.2
        x86/Kconfig: Fix trivial typo
        x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
        x86/spectre: Add missing family 6 check to microcode check
      899ba795
    • Linus Torvalds's avatar
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1395d109
      Linus Torvalds authored
      Pull CPU hotplug fix from Thomas Gleixner:
       "Remove the stale skip_onerr member from the hotplug states"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Remove skip_onerr field from cpuhp_step structure
      1395d109
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 501dacbc
      Linus Torvalds authored
      Pull core fixes from Thomas Gleixner:
       "A small set of updates for core code:
      
         - Prevent tracing in functions which are called from trace patching
           via stop_machine() to prevent executing half patched function trace
           entries.
      
         - Remove old GCC workarounds
      
         - Remove pointless includes of notifier.h"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Remove workaround for unreachable warnings from old GCC
        notifier: Remove notifier header file wherever not used
        watchdog: Mark watchdog touch functions as notrace
      501dacbc
    • Randy Dunlap's avatar
      x86/pti: Fix section mismatch warning/error · ff924c5a
      Randy Dunlap authored
      Fix the section mismatch warning in arch/x86/mm/pti.c:
      
      WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
      The function pti_clone_pgtable() references
      the function __init pti_user_pagetable_walk_pte().
      This is often because pti_clone_pgtable lacks a __init
      annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
      FATAL: modpost: Section mismatches detected.
      
      Fixes: 85900ea5 ("x86/pti: Map the vsyscall page if needed")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org
      
      ff924c5a
    • Olof Johansson's avatar
      Merge tag 'omap-for-v4.19/fixes-v2-signed' of... · a72b44a8
      Olof Johansson authored
      Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
      
      Fixes for omap variants against v4.19-rc1
      
      These are mostly fixes related to using ti-sysc interconnect target module
      driver for accessing right register offsets for sgx and cpsw and for
      no_console_suspend regression.
      
      There is also a droid4 emmc fix where emmc may not get detected for some
      models, and vibrator dts mismerge fix.
      
      And we have a file permission fix for am335x-osd3358-sm-red.dts that
      just got added. And we must tag RTC as system-power-controller for
      am437x for PMIC to shut down during poweroff.
      
      * tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
        ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
        ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
        arm: dts: am4372: setup rtc as system-power-controller
        ARM: dts: omap4-droid4: fix vibrations on Droid 4
        bus: ti-sysc: Fix no_console_suspend handling
        bus: ti-sysc: Fix module register ioremap for larger offsets
        ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
        ARM: OMAP2+: Fix null hwmod for ti-sysc debug
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      a72b44a8
  14. 01 Sep, 2018 4 commits
  15. 31 Aug, 2018 6 commits