1. 07 Aug, 2019 2 commits
    • Mark Zhang's avatar
      RDMA/counter: Prevent QP counter binding if counters unsupported · d97de888
      Mark Zhang authored
      In case of rdma_counter_init() fails, counter allocation and QP bind
      should not be allowed.
      
      Fixes: 413d3347 ("RDMA/counter: Add set/clear per-port auto mode support")
      Fixes: 1bd8e0a9 ("RDMA/counter: Allow manual mode configuration support")
      Signed-off-by: default avatarMark Zhang <markz@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Link: https://lore.kernel.org/r/20190807101819.7581-1-leon@kernel.orgSigned-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d97de888
    • Yishai Hadas's avatar
      IB/mlx5: Fix implicit MR release flow · f591822c
      Yishai Hadas authored
      Once implicit MR is being called to be released by
      ib_umem_notifier_release() its leaves were marked as "dying".
      
      However, when dereg_mr()->mlx5_ib_free_implicit_mr()->mr_leaf_free() is
      called, it skips running the mr_leaf_free_action (i.e. umem_odp->work)
      when those leaves were marked as "dying".
      
      As such ib_umem_release() for the leaves won't be called and their MRs
      will be leaked as well.
      
      When an application exits/killed without calling dereg_mr we might hit the
      above flow.
      
      This fatal scenario is reported by WARN_ON() upon
      mlx5_ib_dealloc_ucontext() as ibcontext->per_mm_list is not empty, the
      call trace can be seen below.
      
      Originally the "dying" mark as part of ib_umem_notifier_release() was
      introduced to prevent pagefault_mr() from returning a success response
      once this happened. However, we already have today the completion
      mechanism so no need for that in those flows any more.  Even in case a
      success response will be returned the firmware will not find the pages and
      an error will be returned in the following call as a released mm will
      cause ib_umem_odp_map_dma_pages() to permanently fail mmget_not_zero().
      
      Fix the above issue by dropping the "dying" from the above flows.  The
      other flows that are using "dying" are still needed it for their
      synchronization purposes.
      
         WARNING: CPU: 1 PID: 7218 at
         drivers/infiniband/hw/mlx5/main.c:2004
      		  mlx5_ib_dealloc_ucontext+0x84/0x90 [mlx5_ib]
         CPU: 1 PID: 7218 Comm: ibv_rc_pingpong Tainted: G     E
      	       5.2.0-rc6+ #13
         Call Trace:
         uverbs_destroy_ufile_hw+0xb5/0x120 [ib_uverbs]
         ib_uverbs_close+0x1f/0x80 [ib_uverbs]
         __fput+0xbe/0x250
         task_work_run+0x88/0xa0
         do_exit+0x2cb/0xc30
         ? __fput+0x14b/0x250
         do_group_exit+0x39/0xb0
         get_signal+0x191/0x920
         ? _raw_spin_unlock_bh+0xa/0x20
         ? inet_csk_accept+0x229/0x2f0
         do_signal+0x36/0x5e0
         ? put_unused_fd+0x5b/0x70
         ? __sys_accept4+0x1a6/0x1e0
         ? inet_hash+0x35/0x40
         ? release_sock+0x43/0x90
         ? _raw_spin_unlock_bh+0xa/0x20
         ? inet_listen+0x9f/0x120
         exit_to_usermode_loop+0x5c/0xc6
         do_syscall_64+0x182/0x1b0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 81713d37 ("IB/mlx5: Add implicit MR support")
      Link: https://lore.kernel.org/r/20190805083010.21777-1-leon@kernel.orgSigned-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      f591822c
  2. 05 Aug, 2019 1 commit
  3. 04 Aug, 2019 10 commits
  4. 03 Aug, 2019 27 commits