• Jason Gunthorpe's avatar
    RDMA/mlx5: Put live in the correct place for ODP MRs · aa603815
    Jason Gunthorpe authored
    live is used to signal to the pagefault thread that the MR is initialized
    and ready for use. It should be after the umem is assigned and all other
    setup is completed. This prevents races (at least) of the form:
    
        CPU0                                     CPU1
    mlx5_ib_alloc_implicit_mr()
     implicit_mr_alloc()
      live = 1
     imr->umem = umem
                                        num_pending_prefetch_inc()
                                          if (live)
    				        atomic_inc(num_pending_prefetch)
     atomic_set(num_pending_prefetch,0) // Overwrites other thread's store
    
    Further, live is being used with SRCU as the 'update' in an
    acquire/release fashion, so it can not be read and written raw.
    
    Move all live = 1's to after MR initialization is completed and use
    smp_store_release/smp_load_acquire() for manipulating it.
    
    Add a missing live = 0 when an implicit MR child is deleted, before
    queuing work to do synchronize_srcu().
    
    The barriers in update_odp_mr() were some broken attempt to create a
    acquire/release, but were not even applied consistently and missed the
    point, delete it as well.
    
    Fixes: 6aec21f6 ("IB/mlx5: Page faults handling infrastructure")
    Link: https://lore.kernel.org/r/20191001153821.23621-6-jgg@ziepe.caReviewed-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
    aa603815
mlx5_ib.h 42.4 KB