Commit 3994586f authored by Parav Pandit's avatar Parav Pandit Committed by Jason Gunthorpe

RDMA/core: Acquire and release mmap_sem on page range

Currently mmap_sem is read locked while pinning the memory.  In a
multi-threaded application of a process, holding mmap_sem lock creates
contention with other threads who might be either registering memory,
creating QPs or simply doing mmap() as such operations also require to
hold the mmap_sem write lock.

All such operation cannot make forward progress until one memory pin
operation is completed.  It becomes more worse if the memory is unpinned
and/or memory registration is large (in GB range).

Therefore, instead of holding mmap_sem for too long (for whole region
pinning), acquire and release the lock for every few pages.  For example
on x86 with 4K page size, acquire and release mmap_sem for every 2Mbytes
memory chunk.

This allows other competing threads to make progress who might wish to
hold mmap_sem for shorter duration.

When memory registration latency is measured using [1] for memory sizes
ranging from 4K to 48GB, <= 1% or 0.5% degradation is noticed. In many
runs no difference is seen other than run-to-run variance.

In other targeted tests of users with large memory, desired improvements
are seen due to reduced contention of mmap_sem.

[1] https://github.com/paravmellanox/rtool

$ rdma_resource_lat -c 1 -s 48G -a -u L -i 500 -A

It registers pinned memory from 4K to 48GB size with 500 iterations for
each memory size.

$ rdma_resource_lat -c 1 -s 12G -a -u L -i 500 -t 4

4 competing threads pin memory, each of 12GB size with 500 iterations.
Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
parent b54900fc
...@@ -181,8 +181,8 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, ...@@ -181,8 +181,8 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
sg_list_start = umem->sg_head.sgl; sg_list_start = umem->sg_head.sgl;
down_read(&mm->mmap_sem);
while (npages) { while (npages) {
down_read(&mm->mmap_sem);
ret = get_user_pages_longterm(cur_base, ret = get_user_pages_longterm(cur_base,
min_t(unsigned long, npages, min_t(unsigned long, npages,
PAGE_SIZE / sizeof (struct page *)), PAGE_SIZE / sizeof (struct page *)),
...@@ -196,17 +196,20 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, ...@@ -196,17 +196,20 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
cur_base += ret * PAGE_SIZE; cur_base += ret * PAGE_SIZE;
npages -= ret; npages -= ret;
/* Continue to hold the mmap_sem as vma_list access
* needs to be protected.
*/
for_each_sg(sg_list_start, sg, ret, i) { for_each_sg(sg_list_start, sg, ret, i) {
if (vma_list && !is_vm_hugetlb_page(vma_list[i])) if (vma_list && !is_vm_hugetlb_page(vma_list[i]))
umem->hugetlb = 0; umem->hugetlb = 0;
sg_set_page(sg, page_list[i], PAGE_SIZE, 0); sg_set_page(sg, page_list[i], PAGE_SIZE, 0);
} }
up_read(&mm->mmap_sem);
/* preparing for next loop */ /* preparing for next loop */
sg_list_start = sg; sg_list_start = sg;
} }
up_read(&mm->mmap_sem);
umem->nmap = ib_dma_map_sg_attrs(context->device, umem->nmap = ib_dma_map_sg_attrs(context->device,
umem->sg_head.sgl, umem->sg_head.sgl,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment