• Vlastimil Babka's avatar
    mm: munlock: manual pte walk in fast path instead of follow_page_mask() · 7a8010cd
    Vlastimil Babka authored
    Currently munlock_vma_pages_range() calls follow_page_mask() to obtain
    each individual struct page.  This entails repeated full page table
    translations and page table lock taken for each page separately.
    
    This patch avoids the costly follow_page_mask() where possible, by
    iterating over ptes within single pmd under single page table lock.  The
    first pte is obtained by get_locked_pte() for non-THP page acquired by the
    initial follow_page_mask().  The rest of the on-stack pagevec for munlock
    is filled up using pte_walk as long as pte_present() and vm_normal_page()
    are sufficient to obtain the struct page.
    
    After this patch, a 14% speedup was measured for munlocking a 56GB large
    memory area with THP disabled.
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Jörn Engel <joern@logfs.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Michel Lespinasse <walken@google.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@suse.cz>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    7a8010cd
mlock.c 21.2 KB