• Hugh Dickins's avatar
    [PATCH] mm: unlink vma before pagetables · 8f4f8c16
    Hugh Dickins authored
    In most places the descent from pgd to pud to pmd to pte holds mmap_sem
    (exclusively or not), which ensures that free_pgtables cannot be freeing page
    tables from any level at the same time.  But truncation and reverse mapping
    descend without mmap_sem.
    
    No problem: just make sure that a vma is unlinked from its prio_tree (or
    nonlinear list) and from its anon_vma list, after zapping the vma, but before
    freeing its page tables.  Then neither vmtruncate nor rmap can reach that vma
    whose page tables are now volatile (nor do they need to reach it, since all
    its page entries have been zapped by this stage).
    
    The i_mmap_lock and anon_vma->lock already serialize this correctly; but the
    locking hierarchy is such that we cannot take them while holding
    page_table_lock.  Well, we're trying to push that down anyway.  So in this
    patch, move anon_vma_unlink and unlink_file_vma into free_pgtables, at the
    same time as moving page_table_lock around calls to unmap_vmas.
    
    tlb_gather_mmu and tlb_finish_mmu then fall outside the page_table_lock, but
    we made them preempt_disable and preempt_enable earlier; and a long source
    audit of all the architectures has shown no problem with removing
    page_table_lock from them.  free_pgtables doesn't need page_table_lock for
    itself, nor for what it calls; tlb->mm->nr_ptes is usually protected by
    page_table_lock, but partly by non-exclusive mmap_sem - here it's decremented
    with exclusive mmap_sem, or mm_users 0.  update_hiwater_rss and
    vm_unacct_memory don't need page_table_lock either.
    Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    8f4f8c16
memory.c 59.5 KB