• Mike Kravetz's avatar
    hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization · b43a9990
    Mike Kravetz authored
    While looking at BUGs associated with invalid huge page map counts, it was
    discovered and observed that a huge pte pointer could become 'invalid' and
    point to another task's page table.  Consider the following:
    
    A task takes a page fault on a shared hugetlbfs file and calls
    huge_pte_alloc to get a ptep.  Suppose the returned ptep points to a
    shared pmd.
    
    Now, another task truncates the hugetlbfs file.  As part of truncation, it
    unmaps everyone who has the file mapped.  If the range being truncated is
    covered by a shared pmd, huge_pmd_unshare will be called.  For all but the
    last user of the shared pmd, huge_pmd_unshare will clear the pud pointing
    to the pmd.  If the task in the middle of the page fault is not the last
    user, the ptep returned by huge_pte_alloc now points to another task's
    page table or worse.  This leads to bad things such as incorrect page
    map/reference counts or invalid memory references.
    
    To fix, expand the use of i_mmap_rwsem as follows:
    
    - i_mmap_rwsem is held in read mode whenever huge_pmd_share is called.
      huge_pmd_share is only called via huge_pte_alloc, so callers of
      huge_pte_alloc take i_mmap_rwsem before calling.  In addition, callers
      of huge_pte_alloc continue to hold the semaphore until finished with the
      ptep.
    
    - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is
      called.
    
    [mike.kravetz@oracle.com: add explicit check for mapping != null]
    Link: http://lkml.kernel.org/r/20181218223557.5202-2-mike.kravetz@oracle.com
    Fixes: 39dde65c ("shared page table for hugetlb page")
    Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
    Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
    Cc: Colin Ian King <colin.king@canonical.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    b43a9990
rmap.c 53.9 KB