• David Hildenbrand's avatar
    mm: support GUP-triggered unsharing of anonymous pages · c89357e2
    David Hildenbrand authored
    Whenever GUP currently ends up taking a R/O pin on an anonymous page that
    might be shared -- mapped R/O and !PageAnonExclusive() -- any write fault
    on the page table entry will end up replacing the mapped anonymous page
    due to COW, resulting in the GUP pin no longer being consistent with the
    page actually mapped into the page table.
    
    The possible ways to deal with this situation are:
     (1) Ignore and pin -- what we do right now.
     (2) Fail to pin -- which would be rather surprising to callers and
         could break user space.
     (3) Trigger unsharing and pin the now exclusive page -- reliable R/O
         pins.
    
    We want to implement 3) because it provides the clearest semantics and
    allows for checking in unpin_user_pages() and friends for possible BUGs:
    when trying to unpin a page that's no longer exclusive, clearly something
    went very wrong and might result in memory corruptions that might be hard
    to debug.  So we better have a nice way to spot such issues.
    
    To implement 3), we need a way for GUP to trigger unsharing:
    FAULT_FLAG_UNSHARE.  FAULT_FLAG_UNSHARE is only applicable to R/O mapped
    anonymous pages and resembles COW logic during a write fault.  However, in
    contrast to a write fault, GUP-triggered unsharing will, for example,
    still maintain the write protection.
    
    Let's implement FAULT_FLAG_UNSHARE by hooking into the existing write
    fault handlers for all applicable anonymous page types: ordinary pages,
    THP and hugetlb.
    
    * If FAULT_FLAG_UNSHARE finds a R/O-mapped anonymous page that has been
      marked exclusive in the meantime by someone else, there is nothing to do.
    * If FAULT_FLAG_UNSHARE finds a R/O-mapped anonymous page that's not
      marked exclusive, it will try detecting if the process is the exclusive
      owner. If exclusive, it can be set exclusive similar to reuse logic
      during write faults via page_move_anon_rmap() and there is nothing
      else to do; otherwise, we either have to copy and map a fresh,
      anonymous exclusive page R/O (ordinary pages, hugetlb), or split the
      THP.
    
    This commit is heavily based on patches by Andrea.
    
    Link: https://lkml.kernel.org/r/20220428083441.37290-16-david@redhat.comSigned-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Co-developed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Don Dutile <ddutile@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jason Gunthorpe <jgg@nvidia.com>
    Cc: John Hubbard <jhubbard@nvidia.com>
    Cc: Khalid Aziz <khalid.aziz@oracle.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Liang Zhang <zhangliang5@huawei.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Mike Rapoport <rppt@linux.ibm.com>
    Cc: Nadav Amit <namit@vmware.com>
    Cc: Oded Gabbay <oded.gabbay@gmail.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Rik van Riel <riel@surriel.com>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    c89357e2
huge_memory.c 84.3 KB