• Lorenzo Stoakes's avatar
    mm/mmap: separate writenotify and dirty tracking logic · 54cbbbf3
    Lorenzo Stoakes authored
    Patch series "mm/gup: disallow GUP writing to file-backed mappings by
    default", v9.
    
    Writing to file-backed mappings which require folio dirty tracking using
    GUP is a fundamentally broken operation, as kernel write access to GUP
    mappings do not adhere to the semantics expected by a file system.
    
    A GUP caller uses the direct mapping to access the folio, which does not
    cause write notify to trigger, nor does it enforce that the caller marks
    the folio dirty.
    
    The problem arises when, after an initial write to the folio, writeback
    results in the folio being cleaned and then the caller, via the GUP
    interface, writes to the folio again.
    
    As a result of the use of this secondary, direct, mapping to the folio no
    write notify will occur, and if the caller does mark the folio dirty, this
    will be done so unexpectedly.
    
    For example, consider the following scenario:-
    
    1. A folio is written to via GUP which write-faults the memory, notifying
       the file system and dirtying the folio.
    2. Later, writeback is triggered, resulting in the folio being cleaned and
       the PTE being marked read-only.
    3. The GUP caller writes to the folio, as it is mapped read/write via the
       direct mapping.
    4. The GUP caller, now done with the page, unpins it and sets it dirty
       (though it does not have to).
    
    This change updates both the PUP FOLL_LONGTERM slow and fast APIs.  As
    pin_user_pages_fast_only() does not exist, we can rely on a slightly
    imperfect whitelisting in the PUP-fast case and fall back to the slow case
    should this fail.
    
    
    This patch (of 3):
    
    vma_wants_writenotify() is specifically intended for setting PTE page
    table flags, accounting for existing page table flag state and whether the
    underlying filesystem performs dirty tracking for a file-backed mapping.
    
    Everything is predicated firstly on whether the mapping is shared
    writable, as this is the only instance where dirty tracking is pertinent -
    MAP_PRIVATE mappings will always be CoW'd and unshared, and read-only
    file-backed shared mappings cannot be written to, even with FOLL_FORCE.
    
    All other checks are in line with existing logic, though now separated
    into checks eplicitily for dirty tracking and those for determining how to
    set page table flags.
    
    We make this change so we can perform checks in the GUP logic to determine
    which mappings might be problematic when written to.
    
    Link: https://lkml.kernel.org/r/cover.1683235180.git.lstoakes@gmail.com
    Link: https://lkml.kernel.org/r/0f218370bd49b4e6bbfbb499f7c7b92c26ba1ceb.1683235180.git.lstoakes@gmail.comSigned-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
    Reviewed-by: default avatarMika Penttilä <mpenttil@redhat.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: Kirill A . Shutemov <kirill@shutemov.name>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    54cbbbf3
mmap.c 101 KB