• Michal Hocko's avatar
    mm, gup: close FOLL MAP_PRIVATE race · 1c8544a9
    Michal Hocko authored
    commit 19be0eaf upstream.
    
    faultin_page drops FOLL_WRITE after the page fault handler did the CoW
    and then we retry follow_page_mask to get our CoWed page. This is racy,
    however because the page might have been unmapped by that time and so
    we would have to do a page fault again, this time without CoW. This
    would cause the page cache corruption for FOLL_FORCE on MAP_PRIVATE
    read only mappings with obvious consequences.
    
    This is an ancient bug that was actually already fixed once by Linus
    eleven years ago in commit 4ceb5db9 ("Fix get_user_pages() race
    for write access") but that was then undone due to problems on s390
    by commit f33ea7f4 ("fix get_user_pages bug") because s390 didn't
    have proper dirty pte tracking until abf09bed ("s390/mm: implement
    software dirty bits"). This wasn't a problem at the time as pointed out
    by Hugh Dickins because madvise relied on mmap_sem for write up until
    0a27a14a ("mm: madvise avoid exclusive mmap_sem") but since then we
    can race with madvise which can unmap the fresh COWed page or with KSM
    and corrupt the content of the shared page.
    
    This patch is based on the Linus' approach to not clear FOLL_WRITE after
    the CoW page fault (aka VM_FAULT_WRITE) but instead introduces FOLL_COW
    to note this fact. The flag is then rechecked during follow_pfn_pte to
    enforce the page fault again if we do not see the CoWed page. Linus was
    suggesting to check pte_dirty again as s390 is OK now. But that would
    make backporting to some old kernels harder. So instead let's just make
    sure that vm_normal_page sees a pure anonymous page.
    
    This would guarantee we are seeing a real CoW page. Introduce
    can_follow_write_pte which checks both pte_write and falls back to
    PageAnon on forced write faults which passed CoW already. Thanks to Hugh
    to point out that a special care has to be taken for KSM pages because
    our COWed page might have been merged with a KSM one and keep its
    PageAnon flag.
    
    Fixes: 0a27a14a ("mm: madvise avoid exclusive mmap_sem")
    Reported-by: default avatarPhil "not Paul" Oester <kernel@linuxace.com>
    Disclosed-by: default avatarAndy Lutomirski <luto@kernel.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    [bwh: Backported to 3.2:
     - Adjust filename, context, indentation
     - The 'no_page' exit path in follow_page() is different, so open-code the
       cleanup
     - Delete a now-unused label]
    Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
    Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
    1c8544a9
memory.c 111 KB