1. 19 Oct, 2021 1 commit
    • Peter Xu's avatar
      mm/userfaultfd: selftests: fix memory corruption with thp enabled · 8913970c
      Peter Xu authored
      In RHEL's gating selftests we've encountered memory corruption in the
      uffd event test even with upstream kernel:
      
              # ./userfaultfd anon 128 4
              nr_pages: 32768, nr_pages_per_cpu: 32768
              bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
              bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
              bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
              bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
              testing uffd-wp with pagemap (pgsize=4096): done
              testing uffd-wp with pagemap (pgsize=2097152): done
              testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
              ERROR: faulting process failed (errno=0, line=1117)
      
      It can be easily reproduced when global thp enabled, which is the
      default for RHEL.
      
      It's also known as a side effect of commit 0db282ba ("selftest: use
      mmap instead of posix_memalign to allocate memory", 2021-07-23), which
      is imho right itself on using mmap() to make sure the addresses will be
      untagged even on arm.
      
      The problem is, for each test we allocate buffers using two
      allocate_area() calls.  We assumed these two buffers won't affect each
      other, however they could, because mmap() could have found that the two
      buffers are near each other and having the same VMA flags, so they got
      merged into one VMA.
      
      It won't be a big problem if thp is not enabled, but when thp is
      agressively enabled it means when initializing the src buffer it could
      accidentally setup part of the dest buffer too when there's a shared THP
      that overlaps the two regions.  Then some of the dest buffer won't be
      able to be trapped by userfaultfd missing mode, then it'll cause memory
      corruption as described.
      
      To fix it, do release_pages() after initializing the src buffer.
      
      Since the previous two release_pages() calls are after
      uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
      stronger than release pages; as unmap() also tear town pgtables), drop
      them as they shouldn't really be anything useful.
      
      We can mark the Fixes tag upon 0db282ba as it's reported to only
      happen there, however the real "Fixes" IMHO should be 8ba6e864, as
      before that commit we'll always do explicit release_pages() before
      registration of uffd, and 8ba6e864 changed that logic by adding
      extra unmap/map and we didn't release the pages at the right place.
      Meanwhile I don't have a solid glue anyway on whether posix_memalign()
      could always avoid triggering this bug, hence it's safer to attach this
      fix to commit 8ba6e864.
      
      Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
      Fixes: 8ba6e864 ("userfaultfd/selftests: reinitialize test context in each test")
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarLi Wang <liwan@redhat.com>
      Tested-by: default avatarLi Wang <liwang@redhat.com>
      Reviewed-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8913970c
  2. 18 Oct, 2021 17 commits
  3. 17 Oct, 2021 3 commits
  4. 16 Oct, 2021 19 commits