An error occurred fetching the project authors.
  1. 18 Apr, 2023 1 commit
  2. 06 Apr, 2023 13 commits
  3. 28 Mar, 2023 7 commits
  4. 20 Feb, 2023 1 commit
  5. 10 Feb, 2023 2 commits
    • Suren Baghdasaryan's avatar
      mm: introduce __vm_flags_mod and use it in untrack_pfn · 68f48381
      Suren Baghdasaryan authored
      There are scenarios when vm_flags can be modified without exclusive
      mmap_lock, such as:
      - after VMA was isolated and mmap_lock was downgraded or dropped
      - in exit_mmap when there are no other mm users and locking is unnecessary
      Introduce __vm_flags_mod to avoid assertions when the caller takes
      responsibility for the required locking.
      Pass a hint to untrack_pfn to conditionally use __vm_flags_mod for
      flags modification to avoid assertion.
      
      Link: https://lkml.kernel.org/r/20230126193752.297968-7-surenb@google.comSigned-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjun Roy <arjunroy@google.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@google.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Peter Oskolkov <posk@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Punit Agrawal <punit.agrawal@bytedance.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Sebastian Reichel <sebastian.reichel@collabora.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      68f48381
    • Suren Baghdasaryan's avatar
      mm: replace vma->vm_flags direct modifications with modifier calls · 1c71222e
      Suren Baghdasaryan authored
      Replace direct modifications to vma->vm_flags with calls to modifier
      functions to be able to track flag changes and to keep vma locking
      correctness.
      
      [akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo]
      Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.comSigned-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarSebastian Reichel <sebastian.reichel@collabora.com>
      Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@Oracle.com>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjun Roy <arjunroy@google.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Minchan Kim <minchan@google.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Peter Oskolkov <posk@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Punit Agrawal <punit.agrawal@bytedance.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1c71222e
  6. 09 Feb, 2023 1 commit
  7. 03 Feb, 2023 7 commits
  8. 19 Jan, 2023 6 commits
    • Yu Zhao's avatar
      mm: add vma_has_recency() · 8788f678
      Yu Zhao authored
      Add vma_has_recency() to indicate whether a VMA may exhibit temporal
      locality that the LRU algorithm relies on.
      
      This function returns false for VMAs marked by VM_SEQ_READ or
      VM_RAND_READ.  While the former flag indicates linear access, i.e., a
      special case of spatial locality, both flags indicate a lack of temporal
      locality, i.e., the reuse of an area within a relatively small duration.
      
      "Recency" is chosen over "locality" to avoid confusion between temporal
      and spatial localities.
      
      Before this patch, the active/inactive LRU only ignored the accessed bit
      from VMAs marked by VM_SEQ_READ.  After this patch, the active/inactive
      LRU and MGLRU share the same logic: they both ignore the accessed bit if
      vma_has_recency() returns false.
      
      For the active/inactive LRU, the following fio test showed a [6, 8]%
      increase in IOPS when randomly accessing mapped files under memory
      pressure.
      
        kb=$(awk '/MemTotal/ { print $2 }' /proc/meminfo)
        kb=$((kb - 8*1024*1024))
      
        modprobe brd rd_nr=1 rd_size=$kb
        dd if=/dev/zero of=/dev/ram0 bs=1M
      
        mkfs.ext4 /dev/ram0
        mount /dev/ram0 /mnt/
        swapoff -a
      
        fio --name=test --directory=/mnt/ --ioengine=mmap --numjobs=8 \
            --size=8G --rw=randrw --time_based --runtime=10m \
            --group_reporting
      
      The discussion that led to this patch is here [1].  Additional test
      results are available in that thread.
      
      [1] https://lore.kernel.org/r/Y31s%2FK8T85jh05wH@google.com/
      
      Link: https://lkml.kernel.org/r/20221230215252.2628425-1-yuzhao@google.comSigned-off-by: default avatarYu Zhao <yuzhao@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andrea Righi <andrea.righi@canonical.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michael Larabel <Michael@MichaelLarabel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8788f678
    • Mike Kravetz's avatar
      mm: remove zap_page_range and create zap_vma_pages · e9adcfec
      Mike Kravetz authored
      zap_page_range was originally designed to unmap pages within an address
      range that could span multiple vmas.  While working on [1], it was
      discovered that all callers of zap_page_range pass a range entirely within
      a single vma.  In addition, the mmu notification call within zap_page
      range does not correctly handle ranges that span multiple vmas.  When
      crossing a vma boundary, a new mmu_notifier_range_init/end call pair with
      the new vma should be made.
      
      Instead of fixing zap_page_range, do the following:
      - Create a new routine zap_vma_pages() that will remove all pages within
        the passed vma.  Most users of zap_page_range pass the entire vma and
        can use this new routine.
      - For callers of zap_page_range not passing the entire vma, instead call
        zap_page_range_single().
      - Remove zap_page_range.
      
      [1] https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.kravetz@oracle.com/
      Link: https://lkml.kernel.org/r/20230104002732.232573-1-mike.kravetz@oracle.comSigned-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Suggested-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: Heiko Carstens <hca@linux.ibm.com>	[s390]
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e9adcfec
    • Vishal Moola (Oracle)'s avatar
      mm/memory: add vm_normal_folio() · 318e9342
      Vishal Moola (Oracle) authored
      Patch series "Convert deactivate_page() to folio_deactivate()", v4.
      
      Deactivate_page() has already been converted to use folios.  This patch
      series modifies the callers of deactivate_page() to use folios.  It also
      introduces vm_normal_folio() to assist with folio conversions, and
      converts deactivate_page() to folio_deactivate() which takes in a folio.
      
      
      This patch (of 4):
      
      Introduce a wrapper function called vm_normal_folio().  This function
      calls vm_normal_page() and returns the folio of the page found, or null if
      no page is found.
      
      This function allows callers to get a folio from a pte, which will
      eventually allow them to completely replace their struct page variables
      with struct folio instead.
      
      Link: https://lkml.kernel.org/r/20221221180848.20774-1-vishal.moola@gmail.com
      Link: https://lkml.kernel.org/r/20221221180848.20774-2-vishal.moola@gmail.comSigned-off-by: default avatarVishal Moola (Oracle) <vishal.moola@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: SeongJae Park <sj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      318e9342
    • Peter Xu's avatar
      mm/uffd: always wr-protect pte in pte|pmd_mkuffd_wp() · f1eb1bac
      Peter Xu authored
      This patch is a cleanup to always wr-protect pte/pmd in mkuffd_wp paths.
      
      The reasons I still think this patch is worthwhile, are:
      
        (1) It is a cleanup already; diffstat tells.
      
        (2) It just feels natural after I thought about this, if the pte is uffd
            protected, let's remove the write bit no matter what it was.
      
        (2) Since x86 is the only arch that supports uffd-wp, it also redefines
            pte|pmd_mkuffd_wp() in that it should always contain removals of
            write bits.  It means any future arch that want to implement uffd-wp
            should naturally follow this rule too.  It's good to make it a
            default, even if with vm_page_prot changes on VM_UFFD_WP.
      
        (3) It covers more than vm_page_prot.  So no chance of any potential
            future "accident" (like pte_mkdirty() sparc64 or loongarch, even
            though it just got its pte_mkdirty fixed <1 month ago).  It'll be
            fairly clear when reading the code too that we don't worry anything
            before a pte_mkuffd_wp() on uncertainty of the write bit.
      
      We may call pte_wrprotect() one more time in some paths (e.g.  thp split),
      but that should be fully local bitop instruction so the overhead should be
      negligible.
      
      Although this patch should logically also fix all the known issues on
      uffd-wp too recently on page migration (not for numa hint recovery - that
      may need another explcit pte_wrprotect), but this is not the plan for that
      fix.  So no fixes, and stable doesn't need this.
      
      Link: https://lkml.kernel.org/r/20221214201533.1774616-1-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ives van Hoorne <ives@codesandbox.io>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f1eb1bac
    • Peter Xu's avatar
      mm: fix a few rare cases of using swapin error pte marker · 7e3ce3f8
      Peter Xu authored
      This patch should harden commit 15520a3f ("mm: use pte markers for
      swap errors") on using pte markers for swapin errors on a few corner
      cases.
      
      1. Propagate swapin errors across fork()s: if there're swapin errors in
         the parent mm, after fork()s the child should sigbus too when an error
         page is accessed.
      
      2. Fix a rare condition race in pte_marker_clear() where a uffd-wp pte
         marker can be quickly switched to a swapin error.
      
      3. Explicitly ignore swapin error pte markers in change_protection().
      
      I mostly don't worry on (2) or (3) at all, but we should still have them. 
      Case (1) is special because it can potentially cause silent data corrupt
      on child when parent has swapin error triggered with swapoff, but since
      swapin error is rare itself already it's probably not easy to trigger
      either.
      
      Currently there is a priority difference between the uffd-wp bit and the
      swapin error entry, in which the swapin error always has higher priority
      (e.g.  we don't need to wr-protect a swapin error pte marker).
      
      If there will be a 3rd bit introduced, we'll probably need to consider a
      more involved approach so we may need to start operate on the bits.  Let's
      leave that for later.
      
      This patch is tested with case (1) explicitly where we'll get corrupted
      data before in the child if there's existing swapin error pte markers, and
      after patch applied the child can be rightfully killed.
      
      We don't need to copy stable for this one since 15520a3f just landed
      as part of v6.2-rc1, only "Fixes" applied.
      
      Link: https://lkml.kernel.org/r/20221214200453.1772655-3-peterx@redhat.com
      Fixes: 15520a3f ("mm: use pte markers for swap errors")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Pengfei Xu <pengfei.xu@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7e3ce3f8
    • Peter Xu's avatar
      mm/uffd: fix pte marker when fork() without fork event · 49d6d7fb
      Peter Xu authored
      Patch series "mm: Fixes on pte markers".
      
      Patch 1 resolves the syzkiller report from Pengfei.
      
      Patch 2 further harden pte markers when used with the recent swapin error
      markers.  The major case is we should persist a swapin error marker after
      fork(), so child shouldn't read a corrupted page.
      
      
      This patch (of 2):
      
      When fork(), dst_vma is not guaranteed to have VM_UFFD_WP even if src may
      have it and has pte marker installed.  The warning is improper along with
      the comment.  The right thing is to inherit the pte marker when needed, or
      keep the dst pte empty.
      
      A vague guess is this happened by an accident when there's the prior patch
      to introduce src/dst vma into this helper during the uffd-wp feature got
      developed and I probably messed up in the rebase, since if we replace
      dst_vma with src_vma the warning & comment it all makes sense too.
      
      Hugetlb did exactly the right here (copy_hugetlb_page_range()).  Fix the
      general path.
      
      Reproducer:
      
      https://github.com/xupengfe/syzkaller_logs/blob/main/221208_115556_copy_page_range/repro.c
      
      Bugzilla report: https://bugzilla.kernel.org/show_bug.cgi?id=216808
      
      Link: https://lkml.kernel.org/r/20221214200453.1772655-1-peterx@redhat.com
      Link: https://lkml.kernel.org/r/20221214200453.1772655-2-peterx@redhat.com
      Fixes: c56d1b62 ("mm/shmem: handle uffd-wp during fork()")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarPengfei Xu <pengfei.xu@intel.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: <stable@vger.kernel.org> # 5.19+
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      49d6d7fb
  9. 12 Dec, 2022 1 commit
  10. 30 Nov, 2022 1 commit