1. 19 Jun, 2023 40 commits
    • Tarun Sahu's avatar
      mm/folio: avoid special handling for order value 0 in folio_set_order · e3b7bf97
      Tarun Sahu authored
      folio_set_order(folio, 0) is used in kernel at two places
      __destroy_compound_gigantic_folio and __prep_compound_gigantic_folio.
      Currently, It is called to clear out the folio->_folio_nr_pages and
      folio->_folio_order.
      
      For __destroy_compound_gigantic_folio:
      In past, folio_set_order(folio, 0) was needed because page->mapping used
      to overlap with _folio_nr_pages and _folio_order. So if these fields were
      left uncleared during freeing gigantic hugepages, they were causing
      "BUG: bad page state" due to non-zero page->mapping. Now, After
      Commit a01f4390 ("hugetlb: be sure to free demoted CMA pages to
      CMA") page->mapping has explicitly been cleared out for tail pages. Also,
      _folio_order and _folio_nr_pages no longer overlaps with page->mapping.
      
      So, folio_set_order(folio, 0) can be removed from freeing gigantic
      folio path (__destroy_compound_gigantic_folio).
      
      Another place, folio_set_order(folio, 0) is called inside
      __prep_compound_gigantic_folio during error path. Here,
      folio_set_order(folio, 0) can also be removed if we move
      folio_set_order(folio, order) after for loop.
      
      The patch also moves _folio_set_head call in __prep_compound_gigantic_folio()
      such that we avoid clearing them in the error path.
      
      Also, as Mike pointed out:
      "It would actually be better to move the calls _folio_set_head and
      folio_set_order in __prep_compound_gigantic_folio() as suggested here. Why?
      In the current code, the ref count on the 'head page' is still 1 (or more)
      while those calls are made. So, someone could take a speculative ref on the
      page BEFORE the tail pages are set up."
      
      This way, folio_set_order(folio, 0) is no more needed. And it will also
      helps removing the confusion of folio order being set to 0 (as _folio_order
      field is part of first tail page).
      
      Testing: I have run LTP tests, which all passes. and also I have written
      the test in LTP which tests the bug caused by compound_nr and page->mapping
      overlapping.
      
      https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/hugetlb/hugemmap/hugemmap32.c
      
      Running on older kernel ( < 5.10-rc7) with the above bug this fails while
      on newer kernel and, also with this patch it passes.
      
      Link: https://lkml.kernel.org/r/20230609162907.111756-1-tsahu@linux.ibm.comSigned-off-by: default avatarTarun Sahu <tsahu@linux.ibm.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e3b7bf97
    • Marcelo Tosatti's avatar
      vmstat: skip periodic vmstat update for isolated CPUs · be5e015d
      Marcelo Tosatti authored
      Problem: The interruption caused by vmstat_update is undesirable
      for certain applications.
      
      With workloads that are running on isolated cpus with nohz full mode to
      shield off any kernel interruption. For example, a VM running a
      time sensitive application with a 50us maximum acceptable interruption
      (use case: soft PLC).
      
      oslat   1094.456862: sys_mlock(start: 7f7ed0000b60, len: 1000)
      oslat   1094.456971: workqueue_queue_work: ... function=vmstat_update ...
      oslat   1094.456974: sched_switch: prev_comm=oslat ... ==> next_comm=kworker/5:1 ...
      kworker 1094.456978: sched_switch: prev_comm=kworker/5:1 ==> next_comm=oslat ...
      
      The example above shows an additional 7us for the
              oslat -> kworker -> oslat
      
      switches. In the case of a virtualized CPU, and the vmstat_update
      interruption in the host (of a qemu-kvm vcpu), the latency penalty
      observed in the guest is higher than 50us, violating the acceptable
      latency threshold.
      
      The isolated vCPU can perform operations that modify per-CPU page counters,
      for example to complete I/O operations:
      
            CPU 11/KVM-9540    [001] dNh1.  2314.248584: mod_zone_page_state <-__folio_end_writeback
            CPU 11/KVM-9540    [001] dNh1.  2314.248585: <stack trace>
       => 0xffffffffc042b083
       => mod_zone_page_state
       => __folio_end_writeback
       => folio_end_writeback
       => iomap_finish_ioend
       => blk_mq_end_request_batch
       => nvme_irq
       => __handle_irq_event_percpu
       => handle_irq_event
       => handle_edge_irq
       => __common_interrupt
       => common_interrupt
       => asm_common_interrupt
       => vmx_do_interrupt_nmi_irqoff
       => vmx_handle_exit_irqoff
       => vcpu_enter_guest
       => vcpu_run
       => kvm_arch_vcpu_ioctl_run
       => kvm_vcpu_ioctl
       => __x64_sys_ioctl
       => do_syscall_64
       => entry_SYSCALL_64_after_hwframe
      
      In kernel users of vmstat counters either require the precise value and
      they are using zone_page_state_snapshot interface or they can live with an
      imprecision as the regular flushing can happen at arbitrary time and
      cumulative error can grow (see calculate_normal_threshold).
      
      From that POV the regular flushing can be postponed for CPUs that have
      been isolated from the kernel interference without critical infrastructure
      ever noticing.  Skip regular flushing from vmstat_shepherd for all
      isolated CPUs to avoid interference with the isolated workload.
      
      Suggested by Michal Hocko.
      
      Link: https://lkml.kernel.org/r/ZIDoV/zxFKVmQl7W@tpadSigned-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      be5e015d
    • Hugh Dickins's avatar
      xtensa: add pte_unmap() to balance pte_offset_map() · 56e0d1cb
      Hugh Dickins authored
      To keep balance in future, remember to pte_unmap() after a successful
      pte_offset_map().  And act as if get_pte_for_vaddr() really needs a map
      there, to read the pteval before "unmapping", to be sure page table is
      not removed.
      
      Link: https://lkml.kernel.org/r/ab2581eb-daa6-894e-4aa6-97c81de3b8c@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      56e0d1cb
    • Hugh Dickins's avatar
      x86: sme_populate_pgd() use pte_offset_kernel() · 653ba810
      Hugh Dickins authored
      sme_populate_pgd() is an __init function for sme_encrypt_kernel():
      it should use pte_offset_kernel() instead of pte_offset_map(), to avoid
      the question of whether a pte_unmap() will be needed to balance.
      
      Link: https://lkml.kernel.org/r/497d7777-736e-85f2-c37-aa6bcf155e4@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      653ba810
    • Hugh Dickins's avatar
      x86: allow get_locked_pte() to fail · 975ca398
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Link: https://lkml.kernel.org/r/b7fa8547-4f28-ec82-9893-1b2eb58e40b4@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      975ca398
    • Hugh Dickins's avatar
      sparc: iounit and iommu use pte_offset_kernel() · 7a19c361
      Hugh Dickins authored
      iounit_alloc() and sbus_iommu_alloc() are working from pmd_off_k(),
      so should use pte_offset_kernel() instead of pte_offset_map(), to avoid
      the question of whether a pte_unmap() will be needed to balance.
      
      Link: https://lkml.kernel.org/r/99962272-12ff-975d-bf7f-7fd5d95a2df5@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7a19c361
    • Hugh Dickins's avatar
      sparc: allow pte_offset_map() to fail · 4be14ec0
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Link: https://lkml.kernel.org/r/22165adb-581c-9ce1-8aa6-a3385cd39055@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4be14ec0
    • Hugh Dickins's avatar
      sparc/hugetlb: pte_alloc_huge() pte_offset_huge() · c65d09fd
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/c2aeb62f-58f9-d014-ddcd-266267bd97b@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c65d09fd
    • Hugh Dickins's avatar
      sh/hugetlb: pte_alloc_huge() pte_offset_huge() · b7b7ef6b
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/ee885978-7355-624b-cfe2-c3d75672b842@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b7b7ef6b
    • Hugh Dickins's avatar
      s390: gmap use pte_unmap_unlock() not spin_unlock() · b2f58941
      Hugh Dickins authored
      pte_alloc_map_lock() expects to be followed by pte_unmap_unlock(): to
      keep balance in future, pass ptep as well as ptl to gmap_pte_op_end(),
      and use pte_unmap_unlock() instead of direct spin_unlock() (even though
      ptep ends up unused inside the macro).
      
      Link: https://lkml.kernel.org/r/78873af-e1ec-4f9-47ac-483940ac6daa@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b2f58941
    • Hugh Dickins's avatar
      s390: allow pte_offset_map_lock() to fail · 5c7f3bf0
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Add comment on mm's contract with s390 above __zap_zero_pages(),
      and fix old comment there: must be called after THP was disabled.
      
      Link: https://lkml.kernel.org/r/3ff29363-336a-9733-12a1-5c31a45c8aeb@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5c7f3bf0
    • Hugh Dickins's avatar
      riscv/hugetlb: pte_alloc_huge() pte_offset_huge() · 893f667f
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/291f20-5947-9f5f-ec7f-96a18df336d9@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarAlexandre Ghiti <alexghiti@rivosinc.com>
      Acked-by: default avatarPalmer Dabbelt <palmer@rivosync.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      893f667f
    • Hugh Dickins's avatar
      powerpc/hugetlb: pte_alloc_huge() · 5d991378
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead.  huge_pte_offset() is using __find_linux_pte(), which is using
      pte_offset_kernel() - don't rename that to _huge, it's more complicated.
      
      Link: https://lkml.kernel.org/r/36b4e5d-954b-8569-4fe2-bd1797362441@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5d991378
    • Hugh Dickins's avatar
      powerpc: allow pte_offset_map[_lock]() to fail · 0c31f29b
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      Balance successful pte_offset_map() with pte_unmap() where omitted.
      
      Link: https://lkml.kernel.org/r/54c8b578-ca9-a0f-bfd2-d72976f8d73a@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0c31f29b
    • Hugh Dickins's avatar
      powerpc: kvmppc_unmap_free_pmd() pte_offset_kernel() · d00ae31f
      Hugh Dickins authored
      kvmppc_unmap_free_pmd() use pte_offset_kernel(), like everywhere else
      in book3s_64_mmu_radix.c: instead of pte_offset_map(), which will come
      to need a pte_unmap() to balance it.
      
      But note that this is a more complex case than most: see those -EAGAINs
      in kvmppc_create_pte(), which is coping with kvmppc races beween page
      table and huge entry, of the kind which we are expecting to address
      in pte_offset_map() - this might want to be revisited in future.
      
      Link: https://lkml.kernel.org/r/c76aa421-aec3-4cc8-cc61-4130f2e27e1@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d00ae31f
    • Hugh Dickins's avatar
      parisc/hugetlb: pte_alloc_huge() pte_offset_huge() · def1cd43
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/7963aeed-f7d2-e0-f3c6-3680c5572444@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      def1cd43
    • Hugh Dickins's avatar
      parisc: unmap_uncached_pte() use pte_offset_kernel() · ffd3e90a
      Hugh Dickins authored
      unmap_uncached_pte() is working from pgd_offset_k(vaddr), so it should
      use pte_offset_kernel() instead of pte_offset_map(), to avoid the
      question of whether a pte_unmap() will be needed to balance.
      
      Link: https://lkml.kernel.org/r/358dfe21-a47f-9d3-bf21-9c454735944@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ffd3e90a
    • Hugh Dickins's avatar
      parisc: add pte_unmap() to balance get_ptep() · 6a2561f9
      Hugh Dickins authored
      To keep balance in future, remember to pte_unmap() after a successful
      get_ptep().  And act as if flush_cache_pages() really needs a map there,
      to read the pfn before "unmapping", to be sure page table is not removed.
      
      Link: https://lkml.kernel.org/r/653369-95ef-acd2-d6ea-e95f5a997493@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6a2561f9
    • Hugh Dickins's avatar
      mips: add pte_unmap() to balance pte_offset_map() · 17b25a38
      Hugh Dickins authored
      To keep balance in future, __update_tlb() remember to pte_unmap() after
      pte_offset_map().  This is an odd case, since the caller has already done
      pte_offset_map_lock(), then mips forgets the address and recalculates it;
      but my two naive attempts to clean that up did more harm than good.
      
      Link: https://lkml.kernel.org/r/addfcb3-b5f4-976e-e050-a2508e589cfe@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarYu Zhao <yuzhao@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      17b25a38
    • Hugh Dickins's avatar
      microblaze: allow pte_offset_map() to fail · 505a23a5
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Link: https://lkml.kernel.org/r/eab66faf-c0ab-3a8f-47bf-8a7c5af638f@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      505a23a5
    • Hugh Dickins's avatar
      m68k: allow pte_offset_map[_lock]() to fail · e67b37c3
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Restructure cf_tlb_miss() with a pte_unmap() (previously omitted)
      at label out, followed by one local_irq_restore() for all.
      
      Link: https://lkml.kernel.org/r/795f6a7-bcca-cdf-ad2a-fbdaa232998c@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e67b37c3
    • Hugh Dickins's avatar
      ia64/hugetlb: pte_alloc_huge() pte_offset_huge() · 0db639f7
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/1c2c7837-bfea-9640-a74-985379fcc5a@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0db639f7
    • Hugh Dickins's avatar
      arm64/hugetlb: pte_alloc_huge() pte_offset_huge() · cafcb9ca
      Hugh Dickins authored
      pte_alloc_map() expects to be followed by pte_unmap(), but hugetlb omits
      that: to keep balance in future, use the recently added pte_alloc_huge()
      instead; with pte_offset_huge() a better name for pte_offset_kernel().
      
      Link: https://lkml.kernel.org/r/5849464-7191-40c5-c55f-fba9c3802e5d@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      cafcb9ca
    • Hugh Dickins's avatar
      arm64: allow pte_offset_map() to fail · 52924726
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Link: https://lkml.kernel.org/r/35e46485-8499-4337-c51f-b8fa495a1a93@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      52924726
    • Hugh Dickins's avatar
      arm: allow pte_offset_map[_lock]() to fail · 766b59e8
      Hugh Dickins authored
      Patch series "arch: allow pte_offset_map[_lock]() to fail", v2.
      
      What is it all about?  Some mmap_lock avoidance i.e.  latency reduction. 
      Initially just for the case of collapsing shmem or file pages to THPs; but
      likely to be relied upon later in other contexts e.g.  freeing of empty
      page tables (but that's not work I'm doing).  mmap_write_lock avoidance
      when collapsing to anon THPs?  Perhaps, but again that's not work I've
      done: a quick attempt was not as easy as the shmem/file case.
      
      I would much prefer not to have to make these small but wide-ranging
      changes for such a niche case; but failed to find another way, and have
      heard that shmem MADV_COLLAPSE's usefulness is being limited by that
      mmap_write_lock it currently requires.
      
      These changes (though of course not these exact patches, and not all of
      these architectures!) have been in Google's data centre kernel for three
      years now: we do rely upon them.
      
      What are the per-arch changes about?  Generally, two things.
      
      One: the current mmap locking may not be enough to guard against that
      tricky transition between pmd entry pointing to page table, and empty pmd
      entry, and pmd entry pointing to huge page: pte_offset_map() will have to
      validate the pmd entry for itself, returning NULL if no page table is
      there.  What to do about that varies: often the nearby error handling
      indicates just to skip it; but in some cases a "goto again" looks
      appropriate (and if that risks an infinite loop, then there must have been
      an oops, or pfn 0 mistaken for page table, before).
      
      Deeper study of each site might show that 90% of them here in arch code
      could only fail if there's corruption e.g.  a transition to THP would be
      surprising on an arch without HAVE_ARCH_TRANSPARENT_HUGEPAGE.  But given
      the likely extension to freeing empty page tables, I have not limited this
      set of changes to THP; and it has been easier, and sets a better example,
      if each site is given appropriate handling.
      
      Two: pte_offset_map() will need to do an rcu_read_lock(), with the
      corresponding rcu_read_unlock() in pte_unmap().  But most architectures
      never supported CONFIG_HIGHPTE, so some don't always call pte_unmap()
      after pte_offset_map(), or have used userspace pte_offset_map() where
      pte_offset_kernel() is more correct.  No problem in the current tree, but
      a problem once an rcu_read_unlock() will be needed to keep balance.
      
      A common special case of that comes in arch/*/mm/hugetlbpage.c, if the
      architecture supports hugetlb pages down at the lowest PTE level. 
      huge_pte_alloc() uses pte_alloc_map(), but generic hugetlb code does no
      corresponding pte_unmap(); similarly for huge_pte_offset().
      
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Link: https://lkml.kernel.org/r/a4963be9-7aa6-350-66d0-2ba843e1af44@google.com
      Link: https://lkml.kernel.org/r/813429a1-204a-1844-eeae-7fd72826c28@google.comSigned-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      766b59e8
    • Yosry Ahmed's avatar
      mm: zswap: support exclusive loads · b9c91c43
      Yosry Ahmed authored
      Commit 71024cb4 ("frontswap: remove frontswap_tmem_exclusive_gets")
      removed support for exclusive loads from frontswap as it was not used. 
      Bring back exclusive loads support to frontswap by adding an "exclusive"
      output parameter to frontswap_ops->load.
      
      On the zswap side, add a module parameter to enable/disable exclusive
      loads, and a config option to control the boot default value.  Refactor
      zswap entry invalidation in zswap_frontswap_invalidate_page() into
      zswap_invalidate_entry() to reuse it in zswap_frontswap_load() if
      exclusive loads are enabled.
      
      With exclusive loads, we avoid having two copies of the same page in
      memory (compressed & uncompressed) after faulting it in from zswap.  On
      the other hand, if the page is to be reclaimed again without being
      dirtied, it will be re-compressed.  Compression is not usually slow, and a
      page that was just faulted in is less likely to be reclaimed again soon.
      
      Link: https://lkml.kernel.org/r/20230607195143.1473802-1-yosryahmed@google.comSigned-off-by: default avatarYosry Ahmed <yosryahmed@google.com>
      Suggested-by: default avatarYu Zhao <yuzhao@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b9c91c43
    • Haifeng Xu's avatar
      mm/mm_init.c: remove reset_node_present_pages() · 32b6a4a1
      Haifeng Xu authored
      reset_node_present_pages() only get called in hotadd_init_pgdat(), move
      the action that clear present pages to free_area_init_core_hotplug(), so
      the helper can be removed.
      
      Link: https://lkml.kernel.org/r/20230607025056.1348-1-haifeng.xu@shopee.comSigned-off-by: default avatarHaifeng Xu <haifeng.xu@shopee.com>
      Suggested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      32b6a4a1
    • Haifeng Xu's avatar
      mm/memory_hotplug: remove reset_node_managed_pages() in hotadd_init_pgdat() · a668968f
      Haifeng Xu authored
      managed pages has already been set to 0 in free_area_init_core_hotplug(),
      via zone_init_internals() on each zone.  It's pointless to reset again.
      
      Furthermore, reset_node_managed_pages() no longer needs to be exposed
      outside of mm/memblock.c.  Remove declaration in include/linux/memblock.h
      and define it as static.
      
      In addtion to this, the only caller of reset_node_managed_pages() is
      reset_all_zones_managed_pages(), which is annotated with __init, so it
      should be safe to also mark reset_node_managed_pages() as __init.
      
      Link: https://lkml.kernel.org/r/20230607024548.1240-1-haifeng.xu@shopee.comSigned-off-by: default avatarHaifeng Xu <haifeng.xu@shopee.com>
      Suggested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a668968f
    • Roberto Sassu's avatar
      shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs · 36ce9d76
      Roberto Sassu authored
      As the ramfs-based tmpfs uses ramfs_init_fs_context() for the
      init_fs_context method, which allocates fc->s_fs_info, use ramfs_kill_sb()
      to free it and avoid a memory leak.
      
      Link: https://lkml.kernel.org/r/20230607161523.2876433-1-roberto.sassu@huaweicloud.com
      Fixes: c3b1b1cb ("ramfs: add support for "mode=" mount option")
      Signed-off-by: default avatarRoberto Sassu <roberto.sassu@huawei.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      36ce9d76
    • Haifeng Xu's avatar
      mm/mm_init.c: drop 'nid' parameter from check_for_memory() · 91ff4d75
      Haifeng Xu authored
      The node_id in pgdat has already been set in free_area_init_node(),
      so use it internally instead of passing a redundant parameter.
      
      Link: https://lkml.kernel.org/r/20230607032402.4679-1-haifeng.xu@shopee.comSigned-off-by: default avatarHaifeng Xu <haifeng.xu@shopee.com>
      Reviewed-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      91ff4d75
    • Yajun Deng's avatar
      mm/sparse: remove unused parameters in sparse_remove_section() · bd5f79ab
      Yajun Deng authored
      These parameters ms and map_offset are not used in
      sparse_remove_section(), so remove them.
      
      The __remove_section() is only called by __remove_pages(), remove it.  And
      put the WARN_ON_ONCE() in sparse_remove_section().
      
      Link: https://lkml.kernel.org/r/20230607023952.2247489-1-yajun.deng@linux.devSigned-off-by: default avatarYajun Deng <yajun.deng@linux.dev>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bd5f79ab
    • ZhangPeng's avatar
      mm/hugetlb: use a folio in hugetlb_fault() · 061e62e8
      ZhangPeng authored
      We can replace seven implicit calls to compound_head() with one by using
      folio.
      
      [akpm@linux-foundation.org: update comment, per Sidhartha]
      Link: https://lkml.kernel.org/r/20230606062013.2947002-4-zhangpeng362@huawei.comSigned-off-by: default avatarZhangPeng <zhangpeng362@huawei.com>
      Reviewed-by Sidhartha Kumar <sidhartha.kumar@oracle.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nanyong Sun <sunnanyong@huawei.com>
      Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      061e62e8
    • ZhangPeng's avatar
      mm/hugetlb: use a folio in hugetlb_wp() · 959a78b6
      ZhangPeng authored
      We can replace nine implict calls to compound_head() with one by using
      old_folio.  The page we get back is always a head page, so we just convert
      old_page to old_folio.
      
      Link: https://lkml.kernel.org/r/20230606062013.2947002-3-zhangpeng362@huawei.comSigned-off-by: default avatarZhangPeng <zhangpeng362@huawei.com>
      Suggested-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarSidhartha Kumar <sidhartha.kumar@oracle.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nanyong Sun <sunnanyong@huawei.com>
      Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      959a78b6
    • ZhangPeng's avatar
      mm/hugetlb: use a folio in copy_hugetlb_page_range() · ad27ce20
      ZhangPeng authored
      Patch series "Convert several functions in hugetlb.c to use a folio", v2.
      
      This patch series converts three functions in hugetlb.c to use a folio,
      which can remove several implicit calls to compound_head().
      
      
      This patch (of 3):
      
      We can replace five implict calls to compound_head() with one by using
      pte_folio.  The page we get back is always a head page, so we just convert
      ptepage to pte_folio.
      
      Link: https://lkml.kernel.org/r/20230606062013.2947002-1-zhangpeng362@huawei.com
      Link: https://lkml.kernel.org/r/20230606062013.2947002-2-zhangpeng362@huawei.comSigned-off-by: default avatarZhangPeng <zhangpeng362@huawei.com>
      Suggested-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarSidhartha Kumar <sidhartha.kumar@oracle.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Nanyong Sun <sunnanyong@huawei.com>
      Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ad27ce20
    • John Hubbard's avatar
      selftests: error out if kernel header files are not yet built · 9fc96c7c
      John Hubbard authored
      As per a discussion with Muhammad Usama Anjum [1], the following is how
      one is supposed to build selftests:
      
          make headers && make -C tools/testing/selftests/mm
      
      Change the selftest build system's lib.mk to fail out with a helpful
      message if that prerequisite "make headers" has not been done yet.
      
      [1] https://lore.kernel.org/all/bf910fa5-0c96-3707-cce4-5bcc656b6274@collabora.com/
      
      [jhubbard@nvidia.com: abort the make process the first time headers aren't detected]
        Link: https://lkml.kernel.org/r/14573e7e-f2ad-ff34-dfbd-3efdebee51ed@nvidia.com
      [anders.roxell@linaro.org: fix out-of-tree builds]
        Link: https://lkml.kernel.org/r/20230613074931.666966-1-anders.roxell@linaro.org
      Link: https://lkml.kernel.org/r/20230606071637.267103-12-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Reviewed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9fc96c7c
    • John Hubbard's avatar
      Documentation: kselftest: "make headers" is a prerequisite · 01d6c48a
      John Hubbard authored
      As per a discussion with Muhammad Usama Anjum [1], the following is how
      one is supposed to build selftests:
      
          make headers && make -C tools/testing/selftests/mm
      
      However, that's not yet documented anywhere. So add it to
      Documentation/dev-tools/kselftest.rst .
      
      [1] https://lore.kernel.org/all/bf910fa5-0c96-3707-cce4-5bcc656b6274@collabora.com/
      
      Link: https://lkml.kernel.org/r/20230606071637.267103-11-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      01d6c48a
    • John Hubbard's avatar
      selftests/mm: move certain uffd*() routines from vm_util.c to uffd-common.c · 56d2afff
      John Hubbard authored
      There are only three uffd*() routines that are used outside of the uffd
      selftests. Leave these in vm_util.c, where they are available to any mm
      selftest program:
      
          uffd_register()
          uffd_unregister()
          uffd_register_with_ioctls().
      
      A few other uffd*() routines, however, are only used by the uffd-focused
      tests found in uffd-stress.c and uffd-unit-tests.c. Move those routines
      into uffd-common.c.
      
      Link: https://lkml.kernel.org/r/20230606071637.267103-10-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      56d2afff
    • John Hubbard's avatar
      selftests/mm: fix build failures due to missing MADV_COLLAPSE · 3972ea24
      John Hubbard authored
      MADV_PAGEOUT, MADV_POPULATE_READ, MADV_COLLAPSE are conditionally
      defined as necessary. However, that was being done in .c files, and a
      new build failure came up that would have been automatically avoided had
      these been in a common header file.
      
      So consolidate and move them all to vm_util.h, which fixes the build
      failure.
      
      An alternative approach from Muhammad Usama Anjum was: rely on "make
      headers" being required, and include asm-generic/mman-common.h. This
      works in the sense that it builds, but it still generates warnings about
      duplicate MADV_* symbols, and the goal here is to get a fully clean (no
      warnings) build here.
      
      Link: https://lkml.kernel.org/r/20230606071637.267103-9-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3972ea24
    • John Hubbard's avatar
      selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h · 97deb66e
      John Hubbard authored
      This fixes a real bug, too, because xstate_size()  was assuming that
      the stack variable xstate_size was initialized to zero. That's not
      guaranteed nor even especially likely.
      
      Link: https://lkml.kernel.org/r/20230606071637.267103-8-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      97deb66e
    • John Hubbard's avatar
      selftests/mm: fix two -Wformat-security warnings in uffd builds · 0e14e7e9
      John Hubbard authored
      The uffd tests generate two compile time warnings from clang's
      -Wformat-security setting. These trigger at the call sites for
      uffd_test_start() and uffd_test_skip().
      
      1) Fix the uffd_test_start() issue by removing the intermediate
      test_name variable (thanks to David Hildenbrand for showing how to do
      this).
      
      2) Fix the uffd_test_skip() issue by observing that there is no need for
      a macro and a variable args approach, because all callers of
      uffd_test_skip() pass in a simple char* string, without any format
      specifiers. So just change uffd_test_skip() into a regular C function.
      
      Link: https://lkml.kernel.org/r/20230606071637.267103-7-jhubbard@nvidia.comSigned-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Tested-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0e14e7e9