1. 24 Feb, 2024 6 commits
    • Aneesh Kumar K.V (IBM)'s avatar
      mm/debug_vm_pgtable: fix BUG_ON with pud advanced test · 720da1e5
      Aneesh Kumar K.V (IBM) authored
      Architectures like powerpc add debug checks to ensure we find only devmap
      PUD pte entries.  These debug checks are only done with CONFIG_DEBUG_VM. 
      This patch marks the ptes used for PUD advanced test devmap pte entries so
      that we don't hit on debug checks on architecture like ppc64 as below.
      
      WARNING: CPU: 2 PID: 1 at arch/powerpc/mm/book3s64/radix_pgtable.c:1382 radix__pud_hugepage_update+0x38/0x138
      ....
      NIP [c0000000000a7004] radix__pud_hugepage_update+0x38/0x138
      LR [c0000000000a77a8] radix__pudp_huge_get_and_clear+0x28/0x60
      Call Trace:
      [c000000004a2f950] [c000000004a2f9a0] 0xc000000004a2f9a0 (unreliable)
      [c000000004a2f980] [000d34c100000000] 0xd34c100000000
      [c000000004a2f9a0] [c00000000206ba98] pud_advanced_tests+0x118/0x334
      [c000000004a2fa40] [c00000000206db34] debug_vm_pgtable+0xcbc/0x1c48
      [c000000004a2fc10] [c00000000000fd28] do_one_initcall+0x60/0x388
      
      Also
      
       kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:202!
       ....
      
       NIP [c000000000096510] pudp_huge_get_and_clear_full+0x98/0x174
       LR [c00000000206bb34] pud_advanced_tests+0x1b4/0x334
       Call Trace:
       [c000000004a2f950] [000d34c100000000] 0xd34c100000000 (unreliable)
       [c000000004a2f9a0] [c00000000206bb34] pud_advanced_tests+0x1b4/0x334
       [c000000004a2fa40] [c00000000206db34] debug_vm_pgtable+0xcbc/0x1c48
       [c000000004a2fc10] [c00000000000fd28] do_one_initcall+0x60/0x388
      
      Link: https://lkml.kernel.org/r/20240129060022.68044-1-aneesh.kumar@kernel.org
      Fixes: 27af67f3 ("powerpc/book3s64/mm: enable transparent pud hugepage")
      Signed-off-by: default avatarAneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      720da1e5
    • Nhat Pham's avatar
      mm: cachestat: fix folio read-after-free in cache walk · 3a75cb05
      Nhat Pham authored
      In cachestat, we access the folio from the page cache's xarray to compute
      its page offset, and check for its dirty and writeback flags.  However, we
      do not hold a reference to the folio before performing these actions,
      which means the folio can concurrently be released and reused as another
      folio/page/slab.
      
      Get around this altogether by just using xarray's existing machinery for
      the folio page offsets and dirty/writeback states.
      
      This changes behavior for tmpfs files to now always report zeroes in their
      dirty and writeback counters.  This is okay as tmpfs doesn't follow
      conventional writeback cache behavior: its pages get "cleaned" during
      swapout, after which they're no longer resident etc.
      
      Link: https://lkml.kernel.org/r/20240220153409.GA216065@cmpxchg.org
      Fixes: cf264e13 ("cachestat: implement cachestat syscall")
      Reported-by: default avatarJann Horn <jannh@google.com>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarNhat Pham <nphamcs@gmail.com>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Tested-by: default avatarJann Horn <jannh@google.com>
      Cc: <stable@vger.kernel.org>	[6.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3a75cb05
    • Lorenzo Stoakes's avatar
      MAINTAINERS: add memory mapping entry with reviewers · 00130266
      Lorenzo Stoakes authored
      Recently there have been a number of patches which have affected various
      aspects of the memory mapping logic as implemented in mm/mmap.c where it
      would have been useful for regular contributors to have been notified.
      
      Add an entry for this part of mm in particular with regular contributors
      tagged as reviewers.
      
      Link: https://lkml.kernel.org/r/20240220064410.4639-1-lstoakes@gmail.comSigned-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      00130266
    • Byungchul Park's avatar
      mm/vmscan: fix a bug calling wakeup_kswapd() with a wrong zone index · 2774f256
      Byungchul Park authored
      With numa balancing on, when a numa system is running where a numa node
      doesn't have its local memory so it has no managed zones, the following
      oops has been observed.  It's because wakeup_kswapd() is called with a
      wrong zone index, -1.  Fixed it by checking the index before calling
      wakeup_kswapd().
      
      > BUG: unable to handle page fault for address: 00000000000033f3
      > #PF: supervisor read access in kernel mode
      > #PF: error_code(0x0000) - not-present page
      > PGD 0 P4D 0
      > Oops: 0000 [#1] PREEMPT SMP NOPTI
      > CPU: 2 PID: 895 Comm: masim Not tainted 6.6.0-dirty #255
      > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      >    rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      > RIP: 0010:wakeup_kswapd (./linux/mm/vmscan.c:7812)
      > Code: (omitted)
      > RSP: 0000:ffffc90004257d58 EFLAGS: 00010286
      > RAX: ffffffffffffffff RBX: ffff88883fff0480 RCX: 0000000000000003
      > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88883fff0480
      > RBP: ffffffffffffffff R08: ff0003ffffffffff R09: ffffffffffffffff
      > R10: ffff888106c95540 R11: 0000000055555554 R12: 0000000000000003
      > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88883fff0940
      > FS:  00007fc4b8124740(0000) GS:ffff888827c00000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      > CR2: 00000000000033f3 CR3: 000000026cc08004 CR4: 0000000000770ee0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      > PKRU: 55555554
      > Call Trace:
      >  <TASK>
      > ? __die
      > ? page_fault_oops
      > ? __pte_offset_map_lock
      > ? exc_page_fault
      > ? asm_exc_page_fault
      > ? wakeup_kswapd
      > migrate_misplaced_page
      > __handle_mm_fault
      > handle_mm_fault
      > do_user_addr_fault
      > exc_page_fault
      > asm_exc_page_fault
      > RIP: 0033:0x55b897ba0808
      > Code: (omitted)
      > RSP: 002b:00007ffeefa821a0 EFLAGS: 00010287
      > RAX: 000055b89983acd0 RBX: 00007ffeefa823f8 RCX: 000055b89983acd0
      > RDX: 00007fc2f8122010 RSI: 0000000000020000 RDI: 000055b89983acd0
      > RBP: 00007ffeefa821a0 R08: 0000000000000037 R09: 0000000000000075
      > R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
      > R13: 00007ffeefa82410 R14: 000055b897ba5dd8 R15: 00007fc4b8340000
      >  </TASK>
      
      Link: https://lkml.kernel.org/r/20240216111502.79759-1-byungchul@sk.comSigned-off-by: default avatarByungchul Park <byungchul@sk.com>
      Reported-by: default avatarHyeongtak Ji <hyeongtak.ji@sk.com>
      Fixes: c574bbe9 ("NUMA balancing: optimize page placement for memory tiering system")
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2774f256
    • Marco Elver's avatar
      kasan: revert eviction of stack traces in generic mode · 711d3491
      Marco Elver authored
      This partially reverts commits cc478e0b, 63b85ac5, 08d7c94d,
      a414d428, and 773688a6 to make use of variable-sized stack depot
      records, since eviction of stack entries from stack depot forces fixed-
      sized stack records.  Care was taken to retain the code cleanups by the
      above commits.
      
      Eviction was added to generic KASAN as a response to alleviating the
      additional memory usage from fixed-sized stack records, but this still
      uses more memory than previously.
      
      With the re-introduction of variable-sized records for stack depot, we can
      just switch back to non-evictable stack records again, and return back to
      the previous performance and memory usage baseline.
      
      Before (observed after a KASAN kernel boot):
      
        pools: 597
        refcounted_allocations: 17547
        refcounted_frees: 6477
        refcounted_in_use: 11070
        freelist_size: 3497
        persistent_count: 12163
        persistent_bytes: 1717008
      
      After:
      
        pools: 319
        refcounted_allocations: 0
        refcounted_frees: 0
        refcounted_in_use: 0
        freelist_size: 0
        persistent_count: 29397
        persistent_bytes: 5183536
      
      As can be seen from the counters, with a generic KASAN config, refcounted
      allocations and evictions are no longer used.  Due to using variable-sized
      records, I observe a reduction of 278 stack depot pools (saving 4448 KiB)
      with my test setup.
      
      Link: https://lkml.kernel.org/r/20240129100708.39460-2-elver@google.com
      Fixes: cc478e0b ("kasan: avoid resetting aux_lock")
      Fixes: 63b85ac5 ("kasan: stop leaking stack trace handles")
      Fixes: 08d7c94d ("kasan: memset free track in qlink_free")
      Fixes: a414d428 ("kasan: handle concurrent kasan_record_aux_stack calls")
      Fixes: 773688a6 ("kasan: use stack_depot_put for Generic mode")
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Tested-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      711d3491
    • Marco Elver's avatar
      stackdepot: use variable size records for non-evictable entries · 31639fd6
      Marco Elver authored
      With the introduction of stack depot evictions, each stack record is now
      fixed size, so that future reuse after an eviction can safely store
      differently sized stack traces.  In all cases that do not make use of
      evictions, this wastes lots of space.
      
      Fix it by re-introducing variable size stack records (up to the max
      allowed size) for entries that will never be evicted.  We know if an entry
      will never be evicted if the flag STACK_DEPOT_FLAG_GET is not provided,
      since a later stack_depot_put() attempt is undefined behavior.
      
      With my current kernel config that enables KASAN and also SLUB owner
      tracking, I observe (after a kernel boot) a whopping reduction of 296
      stack depot pools, which translates into 4736 KiB saved.  The savings here
      are from SLUB owner tracking only, because KASAN generic mode still uses
      refcounting.
      
      Before:
      
        pools: 893
        allocations: 29841
        frees: 6524
        in_use: 23317
        freelist_size: 3454
      
      After:
      
        pools: 597
        refcounted_allocations: 17547
        refcounted_frees: 6477
        refcounted_in_use: 11070
        freelist_size: 3497
        persistent_count: 12163
        persistent_bytes: 1717008
      
      [elver@google.com: fix -Wstringop-overflow warning]
        Link: https://lore.kernel.org/all/20240201135747.18eca98e@canb.auug.org.au/
        Link: https://lkml.kernel.org/r/20240201090434.1762340-1-elver@google.com
        Link: https://lore.kernel.org/all/CABXGCsOzpRPZGg23QqJAzKnqkZPKzvieeg=W7sgjgi3q0pBo0g@mail.gmail.com/
      Link: https://lkml.kernel.org/r/20240129100708.39460-1-elver@google.com
      Link: https://lore.kernel.org/all/CABXGCsOzpRPZGg23QqJAzKnqkZPKzvieeg=W7sgjgi3q0pBo0g@mail.gmail.com/
      Fixes: 108be8de ("lib/stackdepot: allow users to evict stack traces")
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Tested-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      31639fd6
  2. 20 Feb, 2024 14 commits
  3. 18 Feb, 2024 6 commits
    • Linus Torvalds's avatar
      Linux 6.8-rc5 · b401b621
      Linus Torvalds authored
      b401b621
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.8-2' of... · 6c160f16
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Reformat nested if-conditionals in Makefiles with 4 spaces
      
       - Fix CONFIG_DEBUG_INFO_BTF builds for big endian
      
       - Fix modpost for module srcversion
      
       - Fix an escape sequence warning in gen_compile_commands.py
      
       - Fix kallsyms to ignore ARMv4 thunk symbols
      
      * tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kallsyms: ignore ARMv4 thunks along with others
        modpost: trim leading spaces when processing source files list
        gen_compile_commands: fix invalid escape sequence warning
        kbuild: Fix changing ELF file type for output of gen_btf for big endian
        docs: kconfig: Fix grammar and formatting
        kbuild: use 4-space indentation when followed by conditionals
      6c160f16
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ddac3d8b
      Linus Torvalds authored
      Pull x86 fix from Borislav Petkov:
      
       - Use a GB page for identity mapping only when memory of this size is
         requested so that mapping of reserved regions is prevented which
         would otherwise lead to system crashes on UV machines
      
      * tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
      ddac3d8b
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cb7c32d
      Linus Torvalds authored
      Pull irq fixes from Borislav Petkov:
      
       - Fix GICv4.1 affinity update
      
       - Restore a quirk for ACPI-based GICv4 systems
      
       - Handle non-coherent GICv4 redistributors properly
      
       - Prevent spurious interrupts on Broadcom devices using GIC v3
         architecture
      
       - Other minor fixes
      
      * tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
        irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems
        irqchip/gic-v3-its: Handle non-coherent GICv4 redistributors
        irqchip/qcom-mpm: Fix IS_ERR() vs NULL check in qcom_mpm_init()
        irqchip/loongson-eiointc: Use correct struct type in eiointc_domain_alloc()
        irqchip/irq-brcmstb-l2: Add write memory barrier before exit
      7cb7c32d
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 626721ed
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two fixes for i801 and qcom-geni devices. Meanwhile, a fix from Arnd
        addresses a compilation error encountered during compile test on
        powerpc"
      
      * tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: i801: Fix block process call transactions
        i2c: pasemi: split driver into two separate modules
        i2c: qcom-geni: Correct I2C TRE sequence
      626721ed
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c02197fc
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "This is a bit of a big batch for rc4, but just due to holiday hangover
        and because I didn't send any fixes last week due to a late revert
        request. I think next week should be back to normal.
      
         - Fix ftrace bug on boot caused by exit text sections with
           '-fpatchable-function-entry'
      
         - Fix accuracy of stolen time on pseries since the switch to
           VIRT_CPU_ACCOUNTING_GEN
      
         - Fix a crash in the IOMMU code when doing DLPAR remove
      
         - Set pt_regs->link on scv entry to fix BPF stack unwinding
      
         - Add missing PPC_FEATURE_BOOKE on 64-bit e5500/e6500, which broke
           gdb
      
         - Fix boot on some 6xx platforms with STRICT_KERNEL_RWX enabled
      
         - Fix build failures with KASAN enabled and 32KB stack size
      
         - Some other minor fixes
      
        Thanks to Arnd Bergmann, Benjamin Gray, Christophe Leroy, David
        Engraf, Gaurav Batra, Jason Gunthorpe, Jiangfeng Xiao, Matthias
        Schiffer, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A,
        R Nageswara Sastry, Shivaprasad G Bhat, Shrikanth Hegde, Spoorthy,
        Srikar Dronamraju, and Venkat Rao Bagalkote"
      
      * tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach
        powerpc/pseries: fix accuracy of stolen time
        powerpc/ftrace: Ignore ftrace locations in exit text sections
        powerpc/cputable: Add missing PPC_FEATURE_BOOKE on PPC64 Book-E
        powerpc/kasan: Limit KASAN thread size increase to 32KB
        Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"
        powerpc: 85xx: mark local functions static
        powerpc: udbg_memcons: mark functions static
        powerpc/kasan: Fix addr error caused by page alignment
        powerpc/6xx: set High BAT Enable flag on G2_LE cores
        selftests/powerpc/papr_vpd: Check devfd before get_system_loc_code()
        powerpc/64: Set task pt_regs->link to the LR value on scv entry
        powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add
        powerpc/pseries/papr-sysparm: use u8 arrays for payloads
      c02197fc
  4. 17 Feb, 2024 12 commits
  5. 16 Feb, 2024 2 commits