1. 24 May, 2024 15 commits
    • Miaohe Lin's avatar
      mm/memory-failure: fix handling of dissolved but not taken off from buddy pages · 8cf360b9
      Miaohe Lin authored
      When I did memory failure tests recently, below panic occurs:
      
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
      page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
      ------------[ cut here ]------------
      kernel BUG at include/linux/page-flags.h:1009!
      invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      RIP: 0010:__del_page_from_free_list+0x151/0x180
      RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
      RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
      RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
      RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
      R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
      R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
      FS:  00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
      Call Trace:
       <TASK>
       __rmqueue_pcplist+0x23b/0x520
       get_page_from_freelist+0x26b/0xe40
       __alloc_pages_noprof+0x113/0x1120
       __folio_alloc_noprof+0x11/0xb0
       alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
       __alloc_fresh_hugetlb_folio+0xe7/0x140
       alloc_pool_huge_folio+0x68/0x100
       set_max_huge_pages+0x13d/0x340
       hugetlb_sysctl_handler_common+0xe8/0x110
       proc_sys_call_handler+0x194/0x280
       vfs_write+0x387/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7ff916114887
      RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
      RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
      RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
      R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
      R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
       </TASK>
      Modules linked in: mce_inject hwpoison_inject
      ---[ end trace 0000000000000000 ]---
      
      And before the panic, there had an warning about bad page state:
      
      BUG: Bad page state in process page-types  pfn:8cee00
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      page_type: 0xffffff7f(buddy)
      raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
      page dumped because: nonzero mapcount
      Modules linked in: mce_inject hwpoison_inject
      CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
      Call Trace:
       <TASK>
       dump_stack_lvl+0x83/0xa0
       bad_page+0x63/0xf0
       free_unref_page+0x36e/0x5c0
       unpoison_memory+0x50b/0x630
       simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
       debugfs_attr_write+0x42/0x60
       full_proxy_write+0x5b/0x80
       vfs_write+0xcd/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f189a514887
      RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
      RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
      RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
      R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
       </TASK>
      
      The root cause should be the below race:
      
       memory_failure
        try_memory_failure_hugetlb
         me_huge_page
          __page_handle_poison
           dissolve_free_hugetlb_folio
           drain_all_pages -- Buddy page can be isolated e.g. for compaction.
           take_page_off_buddy -- Failed as page is not in the buddy list.
      	     -- Page can be putback into buddy after compaction.
          page_ref_inc -- Leads to buddy page with refcnt = 1.
      
      Then unpoison_memory() can unpoison the page and send the buddy page back
      into buddy list again leading to the above bad page state warning.  And
      bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
      page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
      allocate this page.
      
      Fix this issue by only treating __page_handle_poison() as successful when
      it returns 1.
      
      Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
      Fixes: ceaf8fbe ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8cf360b9
    • Yuanyuan Zhong's avatar
      mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again · 6d065f50
      Yuanyuan Zhong authored
      After switching smaps_rollup to use VMA iterator, searching for next entry
      is part of the condition expression of the do-while loop.  So the current
      VMA needs to be addressed before the continue statement.
      
      Otherwise, with some VMAs skipped, userspace observed memory
      consumption from /proc/pid/smaps_rollup will be smaller than the sum of
      the corresponding fields from /proc/pid/smaps.
      
      Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com
      Fixes: c4c84f06 ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
      Signed-off-by: default avatarYuanyuan Zhong <yzhong@purestorage.com>
      Reviewed-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6d065f50
    • Ryusuke Konishi's avatar
      nilfs2: fix potential hang in nilfs_detach_log_writer() · eb85dace
      Ryusuke Konishi authored
      Syzbot has reported a potential hang in nilfs_detach_log_writer() called
      during nilfs2 unmount.
      
      Analysis revealed that this is because nilfs_segctor_sync(), which
      synchronizes with the log writer thread, can be called after
      nilfs_segctor_destroy() terminates that thread, as shown in the call trace
      below:
      
      nilfs_detach_log_writer
        nilfs_segctor_destroy
          nilfs_segctor_kill_thread  --> Shut down log writer thread
          flush_work
            nilfs_iput_work_func
              nilfs_dispose_list
                iput
                  nilfs_evict_inode
                    nilfs_transaction_commit
                      nilfs_construct_segment (if inode needs sync)
                        nilfs_segctor_sync  --> Attempt to synchronize with
                                                log writer thread
                                 *** DEADLOCK ***
      
      Fix this issue by changing nilfs_segctor_sync() so that the log writer
      thread returns normally without synchronizing after it terminates, and by
      forcing tasks that are already waiting to complete once after the thread
      terminates.
      
      The skipped inode metadata flushout will then be processed together in the
      subsequent cleanup work in nilfs_segctor_destroy().
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      eb85dace
    • Ryusuke Konishi's avatar
      nilfs2: fix unexpected freezing of nilfs_segctor_sync() · 936184ea
      Ryusuke Konishi authored
      A potential and reproducible race issue has been identified where
      nilfs_segctor_sync() would block even after the log writer thread writes a
      checkpoint, unless there is an interrupt or other trigger to resume log
      writing.
      
      This turned out to be because, depending on the execution timing of the
      log writer thread running in parallel, the log writer thread may skip
      responding to nilfs_segctor_sync(), which causes a call to schedule()
      waiting for completion within nilfs_segctor_sync() to lose the opportunity
      to wake up.
      
      The reason why waking up the task waiting in nilfs_segctor_sync() may be
      skipped is that updating the request generation issued using a shared
      sequence counter and adding an wait queue entry to the request wait queue
      to the log writer, are not done atomically.  There is a possibility that
      log writing and request completion notification by nilfs_segctor_wakeup()
      may occur between the two operations, and in that case, the wait queue
      entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of
      nilfs_segctor_sync() will be carried over until the next request occurs.
      
      Fix this issue by performing these two operations simultaneously within
      the lock section of sc_state_lock.  Also, following the memory barrier
      guidelines for event waiting loops, move the call to set_current_state()
      in the same location into the event waiting loop to ensure that a memory
      barrier is inserted just before the event condition determination.
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com
      Fixes: 9ff05123 ("nilfs2: segment constructor")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      936184ea
    • Ryusuke Konishi's avatar
      nilfs2: fix use-after-free of timer for log writer thread · f5d4e046
      Ryusuke Konishi authored
      Patch series "nilfs2: fix log writer related issues".
      
      This bug fix series covers three nilfs2 log writer-related issues,
      including a timer use-after-free issue and potential deadlock issue on
      unmount, and a potential freeze issue in event synchronization found
      during their analysis.  Details are described in each commit log.
      
      
      This patch (of 3):
      
      A use-after-free issue has been reported regarding the timer sc_timer on
      the nilfs_sc_info structure.
      
      The problem is that even though it is used to wake up a sleeping log
      writer thread, sc_timer is not shut down until the nilfs_sc_info structure
      is about to be freed, and is used regardless of the thread's lifetime.
      
      Fix this issue by limiting the use of sc_timer only while the log writer
      thread is alive.
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
      Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
      Fixes: fdce895e ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar"Bai, Shuangpeng" <sjb7183@psu.edu>
      Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJTested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f5d4e046
    • Michael Ellerman's avatar
      selftests/mm: fix build warnings on ppc64 · 1901472f
      Michael Ellerman authored
      Fix warnings like:
      
        In file included from uffd-unit-tests.c:8:
        uffd-unit-tests.c: In function `uffd_poison_handle_fault':
        uffd-common.h:45:33: warning: format `%llu' expects argument of type
        `long long unsigned int', but argument 3 has type `__u64' {aka `long
        unsigned int'} [-Wformat=]
      
      By switching to unsigned long long for u64 for ppc64 builds.
      
      Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.auSigned-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1901472f
    • Will Deacon's avatar
      arm64: patching: fix handling of execmem addresses · b1480ed2
      Will Deacon authored
      Klara Modin reported warnings for a kernel configured with BPF_JIT but
      without MODULES:
      
      [   44.131296] Trying to vfree() bad address (000000004a17c299)
      [   44.138024] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3189 remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.146675] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G      D W          6.9.0-01786-g2c9e5d4a #25
      [   44.158229] Hardware name: Raspberry Pi 3 Model B (DT)
      [   44.164433] Workqueue: events bpf_prog_free_deferred
      [   44.170492] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   44.178601] pc : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.183705] lr : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.188772] sp : ffff800082a13c70
      [   44.193112] x29: ffff800082a13c70 x28: 0000000000000000 x27: 0000000000000000
      [   44.201384] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000
      [   44.209658] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: ffff8000814dd880
      [   44.217924] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006
      [   44.226206] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002
      [   44.234460] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000
      [   44.242710] x11: ffff8000811a6370 x10: 0000000000000144 x9 : ffff8000811fe370
      [   44.250959] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370
      [   44.259206] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
      [   44.267457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240
      [   44.275703] Call trace:
      [   44.279158] remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.283858] vfree (mm/vmalloc.c:3322)
      [   44.287835] execmem_free (mm/execmem.c:70)
      [   44.292347] bpf_jit_free_exec+0x10/0x1c
      [   44.297283] bpf_prog_pack_free (kernel/bpf/core.c:1006)
      [   44.302457] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195)
      [   44.307951] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474)
      [   44.312342] bpf_prog_free_deferred (kernel/bpf/core.c:2785)
      [   44.317785] process_one_work (kernel/workqueue.c:3273)
      [   44.322684] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2))
      [   44.327292] kthread (kernel/kthread.c:388)
      [   44.331342] ret_from_fork (arch/arm64/kernel/entry.S:861)
      
      The problem is because bpf_arch_text_copy() silently fails to write to the
      read-only area as a result of patch_map() faulting and the resulting
      -EFAULT being chucked away.
      
      Update patch_map() to use CONFIG_EXECMEM instead of
      CONFIG_STRICT_MODULE_RWX to check for vmalloc addresses.
      
      Link: https://lkml.kernel.org/r/20240521213813.703309-1-rppt@kernel.org
      Fixes: 2c9e5d4a ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of")
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Reported-by: default avatarKlara Modin <klarasmodin@gmail.com>
      Closes: https://lore.kernel.org/all/7983fbbf-0127-457c-9394-8d6e4299c685@gmail.comTested-by: default avatarKlara Modin <klarasmodin@gmail.com>
      Cc: Björn Töpel <bjorn@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b1480ed2
    • Dev Jain's avatar
      selftests/mm: compaction_test: fix bogus test success and reduce probability... · fb9293b6
      Dev Jain authored
      selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
      
      Reset nr_hugepages to zero before the start of the test.
      
      If a non-zero number of hugepages is already set before the start of the
      test, the following problems arise:
      
       - The probability of the test getting OOM-killed increases.  Proof:
         The test wants to run on 80% of available memory to prevent OOM-killing
         (see original code comments).  Let the value of mem_free at the start
         of the test, when nr_hugepages = 0, be x.  In the other case, when
         nr_hugepages > 0, let the memory consumed by hugepages be y.  In the
         former case, the test operates on 0.8 * x of memory.  In the latter,
         the test operates on 0.8 * (x - y) of memory, with y already filled,
         hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 *
         x.  Q.E.D
      
       - The probability of a bogus test success increases.  Proof: Let the
         memory consumed by hugepages be greater than 25% of x, with x and y
         defined as above.  The definition of compaction_index is c_index = (x -
         y)/z where z is the memory consumed by hugepages after trying to
         increase them again.  In check_compaction(), we set the number of
         hugepages to zero, and then increase them back; the probability that
         they will be set back to consume at least y amount of memory again is
         very high (since there is not much delay between the two attempts of
         changing nr_hugepages).  Hence, z >= y > (x/4) (by the 25% assumption).
         Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
         hence, c_index can always be forced to be less than 3, thereby the test
         succeeding always.  Q.E.D
      
      Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com
      Fixes: bd67d5c1 ("Test compaction of mlocked memory")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Cc: <stable@vger.kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sri Jayaramappa <sjayaram@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fb9293b6
    • Dev Jain's avatar
      selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages · 9ad665ef
      Dev Jain authored
      Currently, the test tries to set nr_hugepages to zero, but that is not
      actually done because the file offset is not reset after read().  Fix that
      using lseek().
      
      Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com
      Fixes: bd67d5c1 ("Test compaction of mlocked memory")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Cc: <stable@vger.kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sri Jayaramappa <sjayaram@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9ad665ef
    • Dev Jain's avatar
      selftests/mm: compaction_test: fix bogus test success on Aarch64 · d4202e66
      Dev Jain authored
      Patch series "Fixes for compaction_test", v2.
      
      The compaction_test memory selftest introduces fragmentation in memory
      and then tries to allocate as many hugepages as possible. This series
      addresses some problems.
      
      On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
      compaction_index becomes 0, which is less than 3, due to no division by
      zero exception being raised. We fix that by checking for division by
      zero.
      
      Secondly, correctly set the number of hugepages to zero before trying
      to set a large number of them.
      
      Now, consider a situation in which, at the start of the test, a non-zero
      number of hugepages have been already set (while running the entire
      selftests/mm suite, or manually by the admin). The test operates on 80%
      of memory to avoid OOM-killer invocation, and because some memory is
      already blocked by hugepages, it would increase the chance of OOM-killing.
      Also, since mem_free used in check_compaction() is the value before we
      set nr_hugepages to zero, the chance that the compaction_index will
      be small is very high if the preset nr_hugepages was high, leading to a
      bogus test success.
      
      
      This patch (of 3):
      
      Currently, if at runtime we are not able to allocate a huge page, the test
      will trivially pass on Aarch64 due to no exception being raised on
      division by zero while computing compaction_index.  Fix that by checking
      for nr_hugepages == 0.  Anyways, in general, avoid a division by zero by
      exiting the program beforehand.  While at it, fix a typo, and handle the
      case where the number of hugepages may overflow an integer.
      
      Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com
      Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com
      Fixes: bd67d5c1 ("Test compaction of mlocked memory")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sri Jayaramappa <sjayaram@akamai.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d4202e66
    • Satya Priya Kakitapalli's avatar
      mailmap: update email address for Satya Priya · c17d39f5
      Satya Priya Kakitapalli authored
      Update mailmap with my latest email ID, quic_c_skakit@quicinc.com
      is no longer active.
      
      Link: https://lkml.kernel.org/r/20240515-mailmap-update-v1-1-df4853f757a3@quicinc.comSigned-off-by: default avatarSatya Priya Kakitapalli <quic_skakitap@quicinc.com>
      Cc: Ajit Pandey <quic_ajipan@quicinc.com>
      Cc: Bjorn Andersson <andersson@kernel.org>
      Cc: Imran Shaik <quic_imrashai@quicinc.com>
      Cc: Jagadeesh Kona <quic_jkona@quicinc.com>
      Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
      Cc: Taniya Das <quic_tdas@quicinc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c17d39f5
    • Miaohe Lin's avatar
      mm/huge_memory: don't unpoison huge_zero_folio · fe6f86f4
      Miaohe Lin authored
      When I did memory failure tests recently, below panic occurs:
      
       kernel BUG at include/linux/mm.h:1135!
       invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
       CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14
       RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
       RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
       RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
       RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
       RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
       R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
       FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
       Call Trace:
        <TASK>
        do_shrink_slab+0x14f/0x6a0
        shrink_slab+0xca/0x8c0
        shrink_node+0x2d0/0x7d0
        balance_pgdat+0x33a/0x720
        kswapd+0x1f3/0x410
        kthread+0xd5/0x100
        ret_from_fork+0x2f/0x50
        ret_from_fork_asm+0x1a/0x30
        </TASK>
       Modules linked in: mce_inject hwpoison_inject
       ---[ end trace 0000000000000000 ]---
       RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
       RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
       RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
       RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
       RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
       R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
       FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
      
      The root cause is that HWPoison flag will be set for huge_zero_folio
      without increasing the folio refcnt.  But then unpoison_memory() will
      decrease the folio refcnt unexpectedly as it appears like a successfully
      hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when
      releasing huge_zero_folio.
      
      Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue. 
      We're not prepared to unpoison huge_zero_folio yet.
      
      Link: https://lkml.kernel.org/r/20240516122608.22610-1-linmiaohe@huawei.com
      Fixes: 478d134e ("mm/huge_memory: do not overkill when splitting huge_zero_page")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Cc: Xu Yu <xuyu@linux.alibaba.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fe6f86f4
    • Andrey Konovalov's avatar
      kasan, fortify: properly rename memintrinsics · 2e577732
      Andrey Konovalov authored
      After commit 69d4c0d3 ("entry, kasan, x86: Disallow overriding mem*()
      functions") and the follow-up fixes, with CONFIG_FORTIFY_SOURCE enabled,
      even though the compiler instruments meminstrinsics by generating calls to
      __asan/__hwasan_ prefixed functions, FORTIFY_SOURCE still uses
      uninstrumented memset/memmove/memcpy as the underlying functions.
      
      As a result, KASAN cannot detect bad accesses in memset/memmove/memcpy. 
      This also makes KASAN tests corrupt kernel memory and cause crashes.
      
      To fix this, use __asan_/__hwasan_memset/memmove/memcpy as the underlying
      functions whenever appropriate.  Do this only for the instrumented code
      (as indicated by __SANITIZE_ADDRESS__).
      
      Link: https://lkml.kernel.org/r/20240517130118.759301-1-andrey.konovalov@linux.dev
      Fixes: 69d4c0d3 ("entry, kasan, x86: Disallow overriding mem*() functions")
      Fixes: 51287dcb ("kasan: emit different calls for instrumentable memintrinsics")
      Fixes: 36be5cba ("kasan: treat meminstrinsic as builtins in uninstrumented files")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Reported-by: default avatarErhard Furtner <erhard_f@mailbox.org>
      Reported-by: default avatarNico Pache <npache@redhat.com>
      Closes: https://lore.kernel.org/all/20240501144156.17e65021@outsider.home/Reviewed-by: default avatarMarco Elver <elver@google.com>
      Tested-by: default avatarNico Pache <npache@redhat.com>
      Acked-by: default avatarNico Pache <npache@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2e577732
    • Suren Baghdasaryan's avatar
      lib: add version into /proc/allocinfo output · a38568a0
      Suren Baghdasaryan authored
      Add version string and a header at the beginning of /proc/allocinfo to
      allow later format changes.  Example output:
      
      > head /proc/allocinfo
      allocinfo - version: 1.0
      #     <size>  <calls> <tag info>
                 0        0 init/main.c:1314 func:do_initcalls
                 0        0 init/do_mounts.c:353 func:mount_nodev_root
                 0        0 init/do_mounts.c:187 func:mount_root_generic
                 0        0 init/do_mounts.c:158 func:do_mount_root
                 0        0 init/initramfs.c:493 func:unpack_to_rootfs
                 0        0 init/initramfs.c:492 func:unpack_to_rootfs
                 0        0 init/initramfs.c:491 func:unpack_to_rootfs
               512        1 arch/x86/events/rapl.c:681 func:init_rapl_pmus
               128        1 arch/x86/events/rapl.c:571 func:rapl_cpu_online
      
      [akpm@linux-foundation.org: remove stray newline from struct allocinfo_private]
      Link: https://lkml.kernel.org/r/20240514163128.3662251-1-surenb@google.comSigned-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Reviewed-by: default avatarPasha Tatashin <pasha.tatashin@soleen.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a38568a0
    • Hailong.Liu's avatar
      mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL · 8e0545c8
      Hailong.Liu authored
      commit a421ef30 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
      includes support for __GFP_NOFAIL, but it presents a conflict with commit
      dd544141 ("vmalloc: back off when the current task is OOM-killed").  A
      possible scenario is as follows:
      
      process-a
      __vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL)
          __vmalloc_area_node()
              vm_area_alloc_pages()
      		--> oom-killer send SIGKILL to process-a
              if (fatal_signal_pending(current)) break;
      --> return NULL;
      
      To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages()
      if __GFP_NOFAIL set.
      
      This issue occurred during OPLUS KASAN TEST. Below is part of the log
      -> oom-killer sends signal to process
      [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198
      
      [65731.259685] [T32454] Call trace:
      [65731.259698] [T32454]  dump_backtrace+0xf4/0x118
      [65731.259734] [T32454]  show_stack+0x18/0x24
      [65731.259756] [T32454]  dump_stack_lvl+0x60/0x7c
      [65731.259781] [T32454]  dump_stack+0x18/0x38
      [65731.259800] [T32454]  mrdump_common_die+0x250/0x39c [mrdump]
      [65731.259936] [T32454]  ipanic_die+0x20/0x34 [mrdump]
      [65731.260019] [T32454]  atomic_notifier_call_chain+0xb4/0xfc
      [65731.260047] [T32454]  notify_die+0x114/0x198
      [65731.260073] [T32454]  die+0xf4/0x5b4
      [65731.260098] [T32454]  die_kernel_fault+0x80/0x98
      [65731.260124] [T32454]  __do_kernel_fault+0x160/0x2a8
      [65731.260146] [T32454]  do_bad_area+0x68/0x148
      [65731.260174] [T32454]  do_mem_abort+0x151c/0x1b34
      [65731.260204] [T32454]  el1_abort+0x3c/0x5c
      [65731.260227] [T32454]  el1h_64_sync_handler+0x54/0x90
      [65731.260248] [T32454]  el1h_64_sync+0x68/0x6c
      
      [65731.260269] [T32454]  z_erofs_decompress_queue+0x7f0/0x2258
      --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL);
      	kernel panic by NULL pointer dereference.
      	erofs assume kvmalloc with __GFP_NOFAIL never return NULL.
      [65731.260293] [T32454]  z_erofs_runqueue+0xf30/0x104c
      [65731.260314] [T32454]  z_erofs_readahead+0x4f0/0x968
      [65731.260339] [T32454]  read_pages+0x170/0xadc
      [65731.260364] [T32454]  page_cache_ra_unbounded+0x874/0xf30
      [65731.260388] [T32454]  page_cache_ra_order+0x24c/0x714
      [65731.260411] [T32454]  filemap_fault+0xbf0/0x1a74
      [65731.260437] [T32454]  __do_fault+0xd0/0x33c
      [65731.260462] [T32454]  handle_mm_fault+0xf74/0x3fe0
      [65731.260486] [T32454]  do_mem_abort+0x54c/0x1b34
      [65731.260509] [T32454]  el0_da+0x44/0x94
      [65731.260531] [T32454]  el0t_64_sync_handler+0x98/0xb4
      [65731.260553] [T32454]  el0t_64_sync+0x198/0x19c
      
      Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com
      Fixes: 9376130c ("mm/vmalloc: add support for __GFP_NOFAIL")
      Signed-off-by: default avatarHailong.Liu <hailong.liu@oppo.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Suggested-by: default avatarBarry Song <21cnbao@gmail.com>
      Reported-by: default avatarOven <liyangouwen1@oppo.com>
      Reviewed-by: default avatarBarry Song <baohua@kernel.org>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Cc: Chao Yu <chao@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Gao Xiang <xiang@kernel.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8e0545c8
  2. 23 May, 2024 2 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of... · c760b372
      Linus Torvalds authored
      Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull more non-mm updates from Andrew Morton:
      
       - A series ("kbuild: enable more warnings by default") from Arnd
         Bergmann which enables a number of additional build-time warnings. We
         fixed all the fallout which we could find, there may still be a few
         stragglers.
      
       - Samuel Holland has developed the series "Unified cross-architecture
         kernel-mode FPU API". This does a lot of consolidation of
         per-architecture kernel-mode FPU usage and enables the use of newer
         AMD GPUs on RISC-V.
      
       - Tao Su has fixed some selftests build warnings in the series
         "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
         definition".
      
       - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
      
      * tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
        nilfs2: make block erasure safe in nilfs_finish_roll_forward()
        selftests/harness: use 1024 in place of LINE_MAX
        Revert "selftests/harness: remove use of LINE_MAX"
        selftests/fpu: allow building on other architectures
        selftests/fpu: move FP code to a separate translation unit
        drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
        drm/amd/display: only use hard-float, not altivec on powerpc
        riscv: add support for kernel-mode FPU
        x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
        powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
        LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
        lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
        arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
        arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
        ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
        ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
        arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
        x86/fpu: fix asm/fpu/types.h include guard
        kbuild: enable -Wcast-function-type-strict unconditionally
        kbuild: enable -Wformat-truncation on clang
        ...
      c760b372
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 5c6f4d68
      Linus Torvalds authored
      Pull more mm updates from Andrew Morton:
       "A series from Dave Chinner which cleans up and fixes the handling of
        nested allocations within stackdepot and page-owner"
      
      * tag 'mm-stable-2024-05-22-17-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm/page-owner: use gfp_nested_mask() instead of open coded masking
        stackdepot: use gfp_nested_mask() instead of open coded masking
        mm: lift gfp_kmemleak_mask() to gfp.h
      5c6f4d68
  3. 22 May, 2024 19 commits
    • Linus Torvalds's avatar
      mm: simplify and improve print_vma_addr() output · de7e71ef
      Linus Torvalds authored
      Use '%pD' to print out the filename, and print out the actual offset
      within the file too, rather than just what the virtual address of the
      mapping is (which doesn't tell you anything about any mapping offsets).
      
      Also, use the exact vma_lookup() instead of find_vma() - the latter
      looks up any vma _after_ the address, which is of questionable value
      (yes, maybe you fell off the beginning, but you'd be more likely to fall
      off the end).
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de7e71ef
    • Linus Torvalds's avatar
      Merge local branch 'x86-codegen' · f8a6e48c
      Linus Torvalds authored
      Merge trivial x86 code generation annoyances
      
       - Introduce helper macros for clang asm input problems
      
       - use said macros to improve trivially stupid code generation issues in
         bitops and array_index_mask_nospec
      
       - also improve codegen with 32-bit array index comparisons
      
      None of these really matter, but I look at code generation and profiles
      fairly regularly, and these misfeatures caused the generated code to
      look really odd and distract from the real issues.
      
      * branch 'x86-codegen' of local tree:
        x86: improve bitop code generation with clang
        x86: improve array_index_mask_nospec() code generation
        clang: work around asm input constraint problems
      f8a6e48c
    • Linus Torvalds's avatar
      x86: improve bitop code generation with clang · b9b60b31
      Linus Torvalds authored
      This uses the new ASM_INPUT_RM macro to avoid the bad code generation
      issue that clang has with more generic asm inputs.
      
      This ends up avoiding generating code like this:
      
       	mov    %r10,(%rsp)
       	tzcnt  (%rsp),%rcx
      
      which now becomes just
      
       	tzcnt  %r10,%rcx
      
      and in the process ends up also removing a few unnecessary stack frames
      when the only use was that pointless "asm uses memory location off stack".
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9b60b31
    • Linus Torvalds's avatar
      x86: improve array_index_mask_nospec() code generation · 7453b948
      Linus Torvalds authored
      Don't force the inputs to be 'unsigned long', when the comparison can
      easily be done in 32-bit if that's more appropriate.
      
      Note that while we can look at the inputs to choose an appropriate size
      for the compare instruction, the output is fixed at 'unsigned long'.
      That's not technically optimal either, since a 32-bit 'sbbl' would often
      be sufficient.
      
      But for the outgoing mask we don't know how the mask ends up being used
      (ie we have uses that have an incoming 32-bit array index, but end up
      using the mask for other things).  That said, it only costs the extra
      REX prefix to always generate the 64-bit mask.
      
      [ A 'sbbl' also always technically generates a 64-bit mask, but with the
        upper 32 bits clear: that's fine for when the incoming index that will
        be masked is already 32-bit, but not if you use the mask to mask a
        pointer afterwards, like the file table lookup does ]
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7453b948
    • Linus Torvalds's avatar
      clang: work around asm input constraint problems · dbaaabd6
      Linus Torvalds authored
      Work around clang problems with asm constraints that have multiple
      possibilities, particularly "g" and "rm".
      
      Clang seems to turn inputs like that into the most generic form, which
      is the memory input - but to make matters worse, clang won't even use a
      possible original memory location, but will spill the value to stack,
      and use the stack for the asm input.
      
      See
      
        https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442
      
      for some explanation of why clang has this strange behavior, but the end
      result is that "g" and "rm" really end up generating horrid code.
      
      Link: https://github.com/llvm/llvm-project/issues/20571
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbaaabd6
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 5f16eb05
      Linus Torvalds authored
      Pull char/misc and other driver subsystem updates from Greg KH:
       "Here is the big set of char/misc and other driver subsystem updates
        for 6.10-rc1. Nothing major here, just lots of new drivers and updates
        for apis and new hardware types. Included in here are:
      
         - big IIO driver updates with more devices and drivers added
      
         - fpga driver updates
      
         - hyper-v driver updates
      
         - uio_pruss driver removal, no one uses it, other drivers control the
           same hardware now
      
         - binder minor updates
      
         - mhi driver updates
      
         - excon driver updates
      
         - counter driver updates
      
         - accessability driver updates
      
         - coresight driver updates
      
         - other hwtracing driver updates
      
         - nvmem driver updates
      
         - slimbus driver updates
      
         - spmi driver updates
      
         - other smaller misc and char driver updates
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (319 commits)
        misc: ntsync: mark driver as "broken" to prevent from building
        spmi: pmic-arb: Add multi bus support
        spmi: pmic-arb: Register controller for bus instead of arbiter
        spmi: pmic-arb: Make core resources acquiring a version operation
        spmi: pmic-arb: Make the APID init a version operation
        spmi: pmic-arb: Fix some compile warnings about members not being described
        dt-bindings: spmi: Deprecate qcom,bus-id
        dt-bindings: spmi: Add X1E80100 SPMI PMIC ARB schema
        spmi: pmic-arb: Replace three IS_ERR() calls by null pointer checks in spmi_pmic_arb_probe()
        spmi: hisi-spmi-controller: Do not override device identifier
        dt-bindings: spmi: hisilicon,hisi-spmi-controller: clean up example
        dt-bindings: spmi: hisilicon,hisi-spmi-controller: fix binding references
        spmi: make spmi_bus_type const
        extcon: adc-jack: Document missing struct members
        extcon: realtek: Remove unused of_gpio.h
        extcon: usbc-cros-ec: Convert to platform remove callback returning void
        extcon: usb-gpio: Convert to platform remove callback returning void
        extcon: max77843: Convert to platform remove callback returning void
        extcon: max3355: Convert to platform remove callback returning void
        extcon: intel-mrfld: Convert to platform remove callback returning void
        ...
      5f16eb05
    • Linus Torvalds's avatar
      Merge tag 'driver-core-6.10-rc1' of... · d90be6e4
      Linus Torvalds authored
      Merge tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core updates from Greg KH:
       "Here is the small set of driver core and kernfs changes for 6.10-rc1.
      
        Nothing major here at all, just a small set of changes for some driver
        core apis, and minor fixups. Included in here are:
      
         - sysfs_bin_attr_simple_read() helper added and used
      
         - device_show_string() helper added and used
      
        All usages of these were acked by the various maintainers. Also in
        here are:
      
         - kernfs minor cleanup
      
         - removed unused functions
      
         - typo fix in documentation
      
         - pay attention to sysfs_create_link() failures in module.c finally
      
        All of these have been in linux-next for a very long time with no
        reported problems"
      
      * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        device property: Fix a typo in the description of device_get_child_node_count()
        kernfs: mount: Remove unnecessary ‘NULL’ values from knparent
        scsi: Use device_show_string() helper for sysfs attributes
        platform/x86: Use device_show_string() helper for sysfs attributes
        perf: Use device_show_string() helper for sysfs attributes
        IB/qib: Use device_show_string() helper for sysfs attributes
        hwmon: Use device_show_string() helper for sysfs attributes
        driver core: Add device_show_string() helper for sysfs attributes
        treewide: Use sysfs_bin_attr_simple_read() helper
        sysfs: Add sysfs_bin_attr_simple_read() helper
        module: don't ignore sysfs_create_link() failures
        driver core: Remove unused platform_notify, platform_notify_remove
      d90be6e4
    • Linus Torvalds's avatar
      Merge tag 'staging-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · be81389c
      Linus Torvalds authored
      Pull staging driver updates from Greg KH:
       "Here is the big set of staging driver changes for 6.10-rc1. Not a lot
        of cleanups happening this kernel release, intern applications must be
        out of sync at the moment. But we did delete two drivers, wlan-ng and
        pi433, as they are no longer in use and the developers involved wanted
        them just gone entirely, allowing us to drop 19k lines from the tree.
      
        Other than the normal coding style cleanups here, there has been a lot
        of work on the vc04_services code, with the intent to finally get that
        out of staging hopefully soon. It's getting closer, which is nice to
        see.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (98 commits)
        staging: pi433: Remove unused driver
        staging: vchiq_core: Add missing blank lines
        staging: vchiq_core: Drop unnecessary blank lines
        staging: vchiq_core: Add parentheses to VCHIQ_MSG_SRCPORT
        staging: vchiq_core: Use printk messages for devices
        staging: vchiq_arm: Drop unnecessary NULL check
        staging: vc04_services: Delete unnecessary NULL check
        staging: vc04_services: vchiq_arm: Fix NULL ptr dereferences
        Staging: rtl8192e: Rename variable DssCCk
        Staging: rtl8192e: Rename variable ExtHTCapInfo
        Staging: rtl8192e: Rename variable MPDUDensity
        Staging: rtl8192e: Rename variable MaxRxAMPDUFactor
        Staging: rtl8192e: Rename variable MaxAMSDUSize
        Staging: rtl8192e: Rename variable DelayBA
        Staging: rtl8192e: Rename variable RxSTBC
        Staging: rtl8192e: Rename variable TxSTBC
        Staging: rtl8192e: Rename variable GreenField
        Staging: rtl8192e: Rename variable ShortGI20Mhz
        Staging: rtl8192e: Rename variable ShortGI40Mhz
        Staging: rtl8192e: Rename variable MimoPwrSave
        ...
      be81389c
    • Linus Torvalds's avatar
      Merge tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · f6b8e86b
      Linus Torvalds authored
      Pull tty / serial updates from Greg KH:
       "Here is the big set of tty/serial driver changes for 6.10-rc1.
        Included in here are:
      
         - Usual good set of api cleanups and evolution by Jiri Slaby to make
           the serial interfaces move out of the 1990's by using kfifos
           instead of hand-rolling their own logic.
      
         - 8250_exar driver updates
      
         - max3100 driver updates
      
         - sc16is7xx driver updates
      
         - exar driver updates
      
         - sh-sci driver updates
      
         - tty ldisc api addition to help refuse bindings
      
         - other smaller serial driver updates
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'tty-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (113 commits)
        serial: Clear UPF_DEAD before calling tty_port_register_device_attr_serdev()
        serial: imx: Raise TX trigger level to 8
        serial: 8250_pnp: Simplify "line" related code
        serial: sh-sci: simplify locking when re-issuing RXDMA fails
        serial: sh-sci: let timeout timer only run when DMA is scheduled
        serial: sh-sci: describe locking requirements for invalidating RXDMA
        serial: sh-sci: protect invalidating RXDMA on shutdown
        tty: add the option to have a tty reject a new ldisc
        serial: core: Call device_set_awake_path() for console port
        dt-bindings: serial: brcm,bcm2835-aux-uart: convert to dtschema
        tty: serial: uartps: Add support for uartps controller reset
        arm64: zynqmp: Add resets property for UART nodes
        dt-bindings: serial: cdns,uart: Add optional reset property
        serial: 8250_pnp: Switch to DEFINE_SIMPLE_DEV_PM_OPS()
        serial: 8250_exar: Keep the includes sorted
        serial: 8250_exar: Make type of bit the same in exar_ee_*_bit()
        serial: 8250_exar: Use BIT() in exar_ee_read()
        serial: 8250_exar: Switch to use dev_err_probe()
        serial: 8250_exar: Return directly from switch-cases
        serial: 8250_exar: Decrease indentation level
        ...
      f6b8e86b
    • Linus Torvalds's avatar
      Merge tag 'usb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 89601f67
      Linus Torvalds authored
      Pull USB / Thunderbolt updates from Greg KH:
       "Here is the big set of USB and Thunderbolt changes for 6.10-rc1.
        Nothing hugely earth-shattering, just constant forward progress for
        hardware support of new devices and cleanups over the drivers.
      
        Included in here are:
      
         - Thunderbolt / USB 4 driver updates
      
         - typec driver updates
      
         - dwc3 driver updates
      
         - gadget driver updates
      
         - uss720 driver id additions and fixes (people use USB->arallel port
           devices still!)
      
         - onboard-hub driver rename and additions for new hardware
      
         - xhci driver updates
      
         - other small USB driver updates and additions for quirks and api
           changes
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'usb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
        drm/bridge: aux-hpd-bridge: correct devm_drm_dp_hpd_bridge_add() stub
        usb: fotg210: Add missing kernel doc description
        usb: dwc3: core: Fix unused variable warning in core driver
        usb: typec: tipd: rely on i2c_get_match_data()
        usb: typec: tipd: fix event checking for tps6598x
        usb: typec: tipd: fix event checking for tps25750
        dt-bindings: usb: qcom,dwc3: fix interrupt max items
        usb: fotg210: Use *-y instead of *-objs in Makefile
        usb: phy: tegra: Replace of_gpio.h by proper one
        usb: typec: ucsi: displayport: Fix potential deadlock
        usb: typec: qcom-pmic-typec: split HPD bridge alloc and registration
        usb: musc: Remove unused list 'buffers'
        usb: dwc3: Wait unconditionally after issuing EndXfer command
        usb: gadget: u_audio: Clear uac pointer when freed.
        usb: gadget: u_audio: Fix race condition use of controls after free during gadget unbind.
        dt-bindings: usb: dwc3: Add QDU1000 compatible
        usb: core: Remove the useless struct usb_devmap which is just a bitmap
        MAINTAINERS: Remove {ehci,uhci}-platform.c from ARM/VT8500 entry
        USB: usb_parse_endpoint: ignore reserved bits
        usb: xhci: compact 'trb_in_td()' arguments
        ...
      89601f67
    • Linus Torvalds's avatar
      Merge tag 'leds-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds · f3033eb7
      Linus Torvalds authored
      Pull LED updates from Lee Jones:
       "Core Frameworks:
         - Ensure seldom updated triggers have a brightness value before first
           update
      
        New Device Support:
         - Add support for Simatic IPC Device BX_59A to IPC LEDs Core
         - Add support for Qualcomm PMI8950 PWM to LPG Core
      
        New Functionality:
         - Add a bunch of new LED function identifiers
         - Add support for High Resolution Timers in LED Trigger Patten
      
        Fix-ups:
         - Shift out Audio Trigger to the Sound subsystem
         - Convert suitable calls to devm_* managed resources
         - Device Tree binding adaptions/conversions/creation
         - Remove superfluous code/variables/attributes and simplify overall
         - Use/convert to new/better APIs/helpers/MACROs instead of
           hand-rolling implementations
      
        Bug Fixes:
         - Repair enabling Torch Mode from V4L2 on the second LED
         - Ensure PWM is disabled when suspending"
      
      * tag 'leds-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (28 commits)
        leds: mt6370: Remove unused field 'reg_cfgs' from 'struct mt6370_priv'
        leds: lp50xx: Remove unused field 'num_of_banked_leds' from 'struct lp50xx'
        leds: lp50xx: Remove unused field 'bank_modules' from 'struct lp50xx_led'
        leds: aat1290: Remove unused field 'torch_brightness' from 'struct aat1290_led'
        leds: sun50i-a100: Use match_string() helper to simplify the code
        leds: pwm: Disable PWM when going to suspend
        leds: trigger: pattern: Add support for hrtimer
        leds: mt6360: Fix the second LED can not enable torch mode by V4L2
        dt-bindings: leds: leds-qcom-lpg: Add support for PMI8950 PWM
        leds: qcom-lpg: Add support for PMI8950 PWM
        leds: apu: Remove duplicate DMI lookup data
        leds: trigger: netdev: Remove not needed call to led_set_brightness in deactivate
        dt-bindings: leds: Add LED_FUNCTION_SPEED_* for link speed on LAN/WAN
        dt-bindings: leds: Add LED_FUNCTION_MOBILE for mobile network
        leds: simatic-ipc-leds-gpio: Add support for module BX-59A
        dt-bindings: leds: qcom-lpg: Document PM6150L compatible
        dt-bindings: leds: pca963x: Convert text bindings to YAML
        leds: an30259a: Use devm_mutex_init() for mutex initialization
        leds: mlxreg: Use devm_mutex_init() for mutex initialization
        leds: nic78bx: Use devm API to cleanup module's resources
        ...
      f3033eb7
    • Linus Torvalds's avatar
      Merge tag 'backlight-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · 7eae27cd
      Linus Torvalds authored
      Pull backlight updates from Lee Jones:
       "Fix-ups:
         - FB Backlight interaction overhaul
         - Remove superfluous code and simplify overall
         - Constify various structs and struct attributes
      
        Bug Fixes:
         - Repair LED flickering
         - Fix signedness bugs"
      
      * tag 'backlight-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: (42 commits)
        backlight: sky81452-backlight: Remove unnecessary call to of_node_get()
        backlight: mp3309c: Fix LEDs flickering in PWM mode
        backlight: otm3225a: Drop driver owner assignment
        backlight: lp8788: Drop support for platform data
        backlight: lcd: Make lcd_class constant
        backlight: Make backlight_class constant
        backlight: mp3309c: Fix signedness bug in mp3309c_parse_fwnode()
        const_structs.checkpatch: add lcd_ops
        fbdev: omap: lcd_ams_delta: Constify lcd_ops
        fbdev: imx: Constify lcd_ops
        fbdev: clps711x: Constify lcd_ops
        HID: picoLCD: Constify lcd_ops
        backlight: tdo24m: Constify lcd_ops
        backlight: platform_lcd: Constify lcd_ops
        backlight: otm3225a: Constify lcd_ops
        backlight: ltv350qv: Constify lcd_ops
        backlight: lms501kf03: Constify lcd_ops
        backlight: lms283gf05: Constify lcd_ops
        backlight: l4f00242t03: Constify lcd_ops
        backlight: jornada720_lcd: Constify lcd_ops
        ...
      7eae27cd
    • Linus Torvalds's avatar
      Merge tag 'mfd-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · a85629f4
      Linus Torvalds authored
      Pull MFD updates from Lee Jones:
       "New Device Support:
         - Add support for X-Powers AXP717 PMIC to AXP22X
         - Add support for Rockchip RK816 PMIC to RK8XX
         - Add support for TI TPS65224 PMIC to TPS6594
      
        New Functionality:
         - Add Power Off functionality to Rohm BD71828
         - Allow I2C SMBus access in Renesas RSMU
      
        Fix-ups:
         - Device Tree binding adaptions/conversions/creation
         - Shift Intel support over to MSI interrupts
         - Generify adding platform data away from being ACPI specific
         - Use device core supplied attribute to register sysfs entries
         - Replace hand-rolled functionality with generic APIs
         - Utilise centrally provided helpers and macros
         - Clean-up error handling
         - Remove superfluous/duplicated/unused sections
         - Trivial; spelling, whitespace, coding-style adaptions
         - More Maple Tree conversions"
      
      * tag 'mfd-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (44 commits)
        dt-bindings: mfd: Use full path to other schemas
        mfd: rsmu: support I2C SMBus access
        dt-bindings: mfd: Convert lp873x.txt to json-schema
        dt-bindings: mfd: aspeed: Drop 'oneOf' for pinctrl node
        dt-bindings: mfd: allwinner,sun6i-a31-prcm: Use hyphens in node names
        mfd: ssbi: Remove unused field 'slave' from 'struct ssbi'
        mfd: kempld: Remove custom DMI matching code
        mfd: cs42l43: Update patching revision check
        dt-bindings: mfd: qcom: pm8xxx: Add pm8901 compatible
        mfd: timberdale: Remove redundant assignment to variable err
        dt-bindings: mfd: qcom,spmi-pmic: Add pbs to SPMI device types
        dt-bindings: mfd: syscon: Add ti,am62p-cpsw-mac-efuse compatible
        dt-bindings: mfd: qcom,tcsr: Add compatible for SDX75
        mfd: axp20x: Convert to use Maple Tree register cache
        mfd: bd71828: Remove commented code lines
        mfd: intel-m10-bmc: Change staging size to a variable
        dt-bindings: mfd: Add ROHM BD71879
        mfd: Tidy Kconfig dependency's parentheses
        mfd: ocelot-spi: Use spi_sync_transfer()
        dt-bindings: mfd: syscon: Add missing simple syscon compatibles
        ...
      a85629f4
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 0bfbc914
      Linus Torvalds authored
      Pull RISC-V updates from Palmer Dabbelt:
      
       - Add byte/half-word compare-and-exchange, emulated via LR/SC loops
      
       - Support for Rust
      
       - Support for Zihintpause in hwprobe
      
       - Add PR_RISCV_SET_ICACHE_FLUSH_CTX prctl()
      
       - Support lockless lockrefs
      
      * tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
        riscv: defconfig: Enable CONFIG_CLK_SOPHGO_CV1800
        riscv: select ARCH_HAS_FAST_MULTIPLIER
        riscv: mm: still create swiotlb buffer for kmalloc() bouncing if required
        riscv: Annotate pgtable_l{4,5}_enabled with __ro_after_init
        riscv: Remove redundant CONFIG_64BIT from pgtable_l{4,5}_enabled
        riscv: mm: Always use an ASID to flush mm contexts
        riscv: mm: Preserve global TLB entries when switching contexts
        riscv: mm: Make asid_bits a local variable
        riscv: mm: Use a fixed layout for the MM context ID
        riscv: mm: Introduce cntx2asid/cntx2version helper macros
        riscv: Avoid TLB flush loops when affected by SiFive CIP-1200
        riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
        riscv: mm: Combine the SMP and UP TLB flush code
        riscv: Only send remote fences when some other CPU is online
        riscv: mm: Broadcast kernel TLB flushes only when needed
        riscv: Use IPIs for remote cache/TLB flushes by default
        riscv: Factor out page table TLB synchronization
        riscv: Flush the instruction cache during SMP bringup
        riscv: hwprobe: export Zihintpause ISA extension
        riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code
        ...
      0bfbc914
    • Linus Torvalds's avatar
      Merge tag 'loongarch-6.10' of... · 4f05e820
      Linus Torvalds authored
      Merge tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch updates from Huacai Chen:
      
       - Select some options in Kconfig
      
       - Give a chance to build with !CONFIG_SMP
      
       - Switch to use built-in rustc target
      
       - Add new supported device nodes to dts
      
       - Some bug fixes and other small changes
      
       - Update the default config file
      
      * tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: Update Loongson-3 default config file
        LoongArch: dts: Add new supported device nodes to Loongson-2K2000
        LoongArch: dts: Add new supported device nodes to Loongson-2K0500
        LoongArch: dts: Remove "disabled" state of clock controller node
        LoongArch: rust: Switch to use built-in rustc target
        LoongArch: Fix callchain parse error with kernel tracepoint events again
        LoongArch: Give a chance to build with !CONFIG_SMP
        LoongArch: Select THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE
        LoongArch: Select ARCH_WANT_DEFAULT_BPF_JIT
        LoongArch: Select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
        LoongArch: Select ARCH_HAS_FAST_MULTIPLIER
      4f05e820
    • Linus Torvalds's avatar
      Merge tag 'microblaze-v6.10' of git://git.monstr.eu/linux-2.6-microblaze · f33fda22
      Linus Torvalds authored
      Pull microblaze updates from Michal Simek:
      
       - Cleanup code around removed early_printk
      
      * tag 'microblaze-v6.10' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: Remove early printk call from cpuinfo-static.c
        microblaze: Remove gcc flag for non existing early_printk.c file
      f33fda22
    • Linus Torvalds's avatar
      Merge tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs · 0e22bedd
      Linus Torvalds authored
      Pull overlayfs updates from Miklos Szeredi:
      
       - Add tmpfile support
      
       - Clean up include
      
      * tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
        ovl: remove duplicate included header
        ovl: remove upper umask handling from ovl_create_upper()
        ovl: implement tmpfile
      0e22bedd
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 4f2d34b6
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
      
       - Add fs-verity support (Richard Fung)
      
       - Add multi-queue support to virtio-fs (Peter-Jan Gootzen)
      
       - Fix a bug in NOTIFY_RESEND handling (Hou Tao)
      
       - page -> folio cleanup (Matthew Wilcox)
      
      * tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        virtio-fs: add multi-queue support
        virtio-fs: limit number of request queues
        fuse: clear FR_SENT when re-adding requests into pending list
        fuse: set FR_PENDING atomically in fuse_resend()
        fuse: Add initial support for fs-verity
        fuse: Convert fuse_readpages_end() to use folio_end_read()
      4f2d34b6
    • Yafang Shao's avatar
      vfs: Delete the associated dentry when deleting a file · 681ce862
      Yafang Shao authored
      Our applications, built on Elasticsearch[0], frequently create and
      delete files.  These applications operate within containers, some with a
      memory limit exceeding 100GB.  Over prolonged periods, the accumulation
      of negative dentries within these containers can amount to tens of
      gigabytes.
      
      Upon container exit, directories are deleted.  However, due to the
      numerous associated dentries, this process can be time-consuming.  Our
      users have expressed frustration with this prolonged exit duration,
      which constitutes our first issue.
      
      Simultaneously, other processes may attempt to access the parent
      directory of the Elasticsearch directories.  Since the task responsible
      for deleting the dentries holds the inode lock, processes attempting
      directory lookup experience significant delays.  This issue, our second
      problem, is easily demonstrated:
      
        - Task 1 generates negative dentries:
        $ pwd
        ~/test
        $ mkdir es && cd es/ && ./create_and_delete_files.sh
      
        [ After generating tens of GB dentries ]
      
        $ cd ~/test && rm -rf es
      
        [ It will take a long duration to finish ]
      
        - Task 2 attempts to lookup the 'test/' directory
        $ pwd
        ~/test
        $ ls
      
        The 'ls' command in Task 2 experiences prolonged execution as Task 1
        is deleting the dentries.
      
      We've devised a solution to address both issues by deleting associated
      dentry when removing a file.  Interestingly, we've noted that a similar
      patch was proposed years ago[1], although it was rejected citing the
      absence of tangible issues caused by negative dentries.  Given our
      current challenges, we're resubmitting the proposal.  All relevant
      stakeholders from previous discussions have been included for reference.
      
      Some alternative solutions are also under discussion[2][3], such as
      shrinking child dentries outside of the parent inode lock or even
      asynchronously shrinking child dentries.  However, given the
      straightforward nature of the current solution, I believe this approach
      is still necessary.
      
      [ NOTE! This is a pretty fundamental change in how we deal with
        unlinking dentries, and it doesn't change the fact that you can have
        lots of negative dentries from just doing negative lookups.
      
        But the kernel test robot is at least initially happy with this from a
        performance angle, so I'm applying this ASAP just to get more testing
        and as a "known fix for an issue people hit in real life".
      
        Put another way: we should still look at the alternatives, and this
        patch may get reverted if somebody finds a performance regression on
        some other load.       - Linus ]
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Link: https://github.com/elastic/elasticsearch [0]
      Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [1]
      Link: https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [2]
      Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ [3]
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Wangkai <wangkai86@huawei.com>
      Cc: Colin Walters <walters@verbum.org>
      Tested-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      681ce862
  4. 21 May, 2024 4 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of... · 29c73fc7
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools updates from Arnaldo Carvalho de Melo:
       "General:
      
         - Integrate the shellcheck utility with the build of perf to allow
           catching shell problems early in areas such as 'perf test', 'perf
           trace' scrape scripts, etc
      
         - Add 'uretprobe' variant in the 'perf bench uprobe' tool
      
         - Add script to run instances of 'perf script' in parallel
      
         - Allow parsing tracepoint names that start with digits, such as
           9p/9p_client_req, etc. Make sure 'perf test' tests it even on
           systems where those tracepoints aren't available
      
         - Add Kan Liang to MAINTAINERS as a perf tools reviewer
      
         - Add support for using the 'capstone' disassembler library in
           various tools, such as 'perf script' and 'perf annotate'. This is
           an alternative for the use of the 'xed' and 'objdump' disassemblers
      
        Data-type profiling improvements:
      
         - Resolve types for a->b->c by backtracking the assignments until it
           finds DWARF info for one of those members
      
         - Support for global variables, keeping a cache to speed up lookups
      
         - Handle the 'call' instruction, dealing with effects on registers
           and handling its return when tracking register data types
      
         - Handle x86's segment based addressing like %gs:0x28, to support
           things like per CPU variables, the stack canary, etc
      
         - Data-type profiling got big speedups when using capstone for
           disassembling. The objdump outoput parsing method is left as a
           fallback when capstone fails or isn't available. There are patches
           posted for 6.11 that to use a LLVM disassembler
      
         - Support event group display in the TUI when annotating types with
           --data-type, for instance to show memory load and store events for
           the data type fields
      
         - Optimize the 'perf annotate' data structures, reducing memory usage
      
         - Add a initial 'perf test' for 'perf annotate', checking that a
           target symbol appears on the output, specifying objdump via the
           command line, etc
      
        Vendor Events:
      
         - Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand
           Ridge, Ice Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra
           Forest, Sky Lake X, Sky Lake and Snow Ridge X. Remove info metrics
           erroneously in TopdownL1
      
         - Add AMD's Zen 5 core and uncore events and metrics. Those come from
           the "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh
           Processors" document, with events that capture information on op
           dispatch, execution and retirement, branch prediction, L1 and L2
           cache activity, TLB activity, etc
      
         - Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/
           AmpereOneX
      
        Miscellaneous:
      
         - Sync header copies with the kernel sources
      
         - Move some header copies used only for generating translation string
           tables for ioctl cmds and other syscall integer arguments to a new
           directory under tools/perf/beauty/, to separate from copies in
           tools/include/ that are used to build the tools
      
         - Introduce scrape script for several syscall 'flags'/'mask'
           arguments
      
         - Improve cpumap utilization, fixing up pairing of refcounts, using
           the right iterators (perf_cpu_map__for_each_cpu), etc
      
         - Give more details about raw event encodings in 'perf list', show
           tracepoint encoding in the detailed output
      
         - Refactor the DSOs handling code, reducing memory usage
      
         - Document the BPF event modifier and add a 'perf test' for it
      
         - Improve the event parser, better error messages and add further
           'perf test's for it
      
         - Add reference count checking to 'struct comm_str' and 'struct
           mem_info'
      
         - Make ARM64's 'perf test' entries for the Neoverse N1 more robust
      
         - Tweak the ARM64's Coresight 'perf test's
      
         - Improve ARM64's CoreSight ETM version detection and error reporting
      
         - Fix handling of symbols when using kcore
      
         - Fix PAI (Processor Activity Instrumentation) counter names for s390
           virtual machines in 'perf report'
      
         - Fix -g/--call-graph option failure in 'perf sched timehist'
      
         - Add LIBTRACEEVENT_DIR build option to allow building with
           libtraceevent installed in non-standard directories, such as when
           doing cross builds
      
         - Various 'perf test' and 'perf bench' fixes
      
         - Improve 'perf probe' error message for long C++ probe names"
      
      * tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (260 commits)
        tools lib subcmd: Show parent options in help
        perf pmu: Count sys and cpuid JSON events separately
        perf stat: Don't display metric header for non-leader uncore events
        perf annotate-data: Ensure the number of type histograms
        perf annotate: Fix segfault on sample histogram
        perf daemon: Fix file leak in daemon_session__control
        libsubcmd: Fix parse-options memory leak
        perf lock: Avoid memory leaks from strdup()
        perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency
        perf tools: Ignore deleted cgroups
        perf parse: Allow tracepoint names to start with digits
        perf parse-events: Add new 'fake_tp' parameter for tests
        perf parse-events: pass parse_state to add_tracepoint
        perf symbols: Fix ownership of string in dso__load_vmlinux()
        perf symbols: Update kcore map before merging in remaining symbols
        perf maps: Re-use __maps__free_maps_by_name()
        perf symbols: Remove map from list before updating addresses
        perf tracepoint: Don't scan all tracepoints to test if one exists
        perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT
        perf thread: Fixes to thread__new() related to initializing comm
        ...
      29c73fc7
    • Linus Torvalds's avatar
      Merge tag 'bitmap-for-6.10v2' of https://github.com/norov/linux · 4865a27c
      Linus Torvalds authored
      Pull bitmap updates from Yury Norov:
      
       - topology_span_sane() optimization from Kyle Meyer
      
       - fns() rework from Kuan-Wei Chiu (used in cpumask_local_spread() and
         other places)
      
       - headers cleanup from Andy
      
       - add a MAINTAINERS record for bitops API
      
      * tag 'bitmap-for-6.10v2' of https://github.com/norov/linux:
        usercopy: Don't use "proxy" headers
        bitops: Move aligned_byte_mask() to wordpart.h
        MAINTAINERS: add BITOPS API record
        bitmap: relax find_nth_bit() limitation on return value
        lib: make test_bitops compilable into the kernel image
        bitops: Optimize fns() for improved performance
        lib/test_bitops: Add benchmark test for fns()
        Compiler Attributes: Add __always_used macro
        sched/topology: Optimize topology_span_sane()
        cpumask: Add for_each_cpu_from()
      4865a27c
    • Linus Torvalds's avatar
      Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · b6394d6f
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted commits that had missed the last merge window..."
      
      * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        remove call_{read,write}_iter() functions
        do_dentry_open(): kill inode argument
        kernel_file_open(): get rid of inode argument
        get_file_rcu(): no need to check for NULL separately
        fd_is_open(): move to fs/file.c
        close_on_exec(): pass files_struct instead of fdtable
      b6394d6f
    • Linus Torvalds's avatar
      Merge tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 3413efa8
      Linus Torvalds authored
      Pull bdev flags update from Al Viro:
       "Compactifying bdev flags.
      
        We can easily have up to 24 flags with sane atomicity, _without_
        pushing anything out of the first cacheline of struct block_device"
      
      * tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        bdev: move ->bd_make_it_fail to ->__bd_flags
        bdev: move ->bd_ro_warned to ->__bd_flags
        bdev: move ->bd_has_subit_bio to ->__bd_flags
        bdev: move ->bd_write_holder into ->__bd_flags
        bdev: move ->bd_read_only to ->__bd_flags
        bdev: infrastructure for flags
        wrapper for access to ->bd_partno
        Use bdev_is_paritition() instead of open-coding it
      3413efa8