1. 04 Jul, 2024 5 commits
    • SeongJae Park's avatar
      mm/damon/core: merge regions aggressively when max_nr_regions is unmet · 310d6c15
      SeongJae Park authored
      DAMON keeps the number of regions under max_nr_regions by skipping regions
      split operations when doing so can make the number higher than the limit. 
      It works well for preventing violation of the limit.  But, if somehow the
      violation happens, it cannot recovery well depending on the situation.  In
      detail, if the real number of regions having different access pattern is
      higher than the limit, the mechanism cannot reduce the number below the
      limit.  In such a case, the system could suffer from high monitoring
      overhead of DAMON.
      
      The violation can actually happen.  For an example, the user could reduce
      max_nr_regions while DAMON is running, to be lower than the current number
      of regions.  Fix the problem by repeating the merge operations with
      increasing aggressiveness in kdamond_merge_regions() for the case, until
      the limit is met.
      
      [sj@kernel.org: increase regions merge aggressiveness while respecting min_nr_regions]
        Link: https://lkml.kernel.org/r/20240626164753.46270-1-sj@kernel.org
      [sj@kernel.org: ensure max threshold attempt for max_nr_regions violation]
        Link: https://lkml.kernel.org/r/20240627163153.75969-1-sj@kernel.org
      Link: https://lkml.kernel.org/r/20240624175814.89611-1-sj@kernel.org
      Fixes: b9a6ac4e ("mm/damon: adaptively adjust regions")
      Signed-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: <stable@vger.kernel.org>	[5.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      310d6c15
    • Audra Mitchell's avatar
      Fix userfaultfd_api to return EINVAL as expected · 1723f04c
      Audra Mitchell authored
      Currently if we request a feature that is not set in the Kernel config we
      fail silently and return all the available features.  However, the man
      page indicates we should return an EINVAL.
      
      We need to fix this issue since we can end up with a Kernel warning should
      a program request the feature UFFD_FEATURE_WP_UNPOPULATED on a kernel with
      the config not set with this feature.
      
       [  200.812896] WARNING: CPU: 91 PID: 13634 at mm/memory.c:1660 zap_pte_range+0x43d/0x660
       [  200.820738] Modules linked in:
       [  200.869387] CPU: 91 PID: 13634 Comm: userfaultfd Kdump: loaded Not tainted 6.9.0-rc5+ #8
       [  200.877477] Hardware name: Dell Inc. PowerEdge R6525/0N7YGH, BIOS 2.7.3 03/30/2022
       [  200.885052] RIP: 0010:zap_pte_range+0x43d/0x660
      
      Link: https://lkml.kernel.org/r/20240626130513.120193-1-audra@redhat.com
      Fixes: e06f1e1d ("userfaultfd: wp: enabled write protection in userfaultfd API")
      Signed-off-by: default avatarAudra Mitchell <audra@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rafael Aquini <raquini@redhat.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1723f04c
    • Uladzislau Rezki (Sony)'s avatar
      mm: vmalloc: check if a hash-index is in cpu_possible_mask · a34acf30
      Uladzislau Rezki (Sony) authored
      The problem is that there are systems where cpu_possible_mask has gaps
      between set CPUs, for example SPARC.  In this scenario addr_to_vb_xa()
      hash function can return an index which accesses to not-possible and not
      setup CPU area using per_cpu() macro.  This results in an oops on SPARC.
      
      A per-cpu vmap_block_queue is also used as hash table, incorrectly
      assuming the cpu_possible_mask has no gaps.  Fix it by adjusting an index
      to a next possible CPU.
      
      Link: https://lkml.kernel.org/r/20240626140330.89836-1-urezki@gmail.com
      Fixes: 062eacf5 ("mm: vmalloc: remove a global vmap_blocks xarray")
      Reported-by: default avatarNick Bowler <nbowler@draconx.ca>
      Closes: https://lore.kernel.org/linux-kernel/ZntjIE6msJbF8zTa@MiWiFi-R3L-srv/T/Signed-off-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Hailong.Liu <hailong.liu@oppo.com>
      Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a34acf30
    • Waiman Long's avatar
      mm: prevent derefencing NULL ptr in pfn_section_valid() · 82f0b6f0
      Waiman Long authored
      Commit 5ec8e8ea ("mm/sparsemem: fix race in accessing
      memory_section->usage") changed pfn_section_valid() to add a READ_ONCE()
      call around "ms->usage" to fix a race with section_deactivate() where
      ms->usage can be cleared.  The READ_ONCE() call, by itself, is not enough
      to prevent NULL pointer dereference.  We need to check its value before
      dereferencing it.
      
      Link: https://lkml.kernel.org/r/20240626001639.1350646-1-longman@redhat.com
      Fixes: 5ec8e8ea ("mm/sparsemem: fix race in accessing memory_section->usage")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Charan Teja Kalla <quic_charante@quicinc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      82f0b6f0
    • Yang Shi's avatar
      mm: page_ref: remove folio_try_get_rcu() · fa2690af
      Yang Shi authored
      The below bug was reported on a non-SMP kernel:
      
      [  275.267158][ T4335] ------------[ cut here ]------------
      [  275.267949][ T4335] kernel BUG at include/linux/page_ref.h:275!
      [  275.268526][ T4335] invalid opcode: 0000 [#1] KASAN PTI
      [  275.269001][ T4335] CPU: 0 PID: 4335 Comm: trinity-c3 Not tainted 6.7.0-rc4-00061-gefa7df3e #1
      [  275.269787][ T4335] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
      [  275.270679][ T4335] RIP: 0010:try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [  275.272813][ T4335] RSP: 0018:ffffc90005dcf650 EFLAGS: 00010202
      [  275.273346][ T4335] RAX: 0000000000000246 RBX: ffffea00066e0000 RCX: 0000000000000000
      [  275.274032][ T4335] RDX: fffff94000cdc007 RSI: 0000000000000004 RDI: ffffea00066e0034
      [  275.274719][ T4335] RBP: ffffea00066e0000 R08: 0000000000000000 R09: fffff94000cdc006
      [  275.275404][ T4335] R10: ffffea00066e0037 R11: 0000000000000000 R12: 0000000000000136
      [  275.276106][ T4335] R13: ffffea00066e0034 R14: dffffc0000000000 R15: ffffea00066e0008
      [  275.276790][ T4335] FS:  00007fa2f9b61740(0000) GS:ffffffff89d0d000(0000) knlGS:0000000000000000
      [  275.277570][ T4335] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  275.278143][ T4335] CR2: 00007fa2f6c00000 CR3: 0000000134b04000 CR4: 00000000000406f0
      [  275.278833][ T4335] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  275.279521][ T4335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  275.280201][ T4335] Call Trace:
      [  275.280499][ T4335]  <TASK>
      [ 275.280751][ T4335] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
      [ 275.281087][ T4335] ? do_trap (arch/x86/kernel/traps.c:112 arch/x86/kernel/traps.c:153)
      [ 275.281463][ T4335] ? try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [ 275.281884][ T4335] ? try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [ 275.282300][ T4335] ? do_error_trap (arch/x86/kernel/traps.c:174)
      [ 275.282711][ T4335] ? try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [ 275.283129][ T4335] ? handle_invalid_op (arch/x86/kernel/traps.c:212)
      [ 275.283561][ T4335] ? try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [ 275.283990][ T4335] ? exc_invalid_op (arch/x86/kernel/traps.c:264)
      [ 275.284415][ T4335] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:568)
      [ 275.284859][ T4335] ? try_get_folio (include/linux/page_ref.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3))
      [ 275.285278][ T4335] try_grab_folio (mm/gup.c:148)
      [ 275.285684][ T4335] __get_user_pages (mm/gup.c:1297 (discriminator 1))
      [ 275.286111][ T4335] ? __pfx___get_user_pages (mm/gup.c:1188)
      [ 275.286579][ T4335] ? __pfx_validate_chain (kernel/locking/lockdep.c:3825)
      [ 275.287034][ T4335] ? mark_lock (kernel/locking/lockdep.c:4656 (discriminator 1))
      [ 275.287416][ T4335] __gup_longterm_locked (mm/gup.c:1509 mm/gup.c:2209)
      [ 275.288192][ T4335] ? __pfx___gup_longterm_locked (mm/gup.c:2204)
      [ 275.288697][ T4335] ? __pfx_lock_acquire (kernel/locking/lockdep.c:5722)
      [ 275.289135][ T4335] ? __pfx___might_resched (kernel/sched/core.c:10106)
      [ 275.289595][ T4335] pin_user_pages_remote (mm/gup.c:3350)
      [ 275.290041][ T4335] ? __pfx_pin_user_pages_remote (mm/gup.c:3350)
      [ 275.290545][ T4335] ? find_held_lock (kernel/locking/lockdep.c:5244 (discriminator 1))
      [ 275.290961][ T4335] ? mm_access (kernel/fork.c:1573)
      [ 275.291353][ T4335] process_vm_rw_single_vec+0x142/0x360
      [ 275.291900][ T4335] ? __pfx_process_vm_rw_single_vec+0x10/0x10
      [ 275.292471][ T4335] ? mm_access (kernel/fork.c:1573)
      [ 275.292859][ T4335] process_vm_rw_core+0x272/0x4e0
      [ 275.293384][ T4335] ? hlock_class (arch/x86/include/asm/bitops.h:227 arch/x86/include/asm/bitops.h:239 include/asm-generic/bitops/instrumented-non-atomic.h:142 kernel/locking/lockdep.c:228)
      [ 275.293780][ T4335] ? __pfx_process_vm_rw_core+0x10/0x10
      [ 275.294350][ T4335] process_vm_rw (mm/process_vm_access.c:284)
      [ 275.294748][ T4335] ? __pfx_process_vm_rw (mm/process_vm_access.c:259)
      [ 275.295197][ T4335] ? __task_pid_nr_ns (include/linux/rcupdate.h:306 (discriminator 1) include/linux/rcupdate.h:780 (discriminator 1) kernel/pid.c:504 (discriminator 1))
      [ 275.295634][ T4335] __x64_sys_process_vm_readv (mm/process_vm_access.c:291)
      [ 275.296139][ T4335] ? syscall_enter_from_user_mode (kernel/entry/common.c:94 kernel/entry/common.c:112)
      [ 275.296642][ T4335] do_syscall_64 (arch/x86/entry/common.c:51 (discriminator 1) arch/x86/entry/common.c:82 (discriminator 1))
      [ 275.297032][ T4335] ? __task_pid_nr_ns (include/linux/rcupdate.h:306 (discriminator 1) include/linux/rcupdate.h:780 (discriminator 1) kernel/pid.c:504 (discriminator 1))
      [ 275.297470][ T4335] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4300 kernel/locking/lockdep.c:4359)
      [ 275.297988][ T4335] ? do_syscall_64 (arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:97)
      [ 275.298389][ T4335] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4300 kernel/locking/lockdep.c:4359)
      [ 275.298906][ T4335] ? do_syscall_64 (arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:97)
      [ 275.299304][ T4335] ? do_syscall_64 (arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:97)
      [ 275.299703][ T4335] ? do_syscall_64 (arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:97)
      [ 275.300115][ T4335] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
      
      This BUG is the VM_BUG_ON(!in_atomic() && !irqs_disabled()) assertion in
      folio_ref_try_add_rcu() for non-SMP kernel.
      
      The process_vm_readv() calls GUP to pin the THP. An optimization for
      pinning THP instroduced by commit 57edfcfd ("mm/gup: accelerate thp
      gup even for "pages != NULL"") calls try_grab_folio() to pin the THP,
      but try_grab_folio() is supposed to be called in atomic context for
      non-SMP kernel, for example, irq disabled or preemption disabled, due to
      the optimization introduced by commit e286781d ("mm: speculative
      page references").
      
      The commit efa7df3e ("mm: align larger anonymous mappings on THP
      boundaries") is not actually the root cause although it was bisected to.
      It just makes the problem exposed more likely.
      
      The follow up discussion suggested the optimization for non-SMP kernel
      may be out-dated and not worth it anymore [1].  So removing the
      optimization to silence the BUG.
      
      However calling try_grab_folio() in GUP slow path actually is
      unnecessary, so the following patch will clean this up.
      
      [1] https://lore.kernel.org/linux-mm/821cf1d6-92b9-4ac4-bacc-d8f2364ac14f@paulmck-laptop/
      
      Link: https://lkml.kernel.org/r/20240625205350.1777481-1-yang@os.amperecomputing.com
      Fixes: 57edfcfd ("mm/gup: accelerate thp gup even for "pages != NULL"")
      Signed-off-by: default avatarYang Shi <yang@os.amperecomputing.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Tested-by: default avatarOliver Sang <oliver.sang@intel.com>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
      Cc: <stable@vger.kernel.org>	[6.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fa2690af
  2. 03 Jul, 2024 6 commits
    • Ryusuke Konishi's avatar
      nilfs2: fix incorrect inode allocation from reserved inodes · 93aef9ed
      Ryusuke Konishi authored
      If the bitmap block that manages the inode allocation status is corrupted,
      nilfs_ifile_create_inode() may allocate a new inode from the reserved
      inode area where it should not be allocated.
      
      Previous fix commit d325dc6e ("nilfs2: fix use-after-free bug of
      struct nilfs_root"), fixed the problem that reserved inodes with inode
      numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to
      bitmap corruption, but since the start number of non-reserved inodes is
      read from the super block and may change, in which case inode allocation
      may occur from the extended reserved inode area.
      
      If that happens, access to that inode will cause an IO error, causing the
      file system to degrade to an error state.
      
      Fix this potential issue by adding a wraparound option to the common
      metadata object allocation routine and by modifying
      nilfs_ifile_create_inode() to disable the option so that it only allocates
      inodes with inode numbers greater than or equal to the inode number read
      in "nilfs->ns_first_ino", regardless of the bitmap status of reserved
      inodes.
      
      Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      93aef9ed
    • Ryusuke Konishi's avatar
      nilfs2: add missing check for inode numbers on directory entries · bb76c6c2
      Ryusuke Konishi authored
      Syzbot reported that mounting and unmounting a specific pattern of
      corrupted nilfs2 filesystem images causes a use-after-free of metadata
      file inodes, which triggers a kernel bug in lru_add_fn().
      
      As Jan Kara pointed out, this is because the link count of a metadata file
      gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(),
      tries to delete that inode (ifile inode in this case).
      
      The inconsistency occurs because directories containing the inode numbers
      of these metadata files that should not be visible in the namespace are
      read without checking.
      
      Fix this issue by treating the inode numbers of these internal files as
      errors in the sanity check helper when reading directory folios/pages.
      
      Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer
      analysis.
      
      Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+d79afb004be235636ee8@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8Reported-by: default avatarJan Kara <jack@suse.cz>
      Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bb76c6c2
    • Ryusuke Konishi's avatar
      nilfs2: fix inode number range checks · e2fec219
      Ryusuke Konishi authored
      Patch series "nilfs2: fix potential issues related to reserved inodes".
      
      This series fixes one use-after-free issue reported by syzbot, caused by
      nilfs2's internal inode being exposed in the namespace on a corrupted
      filesystem, and a couple of flaws that cause problems if the starting
      number of non-reserved inodes written in the on-disk super block is
      intentionally (or corruptly) changed from its default value.  
      
      
      This patch (of 3):
      
      In the current implementation of nilfs2, "nilfs->ns_first_ino", which
      gives the first non-reserved inode number, is read from the superblock,
      but its lower limit is not checked.
      
      As a result, if a number that overlaps with the inode number range of
      reserved inodes such as the root directory or metadata files is set in the
      super block parameter, the inode number test macros (NILFS_MDT_INODE and
      NILFS_VALID_INODE) will not function properly.
      
      In addition, these test macros use left bit-shift calculations using with
      the inode number as the shift count via the BIT macro, but the result of a
      shift calculation that exceeds the bit width of an integer is undefined in
      the C specification, so if "ns_first_ino" is set to a large value other
      than the default value NILFS_USER_INO (=11), the macros may potentially
      malfunction depending on the environment.
      
      Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and
      by preventing bit shifts equal to or greater than the NILFS_USER_INO
      constant in the inode number test macros.
      
      Also, change the type of "ns_first_ino" from signed integer to unsigned
      integer to avoid the need for type casting in comparisons such as the
      lower bound check introduced this time.
      
      Link: https://lkml.kernel.org/r/20240623051135.4180-1-konishi.ryusuke@gmail.com
      Link: https://lkml.kernel.org/r/20240623051135.4180-2-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e2fec219
    • Jan Kara's avatar
      mm: avoid overflows in dirty throttling logic · 385d838d
      Jan Kara authored
      The dirty throttling logic is interspersed with assumptions that dirty
      limits in PAGE_SIZE units fit into 32-bit (so that various multiplications
      fit into 64-bits).  If limits end up being larger, we will hit overflows,
      possible divisions by 0 etc.  Fix these problems by never allowing so
      large dirty limits as they have dubious practical value anyway.  For
      dirty_bytes / dirty_background_bytes interfaces we can just refuse to set
      so large limits.  For dirty_ratio / dirty_background_ratio it isn't so
      simple as the dirty limit is computed from the amount of available memory
      which can change due to memory hotplug etc.  So when converting dirty
      limits from ratios to numbers of pages, we just don't allow the result to
      exceed UINT_MAX.
      
      This is root-only triggerable problem which occurs when the operator
      sets dirty limits to >16 TB.
      
      Link: https://lkml.kernel.org/r/20240621144246.11148-2-jack@suse.czSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reported-by: default avatarZach O'Keefe <zokeefe@google.com>
      Reviewed-By: default avatarZach O'Keefe <zokeefe@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      385d838d
    • Jan Kara's avatar
      Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again" · 30139c70
      Jan Kara authored
      Patch series "mm: Avoid possible overflows in dirty throttling".
      
      Dirty throttling logic assumes dirty limits in page units fit into
      32-bits.  This patch series makes sure this is true (see patch 2/2 for
      more details).
      
      
      This patch (of 2):
      
      This reverts commit 9319b647.
      
      The commit is broken in several ways.  Firstly, the removed (u64) cast
      from the multiplication will introduce a multiplication overflow on 32-bit
      archs if wb_thresh * bg_thresh >= 1<<32 (which is actually common - the
      default settings with 4GB of RAM will trigger this).  Secondly, the
      div64_u64() is unnecessarily expensive on 32-bit archs.  We have
      div64_ul() in case we want to be safe & cheap.  Thirdly, if dirty
      thresholds are larger than 1<<32 pages, then dirty balancing is going to
      blow up in many other spectacular ways anyway so trying to fix one
      possible overflow is just moot.
      
      Link: https://lkml.kernel.org/r/20240621144017.30993-1-jack@suse.cz
      Link: https://lkml.kernel.org/r/20240621144246.11148-1-jack@suse.cz
      Fixes: 9319b647 ("mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-By: default avatarZach O'Keefe <zokeefe@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      30139c70
    • Jinliang Zheng's avatar
      mm: optimize the redundant loop of mm_update_owner_next() · cf3f9a59
      Jinliang Zheng authored
      When mm_update_owner_next() is racing with swapoff (try_to_unuse()) or
      /proc or ptrace or page migration (get_task_mm()), it is impossible to
      find an appropriate task_struct in the loop whose mm_struct is the same as
      the target mm_struct.
      
      If the above race condition is combined with the stress-ng-zombie and
      stress-ng-dup tests, such a long loop can easily cause a Hard Lockup in
      write_lock_irq() for tasklist_lock.
      
      Recognize this situation in advance and exit early.
      
      Link: https://lkml.kernel.org/r/20240620122123.3877432-1-alexjlzheng@tencent.comSigned-off-by: default avatarJinliang Zheng <alexjlzheng@tencent.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Mateusz Guzik <mjguzik@gmail.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tycho Andersen <tandersen@netflix.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      cf3f9a59
  3. 30 Jun, 2024 16 commits
    • Linus Torvalds's avatar
      Linux 6.10-rc6 · 22a40d14
      Linus Torvalds authored
      22a40d14
    • Linus Torvalds's avatar
      Merge tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · aca7c377
      Linus Torvalds authored
      Pull ata fixes from Niklas Cassel:
      
       - Add NOLPM quirk for for all Crucial BX SSD1 models.
      
         Considering that we now have had bug reports for 3 different BX SSD1
         variants from Crucial with the same product name, make the quirk more
         inclusive, to catch more device models from the same generation.
      
       - Fix a trivial NULL pointer dereference in the error path for
         ata_host_release().
      
       - Create a ata_port_free(), so that we don't miss freeing ata_port
         struct members when freeing a struct ata_port.
      
       - Fix a trivial double free in the error path for ata_host_alloc().
      
       - Ensure that we remove the libata "remapped NVMe device count" sysfs
         entry on .probe() error.
      
      * tag 'ata-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
        ata: ahci: Clean up sysfs file on error
        ata: libata-core: Fix double free on error
        ata,scsi: libata-core: Do not leak memory for ata_port struct members
        ata: libata-core: Fix null pointer dereference on error
        ata: libata-core: Add ATA_HORKAGE_NOLPM for all Crucial BX SSD1 models
      aca7c377
    • Niklas Cassel's avatar
      ata: ahci: Clean up sysfs file on error · eeb25a09
      Niklas Cassel authored
      .probe() (ahci_init_one()) calls sysfs_add_file_to_group(), however,
      if probe() fails after this call, we currently never call
      sysfs_remove_file_from_group().
      
      (The sysfs_remove_file_from_group() call in .remove() (ahci_remove_one())
      does not help, as .remove() is not called on .probe() error.)
      
      Thus, if probe() fails after the sysfs_add_file_to_group() call, the next
      time we insmod the module we will get:
      
      sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:04.0/remapped_nvme'
      CPU: 11 PID: 954 Comm: modprobe Not tainted 6.10.0-rc5 #43
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x5d/0x80
       sysfs_warn_dup.cold+0x17/0x23
       sysfs_add_file_mode_ns+0x11a/0x130
       sysfs_add_file_to_group+0x7e/0xc0
       ahci_init_one+0x31f/0xd40 [ahci]
      
      Fixes: 894fba7f ("ata: ahci: Add sysfs attribute to show remapped NVMe device count")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Link: https://lore.kernel.org/r/20240629124210.181537-10-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      eeb25a09
    • Niklas Cassel's avatar
      ata: libata-core: Fix double free on error · ab9e0c52
      Niklas Cassel authored
      If e.g. the ata_port_alloc() call in ata_host_alloc() fails, we will jump
      to the err_out label, which will call devres_release_group().
      devres_release_group() will trigger a call to ata_host_release().
      ata_host_release() calls kfree(host), so executing the kfree(host) in
      ata_host_alloc() will lead to a double free:
      
      kernel BUG at mm/slub.c:553!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 11 PID: 599 Comm: (udev-worker) Not tainted 6.10.0-rc5 #47
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      RIP: 0010:kfree+0x2cf/0x2f0
      Code: 5d 41 5e 41 5f 5d e9 80 d6 ff ff 4d 89 f1 41 b8 01 00 00 00 48 89 d9 48 89 da
      RSP: 0018:ffffc90000f377f0 EFLAGS: 00010246
      RAX: ffff888112b1f2c0 RBX: ffff888112b1f2c0 RCX: ffff888112b1f320
      RDX: 000000000000400b RSI: ffffffffc02c9de5 RDI: ffff888112b1f2c0
      RBP: ffffc90000f37830 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffc90000f37610 R11: 617461203a736b6e R12: ffffea00044ac780
      R13: ffff888100046400 R14: ffffffffc02c9de5 R15: 0000000000000006
      FS:  00007f2f1cabe980(0000) GS:ffff88813b380000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f2f1c3acf75 CR3: 0000000111724000 CR4: 0000000000750ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __die_body.cold+0x19/0x27
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x6a/0x90
       ? kfree+0x2cf/0x2f0
       ? exc_invalid_op+0x50/0x70
       ? kfree+0x2cf/0x2f0
       ? asm_exc_invalid_op+0x1a/0x20
       ? ata_host_alloc+0xf5/0x120 [libata]
       ? ata_host_alloc+0xf5/0x120 [libata]
       ? kfree+0x2cf/0x2f0
       ata_host_alloc+0xf5/0x120 [libata]
       ata_host_alloc_pinfo+0x14/0xa0 [libata]
       ahci_init_one+0x6c9/0xd20 [ahci]
      
      Ensure that we will not call kfree(host) twice, by performing the kfree()
      only if the devres_open_group() call failed.
      
      Fixes: dafd6c49 ("libata: ensure host is free'd on error exit paths")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Link: https://lore.kernel.org/r/20240629124210.181537-9-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      ab9e0c52
    • Niklas Cassel's avatar
      ata,scsi: libata-core: Do not leak memory for ata_port struct members · f6549f53
      Niklas Cassel authored
      libsas is currently not freeing all the struct ata_port struct members,
      e.g. ncq_sense_buf for a driver supporting Command Duration Limits (CDL).
      
      Add a function, ata_port_free(), that is used to free a ata_port,
      including its struct members. It makes sense to keep the code related to
      freeing a ata_port in its own function, which will also free all the
      struct members of struct ata_port.
      
      Fixes: 18bd7718 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD")
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240629124210.181537-8-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      f6549f53
    • Niklas Cassel's avatar
      ata: libata-core: Fix null pointer dereference on error · 5d92c7c5
      Niklas Cassel authored
      If the ata_port_alloc() call in ata_host_alloc() fails,
      ata_host_release() will get called.
      
      However, the code in ata_host_release() tries to free ata_port struct
      members unconditionally, which can lead to the following:
      
      BUG: unable to handle page fault for address: 0000000000003990
      PGD 0 P4D 0
      Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
      RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata]
      Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41
      RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246
      RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0
      RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68
      R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004
      R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006
      FS:  00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? __die_body.cold+0x19/0x27
       ? page_fault_oops+0x15a/0x2f0
       ? exc_page_fault+0x7e/0x180
       ? asm_exc_page_fault+0x26/0x30
       ? ata_host_release.cold+0x2f/0x6e [libata]
       ? ata_host_release.cold+0x2f/0x6e [libata]
       release_nodes+0x35/0xb0
       devres_release_group+0x113/0x140
       ata_host_alloc+0xed/0x120 [libata]
       ata_host_alloc_pinfo+0x14/0xa0 [libata]
       ahci_init_one+0x6c9/0xd20 [ahci]
      
      Do not access ata_port struct members unconditionally.
      
      Fixes: 633273a3 ("libata-pmp: hook PMP support and enable it")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240629124210.181537-7-cassel@kernel.orgSigned-off-by: default avatarNiklas Cassel <cassel@kernel.org>
      5d92c7c5
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.10-3' of... · e0b668b0
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Remove the executable bit from installed DTB files
      
       - Escape $ in subshell execution in the debian-orig target
      
       - Fix RPM builds with CONFIG_MODULES=n
      
       - Fix xconfig with the O= option
      
       - Fix scripts_gdb with the O= option
      
      * tag 'kbuild-fixes-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: scripts/gdb: bring the "abspath" back
        kbuild: Use $(obj)/%.cc to fix host C++ module builds
        kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n
        kbuild: Fix build target deb-pkg: ln: failed to create hard link
        kbuild: doc: Update default INSTALL_MOD_DIR from extra to updates
        kbuild: Install dtb files as 0644 in Makefile.dtbinst
      e0b668b0
    • Linus Torvalds's avatar
      x86-32: fix cmpxchg8b_emu build error with clang · 76932725
      Linus Torvalds authored
      The kernel test robot reported that clang no longer compiles the 32-bit
      x86 kernel in some configurations due to commit 95ece481
      ("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}()
      functions").
      
      The build fails with
      
        arch/x86/include/asm/cmpxchg_32.h:149:9: error: inline assembly requires more registers than available
      
      and the reason seems to be that not only does the cmpxchg8b instruction
      need four fixed registers (EDX:EAX and ECX:EBX), with the emulation
      fallback the inline asm also wants a fifth fixed register for the
      address (it uses %esi for that, but that's just a software convention
      with cmpxchg8b_emu).
      
      Avoiding using another pointer input to the asm (and just forcing it to
      use the "0(%esi)" addressing that we end up requiring for the sw
      fallback) seems to fix the issue.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202406230912.F6XFIyA6-lkp@intel.com/
      Fixes: 95ece481 ("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions")
      Link: https://lore.kernel.org/all/202406230912.F6XFIyA6-lkp@intel.com/Suggested-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Reviewed-and-Tested-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      76932725
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 84dd4373
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small driver fixes for 6.10-rc6. Included in here are:
      
         - IIO driver fixes for reported issues
      
         - Counter driver fix for a reported problem.
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'char-misc-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        counter: ti-eqep: enable clock at probe
        iio: chemical: bme680: Fix sensor data read operation
        iio: chemical: bme680: Fix overflows in compensate() functions
        iio: chemical: bme680: Fix calibration data variable
        iio: chemical: bme680: Fix pressure value output
        iio: humidity: hdc3020: fix hysteresis representation
        iio: dac: fix ad9739a random config compile error
        iio: accel: fxls8962af: select IIO_BUFFER & IIO_KFIFO_BUF
        iio: adc: ad7266: Fix variable checking bug
        iio: xilinx-ams: Don't include ams_ctrl_channels in scan_mask
      84dd4373
    • Linus Torvalds's avatar
      Merge tag 'staging-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 12529aa1
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are two small staging driver fixes for 6.10-rc6, both for the
        vc04_services drivers:
      
         - build fix if CONFIG_DEBUGFS was not set
      
         - initialization check fix that was much reported.
      
        Both of these have been in linux-next this week with no reported
        issues"
      
      * tag 'staging-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: vchiq_debugfs: Fix build if CONFIG_DEBUG_FS is not set
        staging: vc04_services: vchiq_arm: Fix initialisation check
      12529aa1
    • Linus Torvalds's avatar
      Merge tag 'tty-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3e334486
      Linus Torvalds authored
      Pull tty / serial / console fixes from Greg KH:
       "Here are a bunch of fixes/reverts for 6.10-rc6.  Include in here are:
      
         - revert the bunch of tty/serial/console changes that landed in -rc1
           that didn't quite work properly yet.
      
           Everyone agreed to just revert them for now and will work on making
           them better for a future release instead of trying to quick fix the
           existing changes this late in the release cycle
      
         - 8250 driver port count bugfix
      
         - Other tiny serial port bugfixes for reported issues
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'tty-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "printk: Save console options for add_preferred_console_match()"
        Revert "printk: Don't try to parse DEVNAME:0.0 console options"
        Revert "printk: Flag register_console() if console is set on command line"
        Revert "serial: core: Add support for DEVNAME:0.0 style naming for kernel console"
        Revert "serial: core: Handle serial console options"
        Revert "serial: 8250: Add preferred console in serial8250_isa_init_ports()"
        Revert "Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial ports"
        Revert "serial: 8250: Fix add preferred console for serial8250_isa_init_ports()"
        Revert "serial: core: Fix ifdef for serial base console functions"
        serial: bcm63xx-uart: fix tx after conversion to uart_port_tx_limited()
        serial: core: introduce uart_port_tx_limited_flags()
        Revert "serial: core: only stop transmit when HW fifo is empty"
        serial: imx: set receiver level before starting uart
        tty: mcf: MCF54418 has 10 UARTS
        serial: 8250_omap: Implementation of Errata i2310
        tty: serial: 8250: Fix port count mismatch with the device
      3e334486
    • Linus Torvalds's avatar
      Merge tag 'usb-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 2c01c3d5
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a handful of small USB driver fixes for 6.10-rc6 to resolve
        some reported issues. Included in here are:
      
         - typec driver bugfixes
      
         - usb gadget driver reverts for commits that were reported to have
           problems
      
         - resource leak bugfix
      
         - gadget driver bugfixes
      
         - dwc3 driver bugfixes
      
         - usb atm driver bugfix for when syzbot got loose on it
      
        All of these have been in linux-next this week with no reported issues"
      
      * tag 'usb-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: dwc3: core: Workaround for CSR read timeout
        Revert "usb: gadget: u_ether: Replace netif_stop_queue with netif_device_detach"
        Revert "usb: gadget: u_ether: Re-attach netif device to mirror detachment"
        usb: gadget: aspeed_udc: fix device address configuration
        usb: dwc3: core: remove lock of otg mode during gadget suspend/resume to avoid deadlock
        usb: typec: ucsi: glink: fix child node release in probe function
        usb: musb: da8xx: fix a resource leak in probe()
        usb: typec: ucsi_acpi: Add LG Gram quirk
        usb: ucsi: stm32: fix command completion handling
        usb: atm: cxacru: fix endpoint checking in cxacru_bind()
        usb: gadget: printer: fix races against disable
        usb: gadget: printer: SS+ support
      2c01c3d5
    • Linus Torvalds's avatar
      Merge tag 'smp_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3ffea9a7
      Linus Torvalds authored
      Pull smp fixes from Borislav Petkov:
      
       - Fix "nosmp" and "maxcpus=0" after the parallel CPU bringup work went
         in and broke them
      
       - Make sure CPU hotplug dynamic prepare states are actually executed
      
      * tag 'smp_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
        cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
      3ffea9a7
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4e412160
      Linus Torvalds authored
      Pull irq fixes from Borislav Petkov:
      
       - Make sure multi-bridge machines get all eiointc interrupt controllers
         initialized even if the number of CPUs has been limited by a cmdline
         param
      
       - Make sure interrupt lines on liointc hw are configured properly even
         when interrupt routing changes
      
       - Avoid use-after-free in the error path of the MSI init code
      
      * tag 'irq_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        PCI/MSI: Fix UAF in msi_capability_init
        irqchip/loongson-liointc: Set different ISRs for different cores
        irqchip/loongson-eiointc: Use early_cpu_to_node() instead of cpu_to_node()
      4e412160
    • Linus Torvalds's avatar
      Merge tag 'timers_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 03c8b0bd
      Linus Torvalds authored
      Pull timer fix from Borislav Petkov:
      
       - Warn when an hrtimer doesn't get a callback supplied
      
      * tag 'timers_urgent_for_v6.10_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hrtimer: Prevent queuing of hrtimer without a function callback
      03c8b0bd
    • Linus Torvalds's avatar
      Merge tag 'linux-watchdog-6.10-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog · 327fceff
      Linus Torvalds authored
      Pull watchdog fixes from Wim Van Sebroeck:
      
       - lenovo_se10_wdt: add HAS_IOPORT dependency
      
       - add missing MODULE_DESCRIPTION() macros
      
      * tag 'linux-watchdog-6.10-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog:
        watchdog: add missing MODULE_DESCRIPTION() macros
        watchdog: lenovo_se10_wdt: add HAS_IOPORT dependency
      327fceff
  4. 29 Jun, 2024 5 commits
  5. 28 Jun, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · de0a9f44
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix for vector load/store instruction decoding, which could result
         in reserved vector element length encodings decoding as valid vector
         instructions.
      
       - Instruction patching now aggressively flushes the local instruction
         cache, to avoid situations where patching functions on the flush path
         results in torn instructions being fetched.
      
       - A fix to prevent the stack walker from showing up as part of traces.
      
      * tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: stacktrace: convert arch_stack_walk() to noinstr
        riscv: patch: Flush the icache right after patching to avoid illegal insns
        RISC-V: fix vector insn load/store width mask
      de0a9f44
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · b75f9472
      Linus Torvalds authored
      Pull hardening fixes from Kees Cook:
      
       - Remove invalid tty __counted_by annotation (Nathan Chancellor)
      
       - Add missing MODULE_DESCRIPTION()s for KUnit string tests (Jeff
         Johnson)
      
       - Remove non-functional per-arch kstack entropy filtering
      
      * tag 'hardening-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        tty: mxser: Remove __counted_by from mxser_board.ports[]
        randomize_kstack: Remove non-functional per-arch entropy filtering
        string: kunit: add missing MODULE_DESCRIPTION() macros
      b75f9472
    • Linus Torvalds's avatar
      x86: stop playing stack games in profile_pc() · 093d9603
      Linus Torvalds authored
      The 'profile_pc()' function is used for timer-based profiling, which
      isn't really all that relevant any more to begin with, but it also ends
      up making assumptions based on the stack layout that aren't necessarily
      valid.
      
      Basically, the code tries to account the time spent in spinlocks to the
      caller rather than the spinlock, and while I support that as a concept,
      it's not worth the code complexity or the KASAN warnings when no serious
      profiling is done using timers anyway these days.
      
      And the code really does depend on stack layout that is only true in the
      simplest of cases.  We've lost the comment at some point (I think when
      the 32-bit and 64-bit code was unified), but it used to say:
      
      	Assume the lock function has either no stack frame or a copy
      	of eflags from PUSHF.
      
      which explains why it just blindly loads a word or two straight off the
      stack pointer and then takes a minimal look at the values to just check
      if they might be eflags or the return pc:
      
      	Eflags always has bits 22 and up cleared unlike kernel addresses
      
      but that basic stack layout assumption assumes that there isn't any lock
      debugging etc going on that would complicate the code and cause a stack
      frame.
      
      It causes KASAN unhappiness reported for years by syzkaller [1] and
      others [2].
      
      With no real practical reason for this any more, just remove the code.
      
      Just for historical interest, here's some background commits relating to
      this code from 2006:
      
        0cb91a22 ("i386: Account spinlocks to the caller during profiling for !FP kernels")
        31679f38 ("Simplify profile_pc on x86-64")
      
      and a code unification from 2009:
      
        ef451288 ("x86: time_32/64.c unify profile_pc")
      
      but the basics of this thing actually goes back to before the git tree.
      
      Link: https://syzkaller.appspot.com/bug?extid=84fe685c02cd112a2ac3 [1]
      Link: https://lore.kernel.org/all/CAK55_s7Xyq=nh97=K=G1sxueOFrJDAvPOJAL4TPTCAYvmxO9_A@mail.gmail.com/ [2]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      093d9603
    • Wolfram Sang's avatar
      i2c: testunit: discard write requests while old command is running · c116deaf
      Wolfram Sang authored
      When clearing registers on new write requests was added, the protection
      for currently running commands was missed leading to concurrent access
      to the testunit registers. Check the flag beforehand.
      
      Fixes: b39ab96a ("i2c: testunit: add support for block process calls")
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@kernel.org>
      c116deaf
    • Wolfram Sang's avatar
      i2c: testunit: don't erase registers after STOP · c422b6a6
      Wolfram Sang authored
      STOP fallsthrough to WRITE_REQUESTED but this became problematic when
      clearing the testunit registers was added to the latter. Actually, there
      is no reason to clear the testunit state after STOP. Doing it when a new
      WRITE_REQUESTED arrives is enough. So, no need to fallthrough, at all.
      
      Fixes: b39ab96a ("i2c: testunit: add support for block process calls")
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@kernel.org>
      c422b6a6
    • Wolfram Sang's avatar
      Merge tag 'i2c-host-fixes-6.10-rc6' of... · 4e9a1a47
      Wolfram Sang authored
      Merge tag 'i2c-host-fixes-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
      
      Fixed a build error following the major refactoring involving the
      VIA-I2C modules. Originally, the code was split to group together
      parts that would be used by different drivers. This caused build
      issues when two modules linked to the same code.
      4e9a1a47
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 6c0483db
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
      
       - Due to a late review, revert and re-fix a recent crasher fix
      
      * tag 'nfsd-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        Revert "nfsd: fix oops when reading pool_stats before server is started"
        nfsd: initialise nfsd_info.mutex early.
      6c0483db
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs · cd63a278
      Linus Torvalds authored
      Pull bcachefs fixes from Kent Overstreet:
       "Simple stuff:
      
         - NULL ptr/err ptr deref fixes
      
         - fix for getting wedged on shutdown after journal error
      
         - fix missing recalc_capacity() call, capacity now changes correctly
           after a device goes read only
      
           however: our capacity calculation still doesn't take into account
           when we have mixed ro/rw devices and the ro devices have data on
           them, that's going to be a more involved fix to separate accounting
           for "capacity used on ro devices" and "capacity used on rw devices"
      
         - boring syzbot stuff
      
        Slightly more involved:
      
         - discard, invalidate workers are now per device
      
           this has the effect of simplifying how we take device refs in these
           paths, and the device ref cleanup fixes a longstanding race between
           the device removal path and the discard path
      
         - fixes for how the debugfs code takes refs on btree_trans objects we
           have debugfs code that prints in use btree_trans objects.
      
           It uses closure_get() on trans->ref, which is mainly for the cycle
           detector, but the debugfs code was using it on a closure that may
           have hit 0, which is not allowed; for performance reasons we cannot
           avoid having not-in-use transactions on the global list.
      
           Introduce some new primitives to fix this and make the
           synchronization here a whole lot saner"
      
      * tag 'bcachefs-2024-06-28' of https://evilpiepirate.org/git/bcachefs:
        bcachefs: Fix kmalloc bug in __snapshot_t_mut
        bcachefs: Discard, invalidate workers are now per device
        bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc
        bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpu
        bcachefs: Add missing bch2_journal_do_writes() call
        bcachefs: Fix null ptr deref in journal_pins_to_text()
        bcachefs: Add missing recalc_capacity() call
        bcachefs: Fix btree_trans list ordering
        bcachefs: Fix race between trans_put() and btree_transactions_read()
        closures: closure_get_not_zero(), closure_return_sync()
        bcachefs: Make btree_deadlock_to_text() clearer
        bcachefs: fix seqmutex_relock()
        bcachefs: Fix freeing of error pointers
      cd63a278