1. 24 Apr, 2024 2 commits
  2. 14 Apr, 2024 1 commit
    • Uros Bizjak's avatar
      locking/atomic/x86: Introduce arch_try_cmpxchg64_local() · d26e46f6
      Uros Bizjak authored
      Introduce arch_try_cmpxchg64_local() for 64-bit and 32-bit targets
      to improve code using cmpxchg64_local().  On 64-bit targets, the
      generated assembly improves from:
      
          3e28:	31 c0                	xor    %eax,%eax
          3e2a:	4d 0f b1 7d 00       	cmpxchg %r15,0x0(%r13)
          3e2f:	48 85 c0             	test   %rax,%rax
          3e32:	0f 85 9f 00 00 00    	jne    3ed7 <...>
      
      to:
      
          3e28:	31 c0                	xor    %eax,%eax
          3e2a:	4d 0f b1 7d 00       	cmpxchg %r15,0x0(%r13)
          3e2f:	0f 85 9f 00 00 00    	jne    3ed4 <...>
      
      where a TEST instruction after CMPXCHG is saved.  The improvements
      for 32-bit targets are even more noticeable, because double-word
      compare after CMPXCHG8B gets eliminated.
      Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Waiman Long <longman@redhat.com>
      Link: https://lore.kernel.org/r/20240414161257.49145-1-ubizjak@gmail.com
      d26e46f6
  3. 12 Apr, 2024 3 commits
  4. 11 Apr, 2024 1 commit
  5. 10 Apr, 2024 4 commits
    • Uros Bizjak's avatar
      locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() functions · 21689e4b
      Uros Bizjak authored
      There is no need to implement arch_atomic_sub() family of inline
      functions, corresponding macros can be directly implemented using
      arch_atomic_add() inlines with negated argument.
      
      No functional changes intended.
      Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/r/20240410062957.322614-4-ubizjak@gmail.com
      21689e4b
    • Uros Bizjak's avatar
      locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions · 95ece481
      Uros Bizjak authored
      Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions to
      use arch_atomic64_try_cmpxchg().  This implementation avoids one extra
      trip through the CMPXCHG loop.
      
      The value preload before the cmpxchg loop does not need to be atomic.
      Use arch_atomic64_read_nonatomic(v) to load the value from atomic_t
      location in a non-atomic way.
      
      The generated code improves from:
      
        1917d5:	31 c9                	xor    %ecx,%ecx
        1917d7:	31 db                	xor    %ebx,%ebx
        1917d9:	89 4c 24 3c          	mov    %ecx,0x3c(%esp)
        1917dd:	8b 74 24 24          	mov    0x24(%esp),%esi
        1917e1:	89 c8                	mov    %ecx,%eax
        1917e3:	89 5c 24 34          	mov    %ebx,0x34(%esp)
        1917e7:	8b 7c 24 28          	mov    0x28(%esp),%edi
        1917eb:	21 ce                	and    %ecx,%esi
        1917ed:	89 74 24 4c          	mov    %esi,0x4c(%esp)
        1917f1:	21 df                	and    %ebx,%edi
        1917f3:	89 de                	mov    %ebx,%esi
        1917f5:	89 7c 24 50          	mov    %edi,0x50(%esp)
        1917f9:	8b 54 24 4c          	mov    0x4c(%esp),%edx
        1917fd:	8b 7c 24 2c          	mov    0x2c(%esp),%edi
        191801:	8b 4c 24 50          	mov    0x50(%esp),%ecx
        191805:	89 d3                	mov    %edx,%ebx
        191807:	89 f2                	mov    %esi,%edx
        191809:	f0 0f c7 0f          	lock cmpxchg8b (%edi)
        19180d:	89 c1                	mov    %eax,%ecx
        19180f:	8b 74 24 34          	mov    0x34(%esp),%esi
        191813:	89 d3                	mov    %edx,%ebx
        191815:	89 44 24 4c          	mov    %eax,0x4c(%esp)
        191819:	8b 44 24 3c          	mov    0x3c(%esp),%eax
        19181d:	89 df                	mov    %ebx,%edi
        19181f:	89 54 24 44          	mov    %edx,0x44(%esp)
        191823:	89 ca                	mov    %ecx,%edx
        191825:	31 de                	xor    %ebx,%esi
        191827:	31 c8                	xor    %ecx,%eax
        191829:	09 f0                	or     %esi,%eax
        19182b:	75 ac                	jne    1917d9 <...>
      
      to:
      
        1912ba:	8b 06                	mov    (%esi),%eax
        1912bc:	8b 56 04             	mov    0x4(%esi),%edx
        1912bf:	89 44 24 3c          	mov    %eax,0x3c(%esp)
        1912c3:	89 c1                	mov    %eax,%ecx
        1912c5:	23 4c 24 34          	and    0x34(%esp),%ecx
        1912c9:	89 d3                	mov    %edx,%ebx
        1912cb:	23 5c 24 38          	and    0x38(%esp),%ebx
        1912cf:	89 54 24 40          	mov    %edx,0x40(%esp)
        1912d3:	89 4c 24 2c          	mov    %ecx,0x2c(%esp)
        1912d7:	89 5c 24 30          	mov    %ebx,0x30(%esp)
        1912db:	8b 5c 24 2c          	mov    0x2c(%esp),%ebx
        1912df:	8b 4c 24 30          	mov    0x30(%esp),%ecx
        1912e3:	f0 0f c7 0e          	lock cmpxchg8b (%esi)
        1912e7:	0f 85 f3 02 00 00    	jne    1915e0 <...>
      Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/r/20240410062957.322614-3-ubizjak@gmail.com
      95ece481
    • Uros Bizjak's avatar
      locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32 · e73c4e34
      Uros Bizjak authored
      Introduce arch_atomic64_read_nonatomic() for 32-bit targets to load
      the value from atomic64_t location in a non-atomic way. This
      function is intended to be used in cases where a subsequent atomic
      operation will handle the torn value, and can be used to prime the
      first iteration of unconditional try_cmpxchg() loops.
      Suggested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/r/20240410062957.322614-2-ubizjak@gmail.com
      e73c4e34
    • Uros Bizjak's avatar
      locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32 · 276b8930
      Uros Bizjak authored
      Introduce arch_atomic64_try_cmpxchg() for 32-bit targets to use
      optimized target specific implementation instead of a generic one.
      This implementation eliminates dual-word compare after
      cmpxchg8b instruction and improves generated asm code from:
      
          2273:	f0 0f c7 0f          	lock cmpxchg8b (%edi)
          2277:	8b 74 24 2c          	mov    0x2c(%esp),%esi
          227b:	89 d3                	mov    %edx,%ebx
          227d:	89 c2                	mov    %eax,%edx
          227f:	89 5c 24 10          	mov    %ebx,0x10(%esp)
          2283:	8b 7c 24 30          	mov    0x30(%esp),%edi
          2287:	89 44 24 1c          	mov    %eax,0x1c(%esp)
          228b:	31 f2                	xor    %esi,%edx
          228d:	89 d0                	mov    %edx,%eax
          228f:	89 da                	mov    %ebx,%edx
          2291:	31 fa                	xor    %edi,%edx
          2293:	09 d0                	or     %edx,%eax
          2295:	0f 85 a5 00 00 00    	jne    2340 <...>
      
      to:
      
          2270:	f0 0f c7 0f          	lock cmpxchg8b (%edi)
          2274:	0f 85 a6 00 00 00    	jne    2320 <...>
      Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lore.kernel.org/r/20240410062957.322614-1-ubizjak@gmail.com
      276b8930
  6. 09 Apr, 2024 4 commits
  7. 07 Apr, 2024 4 commits
  8. 06 Apr, 2024 13 commits
  9. 05 Apr, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linux · 4f72ed49
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Backport of some fixes that came up during development of the 6.10
         io_uring patches. This includes some kbuf cleanups and reference
         fixes.
      
       - Disable multishot read if we don't have NOWAIT support on the target
      
       - Fix for a dependency issue with workqueue flushing
      
      * tag 'io_uring-6.9-20240405' of git://git.kernel.dk/linux:
        io_uring/kbuf: hold io_buffer_list reference over mmap
        io_uring/kbuf: protect io_buffer_list teardown with a reference
        io_uring/kbuf: get rid of bl->is_ready
        io_uring/kbuf: get rid of lower BGID lists
        io_uring: use private workqueue for exit work
        io_uring: disable io-wq execution of multishot NOWAIT requests
        io_uring/rw: don't allow multishot reads without NOWAIT support
      4f72ed49
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 4de2ff26
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "The most important is the libsas fix, which is a problem for DMA to a
        kmalloc'd structure too small causing cache line interference. The
        other fixes (all in drivers) are mostly for allocation length fixes,
        error leg unwinding, suspend races and a missing retry"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Fix MCQ mode dev command timeout
        scsi: libsas: Align SMP request allocation to ARCH_DMA_MINALIGN
        scsi: sd: Unregister device if device_add_disk() failed in sd_probe()
        scsi: ufs: core: WLUN suspend dev/link state error recovery
        scsi: mylex: Fix sysfs buffer lengths
      4de2ff26
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 84985eb2
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix NIOS2 boot with external DTB
      
       - Add missing synchronization needed between fw_devlink and DT overlay
         removals
      
       - Fix some unit-address regex's to be hex only
      
       - Drop some 10+ year old "unstable binding" statements
      
       - Add new SoCs to QCom UFS binding
      
       - Add TPM bindings to TPM maintainers
      
      * tag 'devicetree-fixes-for-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        nios2: Only use built-in devicetree blob if configured to do so
        dt-bindings: timer: narrow regex for unit address to hex numbers
        dt-bindings: soc: fsl: narrow regex for unit address to hex numbers
        dt-bindings: remoteproc: ti,davinci: remove unstable remark
        dt-bindings: clock: ti: remove unstable remark
        dt-bindings: clock: keystone: remove unstable remark
        of: module: prevent NULL pointer dereference in vsnprintf()
        dt-bindings: ufs: qcom: document SM6125 UFS
        dt-bindings: ufs: qcom: document SC7180 UFS
        dt-bindings: ufs: qcom: document SC8180X UFS
        of: dynamic: Synchronize of_changeset_destroy() with the devlink removals
        driver core: Introduce device_link_wait_removal()
        docs: dt-bindings: add missing address/size-cells to example
        MAINTAINERS: Add TPM DT bindings to TPM maintainers
      84985eb2
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-04-05-11-30' of... · af709adf
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-04-05-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "8 hotfixes, 3 are cc:stable
      
        There are a couple of fixups for this cycle's vmalloc changes and one
        for the stackdepot changes. And a fix for a very old x86 PAT issue
        which can cause a warning splat"
      
      * tag 'mm-hotfixes-stable-2024-04-05-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        stackdepot: rename pool_index to pool_index_plus_1
        x86/mm/pat: fix VM_PAT handling in COW mappings
        MAINTAINERS: change vmware.com addresses to broadcom.com
        selftests/mm: include strings.h for ffsl
        mm: vmalloc: fix lockdep warning
        mm: vmalloc: bail out early in find_vmap_area() if vmap is not init
        init: open output files from cpio unpacking with O_LARGEFILE
        mm/secretmem: fix GUP-fast succeeding on secretmem folios
      af709adf
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · c7830236
      Linus Torvalds authored
      Pull arm64 fix from Catalin Marinas:
       "arm64/ptrace fix to use the correct SVE layout based on the saved
        floating point state rather than the TIF_SVE flag. The latter may be
        left on during syscalls even if the SVE state is discarded"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64/ptrace: Use saved floating point state type to determine SVE layout
      c7830236
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 261b8e89
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix for an __{get,put}_kernel_nofault to avoid an uninitialized
         value causing spurious failures
      
       - compat_vdso.so.dbg is now installed to the standard install location
      
       - A fix to avoid initializing PERF_SAMPLE_BRANCH_*-related events, as
         they aren't supported and will just later fail
      
       - A fix to make AT_VECTOR_SIZE_ARCH correct now that we're providing
         AT_MINSIGSTKSZ
      
       - pgprot_nx() is now implemented, which fixes vmap W^X protection
      
       - A fix for the vector save/restore code, which at least manifests as
         corrupted vector state when a signal is taken
      
       - A fix for a race condition in instruction patching
      
       - A fix to avoid leaking the kernel-mode GP to userspace, which is a
         kernel pointer leak that can be used to defeat KASLR in various ways
      
       - A handful of smaller fixes to build warnings, an overzealous printk,
         and some missing tracing annotations
      
      * tag 'riscv-for-linus-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: process: Fix kernel gp leakage
        riscv: Disable preemption when using patch_map()
        riscv: Fix warning by declaring arch_cpu_idle() as noinstr
        riscv: use KERN_INFO in do_trap
        riscv: Fix vector state restore in rt_sigreturn()
        riscv: mm: implement pgprot_nx
        riscv: compat_vdso: align VDSOAS build log
        RISC-V: Update AT_VECTOR_SIZE_ARCH for new AT_MINSIGSTKSZ
        riscv: Mark __se_sys_* functions __used
        drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supported
        riscv: compat_vdso: install compat_vdso.so.dbg to /lib/modules/*/vdso/
        riscv: hwprobe: do not produce frtace relocation
        riscv: Fix spurious errors from __get/put_kernel_nofault
        riscv: mm: Fix prototype to avoid discarding const
      261b8e89
    • Linus Torvalds's avatar
      Merge tag 's390-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 50094473
      Linus Torvalds authored
      Pull s390 fixes from Alexander Gordeev:
      
       - Fix missing NULL pointer check when determining guest/host fault
      
       - Mark all functions in asm/atomic_ops.h, asm/atomic.h and
         asm/preempt.h as __always_inline to avoid unwanted instrumentation
      
       - Fix removal of a Processor Activity Instrumentation (PAI) sampling
         event in PMU device driver
      
       - Align system call table on 8 bytes
      
      * tag 's390-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/entry: align system call table on 8 bytes
        s390/pai: fix sampling event removal for PMU device driver
        s390/preempt: mark all functions __always_inline
        s390/atomic: mark all functions __always_inline
        s390/mm: fix NULL pointer dereference
      50094473
    • Linus Torvalds's avatar
      Merge tag 'pm-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2f9fd9e4
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Fix a recent Energy Model change that went against a recent scheduler
        change made independently (Vincent Guittot)"
      
      * tag 'pm-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: EM: fix wrong utilization estimation in em_cpu_energy()
      2f9fd9e4