1. 14 May, 2024 29 commits
    • Mike Rapoport (IBM)'s avatar
      powerpc: extend execmem_params for kprobes allocations · 1b750c2f
      Mike Rapoport (IBM) authored
      powerpc overrides kprobes::alloc_insn_page() to remove writable
      permissions when STRICT_MODULE_RWX is on.
      
      Add definition of EXECMEM_KRPOBES to execmem_params to allow using the
      generic kprobes::alloc_insn_page() with the desired permissions.
      
      As powerpc uses breakpoint instructions to inject kprobes, it does not
      need to constrain kprobe allocations to the modules area and can use the
      entire vmalloc address space.
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      1b750c2f
    • Mike Rapoport (IBM)'s avatar
      arm64: extend execmem_info for generated code allocations · e2effa22
      Mike Rapoport (IBM) authored
      The memory allocations for kprobes and BPF on arm64 can be placed
      anywhere in vmalloc address space and currently this is implemented with
      overrides of alloc_insn_page() and bpf_jit_alloc_exec() in arm64.
      
      Define EXECMEM_KPROBES and EXECMEM_BPF ranges in arm64::execmem_info and
      drop overrides of alloc_insn_page() and bpf_jit_alloc_exec().
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      e2effa22
    • Mike Rapoport (IBM)'s avatar
      riscv: extend execmem_params for generated code allocations · 4d7b321a
      Mike Rapoport (IBM) authored
      The memory allocations for kprobes and BPF on RISC-V are not placed in
      the modules area and these custom allocations are implemented with
      overrides of alloc_insn_page() and  bpf_jit_alloc_exec().
      
      Define MODULES_VADDR and MODULES_END as VMALLOC_START and VMALLOC_END for
      32 bit and slightly reorder execmem_params initialization to support both
      32 and 64 bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
      riscv::execmem_params and drop overrides of alloc_insn_page() and
      bpf_jit_alloc_exec().
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Reviewed-by: default avatarAlexandre Ghiti <alexghiti@rivosinc.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      4d7b321a
    • Mike Rapoport (IBM)'s avatar
      mm/execmem, arch: convert remaining overrides of module_alloc to execmem · 223b5e57
      Mike Rapoport (IBM) authored
      Extend execmem parameters to accommodate more complex overrides of
      module_alloc() by architectures.
      
      This includes specification of a fallback range required by arm, arm64
      and powerpc, EXECMEM_MODULE_DATA type required by powerpc, support for
      allocation of KASAN shadow required by s390 and x86 and support for
      late initialization of execmem required by arm64.
      
      The core implementation of execmem_alloc() takes care of suppressing
      warnings when the initial allocation fails but there is a fallback range
      defined.
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Tested-by: default avatarLiviu Dudau <liviu@dudau.co.uk>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      223b5e57
    • Mike Rapoport (IBM)'s avatar
      mm/execmem, arch: convert simple overrides of module_alloc to execmem · f6bec26c
      Mike Rapoport (IBM) authored
      Several architectures override module_alloc() only to define address
      range for code allocations different than VMALLOC address space.
      
      Provide a generic implementation in execmem that uses the parameters for
      address space ranges, required alignment and page protections provided
      by architectures.
      
      The architectures must fill execmem_info structure and implement
      execmem_arch_setup() that returns a pointer to that structure. This way the
      execmem initialization won't be called from every architecture, but rather
      from a central place, namely a core_initcall() in execmem.
      
      The execmem provides execmem_alloc() API that wraps __vmalloc_node_range()
      with the parameters defined by the architectures.  If an architecture does
      not implement execmem_arch_setup(), execmem_alloc() will fall back to
      module_alloc().
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      f6bec26c
    • Mike Rapoport (IBM)'s avatar
      mm: introduce execmem_alloc() and execmem_free() · 12af2b83
      Mike Rapoport (IBM) authored
      module_alloc() is used everywhere as a mean to allocate memory for code.
      
      Beside being semantically wrong, this unnecessarily ties all subsystems
      that need to allocate code, such as ftrace, kprobes and BPF to modules and
      puts the burden of code allocation to the modules code.
      
      Several architectures override module_alloc() because of various
      constraints where the executable memory can be located and this causes
      additional obstacles for improvements of code allocation.
      
      Start splitting code allocation from modules by introducing execmem_alloc()
      and execmem_free() APIs.
      
      Initially, execmem_alloc() is a wrapper for module_alloc() and
      execmem_free() is a replacement of module_memfree() to allow updating all
      call sites to use the new APIs.
      
      Since architectures define different restrictions on placement,
      permissions, alignment and other parameters for memory that can be used by
      different subsystems that allocate executable memory, execmem_alloc() takes
      a type argument, that will be used to identify the calling subsystem and to
      allow architectures define parameters for ranges suitable for that
      subsystem.
      
      No functional changes.
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Acked-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      12af2b83
    • Mike Rapoport (IBM)'s avatar
      module: make module_memory_{alloc,free} more self-contained · bc6b94d3
      Mike Rapoport (IBM) authored
      Move the logic related to the memory allocation and freeing into
      module_memory_alloc() and module_memory_free().
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Reviewed-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      bc6b94d3
    • Mike Rapoport (IBM)'s avatar
      sparc: simplify module_alloc() · e8dbc6a8
      Mike Rapoport (IBM) authored
      Define MODULES_VADDR and MODULES_END as VMALLOC_START and VMALLOC_END
      for 32-bit and reduce module_alloc() to
      
      	__vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END, ...)
      
      as with the new defines the allocations becomes identical for both 32
      and 64 bits.
      
      While on it, drop unused include of <linux/jump_label.h>
      Suggested-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Reviewed-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      e8dbc6a8
    • Mike Rapoport (IBM)'s avatar
      nios2: define virtual address space for modules · 38762155
      Mike Rapoport (IBM) authored
      nios2 uses kmalloc() to implement module_alloc() because CALL26/PCREL26
      cannot reach all of vmalloc address space.
      
      Define module space as 32MiB below the kernel base and switch nios2 to
      use vmalloc for module allocations.
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDinh Nguyen <dinguyen@kernel.org>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      38762155
    • Mike Rapoport (IBM)'s avatar
      mips: module: rename MODULE_START to MODULES_VADDR · 0cdf5876
      Mike Rapoport (IBM) authored
      and MODULE_END to MODULES_END to match other architectures that define
      custom address space for modules.
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      0cdf5876
    • Mike Rapoport (IBM)'s avatar
      arm64: module: remove unneeded call to kasan_alloc_module_shadow() · 00be8758
      Mike Rapoport (IBM) authored
      Since commit f6f37d93 ("arm64: select KASAN_VMALLOC for SW/HW_TAGS
      modes") KASAN_VMALLOC is always enabled when KASAN is on. This means
      that allocations in module_alloc() will be tracked by KASAN protection
      for vmalloc() and that kasan_alloc_module_shadow() will be always an
      empty inline and there is no point in calling it.
      
      Drop meaningless call to kasan_alloc_module_shadow() from
      module_alloc().
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      00be8758
    • Justin Stitt's avatar
      kallsyms: replace deprecated strncpy with strscpy · 086437d9
      Justin Stitt authored
      strncpy() is deprecated for use on NUL-terminated destination strings
      [1] and as such we should prefer more robust and less ambiguous string
      interfaces. The goal is to remove its use completely [2].
      
      namebuf is eventually cleaned of any trailing llvm suffixes using
      strstr(). This hints that namebuf should be NUL-terminated.
      
      static void cleanup_symbol_name(char *s)
      {
      	char *res;
      	...
      	res = strstr(s, ".llvm.");
      	...
      }
      
      Due to this, use strscpy() over strncpy() as it guarantees
      NUL-termination on the destination buffer. Drop the -1 from the length
      calculation as it is no longer needed to ensure NUL-termination.
      
      Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
      Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
      Link: https://github.com/KSPP/linux/issues/90 [2]
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarJustin Stitt <justinstitt@google.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      086437d9
    • Yifan Hong's avatar
      module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree. · 8d0b7288
      Yifan Hong authored
      If UNUSED_KSYMS_WHITELIST is a file generated
      before Kbuild runs, and the source tree is in
      a read-only filesystem, the developer must put
      the file somewhere and specify an absolute
      path to UNUSED_KSYMS_WHITELIST. This worked,
      but if IKCONFIG=y, an absolute path is embedded
      into .config and eventually into vmlinux, causing
      the build to be less reproducible when building
      on a different machine.
      
      This patch makes the handling of
      UNUSED_KSYMS_WHITELIST to be similar to
      MODULE_SIG_KEY.
      
      First, check if UNUSED_KSYMS_WHITELIST is an
      absolute path, just as before this patch. If so,
      use the path as is.
      
      If it is a relative path, use wildcard to check
      the existence of the file below objtree first.
      If it does not exist, fall back to the original
      behavior of adding $(srctree)/ before the value.
      
      After this patch, the developer can put the generated
      file in objtree, then use a relative path against
      objtree in .config, eradicating any absolute paths
      that may be evaluated differently on different machines.
      Signed-off-by: default avatarYifan Hong <elsk@google.com>
      Reviewed-by: default avatarElliot Berman <quic_eberman@quicinc.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      8d0b7288
    • Linus Torvalds's avatar
      Merge tag 'x86-shstk-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a5131c3f
      Linus Torvalds authored
      Pull x86 shadow stacks from Ingo Molnar:
       "Enable shadow stacks for x32.
      
        While we normally don't do such feature-enabling for 32-bit anymore,
        this change is small, straightforward & tested on upstream glibc"
      
      * tag 'x86-shstk-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/shstk: Enable shadow stacks for x32
      a5131c3f
    • Linus Torvalds's avatar
      Merge tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5f487cd8
      Linus Torvalds authored
      Pull x86 platform updates from Ingo Molnar:
      
       - Improve the DeviceTree (OF) NUMA enumeration code to address
         kernel warnings & mis-mappings on DeviceTree platforms
      
       - Migrate x86 platform drivers to the .remove_new callback API
      
       - Misc cleanups & fixes
      
      * tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/platform/olpc-xo1-sci: Convert to platform remove callback returning void
        x86/platform/olpc-x01-pm: Convert to platform remove callback returning void
        x86/platform/iris: Convert to platform remove callback returning void
        x86/of: Change x86_dtb_parse_smp_config() to static
        x86/of: Map NUMA node to CPUs as per DeviceTree
        x86/of: Set the parse_smp_cfg for all the DeviceTree platforms by default
        x86/hyperv/vtl: Correct x86_init.mpparse.parse_smp_cfg assignment
      5f487cd8
    • Linus Torvalds's avatar
      Merge tag 'x86-percpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e76f69b9
      Linus Torvalds authored
      Pull x86 percpu updates from Ingo Molnar:
      
       - Expand the named address spaces optimizations down to
         GCC 9.1+.
      
       - Re-enable named address spaces with sanitizers for GCC 13.3+
      
       - Generate better this_percpu_xchg_op() code
      
       - Introduce raw_cpu_read_long() to reduce ifdeffery
      
       - Simplify the x86_this_cpu_test_bit() et al macros
      
       - Address Sparse warnings
      
       - Misc cleanups & fixes
      
      * tag 'x86-percpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/percpu: Introduce raw_cpu_read_long() to reduce ifdeffery
        x86/percpu: Rewrite x86_this_cpu_test_bit() and friends as macros
        x86/percpu: Fix x86_this_cpu_variable_test_bit() asm template
        x86/percpu: Re-enable named address spaces with sanitizers for GCC 13.3+
        x86/percpu: Use __force to cast from __percpu address space
        x86/percpu: Do not use this_cpu_read_stable_8() for 32-bit targets
        x86/percpu: Unify arch_raw_cpu_ptr() defines
        x86/percpu: Enable named address spaces for GCC 9.1+
        x86/percpu: Re-enable named address spaces with KASAN for GCC 13.3+
        x86/percpu: Move raw_percpu_xchg_op() to a better place
        x86/percpu: Convert this_percpu_xchg_op() from asm() to C code, to generate better code
      e76f69b9
    • Linus Torvalds's avatar
      Merge tag 'x86-mm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eabb6297
      Linus Torvalds authored
      Pull x86 mm updates from Ingo Molnar:
      
       - Fix W^X violation check false-positives in the CPA code
         when running as a Xen PV guest
      
       - Fix W^X violation warning false-positives in show_fault_oops()
      
      * tag 'x86-mm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/pat: Fix W^X violation false-positives when running as Xen PV guest
        x86/pat: Restructure _lookup_address_cpa()
        x86/mm: Use lookup_address_in_pgd_attr() in show_fault_oops()
        x86/pat: Introduce lookup_address_in_pgd_attr()
      eabb6297
    • Linus Torvalds's avatar
      Merge tag 'x86-fpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 963795f7
      Linus Torvalds authored
      Pull x86 fpu updates from Ingo Molnar:
      
       - Fix asm() constraints & modifiers in restore_fpregs_from_fpstate()
      
       - Update comments
      
       - Robustify the free_vm86() definition
      
      * tag 'x86-fpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/fpu: Update fpu_swap_kvm_fpu() uses in comments as well
        x86/vm86: Make sure the free_vm86(task) definition uses its parameter even in the !CONFIG_VM86 case
        x86/fpu: Fix AMD X86_BUG_FXSAVE_LEAK fixup
      963795f7
    • Linus Torvalds's avatar
      Merge tag 'x86-entry-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 31a568b5
      Linus Torvalds authored
      Pull x86 entry cleanup from Ingo Molnar:
      
       - Merge thunk_64.S and thunk_32.S into thunk.S
      
      * tag 'x86-entry-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Merge thunk_64.S and thunk_32.S into thunk.S
      31a568b5
    • Linus Torvalds's avatar
      Merge tag 'x86-cpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ecd83bcb
      Linus Torvalds authored
      Pull x86 cpu updates from Ingo Molnar:
      
       - Rework the x86 CPU vendor/family/model code: introduce the 'VFM'
         value that is an 8+8+8 bit concatenation of the vendor/family/model
         value, and add macros that work on VFM values. This simplifies the
         addition of new Intel models & families, and simplifies existing
         enumeration & quirk code.
      
       - Add support for the AMD 0x80000026 leaf, to better parse topology
         information
      
       - Optimize the NUMA allocation layout of more per-CPU data structures
      
       - Improve the workaround for AMD erratum 1386
      
       - Clear TME from /proc/cpuinfo as well, when disabled by the firmware
      
       - Improve x86 self-tests
      
       - Extend the mce_record tracepoint with the ::ppin and ::microcode fields
      
       - Implement recovery for MCE errors in TDX/SEAM non-root mode
      
       - Misc cleanups and fixes
      
      * tag 'x86-cpu-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
        x86/mm: Switch to new Intel CPU model defines
        x86/tsc_msr: Switch to new Intel CPU model defines
        x86/tsc: Switch to new Intel CPU model defines
        x86/cpu: Switch to new Intel CPU model defines
        x86/resctrl: Switch to new Intel CPU model defines
        x86/microcode/intel: Switch to new Intel CPU model defines
        x86/mce: Switch to new Intel CPU model defines
        x86/cpu: Switch to new Intel CPU model defines
        x86/cpu/intel_epb: Switch to new Intel CPU model defines
        x86/aperfmperf: Switch to new Intel CPU model defines
        x86/apic: Switch to new Intel CPU model defines
        perf/x86/msr: Switch to new Intel CPU model defines
        perf/x86/intel/uncore: Switch to new Intel CPU model defines
        perf/x86/intel/pt: Switch to new Intel CPU model defines
        perf/x86/lbr: Switch to new Intel CPU model defines
        perf/x86/intel/cstate: Switch to new Intel CPU model defines
        x86/bugs: Switch to new Intel CPU model defines
        x86/bugs: Switch to new Intel CPU model defines
        x86/cpu/vfm: Update arch/x86/include/asm/intel-family.h
        x86/cpu/vfm: Add new macros to work with (vendor/family/model) values
        ...
      ecd83bcb
    • Linus Torvalds's avatar
      Merge tag 'x86-cleanups-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c4273a66
      Linus Torvalds authored
      Pull x86 cleanups from Ingo Molnar:
      
       - Fix function prototypes to address clang function type cast
         warnings in the math-emu code
      
       - Reorder definitions in <asm/msr-index.h>
      
       - Remove unused code
      
       - Fix typos
      
       - Simplify #include sections
      
      * tag 'x86-cleanups-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/pci/ce4100: Remove unused 'struct sim_reg_op'
        x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place
        x86/math-emu: Fix function cast warnings
        x86/extable: Remove unused fixup type EX_TYPE_COPY
        x86/rtc: Remove unused intel-mid.h
        x86/32: Remove unused IA32_STACK_TOP and two externs
        x86/head: Simplify relative include path to xen-head.S
        x86/fred: Fix typo in Kconfig description
        x86/syscall/compat: Remove ia32_unistd.h
        x86/syscall/compat: Remove unused macro __SYSCALL_ia32_NR
        x86/virt/tdx: Remove duplicate include
        x86/xen: Remove duplicate #include
      c4273a66
    • Linus Torvalds's avatar
      Merge tag 'x86-build-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d71ec0ed
      Linus Torvalds authored
      Pull x86 build updates from Ingo Molnar:
      
       - Use -fpic to build the kexec 'purgatory' (the self-contained
         code that runs between two kernels)
      
       - Clean up vmlinux.lds.S generation
      
       - Simplify the X86_EXTENDED_PLATFORM section of the x86 Kconfig
      
       - Misc cleanups & fixes
      
      * tag 'x86-build-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/Kconfig: Merge the two CONFIG_X86_EXTENDED_PLATFORM entries
        x86/purgatory: Switch to the position-independent small code model
        x86/boot: Replace __PHYSICAL_START with LOAD_PHYSICAL_ADDR
        x86/vmlinux.lds.S: Take __START_KERNEL out conditional definition
        x86/vmlinux.lds.S: Remove conditional definition of LOAD_OFFSET
        vmlinux.lds.h: Fix a typo in comment
      d71ec0ed
    • Linus Torvalds's avatar
      Merge tag 'x86-bugs-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7e359145
      Linus Torvalds authored
      Pull x86 oops message cleanup from Ingo Molnar:
      
       - Use uniform "Oops: " prefix for die() messages
      
      * tag 'x86-bugs-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/dumpstack: Use uniform "Oops: " prefix for die() messages
      7e359145
    • Linus Torvalds's avatar
      Merge tag 'x86-boot-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9d8e0d52
      Linus Torvalds authored
      Pull x86 boot updates from Ingo Molnar:
      
       - Move the kernel cmdline setup earlier in the boot process (again),
         to address a split_lock_detect= boot parameter bug
      
       - Ignore relocations in .notes sections
      
       - Simplify boot stack setup
      
       - Re-introduce a bootloader quirk wrt CR4 handling
      
       - Miscellaneous cleanups & fixes
      
      * tag 'x86-boot-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot/64: Clear most of CR4 in startup_64(), except PAE, MCE and LA57
        x86/boot: Move kernel cmdline setup earlier in the boot process (again)
        x86/build: Clean up arch/x86/tools/relocs.c a bit
        x86/boot: Ignore relocations in .notes sections in walk_relocs() too
        x86: Rename __{start,end}_init_task to __{start,end}_init_stack
        x86/boot: Simplify boot stack setup
      9d8e0d52
    • Linus Torvalds's avatar
      Merge tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d791a4da
      Linus Torvalds authored
      Pull x86 asm updates from Ingo Molnar:
      
       - Clean up & fix asm() operand modifiers & constraints
      
       - Misc cleanups
      
      * tag 'x86-asm-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/alternatives: Remove a superfluous newline in _static_cpu_has()
        x86/asm/64: Clean up memset16(), memset32(), memset64() assembly constraints in <asm/string_64.h>
        x86/asm: Use "m" operand constraint in WRUSSQ asm template
        x86/asm: Use %a instead of %P operand modifier in asm templates
        x86/asm: Use %c/%n instead of %P operand modifier in asm templates
        x86/asm: Remove %P operand modifier from altinstr asm templates
      d791a4da
    • Linus Torvalds's avatar
      Merge tag 'x86-misc-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 019040fb
      Linus Torvalds authored
      Pull tip tree documentation update from Ingo Molnar:
      
       - Update the -tip maintainers merge policy document wrt
         merge window timing
      
      * tag 'x86-misc-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Documentation/maintainer-tip: Clarify merge window policy
      019040fb
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6e5a0c30
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      
       - Add cpufreq pressure feedback for the scheduler
      
       - Rework misfit load-balancing wrt affinity restrictions
      
       - Clean up and simplify the code around ::overutilized and
         ::overload access.
      
       - Simplify sched_balance_newidle()
      
       - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
         handling that changed the output.
      
       - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()
      
       - Reorganize, clean up and unify most of the higher level
         scheduler balancing function names around the sched_balance_*()
         prefix
      
       - Simplify the balancing flag code (sched_balance_running)
      
       - Miscellaneous cleanups & fixes
      
      * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
        sched/pelt: Remove shift of thermal clock
        sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
        thermal/cpufreq: Remove arch_update_thermal_pressure()
        sched/cpufreq: Take cpufreq feedback into account
        cpufreq: Add a cpufreq pressure feedback for the scheduler
        sched/fair: Fix update of rd->sg_overutilized
        sched/vtime: Do not include <asm/vtime.h> header
        s390/irq,nmi: Include <asm/vtime.h> header directly
        s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
        sched/vtime: Get rid of generic vtime_task_switch() implementation
        sched/vtime: Remove confusing arch_vtime_task_switch() declaration
        sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
        sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
        sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
        sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
        sched/fair: Rename root_domain::overload to ::overloaded
        sched/fair: Use helper functions to access root_domain::overload
        sched/fair: Check root_domain::overload value before update
        sched/fair: Combine EAS check with root_domain::overutilized access
        sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
        ...
      6e5a0c30
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 17ca7fc2
      Linus Torvalds authored
      Pull perf events updates from Ingo Molnar:
      
       - Combine perf and BPF for fast evalution of HW breakpoint
         conditions
      
       - Add LBR capture support outside of hardware events
      
       - Trigger IO signals for watermark_wakeup
      
       - Add RAPL support for Intel Arrow Lake and Lunar Lake
      
       - Optimize frequency-throttling
      
       - Miscellaneous cleanups & fixes
      
      * tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
        perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too
        selftests/perf_events: Test FASYNC with watermark wakeups
        perf/ring_buffer: Trigger IO signals for watermark_wakeup
        perf: Move perf_event_fasync() to perf_event.h
        perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines
        selftest/bpf: Test a perf BPF program that suppresses side effects
        perf/bpf: Allow a BPF program to suppress all sample side effects
        perf/bpf: Remove unneeded uses_default_overflow_handler()
        perf/bpf: Call BPF handler directly, not through overflow machinery
        perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members
        perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL
        perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow()
        perf/x86/rapl: Add support for Intel Lunar Lake
        perf/x86/rapl: Add support for Intel Arrow Lake
        perf/core: Reduce PMU access to adjust sample freq
        perf/core: Optimize perf_adjust_freq_unthr_context()
        perf/x86/amd: Don't reject non-sampling events with configured LBR
        perf/x86/amd: Support capturing LBR from software events
        perf/x86/amd: Avoid taking branches before disabling LBR
        perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined
        ...
      17ca7fc2
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 48fc82c4
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
      
       - Over a dozen code generation micro-optimizations for the atomic
         and spinlock code
      
       - Add more __ro_after_init attributes
      
       - Robustify the lockdevent_*() macros
      
      * tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/pvqspinlock/x86: Use _Q_LOCKED_VAL in PV_UNLOCK_ASM macro
        locking/qspinlock/x86: Micro-optimize virt_spin_lock()
        locking/atomic/x86: Merge __arch{,_try}_cmpxchg64_emu_local() with __arch{,_try}_cmpxchg64_emu()
        locking/atomic/x86: Introduce arch_try_cmpxchg64_local()
        locking/pvqspinlock/x86: Remove redundant CMP after CMPXCHG in __raw_callee_save___pv_queued_spin_unlock()
        locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h
        locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending()
        locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail()
        locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() functions
        locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions
        locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32
        locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32
        locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64
        locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}()
        locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128()
        x86/tsc: Make __use_tsc __ro_after_init
        x86/kvm: Make kvm_async_pf_enabled __ro_after_init
        context_tracking: Make context_tracking_key __ro_after_init
        jump_label,module: Don't alloc static_key_mod for __ro_after_init keys
        locking/qspinlock: Always evaluate lockevent* non-event parameter once
      48fc82c4
  2. 13 May, 2024 11 commits
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-firmware-for-v6.10' of... · a7c840ba
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform firmware updates from Tzung-Bi Shih:
      
       - Set driver owner in the core registration so that coreboot drivers
         don't need to set it individually
      
      * tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
        firmware: google: cbmem: drop driver owner initialization
        firmware: coreboot: store owner from modules with coreboot_driver_register()
      a7c840ba
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v6.10' of... · 59729c8a
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform updates from Tzung-Bi Shih:
       "New:
         - Support Framework Laptop 13 and 16 (AMD Ryzen)
      
        Improvements:
         - Use sysfs_emit() instead of sprintf() for sysfs' show()
      
        Fixes:
         - Fix flex-array-member-not-at-end compiler warnings by using
           DEFINE_RAW_FLEX()
         - Add HAS_IOPORT dependencies
         - Fix long pending events during suspend after resume
      
        Misc cleanups:
         - Provide ID tables for avoiding fallback match
         - Replace deprecated UNIVERSAL_DEV_PM_OPS()"
      
      * tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: (22 commits)
        platform/chrome: cros_ec: Handle events during suspend after resume completion
        platform/chrome: cros_ec_lpc: add quirks for the Framework Laptop (AMD)
        platform/chrome: cros_ec_lpc: add a "quirks" system
        platform/chrome: cros_ec_lpc: pass driver_data from DMI to the device
        platform/chrome: cros_ec_lpc: introduce a priv struct for the lpc device
        platform/chrome: add HAS_IOPORT dependencies
        platform/chrome: cros_hps_i2c: Replace deprecated UNIVERSAL_DEV_PM_OPS()
        platform/chrome: cros_kbd_led_backlight: provide ID table for avoiding fallback match
        platform/chrome: wilco_ec: core: provide ID table for avoiding fallback match
        platform/chrome: wilco_ec: event: remove redundant MODULE_ALIAS
        platform/chrome: wilco_ec: debugfs: provide ID table for avoiding fallback match
        platform/chrome: wilco_ec: telemetry: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_vbc: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_lightbar: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_sysfs: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_debugfs: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_chardev: provide ID table for avoiding fallback match
        platform/chrome: cros_usbpd_notify: provide ID table for avoiding fallback match
        platform/chrome: cros_usbpd_logger: provide ID table for avoiding fallback match
        platform/chrome: cros_ec_sensorhub: provide ID table for avoiding fallback match
        ...
      59729c8a
    • Linus Torvalds's avatar
      Merge tag 'rust-6.10' of https://github.com/Rust-for-Linux/linux · 8f5b5f78
      Linus Torvalds authored
      Pull Rust updates from Miguel Ojeda:
       "The most notable change is the drop of the 'alloc' in-tree fork. This
        is nicely reflected in the diffstat as a ~10k lines drop. In turn,
        this makes the version upgrades way simpler and smaller in the future,
        e.g. the latest one in commit 56f64b37 ("rust: upgrade to Rust
        1.78.0").
      
        More importantly, this increases the chances that a newer compiler
        version just works, which in turn means supporting several compiler
        versions is easier now. Thus we will look into finally setting a
        minimum version in the near future.
      
        Toolchain and infrastructure:
      
         - Upgrade to Rust 1.78.0
      
           This time around, due to how the kernel and Rust schedules have
           aligned, there are two upgrades in fact. These allow us to remove
           one more unstable feature ('offset_of') from the list, among other
           improvements
      
         - Drop 'alloc' in-tree fork of the standard library crate, which
           means all the unstable features used by 'alloc' (~30 language ones,
           ~60 library ones) are not a concern anymore
      
         - Support DWARFv5 via the '-Zdwarf-version' flag
      
         - Support zlib and zstd debuginfo compression via the
           '-Zdebuginfo-compression' flag
      
        'kernel' crate:
      
         - Support allocation flags ('GFP_*'), particularly in 'Box' (via
           'BoxExt'), 'Vec' (via 'VecExt'), 'Arc' and 'UniqueArc', as well as
           in the 'init' module APIs
      
         - Remove usage of the 'allocator_api' unstable feature
      
         - Remove 'try_' prefix in allocation APIs' names
      
         - Add 'VecExt' (an extension trait) to be able to drop the 'alloc'
           fork
      
         - Add the '{make,to}_{upper,lower}case()' methods to 'CStr'/'CString'
      
         - Add the 'as_ptr' method to 'ThisModule'
      
         - Add the 'from_raw' method to 'ArcBorrow'
      
         - Add the 'into_unique_or_drop' method to 'Arc'
      
         - Display column number in the 'dbg!' macro output by applying the
           equivalent change done to the standard library one
      
         - Migrate 'Work' to '#[pin_data]' thanks to the changes in the
           'macros' crate, which allows to remove an unsafe call in its 'new'
           associated function
      
         - Prevent namespacing issues when using the '[try_][pin_]init!'
           macros by changing the generated name of guard variables
      
         - Make the 'get' method in 'Opaque' const
      
         - Implement the 'Default' trait for 'LockClassKey'
      
         - Remove unneeded 'kernel::prelude' imports from doctests
      
         - Remove redundant imports
      
        'macros' crate:
      
         - Add 'decl_generics' to 'parse_generics()' to support default
           values, and use that to allow them in '#[pin_data]'
      
        Helpers:
      
         - Trivial English grammar fix
      
        Documentation:
      
         - Add section on Rust Kselftests to the 'Testing' document
      
         - Expand the 'Abstractions vs. bindings' section of the 'General
           Information' document"
      
      * tag 'rust-6.10' of https://github.com/Rust-for-Linux/linux: (31 commits)
        rust: alloc: fix dangling pointer in VecExt<T>::reserve()
        rust: upgrade to Rust 1.78.0
        rust: kernel: remove redundant imports
        rust: sync: implement `Default` for `LockClassKey`
        docs: rust: extend abstraction and binding documentation
        docs: rust: Add instructions for the Rust kselftest
        rust: remove unneeded `kernel::prelude` imports from doctests
        rust: update `dbg!()` to format column number
        rust: helpers: Fix grammar in comment
        rust: init: change the generated name of guard variables
        rust: sync: add `Arc::into_unique_or_drop`
        rust: sync: add `ArcBorrow::from_raw`
        rust: types: Make Opaque::get const
        rust: kernel: remove usage of `allocator_api` unstable feature
        rust: init: update `init` module to take allocation flags
        rust: sync: update `Arc` and `UniqueArc` to take allocation flags
        rust: alloc: update `VecExt` to take allocation flags
        rust: alloc: introduce the `BoxExt` trait
        rust: alloc: introduce allocation flags
        rust: alloc: remove our fork of the `alloc` crate
        ...
      8f5b5f78
    • Linus Torvalds's avatar
      Merge tag 'v6.10-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 84c7d76b
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
         - Remove crypto stats interface
      
        Algorithms:
         - Add faster AES-XTS on modern x86_64 CPUs
         - Forbid curves with order less than 224 bits in ecc (FIPS 186-5)
         - Add ECDSA NIST P521
      
        Drivers:
         - Expose otp zone in atmel
         - Add dh fallback for primes > 4K in qat
         - Add interface for live migration in qat
         - Use dma for aes requests in starfive
         - Add full DMA support for stm32mpx in stm32
         - Add Tegra Security Engine driver
      
        Others:
         - Introduce scope-based x509_certificate allocation"
      
      * tag 'v6.10-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (123 commits)
        crypto: atmel-sha204a - provide the otp content
        crypto: atmel-sha204a - add reading from otp zone
        crypto: atmel-i2c - rename read function
        crypto: atmel-i2c - add missing arg description
        crypto: iaa - Use kmemdup() instead of kzalloc() and memcpy()
        crypto: sahara - use 'time_left' variable with wait_for_completion_timeout()
        crypto: api - use 'time_left' variable with wait_for_completion_killable_timeout()
        crypto: caam - i.MX8ULP donot have CAAM page0 access
        crypto: caam - init-clk based on caam-page0-access
        crypto: starfive - Use fallback for unaligned dma access
        crypto: starfive - Do not free stack buffer
        crypto: starfive - Skip unneeded fallback allocation
        crypto: starfive - Skip dma setup for zeroed message
        crypto: hisilicon/sec2 - fix for register offset
        crypto: hisilicon/debugfs - mask the unnecessary info from the dump
        crypto: qat - specify firmware files for 402xx
        crypto: x86/aes-gcm - simplify GCM hash subkey derivation
        crypto: x86/aes-gcm - delete unused GCM assembly code
        crypto: x86/aes-xts - simplify loop in xts_crypt_slowpath()
        hwrng: stm32 - repair clock handling
        ...
      84c7d76b
    • Linus Torvalds's avatar
      Merge tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 87caef42
      Linus Torvalds authored
      Pull hardening updates from Kees Cook:
       "The bulk of the changes here are related to refactoring and expanding
        the KUnit tests for string helper and fortify behavior.
      
        Some trivial strncpy replacements in fs/ were carried in my tree. Also
        some fixes to SCSI string handling were carried in my tree since the
        helper for those was introduce here. Beyond that, just little fixes
        all around: objtool getting confused about LKDTM+KCFI, preparing for
        future refactors (constification of sysctl tables, additional
        __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding
        more options in the hardening.config Kconfig fragment.
      
        Summary:
      
         - selftests: Add str*cmp tests (Ivan Orlov)
      
         - __counted_by: provide UAPI for _le/_be variants (Erick Archer)
      
         - Various strncpy deprecation refactors (Justin Stitt)
      
         - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas
           Weißschuh)
      
         - UBSAN: Work around i386 -regparm=3 bug with Clang prior to
           version 19
      
         - Provide helper to deal with non-NUL-terminated string copying
      
         - SCSI: Fix older string copying bugs (with new helper)
      
         - selftests: Consolidate string helper behavioral tests
      
         - selftests: add memcpy() fortify tests
      
         - string: Add additional __realloc_size() annotations for "dup"
           helpers
      
         - LKDTM: Fix KCFI+rodata+objtool confusion
      
         - hardening.config: Enable KCFI"
      
      * tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits)
        uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be}
        stackleak: Use a copy of the ctl_table argument
        string: Add additional __realloc_size() annotations for "dup" helpers
        kunit/fortify: Fix replaced failure path to unbreak __alloc_size
        hardening: Enable KCFI and some other options
        lkdtm: Disable CFI checking for perms functions
        kunit/fortify: Add memcpy() tests
        kunit/fortify: Do not spam logs with fortify WARNs
        kunit/fortify: Rename tests to use recommended conventions
        init: replace deprecated strncpy with strscpy_pad
        kunit/fortify: Fix mismatched kvalloc()/vfree() usage
        scsi: qla2xxx: Avoid possible run-time warning with long model_num
        scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings
        scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings
        fs: ecryptfs: replace deprecated strncpy with strscpy
        hfsplus: refactor copy_name to not use strncpy
        reiserfs: replace deprecated strncpy with scnprintf
        virt: acrn: replace deprecated strncpy with strscpy
        ubsan: Avoid i386 UBSAN handler crashes with Clang
        ubsan: Remove 1-element array usage in debug reporting
        ...
      87caef42
    • Linus Torvalds's avatar
      Merge tag 'execve-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 92f74f7f
      Linus Torvalds authored
      Pull execve updates from Kees Cook:
      
       - Provide knob to change (previously fixed) coredump NOTES size
         (Allen Pais)
      
       - Add sched_prepare_exec tracepoint (Marco Elver)
      
       - Make /proc/$pid/auxv work under binfmt_elf_fdpic (Max Filippov)
      
       - Convert ARCH_HAVE_EXTRA_ELF_NOTES to proper Kconfig (Vignesh
         Balasubramanian)
      
       - Leave a gap between .bss and brk
      
      * tag 'execve-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        fs/coredump: Enable dynamic configuration of max file note size
        binfmt_elf_fdpic: fix /proc/<pid>/auxv
        binfmt_elf: Leave a gap between .bss and brk
        Replace macro "ARCH_HAVE_EXTRA_ELF_NOTES" with kconfig
        tracing: Add sched_prepare_exec tracepoint
      92f74f7f
    • Linus Torvalds's avatar
      Merge tag 'seccomp-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1ba58f1a
      Linus Torvalds authored
      Pull seccomp update from Kees Cook:
      
       - Prepare for sysctl table constification
      
      * tag 'seccomp-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: Constify sysctl subhelpers
      1ba58f1a
    • Linus Torvalds's avatar
      Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux · 0c9f4ac8
      Linus Torvalds authored
      Pull block updates from Jens Axboe:
      
       - Add a partscan attribute in sysfs, fixing an issue with systemd
         relying on an internal interface that went away.
      
       - Attempt #2 at making long running discards interruptible. The
         previous attempt went into 6.9, but we ended up mostly reverting it
         as it had issues.
      
       - Remove old ida_simple API in bcache
      
       - Support for zoned write plugging, greatly improving the performance
         on zoned devices.
      
       - Remove the old throttle low interface, which has been experimental
         since 2017 and never made it beyond that and isn't being used.
      
       - Remove page->index debugging checks in brd, as it hasn't caught
         anything and prepares us for removing in struct page.
      
       - MD pull request from Song
      
       - Don't schedule block workers on isolated CPUs
      
      * tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits)
        blk-throttle: delay initialization until configuration
        blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW
        block: fix that util can be greater than 100%
        block: support to account io_ticks precisely
        block: add plug while submitting IO
        bcache: fix variable length array abuse in btree_iter
        bcache: Remove usage of the deprecated ida_simple_xx() API
        md: Revert "md: Fix overflow in is_mddev_idle"
        blk-lib: check for kill signal in ioctl BLKDISCARD
        block: add a bio_await_chain helper
        block: add a blk_alloc_discard_bio helper
        block: add a bio_chain_and_submit helper
        block: move discard checks into the ioctl handler
        block: remove the discard_granularity check in __blkdev_issue_discard
        block/ioctl: prefer different overflow check
        null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION()
        block: fix and simplify blkdevparts= cmdline parsing
        block: refine the EOF check in blkdev_iomap_begin
        block: add a partscan sysfs attribute for disks
        block: add a disk_has_partscan helper
        ...
      0c9f4ac8
    • Linus Torvalds's avatar
      Merge tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linux · 9961a785
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
      
       - Greatly improve send zerocopy performance, by enabling coalescing of
         sent buffers.
      
         MSG_ZEROCOPY already does this with send(2) and sendmsg(2), but the
         io_uring side did not. In local testing, the crossover point for send
         zerocopy being faster is now around 3000 byte packets, and it
         performs better than the sync syscall variants as well.
      
         This feature relies on a shared branch with net-next, which was
         pulled into both branches.
      
       - Unification of how async preparation is done across opcodes.
      
         Previously, opcodes that required extra memory for async retry would
         allocate that as needed, using on-stack state until that was the
         case. If async retry was needed, the on-stack state was adjusted
         appropriately for a retry and then copied to the allocated memory.
      
         This led to some fragile and ugly code, particularly for read/write
         handling, and made storage retries more difficult than they needed to
         be. Allocate the memory upfront, as it's cheap from our pools, and
         use that state consistently both initially and also from the retry
         side.
      
       - Move away from using remap_pfn_range() for mapping the rings.
      
         This is really not the right interface to use and can cause lifetime
         issues or leaks. Additionally, it means the ring sq/cq arrays need to
         be physically contigious, which can cause problems in production with
         larger rings when services are restarted, as memory can be very
         fragmented at that point.
      
         Move to using vm_insert_page(s) for the ring sq/cq arrays, and apply
         the same treatment to mapped ring provided buffers. This also helps
         unify the code we have dealing with allocating and mapping memory.
      
         Hard to see in the diffstat as we're adding a few features as well,
         but this kills about ~400 lines of code from the codebase as well.
      
       - Add support for bundles for send/recv.
      
         When used with provided buffers, bundles support sending or receiving
         more than one buffer at the time, improving the efficiency by only
         needing to call into the networking stack once for multiple sends or
         receives.
      
       - Tweaks for our accept operations, supporting both a DONTWAIT flag for
         skipping poll arm and retry if we can, and a POLLFIRST flag that the
         application can use to skip the initial accept attempt and rely
         purely on poll for triggering the operation. Both of these have
         identical flags on the receive side already.
      
       - Make the task_work ctx locking unconditional.
      
         We had various code paths here that would do a mix of lock/trylock
         and set the task_work state to whether or not it was locked. All of
         that goes away, we lock it unconditionally and get rid of the state
         flag indicating whether it's locked or not.
      
         The state struct still exists as an empty type, can go away in the
         future.
      
       - Add support for specifying NOP completion values, allowing it to be
         used for error handling testing.
      
       - Use set/test bit for io-wq worker flags. Not strictly needed, but
         also doesn't hurt and helps silence a KCSAN warning.
      
       - Cleanups for io-wq locking and work assignments, closing a tiny race
         where cancelations would not be able to find the work item reliably.
      
       - Misc fixes, cleanups, and improvements
      
      * tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linux: (97 commits)
        io_uring: support to inject result for NOP
        io_uring: fail NOP if non-zero op flags is passed in
        io_uring/net: add IORING_ACCEPT_POLL_FIRST flag
        io_uring/net: add IORING_ACCEPT_DONTWAIT flag
        io_uring/filetable: don't unnecessarily clear/reset bitmap
        io_uring/io-wq: Use set_bit() and test_bit() at worker->flags
        io_uring/msg_ring: cleanup posting to IOPOLL vs !IOPOLL ring
        io_uring: Require zeroed sqe->len on provided-buffers send
        io_uring/notif: disable LAZY_WAKE for linked notifs
        io_uring/net: fix sendzc lazy wake polling
        io_uring/msg_ring: reuse ctx->submitter_task read using READ_ONCE instead of re-reading it
        io_uring/rw: reinstate thread check for retries
        io_uring/notif: implement notification stacking
        io_uring/notif: simplify io_notif_flush()
        net: add callback for setting a ubuf_info to skb
        net: extend ubuf_info callback to ops structure
        io_uring/net: support bundles for recv
        io_uring/net: support bundles for send
        io_uring/kbuf: add helpers for getting/peeking multiple buffers
        io_uring/net: add provided buffer support for IORING_OP_SEND
        ...
      9961a785
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.10.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · f4e8d802
      Linus Torvalds authored
      Pull vfs rw iterator updates from Christian Brauner:
       "The core fs signalfd, userfaultfd, and timerfd subsystems did still
        use f_op->read() instead of f_op->read_iter(). Convert them over since
        we should aim to get rid of f_op->read() at some point.
      
        Aside from that io_uring and others want to mark files as FMODE_NOWAIT
        so it can make use of per-IO nonblocking hints to enable more
        efficient IO. Converting those users to f_op->read_iter() allows them
        to be marked with FMODE_NOWAIT"
      
      * tag 'vfs-6.10.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        signalfd: convert to ->read_iter()
        userfaultfd: convert to ->read_iter()
        timerfd: convert to ->read_iter()
        new helper: copy_to_iter_full()
      f4e8d802
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.10.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · ef31ea6c
      Linus Torvalds authored
      Pull netfs updates from Christian Brauner:
       "This reworks the netfslib writeback implementation so that pages read
        from the cache are written to the cache through ->writepages(),
        thereby allowing the fscache page flag to be retired.
      
        The reworking also:
      
         - builds on top of the new writeback_iter() infrastructure
      
         - makes it possible to use vectored write RPCs as discontiguous
           streams of pages can be accommodated
      
         - makes it easier to do simultaneous content crypto and stream
           division
      
         - provides support for retrying writes and re-dividing a stream
      
         - replaces the ->launder_folio() op, so that ->writepages() is used
           instead
      
         - uses mempools to allocate the netfs_io_request and
           netfs_io_subrequest structs to avoid allocation failure in the
           writeback path
      
        Some code that uses the fscache page flag is retained for
        compatibility purposes with nfs and ceph. The code is switched to
        using the synonymous private_2 label instead and marked with
        deprecation comments.
      
        The merge commit contains additional details on the new algorithm that
        I've left out of here as it would probably be excessively detailed.
      
        On top of the netfslib infrastructure this contains the work to
        convert cifs over to netfslib"
      
      * tag 'vfs-6.10.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (38 commits)
        cifs: Enable large folio support
        cifs: Remove some code that's no longer used, part 3
        cifs: Remove some code that's no longer used, part 2
        cifs: Remove some code that's no longer used, part 1
        cifs: Cut over to using netfslib
        cifs: Implement netfslib hooks
        cifs: Make add_credits_and_wake_if() clear deducted credits
        cifs: Add mempools for cifs_io_request and cifs_io_subrequest structs
        cifs: Set zero_point in the copy_file_range() and remap_file_range()
        cifs: Move cifs_loose_read_iter() and cifs_file_write_iter() to file.c
        cifs: Replace the writedata replay bool with a netfs sreq flag
        cifs: Make wait_mtu_credits take size_t args
        cifs: Use more fields from netfs_io_subrequest
        cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequest
        cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequest
        cifs: Use alternative invalidation to using launder_folio
        netfs, afs: Use writeback retry to deal with alternate keys
        netfs: Miscellaneous tidy ups
        netfs: Remove the old writeback code
        netfs: Cut over to using new writeback code
        ...
      ef31ea6c