1. 24 Mar, 2024 13 commits
    • Linus Torvalds's avatar
      Linux 6.9-rc1 · 4cece764
      Linus Torvalds authored
      4cece764
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · ab8de2db
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - Fix logic that is supposed to prevent placement of the kernel image
         below LOAD_PHYSICAL_ADDR
      
       - Use the firmware stack in the EFI stub when running in mixed mode
      
       - Clear BSS only once when using mixed mode
      
       - Check efi.get_variable() function pointer for NULL before trying to
         call it
      
      * tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: fix panic in kdump kernel
        x86/efistub: Don't clear BSS twice in mixed mode
        x86/efistub: Call mixed mode boot services on the firmware's stack
        efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address
      ab8de2db
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5e74df2f
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - Ensure that the encryption mask at boot is properly propagated on
         5-level page tables, otherwise the PGD entry is incorrectly set to
         non-encrypted, which causes system crashes during boot.
      
       - Undo the deferred 5-level page table setup as it cannot work with
         memory encryption enabled.
      
       - Prevent inconsistent XFD state on CPU hotplug, where the MSR is reset
         to the default value but the cached variable is not, so subsequent
         comparisons might yield the wrong result and as a consequence the
         result prevents updating the MSR.
      
       - Register the local APIC address only once in the MPPARSE enumeration
         to prevent triggering the related WARN_ONs() in the APIC and topology
         code.
      
       - Handle the case where no APIC is found gracefully by registering a
         fake APIC in the topology code. That makes all related topology
         functions work correctly and does not affect the actual APIC driver
         code at all.
      
       - Don't evaluate logical IDs during early boot as the local APIC IDs
         are not yet enumerated and the invoked function returns an error
         code. Nothing requires the logical IDs before the final CPUID
         enumeration takes place, which happens after the enumeration.
      
       - Cure the fallout of the per CPU rework on UP which misplaced the
         copying of boot_cpu_data to per CPU data so that the final update to
         boot_cpu_data got lost which caused inconsistent state and boot
         crashes.
      
       - Use copy_from_kernel_nofault() in the kprobes setup as there is no
         guarantee that the address can be safely accessed.
      
       - Reorder struct members in struct saved_context to work around another
         kmemleak false positive
      
       - Remove the buggy code which tries to update the E820 kexec table for
         setup_data as that is never passed to the kexec kernel.
      
       - Update the resource control documentation to use the proper units.
      
       - Fix a Kconfig warning observed with tinyconfig
      
      * tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot/64: Move 5-level paging global variable assignments back
        x86/boot/64: Apply encryption mask to 5-level pagetable update
        x86/cpu: Add model number for another Intel Arrow Lake mobile processor
        x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
        Documentation/x86: Document that resctrl bandwidth control units are MiB
        x86/mpparse: Register APIC address only once
        x86/topology: Handle the !APIC case gracefully
        x86/topology: Don't evaluate logical IDs during early boot
        x86/cpu: Ensure that CPU info updates are propagated on UP
        kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address
        x86/pm: Work around false positive kmemleak report in msr_build_context()
        x86/kexec: Do not update E820 kexec table for setup_data
        x86/config: Fix warning for 'make ARCH=x86_64 tinyconfig'
      5e74df2f
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b136f68e
      Linus Torvalds authored
      Pull scheduler doc clarification from Thomas Gleixner:
       "A single update for the documentation of the base_slice_ns tunable to
        clarify that any value which is less than the tick slice has no effect
        because the scheduler tick is not guaranteed to happen within the set
        time slice"
      
      * tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/doc: Update documentation for base_slice_ns and CONFIG_HZ relation
      b136f68e
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-6.9-2024-03-24' of git://git.infradead.org/users/hch/dma-mapping · 864ad046
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "This has a set of swiotlb alignment fixes for sometimes very long
        standing bugs from Will. We've been discussion them for a while and
        they should be solid now"
      
      * tag 'dma-mapping-6.9-2024-03-24' of git://git.infradead.org/users/hch/dma-mapping:
        swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE
        iommu/dma: Force swiotlb_max_mapping_size on an untrusted device
        swiotlb: Fix alignment checks when both allocation and DMA masks are present
        swiotlb: Honour dma_alloc_coherent() alignment in swiotlb_alloc()
        swiotlb: Enforce page alignment in swiotlb_alloc()
        swiotlb: Fix double-allocation of slots due to broken alignment handling
      864ad046
    • Oleksandr Tymoshenko's avatar
      efi: fix panic in kdump kernel · 62b71cd7
      Oleksandr Tymoshenko authored
      Check if get_next_variable() is actually valid pointer before
      calling it. In kdump kernel this method is set to NULL that causes
      panic during the kexec-ed kernel boot.
      
      Tested with QEMU and OVMF firmware.
      
      Fixes: bad267f9 ("efi: verify that variable services are supported")
      Signed-off-by: default avatarOleksandr Tymoshenko <ovt@google.com>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      62b71cd7
    • Ard Biesheuvel's avatar
      x86/efistub: Don't clear BSS twice in mixed mode · df7ecce8
      Ard Biesheuvel authored
      Clearing BSS should only be done once, at the very beginning.
      efi_pe_entry() is the entrypoint from the firmware, which may not clear
      BSS and so it is done explicitly. However, efi_pe_entry() is also used
      as an entrypoint by the mixed mode startup code, in which case BSS will
      already have been cleared, and doing it again at this point will corrupt
      global variables holding the firmware's GDT/IDT and segment selectors.
      
      So make the memset() conditional on whether the EFI stub is running in
      native mode.
      
      Fixes: b3810c5a ("x86/efistub: Clear decompressor BSS in native EFI entrypoint")
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      df7ecce8
    • Ard Biesheuvel's avatar
      x86/efistub: Call mixed mode boot services on the firmware's stack · cefcd4fe
      Ard Biesheuvel authored
      Normally, the EFI stub calls into the EFI boot services using the stack
      that was live when the stub was entered. According to the UEFI spec,
      this stack needs to be at least 128k in size - this might seem large but
      all asynchronous processing and event handling in EFI runs from the same
      stack and so quite a lot of space may be used in practice.
      
      In mixed mode, the situation is a bit different: the bootloader calls
      the 32-bit EFI stub entry point, which calls the decompressor's 32-bit
      entry point, where the boot stack is set up, using a fixed allocation
      of 16k. This stack is still in use when the EFI stub is started in
      64-bit mode, and so all calls back into the EFI firmware will be using
      the decompressor's limited boot stack.
      
      Due to the placement of the boot stack right after the boot heap, any
      stack overruns have gone unnoticed. However, commit
      
        5c4feadb0011983b ("x86/decompressor: Move global symbol references to C code")
      
      moved the definition of the boot heap into C code, and now the boot
      stack is placed right at the base of BSS, where any overruns will
      corrupt the end of the .data section.
      
      While it would be possible to work around this by increasing the size of
      the boot stack, doing so would affect all x86 systems, and mixed mode
      systems are a tiny (and shrinking) fraction of the x86 installed base.
      
      So instead, record the firmware stack pointer value when entering from
      the 32-bit firmware, and switch to this stack every time a EFI boot
      service call is made.
      
      Cc: <stable@kernel.org> # v6.1+
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      cefcd4fe
    • Tom Lendacky's avatar
      x86/boot/64: Move 5-level paging global variable assignments back · 9843231c
      Tom Lendacky authored
      Commit 63bed966 ("x86/startup_64: Defer assignment of 5-level paging
      global variables") moved assignment of 5-level global variables to later
      in the boot in order to avoid having to use RIP relative addressing in
      order to set them. However, when running with 5-level paging and SME
      active (mem_encrypt=on), the variables are needed as part of the page
      table setup needed to encrypt the kernel (using pgd_none(), p4d_offset(),
      etc.). Since the variables haven't been set, the page table manipulation
      is done as if 4-level paging is active, causing the system to crash on
      boot.
      
      While only a subset of the assignments that were moved need to be set
      early, move all of the assignments back into check_la57_support() so that
      these assignments aren't spread between two locations. Instead of just
      reverting the fix, this uses the new RIP_REL_REF() macro when assigning
      the variables.
      
      Fixes: 63bed966 ("x86/startup_64: Defer assignment of 5-level paging global variables")
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/2ca419f4d0de719926fd82353f6751f717590a86.1711122067.git.thomas.lendacky@amd.com
      9843231c
    • Tom Lendacky's avatar
      x86/boot/64: Apply encryption mask to 5-level pagetable update · 4d0d7e78
      Tom Lendacky authored
      When running with 5-level page tables, the kernel mapping PGD entry is
      updated to point to the P4D table. The assignment uses _PAGE_TABLE_NOENC,
      which, when SME is active (mem_encrypt=on), results in a page table
      entry without the encryption mask set, causing the system to crash on
      boot.
      
      Change the assignment to use _PAGE_TABLE instead of _PAGE_TABLE_NOENC so
      that the encryption mask is set for the PGD entry.
      
      Fixes: 533568e0 ("x86/boot/64: Use RIP_REL_REF() to access early_top_pgt[]")
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/8f20345cda7dbba2cf748b286e1bc00816fe649a.1711122067.git.thomas.lendacky@amd.com
      4d0d7e78
    • Tony Luck's avatar
    • Adamos Ttofari's avatar
      x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD · 10e4b516
      Adamos Ttofari authored
      Commit 67236547 ("x86/fpu: Update XFD state where required") and
      commit 8bf26758 ("x86/fpu: Add XFD state to fpstate") introduced a
      per CPU variable xfd_state to keep the MSR_IA32_XFD value cached, in
      order to avoid unnecessary writes to the MSR.
      
      On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which
      wipes out any stale state. But the per CPU cached xfd value is not
      reset, which brings them out of sync.
      
      As a consequence a subsequent xfd_update_state() might fail to update
      the MSR which in turn can result in XRSTOR raising a #NM in kernel
      space, which crashes the kernel.
      
      To fix this, introduce xfd_set_state() to write xfd_state together
      with MSR_IA32_XFD, and use it in all places that set MSR_IA32_XFD.
      
      Fixes: 67236547 ("x86/fpu: Update XFD state where required")
      Signed-off-by: default avatarAdamos Ttofari <attofari@amazon.de>
      Signed-off-by: default avatarChang S. Bae <chang.seok.bae@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20240322230439.456571-1-chang.seok.bae@intel.com
      
      Closes: https://lore.kernel.org/lkml/20230511152818.13839-1-attofari@amazon.de
      10e4b516
    • Tony Luck's avatar
      Documentation/x86: Document that resctrl bandwidth control units are MiB · a8ed59a3
      Tony Luck authored
      The memory bandwidth software controller uses 2^20 units rather than
      10^6. See mbm_bw_count() which computes bandwidth using the "SZ_1M"
      Linux define for 0x00100000.
      
      Update the documentation to use MiB when describing this feature.
      It's too late to fix the mount option "mba_MBps" as that is now an
      established user interface.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/r/20240322182016.196544-1-tony.luck@intel.com
      a8ed59a3
  2. 23 Mar, 2024 11 commits
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 70293240
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "Two regression fixes for the timer and timer migration code:
      
         - Prevent endless timer requeuing which is caused by two CPUs racing
           out of idle. This happens when the last CPU goes idle and therefore
           has to ensure to expire the pending global timers and some other
           CPU come out of idle at the same time and the other CPU wins the
           race and expires the global queue. This causes the last CPU to
           chase ghost timers forever and reprogramming it's clockevent device
           endlessly.
      
           Cure this by re-evaluating the wakeup time unconditionally.
      
         - The split into local (pinned) and global timers in the timer wheel
           caused a regression for NOHZ full as it broke the idle tracking of
           global timers. On NOHZ full this prevents an self IPI being sent
           which in turn causes the timer to be not programmed and not being
           expired on time.
      
           Restore the idle tracking for the global timer base so that the
           self IPI condition for NOHZ full is working correctly again"
      
      * tag 'timers-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timers: Fix removed self-IPI on global timer's enqueue in nohz_full
        timers/migration: Fix endless timer requeue after idle interrupts
      70293240
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 00164f47
      Linus Torvalds authored
      Pull more clocksource updates from Thomas Gleixner:
       "A set of updates for clocksource and clockevent drivers:
      
         - A fix for the prescaler of the ARM global timer where the prescaler
           mask define only covered 4 bits while it is actully 8 bits wide.
           This obviously restricted the possible range of prescaler
           adjustments
      
         - A fix for the RISC-V timer which prevents a timer interrupt being
           raised while the timer is initialized
      
         - A set of device tree updates to support new system on chips in
           various drivers
      
         - Kernel-doc and other cleanups all over the place"
      
      * tag 'timers-core-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/timer-riscv: Clear timer interrupt on timer initialization
        dt-bindings: timer: Add support for cadence TTC PWM
        clocksource/drivers/arm_global_timer: Simplify prescaler register access
        clocksource/drivers/arm_global_timer: Guard against division by zero
        clocksource/drivers/arm_global_timer: Make gt_target_rate unsigned long
        dt-bindings: timer: add Ralink SoCs system tick counter
        clocksource: arm_global_timer: fix non-kernel-doc comment
        clocksource/drivers/arm_global_timer: Remove stray tab
        clocksource/drivers/arm_global_timer: Fix maximum prescaler value
        clocksource/drivers/imx-sysctr: Add i.MX95 support
        clocksource/drivers/imx-sysctr: Drop use global variables
        dt-bindings: timer: nxp,sysctr-timer: support i.MX95
        dt-bindings: timer: renesas: ostm: Document RZ/Five SoC
        dt-bindings: timer: renesas,tmu: Document input capture interrupt
        clocksource/drivers/ti-32K: Fix misuse of "/**" comment
        clocksource/drivers/stm32: Fix all kernel-doc warnings
        dt-bindings: timer: exynos4210-mct: Add google,gs101-mct compatible
        clocksource/drivers/imx: Fix -Wunused-but-set-variable warning
      00164f47
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1a391931
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A series of fixes for the Renesas RZG21 interrupt chip driver to
        prevent spurious and misrouted interrupts.
      
         - Ensure that posted writes are flushed in the eoi() callback
      
         - Ensure that interrupts are masked at the chip level when the
           trigger type is changed
      
         - Clear the interrupt status register when setting up edge type
           trigger modes
      
         - Ensure that the trigger type and routing information is set before
           the interrupt is enabled"
      
      * tag 'irq-urgent-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/renesas-rzg2l: Do not set TIEN and TINT source at the same time
        irqchip/renesas-rzg2l: Prevent spurious interrupts when setting trigger type
        irqchip/renesas-rzg2l: Rename rzg2l_irq_eoi()
        irqchip/renesas-rzg2l: Rename rzg2l_tint_eoi()
        irqchip/renesas-rzg2l: Flush posted write in irq_eoi()
      1a391931
    • Linus Torvalds's avatar
      Merge tag 'core-entry-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 976b029d
      Linus Torvalds authored
      Pull core entry fix from Thomas Gleixner:
       "A single fix for the generic entry code:
      
        The trace_sys_enter() tracepoint can modify the syscall number via
        kprobes or BPF in pt_regs, but that requires that the syscall number
        is re-evaluted from pt_regs after the tracepoint.
      
        A seccomp fix in that area removed the re-evaluation so the change
        does not take effect as the code just uses the locally cached number.
      
        Restore the original behaviour by re-evaluating the syscall number
        after the tracepoint"
      
      * tag 'core-entry-2024-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        entry: Respect changes to system call number by trace_sys_enter()
      976b029d
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 484193fe
      Linus Torvalds authored
      Pull more powerpc updates from Michael Ellerman:
      
       - Handle errors in mark_rodata_ro() and mark_initmem_nx()
      
       - Make struct crash_mem available without CONFIG_CRASH_DUMP
      
      Thanks to Christophe Leroy and Hari Bathini.
      
      * tag 'powerpc-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/kdump: Split KEXEC_CORE and CRASH_DUMP dependency
        powerpc/kexec: split CONFIG_KEXEC_FILE and CONFIG_CRASH_DUMP
        kexec/kdump: make struct crash_mem available without CONFIG_CRASH_DUMP
        powerpc: Handle error in mark_rodata_ro() and mark_initmem_nx()
      484193fe
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 02fb638b
      Linus Torvalds authored
      Pull ARM updates from Russell King:
      
       - remove a misuse of kernel-doc comment
      
       - use "Call trace:" for backtraces like other architectures
      
       - implement copy_from_kernel_nofault_allowed() to fix a LKDTM test
      
       - add a "cut here" line for prefetch aborts
      
       - remove unnecessary Kconfing entry for FRAME_POINTER
      
       - remove iwmmxy support for PJ4/PJ4B cores
      
       - use bitfield helpers in ptrace to improve readabililty
      
       - check if folio is reserved before flushing
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9359/1: flush: check if the folio is reserved for no-mapping addresses
        ARM: 9354/1: ptrace: Use bitfield helpers
        ARM: 9352/1: iwmmxt: Remove support for PJ4/PJ4B cores
        ARM: 9353/1: remove unneeded entry for CONFIG_FRAME_POINTER
        ARM: 9351/1: fault: Add "cut here" line for prefetch aborts
        ARM: 9350/1: fault: Implement copy_from_kernel_nofault_allowed()
        ARM: 9349/1: unwind: Add missing "Call trace:" line
        ARM: 9334/1: mm: init: remove misuse of kernel-doc comment
      02fb638b
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · b7187139
      Linus Torvalds authored
      Pull more hardening updates from Kees Cook:
      
       - CONFIG_MEMCPY_SLOW_KUNIT_TEST is no longer needed (Guenter Roeck)
      
       - Fix needless UTF-8 character in arch/Kconfig (Liu Song)
      
       - Improve __counted_by warning message in LKDTM (Nathan Chancellor)
      
       - Refactor DEFINE_FLEX() for default use of __counted_by
      
       - Disable signed integer overflow sanitizer on GCC < 8
      
      * tag 'hardening-v6.9-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        lkdtm/bugs: Improve warning message for compilers without counted_by support
        overflow: Change DEFINE_FLEX to take __counted_by member
        Revert "kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST"
        arch/Kconfig: eliminate needless UTF-8 character in Kconfig help
        ubsan: Disable signed integer overflow sanitizer on GCC < 8
      b7187139
    • Thomas Gleixner's avatar
      x86/mpparse: Register APIC address only once · f2208aa1
      Thomas Gleixner authored
      The APIC address is registered twice. First during the early detection and
      afterwards when actually scanning the table for APIC IDs. The APIC and
      topology core warn about the second attempt.
      
      Restrict it to the early detection call.
      
      Fixes: 81287ad6 ("x86/apic: Sanitize APIC address setup")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: https://lore.kernel.org/r/20240322185305.297774848@linutronix.de
      f2208aa1
    • Thomas Gleixner's avatar
      x86/topology: Handle the !APIC case gracefully · 5e25eb25
      Thomas Gleixner authored
      If there is no local APIC enumerated and registered then the topology
      bitmaps are empty. Therefore, topology_init_possible_cpus() will die with
      a division by zero exception.
      
      Prevent this by registering a fake APIC id to populate the topology
      bitmap. This also allows to use all topology query interfaces
      unconditionally. It does not affect the actual APIC code because either
      the local APIC address was not registered or no local APIC could be
      detected.
      
      Fixes: f1f758a8 ("x86/topology: Add a mechanism to track topology via APIC IDs")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: https://lore.kernel.org/r/20240322185305.242709302@linutronix.de
      5e25eb25
    • Thomas Gleixner's avatar
      x86/topology: Don't evaluate logical IDs during early boot · 7af541ce
      Thomas Gleixner authored
      The local APICs have not yet been enumerated so the logical ID evaluation
      from the topology bitmaps does not work and would return an error code.
      
      Skip the evaluation during the early boot CPUID evaluation and only apply
      it on the final run.
      
      Fixes: 380414be ("x86/cpu/topology: Use topology logical mapping mechanism")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: https://lore.kernel.org/r/20240322185305.186943142@linutronix.de
      7af541ce
    • Thomas Gleixner's avatar
      x86/cpu: Ensure that CPU info updates are propagated on UP · c90399fb
      Thomas Gleixner authored
      The boot sequence evaluates CPUID information twice:
      
        1) During early boot
      
        2) When finalizing the early setup right before
           mitigations are selected and alternatives are patched.
      
      In both cases the evaluation is stored in boot_cpu_data, but on UP the
      copying of boot_cpu_data to the per CPU info of the boot CPU happens
      between #1 and #2. So any update which happens in #2 is never propagated to
      the per CPU info instance.
      
      Consolidate the whole logic and copy boot_cpu_data right before applying
      alternatives as that's the point where boot_cpu_data is in it's final
      state and not supposed to change anymore.
      
      This also removes the voodoo mb() from smp_prepare_cpus_common() which
      had absolutely no purpose.
      
      Fixes: 71eb4893 ("x86/percpu: Cure per CPU madness on UP")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: https://lore.kernel.org/r/20240322185305.127642785@linutronix.de
      c90399fb
  3. 22 Mar, 2024 16 commits
    • Nathan Chancellor's avatar
      lkdtm/bugs: Improve warning message for compilers without counted_by support · 231dc3f0
      Nathan Chancellor authored
      The current message for telling the user that their compiler does not
      support the counted_by attribute in the FAM_BOUNDS test does not make
      much sense either grammatically or semantically. Fix it to make it
      correct in both aspects.
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20240321-lkdtm-improve-lack-of-counted_by-msg-v1-1-0fbf7481a29c@kernel.orgSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      231dc3f0
    • Kees Cook's avatar
      overflow: Change DEFINE_FLEX to take __counted_by member · d8e45f29
      Kees Cook authored
      The norm should be flexible array structures with __counted_by
      annotations, so DEFINE_FLEX() is updated to expect that. Rename
      the non-annotated version to DEFINE_RAW_FLEX(), and update the
      few existing users. Additionally add selftests for the macros.
      Reviewed-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20240306235128.it.933-kees@kernel.orgReviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      d8e45f29
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · bfa8f186
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "The vfs has long had a write lifetime hint mechanism that gives the
        expected longevity on storage of the data being written. f2fs was the
        original consumer of this and used the hint for flash data placement
        (mostly to avoid write amplification by placing objects with similar
        lifetimes in the same erase block).
      
        More recently the SCSI based UFS (Universal Flash Storage) drivers
        have wanted to take advantage of this as well, for the same reasons as
        f2fs, necessitating plumbing the write hints through the block layer
        and then adding it to the SCSI core.
      
        The vfs write_hints already taken plumbs this as far as block and this
        completes the SCSI core enabling based on a recently agreed reuse of
        the old write command group number. The additions to the scsi_debug
        driver are for emulating this property so we can run tests on it in
        the absence of an actual UFS device"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: scsi_debug: Maintain write statistics per group number
        scsi: scsi_debug: Implement GET STREAM STATUS
        scsi: scsi_debug: Implement the IO Advice Hints Grouping mode page
        scsi: scsi_debug: Allocate the MODE SENSE response from the heap
        scsi: scsi_debug: Rework subpage code error handling
        scsi: scsi_debug: Rework page code error handling
        scsi: scsi_debug: Support the block limits extension VPD page
        scsi: scsi_debug: Reduce code duplication
        scsi: sd: Translate data lifetime information
        scsi: scsi_proto: Add structures and constants related to I/O groups and streams
        scsi: core: Query the Block Limits Extension VPD page
      bfa8f186
    • Linus Torvalds's avatar
      Merge tag 'block-6.9-20240322' of git://git.kernel.dk/linux · e3111d9c
      Linus Torvalds authored
      Pull more block updates from Jens Axboe:
      
       - NVMe pull request via Keith:
           - Make an informative message less ominous (Keith)
           - Enhanced trace decoding (Guixin)
           - TCP updates (Hannes, Li)
           - Fabrics connect deadlock fix (Chunguang)
           - Platform API migration update (Uwe)
           - A new device quirk (Jiawei)
      
       - Remove dead assignment in fd (Yufeng)
      
      * tag 'block-6.9-20240322' of git://git.kernel.dk/linux:
        nvmet-rdma: remove NVMET_RDMA_REQ_INVALIDATE_RKEY flag
        nvme: remove redundant BUILD_BUG_ON check
        floppy: remove duplicated code in redo_fd_request()
        nvme/tcp: Add wq_unbound modparam for nvme_tcp_wq
        nvme-tcp: Export the nvme_tcp_wq to sysfs
        drivers/nvme: Add quirks for device 126f:2262
        nvme: parse format command's lbafu when tracing
        nvme: add tracing of reservation commands
        nvme: parse zns command's zsa and zrasf to string
        nvme: use nvme_disk_is_ns_head helper
        nvme: fix reconnection fail due to reserved tag allocation
        nvmet: add tracing of zns commands
        nvmet: add tracing of authentication commands
        nvme-apple: Convert to platform remove callback returning void
        nvmet-tcp: do not continue for invalid icreq
        nvme: change shutdown timeout setting message
      e3111d9c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.9-20240322' of git://git.kernel.dk/linux · 19dba097
      Linus Torvalds authored
      Pull more io_uring updates from Jens Axboe:
       "One patch just missed the initial pull, the rest are either fixes or
        small cleanups that make our life easier for the next kernel:
      
         - Fix a potential leak in error handling of pinned pages, and clean
           it up (Gabriel, Pavel)
      
         - Fix an issue with how read multishot returns retry (me)
      
         - Fix a problem with waitid/futex removals, if we hit the case of
           needing to remove all of them at exit time (me)
      
         - Fix for a regression introduced in this merge window, where we
           don't always have sr->done_io initialized if the ->prep_async()
           path is used (me)
      
         - Fix for SQPOLL setup error handling (me)
      
         - Fix for a poll removal request being delayed (Pavel)
      
         - Rename of a struct member which had a confusing name (Pavel)"
      
      * tag 'io_uring-6.9-20240322' of git://git.kernel.dk/linux:
        io_uring/sqpoll: early exit thread if task_context wasn't allocated
        io_uring: clear opcode specific data for an early failure
        io_uring/net: ensure async prep handlers always initialize ->done_io
        io_uring/waitid: always remove waitid entry for cancel all
        io_uring/futex: always remove futex entry for cancel all
        io_uring: fix poll_remove stalled req completion
        io_uring: Fix release of pinned pages when __io_uaddr_map fails
        io_uring/kbuf: rename is_mapped
        io_uring: simplify io_pages_free
        io_uring: clean rings on NO_MMAP alloc fail
        io_uring/rw: return IOU_ISSUE_SKIP_COMPLETE for multishot retry
        io_uring: don't save/restore iowait state
      19dba097
    • Linus Torvalds's avatar
      Merge tag 'for-6.9/dm-fixes' of... · 64f799ff
      Linus Torvalds authored
      Merge tag 'for-6.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix a memory leak in DM integrity recheck code that was added during
         the 6.9 merge. Also fix the recheck code to ensure it issues bios
         with proper alignment.
      
       - Fix DM snapshot's dm_exception_table_exit() to schedule while
         handling an large exception table during snapshot device shutdown.
      
      * tag 'for-6.9/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm-integrity: align the outgoing bio in integrity_recheck
        dm snapshot: fix lockup in dm_exception_table_exit
        dm-integrity: fix a memory leak when rechecking the data
      64f799ff
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client · ff9c18e4
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A patch to minimize blockage when processing very large batches of
        dirty caps and two fixes to better handle EOF in the face of multiple
        clients performing reads and size-extending writes at the same time"
      
      * tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client:
        ceph: set correct cap mask for getattr request for read
        ceph: stop copying to iter at EOF on sync reads
        ceph: remove SLAB_MEM_SPREAD flag usage
        ceph: break the check delayed cap loop every 5s
      ff9c18e4
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 6f6efce5
      Linus Torvalds authored
      Pull xfs fixes from Chandan Babu:
      
       - Fix invalid pointer dereference by initializing xmbuf before
         tracepoint function is invoked
      
       - Use memalloc_nofs_save() when inserting into quota radix tree
      
      * tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: quota radix tree allocations need to be NOFS on insert
        xfs: fix dev_t usage in xmbuf tracepoints
      6f6efce5
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · c150b809
      Linus Torvalds authored
      Pull RISC-V updates from Palmer Dabbelt:
      
       - Support for various vector-accelerated crypto routines
      
       - Hibernation is now enabled for portable kernel builds
      
       - mmap_rnd_bits_max is larger on systems with larger VAs
      
       - Support for fast GUP
      
       - Support for membarrier-based instruction cache synchronization
      
       - Support for the Andes hart-level interrupt controller and PMU
      
       - Some cleanups around unaligned access speed probing and Kconfig
         settings
      
       - Support for ACPI LPI and CPPC
      
       - Various cleanus related to barriers
      
       - A handful of fixes
      
      * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
        riscv: Fix syscall wrapper for >word-size arguments
        crypto: riscv - add vector crypto accelerated AES-CBC-CTS
        crypto: riscv - parallelize AES-CBC decryption
        riscv: Only flush the mm icache when setting an exec pte
        riscv: Use kcalloc() instead of kzalloc()
        riscv/barrier: Add missing space after ','
        riscv/barrier: Consolidate fence definitions
        riscv/barrier: Define RISCV_FULL_BARRIER
        riscv/barrier: Define __{mb,rmb,wmb}
        RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
        cpufreq: Move CPPC configs to common Kconfig and add RISC-V
        ACPI: RISC-V: Add CPPC driver
        ACPI: Enable ACPI_PROCESSOR for RISC-V
        ACPI: RISC-V: Add LPI driver
        cpuidle: RISC-V: Move few functions to arch/riscv
        riscv: Introduce set_compat_task() in asm/compat.h
        riscv: Introduce is_compat_thread() into compat.h
        riscv: add compile-time test into is_compat_task()
        riscv: Replace direct thread flag check with is_compat_task()
        riscv: Improve arch_get_mmap_end() macro
        ...
      c150b809
    • Linus Torvalds's avatar
      Merge tag 'loongarch-6.9' of... · 1e3cd03c
      Linus Torvalds authored
      Merge tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch updates from Huacai Chen:
      
       - Add objtool support for LoongArch
      
       - Add ORC stack unwinder support for LoongArch
      
       - Add kernel livepatching support for LoongArch
      
       - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
      
       - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
      
       - Some bug fixes and other small changes
      
      * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch/crypto: Clean up useless assignment operations
        LoongArch: Define the __io_aw() hook as mmiowb()
        LoongArch: Remove superfluous flush_dcache_page() definition
        LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h
        LoongArch: Change __my_cpu_offset definition to avoid mis-optimization
        LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig
        LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig
        LoongArch: Add kernel livepatching support
        LoongArch: Add ORC stack unwinder support
        objtool: Check local label in read_unwind_hints()
        objtool: Check local label in add_dead_ends()
        objtool/LoongArch: Enable orc to be built
        objtool/x86: Separate arch-specific and generic parts
        objtool/LoongArch: Implement instruction decoder
        objtool/LoongArch: Enable objtool to be built
      1e3cd03c
    • Linus Torvalds's avatar
      Merge tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev · 4f55aa85
      Linus Torvalds authored
      Pull fbdev updates from Helge Deller:
      
       - Allow console fonts up to 64x128 pixels (Samuel Thibault)
      
       - Prevent division-by-zero in fb monitor code (Roman Smirnov)
      
       - Drop Renesas ARM platforms from Mobile LCDC framebuffer driver (Geert
         Uytterhoeven)
      
       - Various code cleanups in viafb, uveafb and mb862xxfb drivers by
         Aleksandr Burakov, Li Zhijian and Michael Ellerman
      
      * tag 'fbdev-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
        fbdev: panel-tpo-td043mtea1: Convert sprintf() to sysfs_emit()
        fbmon: prevent division by zero in fb_videomode_from_videomode()
        fbcon: Increase maximum font width x height to 64 x 128
        fbdev: viafb: fix typo in hw_bitblt_1 and hw_bitblt_2
        fbdev: mb862xxfb: Fix defined but not used error
        fbdev: uvesafb: Convert sprintf/snprintf to sysfs_emit
        fbdev: Restrict FB_SH_MOBILE_LCDC to SuperH
      4f55aa85
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 4073195a
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A small collection of fixes that came in since the merge window. Most
        of it is relatively minor driver specific fixes, there's also fixes
        for error handling with SPI flash devices and a fix restoring delay
        control functionality for non-GPIO chip selects managed by the core"
      
      * tag 'spi-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spi-mt65xx: Fix NULL pointer access in interrupt handler
        spi: docs: spidev: fix echo command format
        spi: spi-imx: fix off-by-one in mx51 CPU mode burst length
        spi: lm70llp: fix links in doc and comments
        spi: Fix error code checking in spi_mem_exec_op()
        spi: Restore delays for non-GPIO chip select
        spi: lpspi: Avoid potential use-after-free in probe()
      4073195a
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.9-merge-window' of... · 8c826bd9
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fix from Mark Brown:
       "One fix that came in during the merge window, fixing a problem with
        bootstrapping the state of exclusive regulators which have a parent
        regulator"
      
      * tag 'regulator-fix-v6.9-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: core: Propagate the regulator state in case of exclusive get
      8c826bd9
    • Linus Torvalds's avatar
      Merge tag 'sound-fix2-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 6b571e26
      Linus Torvalds authored
      Pull more sound fixes from Takashi Iwai:
       "The remaining fixes for 6.9-rc1 that have been gathered in this week.
      
        More about ASoC at this time (one long-standing fix for compress
        offload, SOF, AMD ACP, Rockchip, Cirrus and tlv320 stuff) while
        another regression fix in ALSA core and a couple of HD-audio quirks as
        usual are included"
      
      * tag 'sound-fix2-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: control: Fix unannotated kfree() cleanup
        ALSA: hda/realtek: Add quirks for some Clevo laptops
        ALSA: hda/realtek: Add quirk for HP Spectre x360 14 eu0000
        ALSA: hda/realtek: fix the hp playback volume issue for LG machines
        ASoC: soc-compress: Fix and add DPCM locking
        ASoC: SOF: amd: Skip IRAM/DRAM size modification for Steam Deck OLED
        ASoC: SOF: amd: Move signed_fw_image to struct acp_quirk_entry
        ASoC: amd: yc: Revert "add new YC platform variant (0x63) support"
        ASoC: amd: yc: Revert "Fix non-functional mic on Lenovo 21J2"
        ASoC: soc-core.c: Skip dummy codec when adding platforms
        ASoC: rockchip: i2s-tdm: Fix inaccurate sampling rates
        ASoC: dt-bindings: cirrus,cs42l43: Fix 'gpio-ranges' schema
        ASoC: amd: yc: Fix non-functional mic on ASUS M7600RE
        ASoC: tlv320adc3xxx: Don't strip remove function when driver is builtin
      6b571e26
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.9-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5ee2433f
      Linus Torvalds authored
      Pull more i2c updates from Wolfram Sang:
       "Some more I2C updates after the dependencies have been merged now.
      
        Plus a DT binding fix"
      
      * tag 'i2c-for-6.9-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        dt-bindings: i2c: qcom,i2c-cci: Fix OV7251 'data-lanes' entries
        i2c: muxes: pca954x: Allow sharing reset GPIO
        i2c: nomadik: sort includes
        i2c: nomadik: support Mobileye EyeQ5 I2C controller
        i2c: nomadik: fetch i2c-transfer-timeout-us property from devicetree
        i2c: nomadik: replace jiffies by ktime for FIFO flushing timeout
        i2c: nomadik: support short xfer timeouts using waitqueue & hrtimer
        i2c: nomadik: use bitops helpers
        i2c: nomadik: simplify IRQ masking logic
        i2c: nomadik: rename private struct pointers from dev to priv
        dt-bindings: i2c: nomadik: add mobileye,eyeq5-i2c bindings and example
      5ee2433f
    • KONDO KAZUMA(近藤 和真)'s avatar
      efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address · 3cb4a482
      KONDO KAZUMA(近藤 和真) authored
      Following warning is sometimes observed while booting my servers:
        [    3.594838] DMA: preallocated 4096 KiB GFP_KERNEL pool for atomic allocations
        [    3.602918] swapper/0: page allocation failure: order:10, mode:0xcc1(GFP_KERNEL|GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0-1
        ...
        [    3.851862] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocation
      
      If 'nokaslr' boot option is set, the warning always happens.
      
      On x86, ZONE_DMA is small zone at the first 16MB of physical address
      space. When this problem happens, most of that space seems to be used by
      decompressed kernel. Thereby, there is not enough space at DMA_ZONE to
      meet the request of DMA pool allocation.
      
      The commit 2f77465b ("x86/efistub: Avoid placing the kernel below
      LOAD_PHYSICAL_ADDR") tried to fix this problem by introducing lower
      bound of allocation.
      
      But the fix is not complete.
      
      efi_random_alloc() allocates pages by following steps.
      1. Count total available slots ('total_slots')
      2. Select a slot ('target_slot') to allocate randomly
      3. Calculate a starting address ('target') to be included target_slot
      4. Allocate pages, which starting address is 'target'
      
      In step 1, 'alloc_min' is used to offset the starting address of memory
      chunk. But in step 3 'alloc_min' is not considered at all.  As the
      result, 'target' can be miscalculated and become lower than 'alloc_min'.
      
      When KASLR is disabled, 'target_slot' is always 0 and the problem
      happens everytime if the EFI memory map of the system meets the
      condition.
      
      Fix this problem by calculating 'target' considering 'alloc_min'.
      
      Cc: linux-efi@vger.kernel.org
      Cc: Tom Englund <tomenglund26@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 2f77465b ("x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR")
      Signed-off-by: default avatarKazuma Kondo <kazuma-kondo@nec.com>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      3cb4a482