1. 01 May, 2024 1 commit
    • Linus Torvalds's avatar
      x86/mm: Remove broken vsyscall emulation code from the page fault code · 02b670c1
      Linus Torvalds authored
      The syzbot-reported stack trace from hell in this discussion thread
      actually has three nested page faults:
      
        https://lore.kernel.org/r/000000000000d5f4fc0616e816d4@google.com
      
      ... and I think that's actually the important thing here:
      
       - the first page fault is from user space, and triggers the vsyscall
         emulation.
      
       - the second page fault is from __do_sys_gettimeofday(), and that should
         just have caused the exception that then sets the return value to
         -EFAULT
      
       - the third nested page fault is due to _raw_spin_unlock_irqrestore() ->
         preempt_schedule() -> trace_sched_switch(), which then causes a BPF
         trace program to run, which does that bpf_probe_read_compat(), which
         causes that page fault under pagefault_disable().
      
      It's quite the nasty backtrace, and there's a lot going on.
      
      The problem is literally the vsyscall emulation, which sets
      
              current->thread.sig_on_uaccess_err = 1;
      
      and that causes the fixup_exception() code to send the signal *despite* the
      exception being caught.
      
      And I think that is in fact completely bogus.  It's completely bogus
      exactly because it sends that signal even when it *shouldn't* be sent -
      like for the BPF user mode trace gathering.
      
      In other words, I think the whole "sig_on_uaccess_err" thing is entirely
      broken, because it makes any nested page-faults do all the wrong things.
      
      Now, arguably, I don't think anybody should enable vsyscall emulation any
      more, but this test case clearly does.
      
      I think we should just make the "send SIGSEGV" be something that the
      vsyscall emulation does on its own, not this broken per-thread state for
      something that isn't actually per thread.
      
      The x86 page fault code actually tried to deal with the "incorrect nesting"
      by having that:
      
                      if (in_interrupt())
                              return;
      
      which ignores the sig_on_uaccess_err case when it happens in interrupts,
      but as shown by this example, these nested page faults do not need to be
      about interrupts at all.
      
      IOW, I think the only right thing is to remove that horrendously broken
      code.
      
      The attached patch looks like the ObviouslyCorrect(tm) thing to do.
      
      NOTE! This broken code goes back to this commit in 2011:
      
        4fc34901 ("x86-64: Set siginfo and context on vsyscall emulation faults")
      
      ... and back then the reason was to get all the siginfo details right.
      Honestly, I do not for a moment believe that it's worth getting the siginfo
      details right here, but part of the commit says:
      
          This fixes issues with UML when vsyscall=emulate.
      
      ... and so my patch to remove this garbage will probably break UML in this
      situation.
      
      I do not believe that anybody should be running with vsyscall=emulate in
      2024 in the first place, much less if you are doing things like UML. But
      let's see if somebody screams.
      
      Reported-and-tested-by: syzbot+83e7f982ca045ab4405c@syzkaller.appspotmail.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Link: https://lore.kernel.org/r/CAHk-=wh9D6f7HUkDgZHKmDCHUQmp+Co89GP+b8+z+G56BKeyNg@mail.gmail.com
      02b670c1
  2. 30 Apr, 2024 1 commit
    • Thomas Gleixner's avatar
      x86/apic: Don't access the APIC when disabling x2APIC · 720a22fd
      Thomas Gleixner authored
      With 'iommu=off' on the kernel command line and x2APIC enabled by the BIOS
      the code which disables the x2APIC triggers an unchecked MSR access error:
      
        RDMSR from 0x802 at rIP: 0xffffffff94079992 (native_apic_msr_read+0x12/0x50)
      
      This is happens because default_acpi_madt_oem_check() selects an x2APIC
      driver before the x2APIC is disabled.
      
      When the x2APIC is disabled because interrupt remapping cannot be enabled
      due to 'iommu=off' on the command line, x2apic_disable() invokes
      apic_set_fixmap() which in turn tries to read the APIC ID. This triggers
      the MSR warning because x2APIC is disabled, but the APIC driver is still
      x2APIC based.
      
      Prevent that by adding an argument to apic_set_fixmap() which makes the
      APIC ID read out conditional and set it to false from the x2APIC disable
      path. That's correct as the APIC ID has already been read out during early
      discovery.
      
      Fixes: d10a9044 ("x86/apic: Consolidate boot_cpu_physical_apicid initialization sites")
      Reported-by: default avatarAdrian Huang <ahuang12@lenovo.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarAdrian Huang <ahuang12@lenovo.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/875xw5t6r7.ffs@tglx
      720a22fd
  3. 29 Apr, 2024 2 commits
    • Ashish Kalra's avatar
      x86/sev: Add callback to apply RMP table fixups for kexec · 400fea4b
      Ashish Kalra authored
      Handle cases where the RMP table placement in the BIOS is not 2M aligned
      and the kexec-ed kernel could try to allocate from within that chunk
      which then causes a fatal RMP fault.
      
      The kexec failure is illustrated below:
      
        SEV-SNP: RMP table physical range [0x0000007ffe800000 - 0x000000807f0fffff]
        BIOS-provided physical RAM map:
        BIOS-e820: [mem 0x0000000000000000-0x000000000008efff] usable
        BIOS-e820: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS
        ...
        BIOS-e820: [mem 0x0000004080000000-0x0000007ffe7fffff] usable
        BIOS-e820: [mem 0x0000007ffe800000-0x000000807f0fffff] reserved
        BIOS-e820: [mem 0x000000807f100000-0x000000807f1fefff] usable
      
      As seen here in the e820 memory map, the end range of the RMP table is not
      aligned to 2MB and not reserved but it is usable as RAM.
      
      Subsequently, kexec -s (KEXEC_FILE_LOAD syscall) loads it's purgatory
      code and boot_param, command line and other setup data into this RAM
      region as seen in the kexec logs below, which leads to fatal RMP fault
      during kexec boot.
      
        Loaded purgatory at 0x807f1fa000
        Loaded boot_param, command line and misc at 0x807f1f8000 bufsz=0x1350 memsz=0x2000
        Loaded 64bit kernel at 0x7ffae00000 bufsz=0xd06200 memsz=0x3894000
        Loaded initrd at 0x7ff6c89000 bufsz=0x4176014 memsz=0x4176014
        E820 memmap:
        0000000000000000-000000000008efff (1)
        000000000008f000-000000000008ffff (4)
        0000000000090000-000000000009ffff (1)
        ...
        0000004080000000-0000007ffe7fffff (1)
        0000007ffe800000-000000807f0fffff (2)
        000000807f100000-000000807f1fefff (1)
        000000807f1ff000-000000807fffffff (2)
        nr_segments = 4
        segment[0]: buf=0x00000000e626d1a2 bufsz=0x4000 mem=0x807f1fa000 memsz=0x5000
        segment[1]: buf=0x0000000029c67bd6 bufsz=0x1350 mem=0x807f1f8000 memsz=0x2000
        segment[2]: buf=0x0000000045c60183 bufsz=0xd06200 mem=0x7ffae00000 memsz=0x3894000
        segment[3]: buf=0x000000006e54f08d bufsz=0x4176014 mem=0x7ff6c89000 memsz=0x4177000
        kexec_file_load: type:0, start:0x807f1fa150 head:0x1184d0002 flags:0x0
      
      Check if RMP table start and end physical range in the e820 tables are
      not aligned to 2MB and in that case map this range to reserved in all
      the three e820 tables.
      
        [ bp: Massage. ]
      
      Fixes: c3b86e61 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature")
      Signed-off-by: default avatarAshish Kalra <ashish.kalra@amd.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/df6e995ff88565262c2c7c69964883ff8aa6fc30.1714090302.git.ashish.kalra@amd.com
      400fea4b
    • Ashish Kalra's avatar
      x86/e820: Add a new e820 table update helper · d6d85ac1
      Ashish Kalra authored
      Add a new API helper e820__range_update_table() with which to update an
      arbitrary e820 table. Move all current users of
      e820__range_update_kexec() to this new helper.
      
        [ bp: Massage. ]
      Signed-off-by: default avatarAshish Kalra <ashish.kalra@amd.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/b726af213ad55053f8a7a1e793b01bb3f1ca9dd5.1714090302.git.ashish.kalra@amd.com
      d6d85ac1
  4. 28 Apr, 2024 6 commits
  5. 27 Apr, 2024 9 commits
    • Linus Torvalds's avatar
      Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux · 2c815938
      Linus Torvalds authored
      Pull Rust fixes from Miguel Ojeda:
      
       - Soundness: make internal functions generated by the 'module!' macro
         inaccessible, do not implement 'Zeroable' for 'Infallible' and
         require 'Send' for the 'Module' trait.
      
       - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE.
      
       - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'.
      
       - Code docs: remove non-existing key from 'module!' macro example.
      
       - Docs: trivial rendering fix in arch table.
      
      * tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux:
        rust: remove `params` from `module` macro example
        kbuild: rust: force `alloc` extern to allow "empty" Rust files
        kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE
        rust: kernel: require `Send` for `Module` implementations
        rust: phy: implement `Send` for `Registration`
        rust: make mutually exclusive with CFI_CLANG
        rust: macros: fix soundness issue in `module!` macro
        rust: init: remove impl Zeroable for Infallible
        docs: rust: fix improper rendering in Arch Support page
        rust: don't select CONSTRUCTORS
      2c815938
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 57865f39
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
         separation
      
       - A fix to avoid loading rv64/NOMMU kernel past the start of RAM
      
       - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
         overflow in the bitmask
      
       - The sud_test kselftest has been fixed to properly swizzle the syscall
         number into the return register, which are not the same on RISC-V
      
       - A fix for a build warning in the perf tools on rv32
      
       - A fix for the CBO selftests, to avoid non-constants leaking into the
         inline asm
      
       - A pair of fixes for T-Head PBMT errata probing, which has been
         renamed MAE by the vendor
      
      * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
        perf riscv: Fix the warning due to the incompatible type
        riscv: T-Head: Test availability bit before enabling MAE errata
        riscv: thead: Rename T-Head PBMT to MAE
        selftests: sud_test: return correct emulated syscall value on RISC-V
        riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
        riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
        riscv: Fix TASK_SIZE on 64-bit NOMMU
      57865f39
    • Linus Torvalds's avatar
      Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · d43df69f
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Three smb3 client fixes, all also for stable:
      
         - two small locking fixes spotted by Coverity
      
         - FILE_ALL_INFO and network_open_info packing fix"
      
      * tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        smb3: fix lock ordering potential deadlock in cifs_sync_mid_result
        smb3: missing lock when picking channel
        smb: client: Fix struct_group() usage in __packed structs
      d43df69f
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5d12ed4b
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Fix a race condition in the at24 eeprom handler, a NULL pointer
        exception in the I2C core for controllers only using target modes,
        drop a MAINTAINERS entry, and fix an incorrect DT binding for at24"
      
      * tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: smbus: fix NULL function pointer dereference
        MAINTAINERS: Drop entry for PCA9541 bus master selector
        eeprom: at24: fix memory corruption race condition
        dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
      5d12ed4b
    • Tetsuo Handa's avatar
      profiling: Remove create_prof_cpu_mask(). · 2e5449f4
      Tetsuo Handa authored
      create_prof_cpu_mask() is no longer used after commit 1f44a225 ("s390:
      convert interrupt handling to use generic hardirq").
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e5449f4
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · 8a5c3ef7
      Linus Torvalds authored
      Pull soundwire fix from Vinod Koul:
      
       - Single AMD driver fix for wake interrupt handling in clockstop mode
      
      * tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
        soundwire: amd: fix for wake interrupt handling for clockstop mode
      8a5c3ef7
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 6fba14a7
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - Revert pl330 issue_pending waits until WFP state due to regression
         reported in Bluetooth loading
      
       - Xilinx driver fixes for synchronization, buffer offsets, locking and
         kdoc
      
       - idxd fixes for spinlock and preventing the migration of the perf
         context to an invalid target
      
       - idma driver fix for interrupt handling when powered off
      
       - Tegra driver residual calculation fix
      
       - Owl driver register access fix
      
      * tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: idxd: Fix oops during rmmod on single-CPU platforms
        dmaengine: xilinx: xdma: Clarify kdoc in XDMA driver
        dmaengine: xilinx: xdma: Fix synchronization issue
        dmaengine: xilinx: xdma: Fix wrong offsets in the buffers addresses in dma descriptor
        dma: xilinx_dpdma: Fix locking
        dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue
        idma64: Don't try to serve interrupts when device is powered off
        dmaengine: tegra186: Fix residual calculation
        dmaengine: owl: fix register access functions
        dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"
      6fba14a7
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · 63407d30
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - static checker (array size, bounds) fix for marvel driver
      
       - Rockchip rk3588 pcie fixes for bifurcation and mux
      
       - Qualcomm qmp-compbo fix for VCO, register base and regulator name for
         m31 driver
      
       - charger det crash fix for ti driver
      
      * tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered
        phy: qcom: qmp-combo: fix VCO div offset on v5_5nm and v6
        phy: phy-rockchip-samsung-hdptx: Select CONFIG_RATIONAL
        phy: qcom: m31: match requested regulator name with dt schema
        phy: qcom: qmp-combo: Fix register base for QSERDES_DP_PHY_MODE
        phy: qcom: qmp-combo: Fix VCO div offset on v3
        phy: rockchip: naneng-combphy: Fix mux on rk3588
        phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bits
        phy: rockchip-snps-pcie3: fix bifurcation on rk3588
        phy: freescale: imx8m-pcie: fix pcie link-up instability
        phy: marvell: a3700-comphy: Fix hardcoded array size
        phy: marvell: a3700-comphy: Fix out of bounds read
      63407d30
    • Wolfram Sang's avatar
      i2c: smbus: fix NULL function pointer dereference · 91811a31
      Wolfram Sang authored
      Baruch reported an OOPS when using the designware controller as target
      only. Target-only modes break the assumption of one transfer function
      always being available. Fix this by always checking the pointer in
      __i2c_transfer.
      Reported-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il
      Fixes: 4b1acc43 ("i2c: core changes for slave support")
      [wsa: dropped the simplification in core-smbus to avoid theoretical regressions]
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Tested-by: default avatarBaruch Siach <baruch@tkos.co.il>
      91811a31
  6. 26 Apr, 2024 21 commits