- 16 Sep, 2024 3 commits
-
-
Xiao Wang authored
The macro CONFIG_RISCV_PMU must have been defined when riscv_pmu.c gets compiled, so this patch removes the redundant check. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240708121224.1148154-1-xiao.w.wang@intel.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Palmer Dabbelt authored
Jinjie Ruan <ruanjinjie@huawei.com> says: Add RISC-V USER_STACKTRACE support, and fix the fp alignment bug in perf_callchain_user() by the way as Björn pointed out. * b4-shazam-merge: riscv: stacktrace: Add USER_STACKTRACE support riscv: Fix fp alignment bug in perf_callchain_user() Link: https://lore.kernel.org/r/20240708032847.2998158-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
This is used in poison.h for poison pointer offset. Based on current SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value that is not mappable, this can avoid potentially turning an oops to an expolit. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Fixes: fbe934d6 ("RISC-V: Build Infrastructure") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240705170210.3236-1-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 15 Sep, 2024 2 commits
-
-
Jinjie Ruan authored
Currently, userstacktrace is unsupported for riscv. So use the perf_callchain_user() code as blueprint to implement the arch_stack_walk_user() which add userstacktrace support on riscv. Meanwhile, we can use arch_stack_walk_user() to simplify the implementation of perf_callchain_user(). A ftrace test case is shown as below: # cd /sys/kernel/debug/tracing # echo 1 > options/userstacktrace # echo 1 > options/sym-userobj # echo 1 > events/sched/sched_process_fork/enable # cat trace ...... bash-178 [000] ...1. 97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231 bash-178 [000] ...1. 97.970075: <user stack trace> => /lib/libc.so.6[+0xb5090] Also a simple perf test is ok as below: # perf record -e cpu-clock --call-graph fp top # perf report --call-graph ..... [[31m 66.54%[[m 0.00% top [kernel.kallsyms] [k] ret_from_exception | ---ret_from_exception | |--[[31m58.97%[[m--do_trap_ecall_u | | | |--[[31m17.34%[[m--__riscv_sys_read | | ksys_read | | | | | --[[31m16.88%[[m--vfs_read | | | | | |--[[31m10.90%[[m--seq_read Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Tested-by: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jinjie Ruan authored
The standard RISC-V calling convention said: "The stack grows downward and the stack pointer is always kept 16-byte aligned". So perf_callchain_user() should check whether 16-byte aligned for fp. Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf Fixes: dbeb90b0 ("riscv: Add perf callchain support") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 14 Sep, 2024 2 commits
-
-
Stuart Menefy authored
The original reason for reserving the top 4GiB of the direct map (space for modules/BPF/kernel) hasn't applied since the address map was reworked for KASAN. Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Changbin Du authored
The vdso.so.dbg is a debug version of vdso and could be used for debugging purpose. For example, perf-annotate requires debugging info to show source lines. So let's keep its debugging info. Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 12 Sep, 2024 9 commits
-
-
Palmer Dabbelt authored
Nam Cao <namcao@linutronix.de> says: Hi, For XIP kernel, the writable data section is always at offset specified in XIP_OFFSET, which is hard-coded to 32MB. Unfortunately, this means the read-only section (placed before the writable section) is restricted in size. This causes build failure if the kernel gets too large. This series remove the use of XIP_OFFSET one by one, then remove this macro entirely at the end, with the goal of lifting this size restriction. Also some cleanup and documentation along the way. * b4-shazam-merge riscv: remove limit on the size of read-only section for XIP kernel riscv: drop the use of XIP_OFFSET in create_kernel_page_table() riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa() riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET riscv: replace misleading va_kernel_pa_offset on XIP kernel riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel riscv: cleanup XIP_FIXUP macro riscv: change XIP's kernel_map.size to be size of the entire kernel ... Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. This causes build failures if the kernel gets too big [1]. Remove this limit. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1] Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded value entirely, stop using XIP_OFFSET in create_kernel_page_table(). Instead use _sdata and _start to do the same thing. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/4ea3f222a7eb9f91c04b155ff2e4d3ef19158acc.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, remove the use of XIP_OFFSET in kernel_mapping_va_to_pa(). The macro XIP_OFFSET is used in this case to check if the virtual address is mapped to Flash or to RAM. The same check can be done with kernel_map.xiprom_sz. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/644c13d9467525a06f5d63d157875a35b2edb4bc.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop using XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET. Instead, use __data_loc and _sdata to do the same thing. While at it, also add a description for XIP_FIXUP_FLASH_OFFSET. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/7b3319657edd1822f3457e7e7c07aaa326cc2f87.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
XIP_OFFSET is the hard-coded offset of writable data section within the kernel. By hard-coding this value, the read-only section of the kernel (which is placed before the writable data section) is restricted in size. As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop using XIP_OFFSET in XIP_FIXUP_OFFSET. Instead, use CONFIG_PHYS_RAM_BASE and _sdata to do the same thing. While at it, also add a description for XIP_FIXUP_OFFSET. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/dba0409518b14ee83b346e099b1f7f934daf7b74.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike "normal" kernel, it is not the virtual-physical address offset of kernel mapping, it is the offset of kernel mapping's first virtual address to first physical address in DRAM, which is not meaningful because the kernel's first physical address is not in DRAM. For XIP kernel, there are 2 different offsets because the read-only part of the kernel resides in ROM while the rest is in RAM. The offset to ROM is in kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored anywhere: it is calculated on-the-fly. Remove this confusing "va_kernel_pa_offset" and add "va_kernel_xip_data_pa_offset" as its replacement. This new variable is the offset of virtual mapping of the kernel's data portion to the corresponding physical addresses. With the introduction of this new variable, also rename va_kernel_xip_pa_offset -> va_kernel_xip_text_pa_offset to make it clear that this one is about the .text section. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
The crash utility uses va_kernel_pa_offset to translate virtual addresses. This is incorrect in the case of XIP kernel, because va_kernel_pa_offset is not the virtual-physical address offset (yes, the name is misleading; this variable will be removed for XIP in a following commit). Stop exporting this variable for XIP kernel. The replacement is to be determined, note it as a TODO for now. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/8f8760d3f9a11af4ea0acbc247e4f49ff5d317e9.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Nam Cao authored
The XIP_FIXUP macro is used to fix addresses early during boot before MMU: generated code "thinks" the data section is in ROM while it is actually in RAM. So this macro corrects the addresses in the data section. This macro determines if the address needs to be fixed by checking if it is within the range starting from ROM address up to the size of (2 * XIP_OFFSET). This means if the kernel size is bigger than (2 * XIP_OFFSET), some addresses would not be fixed up. XIP kernel can still work if the above scenario does not happen. But this macro is obviously incorrect. Rewrite this macro to only fix up addresses within the data section. Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/95f50a4ec8204ec4fcbf2a80c9addea0e0609e3b.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 03 Sep, 2024 2 commits
-
-
Charlie Jenkins authored
Add a missing license to vmalloc.h. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-2-7d5648069640@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Charlie Jenkins authored
Add a missing license to fence.h. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-1-7d5648069640@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 15 Aug, 2024 2 commits
-
-
Jinjie Ruan authored
Currently x86, ARM and ARM64 support generic CPU vulnerabilites, but RISC-V not, such as: # cd /sys/devices/system/cpu/vulnerabilities/ x86: # cat spec_store_bypass Mitigation: Speculative Store Bypass disabled via prctl and seccomp # cat meltdown Not affected ARM64: # cat spec_store_bypass Mitigation: Speculative Store Bypass disabled via prctl and seccomp # cat meltdown Mitigation: PTI RISC-V: # cat /sys/devices/system/cpu/vulnerabilities # ... No such file or directory As SiFive RISC-V Core IP offerings are not affected by Meltdown and Spectre, it can use the default weak function as below: # cat spec_store_bypass Not affected # cat meltdown Not affected Link: https://www.sifive.cn/blog/sifive-statement-on-meltdown-and-spectreSigned-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20240703022732.2068316-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Ying Sun authored
Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled: kexec -sl vmlinux Error: kexec_image: Unknown rela relocation: 34 kexec_image: Error loading purgatory ret=-8 and kexec_image: Unknown rela relocation: 38 kexec_image: Error loading purgatory ret=-8 The purgatory code uses the 16-bit addition and subtraction relocation type, but not handled, resulting in kexec_file_load failure. So add handle to arch_kexec_apply_relocations_add(). Tested on RISC-V64 Qemu-virt, issue fixed. Co-developed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Ying Sun <sunying@isrc.iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cnSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 14 Aug, 2024 2 commits
-
-
Nam Cao authored
With XIP kernel, kernel_map.size is set to be only the size of data part of the kernel. This is inconsistent with "normal" kernel, who sets it to be the size of the entire kernel. More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is enabled, because there are checks on virtual addresses with the assumption that kernel_map.size is the size of the entire kernel (these checks are in arch/riscv/mm/physaddr.c). Change XIP's kernel_map.size to be the size of the entire kernel. Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: <stable@vger.kernel.org> # v6.1+ Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Celeste Liu authored
Otherwise when the tracer changes syscall number to -1, the kernel fails to initialize a0 with -ENOSYS and subsequently fails to return the error code of the failed syscall to userspace. For example, it will break strace syscall tampering. Fixes: 52449c17 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1") Reported-by: "Dmitry V. Levin" <ldv@strace.io> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Cc: stable@vger.kernel.org Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com> Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 07 Aug, 2024 1 commit
-
-
Ryo Takakura authored
Add arch_trigger_cpumask_backtrace() which is a generic infrastructure for sampling other CPUs' backtrace using IPI. The feature is used when lockups are detected or in case of oops/panic if parameters are set accordingly. Below is the case of oops with the oops_all_cpu_backtrace enabled. $ sysctl kernel.oops_all_cpu_backtrace=1 triggering oops shows: [ 212.214237] NMI backtrace for cpu 1 [ 212.214390] CPU: 1 PID: 610 Comm: in:imklog Tainted: G OE 6.10.0-rc6 #1 [ 212.214570] Hardware name: riscv-virtio,qemu (DT) [ 212.214690] epc : fallback_scalar_usercopy+0x8/0xdc [ 212.214809] ra : _copy_to_user+0x20/0x40 [ 212.214913] epc : ffffffff80c3a930 ra : ffffffff8059ba7e sp : ff20000000eabb50 [ 212.215061] gp : ffffffff82066f90 tp : ff6000008e958000 t0 : 3463303866660000 [ 212.215210] t1 : 000000000000005b t2 : 3463303866666666 s0 : ff20000000eabb60 [ 212.215358] s1 : 0000000000000386 a0 : 00007ff6e81df926 a1 : ff600000824df800 [ 212.215505] a2 : 000000000000003f a3 : 7fffffffffffffc0 a4 : 0000000000000000 [ 212.215651] a5 : 000000000000003f a6 : 0000000000000000 a7 : 0000000000000000 [ 212.215857] s2 : ff600000824df800 s3 : ffffffff82066cc0 s4 : 0000000000001c1a [ 212.216074] s5 : ffffffff8206a5a8 s6 : 00007ff6e81df926 s7 : ffffffff8206a5a0 [ 212.216278] s8 : ff600000824df800 s9 : ffffffff81e25de0 s10: 000000000000003f [ 212.216471] s11: ffffffff8206a59d t3 : ff600000824df812 t4 : ff600000824df812 [ 212.216651] t5 : ff600000824df818 t6 : 0000000000040000 [ 212.216796] status: 0000000000040120 badaddr: 0000000000000000 cause: 8000000000000001 [ 212.217035] [<ffffffff80c3a930>] fallback_scalar_usercopy+0x8/0xdc [ 212.217207] [<ffffffff80095f56>] syslog_print+0x1f4/0x2b2 [ 212.217362] [<ffffffff80096e5c>] do_syslog.part.0+0x94/0x2d8 [ 212.217502] [<ffffffff800979e8>] do_syslog+0x66/0x88 [ 212.217636] [<ffffffff803a5dda>] kmsg_read+0x44/0x5c [ 212.217764] [<ffffffff80392dbe>] proc_reg_read+0x7a/0xa8 [ 212.217952] [<ffffffff802ff726>] vfs_read+0xb0/0x24e [ 212.218090] [<ffffffff803001ba>] ksys_read+0x64/0xe4 [ 212.218264] [<ffffffff8030025a>] __riscv_sys_read+0x20/0x2c [ 212.218453] [<ffffffff80c4af9a>] do_trap_ecall_u+0x60/0x1d4 [ 212.218664] [<ffffffff80c56998>] ret_from_exception+0x0/0x64 Signed-off-by: Ryo Takakura <takakura@valinux.co.jp> Link: https://lore.kernel.org/r/20240718093659.158912-1-takakura@valinux.co.jpSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 06 Aug, 2024 1 commit
-
-
Alexandre Ghiti authored
commit edf2d546 ("riscv: patch: Flush the icache right after patching to avoid illegal insns") mistakenly removed the global icache flush in patch_text_nosync() and patch_text_set_nosync() functions, so reintroduce them. Fixes: edf2d546 ("riscv: patch: Flush the icache right after patching to avoid illegal insns") Reported-by: Samuel Holland <samuel.holland@sifive.com> Closes: https://lore.kernel.org/linux-riscv/a28ddc26-d77a-470a-a33f-88144f717e86@sifive.com/Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240801191404.55181-1-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 05 Aug, 2024 5 commits
-
-
Palmer Dabbelt authored
Jesse Taube <jesse@rivosinc.com> says: Add functions to pi/fdt_early.c to help parse the FDT to check if the isa string has the Zkr extension. Then use the Zkr extension to seed the KASLR base address. The first two patches fix the visibility of symbols. * b4-shazam-merge: RISC-V: Use Zkr to seed KASLR base address RISC-V: pi: Add kernel/pi/pi.h RISC-V: lib: Add pi aliases for string functions RISC-V: pi: Force hidden visibility for all symbol references Link: https://lore.kernel.org/r/20240709173937.510084-1-jesse@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jesse Taube authored
Parse the device tree for Zkr in the isa string. If Zkr is present, use it to seed the kernel base address. On an ACPI system, as of this commit, there is no easy way to check if Zkr is present. Blindly running the instruction isn't an option as; we have to be able to trust the firmware. Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Zong Li <zong.li@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240709173937.510084-5-jesse@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jesse Taube authored
Add pi.h header for declarations of the kernel/pi prefixed functions and any other related declarations. Suggested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240709173937.510084-4-jesse@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jesse Taube authored
memset, strcmp, and strncmp are all used in the __pi_ section, add SYM_FUNC_ALIAS for them. When KASAN is enabled in <asm/string.h> __pi___memset is also needed. Suggested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240709173937.510084-3-jesse@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Jesse Taube authored
Eliminate all GOT entries in the .pi section, by forcing hidden visibility for all symbol references, which informs the compiler that such references will be resolved at link time without the need for allocating GOT entries. Include linux/hidden.h in Makefile, like arm64, for the hidden visibility attribute. Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240709173937.510084-2-jesse@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
- 04 Aug, 2024 11 commits
-
-
Linus Torvalds authored
-
Tetsuo Handa authored
The kernel sleep profile is no longer working due to a recursive locking bug introduced by commit 42a20f86 ("sched: Add wrapper for get_wchan() to keep task blocked") Booting with the 'profile=sleep' kernel command line option added or executing # echo -n sleep > /sys/kernel/profiling after boot causes the system to lock up. Lockdep reports kthreadd/3 is trying to acquire lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70 but task is already holding lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370 with the call trace being lock_acquire+0xc8/0x2f0 get_wchan+0x32/0x70 __update_stats_enqueue_sleeper+0x151/0x430 enqueue_entity+0x4b0/0x520 enqueue_task_fair+0x92/0x6b0 ttwu_do_activate+0x73/0x140 try_to_wake_up+0x213/0x370 swake_up_locked+0x20/0x50 complete+0x2f/0x40 kthread+0xfb/0x180 However, since nobody noticed this regression for more than two years, let's remove 'profile=sleep' support based on the assumption that nobody needs this functionality. Fixes: 42a20f86 ("sched: Add wrapper for get_wchan() to keep task blocked") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: - Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver. A recent change in the ACPI code which consolidated code pathes moved the invocation of init_freq_invariance_cppc() to be moved to a CPU hotplug handler. The first invocation on AMD CPUs ends up enabling a static branch which dead locks because the static branch enable tries to acquire cpu_hotplug_lock but that lock is already held write by the hotplug machinery. Use static_branch_enable_cpuslocked() instead and take the hotplug lock read for the Intel code path which is invoked from the architecture code outside of the CPU hotplug operations. - Fix the number of reserved bits in the sev_config structure bit field so that the bitfield does not exceed 64 bit. - Add missing Zen5 model numbers - Fix the alignment assumptions of pti_clone_pgtable() and clone_entry_text() on 32-bit: The code assumes PMD aligned code sections, but on 32-bit the kernel entry text is not PMD aligned. So depending on the code size and location, which is configuration and compiler dependent, entry text can cross a PMD boundary. As the start is not PMD aligned adding PMD size to the start address is larger than the end address which results in partially mapped entry code for user space. That causes endless recursion on the first entry from userspace (usually #PF). Cure this by aligning the start address in the addition so it ends up at the next PMD start address. clone_entry_text() enforces PMD mapping, but on 32-bit the tail might eventually be PTE mapped, which causes a map fail because the PMD for the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for the clone() invocation which resolves to PTE on 32-bit and PMD on 64-bit. - Zero the 8-byte case for get_user() on range check failure on 32-bit The recend consolidation of the 8-byte get_user() case broke the zeroing in the failure case again. Establish it by clearing ECX before the range check and not afterwards as that obvioulsy can't be reached when the range check fails * tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit x86/mm: Fix pti_clone_entry_text() for i386 x86/mm: Fix pti_clone_pgtable() alignment assumption x86/setup: Parse the builtin command line before merging x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range x86/sev: Fix __reserved field in sev_config x86/aperfmperf: Fix deadlock on cpu_hotplug_lock
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Thomas Gleixner: "Two fixes for the timer/clocksource code: - The recent fix to make the take over of the broadcast timer more reliable retrieves a per CPU pointer in preemptible context. This went unnoticed in testing as some compilers hoist the access into the non-preemotible section where the pointer is actually used, but obviously compilers can rightfully invoke it where the code put it. Move it into the non-preemptible section right to the actual usage side to cure it. - The clocksource watchdog is supposed to emit a warning when the retry count is greater than one and the number of retries reaches the limit. The condition is backwards and warns always when the count is greater than one. Fixup the condition to prevent spamming dmesg" * tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Fix brown-bag boolean thinko in cs_watchdog_read() tick/broadcast: Move per CPU pointer access into the atomic section
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Thomas Gleixner: - When stime is larger than rtime due to accounting imprecision, then utime = rtime - stime becomes negative. As this is unsigned math, the result becomes a huge positive number. Cure it by resetting stime to rtime in that case, so utime becomes 0. - Restore consistent state when sched_cpu_deactivate() fails. When offlining a CPU fails in sched_cpu_deactivate() after the SMT present counter has been decremented, then the function aborts but fails to increment the SMT present counter and leaves it imbalanced. Consecutive operations cause it to underflow. Add the missing fixup for the error path. For SMT accounting the runqueue needs to marked online again in the error exit path to restore consistent state. * tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate() sched/core: Introduce sched_set_rq_on/offline() helper sched/smt: Fix unbalance sched_smt_present dec/inc sched/smt: Introduce sched_smt_present_inc/dec() helper sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 perf fixes from Thomas Gleixner: - Move the smp_processor_id() invocation back into the non-preemtible region, so that the result is valid to use - Add the missing package C2 residency counters for Sierra Forest CPUs to make the newly added support actually useful * tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix smp_processor_id()-in-preemptible warnings perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes from Thomas Gleixner: "A couple of fixes for interrupt chip drivers: - Make sure to skip the clear register space in the MBIGEN driver when calculating the node register index. Otherwise the clear register is clobbered and the wrong node registers are accessed. - Fix a signed/unsigned confusion in the loongarch CPU driver which converts an error code to a huge "valid" interrupt number. - Convert the mesion GPIO interrupt controller lock to a raw spinlock so it works on RT. - Add a missing static to a internal function in the pic32 EVIC driver" * tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mbigen: Fix mbigen node address layout irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t' irqchip/irq-pic32-evic: Add missing 'static' to internal function irqchip/loongarch-cpu: Fix return value of lpic_gsi_to_irq()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fixes from Thomas Gleixner: "Two fixes for locking and jump labels: - Ensure that the atomic_cmpxchg() conditions are correct and evaluating to true on any non-zero value except 1. The missing check of the return value leads to inconsisted state of the jump label counter. - Add a missing type conversion in the paravirt spinlock code which makes loongson build again" * tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: jump_label: Fix the fix, brown paper bags galore locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node()
-
Rob Herring (Arm) authored
Commit 04f08ef2 ("arm/arm64: dts: arm: Use generic clock and regulator nodenames") renamed nodes and created 2 "clock-24000000" nodes (at different paths). The kernel can't handle these duplicate names even though they are at different paths. Fix this by renaming one of the nodes to "clock-pclk". This name is aligned with other Arm boards (those didn't have a known frequency to use in the node name). Fixes: 04f08ef2 ("arm/arm64: dts: arm: Use generic clock and regulator nodenames") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull smb client fixes from Steve French: - two reparse point fixes - minor cleanup - additional trace point (to help debug a recent problem) * tag '6.11-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal version number smb: client: fix FSCTL_GET_REPARSE_POINT against NetApp smb3: add dynamic tracepoints for shutdown ioctl cifs: Remove cifs_aio_ctx smb: client: handle lack of FSCTL_GET_REPARSE_POINT support
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-mediaLinus Torvalds authored
Pull media fixes from Mauro Carvalho Chehab: - two Kconfig fixes - one fix for the UVC driver addressing probing time detection of a UVC custom controls - one fix related to PDF generation * tag 'media/v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: v4l: Fix missing tabular column hint for Y14P format media: intel/ipu6: select AUXILIARY_BUS in Kconfig media: ipu-bridge: fix ipu6 Kconfig dependencies media: uvcvideo: Fix custom control mapping probing
-