Commits · 21d98d658f9e5967dc30c321bc258b50740c6665 · Kirill Smelkov / linux

17 Sep, 2024 7 commits

ACPI: RISCV: Make acpi_numa_get_nid() to be static · 21d98d65

Hanjun Guo authored Aug 11, 2024

acpi_numa_get_nid() is only called in acpi_numa.c for riscv,
no need to add it in head file, so make it static and remove
related functions in the asm/acpi.h.

Spotted by doing some cleanup for arm64 ACPI.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Link: https://lore.kernel.org/r/20240811031804.3347298-1-guohanjun@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

21d98d65

riscv: Randomize lower bits of stack address · 048e2906

Yunhui Cui authored Jun 25, 2024

Implement arch_align_stack() to randomize the lower bits
of the stack address.
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240625030502.68988-1-cuiyunhui@bytedance.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

048e2906

selftests: riscv: Allow mmap test to compile on 32-bit · 11c2dbd7

Charlie Jenkins authored Aug 08, 2024

Macros needed for 32-bit compilations were hidden behind 64-bit riscv
ifdefs. Fix the 32-bit compilations by moving macros to allow the
memory_layout test to run on 32-bit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 73d05262 ("selftests: riscv: Generalize mm selftests")
Link: https://lore.kernel.org/r/20240808-mmap_tests__fixes-v1-1-b1344b642a84@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

11c2dbd7

riscv: Make riscv_isa_vendor_ext_andes array static · 594ffcf4

Charlie Jenkins authored Aug 07, 2024

Since this array is only used in this file, it should be static.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407241530.ej5SVgX1-lkp@intel.com/Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240807-make_andes_static-v1-1-b64bf4c3d941@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

594ffcf4

riscv: Use LIST_HEAD() to simplify code · 3cc754c2

Jinjie Ruan authored Sep 04, 2024

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240904013344.2026738-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

3cc754c2

riscv: defconfig: Disable RZ/Five peripheral support · e36ddf32

Geert Uytterhoeven authored Jul 30, 2024

There is not much point in keeping support for RZ/Five peripherals
enabled, as the RZ/Five platform option (ARCH_R9A07G043) is gated behind
NONPORTABLE. Hence drop all config options that enable built-in or
modular support for peripherals found on RZ/Five SoCs.

Disable USB_XHCI_RCAR explicitly, as its value defaults to the value of
ARCH_RENESAS, which is still enabled.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/89ad70c7d6e8078208fecfd41dc03f6028531729.1722353710.git.geert+renesas@glider.beSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

e36ddf32

RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup · 983f1214

Jinjie Ruan authored Jul 27, 2024

Until now, the generic weak kgdb_roundup_cpus() has been used for kgdb on
RISCV. A custom one allows to debug CPUs that are stuck with interrupts
disabled with NMI support in the future. And using an IPI is better than
the generic one since it avoids the potential situation described in the
generic kgdb_call_nmi_hook(). As Andrew pointed out, once there is NMI
support, we can easily extend this and the CPU backtrace support
to use NMIs.

After this patch, the kgdb test show that:
	# echo g > /proc/sysrq-trigger
	[2]kdb> btc
	btc: cpu status: Currently on cpu 2
	Available cpus: 0-1(-), 2, 3(-)
	Stack traceback for pid 0
	0xffffffff81c13a40        0        0  1    0   -  0xffffffff81c14510  swapper/0
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-g3120273055b6-dirty #51
	Hardware name: riscv-virtio,qemu (DT)
	Call Trace:
	[<ffffffff80006c48>] dump_backtrace+0x28/0x30
	[<ffffffff80fceb38>] show_stack+0x38/0x44
	[<ffffffff80fe6a04>] dump_stack_lvl+0x58/0x7a
	[<ffffffff80fe6a3e>] dump_stack+0x18/0x20
	[<ffffffff801143fa>] kgdb_cpu_enter+0x682/0x6b2
	[<ffffffff801144ca>] kgdb_nmicallback+0xa0/0xac
	[<ffffffff8000a392>] handle_IPI+0x9c/0x120
	[<ffffffff800a2baa>] handle_percpu_devid_irq+0xa4/0x1e4
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff800a9e5c>] ipi_mux_process+0xe8/0x110
	[<ffffffff806e1e30>] imsic_handle_irq+0xf8/0x13a
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff806dff12>] riscv_intc_aia_irq+0x2e/0x40
	[<ffffffff80fe6ab0>] handle_riscv_irq+0x54/0x86
	[<ffffffff80ff2e4a>] call_on_irq_stack+0x32/0x40

Rebased on Ryo Takakura's "RISC-V: Enable IPI CPU Backtrace" patch.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240727063438.886155-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

983f1214

16 Sep, 2024 8 commits

riscv: avoid Imbalance in RAS · 8f1534e7

Jisheng Zhang authored Jul 21, 2024

Inspired by[1], modify the code to remove the code of modifying ra to
avoid imbalance RAS (return address stack) which may lead to incorret
predictions on return.

Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tenstorrent.com/ [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

8f1534e7

Merge patch series "Svvptc extension to remove preventive sfence.vma" · 7e340f4f

Palmer Dabbelt authored Sep 15, 2024

Alexandre Ghiti <alexghiti@rivosinc.com> says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

7e340f4f

riscv: cacheinfo: Add back init_cache_level() function · 1845d381

Steffen Persvold authored Jul 07, 2024

commit 5944ce09 (arch_topology: Build cacheinfo from primary CPU)
removed the init_cache_level() function from arch/riscv/kernel/cacheinfo.c
and relies on the init_cpu_topology() function in drivers/base/arch_topology.c
to call fetch_cache_info() which in turn calls init_of_cache_level() to
populate the cache hierarchy information. However, init_cpu_topology() is only
called from smpboot.c:smp_prepare_cpus() and thus only available when
CONFIG_SMP is defined.

To support non-SMP enabled kernels to still detect cache hierarchy, we add back
the init_cache_level() function. The init_level_allocate_ci() function handles
this gracefully on SMP-enabled kernels anyway where fetch_cache_info() is
called from init_cpu_topology() earlier in the boot phase.
Signed-off-by: Steffen Persvold <spersvold@gmail.com>
Link: https://lore.kernel.org/r/20240707003515.5058-1-spersvold@gmail.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

1845d381

riscv: Remove unused _TIF_WORK_MASK · cea9d277

Jinjie Ruan authored Jul 11, 2024

Since commit f0bddf50 ("riscv: entry: Convert to generic entry"),
_TIF_WORK_MASK is no longer used, so remove it.

Fixes: f0bddf50 ("riscv: entry: Convert to generic entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240711111508.1373322-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

cea9d277

Merge patch series "riscv: select ARCH_USE_SYM_ANNOTATIONS" · 9b2863e2

Palmer Dabbelt authored Sep 15, 2024

Jisheng Zhang <jszhang@kernel.org> says:

commit 76329c69 ("riscv: Use SYM_*() assembly macros instead
of deprecated ones"), most riscv has been to converted the new style
SYM_ assembler annotations. The remaining one is sifive's
errata_cip_453.S, so convert to new style SYM_ annotations as well.
After that select ARCH_USE_SYM_ANNOTATIONS.

* b4-shazam-merge:
  riscv: select ARCH_USE_SYM_ANNOTATIONS
  riscv: errata: sifive: Use SYM_*() assembly macros

Link: https://lore.kernel.org/r/20240709160536.3690-1-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

9b2863e2

drivers/perf: riscv: Remove redundant macro check · 1e206fad

Xiao Wang authored Jul 08, 2024

The macro CONFIG_RISCV_PMU must have been defined when riscv_pmu.c gets
compiled, so this patch removes the redundant check.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240708121224.1148154-1-xiao.w.wang@intel.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

1e206fad

Merge patch series "riscv: stacktrace: Add USER_STACKTRACE support" · f25170a0

Palmer Dabbelt authored Sep 14, 2024

Jinjie Ruan <ruanjinjie@huawei.com> says:

Add RISC-V USER_STACKTRACE support, and fix the fp alignment bug
in perf_callchain_user() by the way as Björn pointed out.

* b4-shazam-merge:
  riscv: stacktrace: Add USER_STACKTRACE support
  riscv: Fix fp alignment bug in perf_callchain_user()

Link: https://lore.kernel.org/r/20240708032847.2998158-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

f25170a0

riscv: define ILLEGAL_POINTER_VALUE for 64bit · 5c178472

Jisheng Zhang authored Jul 06, 2024

This is used in poison.h for poison pointer offset. Based on current
SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
that is not mappable, this can avoid potentially turning an oops to
an expolit.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: fbe934d6 ("RISC-V: Build Infrastructure")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240705170210.3236-1-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

5c178472

15 Sep, 2024 8 commits

riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc · 7a21b2e3

Alexandre Ghiti authored Jul 17, 2024

The preventive sfence.vma were emitted because new mappings must be made
visible to the page table walker but Svvptc guarantees that it will
happen within a bounded timeframe, so no need to sfence.vma for the uarchs
that implement this extension, we will then take gratuitous (but very
unlikely) page faults, similarly to x86 and arm64.

This allows to drastically reduce the number of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

7a21b2e3

riscv: Stop emitting preventive sfence.vma for new vmalloc mappings · 503638e0

Alexandre Ghiti authored Jul 17, 2024

In 6.5, we removed the vmalloc fault path because that can't work (see
[1] [2]). Then in order to make sure that new page table entries were
seen by the page table walker, we had to preventively emit a sfence.vma
on all harts [3] but this solution is very costly since it relies on IPI.

And even there, we could end up in a loop of vmalloc faults if a vmalloc
allocation is done in the IPI path (for example if it is traced, see
[4]), which could result in a kernel stack overflow.

Those preventive sfence.vma needed to be emitted because:

- if the uarch caches invalid entries, the new mapping may not be
  observed by the page table walker and an invalidation may be needed.
- if the uarch does not cache invalid entries, a reordered access
  could "miss" the new mapping and traps: in that case, we would actually
  only need to retry the access, no sfence.vma is required.

So this patch removes those preventive sfence.vma and actually handles
the possible (and unlikely) exceptions. And since the kernel stacks
mappings lie in the vmalloc area, this handling must be done very early
when the trap is taken, at the very beginning of handle_exception: this
also rules out the vmalloc allocations in the fault path.

Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1]
Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2]
Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3]
Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

503638e0

dt-bindings: riscv: Add Svvptc ISA extension description · d25599b5

Alexandre Ghiti authored Jul 17, 2024

Add description for the Svvptc ISA extension which was ratified recently.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-3-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

d25599b5

riscv: Add ISA extension parsing for Svvptc · a6efe33c

Alexandre Ghiti authored Jul 17, 2024

Add support to parse the Svvptc string in the riscv,isa string.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

a6efe33c

riscv: select ARCH_USE_SYM_ANNOTATIONS · 7c9d980e

Jisheng Zhang authored Jul 10, 2024

Now, riscv has been converted to the new style SYM_ assembler
annotations. So select ARCH_USE_SYM_ANNOTATIONS to ensure the
deprecated macros such as ENTRY(), END(), WEAK() and so on are not
available and we don't regress.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-3-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

7c9d980e

riscv: errata: sifive: Use SYM_*() assembly macros · 6868d12e

Jisheng Zhang authored Jul 10, 2024

ENTRY()/END() macros are deprecated and we should make use of the
new SYM_*() macros [1] for better annotation of symbols. Replace the
deprecated ones with the new ones.

[1] https://docs.kernel.org/core-api/asm-annotations.htmlSigned-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240709160536.3690-2-jszhang@kernel.orgSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

6868d12e

riscv: stacktrace: Add USER_STACKTRACE support · 1a748331

Jinjie Ruan authored Jul 08, 2024

Currently, userstacktrace is unsupported for riscv. So use the
perf_callchain_user() code as blueprint to implement the
arch_stack_walk_user() which add userstacktrace support on riscv.
Meanwhile, we can use arch_stack_walk_user() to simplify the implementation
of perf_callchain_user().

A ftrace test case is shown as below:

	# cd /sys/kernel/debug/tracing
	# echo 1 > options/userstacktrace
	# echo 1 > options/sym-userobj
	# echo 1 > events/sched/sched_process_fork/enable
	# cat trace
	......
	            bash-178     [000] ...1.    97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231
	            bash-178     [000] ...1.    97.970075: <user stack trace>
	 => /lib/libc.so.6[+0xb5090]

Also a simple perf test is ok as below:

	# perf record -e cpu-clock --call-graph fp top
	# perf report --call-graph

	.....
	[[31m  66.54%[[m     0.00%  top      [kernel.kallsyms]            [k] ret_from_exception
            |
            ---ret_from_exception
               |
               |--[[31m58.97%[[m--do_trap_ecall_u
               |          |
               |          |--[[31m17.34%[[m--__riscv_sys_read
               |          |          ksys_read
               |          |          |
               |          |           --[[31m16.88%[[m--vfs_read
               |          |                     |
               |          |                     |--[[31m10.90%[[m--seq_read
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

1a748331

riscv: Fix fp alignment bug in perf_callchain_user() · 22ab0895

Jinjie Ruan authored Jul 08, 2024

The standard RISC-V calling convention said:
	"The stack grows downward and the stack pointer is always
	kept 16-byte aligned".

So perf_callchain_user() should check whether 16-byte aligned for fp.

Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

Fixes: dbeb90b0 ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

22ab0895

14 Sep, 2024 2 commits

riscv: Remove redundant restriction on memory size · d6a19281

Stuart Menefy authored Jun 24, 2024

The original reason for reserving the top 4GiB of the direct map
(space for modules/BPF/kernel) hasn't applied since the address
map was reworked for KASAN.
Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

d6a19281

riscv: vdso: do not strip debugging info for vdso.so.dbg · 7587a360

Changbin Du authored Jun 11, 2024

The vdso.so.dbg is a debug version of vdso and could be used for debugging
purpose. For example, perf-annotate requires debugging info to show source
lines. So let's keep its debugging info.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

7587a360

12 Sep, 2024 9 commits

Merge patch series "remove size limit on XIP kernel" · 9ea7b92b

Palmer Dabbelt authored Sep 12, 2024

Nam Cao <namcao@linutronix.de> says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

9ea7b92b

riscv: remove limit on the size of read-only section for XIP kernel · b635a84b

Nam Cao authored Jun 07, 2024

XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size. This causes
build failures if the kernel gets too big [1].

Remove this limit.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1]
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

b635a84b

riscv: drop the use of XIP_OFFSET in create_kernel_page_table() · a7cfb999

Nam Cao authored Jun 07, 2024

XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded value entirely, stop using
XIP_OFFSET in create_kernel_page_table(). Instead use _sdata and _start to
do the same thing.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/4ea3f222a7eb9f91c04b155ff2e4d3ef19158acc.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

a7cfb999

riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa() · 75fdf791

Nam Cao authored Jun 07, 2024

XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely,
remove the use of XIP_OFFSET in kernel_mapping_va_to_pa(). The macro
XIP_OFFSET is used in this case to check if the virtual address is mapped
to Flash or to RAM. The same check can be done with kernel_map.xiprom_sz.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/644c13d9467525a06f5d63d157875a35b2edb4bc.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

75fdf791

riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET · 23311f57

Nam Cao authored Jun 07, 2024

XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET. Instead, use __data_loc and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_FLASH_OFFSET.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/7b3319657edd1822f3457e7e7c07aaa326cc2f87.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

23311f57

riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET · e4eac34f

Nam Cao authored Jun 07, 2024

XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_OFFSET. Instead, use CONFIG_PHYS_RAM_BASE and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_OFFSET.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/dba0409518b14ee83b346e099b1f7f934daf7b74.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

e4eac34f

riscv: replace misleading va_kernel_pa_offset on XIP kernel · 5cf08967

Nam Cao authored Jun 07, 2024

On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike
"normal" kernel, it is not the virtual-physical address offset of kernel
mapping, it is the offset of kernel mapping's first virtual address to
first physical address in DRAM, which is not meaningful because the
kernel's first physical address is not in DRAM.

For XIP kernel, there are 2 different offsets because the read-only part of
the kernel resides in ROM while the rest is in RAM. The offset to ROM is in
kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored
anywhere: it is calculated on-the-fly.

Remove this confusing "va_kernel_pa_offset" and add
"va_kernel_xip_data_pa_offset" as its replacement. This new variable is the
offset of virtual mapping of the kernel's data portion to the corresponding
physical addresses.

With the introduction of this new variable, also rename
va_kernel_xip_pa_offset -> va_kernel_xip_text_pa_offset to make it clear
that this one is about the .text section.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

5cf08967

riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel · f2df5b4f

Nam Cao authored Jun 07, 2024

The crash utility uses va_kernel_pa_offset to translate virtual addresses.
This is incorrect in the case of XIP kernel, because va_kernel_pa_offset is
not the virtual-physical address offset (yes, the name is misleading; this
variable will be removed for XIP in a following commit).

Stop exporting this variable for XIP kernel. The replacement is to be
determined, note it as a TODO for now.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/8f8760d3f9a11af4ea0acbc247e4f49ff5d317e9.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

f2df5b4f

riscv: cleanup XIP_FIXUP macro · aa3457f2

Nam Cao authored Jun 07, 2024

The XIP_FIXUP macro is used to fix addresses early during boot before MMU:
generated code "thinks" the data section is in ROM while it is actually in
RAM. So this macro corrects the addresses in the data section.

This macro determines if the address needs to be fixed by checking if it is
within the range starting from ROM address up to the size of (2 *
XIP_OFFSET).

This means if the kernel size is bigger than (2 * XIP_OFFSET), some
addresses would not be fixed up.

XIP kernel can still work if the above scenario does not happen. But this
macro is obviously incorrect.

Rewrite this macro to only fix up addresses within the data section.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/95f50a4ec8204ec4fcbf2a80c9addea0e0609e3b.1717789719.git.namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

aa3457f2

03 Sep, 2024 2 commits

riscv: Add license to vmalloc.h · 4ffc8a34

Charlie Jenkins authored Jul 29, 2024

Add a missing license to vmalloc.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-2-7d5648069640@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

4ffc8a34

riscv: Add license to fence.h · 097c72e1

Charlie Jenkins authored Jul 29, 2024

Add a missing license to fence.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-1-7d5648069640@rivosinc.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

097c72e1

15 Aug, 2024 2 commits

riscv: Enable generic CPU vulnerabilites support · 0e3f3649

Jinjie Ruan authored Jul 03, 2024

Currently x86, ARM and ARM64 support generic CPU vulnerabilites, but
RISC-V not, such as:

	# cd /sys/devices/system/cpu/vulnerabilities/
x86:
	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Not affected

ARM64:

	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Mitigation: PTI

RISC-V:

	# cat /sys/devices/system/cpu/vulnerabilities
	# ... No such file or directory

As SiFive RISC-V Core IP offerings are not affected by Meltdown and
Spectre, it can use the default weak function as below:

	# cat spec_store_bypass
		Not affected
	# cat meltdown
		Not affected

Link: https://www.sifive.cn/blog/sifive-statement-on-meltdown-and-spectreSigned-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20240703022732.2068316-1-ruanjinjie@huawei.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

0e3f3649

riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown · c6ebf2c5

Ying Sun authored Jul 11, 2024

Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled:
  kexec -sl vmlinux

Error:
  kexec_image: Unknown rela relocation: 34
  kexec_image: Error loading purgatory ret=-8
and
  kexec_image: Unknown rela relocation: 38
  kexec_image: Error loading purgatory ret=-8

The purgatory code uses the 16-bit addition and subtraction relocation
type, but not handled, resulting in kexec_file_load failure.
So add handle to arch_kexec_apply_relocations_add().

Tested on RISC-V64 Qemu-virt, issue fixed.
Co-developed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Ying Sun <sunying@isrc.iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cnSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

c6ebf2c5

14 Aug, 2024 2 commits

riscv: change XIP's kernel_map.size to be size of the entire kernel · 57d76bc5

Nam Cao authored May 08, 2024

With XIP kernel, kernel_map.size is set to be only the size of data part of
the kernel. This is inconsistent with "normal" kernel, who sets it to be
the size of the entire kernel.

More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is
enabled, because there are checks on virtual addresses with the assumption
that kernel_map.size is the size of the entire kernel (these checks are in
arch/riscv/mm/physaddr.c).

Change XIP's kernel_map.size to be the size of the entire kernel.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.deSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

57d76bc5

riscv: entry: always initialize regs->a0 to -ENOSYS · 61119394

Celeste Liu authored Jun 27, 2024

Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.

Fixes: 52449c17 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.comSigned-off-by: Palmer Dabbelt <palmer@rivosinc.com>

61119394