Commits · 4e8f6e44bce8da3b0e2df37b12839f4bc9c9cabe · Kirill Smelkov / linux

26 Apr, 2023 1 commit

arm64: Fix label placement in record_mmu_state() · 4e8f6e44

Neeraj Upadhyay authored Apr 25, 2023

Fix label so that pre_disable_mmu_workaround() is called
before clearing sctlr_el1.M.

Fixes: 2ced0f30 ("arm64: head: Switch endianness before populating the ID map")
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230425095700.22005-1-quic_neeraju@quicinc.comSigned-off-by: Will Deacon <will@kernel.org>

4e8f6e44

20 Apr, 2023 13 commits

Merge branch 'for-next/sysreg' into for-next/core · eeb3557c

Will Deacon authored Apr 20, 2023

* for-next/sysreg:
  arm64/sysreg: Convert HFGITR_EL2 to automatic generation
  arm64/idreg: Don't disable SME when disabling SVE
  arm64/sysreg: Update ID_AA64PFR1_EL1 for DDI0601 2022-12
  arm64/sysreg: Convert HFG[RW]TR_EL2 to automatic generation
  arm64/sysreg: allow *Enum blocks in SysregFields blocks

eeb3557c

Merge branch 'for-next/stacktrace' into for-next/core · 9772b7f0

Will Deacon authored Apr 20, 2023

* for-next/stacktrace:
  arm64: move PAC masks to <asm/pointer_auth.h>
  arm64: use XPACLRI to strip PAC
  arm64: avoid redundant PAC stripping in __builtin_return_address()
  arm64: stacktrace: always inline core stacktrace functions
  arm64: stacktrace: move dump functions to end of file
  arm64: stacktrace: recover return address for first entry

9772b7f0

Merge branch 'for-next/perf' into for-next/core · 9651f00e

Will Deacon authored Apr 20, 2023

* for-next/perf: (24 commits)
  KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege
  drivers/perf: hisi: add NULL check for name
  drivers/perf: hisi: Remove redundant initialized of pmu->name
  perf/arm-cmn: Fix port detection for CMN-700
  arm64: pmuv3: dynamically map PERF_COUNT_HW_BRANCH_INSTRUCTIONS
  perf/arm-cmn: Validate cycles events fully
  Revert "ARM: mach-virt: Select PMUv3 driver by default"
  drivers/perf: apple_m1: Add Apple M2 support
  dt-bindings: arm-pmu: Add PMU compatible strings for Apple M2 cores
  perf: arm_cspmu: Fix variable dereference warning
  perf/amlogic: Fix config1/config2 parsing issue
  drivers/perf: Use devm_platform_get_and_ioremap_resource()
  kbuild, drivers/perf: remove MODULE_LICENSE in non-modules
  perf: qcom: Use devm_platform_get_and_ioremap_resource()
  perf: arm: Use devm_platform_get_and_ioremap_resource()
  perf/arm-cmn: Move overlapping wp_combine field
  ARM: mach-virt: Select PMUv3 driver by default
  ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM
  ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations
  perf: pmuv3: Change GENMASK to GENMASK_ULL
  ...

9651f00e

KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege · 87727ba2

Will Deacon authored Apr 20, 2023

Although pKVM supports CPU PMU emulation for non-protected guests since
722625c6 ("KVM: arm64: Reenable pmu in Protected Mode"), this relies
on the PMU driver probing before the host has de-privileged so that the
'kvm_arm_pmu_available' static key can still be enabled by patching the
hypervisor text.

As it happens, both of these events hang off device_initcall() but the
PMU consistently won the race until 7755cec6 ("arm64: perf: Move
PMUv3 driver to drivers/perf"). Since then, the host will fail to boot
when pKVM is enabled:

  | hw perfevents: enabled with armv8_pmuv3_0 PMU driver, 7 counters available
  | kvm [1]: nVHE hyp BUG at: [<ffff8000090366e0>] __kvm_nvhe_handle_host_mem_abort+0x270/0x284!
  | kvm [1]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE
  | kvm [1]: Hyp Offset: 0xfffea41fbdf70000
  | Kernel panic - not syncing: HYP panic:
  | PS:a00003c9 PC:0000dbe04b0c66e0 ESR:00000000f2000800
  | FAR:fffffbfffddfcf00 HPFAR:00000000010b0bf0 PAR:0000000000000000
  | VCPU:0000000000000000
  | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc7-00083-g0bce6746d154 #1
  | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
  | Call trace:
  |  dump_backtrace+0xec/0x108
  |  show_stack+0x18/0x2c
  |  dump_stack_lvl+0x50/0x68
  |  dump_stack+0x18/0x24
  |  panic+0x13c/0x33c
  |  nvhe_hyp_panic_handler+0x10c/0x190
  |  aarch64_insn_patch_text_nosync+0x64/0xc8
  |  arch_jump_label_transform+0x4c/0x5c
  |  __jump_label_update+0x84/0xfc
  |  jump_label_update+0x100/0x134
  |  static_key_enable_cpuslocked+0x68/0xac
  |  static_key_enable+0x20/0x34
  |  kvm_host_pmu_init+0x88/0xa4
  |  armpmu_register+0xf0/0xf4
  |  arm_pmu_acpi_probe+0x2ec/0x368
  |  armv8_pmu_driver_init+0x38/0x44
  |  do_one_initcall+0xcc/0x240

Fix the race properly by deferring the de-privilege step to
device_initcall_sync(). This will also be needed in future when probing
IOMMU devices and allows us to separate the pKVM de-privilege logic from
the core hypervisor initialisation path.

Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: 7755cec6 ("arm64: perf: Move PMUv3 driver to drivers/perf")
Tested-by: Fuad Tabba <tabba@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230420123356.2708-1-will@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

87727ba2

Merge branch 'for-next/mm' into for-next/core · 1bb31cc7

Will Deacon authored Apr 20, 2023

* for-next/mm:
  arm64: mm: always map fixmap at page granularity
  arm64: mm: move fixmap code to its own file
  arm64: add FIXADDR_TOT_{START,SIZE}
  Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""
  arm: uaccess: Remove memcpy_page_flushcache()
  mm,kfence: decouple kfence from page granularity mapping judgement

1bb31cc7

Merge branch 'for-next/misc' into for-next/core · 81444b77

Will Deacon authored Apr 20, 2023

* for-next/misc:
  arm64: kexec: include reboot.h
  arm64: delete dead code in this_cpu_set_vectors()
  arm64: kernel: Fix kernel warning when nokaslr is passed to commandline
  arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
  arm64/sme: Fix some comments of ARM SME
  arm64/signal: Alloc tpidr2 sigframe after checking system_supports_tpidr2()
  arm64/signal: Use system_supports_tpidr2() to check TPIDR2
  arm64: compat: Remove defines now in asm-generic
  arm64: kexec: remove unnecessary (void*) conversions
  arm64: armv8_deprecated: remove unnecessary (void*) conversions
  firmware: arm_sdei: Fix sleep from invalid context BUG

81444b77

Merge branch 'for-next/kdump' into for-next/core · f8863bc8

Will Deacon authored Apr 20, 2023

* for-next/kdump:
  arm64: kdump: defer the crashkernel reservation for platforms with no DMA memory zones
  arm64: kdump: do not map crashkernel region specifically
  arm64: kdump : take off the protection on crashkernel memory region

f8863bc8

Merge branch 'for-next/ftrace' into for-next/core · ea88dc92

Will Deacon authored Apr 20, 2023

* for-next/ftrace:
  arm64: ftrace: Simplify get_ftrace_plt
  arm64: ftrace: Add direct call support
  ftrace: selftest: remove broken trace_direct_tramp
  ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  ftrace: Store direct called addresses in their ops
  ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  ftrace: Remove the legacy _ftrace_direct API
  ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  ftrace: Let unregister_ftrace_direct_multi() call ftrace_free_filter()

ea88dc92

Merge branch 'for-next/cpufeature' into for-next/core · 31eb87cf

Will Deacon authored Apr 20, 2023

* for-next/cpufeature:
  arm64/cpufeature: Use helper macro to specify ID register for capabilites
  arm64/cpufeature: Consistently use symbolic constants for min_field_value
  arm64/cpufeature: Pull out helper for CPUID register definitions

31eb87cf

Merge branch 'for-next/asm' into for-next/core · 0f6563a3

Will Deacon authored Apr 20, 2023

* for-next/asm:
  arm64: uaccess: remove unnecessary earlyclobber
  arm64: uaccess: permit put_{user,kernel} to use zero register
  arm64: uaccess: permit __smp_store_release() to use zero register
  arm64: atomics: lse: improve cmpxchg implementation

0f6563a3

Merge branch 'for-next/acpi' into for-next/core · 67eacd61
Will Deacon authored Apr 20, 2023
```
* for-next/acpi:
  ACPI: AGDI: Improve error reporting for problems during .remove()
```
67eacd61

arm64: kexec: include reboot.h · b7b4ce84

Simon Horman authored Apr 18, 2023

Include reboot.h in machine_kexec.c for declaration of
machine_crash_shutdown.

gcc-12 with W=1 reports:

 arch/arm64/kernel/machine_kexec.c:257:6: warning: no previous prototype for 'machine_crash_shutdown' [-Wmissing-prototypes]
   257 | void machine_crash_shutdown(struct pt_regs *regs)

No functional changes intended.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230418-arm64-kexec-include-reboot-v1-1-8453fd4fb3fb@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

b7b4ce84

arm64: delete dead code in this_cpu_set_vectors() · 460e70e2

Dan Carpenter authored Apr 19, 2023

The "slot" variable is an enum, and in this context it is an unsigned
int. So the type means it can never be negative and also we never pass
invalid data to this function. If something did pass invalid data then
this check would be insufficient protection.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/73859c9e-dea0-4764-bf01-7ae694fa2e37@kili.mountainSigned-off-by: Will Deacon <will@kernel.org>

460e70e2

17 Apr, 2023 7 commits

arm64/cpufeature: Use helper macro to specify ID register for capabilites · 863da0bd

Mark Brown authored Apr 12, 2023

When defining which value to look for in a system register field we
currently manually specify the register, field shift, width and sign and
the value to look for. This opens the potential for error with for example
the wrong field width or sign being specified, an enumeration value for
a different similarly named field or letting something be initialised to 0.

Since we now generate defines for all the ID registers we now have named
constants for all of these things generated from the system register
description, meaning that we can generate initialisation for all the fields
used in matching from a minimal specification of register, field and match
value. This is both shorter and eliminates or makes build failures several
potential errors.

No change in the generated binary.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-3-4c8f28a6f203@kernel.org
[will: Drop explicit '.sign' assignment for BTI feature]
Signed-off-by: Will Deacon <will@kernel.org>

863da0bd

drivers/perf: hisi: add NULL check for name · 257aedb7

Junhao He authored Apr 03, 2023

When allocations fails that can be NULL now.

If the name provided is NULL, then the initialization process of the PMU
type and dev will be skipped in function perf_pmu_register().
Consequently, the PMU will not be able to register into the kernel.
Moreover, in the case of unregister the PMU, the function device_del()
will need to handle NULL pointers, which potentially can cause issues.

So move this allocation above the cpuhp_state_add_instance() and directly
return if it does fail.
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230403081423.62460-3-hejunhao3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>

257aedb7

drivers/perf: hisi: Remove redundant initialized of pmu->name · 25d8c250

Junhao He authored Apr 03, 2023

"pmu->name" is initialized by perf_pmu_register() function, so remove
the redundant initialized in hisi_pmu_init().
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230403081423.62460-2-hejunhao3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>

25d8c250

arm64/cpufeature: Consistently use symbolic constants for min_field_value · 21642da2

Mark Brown authored Apr 12, 2023

A number of the cpufeatures use raw numbers for the minimum field values
specified rather than symbolic constants. In preparation for the use of
helper macros replace all these with the appropriate constants.

No change in the generated binary.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-2-4c8f28a6f203@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

21642da2

arm64/cpufeature: Pull out helper for CPUID register definitions · 876e3c8e

Mark Brown authored Apr 12, 2023

We use the same structure to match hwcaps and CPU features so we can use
the same helper to generate the fields required. Pull the portion of the
current hwcaps helper that initialises the fields out into a separate
define placed earlier in the file so we can use it for cpufeatures.

No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-1-4c8f28a6f203@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

876e3c8e

arm64/sysreg: Convert HFGITR_EL2 to automatic generation · bbd329fe

Mark Brown authored Apr 12, 2023

Automatically generate the Hypervisor Fine-Grained Instruction Trap
Register as per DDI0601 2023-03, currently we only have a definition for
the register name not any of the contents. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230306-arm64-fgt-reg-gen-v5-1-516a89cb50f6@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

bbd329fe

ACPI: AGDI: Improve error reporting for problems during .remove() · 858a5663

Uwe Kleine-König authored Oct 14, 2022

Returning an error value in a platform driver's remove callback results in
a generic error message being emitted by the driver core, but otherwise it
doesn't make a difference. The device goes away anyhow.

So instead of triggering the generic platform error message, emit a more
helpful message if a problem occurs and return 0 to suppress the generic
message.

This patch is a preparation for making platform remove callbacks return
void.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20221014160623.467195-1-u.kleine-koenig@pengutronix.deSigned-off-by: Will Deacon <will@kernel.org>

858a5663

14 Apr, 2023 3 commits

arm64: kernel: Fix kernel warning when nokaslr is passed to commandline · a2a83eb4

Pavankumar Kondeti authored Apr 12, 2023

'Unknown kernel command line parameters "nokaslr", will be passed to
user space' message is noticed in the dmesg when nokaslr is passed to
the kernel commandline on ARM64 platform. This is because nokaslr param
is handled by early cpufeature detection infrastructure and the parameter
is never consumed by a kernel param handler. Fix this warning by
providing a dummy kernel param handler for nokaslr.
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Link: https://lore.kernel.org/r/20230412043258.397455-1-quic_pkondeti@quicinc.comSigned-off-by: Will Deacon <will@kernel.org>

a2a83eb4

perf/arm-cmn: Fix port detection for CMN-700 · 2ad91e44

Robin Murphy authored Apr 12, 2023

When the "extra device ports" configuration was first added, the
additional mxp_device_port_connect_info registers were added around the
existing mxp_mesh_port_connect_info registers. What I missed about
CMN-700 is that it shuffled them around to remove this discontinuity.
As such, tweak the definitions and factor out a helper for reading these
registers so we can deal with this discrepancy easily, which does at
least allow nicely tidying up the callsites. With this we can then also
do the nice thing and skip accesses completely rather than relying on
RES0 behaviour where we know the extra registers aren't defined.

Fixes: 23760a01 ("perf/arm-cmn: Add CMN-700 support")
Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/71d129241d4d7923cde72a0e5b4c8d2f6084525f.1681295193.git.robin.murphy@arm.comSigned-off-by: Will Deacon <will@kernel.org>

2ad91e44

arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step · af6c0bd5

Sumit Garg authored Feb 02, 2023

Currently only the first attempt to single-step has any effect. After
that all further stepping remains "stuck" at the same program counter
value.

Refer to the ARM Architecture Reference Manual (ARM DDI 0487E.a) D2.12,
PSTATE.SS=1 should be set at each step before transferring the PE to the
'Active-not-pending' state. The problem here is PSTATE.SS=1 is not set
since the second single-step.

After the first single-step, the PE transferes to the 'Inactive' state,
with PSTATE.SS=0 and MDSCR.SS=1, thus PSTATE.SS won't be set to 1 due to
kernel_active_single_step()=true. Then the PE transferes to the
'Active-pending' state when ERET and returns to the debugger by step
exception.

Before this patch:
==================
Entering kdb (current=0xffff3376039f0000, pid 1) on processor 0 due to Keyboard Entry
[0]kdb>

[0]kdb>
[0]kdb> bp write_sysrq_trigger
Instruction(i) BP #0 at 0xffffa45c13d09290 (write_sysrq_trigger)
    is enabled   addr at ffffa45c13d09290, hardtype=0 installed=0

[0]kdb> go
$ echo h > /proc/sysrq-trigger

Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to Breakpoint @ 0xffffad651a309290
[1]kdb> ss

Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to SS trap @ 0xffffad651a309294
[1]kdb> ss

Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to SS trap @ 0xffffad651a309294
[1]kdb>

After this patch:
=================
Entering kdb (current=0xffff6851c39f0000, pid 1) on processor 0 due to Keyboard Entry
[0]kdb> bp write_sysrq_trigger
Instruction(i) BP #0 at 0xffffc02d2dd09290 (write_sysrq_trigger)
    is enabled   addr at ffffc02d2dd09290, hardtype=0 installed=0

[0]kdb> go
$ echo h > /proc/sysrq-trigger

Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to Breakpoint @ 0xffffc02d2dd09290
[1]kdb> ss

Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd09294
[1]kdb> ss

Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd09298
[1]kdb> ss

Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd0929c
[1]kdb>

Fixes: 44679a4f ("arm64: KGDB: Add step debugging support")
Co-developed-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20230202073148.657746-3-sumit.garg@linaro.orgSigned-off-by: Will Deacon <will@kernel.org>

af6c0bd5

13 Apr, 2023 3 commits

arm64: move PAC masks to <asm/pointer_auth.h> · de1702f6

Mark Rutland authored Apr 12, 2023

Now that we use XPACLRI to strip PACs within the kernel, the
ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() definitions no
longer need to live in <asm/compiler.h>.

Move them to <asm/pointer_auth.h>, and ensure that this header is
included where they are used.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230412160134.306148-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

de1702f6

arm64: use XPACLRI to strip PAC · ca708599

Mark Rutland authored Apr 12, 2023

Currently we strip the PAC from pointers using C code, which requires
generating bitmasks, and conditionally clearing/setting bits depending
on bit 55. We can do better by using XPACLRI directly.

When the logic was originally written to strip PACs from user pointers,
contemporary toolchains used for the kernel had assemblers which were
unaware of the PAC instructions. As stripping the PAC from userspace
pointers required unconditional clearing of a fixed set of bits (which
could be performed with a single instruction), it was simpler to
implement the masking in C than it was to make use of XPACI or XPACLRI.

When support for in-kernel pointer authentication was added, the
stripping logic was extended to cover TTBR1 pointers, requiring several
instructions to handle whether to clear/set bits dependent on bit 55 of
the pointer.

This patch simplifies the stripping of PACs by using XPACLRI directly,
as contemporary toolchains do within __builtin_return_address(). This
saves a number of instructions, especially where
__builtin_return_address() does not implicitly strip the PAC but is
heavily used (e.g. with tracepoints). As the kernel might be compiled
with an assembler without knowledge of XPACLRI, it is assembled using
the 'HINT #7' alias, which results in an identical opcode.

At the same time, I've split ptrauth_strip_insn_pac() into
ptrauth_strip_user_insn_pac() and ptrauth_strip_kernel_insn_pac()
helpers so that we can avoid unnecessary PAC stripping when pointer
authentication is not in use in userspace or kernel respectively.

The underlying xpaclri() macro uses inline assembly which clobbers x30.
The clobber causes the compiler to save/restore the original x30 value
in a frame record (protected with PACIASP and AUTIASP when in-kernel
authentication is enabled), so this does not provide a gadget to alter
the return address. Similarly this does not adversely affect unwinding
due to the presence of the frame record.

The ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() are exported
from the kernel in ptrace and core dumps, so these are retained. A
subsequent patch will move them out of <asm/compiler.h>.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230412160134.306148-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

ca708599

arm64: avoid redundant PAC stripping in __builtin_return_address() · 9df3f508

Mark Rutland authored Apr 12, 2023

In old versions of GCC and Clang, __builtin_return_address() did not
strip the PAC. This was not the behaviour we desired, and so we wrapped
this with code to strip the PAC in commit:

  689eae42 ("arm64: mask PAC bits of __builtin_return_address")

Since then, both GCC and Clang decided that __builtin_return_address()
*should* strip the PAC, and the existing behaviour was a bug.

GCC was fixed in 11.1.0, with those fixes backported to 10.2.0, 9.4.0,
8.5.0, but not earlier:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94891

Clang was fixed in 12.0.0, though this was not backported:

  https://reviews.llvm.org/D75044

When using a compiler whose __builtin_return_address() strips the PAC,
our wrapper to strip the PAC is redundant. Similarly, when pointer
authentication is not in use within the kernel pointers will not have a
PAC, and so there's no point stripping those pointers.

To avoid this redundant work, this patch updates the
__builtin_return_address() wrapper to only be used when in-kernel
pointer authentication is configured and the compiler's
__builtin_return_address() does not strip the PAC.

This is a cleanup/optimization, and not a fix that requires backporting.
Stripping a PAC should be an idempotent operation, and so redundantly
stripping the PAC is not harmful.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230412160134.306148-2-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

9df3f508

12 Apr, 2023 3 commits

arm64/sme: Fix some comments of ARM SME · 97b5576b

Dongxu Sun authored Mar 17, 2023

When TIF_SME is clear, fpsimd_restore_current_state will disable
SME trap during ret_to_user, then SME access trap is impossible
in userspace, not SVE.

Besides, fix typo: alocated->allocated.
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230317124915.1263-5-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>

97b5576b

arm64/signal: Alloc tpidr2 sigframe after checking system_supports_tpidr2() · 19e99e7d

Dongxu Sun authored Mar 17, 2023

Move tpidr2 sigframe allocation from under the checking of
system_supports_sme() to the checking of system_supports_tpidr2().
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230317124915.1263-3-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>

19e99e7d

arm64/signal: Use system_supports_tpidr2() to check TPIDR2 · e9d14f3f

Dongxu Sun authored Mar 17, 2023

Since commit a9d69158("arm64/sme: Implement support
for TPIDR2"), We introduced system_supports_tpidr2() for
TPIDR2 handling. Let's use the specific check instead.

No functional changes.
Signed-off-by: Dongxu Sun <sundongxu3@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230317124915.1263-2-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>

e9d14f3f

11 Apr, 2023 10 commits

arm64/idreg: Don't disable SME when disabling SVE · b2ad9d4e

Mark Brown authored Mar 23, 2023

SVE and SME are separate features which can be implemented without each
other but currently if the user specifies arm64.nosve then we disable SME
as well as SVE. There is already a separate override for SME so remove the
implicit disablement from the SVE override.

One usecase for this would be testing SME only support on a system which
implements both SVE and SME.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230315-arm64-override-sve-sme-v2-1-bab7593e842b@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

b2ad9d4e

arm64: kdump: defer the crashkernel reservation for platforms with no DMA memory zones · 504cae45

Baoquan He authored Apr 07, 2023

In commit 03149563 ("arm64: Do not defer reserve_crashkernel() for
platforms with no DMA memory zones"), reserve_crashkernel() is called
much earlier in arm64_memblock_init() to avoid causing base apge
mapping on platforms with no DMA meomry zones.

With taking off protection on crashkernel memory region, no need to call
reserve_crashkernel() specially in advance. The deferred invocation of
reserve_crashkernel() in bootmem_init() can cover all cases. So revert
the whole commit now.
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230407011507.17572-4-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>

504cae45

arm64: kdump: do not map crashkernel region specifically · 04a2a7af

Baoquan He authored Apr 07, 2023

After taking off the protection functions on crashkernel memory region,
there's no need to map crashkernel region with page granularity during
linear mapping.

With this change, the system can make use of block or section mapping
on linear region to largely improve perforcemence during system bootup
and running.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230407011507.17572-3-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>

04a2a7af

arm64: kdump : take off the protection on crashkernel memory region · 0d124e96

Baoquan He authored Apr 07, 2023

Problem:
=======
On arm64, block and section mapping is supported to build page tables.
However, currently it enforces to take base page mapping for the whole
linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and
crashkernel kernel parameter is set. This will cause longer time of the
linear mapping process during bootup and severe performance degradation
during running time.

Root cause:
==========
On arm64, crashkernel reservation relies on knowing the upper limit of
low memory zone because it needs to reserve memory in the zone so that
devices' DMA addressing in kdump kernel can be satisfied. However, the
upper limit of low memory on arm64 is variant. And the upper limit can
only be decided late till bootmem_init() is called [1].

And we need to map the crashkernel region with base page granularity when
doing linear mapping, because kdump needs to protect the crashkernel region
via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't
support well on splitting the built block or section mapping due to some
cpu reststriction [2]. And unfortunately, the linear mapping is done before
bootmem_init().

To resolve the above conflict on arm64, the compromise is enforcing to
take base page mapping for the entire linear mapping if crashkernel is
set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence
performance is sacrificed.

Solution:
=========
Comparing with the base page mapping for the whole linear region, it's
better to take off the protection on crashkernel memory region for the
time being because the anticipated stamping on crashkernel memory region
could only happen in a chance in one million, while the base page mapping
for the whole linear region is mitigating arm64 systems with crashkernel
set always.

[1]
https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u

[2]
https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@suse.de/T/Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20230407011507.17572-2-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>

0d124e96

arm64: compat: Remove defines now in asm-generic · 73e68984

Teo Couprie Diaz authored Mar 14, 2023

Some generic COMPAT definitions have been consolidated in
asm-generic/compat.h by commit 84a0c977
("asm-generic: compat: Cleanup duplicate definitions")

Remove those that are already defined to the same value there from
arm64 asm/compat.h.
Signed-off-by: Teo Couprie Diaz <teo.coupriediaz@arm.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230314140038.252908-1-teo.coupriediaz@arm.comSigned-off-by: Will Deacon <will@kernel.org>

73e68984

arm64: mm: always map fixmap at page granularity · 414c109b

Mark Rutland authored Apr 06, 2023

Today the fixmap code largely maps elements at PAGE_SIZE granularity,
but we special-case the FDT mapping such that it can be mapped with 2M
block mappings when 4K pages are in use. The original rationale for this
was simplicity, but it has some unfortunate side-effects, and
complicates portions of the fixmap code (i.e. is not so simple after
all).

The FDT can be up to 2M in size but is only required to have 8-byte
alignment, and so it may straddle a 2M boundary. Thus when using 2M
block mappings we may map up to 4M of memory surrounding the FDT. This
is unfortunate as most of that memory will be unrelated to the FDT, and
any pages which happen to share a 2M block with the FDT will by mapped
with Normal Write-Back Cacheable attributes, which might not be what we
want elsewhere (e.g. for carve-outs using Non-Cacheable attributes).

The logic to handle mapping the FDT with 2M blocks requires some special
cases in the fixmap code, and ties it to the early page table
configuration by virtue of the SWAPPER_TABLE_SHIFT and
SWAPPER_BLOCK_SIZE constants used to determine the granularity used to
map the FDT.

This patch simplifies the FDT logic and removes the unnecessary mappings
of surrounding pages by always mapping the FDT at page granularity as
with all other fixmap mappings. To do so we statically reserve multiple
PTE tables to cover the fixmap VA range. Since the FDT can be at most
2M, for 4K pages we only need to allocate a single additional PTE table,
and for 16K and 64K pages the existing single PTE table is sufficient.

The PTE table allocation scales with the number of slots reserved in the
fixmap, and so this also makes it easier to add more fixmap entries if
we require those in future.

Our VA layout means that the fixmap will always fall within a single PMD
table (and consequently, within a single PUD/P4D/PGD entry), which we
can verify at compile time with a static_assert(). With that assert a
number of runtime warnings become impossible, and are removed.

I've boot-tested this patch with both 4K and 64K pages.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20230406152759.4164229-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

414c109b

arm64: mm: move fixmap code to its own file · b9754776

Mark Rutland authored Apr 06, 2023

Over time, arm64's mm/mmu.c has become increasingly large and painful to
navigate. Move the fixmap code to its own file where it can be understood in
isolation.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20230406152759.4164229-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

b9754776

arm64: add FIXADDR_TOT_{START,SIZE} · 32f5b699

Mark Rutland authored Apr 06, 2023

Currently arm64's FIXADDR_{START,SIZE} definitions only cover the
runtime fixmap slots (and not the boot-time fixmap slots), but the code
for creating the fixmap assumes that these definitions cover the entire
fixmap range. This means that the ptdump boundaries are reported in a
misleading way, missing the VA region of the runtime slots. In theory
this could also cause the fixmap creation to go wrong if the boot-time
fixmap slots end up spilling into a separate PMD entry, though luckily
this is not currently the case in any configuration.

While it seems like we could extend FIXADDR_{START,SIZE} to cover the
entire fixmap area, core code relies upon these *only* covering the
runtime slots. For example, fix_to_virt() and virt_to_fix() try to
reject manipulation of the boot-time slots based upon
FIXADDR_{START,SIZE}, while __fix_to_virt() and __virt_to_fix() can
handle any fixmap slot.

This patch follows the lead of x86 in commit:

  55f49fcb ("x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP")

... and add new FIXADDR_TOT_{START,SIZE} definitions which cover the
entire fixmap area, using these for the fixmap creation and ptdump code.

As the boot-time fixmap slots are now rejected by fix_to_virt(),
the early_fixmap_init() code is changed to consistently use
__fix_to_virt(), as it already does in a few cases.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20230406152759.4164229-2-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

32f5b699

arm64: stacktrace: always inline core stacktrace functions · b5ecc19e

Mark Rutland authored Apr 11, 2023

The arm64 stacktrace code can be used in kprobe context, and so cannot
be safely probed. Some (but not all) of the unwind functions are
annotated with `NOKPROBE_SYMBOL()` to ensure this, with others markes as
`__always_inline`, relying on the top-level unwind function being marked
as `noinstr`.

This patch has stacktrace.c consistently mark the internal stacktrace
functions as `__always_inline`, removing the need for NOKPROBE_SYMBOL()
as the top-level unwind function (arch_stack_walk()) is marked as
`noinstr`. This is more consistent and is a simpler pattern to follow
for future additions to stacktrace.c.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230411162943.203199-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

b5ecc19e

arm64: stacktrace: move dump functions to end of file · ead6122c

Mark Rutland authored Apr 11, 2023

For historical reasons, the backtrace dumping functions are placed in
the middle of stacktrace.c, despite using functions defined later. For
clarity, and to make subsequent refactoring easier, move the dumping
functions to the end of stacktrace.c

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230411162943.203199-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>

ead6122c