- 26 Apr, 2023 1 commit
-
-
Neeraj Upadhyay authored
Fix label so that pre_disable_mmu_workaround() is called before clearing sctlr_el1.M. Fixes: 2ced0f30 ("arm64: head: Switch endianness before populating the ID map") Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230425095700.22005-1-quic_neeraju@quicinc.comSigned-off-by: Will Deacon <will@kernel.org>
-
- 20 Apr, 2023 13 commits
-
-
Will Deacon authored
* for-next/sysreg: arm64/sysreg: Convert HFGITR_EL2 to automatic generation arm64/idreg: Don't disable SME when disabling SVE arm64/sysreg: Update ID_AA64PFR1_EL1 for DDI0601 2022-12 arm64/sysreg: Convert HFG[RW]TR_EL2 to automatic generation arm64/sysreg: allow *Enum blocks in SysregFields blocks
-
Will Deacon authored
* for-next/stacktrace: arm64: move PAC masks to <asm/pointer_auth.h> arm64: use XPACLRI to strip PAC arm64: avoid redundant PAC stripping in __builtin_return_address() arm64: stacktrace: always inline core stacktrace functions arm64: stacktrace: move dump functions to end of file arm64: stacktrace: recover return address for first entry
-
Will Deacon authored
* for-next/perf: (24 commits) KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege drivers/perf: hisi: add NULL check for name drivers/perf: hisi: Remove redundant initialized of pmu->name perf/arm-cmn: Fix port detection for CMN-700 arm64: pmuv3: dynamically map PERF_COUNT_HW_BRANCH_INSTRUCTIONS perf/arm-cmn: Validate cycles events fully Revert "ARM: mach-virt: Select PMUv3 driver by default" drivers/perf: apple_m1: Add Apple M2 support dt-bindings: arm-pmu: Add PMU compatible strings for Apple M2 cores perf: arm_cspmu: Fix variable dereference warning perf/amlogic: Fix config1/config2 parsing issue drivers/perf: Use devm_platform_get_and_ioremap_resource() kbuild, drivers/perf: remove MODULE_LICENSE in non-modules perf: qcom: Use devm_platform_get_and_ioremap_resource() perf: arm: Use devm_platform_get_and_ioremap_resource() perf/arm-cmn: Move overlapping wp_combine field ARM: mach-virt: Select PMUv3 driver by default ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations perf: pmuv3: Change GENMASK to GENMASK_ULL ...
-
Will Deacon authored
Although pKVM supports CPU PMU emulation for non-protected guests since 722625c6 ("KVM: arm64: Reenable pmu in Protected Mode"), this relies on the PMU driver probing before the host has de-privileged so that the 'kvm_arm_pmu_available' static key can still be enabled by patching the hypervisor text. As it happens, both of these events hang off device_initcall() but the PMU consistently won the race until 7755cec6 ("arm64: perf: Move PMUv3 driver to drivers/perf"). Since then, the host will fail to boot when pKVM is enabled: | hw perfevents: enabled with armv8_pmuv3_0 PMU driver, 7 counters available | kvm [1]: nVHE hyp BUG at: [<ffff8000090366e0>] __kvm_nvhe_handle_host_mem_abort+0x270/0x284! | kvm [1]: Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE | kvm [1]: Hyp Offset: 0xfffea41fbdf70000 | Kernel panic - not syncing: HYP panic: | PS:a00003c9 PC:0000dbe04b0c66e0 ESR:00000000f2000800 | FAR:fffffbfffddfcf00 HPFAR:00000000010b0bf0 PAR:0000000000000000 | VCPU:0000000000000000 | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc7-00083-g0bce6746d154 #1 | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 | Call trace: | dump_backtrace+0xec/0x108 | show_stack+0x18/0x2c | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | panic+0x13c/0x33c | nvhe_hyp_panic_handler+0x10c/0x190 | aarch64_insn_patch_text_nosync+0x64/0xc8 | arch_jump_label_transform+0x4c/0x5c | __jump_label_update+0x84/0xfc | jump_label_update+0x100/0x134 | static_key_enable_cpuslocked+0x68/0xac | static_key_enable+0x20/0x34 | kvm_host_pmu_init+0x88/0xa4 | armpmu_register+0xf0/0xf4 | arm_pmu_acpi_probe+0x2ec/0x368 | armv8_pmu_driver_init+0x38/0x44 | do_one_initcall+0xcc/0x240 Fix the race properly by deferring the de-privilege step to device_initcall_sync(). This will also be needed in future when probing IOMMU devices and allows us to separate the pKVM de-privilege logic from the core hypervisor initialisation path. Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Fixes: 7755cec6 ("arm64: perf: Move PMUv3 driver to drivers/perf") Tested-by: Fuad Tabba <tabba@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230420123356.2708-1-will@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
* for-next/mm: arm64: mm: always map fixmap at page granularity arm64: mm: move fixmap code to its own file arm64: add FIXADDR_TOT_{START,SIZE} Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"" arm: uaccess: Remove memcpy_page_flushcache() mm,kfence: decouple kfence from page granularity mapping judgement
-
Will Deacon authored
* for-next/misc: arm64: kexec: include reboot.h arm64: delete dead code in this_cpu_set_vectors() arm64: kernel: Fix kernel warning when nokaslr is passed to commandline arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step arm64/sme: Fix some comments of ARM SME arm64/signal: Alloc tpidr2 sigframe after checking system_supports_tpidr2() arm64/signal: Use system_supports_tpidr2() to check TPIDR2 arm64: compat: Remove defines now in asm-generic arm64: kexec: remove unnecessary (void*) conversions arm64: armv8_deprecated: remove unnecessary (void*) conversions firmware: arm_sdei: Fix sleep from invalid context BUG
-
Will Deacon authored
* for-next/kdump: arm64: kdump: defer the crashkernel reservation for platforms with no DMA memory zones arm64: kdump: do not map crashkernel region specifically arm64: kdump : take off the protection on crashkernel memory region
-
Will Deacon authored
* for-next/ftrace: arm64: ftrace: Simplify get_ftrace_plt arm64: ftrace: Add direct call support ftrace: selftest: remove broken trace_direct_tramp ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS ftrace: Store direct called addresses in their ops ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs ftrace: Remove the legacy _ftrace_direct API ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi ftrace: Let unregister_ftrace_direct_multi() call ftrace_free_filter()
-
Will Deacon authored
* for-next/cpufeature: arm64/cpufeature: Use helper macro to specify ID register for capabilites arm64/cpufeature: Consistently use symbolic constants for min_field_value arm64/cpufeature: Pull out helper for CPUID register definitions
-
Will Deacon authored
* for-next/asm: arm64: uaccess: remove unnecessary earlyclobber arm64: uaccess: permit put_{user,kernel} to use zero register arm64: uaccess: permit __smp_store_release() to use zero register arm64: atomics: lse: improve cmpxchg implementation
-
Will Deacon authored
* for-next/acpi: ACPI: AGDI: Improve error reporting for problems during .remove()
-
Simon Horman authored
Include reboot.h in machine_kexec.c for declaration of machine_crash_shutdown. gcc-12 with W=1 reports: arch/arm64/kernel/machine_kexec.c:257:6: warning: no previous prototype for 'machine_crash_shutdown' [-Wmissing-prototypes] 257 | void machine_crash_shutdown(struct pt_regs *regs) No functional changes intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230418-arm64-kexec-include-reboot-v1-1-8453fd4fb3fb@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Dan Carpenter authored
The "slot" variable is an enum, and in this context it is an unsigned int. So the type means it can never be negative and also we never pass invalid data to this function. If something did pass invalid data then this check would be insufficient protection. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/73859c9e-dea0-4764-bf01-7ae694fa2e37@kili.mountainSigned-off-by: Will Deacon <will@kernel.org>
-
- 17 Apr, 2023 7 commits
-
-
Mark Brown authored
When defining which value to look for in a system register field we currently manually specify the register, field shift, width and sign and the value to look for. This opens the potential for error with for example the wrong field width or sign being specified, an enumeration value for a different similarly named field or letting something be initialised to 0. Since we now generate defines for all the ID registers we now have named constants for all of these things generated from the system register description, meaning that we can generate initialisation for all the fields used in matching from a minimal specification of register, field and match value. This is both shorter and eliminates or makes build failures several potential errors. No change in the generated binary. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-3-4c8f28a6f203@kernel.org [will: Drop explicit '.sign' assignment for BTI feature] Signed-off-by: Will Deacon <will@kernel.org>
-
Junhao He authored
When allocations fails that can be NULL now. If the name provided is NULL, then the initialization process of the PMU type and dev will be skipped in function perf_pmu_register(). Consequently, the PMU will not be able to register into the kernel. Moreover, in the case of unregister the PMU, the function device_del() will need to handle NULL pointers, which potentially can cause issues. So move this allocation above the cpuhp_state_add_instance() and directly return if it does fail. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-3-hejunhao3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>
-
Junhao He authored
"pmu->name" is initialized by perf_pmu_register() function, so remove the redundant initialized in hisi_pmu_init(). Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-2-hejunhao3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Brown authored
A number of the cpufeatures use raw numbers for the minimum field values specified rather than symbolic constants. In preparation for the use of helper macros replace all these with the appropriate constants. No change in the generated binary. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-2-4c8f28a6f203@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Brown authored
We use the same structure to match hwcaps and CPU features so we can use the same helper to generate the fields required. Pull the portion of the current hwcaps helper that initialises the fields out into a separate define placed earlier in the file so we can use it for cpufeatures. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-1-4c8f28a6f203@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Brown authored
Automatically generate the Hypervisor Fine-Grained Instruction Trap Register as per DDI0601 2023-03, currently we only have a definition for the register name not any of the contents. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230306-arm64-fgt-reg-gen-v5-1-516a89cb50f6@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Uwe Kleine-König authored
Returning an error value in a platform driver's remove callback results in a generic error message being emitted by the driver core, but otherwise it doesn't make a difference. The device goes away anyhow. So instead of triggering the generic platform error message, emit a more helpful message if a problem occurs and return 0 to suppress the generic message. This patch is a preparation for making platform remove callbacks return void. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20221014160623.467195-1-u.kleine-koenig@pengutronix.deSigned-off-by: Will Deacon <will@kernel.org>
-
- 14 Apr, 2023 3 commits
-
-
Pavankumar Kondeti authored
'Unknown kernel command line parameters "nokaslr", will be passed to user space' message is noticed in the dmesg when nokaslr is passed to the kernel commandline on ARM64 platform. This is because nokaslr param is handled by early cpufeature detection infrastructure and the parameter is never consumed by a kernel param handler. Fix this warning by providing a dummy kernel param handler for nokaslr. Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Link: https://lore.kernel.org/r/20230412043258.397455-1-quic_pkondeti@quicinc.comSigned-off-by: Will Deacon <will@kernel.org>
-
Robin Murphy authored
When the "extra device ports" configuration was first added, the additional mxp_device_port_connect_info registers were added around the existing mxp_mesh_port_connect_info registers. What I missed about CMN-700 is that it shuffled them around to remove this discontinuity. As such, tweak the definitions and factor out a helper for reading these registers so we can deal with this discrepancy easily, which does at least allow nicely tidying up the callsites. With this we can then also do the nice thing and skip accesses completely rather than relying on RES0 behaviour where we know the extra registers aren't defined. Fixes: 23760a01 ("perf/arm-cmn: Add CMN-700 support") Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/71d129241d4d7923cde72a0e5b4c8d2f6084525f.1681295193.git.robin.murphy@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Sumit Garg authored
Currently only the first attempt to single-step has any effect. After that all further stepping remains "stuck" at the same program counter value. Refer to the ARM Architecture Reference Manual (ARM DDI 0487E.a) D2.12, PSTATE.SS=1 should be set at each step before transferring the PE to the 'Active-not-pending' state. The problem here is PSTATE.SS=1 is not set since the second single-step. After the first single-step, the PE transferes to the 'Inactive' state, with PSTATE.SS=0 and MDSCR.SS=1, thus PSTATE.SS won't be set to 1 due to kernel_active_single_step()=true. Then the PE transferes to the 'Active-pending' state when ERET and returns to the debugger by step exception. Before this patch: ================== Entering kdb (current=0xffff3376039f0000, pid 1) on processor 0 due to Keyboard Entry [0]kdb> [0]kdb> [0]kdb> bp write_sysrq_trigger Instruction(i) BP #0 at 0xffffa45c13d09290 (write_sysrq_trigger) is enabled addr at ffffa45c13d09290, hardtype=0 installed=0 [0]kdb> go $ echo h > /proc/sysrq-trigger Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to Breakpoint @ 0xffffad651a309290 [1]kdb> ss Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to SS trap @ 0xffffad651a309294 [1]kdb> ss Entering kdb (current=0xffff4f7e453f8000, pid 175) on processor 1 due to SS trap @ 0xffffad651a309294 [1]kdb> After this patch: ================= Entering kdb (current=0xffff6851c39f0000, pid 1) on processor 0 due to Keyboard Entry [0]kdb> bp write_sysrq_trigger Instruction(i) BP #0 at 0xffffc02d2dd09290 (write_sysrq_trigger) is enabled addr at ffffc02d2dd09290, hardtype=0 installed=0 [0]kdb> go $ echo h > /proc/sysrq-trigger Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to Breakpoint @ 0xffffc02d2dd09290 [1]kdb> ss Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd09294 [1]kdb> ss Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd09298 [1]kdb> ss Entering kdb (current=0xffff6851c53c1840, pid 174) on processor 1 due to SS trap @ 0xffffc02d2dd0929c [1]kdb> Fixes: 44679a4f ("arm64: KGDB: Add step debugging support") Co-developed-by: Wei Li <liwei391@huawei.com> Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Tested-by: Douglas Anderson <dianders@chromium.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20230202073148.657746-3-sumit.garg@linaro.orgSigned-off-by: Will Deacon <will@kernel.org>
-
- 13 Apr, 2023 3 commits
-
-
Mark Rutland authored
Now that we use XPACLRI to strip PACs within the kernel, the ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() definitions no longer need to live in <asm/compiler.h>. Move them to <asm/pointer_auth.h>, and ensure that this header is included where they are used. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230412160134.306148-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
Currently we strip the PAC from pointers using C code, which requires generating bitmasks, and conditionally clearing/setting bits depending on bit 55. We can do better by using XPACLRI directly. When the logic was originally written to strip PACs from user pointers, contemporary toolchains used for the kernel had assemblers which were unaware of the PAC instructions. As stripping the PAC from userspace pointers required unconditional clearing of a fixed set of bits (which could be performed with a single instruction), it was simpler to implement the masking in C than it was to make use of XPACI or XPACLRI. When support for in-kernel pointer authentication was added, the stripping logic was extended to cover TTBR1 pointers, requiring several instructions to handle whether to clear/set bits dependent on bit 55 of the pointer. This patch simplifies the stripping of PACs by using XPACLRI directly, as contemporary toolchains do within __builtin_return_address(). This saves a number of instructions, especially where __builtin_return_address() does not implicitly strip the PAC but is heavily used (e.g. with tracepoints). As the kernel might be compiled with an assembler without knowledge of XPACLRI, it is assembled using the 'HINT #7' alias, which results in an identical opcode. At the same time, I've split ptrauth_strip_insn_pac() into ptrauth_strip_user_insn_pac() and ptrauth_strip_kernel_insn_pac() helpers so that we can avoid unnecessary PAC stripping when pointer authentication is not in use in userspace or kernel respectively. The underlying xpaclri() macro uses inline assembly which clobbers x30. The clobber causes the compiler to save/restore the original x30 value in a frame record (protected with PACIASP and AUTIASP when in-kernel authentication is enabled), so this does not provide a gadget to alter the return address. Similarly this does not adversely affect unwinding due to the presence of the frame record. The ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() are exported from the kernel in ptrace and core dumps, so these are retained. A subsequent patch will move them out of <asm/compiler.h>. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230412160134.306148-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
In old versions of GCC and Clang, __builtin_return_address() did not strip the PAC. This was not the behaviour we desired, and so we wrapped this with code to strip the PAC in commit: 689eae42 ("arm64: mask PAC bits of __builtin_return_address") Since then, both GCC and Clang decided that __builtin_return_address() *should* strip the PAC, and the existing behaviour was a bug. GCC was fixed in 11.1.0, with those fixes backported to 10.2.0, 9.4.0, 8.5.0, but not earlier: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94891 Clang was fixed in 12.0.0, though this was not backported: https://reviews.llvm.org/D75044 When using a compiler whose __builtin_return_address() strips the PAC, our wrapper to strip the PAC is redundant. Similarly, when pointer authentication is not in use within the kernel pointers will not have a PAC, and so there's no point stripping those pointers. To avoid this redundant work, this patch updates the __builtin_return_address() wrapper to only be used when in-kernel pointer authentication is configured and the compiler's __builtin_return_address() does not strip the PAC. This is a cleanup/optimization, and not a fix that requires backporting. Stripping a PAC should be an idempotent operation, and so redundantly stripping the PAC is not harmful. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230412160134.306148-2-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
- 12 Apr, 2023 3 commits
-
-
Dongxu Sun authored
When TIF_SME is clear, fpsimd_restore_current_state will disable SME trap during ret_to_user, then SME access trap is impossible in userspace, not SVE. Besides, fix typo: alocated->allocated. Signed-off-by: Dongxu Sun <sundongxu3@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230317124915.1263-5-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>
-
Dongxu Sun authored
Move tpidr2 sigframe allocation from under the checking of system_supports_sme() to the checking of system_supports_tpidr2(). Signed-off-by: Dongxu Sun <sundongxu3@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230317124915.1263-3-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>
-
Dongxu Sun authored
Since commit a9d69158("arm64/sme: Implement support for TPIDR2"), We introduced system_supports_tpidr2() for TPIDR2 handling. Let's use the specific check instead. No functional changes. Signed-off-by: Dongxu Sun <sundongxu3@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230317124915.1263-2-sundongxu3@huawei.comSigned-off-by: Will Deacon <will@kernel.org>
-
- 11 Apr, 2023 10 commits
-
-
Mark Brown authored
SVE and SME are separate features which can be implemented without each other but currently if the user specifies arm64.nosve then we disable SME as well as SVE. There is already a separate override for SME so remove the implicit disablement from the SVE override. One usecase for this would be testing SME only support on a system which implements both SVE and SME. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230315-arm64-override-sve-sme-v2-1-bab7593e842b@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>
-
Baoquan He authored
In commit 03149563 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones"), reserve_crashkernel() is called much earlier in arm64_memblock_init() to avoid causing base apge mapping on platforms with no DMA meomry zones. With taking off protection on crashkernel memory region, no need to call reserve_crashkernel() specially in advance. The deferred invocation of reserve_crashkernel() in bootmem_init() can cover all cases. So revert the whole commit now. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20230407011507.17572-4-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>
-
Baoquan He authored
After taking off the protection functions on crashkernel memory region, there's no need to map crashkernel region with page granularity during linear mapping. With this change, the system can make use of block or section mapping on linear region to largely improve perforcemence during system bootup and running. Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20230407011507.17572-3-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>
-
Baoquan He authored
Problem: ======= On arm64, block and section mapping is supported to build page tables. However, currently it enforces to take base page mapping for the whole linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and crashkernel kernel parameter is set. This will cause longer time of the linear mapping process during bootup and severe performance degradation during running time. Root cause: ========== On arm64, crashkernel reservation relies on knowing the upper limit of low memory zone because it needs to reserve memory in the zone so that devices' DMA addressing in kdump kernel can be satisfied. However, the upper limit of low memory on arm64 is variant. And the upper limit can only be decided late till bootmem_init() is called [1]. And we need to map the crashkernel region with base page granularity when doing linear mapping, because kdump needs to protect the crashkernel region via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't support well on splitting the built block or section mapping due to some cpu reststriction [2]. And unfortunately, the linear mapping is done before bootmem_init(). To resolve the above conflict on arm64, the compromise is enforcing to take base page mapping for the entire linear mapping if crashkernel is set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence performance is sacrificed. Solution: ========= Comparing with the base page mapping for the whole linear region, it's better to take off the protection on crashkernel memory region for the time being because the anticipated stamping on crashkernel memory region could only happen in a chance in one million, while the base page mapping for the whole linear region is mitigating arm64 systems with crashkernel set always. [1] https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u [2] https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@suse.de/T/Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20230407011507.17572-2-bhe@redhat.comSigned-off-by: Will Deacon <will@kernel.org>
-
Teo Couprie Diaz authored
Some generic COMPAT definitions have been consolidated in asm-generic/compat.h by commit 84a0c977 ("asm-generic: compat: Cleanup duplicate definitions") Remove those that are already defined to the same value there from arm64 asm/compat.h. Signed-off-by: Teo Couprie Diaz <teo.coupriediaz@arm.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230314140038.252908-1-teo.coupriediaz@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
Today the fixmap code largely maps elements at PAGE_SIZE granularity, but we special-case the FDT mapping such that it can be mapped with 2M block mappings when 4K pages are in use. The original rationale for this was simplicity, but it has some unfortunate side-effects, and complicates portions of the fixmap code (i.e. is not so simple after all). The FDT can be up to 2M in size but is only required to have 8-byte alignment, and so it may straddle a 2M boundary. Thus when using 2M block mappings we may map up to 4M of memory surrounding the FDT. This is unfortunate as most of that memory will be unrelated to the FDT, and any pages which happen to share a 2M block with the FDT will by mapped with Normal Write-Back Cacheable attributes, which might not be what we want elsewhere (e.g. for carve-outs using Non-Cacheable attributes). The logic to handle mapping the FDT with 2M blocks requires some special cases in the fixmap code, and ties it to the early page table configuration by virtue of the SWAPPER_TABLE_SHIFT and SWAPPER_BLOCK_SIZE constants used to determine the granularity used to map the FDT. This patch simplifies the FDT logic and removes the unnecessary mappings of surrounding pages by always mapping the FDT at page granularity as with all other fixmap mappings. To do so we statically reserve multiple PTE tables to cover the fixmap VA range. Since the FDT can be at most 2M, for 4K pages we only need to allocate a single additional PTE table, and for 16K and 64K pages the existing single PTE table is sufficient. The PTE table allocation scales with the number of slots reserved in the fixmap, and so this also makes it easier to add more fixmap entries if we require those in future. Our VA layout means that the fixmap will always fall within a single PMD table (and consequently, within a single PUD/P4D/PGD entry), which we can verify at compile time with a static_assert(). With that assert a number of runtime warnings become impossible, and are removed. I've boot-tested this patch with both 4K and 64K pages. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20230406152759.4164229-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
Over time, arm64's mm/mmu.c has become increasingly large and painful to navigate. Move the fixmap code to its own file where it can be understood in isolation. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20230406152759.4164229-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
Currently arm64's FIXADDR_{START,SIZE} definitions only cover the runtime fixmap slots (and not the boot-time fixmap slots), but the code for creating the fixmap assumes that these definitions cover the entire fixmap range. This means that the ptdump boundaries are reported in a misleading way, missing the VA region of the runtime slots. In theory this could also cause the fixmap creation to go wrong if the boot-time fixmap slots end up spilling into a separate PMD entry, though luckily this is not currently the case in any configuration. While it seems like we could extend FIXADDR_{START,SIZE} to cover the entire fixmap area, core code relies upon these *only* covering the runtime slots. For example, fix_to_virt() and virt_to_fix() try to reject manipulation of the boot-time slots based upon FIXADDR_{START,SIZE}, while __fix_to_virt() and __virt_to_fix() can handle any fixmap slot. This patch follows the lead of x86 in commit: 55f49fcb ("x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP") ... and add new FIXADDR_TOT_{START,SIZE} definitions which cover the entire fixmap area, using these for the fixmap creation and ptdump code. As the boot-time fixmap slots are now rejected by fix_to_virt(), the early_fixmap_init() code is changed to consistently use __fix_to_virt(), as it already does in a few cases. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20230406152759.4164229-2-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
The arm64 stacktrace code can be used in kprobe context, and so cannot be safely probed. Some (but not all) of the unwind functions are annotated with `NOKPROBE_SYMBOL()` to ensure this, with others markes as `__always_inline`, relying on the top-level unwind function being marked as `noinstr`. This patch has stacktrace.c consistently mark the internal stacktrace functions as `__always_inline`, removing the need for NOKPROBE_SYMBOL() as the top-level unwind function (arch_stack_walk()) is marked as `noinstr`. This is more consistent and is a simpler pattern to follow for future additions to stacktrace.c. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230411162943.203199-4-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-
Mark Rutland authored
For historical reasons, the backtrace dumping functions are placed in the middle of stacktrace.c, despite using functions defined later. For clarity, and to make subsequent refactoring easier, move the dumping functions to the end of stacktrace.c There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230411162943.203199-3-mark.rutland@arm.comSigned-off-by: Will Deacon <will@kernel.org>
-