- 12 Mar, 2022 1 commit
-
-
Arnd Bergmann authored
The removal of the old-style irq entry broke obscure NOMMU configurations on machines that have an MMU: ld.lld: error: undefined symbol: generic_handle_arch_irq referenced by kernel/entry-armv.o:(__irq_svc) in archive arch/arm/built-in.a A follow-up patch to convert nvic to the generic_handle_arch_irq() could have fixed this by removing the Kconfig conditional, but did it differently. Change the Kconfig logic so ARM machines now unconditionally enable the feature. I have also submitted a patch to remove support for the configurations that broke, but fixing the regression first is a trivial and correct change. Reported-by:
kernel test robot <lkp@intel.com> Fixes: 54f481a2 ("ARM: remove old-style irq entry") Fixes: 52d24087 ("irqchip: nvic: Use GENERIC_IRQ_MULTI_HANDLER") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 11 Mar, 2022 4 commits
-
-
Ard Biesheuvel authored
Commit b6506981 ("ARM: unwind: support unwinding across multiple stacks") updated the logic in the ARM unwinder to widen the bounds within which SP is assumed to be valid, in order to allow the unwind to traverse from the IRQ stack to the task stack. This is necessary, as otherwise, unwinds started from the IRQ stack would terminate in the IRQ exception handler, making stacktraces substantially less useful. This turns out to be a mistake, as it breaks asynchronous unwinding across exceptions, when the exception is taken before the stack frame is consistent with the unwind info. For instance, in the following backtrace: ... generic_handle_arch_irq from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x80/0x98 Exception stack(0xc7093e20 to 0xc7093e68) 3e20: b6a94a88 c7093ea0 00000008 00000000 c7093ea0 b7e127d0 00000051 c9220000 3e40: b6a94a88 b6a94a88 00000004 0002b000 0036b570 c7093e70 c040ca2c c0994a90 3e60: 20070013 ffffffff __irq_svc from __copy_to_user_std+0x20/0x378 ... we need to apply the following unwind directives: 0xc099720c <__copy_to_user_std+0x1c>: @0xc295d1d4 Compact model index: 1 0x9b vsp = r11 0xb1 0x0d pop {r0, r2, r3} 0x84 0x81 pop {r4, r11, r14} 0xb0 finish which tell us to switch to the frame pointer register R11 and proceed with the unwind from that. However, having been interrupted 0x20 bytes into the function: c09971f0 <__copy_to_user_std>: c09971f0: e59f3350 ldr r3, [pc, #848] c09971f4: e243c001 sub ip, r3, #1 c09971f8: e05cc000 subs ip, ip, r0 c09971fc: 228cc001 addcs ip, ip, #1 c0997200: 205cc002 subscs ip, ip, r2 c0997204: 33a00000 movcc r0, #0 c0997208: e320f014 csdb c099720c: e3a03000 mov r3, #0 c0997210: e92d481d push {r0, r2, r3, r4, fp, lr} c0997214: e1a0b00d mov fp, sp c0997218: e2522004 subs r2, r2, #4 the value for R11 recovered from the previous frame (__irq_svc) will be a snapshot of its value before the exception was taken (0x0002b000), which occurred at address __copy_to_user_std+0x20 (0xc0997210), when R11 had not been assigned its value yet. This means we can never assume that the SP values recovered from the stack or from the frame pointer are ever safe to use, given the need to do asynchronous unwinding, and the only robust approach is to revert to the previous approach, which is to derive bounds for SP based on the initial value, and never update them. We can make an exception, though: now that the IRQ stack switch is guaranteed to occur in call_with_stack(), we can implement a special case for this function, and use a different set of bounds based on the knowledge that it will always unwind from R11 rather than SP. As call_with_stack() is a hand-rolled assembly routine, this is guaranteed to remain that way. So let's do a partial revert of b6506981, and drop all manipulations for sp_low and sp_high based on the information collected during the unwind itself. To support call_with_stack(), set sp_low and sp_high explicitly to values derived from R11 when we unwind that function. The only downside is that, while unwinding an overflow of the vmap'ed stack will work fine as before, we will no longer be able to produce a backtrace that unwinds the overflow stack itself across the exception that was raised due to the faulting access to the guard region. However, this only affects exceptions caused by problems in the stack overflow handling code itself, in which case the remaining backtrace is not that relevant. Fixes: b6506981 ("ARM: unwind: support unwinding across multiple stacks") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
After simplifying the stack switch code in the IRQ exception handler by deferring the actual stack switch to call_with_stack(), we no longer need to special case the way we dump the exception stack, since it will always be at the top of whichever stack was active when the exception was taken. So revert this special handling for the ARM unwinder. This reverts commit 4ab68270. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
The IRQ stacks series made some changes to the unwinder, to permit unwinding across different stacks. This is needed because otherwise, the call stack would terminate at the point where the stack switch between the task stack and the IRQ stack occurs, which would defeat any diagnostics that rely on timer interrupts, such as RCU stall detection. Unfortunately, getting the unwind annotations correct turns out to be difficult, given that this now involves a frame pointer which needs to point into the right location in the task stack when unwinding from the IRQ stack. Getting this wrong for an exception handling routine results in the stack pointer to be unwound from the wrong location, causing any subsequent unwind attempts to cause all kinds of issues, as reported by Naresh here [0]. So let's simplify this, by deferring the stack switch to call_with_stack(), which already has the correct unwind annotations, and removing all the complicated handling of the stack frame from the IRQ exception entrypoint itself. [0] https://lore.kernel.org/all/CA+G9fYtpy8VgK+ag6OsA9TDrwi5YGU4hu7GM8xwpO7v6LrCD4Q@mail.gmail.com/Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
Russell King (Oracle) authored
When e.g. a WARN_ON() is encountered, we attempt to unwind the current thread. To do this, we set frame.pc to unwind_backtrace, which means it points at the beginning of the function. However, the rest of the state is initialised from within the function, which means the function prologue has already been run. This can be confusing, and with a recent patch from Ard, can result in the unwinder misbehaving if we want to be strict about the PC value. If we correctly initialise the state so it is self-consistent (in other words, set frame.pc to the location we are initialising it) then we eliminate this confusion, and avoid possible future issues. Reviewed-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 07 Mar, 2022 2 commits
-
-
Ard Biesheuvel authored
Commit 41918ec8 ("ARM: ftrace: enable the graph tracer with the EABI unwinder") removed the dummy version of return_address() that was provided for the CONFIG_ARM_UNWIND=y case, on the assumption that the removal of the kernel_text_address() call from unwind_frame() in the preceding patch made it safe to do so. However, this turns out not to be the case: Corentin reports warnings about suspicious RCU usage and other strange behavior that seems to originate in the stack unwinding that occurs in return_address(). Given that the function graph tracer (which is what these changes were enabling for CONFIG_ARM_UNWIND=y builds) does not appear to care about this distinction, let's revert return_address() to the old state. Cc: Corentin Labbe <clabbe.montjoie@gmail.com> Fixes: 41918ec8 ("ARM: ftrace: enable the graph tracer with the EABI unwinder") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reported-by:
Corentin Labbe <clabbe.montjoie@gmail.com> Tested-by:
Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
Corentin reports that since commit 538b9265 ("ARM: unwind: track location of LR value in stack frame"), numerous spurious warnings are emitted into the kernel log: [ 0.000000] unwind: Index not found c0f0c440 [ 0.000000] unwind: Index not found 00000000 [ 0.000000] unwind: Index not found c0f0c440 [ 0.000000] unwind: Index not found 00000000 This is due to the fact that the commit in question removes a check whether the PC value in the unwound frame is actually a kernel text address, on the assumption that such an address will not be associated with valid unwind data to begin with, which is checked right after. The reason for removing this check was that unwind_frame() will be called by the ftrace graph tracer code, which means that it can no longer be safely instrumented itself, or any code that it calls, as it could cause infinite recursion. In order to prevent the spurious diagnostics, let's add back the call to kernel_text_address(), but this time, only call it if no unwind data could be found for the address in question. This is more efficient for the common successful case, and should avoid any unintended recursion, considering that kernel_text_address() will only be called if no unwind data was found. Cc: Corentin Labbe <clabbe.montjoie@gmail.com> Fixes: 538b9265 ("ARM: unwind: track location of LR value in stack frame") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 28 Feb, 2022 1 commit
-
-
Russell King (Oracle) authored
Merge tag 'arm-ftrace-for-rmk' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable ARM: ftrace fixes and cleanups Make all flavors of ftrace available on all builds, regardless of ISA choice, unwinder choice or compiler: - use ADD not POP where possible - fix a couple of Thumb2 related issues - enable HAVE_FUNCTION_GRAPH_FP_TEST for robustness - enable the graph tracer with the EABI unwinder - avoid clobbering frame pointer registers to make Clang happy Link: https://lore.kernel.org/linux-arm-kernel/20220203082204.1176734-1-ardb@kernel.org/
-
- 10 Feb, 2022 2 commits
-
-
Ard Biesheuvel authored
This reverts commit ecb108e3. Clang + Thumb2 with ftrace is now supported. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
The SMC calling convention uses R7 as an argument register, which conflicts with its use as a frame pointer when building in Thumb2 mode. Given that Clang with ftrace does not permit frame pointers to be disabled, let's omit this compilation unit from ftrace instrumentation. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Acked-by:
Nick Desaulniers <ndesaulniers@google.com>
-
- 09 Feb, 2022 10 commits
-
-
Ard Biesheuvel authored
Thumb2 uses R7 rather than R11 as the frame pointer, and even if we rarely use a frame pointer to begin with when building in Thumb2 mode, there are cases where it is required by the compiler (Clang when inserting profiling hooks via -pg) However, preserving and restoring the frame pointer is risky, as any unhandled exceptions raised in the mean time will produce a bogus backtrace, and it would be better not to touch the frame pointer at all. This is the case even when CONFIG_FRAME_POINTER is not set, as the unwind directive used by the unwinder may also use R7 or R11 as the unwind anchor, even if the frame pointer is not managed strictly according to the frame pointer ABI. So let's tweak the cacheflush asm code not to clobber R7 or R11 at all, so that we can drop R7 from the clobber lists of the inline asm blocks that call these routines, and remove the code that preserves/restores R11. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
Ard Biesheuvel authored
Thumb2 code uses R7 as the frame pointer rather than R11, because the opcodes to access it are generally shorter. This means that there are cases where we cannot simply add it to the clobber list of an asm() block, but need to preserve/restore it explicitly, or the compiler may complain in some cases (e.g., Clang builds with ftrace enabled). Since R11 is not special in that case, clobber it instead, and use it to preserve/restore the value of R7. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Masami Hiramatsu <mhiramat@kernel.org>
-
Ard Biesheuvel authored
Enable the function graph tracer in combination with the EABI unwinder, so that Thumb2 builds or Clang ARM builds can make use of it. This involves using the unwinder to locate the return address of an instrumented function on the stack, so that it can be overridden and made to refer to the ftrace handling routines that need to be called at function return. Given that for these builds, it is not guaranteed that the value of the link register is stored on the stack, fall back to the stack slot that will be used by the ftrace exit code to restore LR in the instrumented function's execution context. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
The ftrace graph tracer needs to override the return address of an instrumented function, in order to install a hook that gets invoked when the function returns again. Currently, we only support this when building for ARM using GCC with frame pointers, as in this case, it is guaranteed that the function will reload LR from [FP, #-4] in all cases, and we can simply pass that address to the ftrace code. In order to support this for configurations that rely on the EABI unwinder, such as Thumb2 builds, make the unwinder keep track of the address from which LR was unwound, permitting ftrace to make use of this in a subsequent patch. Drop the call to is_kernel_text_address(), which is problematic in terms of ftrace recursion, given that it may be instrumented itself. The call is redundant anyway, as no unwind directives will be found unless the PC points to memory that is known to contain executable code. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
Ard Biesheuvel authored
Fix the frame pointer handling in the function graph tracer entry and exit code so we can enable HAVE_FUNCTION_GRAPH_FP_TEST. Instead of using FP directly (which will have different values between the entry and exit pieces of the function graph tracer), use the value of SP at entry and exit, as we can derive the former value from the frame pointer. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Avoid explicit literal loads and instead, use accessor macros that generate the optimal sequence depending on the architecture revision being targeted. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Tweak the ftrace return paths to avoid redundant loads of SP, as well as unnecessary clobbering of IP. This also fixes the inconsistency of using MOV to perform a function return, which is sub-optimal on recent micro-architectures but more importantly, does not perform an interworking return, unlike compiler generated function returns in Thumb2 builds. Let's fix this by popping PC from the stack like most ordinary code does. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Kernel images that are large in comparison to the range of a direct branch may fail to work as expected with ftrace, as patching a direct branch to one of the core ftrace routines may not be possible from the .init.text section, if it is emitted too far away from the normal .text section. This is more likely to affect Thumb2 builds, given that its range is only -/+ 16 MiB (as opposed to ARM which has -/+ 32 MiB), but may occur in either ISA. To work around this, add a couple of trampolines to .init.text and swap these in when the ftrace patching code is operating on callers in .init.text. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
The compiler emitted hook used for ftrace consists of a PUSH {LR} to preserve the link register, followed by a branch-and-link (BL) to __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the combined sequence into a NOP, using a POP {LR} instruction. This is not necessary, since the link register does not get clobbered in this case, and simply adding #4 to the stack pointer is sufficient, and avoids a memory access that may take a few cycles to resolve depending on the micro-architecture. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Using ADR to take the address of 'ftrace_stub' via a local label produces an address that has the Thumb bit cleared, which means the subsequent comparison is guaranteed to fail. Instead, use the badr macro, which forces the Thumb bit to be set. Fixes: a3ba87a6 ("ARM: 6316/1: ftrace: add Thumb-2 support") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org>
-
- 31 Jan, 2022 2 commits
-
-
Russell King (Oracle) authored
Merge tag 'arm-vmap-stacks-v6' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable ARM: support for IRQ and vmap'ed stacks [v6] This tag covers the changes between the version of vmap'ed + IRQ stacks support pulled into rmk/devel-stable [0] (which was dropped from v5.17 due to issues discovered too late in the cycle), and my v5 proposed for the v5.18 cycle [1]. [0] git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git arm-irq-and-vmap-stacks-for-rmk [1] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/
-
Ard Biesheuvel authored
The get_current() and __my_cpu_offset() accessors evaluate to only a single instruction emitted inline, but due to the size of the asm string that is created for SMP+v6 configurations, the compiler assumes otherwise, and may emit the functions out of line instead. So use __always_inline to avoid this. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
- 25 Jan, 2022 5 commits
-
-
Ard Biesheuvel authored
Only SMP systems use the secondary startup path by definition, so there is no need for SMP conditionals there. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The build bots complain about iop_handle_irq() not being declared so let's make it static instead. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Rework the vmalloc_seq handling so it can be used safely under SMP, as we started using it to ensure that vmap'ed stacks are guaranteed to be mapped by the active mm before switching to a task, and here we need to ensure that changes to the page tables are visible to other CPUs when they observe a change in the sequence count. Since LPAE needs none of this, fold a check against it into the vmalloc_seq counter check after breaking it out into a separate static inline helper. Given that vmap'ed stacks are now also supported on !SMP configurations, let's drop the WARN() that could potentially now fire spuriously. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Avoid using R9 in the IRQ handler code, as the entry code uses it for tsk, and expects it to remain untouched between the IRQ entry and exit code. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Use the SMP_ON_UP patching framework to elide HWCAP_TLS tests from the context switch and return to userspace code paths, as SMP systems are guaranteed to have this h/w capability. At the same time, omit the update of __entry_task if the system is detected to be UP at runtime, as in that case, the value is never used. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
- 24 Jan, 2022 2 commits
-
-
Ard Biesheuvel authored
Nathan reports the group relocations go out of range in pathological cases such as allyesconfig kernels, which have little chance of actually booting but are still used in validation. So add a Kconfig symbol for this feature, and make it depend on !COMPILE_TEST. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
When onlining a CPU, switch to swapper_pg_dir as soon as possible so that it is guaranteed that the vmap'ed stack is mapped before it is used. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
- 06 Jan, 2022 1 commit
-
-
Ard Biesheuvel authored
Nathan reports that the new get_current() and per-CPU offset accessors may cause problems at build time due to the use of a literal to hold the address of the respective variables. This is due to the fact that LLD before v14 does not support the PC-relative group relocations that are normally used for this, and the fallback relies on literals but does not emit the literal pools explictly using the .ltorg directive. ./arch/arm/include/asm/current.h:53:6: error: out of range pc-relative fixup value asm(LOAD_SYM_ARMV6(%0, __current) : "=r"(cur)); ^ ./arch/arm/include/asm/insn.h:25:2: note: expanded from macro 'LOAD_SYM_ARMV6' " ldr " #reg ", =" #sym " nt" ^ <inline asm>:1:3: note: instantiated into assembly here ldr r0, =__current ^ Since emitting a literal pool in this particular case is not possible, let's avoid the LOAD_SYM_ARMV6() entirely, and use the ordinary C assigment instead. As it turns out, there are other such cases, and here, using .ltorg to emit the literal pool within range of the LDR instruction would be possible due to the presence of an unconditional branch right after it. Unfortunately, putting .ltorg directives in subsections appears to confuse the Clang inline assembler, resulting in similar errors even though the .ltorg is most definitely within range. So let's fix this by emitting the literal explicitly, and not rely on the assembler to figure this out. This means we have move the fallback out of the LOAD_SYM_ARMV6() macro and into the callers. Link: https://github.com/ClangBuiltLinux/linux/issues/1551 Fixes: 9c46929e ("ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems") Reported-by:
Nathan Chancellor <natechancellor@gmail.com> Tested-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 05 Jan, 2022 1 commit
-
-
Ard Biesheuvel authored
There are several reports about the new vmap'ed stacks code breaking suspend/resume on Exynos, Renesas and Tegra SMP platforms. While this is under investigation, let's disable the vmap'ed stacks feature for the time being for SMP configurations that have suspend/resume enabled. [0] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-8-ardb@kernel.org/ Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jon Hunter <jonathanh@nvidia.com> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 17 Dec, 2021 1 commit
-
-
Russell King (Oracle) authored
Merge tag 'arm-irq-and-vmap-stacks-for-rmk' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable ARM: support for IRQ and vmap'ed stacks This PR covers all the work related to implementing IRQ stacks and vmap'ed stacks for all 32-bit ARM systems that are currently supported by the Linux kernel, including RiscPC and Footbridge. It has been submitted for review in three different waves: - IRQ stacks support for v7 SMP systems [0], - vmap'ed stacks support for v7 SMP systems[1], - extending support for both IRQ stacks and vmap'ed stacks for all remaining configurations, including v6/v7 SMP multiplatform kernels and uniprocessor configurations including v7-M [2] [0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/ [1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/ [2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/
-
- 06 Dec, 2021 8 commits
-
-
Ard Biesheuvel authored
Enable support for IRQ stacks on !MMU, and add the code to the IRQ entry path to switch to the IRQ stack if not running from it already. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
On UP systems, only a single task can be 'current' at the same time, which means we can use a global variable to track it. This means we can also enable THREAD_INFO_IN_TASK for those systems, as in that case, thread_info is accessed via current rather than the other way around, removing the need to store thread_info at the base of the task stack. This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP systems as well. To partially mitigate the performance overhead of this arrangement, use a ADD/ADD/LDR sequence with the appropriate PC-relative group relocations to load the value of current when needed. This means that accessing current will still only require a single load as before, avoiding the need for a literal to carry the address of the global variable in each function. However, accessing thread_info will now require this load as well. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Defer TPIDURO updates for user space until exit also for CPU_V6+SMP configurations so that we can decide at runtime whether to use it to carry the current pointer, provided that we are running on a CPU that actually implements this register. This is needed for THREAD_INFO_IN_TASK support for UP systems, which requires that all SMP capable systems use the TPIDRURO based access to 'current' as the only remaining alternative will be a global variable which only works on UP. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Enable the use of the TLS register to hold the 'current' pointer also on non-SMP configurations that target v6k or later CPUs. This will permit the use of THREAD_INFO_IN_TASK as well as IRQ stacks and vmap'ed stacks for such configurations. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Permit the use of the TPIDRPRW system register for carrying the per-CPU offset in generic SMP configurations that also target non-SMP capable ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset array. While at it, switch over some existing direct TPIDRPRW accesses in asm code to invocations of a new helper that is patched in the same way when necessary. Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not boot on v6 CPUs without SMP extensions, so add this dependency to Kconfig as well. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
We will be adding variable loads to various hot paths, so it makes sense to add a helper macro that can load variables from asm code without the use of literal pool entries. On v7 or later, we can simply use MOVW/MOVT pairs, but on earlier cores, this requires a bit of hackery to emit a instruction sequence that implements this using a sequence of ADD/LDR instructions. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Add support for the R_ARM_ALU_PC_Gn_NC and R_ARM_LDR_PC_G2 group relocations [0] so we can use them in modules. These will be used to load the current task pointer from a global variable without having to rely on a literal pool entry to carry the address of this variable, which may have a significant negative impact on cache utilization for variables that are used often and in many different places, as each occurrence will result in a literal pool entry and therefore a line in the D-cache. [0] 'ELF for the ARM architecture' https://github.com/ARM-software/abi-aa/releasesAcked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Tweak the UP stack protector handling code so that the thread info pointer is preserved in R7 until set_current is called. This is needed for a subsequent patch that implements THREAD_INFO_IN_TASK and set_current for UP as well. This also means we will prefer the per-task protector on UP systems that implement the thread ID registers, so tweak the preprocessor conditionals to reflect this. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-