- 21 Dec, 2020 11 commits
-
-
Russell King authored
-
Russell King authored
-
Ard Biesheuvel authored
The early ATAGS/DT mapping code uses SECTION_SHIFT to mask low order bits of R2, and decides that no ATAGS/DTB were provided if the resulting value is 0x0. This means that on systems where DRAM starts at 0x0 (such as Raspberry Pi), no explicit mapping of the DT will be created if R2 points into the first 1 MB section of memory. This was not a problem before, because the decompressed kernel is loaded at the base of DRAM and mapped using sections as well, and so as long as the DT is referenced via a virtual address that uses the same translation (the linear map, in this case), things work fine. However, commit 7a1be318 ("9012/1: move device tree mapping out of linear region") changes this, and now the DT is referenced via a virtual address that is disjoint from the linear mapping of DRAM, and so we need the early code to create the DT mapping unconditionally. So let's create the early DT mapping for any value of R2 != 0x0. Reported-by: "kernelci.org bot" <bot@kernelci.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Nathan Chancellor authored
When linking a multi_v7_defconfig + CONFIG_KASAN=y kernel with LD=ld.lld, the following error occurs: $ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- LLVM=1 zImage ld.lld: error: section: .exit.data is not contiguous with other relro sections LLD defaults to '-z relro', which is unneeded for the kernel because program headers are not used nor is there any position independent code generation or linking for ARM. Add '-z norelro' to LDFLAGS_vmlinux to avoid this error. Link: https://github.com/ClangBuiltLinux/linux/issues/1189Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Geert Uytterhoeven authored
The DTB magic marker is stored as a 32-bit big-endian value, and thus depends on the CPU's endianness. Add a macro to define this value in native endianness, to reduce #ifdef clutter and (future) duplication. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Geert Uytterhoeven authored
The dbgadtb macro is passed the size of the appended DTB, not the end address. Fixes: c03e4147 ("ARM: 9010/1: uncompress: Print the location of appended DTB") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Geert Uytterhoeven authored
DTB stores all values as 32-bit big-endian integers. Add a macro to convert such values to native CPU endianness, to reduce duplication. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Anshuman Khandual authored
Mapping between IPI type index and its string is direct without requiring an additional offset. Hence the existing macro S(x, s) is now redundant and can just be dropped. This also makes the code clean and simple. Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Anshuman Khandual authored
Macros used as functions can be problematic from the compiler perspective. There was a build failure report caused primarily because of non reference of an argument variable. Hence convert PUD level pgtable helper macros into functions in order to avoid such problems in the future. In the process, it fixes the argument variables sequence in set_pud() which probably remained hidden for being a macro. https://lore.kernel.org/linux-mm/202011020749.5XQ3Hfzc-lkp@intel.com/ https://lore.kernel.org/linux-mm/5fa49698.Vu2O3r+dU20UoEJ+%25lkp@intel.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
Commit aaac3733 ("ARM: kvm: replace open coded VA->PA calculations with adr_l call") removed all uses of .L__boot_cpu_mode_offset, so there is no longer a need to define it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
Commit f77ac2e3 ("ARM: 9030/1: entry: omit FP emulation for UND exceptions taken in kernel mode") failed to take into account that there is in fact a case where we relied on this code path: during boot, the VFP detection code issues a read of FPSID, which will trigger an undef exception on cores that lack VFP support. So let's reinstate this logic using an undef hook which is registered only for the duration of the initcall to vpf_init(), and which sets VFP_arch to a non-zero value - as before - if no VFP support is present. Fixes: f77ac2e3 ("ARM: 9030/1: entry: omit FP emulation for UND ...") Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
- 08 Dec, 2020 6 commits
-
-
Nicolas Pitre authored
The ARM version of __div64_32() encapsulates a call to __do_div64 with non-standard argument passing. In particular, __n is a 64-bit input argument assigned to r0-r1 and __rem is an output argument sharing half of that r0-r1 register pair. With __n being an input argument, the compiler is in its right to presume that r0-r1 would still hold the value of __n past the inline assembly statement. Normally, the compiler would have assigned non overlapping registers to __n and __rem if the value for __n is needed again. However, here we enforce our own register assignment and gcc fails to notice the conflict. In practice this doesn't cause any problem as __n is considered dead after the asm statement and *n is overwritten. However this is not always guaranteed and clang rightfully complains. Let's fix it properly by making __n into an input-output variable. This makes it clear that those registers representing __n have been modified. Then we can extract __rem as the high part of __n with plain C code. This asm constraint "abuse" was likely relied upon back when gcc didn't handle 64-bit values optimally. Turns out that gcc is now able to optimize things and produces the same code with this patch applied. Reported-by: Antony Yu <swpenim@gmail.com> Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
There are a couple of problems with the exception entry code that deals with FP exceptions (which are reported as UND exceptions) when building the kernel in Thumb2 mode: - the conditional branch to vfp_kmode_exception in vfp_support_entry() may be out of range for its target, depending on how the linker decides to arrange the sections; - when the UND exception is taken in kernel mode, the emulation handling logic is entered via the 'call_fpe' label, which means we end up using the wrong value/mask pairs to match and detect the NEON opcodes. Since UND exceptions in kernel mode are unlikely to occur on a hot path (as opposed to the user mode version which is invoked for VFP support code and lazy restore), we can use the existing undef hook machinery for any kernel mode instruction emulation that is needed, including calling the existing vfp_kmode_exception() routine for unexpected cases. So drop the call to call_fpe, and instead, install an undef hook that will get called for NEON and VFP instructions that trigger an UND exception in kernel mode. While at it, make sure that the PC correction is accurate for the execution mode where the exception was taken, by checking the PSR Thumb bit. Cc: Dmitry Osipenko <digetx@gmail.com> Cc: Kees Cook <keescook@chromium.org> Fixes: eff8728f ("vmlinux.lds.h: Add PGO and AutoFDO input sections") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Jian Cai authored
This patch replaces 6 IWMMXT instructions Clang's integrated assembler does not support in iwmmxt.S using macros, while making sure GNU assembler still emit the same instructions. This should be easier than providing full IWMMXT support in Clang. This is one of the last bits of kernel code that could be compiled but not assembled with clang. Once all of it works with IAS, we no longer need to special-case 32-bit Arm in Kbuild, or turn off CONFIG_IWMMXT when build-testing. "Intel Wireless MMX Technology - Developer Guide - August, 2002" should be referenced for the encoding schemes of these extensions. Link: https://github.com/ClangBuiltLinux/linux/issues/975Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Jian Cai <jiancai@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Ard Biesheuvel authored
KASAN uses the routines in stacktrace.c to capture the call stack each time memory gets allocated or freed. Some of these routines are also used to log CPU and memory context when exceptions are taken, and so in some cases, memory accesses may be made that are not strictly in line with the KASAN constraints, and may therefore trigger false KASAN positives. So follow the example set by other architectures, and simply disable KASAN instrumentation for these routines. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Nick Desaulniers authored
Since commit 0bddd227 ("Documentation: update for gcc 4.9 requirement") the minimum supported version of GCC is gcc-4.9. It's now safe to remove this code. Link: https://github.com/ClangBuiltLinux/linux/issues/427Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Nick Desaulniers authored
LLD does not yet support any big endian architectures. Make this config non-selectable when using LLD until LLD is fixed. Link: https://github.com/ClangBuiltLinux/linux/issues/965Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
- 12 Nov, 2020 3 commits
-
-
Geert Uytterhoeven authored
As "u64" is equivalent to "unsigned long long", there is no need to cast a "u64" parameter for printing it using the "0x%08llx" format specifier. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Geert Uytterhoeven authored
Fix a misspelling of the word "memory". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
Fangrui Song authored
Commit d6d51a96 ("ARM: 9014/2: Replace string mem* functions for KASan") add .weak directives to memcpy/memmove/memset to avoid collision with KASAN interceptors. This does not work with LLVM's integrated assembler (the assembly snippet `.weak memcpy ... .globl memcpy` produces a STB_GLOBAL memcpy while GNU as produces a STB_WEAK memcpy). LLVM 12 (since https://reviews.llvm.org/D90108) will error on such an overridden symbol binding. Use the appropriate WEAK macro instead. Link: https://github.com/ClangBuiltLinux/linux/issues/1190 -- Fixes: d6d51a96 ("ARM: 9014/2: Replace string mem* functions for KASan") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Fangrui Song <maskray@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
-
- 09 Nov, 2020 2 commits
-
-
Russell King authored
Merge tag 'arm-adrl-replacement-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable Tidy up open coded relative references in asm Use the newly introduced adr_l/ldr_l/str_l/mov_l assembler macros to replace open coded VA-to-PA arithmetic in various places in the code. This avoids the use of literals on v7+ CPUs, reduces the footprint of the code in most cases, and generally makes the code easier to follow. Series was posted here, and reviewed by Nicolas Pitre: https://lore.kernel.org/linux-arm-kernel/20200914095706.3985-1-ardb@kernel.org/
-
Russell King authored
Merge tag 'arm-p2v-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable Implement the necessary changes in the ARM assembler boot code to permit the relative alignment of the physical and the virtual addresses of the kernel to be as little as 2 MiB, as opposed to the minimum of 16 MiB we support today. Series was posted here, and reviewed by Nicolas Pitre and Linus Walleij: https://lore.kernel.org/linux-arm-kernel/20200921154117.757-1-ardb@kernel.org/
-
- 28 Oct, 2020 18 commits
-
-
Ard Biesheuvel authored
Replace the open coded calculations of the actual physical address of the KVM stub vector table with a single adr_l invocation. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Replace the open coded arithmetic with a simple adr_l/sub pair. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Replace the open coded PC relative offset calculations with adr_l and ldr_l invocations. This removes some open coded PC relative arithmetic, avoids literal pools on v7+, and slightly reduces the footprint of the code. Note that ALT_SMP() expects a single instruction so move the macro invocation after it. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Now that calling __do_fixup_smp_on_up() can be done without passing the physical-to-virtual offset in r3, we can replace the open coded PC relative offset calculations with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Currently, the .alt.smp.init section contains the virtual addresses of the patch sites. Since patching may occur both before and after switching into virtual mode, this requires some manual handling of the address when applying the UP alternative. Let's simplify this by using relative offsets in the table entries: this allows us to simply add each entry's address to its contents, regardless of whether we are running in virtual mode or not. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Replace the open coded PC relative offset calculations with adr_l and ldr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Note that it also removes a stale comment about the contents of r6. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Replace the open coded PC relative offset calculations involving __turn_mmu_on and __turn_mmu_on_end with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Replace the open coded PC relative offset calculations with a pair of adr_l invocations. This removes some open coded arithmetic involving virtual addresses, avoids literal pools on v7+, and slightly reduces the footprint of the code. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The ARM 'adrl' pseudo instruction is a bit problematic, as it does not exist in Thumb mode, and it is not implemented by Clang either. Since the Thumb variant has a slightly bigger range, it is sometimes necessary to emit the 'adrl' variant in ARM mode where Thumb mode can use adr just fine. However, that still leaves the Clang issue, which does not appear to be supporting this any time soon. So let's switch to the adr_l macro, which works for both ARM and Thumb, and has unlimited range. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a physical address (PHYS_OFFSET) that is platform specific, and is discovered at boot. Since we don't want to slow down translations between physical and virtual addresses by keeping the offset in a variable in memory, we implement this by patching the code performing the translation, and putting the offset between PAGE_OFFSET and the start of physical RAM directly into the instruction opcodes. As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB of granularity, we have to round up PHYS_OFFSET to the next multiple if the start of physical RAM is not a multiple of 16 MiB. This wastes some physical RAM, since the memory that was skipped will now live below PAGE_OFFSET, making it inaccessible to the kernel. We can improve this by changing the patchable sequences and the patching logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB of granularity, and so we will never waste more than that amount by rounding up the physical start of DRAM to the next multiple of 2 MiB. (Note that 2 MiB granularity guarantees that the linear mapping can be created efficiently, whereas less than 2 MiB may result in the linear mapping needing another level of page tables) This helps Zhen Lei's scenario, where the start of DRAM is known to be occupied. It also helps EFI boot, which relies on the firmware's page allocator to allocate space for the decompressed kernel as low as possible. And if the KASLR patches ever land for 32-bit, it will give us 3 more bits of randomization of the placement of the kernel inside the linear region. For the ARM code path, it simply comes down to using two add/sub instructions instead of one for the carryless version, and patching each of them with the correct immediate depending on the rotation field. For the LPAE calculation, which has to deal with a carry, it patches the MOVW instruction with up to 12 bits of offset (but we only need 11 bits anyway) For the Thumb2 code path, patching more than 11 bits of displacement would be somewhat cumbersome, but the 11 bits we need fit nicely into the second word of the u16[2] opcode, so we simply update the immediate assignment and the left shift to create an addend of the right magnitude. Suggested-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Nicolas Pitre <nico@fluxnic.net> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
In preparation for reducing the phys-to-virt minimum relative alignment from 16 MiB to 2 MiB, switch to patchable sequences involving MOVW instructions that can more easily be manipulated to carry a 12-bit immediate. Note that the non-LPAE ARM sequence is not updated: MOVW may not be supported on non-LPAE platforms, and the sequence itself can be updated more easily to apply the 12 bits of displacement. For Thumb2, which has many more versions of opcodes, switch to a sequence that can be patched by the same patching code for both versions. Note that the Thumb2 opcodes for MOVW and MVN are unambiguous, and have no rotation bits in their immediate fields, so there is no need to use placeholder constants in the asm blocks. While at it, drop the 'volatile' qualifiers from the asm blocks: the code does not have any side effects that are invisible to the compiler, so it is free to omit these sequences if the outputs are not used. Suggested-by: Russell King <linux@armlinux.org.uk> Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Declutter the code in __fixup_pv_table() by using the new adr_l/str_l macros to take PC relative references to external symbols, and by using the value of PHYS_OFFSET passed in r8 to calculate the p2v offset. Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Free up a register in the p2v patching code by switching to relative references, which don't require keeping the phys-to-virt displacement live in a register. Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
We always pass the same value for 'type' so pull it into the __pv_stub macro itself. Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The big and little endian versions of the ARM p2v patching routine only differ in the values of the constants, so factor those out into macros so that we only have one version of the logic sequence to maintain. Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The ARM and Thumb2 versions of the p2v patching loop have some overlap at the end of the loop, so factor that out. As numeric labels are not required to be unique, and may therefore be ambiguous, use named local labels for the start and end of the loop instead. Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Move the phys2virt patching code into a separate .S file before doing some work on it. Suggested-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
When using the new adr_l/ldr_l/str_l macros to refer to external symbols from modules, the linker may emit place relative ELF relocations that need to be fixed up by the module loader. So add support for these. Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-