Commits · 0383808e4d99ac31892655ae9dc93597eb6f1412 · Kirill Smelkov / linux

16 Feb, 2024 35 commits

arm64: kasan: Reduce minimum shadow alignment and enable 5 level paging · 0383808e

Ard Biesheuvel authored Feb 14, 2024

Allow the KASAN init code to deal with 5 levels of paging, and relax the
requirement that the shadow region is aligned to the top level pgd_t
size. This is necessary for LPA2 based 52-bit virtual addressing, where
the KASAN shadow will never be aligned to the pgd_t size. Allowing this
also enables the 16k/48-bit case for KASAN, which is a nice bonus.

This involves some hackery to manipulate the root and next level page
tables without having to distinguish all the various configurations,
including 16k/48-bits (which has a two entry pgd_t level), and LPA2
configurations running with one translation level less on non-LPA2
hardware.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-80-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

0383808e

arm64: mm: Add 5 level paging support to fixmap and swapper handling · 6ed8a3a0

Ard Biesheuvel authored Feb 14, 2024

Add support for using 5 levels of paging in the fixmap, as well as in
the kernel page table handling code which uses fixmaps internally.
This also handles the case where a 5 level build runs on hardware that
only supports 4 levels of paging.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-79-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

6ed8a3a0

arm64: Enable LPA2 at boot if supported by the system · 9684ec18

Ard Biesheuvel authored Feb 14, 2024

Update the early kernel mapping code to take 52-bit virtual addressing
into account based on the LPA2 feature. This is a bit more involved than
LVA (which is supported with 64k pages only), given that some page table
descriptor bits change meaning in this case.

To keep the handling in asm to a minimum, the initial ID map is still
created with 48-bit virtual addressing, which implies that the kernel
image must be loaded into 48-bit addressable physical memory. This is
currently required by the boot protocol, even though we happen to
support placement outside of that for LVA/64k based configurations.

Enabling LPA2 involves more than setting TCR.T1SZ to a lower value,
there is also a DS bit in TCR that needs to be set, and which changes
the meaning of bits [9:8] in all page table descriptors. Since we cannot
enable DS and every live page table descriptor at the same time, let's
pivot through another temporary mapping. This avoids the need to
reintroduce manipulations of the page tables with the MMU and caches
disabled.

To permit the LPA2 feature to be overridden on the kernel command line,
which may be necessary to work around silicon errata, or to deal with
mismatched features on heterogeneous SoC designs, test for CPU feature
overrides first, and only then enable LPA2.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-78-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

9684ec18

arm64: mm: add LPA2 and 5 level paging support to G-to-nG conversion · 2b6c8f96

Ard Biesheuvel authored Feb 14, 2024

Add support for 5 level paging in the G-to-nG routine that creates its
own temporary page tables to traverse the swapper page tables. Also add
support for running the 5 level configuration with the top level folded
at runtime, to support CPUs that do not implement the LPA2 extension.

While at it, wire up the level skipping logic so it will also trigger on
4 level configurations with LPA2 enabled at build time but not active at
runtime, as we'll fall back to 3 level paging in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-77-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

2b6c8f96

arm64: mm: Add definitions to support 5 levels of paging · a6bbf5d4

Ard Biesheuvel authored Feb 14, 2024

Add the required types and descriptor accessors to support 5 levels of
paging in the common code. This is one of the prerequisites for
supporting 52-bit virtual addressing with 4k pages.

Note that this does not cover the code that handles kernel mappings or
the fixmap.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-76-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a6bbf5d4

arm64: mm: Add LPA2 support to phys<->pte conversion routines · 925a0eb4

Ard Biesheuvel authored Feb 14, 2024

In preparation for enabling LPA2 support, introduce the mask values for
converting between physical addresses and their representations in a
page table descriptor.

While at it, move the pte_to_phys asm macro into its only user, so that
we can freely modify it to use its input value register as a temp
register.

For LPA2, the PTE_ADDR_MASK contains two non-adjacent sequences of zero
bits, which means it no longer fits into the immediate field of an
ordinary ALU instruction. So let's redefine it to include the bits in
between as well, and only use it when converting from physical address
to PTE representation, where the distinction does not matter. Also
update the name accordingly to emphasize this.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-75-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

925a0eb4

arm64: mm: Wire up TCR.DS bit to PTE shareability fields · db95ea78

Ard Biesheuvel authored Feb 14, 2024

When LPA2 is enabled, bits 8 and 9 of page and block descriptors become
part of the output address instead of carrying shareability attributes
for the region in question.

So avoid setting these bits if TCR.DS == 1, which means LPA2 is enabled.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-74-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

db95ea78

arm64: Add ESR decoding for exceptions involving translation level -1 · 7ac8d5b2

Ard Biesheuvel authored Feb 14, 2024

The LPA2 feature introduces new FSC values to report abort exceptions
related to translation level -1. Define these and wire them up.

Reuse the new ESR FSC classification helpers that arrived via the KVM
arm64 tree, and update the one for translation faults to check
specifically for a translation fault at level -1. (Access flag or
permission faults cannot occur at level -1 because they alway involve a
descriptor at the superior level so changing those helpers is not
needed).
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-73-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

7ac8d5b2

arm64: Avoid #define'ing PTE_MAYBE_NG to 0x0 for asm use · 60d043c1

Ard Biesheuvel authored Feb 14, 2024

The PROT_* macros resolve to expressions that are only valid in C and
not in assembler, and so they are only usable from C code. Currently, we
make an exception for the permission indirection init code in proc.S,
which doesn't care about the bits that are conditionally set, and so we
just #define PTE_MAYBE_NG to 0x0 for any assembler file that includes
these definitions.

This is dodgy because this means that PROT_NORMAL and friends is
generally available in asm code, but defined in a way that deviates from
the definition that C code will observe, which might lead to hard to
diagnose issues down the road.

So instead, #define PTE_MAYBE_NG only in the place where the PIE
constants are evaluated, and #undef it again right after. This allows us
to drop the #define from pgtable-prot.h, and avoid the risk of deviating
definitions between asm and C.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-72-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

60d043c1

arm64: mm: Add feature override support for LVA · 68aec33f

Ard Biesheuvel authored Feb 14, 2024

Add support for overriding the VARange field of the MMFR2 CPU ID
register. This permits the associated LVA feature to be overridden early
enough for the boot code that creates the kernel mapping to take it into
account.

Given that LPA2 implies LVA, disabling the latter should disable the
former as well. So override the ID_AA64MMFR0.TGran field of the current
page size as well if it advertises support for 52-bit addressing.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-71-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

68aec33f

arm64: mm: Handle LVA support as a CPU feature · 9cce9c6c

Ard Biesheuvel authored Feb 14, 2024

Currently, we detect CPU support for 52-bit virtual addressing (LVA)
extremely early, before creating the kernel page tables or enabling the
MMU. We cannot override the feature this early, and so large virtual
addressing is always enabled on CPUs that implement support for it if
the software support for it was enabled at build time. It also means we
rely on non-trivial code in asm to deal with this feature.

Given that both the ID map and the TTBR1 mapping of the kernel image are
guaranteed to be 48-bit addressable, it is not actually necessary to
enable support this early, and instead, we can model it as a CPU
feature. That way, we can rely on code patching to get the correct
TCR.T1SZ values programmed on secondary boot and resume from suspend.

On the primary boot path, we simply enable the MMU with 48-bit virtual
addressing initially, and update TCR.T1SZ if LVA is supported from C
code, right before creating the kernel mapping. Given that TTBR1 still
points to reserved_pg_dir at this point, updating TCR.T1SZ should be
safe without the need for explicit TLB maintenance.

Since this gets rid of all accesses to the vabits_actual variable from
asm code that occurred before TCR.T1SZ had been programmed, we no longer
have a need for this variable, and we can replace it with a C expression
that produces the correct value directly, based on the value of TCR.T1SZ.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-70-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

9cce9c6c

arm64: Revert "mm: provide idmap pointer to cpu_replace_ttbr1()" · e0f92f0d

Ard Biesheuvel authored Feb 14, 2024

This reverts commit 1682c45b, which is no longer needed now that
we create the permanent kernel mapping directly during early boot.

This is a RINO (revert in name only) given that some of the code has
moved around, but the changes are straight-forward.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-69-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

e0f92f0d

arm64: mm: omit redundant remap of kernel image · ba5b0333

Ard Biesheuvel authored Feb 14, 2024

Now that the early kernel mapping is created with all the right
attributes and segment boundaries, there is no longer a need to recreate
it and switch to it. This also means we no longer have to copy the kasan
shadow or some parts of the fixmap from one set of page tables to the
other.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-68-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

ba5b0333

arm64: mm: avoid fixmap for early swapper_pg_dir updates · 567a70c1

Ard Biesheuvel authored Feb 14, 2024

Early in the boot, when .rodata is still writable, we can poke
swapper_pg_dir entries directly, and there is no need to go through the
fixmap. After a future patch, we will enter the kernel with
swapper_pg_dir already active, and early swapper_pg_dir updates for
creating the fixmap page table hierarchy itself cannot go through the
fixmap for obvious reaons. So let's keep track of whether rodata is
writable, and update the descriptor directly in that case.

As the same reasoning applies to early KASAN init, make the function
noinstr as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-67-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

567a70c1

arm64: kernel: Create initial ID map from C code · 84b04d3e

Ard Biesheuvel authored Feb 14, 2024

The asm code that creates the initial ID map is rather intricate and
hard to follow. This is problematic because it makes adding support for
things like LPA2 or WXN more difficult than necessary. Also, it is
parameterized like the rest of the MM code to run with a configurable
number of levels, which is rather pointless, given that all AArch64 CPUs
implement support for 48-bit virtual addressing, and that many systems
exist with DRAM located outside of the 39-bit addressable range, which
is the only smaller VA size that is widely used, and we need additional
tricks to make things work in that combination.

So let's bite the bullet, and rip out all the asm macros, and fiddly
code, and replace it with a C implementation based on the newly added
routines for creating the early kernel VA mappings. And while at it,
create the initial ID map based on 48-bit virtual addressing as well,
regardless of the number of configured levels for the kernel proper.

Note that this code may execute with the MMU and caches disabled, and is
therefore not permitted to make unaligned accesses. This shouldn't
generally happen in any case for the algorithm as implemented, but to be
sure, let's pass -mstrict-align to the compiler just in case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-66-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

84b04d3e

arm64: pgtable: Decouple PGDIR size macros from PGD/PUD/PMD levels · 34b98e55

Ard Biesheuvel authored Feb 14, 2024

The mapping from PGD/PUD/PMD to levels and shifts is very confusing,
given that, due to folding, the shifts may be equal for different
levels, if the macros are even #define'd to begin with.

In a subsequent patch, we will modify the ID mapping code to decouple
the number of levels from the kernel's view of how these types are
folded, so prepare for this by reformulating the macros without the use
of these types.

Instead, use SWAPPER_BLOCK_SHIFT as the base quantity, and derive it
from either PAGE_SHIFT or PMD_SHIFT, which -if defined at all- are
defined unambiguously for a given page size, regardless of the number of
configured levels.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-65-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

34b98e55

arm64: mm: Use 48-bit virtual addressing for the permanent ID map · e6128a8e

Ard Biesheuvel authored Feb 14, 2024

Even though we support loading kernels anywhere in 48-bit addressable
physical memory, we create the ID maps based on the number of levels
that we happened to configure for the kernel VA and user VA spaces.

The reason for this is that the PGD/PUD/PMD based classification of
translation levels, along with the associated folding when the number of
levels is less than 5, does not permit creating a page table hierarchy
of a set number of levels. This means that, for instance, on 39-bit VA
kernels we need to configure an additional level above PGD level on the
fly, and 36-bit VA kernels still only support 47-bit virtual addressing
with this trick applied.

Now that we have a separate helper to populate page table hierarchies
that does not define the levels in terms of PUDS/PMDS/etc at all, let's
reuse it to create the permanent ID map with a fixed VA size of 48 bits.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-64-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

e6128a8e

arm64: head: Move early kernel mapping routines into C code · 97a6f43b

Ard Biesheuvel authored Feb 14, 2024

The asm version of the kernel mapping code works fine for creating a
coarse grained identity map, but for mapping the kernel down to its
exact boundaries with the right attributes, it is not suitable. This is
why we create a preliminary RWX kernel mapping first, and then rebuild
it from scratch later on.

So let's reimplement this in C, in a way that will make it unnecessary
to create the kernel page tables yet another time in paging_init().
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-63-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

97a6f43b

arm64: mmu: Make __cpu_replace_ttbr1() out of line · 82ca151d

Ard Biesheuvel authored Feb 14, 2024

__cpu_replace_ttbr1() is a static inline, and so it gets instantiated
wherever it is used. This is not really necessary, as it is never called
on a hot path. It also has the unfortunate side effect that the symbol
idmap_cpu_replace_ttbr1 may never be referenced from kCFI enabled C
code, and this means the type id symbol may not exist either.  This will
result in a build error once we start referring to this symbol from asm
code as well. (Note that this problem only occurs when CnP, KAsan and
suspend/resume are all disabled in the Kconfig but that is a valid
config, if unusual).

So let's just move it out of line so all callers will share the same
implementation, which will reference idmap_cpu_replace_ttbr1
unconditionally.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-62-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

82ca151d

arm64: mm: Make kaslr_requires_kpti() a static inline · 293d865f

Ard Biesheuvel authored Feb 14, 2024

In preparation for moving the first assignment of arm64_use_ng_mappings
to an earlier stage in the boot, ensure that kaslr_requires_kpti() is
accessible without relying on the core kernel's view on whether or not
KASLR is enabled. So make it a static inline, and move the
kaslr_enabled() check out of it and into the callers, one of which will
disappear in a subsequent patch.

Once/when support for the obsolete ThunderX 1 platform is dropped, this
check reduces to a E0PD feature check on the local CPU.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-61-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

293d865f

arm64: head: move memstart_offset_seed handling to C code · aa6a52b2

Ard Biesheuvel authored Feb 14, 2024

Now that we can set BSS variables from the early code running from the
ID map, we can set memstart_offset_seed directly from the C code that
derives the value instead of passing it back and forth between C and asm
code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-60-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

aa6a52b2

arm64: head: allocate more pages for the kernel mapping · 8d47b8e5

Ard Biesheuvel authored Feb 14, 2024

In preparation for switching to an early kernel mapping routine that
maps each segment according to its precise boundaries, and with the
correct attributes, let's allocate some extra pages for page tables for
the 4k page size configuration. This is necessary because the start and
end of each segment may not be aligned to the block size, and so we'll
need an extra page table at each segment boundary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-59-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

8d47b8e5

arm64: Add helpers to probe local CPU for PAC and BTI support · a669c6a4

Ard Biesheuvel authored Feb 14, 2024

Add some helpers that will be used by the early kernel mapping code to
check feature support on the local CPU. This permits the early kernel
mapping to be created with the right attributes, removing the need for
tearing it down and recreating it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-58-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a669c6a4

arm64: idreg-override: Create a pseudo feature for rodata=off · 9ddd9baa

Ard Biesheuvel authored Feb 14, 2024

Add rodata=off to the set of kernel command line options that is parsed
early using the CPU feature override detection code, so we can easily
refer to it when creating the kernel mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-57-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

9ddd9baa

arm64: kaslr: Use feature override instead of parsing the cmdline again · af73b9a2

Ard Biesheuvel authored Feb 14, 2024

The early kaslr code open codes the detection of 'nokaslr' on the kernel
command line, and this is no longer necessary now that the feature
detection code, which also looks for the same string, executes before
this code.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-56-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

af73b9a2

arm64: cpufeature: Add helper to test for CPU feature overrides · 35876f35

Ard Biesheuvel authored Feb 14, 2024

Add some helpers to extract and apply feature overrides to the bare
idreg values. This involves inspecting the value and mask of the
specific field that we are interested in, given that an override
value/mask pair might be invalid for one field but valid for another.

Then, wire up the new helper for the hVHE test - note that we can drop
the sysreg test here, as the override will be invalid when trying to
enable hVHE on non-VHE capable hardware.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-55-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

35876f35

arm64: head: move dynamic shadow call stack patching into early C runtime · 8a6e40e1

Ard Biesheuvel authored Feb 14, 2024

Once we update the early kernel mapping code to only map the kernel once
with the right permissions, we can no longer perform code patching via
this mapping.

So move this code to an earlier stage of the boot, right after applying
the relocations.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-54-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

8a6e40e1

arm64: head: Run feature override detection before mapping the kernel · dcfe969a

Ard Biesheuvel authored Feb 14, 2024

To permit the feature overrides to be taken into account before the
KASLR init code runs and the kernel mapping is created, move the
detection code to an earlier stage in the boot.

In a subsequent patch, this will be taken advantage of by merging the
preliminary and permanent mappings of the kernel text and data into a
single one that gets created and relocated before start_kernel() is
called.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-53-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

dcfe969a

arm64: Move feature overrides into the BSS section · 30687dec

Ard Biesheuvel authored Feb 14, 2024

In order to allow the CPU feature override detection code to run even
earlier, move the feature override global variables into BSS, which is
the only part of the static kernel image that is mapped read-write in
the initial ID map.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-52-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

30687dec

arm64: head: Clear BSS and the kernel page tables in one go · aa99aad7

Ard Biesheuvel authored Feb 14, 2024

We will move the CPU feature overrides into BSS in a subsequent patch,
and this requires that BSS is zeroed before the feature override
detection code runs. So let's map BSS read-write in the ID map, and zero
it via this mapping.

Since the kernel page tables are right next to it, and also zeroed via
the ID map, let's drop the separate clear_page_tables() function, and
just zero everything in one go.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-51-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

aa99aad7

arm64: kernel: Remove early fdt remap code · 9c4cd2a7

Ard Biesheuvel authored Feb 14, 2024

The early FDT remap code is no longer used so let's drop it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-50-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

9c4cd2a7

arm64: idreg-override: Move to early mini C runtime · e223a449

Ard Biesheuvel authored Feb 14, 2024

We will want to parse the ID register overrides even earlier, so that we
can take them into account before creating the kernel mapping. So
migrate the code and make it work in the context of the early C runtime.
We will move the invocation to an earlier stage in a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-49-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

e223a449

arm64: head: move relocation handling to C code · 734958ef

Ard Biesheuvel authored Feb 14, 2024

Now that we have a mini C runtime before the kernel mapping is up, we
can move the non-trivial relocation processing code out of head.S and
reimplement it in C.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-48-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

734958ef

arm64: kernel: Don't rely on objcopy to make code under pi/ __init · a86aa72e

Ard Biesheuvel authored Feb 14, 2024

We will add some code under pi/ that contains global variables that
should not end up in __initdata, as they will not be writable via the
initial ID map. So only rely on objcopy for making the libfdt code
__init, and use explicit annotations for the rest.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-47-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a86aa72e

arm64: kernel: Manage absolute relocations in code built under pi/ · 48157aa3

Ard Biesheuvel authored Feb 14, 2024

The mini C runtime runs before relocations are processed, and so it
cannot rely on statically initialized pointer variables.

Add a check to ensure that such code does not get introduced by
accident, by going over the relocations in each object, identifying the
ones that operate on data sections that are part of the executable
image, and raising an error if any relocations of type R_AARCH64_ABS64
exist. Note that such relocations are permitted in other places (e.g.,
debug sections) and will never occur in compiler generated code sections
when using the small code model, so only check sections that have
SHF_ALLOC set and SHF_EXECINSTR cleared.

To accommodate cases where statically initialized symbol references are
unavoidable, introduce a special case for ELF input data sections that
have ".rodata.prel64" in their names, and in these cases, instead of
rejecting any encountered ABS64 relocations, convert them into PREL64
relocations, which don't require any runtime fixups. Note that the code
in question must still be modified to deal with this, as it needs to
convert the 64-bit signed offsets into absolute addresses before use.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-46-ardb+git@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

48157aa3

09 Feb, 2024 5 commits

arm64: kaslr: Adjust randomization range dynamically · 3567fa63

Ard Biesheuvel authored Dec 13, 2023

Currently, we base the KASLR randomization range on a rough estimate of
the available space in the upper VA region: the lower 1/4th has the
module region and the upper 1/4th has the fixmap, vmemmap and PCI I/O
ranges, and so we pick a random location in the remaining space in the
middle.

Once we enable support for 5-level paging with 4k pages, this no longer
works: the vmemmap region, being dimensioned to cover a 52-bit linear
region, takes up so much space in the upper VA region (the size of which
is based on a 48-bit VA space for compatibility with non-LVA hardware)
that the region above the vmalloc region takes up more than a quarter of
the available space.

So instead of a heuristic, let's derive the randomization range from the
actual boundaries of the vmalloc region.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231213084024.2367360-16-ardb@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>

3567fa63

arm64: mm: Reclaim unused vmemmap region for vmalloc use · d432b8d5

Ard Biesheuvel authored Dec 13, 2023

The vmemmap array is statically sized based on the maximum supported
size of the virtual address space, but it is located inside the upper VA
region, which is statically sized based on the *minimum* supported size
of the VA space. This doesn't matter much when using 64k pages, which is
the only configuration that currently supports 52-bit virtual
addressing.

However, upcoming LPA2 support will change this picture somewhat, as in
that case, the vmemmap array will take up more than 25% of the upper VA
region when using 4k pages. Given that most of this space is never used
when running on a system that does not support 52-bit virtual
addressing, let's reclaim the unused vmemmap area in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231213084024.2367360-15-ardb@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>

d432b8d5

arm64: vmemmap: Avoid base2 order of struct page size to dimension region · 32697ff3

Ard Biesheuvel authored Dec 13, 2023

The placement and size of the vmemmap region in the kernel virtual
address space is currently derived from the base2 order of the size of a
struct page. This makes for nicely aligned constants with lots of
leading 0xf and trailing 0x0 digits, but given that the actual struct
pages are indexed as an ordinary array, this resulting region is
severely overdimensioned when the size of a struct page is just over a
power of 2.

This doesn't matter today, but once we enable 52-bit virtual addressing
for 4k pages configurations, the vmemmap region may take up almost half
of the upper VA region with the current struct page upper bound at 64
bytes. And once we enable KMSAN or other features that push the size of
a struct page over 64 bytes, we will run out of VMALLOC space entirely.

So instead, let's derive the region size from the actual size of a
struct page, and place the entire region 1 GB from the top of the VA
space, where it still doesn't share any lower level translation table
entries with the fixmap.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231213084024.2367360-14-ardb@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>

32697ff3

arm64: ptdump: Discover start of vmemmap region at runtime · f9cca244

Ard Biesheuvel authored Dec 13, 2023

We will soon reclaim the part of the vmemmap region that covers VA space
that is not addressable by the hardware. To avoid confusion, ensure that
the 'vmemmap start' marker points at the start of the region that is
actually being used for the struct page array, rather than the start of
the region we set aside for it at build time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231213084024.2367360-13-ardb@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>

f9cca244

arm64: ptdump: Allow all region boundaries to be defined at boot time · 34f879fb

Ard Biesheuvel authored Dec 13, 2023

Rework the way the address_markers array is populated so that we can
tolerate values that are not compile time constants generally, rather
than keeping track manually of the array indexes in question, and poking
new values into them manually. This will be needed for VMALLOC_END,
which will cease to be a compile time constant after a subsequent patch.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20231213084024.2367360-12-ardb@google.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>

34f879fb