Commit f530ee95 authored by Kirill A. Shutemov's avatar Kirill A. Shutemov Committed by Ingo Molnar

x86/boot/compressed: Reserve more memory for page tables

The decompressor has a hard limit on the number of page tables it can
allocate. This limit is defined at compile-time and will cause boot
failure if it is reached.

The kernel is very strict and calculates the limit precisely for the
worst-case scenario based on the current configuration. However, it is
easy to forget to adjust the limit when a new use-case arises. The
worst-case scenario is rarely encountered during sanity checks.

In the case of enabling 5-level paging, a use-case was overlooked. The
limit needs to be increased by one to accommodate the additional level.
This oversight went unnoticed until Aaron attempted to run the kernel
via kexec with 5-level paging and unaccepted memory enabled.

Update wost-case calculations to include 5-level paging.

To address this issue, let's allocate some extra space for page tables.
128K should be sufficient for any use-case. The logic can be simplified
by using a single value for all kernel configurations.

[ Also add a warning, should this memory run low - by Dave Hansen. ]

Fixes: 34bbb000 ("x86/boot/compressed: Enable 5-level paging during decompression stage")
Reported-by: default avatarAaron Lu <aaron.lu@intel.com>
Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com
parent 7575e5a3
...@@ -59,6 +59,14 @@ static void *alloc_pgt_page(void *context) ...@@ -59,6 +59,14 @@ static void *alloc_pgt_page(void *context)
return NULL; return NULL;
} }
/* Consumed more tables than expected? */
if (pages->pgt_buf_offset == BOOT_PGT_SIZE_WARN) {
debug_putstr("pgt_buf running low in " __FILE__ "\n");
debug_putstr("Need to raise BOOT_PGT_SIZE?\n");
debug_putaddr(pages->pgt_buf_offset);
debug_putaddr(pages->pgt_buf_size);
}
entry = pages->pgt_buf + pages->pgt_buf_offset; entry = pages->pgt_buf + pages->pgt_buf_offset;
pages->pgt_buf_offset += PAGE_SIZE; pages->pgt_buf_offset += PAGE_SIZE;
......
...@@ -40,23 +40,40 @@ ...@@ -40,23 +40,40 @@
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
# define BOOT_STACK_SIZE 0x4000 # define BOOT_STACK_SIZE 0x4000
/*
* Used by decompressor's startup_32() to allocate page tables for identity
* mapping of the 4G of RAM in 4-level paging mode:
* - 1 level4 table;
* - 1 level3 table;
* - 4 level2 table that maps everything with 2M pages;
*
* The additional level5 table needed for 5-level paging is allocated from
* trampoline_32bit memory.
*/
# define BOOT_INIT_PGT_SIZE (6*4096) # define BOOT_INIT_PGT_SIZE (6*4096)
# ifdef CONFIG_RANDOMIZE_BASE
/* /*
* Assuming all cross the 512GB boundary: * Total number of page tables kernel_add_identity_map() can allocate,
* 1 page for level4 * including page tables consumed by startup_32().
* (2+2)*4 pages for kernel, param, cmd_line, and randomized kernel *
* 2 pages for first 2M (video RAM: CONFIG_X86_VERBOSE_BOOTUP). * Worst-case scenario:
* Total is 19 pages. * - 5-level paging needs 1 level5 table;
* - KASLR needs to map kernel, boot_params, cmdline and randomized kernel,
* assuming all of them cross 256T boundary:
* + 4*2 level4 table;
* + 4*2 level3 table;
* + 4*2 level2 table;
* - X86_VERBOSE_BOOTUP needs to map the first 2M (video RAM):
* + 1 level4 table;
* + 1 level3 table;
* + 1 level2 table;
* Total: 28 tables
*
* Add 4 spare table in case decompressor touches anything beyond what is
* accounted above. Warn if it happens.
*/ */
# ifdef CONFIG_X86_VERBOSE_BOOTUP # define BOOT_PGT_SIZE_WARN (28*4096)
# define BOOT_PGT_SIZE (19*4096) # define BOOT_PGT_SIZE (32*4096)
# else /* !CONFIG_X86_VERBOSE_BOOTUP */
# define BOOT_PGT_SIZE (17*4096)
# endif
# else /* !CONFIG_RANDOMIZE_BASE */
# define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE
# endif
#else /* !CONFIG_X86_64 */ #else /* !CONFIG_X86_64 */
# define BOOT_STACK_SIZE 0x1000 # define BOOT_STACK_SIZE 0x1000
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment