- 07 Sep, 2020 38 commits
-
-
Joerg Roedel authored
Move the assembly coded dispatch between page-faults and all other exceptions to C code to make it easier to maintain and extend. Also change the return-type of early_make_pgtable() to bool and make it static. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-36-joro@8bytes.org
-
Joerg Roedel authored
Move these two functions from kernel/idt.c to include/asm/desc.h: * init_idt_data() * idt_init_desc() These functions are needed to setup IDT entries very early and need to be called from head64.c. To be usable this early, these functions need to be compiled without instrumentation and the stack-protector feature. These features need to be kept enabled for kernel/idt.c, so head64.c must use its own versions. [ bp: Take Kees' suggested patch title and add his Rev-by. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-35-joro@8bytes.org
-
Joerg Roedel authored
Add a separate bringup IDT for the CPU bringup code that will be used until the kernel switches to the idt_table. There are two reasons for a separate IDT: 1) When the idt_table is set up and the secondary CPUs are booted, it contains entries (e.g. IST entries) which require certain CPU state to be set up. This includes a working TSS (for IST), MSR_GS_BASE (for stack protector) or CR4.FSGSBASE (for paranoid_entry) path. By using a dedicated IDT for early boot this state need not to be set up early. 2) The idt_table is static to idt.c, so any function using/modifying must be in idt.c too. That means that all compiler driven instrumentation like tracing or KASAN is also active in this code. But during early CPU bringup the environment is not set up for this instrumentation to work correctly. To avoid all of these hassles and make early exception handling robust, use a dedicated bringup IDT. The IDT is loaded two times, first on the boot CPU while the kernel is still running on direct mapped addresses, and again later after the switch to kernel addresses has happened. The second IDT load happens on the boot and secondary CPUs. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-34-joro@8bytes.org
-
Joerg Roedel authored
Make sure there is a stack once the kernel runs from virtual addresses. At this stage any secondary CPU which boots will have lost its stack because the kernel switched to a new page-table which does not map the real-mode stack anymore. This is needed for handling early #VC exceptions caused by instructions like CPUID. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-33-joro@8bytes.org
-
Joerg Roedel authored
Make sure segments are properly set up before setting up an IDT and doing anything that might cause a #VC exception. This is later needed for early exception handling. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-32-joro@8bytes.org
-
Joerg Roedel authored
Load the GDT right after switching to virtual addresses to make sure there is a defined GDT for exception handling. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-31-joro@8bytes.org
-
Joerg Roedel authored
Handling exceptions during boot requires a working GDT. The kernel GDT can't be used on the direct mapping, so load a startup GDT and setup segments. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-30-joro@8bytes.org
-
Joerg Roedel authored
The code to setup idt_data is needed for early exception handling, but set_intr_gate() can't be used that early because it has pv-ops in its code path which don't work that early. Split out the idt_data initialization part from set_intr_gate() so that it can be used separately. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-29-joro@8bytes.org
-
Tom Lendacky authored
Handle #VC exceptions caused by CPUID instructions. These happen in early boot code when the KASLR code checks for RDTSC. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> [ jroedel@suse.de: Adapt to #VC handling framework ] Co-developed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-28-joro@8bytes.org
-
Joerg Roedel authored
The xgetbv() function is needed in the pre-decompression boot code, but asm/fpu/internal.h can't be included there directly. Doing so opens the door to include-hell due to various include-magic in boot/compressed/misc.h. Avoid that by moving xgetbv()/xsetbv() to a separate header file and include it instead. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-27-joro@8bytes.org
-
Tom Lendacky authored
Add support for decoding and handling #VC exceptions for IOIO events. [ jroedel@suse.de: Adapted code to #VC handling framework ] Co-developed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-26-joro@8bytes.org
-
Joerg Roedel authored
Force a page-fault on any further accesses to the GHCB page when they shouldn't happen anymore. This will catch any bugs where a #VC exception is raised even though none is expected anymore. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-25-joro@8bytes.org
-
Joerg Roedel authored
Install an exception handler for #VC exception that uses a GHCB. Also add the infrastructure for handling different exit-codes by decoding the instruction that caused the exception and error handling. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-24-joro@8bytes.org
-
Joerg Roedel authored
The functions are needed to map the GHCB for SEV-ES guests. The GHCB is used for communication with the hypervisor, so its content must not be encrypted. After the GHCB is not needed anymore it must be mapped encrypted again so that the running kernel image can safely re-use the memory. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-23-joro@8bytes.org
-
Joerg Roedel authored
The function can fail to create an identity mapping, check for that and bail out if it happens. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-22-joro@8bytes.org
-
Joerg Roedel authored
Call set_sev_encryption_mask() while still on the stage 1 #VC-handler because the stage 2 handler needs the kernel's own page tables to be set up, to which calling set_sev_encryption_mask() is a prerequisite. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-21-joro@8bytes.org
-
Joerg Roedel authored
Add the first handler for #VC exceptions. At stage 1 there is no GHCB yet because the kernel might still be running on the EFI page table. The stage 1 handler is limited to the MSR-based protocol to talk to the hypervisor and can only support CPUID exit-codes, but that is enough to get to stage 2. [ bp: Zap superfluous newlines after rd/wrmsr instruction mnemonics. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-20-joro@8bytes.org
-
Joerg Roedel authored
Changing the function to take start and end as parameters instead of start and size simplifies the callers which don't need to calculate the size if they already have start and end. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-19-joro@8bytes.org
-
Joerg Roedel authored
With the page-fault handler in place, he identity mapping can be built on-demand. So remove the code which manually creates the mappings and unexport/remove the functions used for it. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-18-joro@8bytes.org
-
Joerg Roedel authored
When booted through startup_64(), the kernel keeps running on the EFI page table until the KASLR code sets up its own page table. Without KASLR, the pre-decompression boot code never switches off the EFI page table. Change that by unconditionally switching to a kernel-controlled page table after relocation. This makes sure the kernel can make changes to the mapping when necessary, for example map pages unencrypted in SEV and SEV-ES guests. Also, remove the debug_putstr() calls in initialize_identity_maps() because the function now runs before console_init() is called. [ bp: Massage commit message. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-17-joro@8bytes.org
-
Joerg Roedel authored
Install a page-fault handler to add an identity mapping to addresses not yet mapped. Also do some checking whether the error code is sane. This makes non SEV-ES machines use the exception handling infrastructure in the pre-decompressions boot code too, making it less likely to break in the future. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-16-joro@8bytes.org
-
Joerg Roedel authored
The file contains only code related to identity-mapped page tables. Rename the file and compile it always in. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-15-joro@8bytes.org
-
Joerg Roedel authored
Add code needed to setup an IDT in the early pre-decompression boot-code. The IDT is loaded first in startup_64, which is after EfiExitBootServices() has been called, and later reloaded when the kernel image has been relocated to the end of the decompression area. This allows to setup different IDT handlers before and after the relocation. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-14-joro@8bytes.org
-
Joerg Roedel authored
The x86-64 ABI defines a red-zone on the stack: The 128-byte area beyond the location pointed to by %rsp is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone. This is not compatible with exception handling, because the IRET frame written by the hardware at the stack pointer and the functions to handle the exception will overwrite the temporary variables of the interrupted function, causing undefined behavior. So disable red-zones for the pre-decompression boot code. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-13-joro@8bytes.org
-
Joerg Roedel authored
Add a function to check whether an instruction has a REP prefix. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-12-joro@8bytes.org
-
Joerg Roedel authored
Add a function to the instruction decoder which returns the pt_regs offset of the register specified in the reg field of the modrm byte. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-11-joro@8bytes.org
-
Joerg Roedel authored
Factor out the code used to decode an instruction with the correct address and operand sizes to a helper function. No functional changes. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-10-joro@8bytes.org
-
Joerg Roedel authored
Factor out the code to fetch the instruction from user-space to a helper function. No functional changes. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-9-joro@8bytes.org
-
Joerg Roedel authored
The inat-tables.c file has some arrays in it that contain pointers to other arrays. These pointers need to be relocated when the kernel image is moved to a different location. The pre-decompression boot-code has no support for applying ELF relocations, so initialize these arrays at runtime in the pre-decompression code to make sure all pointers are correctly initialized. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-8-joro@8bytes.org
-
Joerg Roedel authored
Move the definition of the x86 page-fault error code bits to a new header file asm/trap_pf.h. This makes it easier to include them into pre-decompression boot code. No functional changes. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-7-joro@8bytes.org
-
Tom Lendacky authored
Add CPU feature detection for Secure Encrypted Virtualization with Encrypted State. This feature enhances SEV by also encrypting the guest register state, making it in-accessible to the hypervisor. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-6-joro@8bytes.org
-
Borislav Petkov authored
Use the shorthand to make it more readable. No functional changes. Signed-off-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-5-joro@8bytes.org
-
Joerg Roedel authored
Building a correct GHCB for the hypervisor requires setting valid bits in the GHCB. Simplify that process by providing accessor functions to set values and to update the valid bitmap and to check the valid bitmap in KVM. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-4-joro@8bytes.org
-
Tom Lendacky authored
Extend the vmcb_safe_area with SEV-ES fields and add a new 'struct ghcb' which will be used for guest-hypervisor communication. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-3-joro@8bytes.org
-
Joerg Roedel authored
Do not allocate a vmcb_control_area and a vmcb_save_area on the stack, as these structures will become larger with future extenstions of SVM and thus the svm_set_nested_state() function will become a too large stack frame. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-2-joro@8bytes.org
-
Borislav Petkov authored
Pick up work happening in parallel to avoid nasty merge conflicts later. Signed-off-by: Borislav Petkov <bp@suse.de>
-
Borislav Petkov authored
Signed-off-by: Borislav Petkov <bp@suse.de>
-
Linus Torvalds authored
-
- 06 Sep, 2020 2 commits
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull more io_uring fixes from Jens Axboe: "Two followup fixes. One is fixing a regression from this merge window, the other is two commits fixing cancelation of deferred requests. Both have gone through full testing, and both spawned a few new regression test additions to liburing. - Don't play games with const, properly store the output iovec and assign it as needed. - Deferred request cancelation fix (Pavel)" * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block: io_uring: fix linked deferred ->files cancellation io_uring: fix cancel of deferred reqs with ->files io_uring: fix explicit async read/write mapping for large segments
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull iommu fixes from Joerg Roedel: - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL pointer dereference bug and serialize a hardware register access as required by the VT-d spec. - two patches for AMD IOMMU to force AMD GPUs into translation mode when memory encryption is active and disallow using IOMMUv2 functionality. This makes the AMDGPU driver work when memory encryption is active. - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping Table Entries. - MAINTAINERS file update for the Qualcom IOMMU driver. * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Handle 36bit addressing for x86-32 iommu/amd: Do not use IOMMUv2 functionality when SME is active iommu/amd: Do not force direct mapping when SME is active iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE iommu/amd: Restore IRTE.RemapEn bit after programming IRTE iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set() iommu/vt-d: Serialize IOMMU GCMD register modifications MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
-