An error occurred fetching the project authors.
- 13 Jan, 2023 2 commits
-
-
Sean Christopherson authored
Inhibit APICv/AVIC if the optimized physical map is disabled so that KVM KVM provides consistent APIC behavior if xAPIC IDs are aliased due to vcpu_id being truncated and the x2APIC hotplug hack isn't enabled. If the hotplug hack is disabled, events that are emulated by KVM will follow architectural behavior (all matching vCPUs receive events, even if the "match" is due to truncation), whereas APICv and AVIC will deliver events only to the first matching vCPU, i.e. the vCPU that matches without truncation. Note, the "extra" inhibit is needed because KVM deliberately ignores mismatches due to truncation when applying the APIC_ID_MODIFIED inhibit so that large VMs (>255 vCPUs) can run with APICv/AVIC. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20230106011306.85230-24-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Move the APIC access page allocation helper function to common x86 code, the allocation routine is virtually identical between APICv (VMX) and AVIC (SVM). Keep APICv's gfn_to_page() + put_page() sequence, which verifies that a backing page can be allocated, i.e. that the system isn't under heavy memory pressure. Forcing the backing page to be populated isn't strictly necessary, but skipping the effective prefetch only delays the inevitable. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20230106011306.85230-10-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 27 Dec, 2022 1 commit
-
-
Vitaly Kuznetsov authored
Commit 9bcb9065 ("KVM: VMX: Get rid of eVMCS specific VMX controls sanitization") dropped 'vmcs_conf' sanitization for KVM-on-Hyper-V because there's no known Hyper-V version which would expose a feature unsupported in eVMCS in VMX feature MSRs. This works well for all currently existing Hyper-V version, however, future Hyper-V versions may add features which are supported by KVM and are currently missing in eVMCSv1 definition (e.g. APIC virtualization, PML,...). When this happens, existing KVMs will get broken. With the inverted 'unsupported by eVMCSv1' checks, we can resurrect vmcs_conf sanitization and make KVM future proof. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20221104144708.435865-5-vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 23 Dec, 2022 1 commit
-
-
Sean Christopherson authored
When stuffing the allowed secondary execution controls for nested VMX in response to CPUID updates, don't set the allowed-1 bit for a feature that isn't supported by KVM, i.e. isn't allowed by the canonical vmcs_config. WARN if KVM attempts to manipulate a feature that isn't supported. All features that are currently stuffed are always advertised to L1 for nested VMX if they are supported in KVM's base configuration, and no additional features should ever be added to the CPUID-induced stuffing (updating VMX MSRs in response to CPUID updates is a long-standing KVM flaw that is slowly being fixed). Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20221213062306.667649-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 01 Dec, 2022 4 commits
-
-
Sean Christopherson authored
Move the check on IA32_FEATURE_CONTROL being locked, i.e. read-only from the guest, into the helper to check the overall validity of the incoming value. Opportunistically rename the helper to make it clear that it returns a bool. No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220607232353.3375324-3-seanjc@google.com
-
Sean Christopherson authored
Allow userspace to set all supported bits in MSR IA32_FEATURE_CONTROL irrespective of the guest CPUID model, e.g. via KVM_SET_MSRS. KVM's ABI is that userspace is allowed to set MSRs before CPUID, i.e. can set MSRs to values that would fault according to the guest CPUID model. Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220607232353.3375324-2-seanjc@google.com
-
Jim Mattson authored
According to Intel's document on Indirect Branch Restricted Speculation, "Enabling IBRS does not prevent software from controlling the predicted targets of indirect branches of unrelated software executed later at the same predictor mode (for example, between two different user applications, or two different virtual machines). Such isolation can be ensured through use of the Indirect Branch Predictor Barrier (IBPB) command." This applies to both basic and enhanced IBRS. Since L1 and L2 VMs share hardware predictor modes (guest-user and guest-kernel), hardware IBRS is not sufficient to virtualize IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts, hardware IBRS is actually sufficient in practice, even though it isn't sufficient architecturally.) For virtual CPUs that support IBRS, add an indirect branch prediction barrier on emulated VM-exit, to ensure that the predicted targets of indirect branches executed in L1 cannot be controlled by software that was executed in L2. Since we typically don't intercept guest writes to IA32_SPEC_CTRL, perform the IBPB at emulated VM-exit regardless of the current IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit. This is CVE-2022-2196. Fixes: 5c911bef ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02") Cc: Sean Christopherson <seanjc@google.com> Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Jim Mattson authored
At this point in time, most guests (in the default, out-of-the-box configuration) are likely to use IA32_SPEC_CTRL. Therefore, drop the compiler hint that it is unlikely for KVM to be intercepting WRMSR of IA32_SPEC_CTRL. Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221019213620.1953281-2-jmattson@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
- 18 Nov, 2022 2 commits
-
-
Vitaly Kuznetsov authored
To conform with SVM, rename VMX specific Hyper-V files from "evmcs.{ch}" to "hyperv.{ch}". While Enlightened VMCS is a lion's share of these files, some stuff (e.g. enlightened MSR bitmap, the upcoming Hyper-V L2 TLB flush, ...) goes beyond that. Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-7-vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
To make terminology between Hyper-V-on-KVM and KVM-on-Hyper-V consistent, rename 'enable_direct_tlbflush' to 'enable_l2_tlb_flush'. The change eliminates the use of confusing 'direct' and adds the missing underscore. No functional change. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-6-vkuznets@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 09 Nov, 2022 6 commits
-
-
Maxim Levitsky authored
Use kvm_smram union instad of raw arrays in the common smm code. Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221025124741.228045-18-mlevitsk@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Vendor-specific code that deals with SMI injection and saving/restoring SMM state is not needed if CONFIG_KVM_SMM is disabled, so remove the four callbacks smi_allowed, enter_smm, leave_smm and enable_smi_window. The users in svm/nested.c and x86.c also have to be compiled out; the amount of #ifdef'ed code is small and it's not worth moving it to smm.c. enter_smm is now used only within #ifdef CONFIG_KVM_SMM, and the stub can therefore be removed. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220929172016.319443-7-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Some users of KVM implement the UEFI variable store through a paravirtual device that does not require the "SMM lockbox" component of edk2; allow them to compile out system management mode, which is not a full implementation especially in how it interacts with nested virtualization. Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220929172016.319443-6-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Create a new header and source with code related to system management mode emulation. Entry and exit will move there too; for now, opportunistically rename put_smstate to PUT_SMSTATE while moving it to smm.h, and adjust the SMM state saving code. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220929172016.319443-2-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Handle PERF_CAPABILITIES directly in kvm_get_msr_feature() now that the supported value is available in kvm_caps. No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20221006000314.73240-8-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Track KVM's supported PERF_CAPABILITIES in kvm_caps instead of computing the supported capabilities on the fly every time. Using kvm_caps will also allow for future cleanups as the kvm_caps values can be used directly in common x86 code. Signed-off-by:
Sean Christopherson <seanjc@google.com> Acked-by:
Like Xu <likexu@tencent.com> Message-Id: <20221006000314.73240-6-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 02 Nov, 2022 2 commits
-
-
Sean Christopherson authored
Ignore guest CPUID for host userspace writes to the DEBUGCTL MSR, KVM's ABI is that setting CPUID vs. state can be done in any order, i.e. KVM allows userspace to stuff MSRs prior to setting the guest's CPUID that makes the new MSR "legal". Keep the vmx_get_perf_capabilities() check for guest writes, even though it's technically unnecessary since the vCPU's PERF_CAPABILITIES is consulted when refreshing LBR support. A future patch will clean up vmx_get_perf_capabilities() to avoid the RDMSR on every call, at which point the paranoia will incur no meaningful overhead. Note, prior to vmx_get_perf_capabilities() checking that the host fully supports LBRs via x86_perf_get_lbr(), KVM effectively relied on intel_pmu_lbr_is_enabled() to guard against host userspace enabling LBRs on platforms without full support. Fixes: c6462363 ("KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled") Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20221006000314.73240-5-seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Fold vmx_supported_debugctl() into vcpu_supported_debugctl(), its only caller. Setting bits only to clear them a few instructions later is rather silly, and splitting the logic makes things seem more complicated than they actually are. Opportunistically drop DEBUGCTLMSR_LBR_MASK now that there's a single reference to the pair of bits. The extra layer of indirection provides no meaningful value and makes it unnecessarily tedious to understand what KVM is doing. No functional change. Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20221006000314.73240-4-seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 27 Oct, 2022 1 commit
-
-
Emanuele Giuseppe Esposito authored
Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be virtualized. When KVM is loaded, adjust_vmx_controls checks that the bit is available before enabling the feature; however, other parts of the code check enable_sgx and not clearing the variable caused two different bugs, mostly affecting nested virtualization scenarios. First, because enable_sgx remained true, SECONDARY_EXEC_ENCLS_EXITING would be marked available in the capability MSR that are accessed by a nested hypervisor. KVM would then propagate the control from vmcs12 to vmcs02 even if it isn't supported by the processor, thus causing an unexpected VM-Fail (exit code 0x7) in L1. Second, vmx_set_cpu_caps() would not clear the SGX bits when hardware support is unavailable. This is a much less problematic bug as it only happens if SGX is soft-disabled (available in the processor but hidden in CPUID) or if SGX is supported for bare metal but not in the VMCS (will never happen when running on bare metal, but can theoertically happen when running in a VM). Last but not least, this ensures that module params in sysfs reflect KVM's actual configuration. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128 Fixes: 72add915 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC") Cc: stable@vger.kernel.org Suggested-by:
Sean Christopherson <seanjc@google.com> Suggested-by:
Bandan Das <bsd@redhat.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20221025123749.2201649-1-eesposit@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 26 Sep, 2022 21 commits
-
-
Sean Christopherson authored
Set KVM_REQ_EVENT when MTF becomes pending to ensure that KVM will run through inject_pending_event() and thus vmx_check_nested_events() prior to re-entering the guest. MTF currently works by virtue of KVM's hack that calls kvm_check_nested_events() from kvm_vcpu_running(), but that hack will be removed in the near future. Until that call is removed, the patch introduces no real functional change. Fixes: 5ef8acbd ("KVM: nVMX: Emulate MTF when performing instruction emulation") Cc: stable@vger.kernel.org Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220921003201.1441511-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Document the oddities of ICEBP interception (trap-like #DB is intercepted as a fault-like exception), and how using VMX's inner "skip" helper deliberately bypasses the pending MTF and single-step #DB logic. No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-24-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Morph pending exceptions to pending VM-Exits (due to interception) when the exception is queued instead of waiting until nested events are checked at VM-Entry. This fixes a longstanding bug where KVM fails to handle an exception that occurs during delivery of a previous exception, KVM (L0) and L1 both want to intercept the exception (e.g. #PF for shadow paging), and KVM determines that the exception is in the guest's domain, i.e. queues the new exception for L2. Deferring the interception check causes KVM to esclate various combinations of injected+pending exceptions to double fault (#DF) without consulting L1's interception desires, and ends up injecting a spurious #DF into L2. KVM has fudged around the issue for #PF by special casing emulated #PF injection for shadow paging, but the underlying issue is not unique to shadow paging in L0, e.g. if KVM is intercepting #PF because the guest has a smaller maxphyaddr and L1 (but not L0) is using shadow paging. Other exceptions are affected as well, e.g. if KVM is intercepting #GP for one of SVM's workaround or for the VMware backdoor emulation stuff. The other cases have gone unnoticed because the #DF is spurious if and only if L1 resolves the exception, e.g. KVM's goofs go unnoticed if L1 would have injected #DF anyways. The hack-a-fix has also led to ugly code, e.g. bailing from the emulator if #PF injection forced a nested VM-Exit and the emulator finds itself back in L1. Allowing for direct-to-VM-Exit queueing also neatly solves the async #PF in L2 mess; no need to set a magic flag and token, simply queue a #PF nested VM-Exit. Deal with event migration by flagging that a pending exception was queued by userspace and check for interception at the next KVM_RUN, e.g. so that KVM does the right thing regardless of the order in which userspace restores nested state vs. event state. When "getting" events from userspace, simply drop any pending excpetion that is destined to be intercepted if there is also an injected exception to be migrated. Ideally, KVM would migrate both events, but that would require new ABI, and practically speaking losing the event is unlikely to be noticed, let alone fatal. The injected exception is captured, RIP still points at the original faulting instruction, etc... So either the injection on the target will trigger the same intercepted exception, or the source of the intercepted exception was transient and/or non-deterministic, thus dropping it is ok-ish. Fixes: a04aead1 ("KVM: nSVM: fix running nested guests when npt=0") Fixes: feaf0c7d ("KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2") Cc: Jim Mattson <jmattson@google.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-22-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Move the definition of "struct kvm_queued_exception" out of kvm_vcpu_arch in anticipation of adding a second instance in kvm_vcpu_arch to handle exceptions that occur when vectoring an injected exception and are morphed to VM-Exit instead of leading to #DF. Opportunistically take advantage of the churn to rename "nr" to "vector". No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-15-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Rename the kvm_x86_ops hook for exception injection to better reflect reality, and to align with pretty much every other related function name in KVM. No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-14-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Deliberately truncate the exception error code when shoving it into the VMCS (VM-Entry field for vmcs01 and vmcs02, VM-Exit field for vmcs12). Intel CPUs are incapable of handling 32-bit error codes and will never generate an error code with bits 31:16, but userspace can provide an arbitrary error code via KVM_SET_VCPU_EVENTS. Failure to drop the bits on exception injection results in failed VM-Entry, as VMX disallows setting bits 31:16. Setting the bits on VM-Exit would at best confuse L1, and at worse induce a nested VM-Entry failure, e.g. if L1 decided to reinject the exception back into L2. Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-3-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in vmcs_config to avoid the need to re-read it later, e.g. from cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs(). No (real) functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-33-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Using raw host MSR values for setting up nested VMX control MSRs is incorrect as some features need to disabled, e.g. when KVM runs as a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this is done in setup_vmcs_config() and the result is stored in vmcs_config. Use it for setting up allowed-1 bits in nested VMX MSRs too. Suggested-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-32-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() for setting up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from hardware_setup(). While it seems reasonable to not expose LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs, such change would inevitably break live migration from older KVMs where the controls are exposed. Keep the status quo for now, L1 hypervisor itself is supposed to take care of the errata. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-30-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
Intel processor code names are more familiar to many readers than their decimal model numbers. Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-29-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Clear the CR3 and INVLPG interception controls at runtime based on whether or not EPT is being _used_, as opposed to clearing the bits at setup if EPT is _supported_ in hardware, and then restoring them when EPT is not used. Not mucking with the base config will allow using the base config as the starting point for emulating the VMX capability MSRs. Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-28-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the CPU based VM execution controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL and filter them out in vmx_exec_control(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-27-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the VMEXIT controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and filter them out in vmx_vmexit_ctrl(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-26-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering to vmx_exec_control(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-25-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
When VMX controls macros are used to set or clear a control bit, make sure that this bit was checked in setup_vmcs_config() and thus is properly reflected in vmcs_config. Opportunistically drop pointless "< 0" check for adjust_vmx_controls()'s return value. No functional change intended. Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-24-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug the VM if KVM attempts to run the guest with EFER.LMA=1. KVM doesn't support running 64-bit guests with 32-bit hosts. Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-23-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
SECONDARY_EXEC_ENCLS_EXITING is the only control which is conditionally added to the 'optional' checklist in setup_vmcs_config() but the special case can be avoided by always checking for its presence first and filtering out the result later. Note: the situation when SECONDARY_EXEC_ENCLS_EXITING is present but cpu_has_sgx() is false is possible when SGX is "soft-disabled", e.g. if software writes MCE control MSRs or there's an uncorrectable #MC. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-22-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window() but setup_vmcs_config() doesn't check their existence. Add the check and filter the controls out in vmx_exec_control(). Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis() accordingly. No functional change intended. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-21-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer() and setup_vmcs_config() doesn't check its existence. On the contrary, nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing check and filter the bit out in vmx_vmentry_ctrl(). No (real) functional change intended as all existing CPUs supporting long mode and VMX are supposed to have it. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-20-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
With the updated eVMCSv1 definition, there's no known 'problematic' controls which are exposed in VMX control MSRs but are not present in eVMCSv1: all known Hyper-V versions either don't expose the new fields by not setting bits in the VMX feature controls or support the new eVMCS revision. Get rid of VMX control MSRs filtering for KVM on Hyper-V. Note: VMX control MSRs filtering for Hyper-V on KVM (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1 definition doesn't have all the features implemented by KVM and some fields are still missing. Moreover, nested_evmcs_filter_control_msr() has to support the original eVMCSv1 version when VMM wishes so. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-17-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Enlightened VMCS v1 got updated and now includes the required fields for loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT features. For KVM on Hyper-V enablement, KVM can just observe VMX control MSRs and use the features (with or without eVMCS) when possible. Hyper-V on KVM is messier as Windows 11 guests fail to boot if the controls are advertised and a new PV feature flag, CPUID.0x4000000A.EBX BIT(0), is not set. Honor the Hyper-V CPUID feature flag to play nice with Windows guests. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20220830133737.1539624-16-vkuznets@redhat.comSigned-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-