Commit 2fb2b787 authored by Sean Christopherson's avatar Sean Christopherson

KVM: x86/mmu: Skip emulation on page fault iff 1+ SPs were unprotected

When doing "fast unprotection" of nested TDP page tables, skip emulation
if and only if at least one gfn was unprotected, i.e. continue with
emulation if simply resuming is likely to hit the same fault and risk
putting the vCPU into an infinite loop.

Note, it's entirely possible to get a false negative, e.g. if a different
vCPU faults on the same gfn and unprotects the gfn first, but that's a
relatively rare edge case, and emulating is still functionally ok, i.e.
saving a few cycles by avoiding emulation isn't worth the risk of putting
the vCPU into an infinite loop.

Opportunistically rewrite the relevant comment to document in gory detail
exactly what scenario the "fast unprotect" logic is handling.

Fixes: 14727754 ("kvm: svm: Add support for additional SVM NPF error codes")
Cc: Yuan Yao <yuan.yao@intel.com>
Reviewed-by: default avatarYuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-5-seanjc@google.comSigned-off-by: default avatarSean Christopherson <seanjc@google.com>
parent 989a84c9
...@@ -5957,16 +5957,37 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, ...@@ -5957,16 +5957,37 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
bool direct = vcpu->arch.mmu->root_role.direct; bool direct = vcpu->arch.mmu->root_role.direct;
/* /*
* Before emulating the instruction, check if the error code * Before emulating the instruction, check to see if the access was due
* was due to a RO violation while translating the guest page. * to a read-only violation while the CPU was walking non-nested NPT
* This can occur when using nested virtualization with nested * page tables, i.e. for a direct MMU, for _guest_ page tables in L1.
* paging in both guests. If true, we simply unprotect the page * If L1 is sharing (a subset of) its page tables with L2, e.g. by
* and resume the guest. * having nCR3 share lower level page tables with hCR3, then when KVM
* (L0) write-protects the nested NPTs, i.e. npt12 entries, KVM is also
* unknowingly write-protecting L1's guest page tables, which KVM isn't
* shadowing.
*
* Because the CPU (by default) walks NPT page tables using a write
* access (to ensure the CPU can do A/D updates), page walks in L1 can
* trigger write faults for the above case even when L1 isn't modifying
* PTEs. As a result, KVM will unnecessarily emulate (or at least, try
* to emulate) an excessive number of L1 instructions; because L1's MMU
* isn't shadowed by KVM, there is no need to write-protect L1's gPTEs
* and thus no need to emulate in order to guarantee forward progress.
*
* Try to unprotect the gfn, i.e. zap any shadow pages, so that L1 can
* proceed without triggering emulation. If one or more shadow pages
* was zapped, skip emulation and resume L1 to let it natively execute
* the instruction. If no shadow pages were zapped, then the write-
* fault is due to something else entirely, i.e. KVM needs to emulate,
* as resuming the guest will put it into an infinite loop.
*
* Note, this code also applies to Intel CPUs, even though it is *very*
* unlikely that an L1 will share its page tables (IA32/PAE/paging64
* format) with L2's page tables (EPT format).
*/ */
if (direct && is_write_to_guest_page_table(error_code)) { if (direct && is_write_to_guest_page_table(error_code) &&
kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa)); kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa)))
return RET_PF_RETRY; return RET_PF_RETRY;
}
/* /*
* The gfn is write-protected, but if emulation fails we can still * The gfn is write-protected, but if emulation fails we can still
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment