• Liran Alon's avatar
    KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page · 0155b2b9
    Liran Alon authored
    According to Intel SDM section 28.3.3.3/28.3.3.4 Guidelines for Use
    of the INVVPID/INVEPT Instruction, the hypervisor needs to execute
    INVVPID/INVEPT X in case CPU executes VMEntry with VPID/EPTP X and
    either: "Virtualize APIC accesses" VM-execution control was changed
    from 0 to 1, OR the value of apic_access_page was changed.
    
    In the nested case, the burden falls on L1, unless L0 enables EPT in
    vmcs02 but L1 enables neither EPT nor VPID in vmcs12.  For this reason
    prepare_vmcs02() and load_vmcs12_host_state() have special code to
    request a TLB flush in case L1 does not use EPT but it uses
    "virtualize APIC accesses".
    
    This special case however is not necessary. On a nested vmentry the
    physical TLB will already be flushed except if all the following apply:
    
    * L0 uses VPID
    
    * L1 uses VPID
    
    * L0 can guarantee TLB entries populated while running L1 are tagged
    differently than TLB entries populated while running L2.
    
    If the first condition is false, the processor will flush the TLB
    on vmentry to L2.  If the second or third condition are false,
    prepare_vmcs02() will request KVM_REQ_TLB_FLUSH.  However, even
    if both are true, no extra TLB flush is needed to handle the APIC
    access page:
    
    * if L1 doesn't use VPID, the second condition doesn't hold and the
    TLB will be flushed anyway.
    
    * if L1 uses VPID, it has to flush the TLB itself with INVVPID and
    section 28.3.3.3 doesn't apply to L0.
    
    * even INVEPT is not needed because, if L0 uses EPT, it uses different
    EPTP when running L2 than L1 (because guest_mode is part of mmu-role).
    In this case SDM section 28.3.3.4 doesn't apply.
    
    Similarly, examining nested_vmx_vmexit()->load_vmcs12_host_state(),
    one could note that L0 won't flush TLB only in cases where SDM sections
    28.3.3.3 and 28.3.3.4 don't apply.  In particular, if L0 uses different
    VPIDs for L1 and L2 (i.e. vmx->vpid != vmx->nested.vpid02), section
    28.3.3.3 doesn't apply.
    
    Thus, remove this flush from prepare_vmcs02() and nested_vmx_vmexit().
    
    Side-note: This patch can be viewed as removing parts of commit
    fb6c8198 ("kvm: vmx: Flush TLB when the APIC-access address changes”)
    that is not relevant anymore since commit
    1313cc2b ("kvm: mmu: Add guest_mode to kvm_mmu_page_role”).
    i.e. The first commit assumes that if L0 use EPT and L1 doesn’t use EPT,
    then L0 will use same EPTP for both L0 and L1. Which indeed required
    L0 to execute INVEPT before entering L2 guest. This assumption is
    not true anymore since when guest_mode was added to mmu-role.
    Reviewed-by: default avatarJoao Martins <joao.m.martins@oracle.com>
    Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    0155b2b9
nested.c 190 KB