Commit 411c588d authored by Avi Kivity's avatar Avi Kivity

KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0

When CR0.WP=0, we sometimes map user pages as kernel pages (to allow
the kernel to write to them).  Unfortunately this also allows the kernel
to fetch from these pages, even if CR4.SMEP is set.

Adjust for this by also setting NX on the spte in these circumstances.
Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
parent a01c8f9b
...@@ -165,6 +165,10 @@ Shadow pages contain the following information: ...@@ -165,6 +165,10 @@ Shadow pages contain the following information:
Contains the value of efer.nxe for which the page is valid. Contains the value of efer.nxe for which the page is valid.
role.cr0_wp: role.cr0_wp:
Contains the value of cr0.wp for which the page is valid. Contains the value of cr0.wp for which the page is valid.
role.smep_andnot_wp:
Contains the value of cr4.smep && !cr0.wp for which the page is valid
(pages for which this is true are different from other pages; see the
treatment of cr0.wp=0 below).
gfn: gfn:
Either the guest page table containing the translations shadowed by this Either the guest page table containing the translations shadowed by this
page, or the base page frame for linear translations. See role.direct. page, or the base page frame for linear translations. See role.direct.
...@@ -317,6 +321,20 @@ on fault type: ...@@ -317,6 +321,20 @@ on fault type:
(user write faults generate a #PF) (user write faults generate a #PF)
In the first case there is an additional complication if CR4.SMEP is
enabled: since we've turned the page into a kernel page, the kernel may now
execute it. We handle this by also setting spte.nx. If we get a user
fetch or read fault, we'll change spte.u=1 and spte.nx=gpte.nx back.
To prevent an spte that was converted into a kernel page with cr0.wp=0
from being written by the kernel after cr0.wp has changed to 1, we make
the value of cr0.wp part of the page role. This means that an spte created
with one value of cr0.wp cannot be used when cr0.wp has a different value -
it will simply be missed by the shadow page lookup code. A similar issue
exists when an spte created with cr0.wp=0 and cr4.smep=0 is used after
changing cr4.smep to 1. To avoid this, the value of !cr0.wp && cr4.smep
is also made a part of the page role.
Large pages Large pages
=========== ===========
......
...@@ -205,6 +205,7 @@ union kvm_mmu_page_role { ...@@ -205,6 +205,7 @@ union kvm_mmu_page_role {
unsigned invalid:1; unsigned invalid:1;
unsigned nxe:1; unsigned nxe:1;
unsigned cr0_wp:1; unsigned cr0_wp:1;
unsigned smep_andnot_wp:1;
}; };
}; };
......
...@@ -1985,8 +1985,17 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, ...@@ -1985,8 +1985,17 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
spte |= PT_WRITABLE_MASK; spte |= PT_WRITABLE_MASK;
if (!vcpu->arch.mmu.direct_map if (!vcpu->arch.mmu.direct_map
&& !(pte_access & ACC_WRITE_MASK)) && !(pte_access & ACC_WRITE_MASK)) {
spte &= ~PT_USER_MASK; spte &= ~PT_USER_MASK;
/*
* If we converted a user page to a kernel page,
* so that the kernel can write to it when cr0.wp=0,
* then we should prevent the kernel from executing it
* if SMEP is enabled.
*/
if (kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))
spte |= PT64_NX_MASK;
}
/* /*
* Optimization: for pte sync, if spte was writable the hash * Optimization: for pte sync, if spte was writable the hash
...@@ -2955,6 +2964,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) ...@@ -2955,6 +2964,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
{ {
int r; int r;
bool smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
ASSERT(vcpu); ASSERT(vcpu);
ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa));
...@@ -2969,6 +2979,8 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) ...@@ -2969,6 +2979,8 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu);
vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu);
vcpu->arch.mmu.base_role.smep_andnot_wp
= smep && !is_write_protection(vcpu);
return r; return r;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment