Commit 361209e0 authored by Sean Christopherson's avatar Sean Christopherson Committed by Paolo Bonzini

KVM: Explicitly define the "memslot update in-progress" bit

KVM uses bit 0 of the memslots generation as an "update in-progress"
flag, which is used by x86 to prevent caching MMIO access while the
memslots are changing.  Although the intended behavior is flag-like,
e.g. MMIO sptes intentionally drop the in-progress bit so as to avoid
caching data from in-flux memslots, the implementation oftentimes treats
the bit as part of the generation number itself, e.g. incrementing the
generation increments twice, once to set the flag and once to clear it.

Prior to commit 4bd518f1 ("KVM: use separate generations for
each address space"), incorporating the "update in-progress" bit into
the generation number largely made sense, e.g. "real" generations are
even, "bogus" generations are odd, most code doesn't need to be aware of
the bit, etc...

Now that unique memslots generation numbers are assigned to each address
space, stealthing the in-progress status into the generation number
results in a wide variety of subtle code, e.g. kvm_create_vm() jumps
over bit 0 when initializing the memslots generation without any hint as
to why.

Explicitly define the flag and convert as much code as possible (which
isn't much) to actually treat it like a flag.  This paves the way for
eventually using a different bit for "update in-progress" so that it can
be a flag in truth instead of a awkward extension to the generation
number.
Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
parent ddfd1730
...@@ -183,7 +183,7 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu, ...@@ -183,7 +183,7 @@ static inline void vcpu_cache_mmio_info(struct kvm_vcpu *vcpu,
{ {
u64 gen = kvm_memslots(vcpu->kvm)->generation; u64 gen = kvm_memslots(vcpu->kvm)->generation;
if (unlikely(gen & 1)) if (unlikely(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS))
return; return;
/* /*
......
...@@ -48,6 +48,27 @@ ...@@ -48,6 +48,27 @@
*/ */
#define KVM_MEMSLOT_INVALID (1UL << 16) #define KVM_MEMSLOT_INVALID (1UL << 16)
/*
* Bit 0 of the memslot generation number is an "update in-progress flag",
* e.g. is temporarily set for the duration of install_new_memslots().
* This flag effectively creates a unique generation number that is used to
* mark cached memslot data, e.g. MMIO accesses, as potentially being stale,
* i.e. may (or may not) have come from the previous memslots generation.
*
* This is necessary because the actual memslots update is not atomic with
* respect to the generation number update. Updating the generation number
* first would allow a vCPU to cache a spte from the old memslots using the
* new generation number, and updating the generation number after switching
* to the new memslots would allow cache hits using the old generation number
* to reference the defunct memslots.
*
* This mechanism is used to prevent getting hits in KVM's caches while a
* memslot update is in-progress, and to prevent cache hits *after* updating
* the actual generation number against accesses that were inserted into the
* cache *before* the memslots were updated.
*/
#define KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS BIT_ULL(0)
/* Two fragments for cross MMIO pages. */ /* Two fragments for cross MMIO pages. */
#define KVM_MAX_MMIO_FRAGMENTS 2 #define KVM_MAX_MMIO_FRAGMENTS 2
......
...@@ -874,30 +874,30 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, ...@@ -874,30 +874,30 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
int as_id, struct kvm_memslots *slots) int as_id, struct kvm_memslots *slots)
{ {
struct kvm_memslots *old_memslots = __kvm_memslots(kvm, as_id); struct kvm_memslots *old_memslots = __kvm_memslots(kvm, as_id);
u64 gen; u64 gen = old_memslots->generation;
/* WARN_ON(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS);
* Set the low bit in the generation, which disables SPTE caching slots->generation = gen | KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS;
* until the end of synchronize_srcu_expedited.
*/
WARN_ON(old_memslots->generation & 1);
slots->generation = old_memslots->generation + 1;
rcu_assign_pointer(kvm->memslots[as_id], slots); rcu_assign_pointer(kvm->memslots[as_id], slots);
synchronize_srcu_expedited(&kvm->srcu); synchronize_srcu_expedited(&kvm->srcu);
/* /*
* Increment the new memslot generation a second time. This prevents * Increment the new memslot generation a second time, dropping the
* vm exits that race with memslot updates from caching a memslot * update in-progress flag and incrementing then generation based on
* generation that will (potentially) be valid forever. * the number of address spaces. This provides a unique and easily
* * identifiable generation number while the memslots are in flux.
*/
gen = slots->generation & ~KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS;
/*
* Generations must be unique even across address spaces. We do not need * Generations must be unique even across address spaces. We do not need
* a global counter for that, instead the generation space is evenly split * a global counter for that, instead the generation space is evenly split
* across address spaces. For example, with two address spaces, address * across address spaces. For example, with two address spaces, address
* space 0 will use generations 0, 4, 8, ... while * address space 1 will * space 0 will use generations 0, 4, 8, ... while address space 1 will
* use generations 2, 6, 10, 14, ... * use generations 2, 6, 10, 14, ...
*/ */
gen = slots->generation + KVM_ADDRESS_SPACE_NUM * 2 - 1; gen += KVM_ADDRESS_SPACE_NUM * 2;
kvm_arch_memslots_updated(kvm, gen); kvm_arch_memslots_updated(kvm, gen);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment