Commits · e5c301428294cb8925667c9ee39f817c4ab1c2c9 · Kirill Smelkov / linux

An error occurred fetching the project authors.

12 Jan, 2011 32 commits

KVM: Initialize fpu state in preemptible context · e5c30142

Avi Kivity authored 14 years ago

init_fpu() (which is indirectly called by the fpu switching code) assumes
it is in process context.  Rather than makeing init_fpu() use an atomic
allocation, which can cause a task to be killed, make sure the fpu is
already initialized when we enter the run loop.

KVM-Stable-Tag.
Reported-and-tested-by: Kirill A. Shutemov <kas@openvz.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

e5c30142

KVM: Fetch guest cr3 from hardware on demand · aff48baa

Avi Kivity authored 14 years ago

Instead of syncing the guest cr3 every exit, which is expensince on vmx
with ept enabled, sync it only on demand.

[sheng: fix incorrect cr3 seen by Windows XP]
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

aff48baa

KVM: Replace reads of vcpu->arch.cr3 by an accessor · 9f8fe504
Avi Kivity authored 14 years ago
```
This allows us to keep cr3 in the VMCS, later on.
Signed-off-by: Avi Kivity <avi@redhat.com>
```
9f8fe504

KVM: SVM: copy instruction bytes from VMCB · dc25e89e

Andre Przywara authored 14 years ago

In case of a nested page fault or an intercepted #PF newer SVM
implementations provide a copy of the faulting instruction bytes
in the VMCB.
Use these bytes to feed the instruction emulator and avoid the costly
guest instruction fetch in this case.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

dc25e89e

KVM: cleanup emulate_instruction · 51d8b661

Andre Przywara authored 14 years ago

emulate_instruction had many callers, but only one used all
parameters. One parameter was unused, another one is now
hidden by a wrapper function (required for a future addition
anyway), so most callers use now a shorter parameter list.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

51d8b661

KVM: move complete_insn_gp() into x86.c · db8fcefa

Andre Przywara authored 14 years ago

move the complete_insn_gp() helper function out of the VMX part
into the generic x86 part to make it usable by SVM.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

db8fcefa

KVM: x86: fix CR8 handling · eea1cff9

Andre Przywara authored 14 years ago

The handling of CR8 writes in KVM is currently somewhat cumbersome.
This patch makes it look like the other CR register handlers
and fixes a possible issue in VMX, where the RIP would be incremented
despite an injected #GP.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

eea1cff9

KVM: Take missing slots_lock for kvm_io_bus_unregister_dev() · 175504cd

Takuya Yoshikawa authored 14 years ago

In KVM_CREATE_IRQCHIP, kvm_io_bus_unregister_dev() is called without taking
slots_lock in the error handling path.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>

175504cd

KVM: return true when user space query KVM_CAP_USER_NMI extension · a355c85c

Lai Jiangshan authored 14 years ago

userspace may check this extension in runtime.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

a355c85c

KVM: Correct kvm_pio tracepoint count field · 61cfab2e

Avi Kivity authored 14 years ago

Currently, we record '1' for count regardless of the real count.  Fix.
Signed-off-by: Avi Kivity <avi@redhat.com>

61cfab2e

KVM: MMU: retry #PF for softmmu · fb67e14f

Xiao Guangrong authored 14 years ago

Retry #PF for softmmu only when the current vcpu has the same cr3 as the time
when #PF occurs
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

fb67e14f

KVM: X86: Don't report L2 emulation failures to user-space · fc3a9157

Joerg Roedel authored 14 years ago

This patch prevents that emulation failures which result
from emulating an instruction for an L2-Guest results in
being reported to userspace.
Without this patch a malicious L2-Guest would be able to
kill the L1 by triggering a race-condition between an vmexit
and the instruction emulator.
With this patch the L2 will most likely only kill itself in
this situation.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

fc3a9157

KVM: Pull extra page fault information into struct x86_exception · 6389ee94

Avi Kivity authored 14 years ago

Currently page fault cr2 and nesting infomation are carried outside
the fault data structure.  Instead they are placed in the vcpu struct,
which results in confusion as global variables are manipulated instead
of passing parameters.

Fix this issue by adding address and nested fields to struct x86_exception,
so this struct can carry all information associated with a fault.
Signed-off-by: Avi Kivity <avi@redhat.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

6389ee94

KVM: Push struct x86_exception info the various gva_to_gpa variants · ab9ae313
Avi Kivity authored 14 years ago
```
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
```
ab9ae313

KVM: x86 emulator: make emulator memory callbacks return full exception · bcc55cba

Avi Kivity authored 14 years ago

This way, they can return #GP, not just #PF.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

bcc55cba

KVM: x86 emulator: introduce struct x86_exception to communicate faults · da9cb575

Avi Kivity authored 14 years ago

Introduce a structure that can contain an exception to be passed back
to main kvm code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

da9cb575

KVM: Mask KVM_GET_SUPPORTED_CPUID data with Linux cpuid info · 945ee35e

Avi Kivity authored 14 years ago

This allows Linux to mask cpuid bits if, for example, nx is enabled on only
some cpus.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

945ee35e

KVM: MMU: fix apf prefault if nested guest is enabled · c4806acd

Xiao Guangrong authored 14 years ago

If apf is generated in L2 guest and is completed in L1 guest, it will
prefault this apf in L1 guest's mmu context.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

c4806acd

KVM: MMU: clear apfs if page state is changed · e5f3f027

Xiao Guangrong authored 14 years ago

If CR0.PG is changed, the page fault cann't be avoid when the prefault address
is accessed later

And it also fix a bug: it can retry a page enabled #PF in page disabled context
if mmu is shadow page

This idear is from Gleb Natapov
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

e5f3f027

KVM: Clean up vm creation and release · d89f5eff

Jan Kiszka authored 14 years ago

IA64 support forces us to abstract the allocation of the kvm structure.
But instead of mixing this up with arch-specific initialization and
doing the same on destruction, split both steps. This allows to move
generic destruction calls into generic code.

It also fixes error clean-up on failures of kvm_create_vm for IA64.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

d89f5eff

KVM: avoid unnecessary wait for a async pf · e6d53e3b

Xiao Guangrong authored 14 years ago

In current code, it checks async pf completion out of the wait context,
like this:

if (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
		    !vcpu->arch.apf.halted)
			r = vcpu_enter_guest(vcpu);
		else {
			......
			kvm_vcpu_block(vcpu)
			 ^- waiting until 'async_pf.done' is not empty
}

kvm_check_async_pf_completion(vcpu)
 ^- delete list from async_pf.done

So, if we check aysnc pf completion first, it can be blocked at
kvm_vcpu_block

Fixed by mark the vcpu is unhalted in kvm_check_async_pf_completion()
path
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

e6d53e3b

KVM: fix searching async gfn in kvm_async_pf_gfn_slot · c7d28c24

Xiao Guangrong authored 14 years ago

Don't search later slots if the slot is empty
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

c7d28c24

KVM: x86: Avoid issuing wbinvd twice · 2eec7343

Jan Kiszka authored 14 years ago

Micro optimization to avoid calling wbinvd twice on the CPU that has to
emulate it. As we might be preempted between smp_call_function_many and
the local wbinvd, the cache might be filled again so that real work
could be done uselessly.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

2eec7343

KVM: pre-allocate one more dirty bitmap to avoid vmalloc() · 515a0127

Takuya Yoshikawa authored 14 years ago

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

515a0127

KVM: MMU: remove kvm_mmu_set_base_ptes · 982c2565

Marcelo Tosatti authored 14 years ago

Unused.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

982c2565

KVM: Send async PF when guest is not in userspace too. · fc5f06fa

Gleb Natapov authored 14 years ago

If guest indicates that it can handle async pf in kernel mode too send
it, but only if interrupts are enabled.
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

fc5f06fa

KVM: Let host know whether the guest can handle async PF in non-userspace context. · 6adba527

Gleb Natapov authored 14 years ago

If guest can detect that it runs in non-preemptable context it can
handle async PFs at any time, so let host know that it can send async
PF even if guest cpu is not in userspace.
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

6adba527

KVM: Inject asynchronous page fault into a PV guest if page is swapped out. · 7c90705b

Gleb Natapov authored 14 years ago

Send async page fault to a PV guest if it accesses swapped out memory.
Guest will choose another task to run upon receiving the fault.

Allow async page fault injection only when guest is in user mode since
otherwise guest may be in non-sleepable context and will not be able
to reschedule.

Vcpu will be halted if guest will fault on the same page again or if
vcpu executes kernel code.
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

7c90705b

KVM: Add PV MSR to enable asynchronous page faults delivery. · 344d9588

Gleb Natapov authored 14 years ago

Guest enables async PF vcpu functionality using this MSR.
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

344d9588

KVM: Add memory slot versioning and use it to provide fast guest write interface · 49c7754c

Gleb Natapov authored 14 years ago

Keep track of memslots changes by keeping generation number in memslots
structure. Provide kvm_write_guest_cached() function that skips
gfn_to_hva() translation if memslots was not changed since previous
invocation.
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

49c7754c

KVM: Retry fault before vmentry · 56028d08

Gleb Natapov authored 14 years ago

When page is swapped in it is mapped into guest memory only after guest
tries to access it again and generate another fault. To save this fault
we can map it immediately since we know that guest is going to access
the page. Do it only when tdp is enabled for now. Shadow paging case is
more complicated. CR[034] and EFER registers should be switched before
doing mapping and then switched back.
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

56028d08

KVM: Halt vcpu if page it tries to access is swapped out · af585b92

Gleb Natapov authored 14 years ago

If a guest accesses swapped out memory do not swap it in from vcpu thread
context. Schedule work to do swapping and put vcpu into halted state
instead.

Interrupts will still be delivered to the guest and if interrupt will
cause reschedule guest will continue to run another task.

[avi: remove call to get_user_pages_noio(), nacked by Linus; this
      makes everything synchrnous again]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

af585b92

02 Jan, 2011 1 commit

KVM: Don't reset mmu context unnecessarily when updating EFER · 010c520e

Avi Kivity authored 14 years ago

The only bit of EFER that affects the mmu is NX, and this is already
accounted for (LME only takes effect when changing cr0).

Based on a patch by Hillf Danton.
Signed-off-by: Avi Kivity <avi@redhat.com>

010c520e

16 Dec, 2010 1 commit
- KVM: Fix preemption counter leak in kvm_timer_init() · 3e26f230
  Avi Kivity authored 14 years ago
```
Based on a patch from Thomas Meyer.
Signed-off-by: Avi Kivity <avi@redhat.com>
```
  3e26f230
08 Dec, 2010 2 commits

KVM: SVM: Do not report xsave in supported cpuid · 24d1b15f

Joerg Roedel authored 14 years ago

To support xsave properly for the guest the SVM module need
software support for it. As long as this is not present do
not report the xsave as supported feature in cpuid.
As a side-effect this patch moves the bit() helper function
into the x86.h file so that it can be used in svm.c too.

KVM-Stable-Tag.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

24d1b15f

KVM: Fix OSXSAVE after migration · 3ea3aa8c

Sheng Yang authored 14 years ago

CPUID's OSXSAVE is a mirror of CR4.OSXSAVE bit. We need to update the CPUID
after migration.

KVM-Stable-Tag.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

3ea3aa8c

05 Nov, 2010 3 commits

KVM: x86: Issue smp_call_function_many with preemption disabled · 453d9c57

Jan Kiszka authored 14 years ago

smp_call_function_many is specified to be called only with preemption
disabled. Fulfill this requirement.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

453d9c57

KVM: x86: fix information leak to userland · 97e69aa6

Vasiliy Kulikov authored 14 years ago

Structures kvm_vcpu_events, kvm_debugregs, kvm_pit_state2 and
kvm_clock_data are copied to userland with some padding and reserved
fields unitialized.  It leads to leaking of contents of kernel stack
memory.  We have to initialize them to zero.

In patch v1 Jan Kiszka suggested to fill reserved fields with zeros
instead of memset'ting the whole struct.  It makes sense as these
fields are explicitly marked as padding.  No more fields need zeroing.

KVM-Stable-Tag.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

97e69aa6

KVM: Write protect memory after slot swap · edde99ce

Michael S. Tsirkin authored 14 years ago

I have observed the following bug trigger:

1. userspace calls GET_DIRTY_LOG
2. kvm_mmu_slot_remove_write_access is called and makes a page ro
3. page fault happens and makes the page writeable
   fault is logged in the bitmap appropriately
4. kvm_vm_ioctl_get_dirty_log swaps slot pointers

a lot of time passes

5. guest writes into the page
6. userspace calls GET_DIRTY_LOG

At point (5), bitmap is clean and page is writeable,
thus, guest modification of memory is not logged
and GET_DIRTY_LOG returns an empty bitmap.

The rule is that all pages are either dirty in the current bitmap,
or write-protected, which is violated here.

It seems that just moving kvm_mmu_slot_remove_write_access down
to after the slot pointer swap should fix this bug.

KVM-Stable-Tag.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

edde99ce

24 Oct, 2010 1 commit

KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED · 5854dbca

Huang Ying authored 14 years ago

Now we have MCG_SER_P (and corresponding SRAO/SRAR MCE) support in
kernel and QEMU-KVM, the MCG_SER_P should be added into
KVM_MCE_CAP_SUPPORTED to make all these code really works.
Reported-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

5854dbca