Commits · 296f047502f1b3ddfd63adbc192624ce80740081 · Kirill Smelkov / linux

30 Jul, 2014 1 commit

KVM: vmx: remove duplicate vmx_mpx_supported() prototype · 296f0475

Chris J Arges authored Jul 29, 2014

Remove a prototype which was added by both 93c4adc7 and 36be0b9d.
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

296f0475

25 Jul, 2014 2 commits

x86/kvm: Resolve shadow warning from min macro · b55a8144

Mark Rustad authored Jul 25, 2014

Resolve a shadow warning generated in W=2 builds by the nested
use of the min macro by instead using the min3 macro for the
minimum of 3 values.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b55a8144

kvm: Resolve missing-field-initializers warnings · 25f97ff4

Mark Rustad authored Jul 25, 2014

Resolve missing-field-initializers warnings seen in W=2 kernel
builds by having macros generate more elaborated initializers.
That is enough to silence the warnings.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

25f97ff4

24 Jul, 2014 4 commits

Replace NR_VMX_MSR with its definition · 03916db9

Paolo Bonzini authored Jul 24, 2014

Using ARRAY_SIZE directly makes it easier to read the code.  While touching
the code, replace the division by a multiplication in the recently added
BUILD_BUG_ON.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

03916db9

KVM: x86: Assertions to check no overrun in MSR lists · 0123be42

Nadav Amit authored Jul 24, 2014

Currently there is no check whether shared MSRs list overrun the allocated size
which can results in bugs. In addition there is no check that vmx->guest_msrs
has sufficient space to accommodate all the VMX msrs.  This patch adds the
assertions.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

0123be42

KVM: x86: set rflags.rf during fault injection · d6e8c854

Nadav Amit authored Jul 24, 2014

x86 does not automatically set rflags.rf during event injection. This patch
does partial job, setting rflags.rf upon fault injection. It does not handle
the setting of RF upon interrupt injection on rep-string instruction.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d6e8c854

KVM: x86: Setting rflags.rf during rep-string emulation · b9a1ecb9

Nadav Amit authored Jul 24, 2014

This patch updates RF for rep-string emulation. The flag is set upon the first
iteration, and cleared after the last (if emulated). It is intended to make
sure that if a trap (in future data/io #DB emulation) or interrupt is delivered
to the guest during the rep-string instruction, RF will be set correctly. RF
affects whether instruction breakpoint in the guest is masked.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b9a1ecb9

22 Jul, 2014 1 commit

Merge tag 'kvm-s390-20140721' of... · c756ad03

Paolo Bonzini authored Jul 22, 2014

Merge tag 'kvm-s390-20140721' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

Bugfixes
--------
- add IPTE to trace event decoder
- document and advertise KVM_CAP_S390_IRQCHIP

Cleanups
--------
- Reuse kvm_vcpu_block for s390
- Get rid of tasklet for wakup processing

c756ad03

21 Jul, 2014 19 commits

KVM: x86: DR6/7.RTM cannot be written · 6f43ed01

Nadav Amit authored Jul 15, 2014

Haswell and newer Intel CPUs have support for RTM, and in that case DR6.RTM is
not fixed to 1 and DR7.RTM is not fixed to zero. That is not the case in the
current KVM implementation. This bug is apparent only if the MOV-DR instruction
is emulated or the host also debugs the guest.

This patch is a partial fix which enables DR6.RTM and DR7.RTM to be cleared and
set respectively. It also sets DR6.RTM upon every debug exception. Obviously,
it is not a complete fix, as debugging of RTM is still unsupported.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6f43ed01

KVM: nVMX: clean up nested_release_vmcs12 and code around it · 9a2a05b9

Paolo Bonzini authored Jul 17, 2014

Make nested_release_vmcs12 idempotent.
Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9a2a05b9

KVM: nVMX: fix lifetime issues for vmcs02 · 4fa7734c

Paolo Bonzini authored Jul 17, 2014

free_nested needs the loaded_vmcs to be valid if it is a vmcs02, in
order to detach it from the shadow vmcs.  However, this is not
available anymore after commit 26a865f4 (KVM: VMX: fix use after
free of vmx->loaded_vmcs, 2014-01-03).

Revert that patch, and fix its problem by forcing a vmcs01 as the
active VMCS before freeing all the nested VMX state.
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4fa7734c

KVM: x86: Defining missing x86 vectors · c9cdd085

Nadav Amit authored Jul 21, 2014

Defining XE, XM and VE vector numbers.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c9cdd085

KVM: x86: emulator injects #DB when RFLAGS.RF is set · 4161a569

Nadav Amit authored Jul 17, 2014

If the RFLAGS.RF is set, then no #DB should occur on instruction breakpoints.
However, the KVM emulator injects #DB regardless to RFLAGS.RF. This patch fixes
this behavior. KVM, however, still appears not to update RFLAGS.RF correctly,
regardless of this patch.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4161a569

KVM: x86: Cleanup of rflags.rf cleaning · 6c6cb69b

Nadav Amit authored Jul 21, 2014

RFLAGS.RF was cleaned in several functions (e.g., syscall) in the x86 emulator.
Now that we clear it before the execution of an instruction in the emulator, we
can remove the specific cleanup of RFLAGS.RF.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6c6cb69b

KVM: x86: Clear rflags.rf on emulated instructions · 4467c3f1

Nadav Amit authored Jul 21, 2014

When an instruction is emulated RFLAGS.RF should be cleared. KVM previously did
not do so. This patch clears RFLAGS.RF after interception is done. If a fault
occurs during the instruction, RFLAGS.RF will be set by a previous patch. This
patch does not handle the case of traps/interrupts during rep-strings. Traps
are only expected to occur on debug watchpoints, and those are anyhow not
handled by the emulator.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4467c3f1

KVM: x86: popf emulation should not change RF · 163b135e

Nadav Amit authored Jul 21, 2014

RFLAGS.RF is always zero after popf. Therefore, popf should not updated RF, as
anyhow emulating popf, just as any other instruction should clear RFLAGS.RF.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

163b135e

KVM: x86: Clearing rflags.rf upon skipped emulated instruction · bb663c7a

Nadav Amit authored Jul 21, 2014

When skipping an emulated instruction, rflags.rf should be cleared as it would
be on real x86 CPU.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bb663c7a

Merge tag 'kvm-s390-20140715' of... · ec10b727

Paolo Bonzini authored Jul 21, 2014

Merge tag 'kvm-s390-20140715' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

This series enables the "KVM_(S|G)ET_MP_STATE" ioctls on s390 to make
the cpu state settable by user space.

This is necessary to avoid races in s390 SIGP/reset handling which
happen because some SIGPs are handled in QEMU, while others are
handled in the kernel. Together with the busy conditions as return
value of SIGP races happen especially in areas like starting and
stopping of CPUs. (For example, there is a program 'cpuplugd', that
runs on several s390 distros which does automatic onlining and
offlining on cpus.)

As soon as the MPSTATE interface is used, user space takes complete
control of the cpu states. Otherwise the kernel will use the old way.

Therefore, the new kernel continues to work fine with old QEMUs.

ec10b727

KVM: s390: add ipte to trace event decoding · e59d120f

Christian Borntraeger authored Jul 16, 2014

IPTE intercept can happen, let's decode that.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>

e59d120f

KVM: s390: advertise KVM_CAP_S390_IRQCHIP · 78599d90

Cornelia Huck authored Jul 15, 2014

We should advertise all capabilities, including those that can
be enabled.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

78599d90

KVM: s390: document KVM_CAP_S390_IRQCHIP · 8a366a4b

Cornelia Huck authored Jun 27, 2014

Let's document that this is a capability that may be enabled per-vm.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

8a366a4b

KVM: document target of capability enablement · 0907c855

Cornelia Huck authored Jun 27, 2014

Capabilities can be enabled on a vcpu or (since recently) on a vm. Document
this and note for the existing capabilites whether they are per-vcpu or
per-vm.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

0907c855

KVM: s390: remove the tasklet used by the hrtimer · ea74c0ea

David Hildenbrand authored May 16, 2014

We can get rid of the tasklet used for waking up a VCPU in the hrtimer
code but wakeup the VCPU directly.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

ea74c0ea

KVM: s390: move vcpu wakeup code to a central point · 0e9c85a5

David Hildenbrand authored May 16, 2014

Let's move the vcpu wakeup code to a central point.

We should set the vcpu->preempted flag only if the target is actually sleeping
and before the real wakeup happens. Otherwise the preempted flag might be set,
when not necessary. This may result in immediate reschedules after schedule()
in some scenarios.

The wakeup code doesn't require the local_int.lock to be held.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

0e9c85a5

KVM: s390: remove _bh locking from start_stop_lock · 433b9ee4

David Hildenbrand authored May 06, 2014

The start_stop_lock is no longer acquired when in atomic context, therefore we
can convert it into an ordinary spin_lock.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

433b9ee4

KVM: s390: remove _bh locking from local_int.lock · 4ae3c081

David Hildenbrand authored May 16, 2014

local_int.lock is not used in a bottom-half handler anymore, therefore we can
turn it into an ordinary spin_lock at all occurrences.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

4ae3c081

KVM: s390: cleanup handle_wait by reusing kvm_vcpu_block · 0759d068

David Hildenbrand authored May 13, 2014

This patch cleans up the code in handle_wait by reusing the common code
function kvm_vcpu_block.

signal_pending(), kvm_cpu_has_pending_timer() and kvm_arch_vcpu_runnable() are
sufficient for checking if we need to wake-up that VCPU. kvm_vcpu_block
uses these functions, so no checks are lost.

The flag "timer_due" can be removed - kvm_cpu_has_pending_timer() tests whether
the timer is pending, thus the vcpu is correctly woken up.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

0759d068

17 Jul, 2014 1 commit

KVM: nVMX: Fix virtual interrupt delivery injection · 963fee16

Wanpeng Li authored Jul 17, 2014

This patch fix bug reported in https://bugzilla.kernel.org/show_bug.cgi?id=73331,
after the patch http://www.spinics.net/lists/kvm/msg105230.html applied, there is
some progress and the L2 can boot up, however, slowly. The original idea of this
fix vid injection patch is from "Zhang, Yang Z" <yang.z.zhang@intel.com>.

Interrupt which delivered by vid should be injected to L1 by L0 if current is in
L1, or should be injected to L2 by L0 through the old injection way if L1 doesn't
have set External-interrupt exiting bit. The current logic doen't consider these
cases. This patch fix it by vid intr to L1 if current is L1 or L2 through old
injection way if L1 doen't have External-interrupt exiting bit set.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

963fee16

11 Jul, 2014 12 commits

KVM: x86: Emulator support for #UD on CPL>0 · 68efa764

Nadav Amit authored Jun 18, 2014

Certain instructions (e.g., mwait and monitor) cause a #UD exception when they
are executed in user mode. This is in contrast to the regular privileged
instructions which cause #GP. In order not to mess with SVM interception of
mwait and monitor which assumes privilege level assertions take place before
interception, a flag has been added.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

68efa764

KVM: x86: Emulator flag for instruction that only support 16-bit addresses in real mode · 10e38fc7

Nadav Amit authored Jun 18, 2014

Certain instructions, such as monitor and xsave do not support big real mode
and cause a #GP exception if any of the accessed bytes effective address are
not within [0, 0xffff].  This patch introduces a flag to mark these
instructions, including the necassary checks.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

10e38fc7

KVM: x86: use kvm_read_guest_page for emulator accesses · 44583cba

Paolo Bonzini authored May 13, 2014

Emulator accesses are always done a page at a time, either by the emulator
itself (for fetches) or because we need to query the MMU for address
translations. Speed up these accesses by using kvm_read_guest_page
and, in the case of fetches, by inlining kvm_read_guest_virt_helper and
dropping the loop around kvm_read_guest_page.

This final tweak saves 30-100 more clock cycles (4-10%), bringing the
count (as measured by kvm-unit-tests) down to 720-1100 clock cycles on
a Sandy Bridge Xeon host, compared to 2300-3200 before the whole series
and 925-1700 after the first two low-hanging fruit changes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

44583cba

KVM: x86: ensure emulator fetches do not span multiple pages · 719d5a9b

Paolo Bonzini authored Jun 19, 2014

When the CS base is not page-aligned, the linear address of the code could
get close to the page boundary (e.g. 0x...ffe) even if the EIP value is
not. So we need to first linearize the address, and only then compute
the number of valid bytes that can be fetched.

This happens relatively often when executing real mode code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

719d5a9b

KVM: emulate: put pointers in the fetch_cache · 17052f16

Paolo Bonzini authored May 06, 2014

This simplifies the code a bit, especially the overflow checks.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

17052f16

KVM: emulate: avoid per-byte copying in instruction fetches · 9506d57d

Paolo Bonzini authored May 06, 2014

We do not need a memory copying loop anymore in insn_fetch; we
can use a byte-aligned pointer to access instruction fields directly
from the fetch_cache. This eliminates 50-150 cycles (corresponding to
a 5-10% improvement in performance) from each instruction.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9506d57d

KVM: emulate: avoid repeated calls to do_insn_fetch_bytes · 5cfc7e0f

Paolo Bonzini authored May 06, 2014

do_insn_fetch_bytes will only be called once in a given insn_fetch and
insn_fetch_arr, because in fact it will only be called at most twice
for any instruction and the first call is explicit in x86_decode_insn.
This observation lets us hoist the call out of the memory copying loop.
It does not buy performance, because most fetches are one byte long
anyway, but it prepares for the next patch.

The overflow check is tricky, but correct. Because do_insn_fetch_bytes
has already been called once, we know that fc->end is at least 15. So
it is okay to subtract the number of bytes we want to read.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5cfc7e0f

KVM: emulate: speed up do_insn_fetch · 285ca9e9

Paolo Bonzini authored May 06, 2014

Hoist the common case up from do_insn_fetch_byte to do_insn_fetch,
and prime the fetch_cache in x86_decode_insn.  This helps a bit the
compiler and the branch predictor, but above all it lays the
ground for further changes in the next few patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

285ca9e9

KVM: emulate: do not initialize memopp · 41061cdb

Bandan Das authored Apr 16, 2014

rip_relative is only set if decode_modrm runs, and if you have ModRM
you will also have a memopp.  We can then access memopp unconditionally.
Note that rip_relative cannot be hoisted up to decode_modrm, or you
break "mov $0, xyz(%rip)".

Also, move typecast on "out of range value" of mem.ea to decode_modrm.

Together, all these optimizations save about 50 cycles on each emulated
instructions (4-6%).
Signed-off-by: Bandan Das <bsd@redhat.com>
[Fix immediate operands with rip-relative addressing. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

41061cdb

KVM: emulate: rework seg_override · 573e80fe

Bandan Das authored Apr 16, 2014

x86_decode_insn already sets a default for seg_override,
so remove it from the zeroed area. Also replace set/get functions
with direct access to the field.
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

573e80fe

KVM: emulate: clean up initializations in init_decode_cache · c44b4c6a

Bandan Das authored Apr 16, 2014

A lot of initializations are unnecessary as they get set to
appropriate values before actually being used. Optimize
placement of fields in x86_emulate_ctxt
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c44b4c6a

KVM: emulate: cleanup decode_modrm · 02357bdc

Bandan Das authored Apr 16, 2014

Remove the if conditional - that will help us avoid
an "else initialize to 0" Also, rearrange operators
for slightly better code.
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

02357bdc