Commits · 95f328d3ad1a8e4e3175a18546fb35c495e31130 · nexedi / linux

04 Nov, 2013 1 commit
- Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue · 95f328d3
  Gleb Natapov authored Nov 04, 2013
```
Conflicts:
	arch/powerpc/include/asm/processor.h
```
  95f328d3
03 Nov, 2013 1 commit

KVM: x86: fix emulation of "movzbl %bpl, %eax" · daf72722

Paolo Bonzini authored Oct 31, 2013

When I was looking at RHEL5.9's failure to start with
unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a
slightly older tree than kvm.git.  I now debugged the remaining failure,
which was introduced by commit 660696d1 (KVM: X86 emulator: fix
source operand decoding for 8bit mov[zs]x instructions, 2013-04-24)
introduced a similar mis-emulation to the one in commit 8acb4207 (KVM:
fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30).  The incorrect
decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand
is sil/dil/bpl/spl.

Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression
prolog, just a handful of instructions before finally giving control to
the decompressed vmlinux and getting out of the invalid guest state.

Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix
must be applied to OpMem8.
Reported-by: Michele Baldessari <michele@redhat.com>
Cc: stable@vger.kernel.org
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

daf72722

31 Oct, 2013 8 commits

kvm_host: typo fix · 81e87e26

Michael S. Tsirkin authored Oct 30, 2013

fix up typo in comment.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

81e87e26

KVM: x86: emulate SAHF instruction · 98f73630

Paolo Bonzini authored Oct 31, 2013

Yet another instruction that we fail to emulate, this time found
in Windows 2008R2 32-bit.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

98f73630

MAINTAINERS: add tree for kvm.git · a94b40a6

Ramkumar Ramachandra authored Oct 31, 2013

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: KVM List <kvm@vger.kernel.org>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a94b40a6

Documentation/kvm: add a 00-INDEX file · 6beda1e5

Ramkumar Ramachandra authored Oct 31, 2013

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
[Some editing. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6beda1e5

MAINTAINERS: fix broken link to www.linux-kvm.org · e3e58478

Ramkumar Ramachandra authored Oct 31, 2013

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e3e58478

kvm/vmx: error message typo fix · 60266204

Michael S. Tsirkin authored Oct 31, 2013

mst can't be blamed for lack of switch entries: the
issue is with msrs actually.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

60266204

KVM: x86: fix KVM_SET_XCRS loop · c67a04cb

Paolo Bonzini authored Oct 17, 2013

The loop was always using 0 as the index.  This means that
any rubbish after the first element of the array went undetected.
It seems reasonable to assume that no KVM userspace did that.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c67a04cb

KVM: x86: fix KVM_SET_XCRS for CPUs that do not support XSAVE · 46c34cb0

Paolo Bonzini authored Oct 17, 2013

The KVM_SET_XCRS ioctl must accept anything that KVM_GET_XCRS
could return.  XCR0's bit 0 is always 1 in real processors with
XSAVE, and KVM_GET_XCRS will always leave bit 0 set even if the
emulated processor does not have XSAVE.  So, KVM_SET_XCRS must
ignore that bit when checking for attempts to enable unsupported
save states.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

46c34cb0

30 Oct, 2013 8 commits

kvm: Create non-coherent DMA registeration · e0f0bbc5

Alex Williamson authored Oct 30, 2013

We currently use some ad-hoc arch variables tied to legacy KVM device
assignment to manage emulation of instructions that depend on whether
non-coherent DMA is present. Create an interface for this, adapting
legacy KVM device assignment and adding VFIO via the KVM-VFIO device.
For now we assume that non-coherent DMA is possible any time we have a
VFIO group. Eventually an interface can be developed as part of the
VFIO external user interface to query the coherency of a group.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e0f0bbc5

kvm/x86: Convert iommu_flags to iommu_noncoherent · d96eb2c6

Alex Williamson authored Oct 30, 2013

Default to operating in coherent mode.  This simplifies the logic when
we switch to a model of registering and unregistering noncoherent I/O
with KVM.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d96eb2c6

kvm: Add VFIO device · ec53500f

Alex Williamson authored Oct 30, 2013

So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made. This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction. The user creates
the device and can add and remove VFIO groups to it via file
descriptors. When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ec53500f

kvm: Emulate MOVBE · 84cffe49

Borislav Petkov authored Oct 29, 2013

This basically came from the need to be able to boot 32-bit Atom SMP
guests on an AMD host, i.e. a host which doesn't support MOVBE. As a
matter of fact, qemu has since recently received MOVBE support but we
cannot share that with kvm emulation and thus we have to do this in the
host. We're waay faster in kvm anyway. :-)

So, we piggyback on the #UD path and emulate the MOVBE functionality.
With it, an 8-core SMP guest boots in under 6 seconds.

Also, requesting MOVBE emulation needs to happen explicitly to work,
i.e. qemu -cpu n270,+movbe...

Just FYI, a fairly straight-forward boot of a MOVBE-enabled 3.9-rc6+
kernel in kvm executes MOVBE ~60K times.
Signed-off-by: Andre Przywara <andre@andrep.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

84cffe49

kvm, emulator: Add initial three-byte insns support · 0bc5eedb

Borislav Petkov authored Oct 29, 2013

Add initial support for handling three-byte instructions in the
emulator.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

0bc5eedb

kvm, emulator: Rename VendorSpecific flag · b51e974f

Borislav Petkov authored Sep 22, 2013

Call it EmulateOnUD which is exactly what we're trying to do with
vendor-specific instructions.

Rename ->only_vendor_specific_insn to something shorter, while at it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b51e974f

kvm, emulator: Use opcode length · 1ce19dc1

Borislav Petkov authored Sep 22, 2013

Add a field to the current emulation context which contains the
instruction opcode length. This will streamline handling of opcodes of
different length.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

1ce19dc1

kvm: Add KVM_GET_EMULATED_CPUID · 9c15bb1d

Borislav Petkov authored Sep 22, 2013

Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.

Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.

s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9c15bb1d

28 Oct, 2013 5 commits

Merge tag 'kvm-arm-for-3.13-2' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-queue · 5bb3398d

Paolo Bonzini authored Oct 28, 2013

Updates for KVM/ARM, take 2 including:
 - Transparent Huge Pages and hugetlbfs support for KVM/ARM
 - Yield CPU when guest executes WFE to speed up CPU overcommit

5bb3398d

KVM: Mapping IOMMU pages after updating memslot · e0230e13

Yang Zhang authored Oct 24, 2013

In kvm_iommu_map_pages(), we need to know the page size via call
kvm_host_page_size(). And it will check whether the target slot
is valid before return the right page size.
Currently, we will map the iommu pages when creating a new slot.
But we call kvm_iommu_map_pages() during preparing the new slot.
At that time, the new slot is not visible by domain(still in preparing).
So we cannot get the right page size from kvm_host_page_size() and
this will break the IOMMU super page logic.
The solution is to map the iommu pages after we insert the new slot
into domain.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Tested-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e0230e13

nVMX: Report CPU_BASED_VIRTUAL_NMI_PENDING as supported · a294c9bb

Jan Kiszka authored Oct 23, 2013

If the host supports it, we can and should expose it to the guest as
well, just like we already do with PIN_BASED_VIRTUAL_NMIS.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a294c9bb

nVMX: Fix pick-up of uninjected NMIs · cd2633c5

Jan Kiszka authored Oct 23, 2013

__vmx_complete_interrupts stored uninjected NMIs in arch.nmi_injected,
not arch.nmi_pending. So we actually need to check the former field in
vmcs12_save_pending_event. This fixes the eventinj unit test when run
in nested KVM.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

cd2633c5

KVM: nVMX: Report 2MB EPT pages as supported · d3134dbf

Jan Kiszka authored Oct 23, 2013

As long as the hardware provides us 2MB EPT pages, we can also expose
them to the guest because our shadow EPT code already supports this
feature.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d3134dbf

18 Oct, 2013 2 commits

KVM: ARM: Transparent huge page (THP) support · 9b5fdb97

Christoffer Dall authored Oct 02, 2013

Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

9b5fdb97

KVM: ARM: Support hugetlbfs backed huge pages · ad361f09

Christoffer Dall authored Nov 01, 2012

Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

ad361f09

17 Oct, 2013 15 commits

KVM: ARM: Update comments for kvm_handle_wfi · 86ed81aa

Christoffer Dall authored Oct 15, 2013

Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

86ed81aa

ARM: KVM: Yield CPU when vcpu executes a WFE · 58d5ec8f

Marc Zyngier authored Oct 08, 2013

On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

58d5ec8f

kvm: powerpc: book3s: drop is_hv_enabled · a78b55d1

Aneesh Kumar K.V authored Oct 07, 2013

drop is_hv_enabled, because that should not be a callback property
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

a78b55d1

kvm: powerpc: book3s: Allow the HV and PR selection per virtual machine · cbbc58d4

Aneesh Kumar K.V authored Oct 07, 2013

This moves the kvmppc_ops callbacks to be a per VM entity. This
enables us to select HV and PR mode when creating a VM. We also
allow both kvm-hv and kvm-pr kernel module to be loaded. To
achieve this we move /dev/kvm ownership to kvm.ko module. Depending on
which KVM mode we select during VM creation we take a reference
count on respective module
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: fix coding style]
Signed-off-by: Alexander Graf <agraf@suse.de>

cbbc58d4

Powerpc KVM work is based on a commit after rc4. · 13acfd57
Gleb Natapov authored Oct 17, 2013
```
Merging master into next to satisfy the dependencies.

Conflicts:
	arch/arm/kvm/reset.c
```
13acfd57

kvm: Add struct kvm arg to memslot APIs · 5587027c

Aneesh Kumar K.V authored Oct 07, 2013

We will use that in the later patch to find the kvm ops handler
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

5587027c

kvm: powerpc: book3s: Support building HV and PR KVM as module · 2ba9f0d8

Aneesh Kumar K.V authored Oct 07, 2013

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: squash in compile fix]
Signed-off-by: Alexander Graf <agraf@suse.de>

2ba9f0d8

kvm: powerpc: booke: Move booke related tracepoints to separate header · dba291f2
Aneesh Kumar K.V authored Oct 07, 2013
```
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
```
dba291f2

kvm: powerpc: book3s: pr: move PR related tracepoints to a separate header · 72c12535

Aneesh Kumar K.V authored Oct 07, 2013

This patch moves PR related tracepoints to a separate header. This
enables in converting PR to a kernel module which will be done in
later patches
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

72c12535

kvm: powerpc: book3s: Add is_hv_enabled to kvmppc_ops · 699cc876

Aneesh Kumar K.V authored Oct 07, 2013

This help us to identify whether we are running with hypervisor mode KVM
enabled. The change is needed so that we can have both HV and PR kvm
enabled in the same kernel.

If both HV and PR KVM are included, interrupts come in to the HV version
of the kvmppc_interrupt code, which then jumps to the PR handler,
renamed to kvmppc_interrupt_pr, if the guest is a PR guest.

Allowing both PR and HV in the same kernel required some changes to
kvm_dev_ioctl_check_extension(), since the values returned now can't
be selected with #ifdefs as much as previously. We look at is_hv_enabled
to return the right value when checking for capabilities.For capabilities that
are only provided by HV KVM, we return the HV value only if
is_hv_enabled is true. For capabilities provided by PR KVM but not HV,
we return the PR value only if is_hv_enabled is false.

NOTE: in later patch we replace is_hv_enabled with a static inline
function comparing kvm_ppc_ops
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

699cc876

kvm: powerpc: book3s: Cleanup interrupt handling code · dd96b2c2

Aneesh Kumar K.V authored Oct 07, 2013

With this patch if HV is included, interrupts come in to the HV version
of the kvmppc_interrupt code, which then jumps to the PR handler,
renamed to kvmppc_interrupt_pr, if the guest is a PR guest. This helps
in enabling both HV and PR, which we do in later patch
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

dd96b2c2

kvm: powerpc: Add kvmppc_ops callback · 3a167bea

Aneesh Kumar K.V authored Oct 07, 2013

This patch add a new callback kvmppc_ops. This will help us in enabling
both HV and PR KVM together in the same kernel. The actual change to
enable them together is done in the later patch in the series.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: squash in booke changes]
Signed-off-by: Alexander Graf <agraf@suse.de>

3a167bea

kvm: powerpc: book3s: Add a new config variable CONFIG_KVM_BOOK3S_HV_POSSIBLE · 9975f5e3

Aneesh Kumar K.V authored Oct 07, 2013

This help ups to select the relevant code in the kernel code
when we later move HV and PR bits as seperate modules. The patch
also makes the config options for PR KVM selectable
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

9975f5e3

kvm: powerpc: book3s: pr: Rename KVM_BOOK3S_PR to KVM_BOOK3S_PR_POSSIBLE · 7aa79938

Aneesh Kumar K.V authored Oct 07, 2013

With later patches supporting PR kvm as a kernel module, the changes
that has to be built into the main kernel binary to enable PR KVM module
is now selected via KVM_BOOK3S_PR_POSSIBLE
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

7aa79938

kvm: powerpc: book3s: move book3s_64_vio_hv.c into the main kernel binary · 066212e0

Paul Mackerras authored Oct 07, 2013

Since the code in book3s_64_vio_hv.c is called from real mode with HV
KVM, and therefore has to be built into the main kernel binary, this
makes it always built-in rather than part of the KVM module. It gets
called from the KVM module by PR KVM, so this adds an EXPORT_SYMBOL_GPL().
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

066212e0