1. 10 Apr, 2015 1 commit
  2. 08 Apr, 2015 18 commits
    • Wanpeng Li's avatar
      kvm: mmu: lazy collapse small sptes into large sptes · 3ea3b7fa
      Wanpeng Li authored
      Dirty logging tracks sptes in 4k granularity, meaning that large sptes
      have to be split.  If live migration is successful, the guest in the
      source machine will be destroyed and large sptes will be created in the
      destination. However, the guest continues to run in the source machine
      (for example if live migration fails), small sptes will remain around
      and cause bad performance.
      
      This patch introduce lazy collapsing of small sptes into large sptes.
      The rmap will be scanned in ioctl context when dirty logging is stopped,
      dropping those sptes which can be collapsed into a single large-page spte.
      Later page faults will create the large-page sptes.
      Reviewed-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Message-Id: <1428046825-6905-1-git-send-email-wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3ea3b7fa
    • Nadav Amit's avatar
      KVM: x86: Clear CR2 on VCPU reset · 1119022c
      Nadav Amit authored
      CR2 is not cleared as it should after reset.  See Intel SDM table named "IA-32
      Processor States Following Power-up, Reset, or INIT".
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Message-Id: <1427933438-12782-5-git-send-email-namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1119022c
    • Nadav Amit's avatar
      KVM: x86: DR0-DR3 are not clear on reset · ae561ede
      Nadav Amit authored
      DR0-DR3 are not cleared as they should during reset and when they are set from
      userspace.  It appears to be caused by c77fb5fe ("KVM: x86: Allow the guest
      to run with dirty debug registers").
      
      Force their reload on these situations.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Message-Id: <1427933438-12782-4-git-send-email-namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ae561ede
    • Nadav Amit's avatar
      KVM: x86: BSP in MSR_IA32_APICBASE is writable · 58d269d8
      Nadav Amit authored
      After reset, the CPU can change the BSP, which will be used upon INIT.  Reset
      should return the BSP which QEMU asked for, and therefore handled accordingly.
      
      To quote: "If the MP protocol has completed and a BSP is chosen, subsequent
      INITs (either to a specific processor or system wide) do not cause the MP
      protocol to be repeated."
      [Intel SDM 8.4.2: MP Initialization Protocol Requirements and Restrictions]
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Message-Id: <1427933438-12782-3-git-send-email-namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      58d269d8
    • Radim Krčmář's avatar
      KVM: x86: simplify kvm_apic_map · 3b5a5ffa
      Radim Krčmář authored
      recalculate_apic_map() uses two passes over all VCPUs.  This is a relic
      from time when we selected a global mode in the first pass and set up
      the optimized table in the second pass (to have a consistent mode).
      
      Recent changes made mixed mode unoptimized and we can do it in one pass.
      Format of logical MDA is a function of the mode, so we encode it in
      apic_logical_id() and drop obsoleted variables from the struct.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-5-git-send-email-rkrcmar@redhat.com>
      [Add lid_bits temporary in apic_logical_id. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3b5a5ffa
    • Radim Krčmář's avatar
      KVM: x86: avoid logical_map when it is invalid · 3548a259
      Radim Krčmář authored
      We want to support mixed modes and the easiest solution is to avoid
      optimizing those weird and unlikely scenarios.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-4-git-send-email-rkrcmar@redhat.com>
      [Add comment above KVM_APIC_MODE_* defines. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3548a259
    • Radim Krčmář's avatar
      KVM: x86: fix mixed APIC mode broadcast · 9ea369b0
      Radim Krčmář authored
      Broadcast allowed only one global APIC mode, but mixed modes are
      theoretically possible.  x2APIC IPI doesn't mean 0xff as broadcast,
      the rest does.
      
      x2APIC broadcasts are accepted by xAPIC.  If we take SDM to be logical,
      even addreses beginning with 0xff should be accepted, but real hardware
      disagrees.  This patch aims for simple code by considering most of real
      behavior as undefined.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-3-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9ea369b0
    • Radim Krčmář's avatar
      KVM: x86: use MDA for interrupt matching · 03d2249e
      Radim Krčmář authored
      In mixed modes, we musn't deliver xAPIC IPIs like x2APIC and vice versa.
      Instead of preserving the information in apic_send_ipi(), we regain it
      by converting all destinations into correct MDA in the slow path.
      This allows easier reasoning about subsequent matching.
      
      Our kvm_apic_broadcast() had an interesting design decision: it didn't
      consider IOxAPIC 0xff as broadcast in x2APIC mode ...
      everything worked because IOxAPIC can't set that in physical mode and
      logical mode considered it as a message for first 8 VCPUs.
      This patch interprets IOxAPIC 0xff as x2APIC broadcast.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1423766494-26150-2-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      03d2249e
    • Arseny Solokha's avatar
      kvm/ppc/mpic: drop unused IRQ_testbit · 19456060
      Arseny Solokha authored
      Drop unused static procedure which doesn't have callers within its
      translation unit. It had been already removed independently in QEMU[1]
      from the OpenPIC implementation borrowed from the kernel.
      
      [1] https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg01812.htmlSigned-off-by: default avatarArseny Solokha <asolokha@kb.kras.ru>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1424768706-23150-3-git-send-email-asolokha@kb.kras.ru>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      19456060
    • Eugene Korenevsky's avatar
      KVM: nVMX: remove unnecessary double caching of MAXPHYADDR · 92d71bc6
      Eugene Korenevsky authored
      After speed-up of cpuid_maxphyaddr() it can be called frequently:
      instead of heavyweight enumeration of CPUID entries it returns a cached
      pre-computed value. It is also inlined now. So caching its result became
      unnecessary and can be removed.
      Signed-off-by: default avatarEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205644.GA1258@gnote>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      92d71bc6
    • Eugene Korenevsky's avatar
      KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry · 9090422f
      Eugene Korenevsky authored
      On each VM-entry CPU should check the following VMCS fields for zero bits
      beyond physical address width:
      -  APIC-access address
      -  virtual-APIC address
      -  posted-interrupt descriptor address
      This patch adds these checks required by Intel SDM.
      Signed-off-by: default avatarEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205627.GA1244@gnote>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9090422f
    • Eugene Korenevsky's avatar
      KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu · 5a4f55cd
      Eugene Korenevsky authored
      cpuid_maxphyaddr(), which performs lot of memory accesses is called
      extensively across KVM, especially in nVMX code.
      
      This patch adds a cached value of maxphyaddr to vcpu.arch to reduce the
      pressure onto CPU cache and simplify the code of cpuid_maxphyaddr()
      callers. The cached value is initialized in kvm_arch_vcpu_init() and
      reloaded every time CPUID is updated by usermode. It is obvious that
      these reloads occur infrequently.
      Signed-off-by: default avatarEugene Korenevsky <ekorenevsky@gmail.com>
      Message-Id: <20150329205612.GA1223@gnote>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5a4f55cd
    • Radim Krčmář's avatar
      KVM: vmx: pass error code with internal error #2 · 80f0e95d
      Radim Krčmář authored
      Exposing the on-stack error code with internal error is cheap and
      potentially useful.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <1428001865-32280-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      80f0e95d
    • Radim Krčmář's avatar
      x86: vdso: fix pvclock races with task migration · 80f7fdb1
      Radim Krčmář authored
      If we were migrated right after __getcpu, but before reading the
      migration_count, we wouldn't notice that we read TSC of a different
      VCPU, nor that KVM's bug made pvti invalid, as only migration_count
      on source VCPU is increased.
      
      Change vdso instead of updating migration_count on destination.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 0a4e6be9 ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"")
      Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      80f7fdb1
    • Paolo Bonzini's avatar
      KVM: remove kvm_read_hva and kvm_read_hva_atomic · 3180a7fc
      Paolo Bonzini authored
      The corresponding write functions just use __copy_to_user.  Do the
      same on the read side.
      
      This reverts what's left of commit 86ab8cff (KVM: introduce
      gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic, 2012-08-21)
      
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1427976500-28533-1-git-send-email-pbonzini@redhat.com>
      3180a7fc
    • Paolo Bonzini's avatar
      KVM: x86: optimize delivery of TSC deadline timer interrupt · 9c8fd1ba
      Paolo Bonzini authored
      The newly-added tracepoint shows the following results on
      the tscdeadline_latency test:
      
              qemu-kvm-8387  [002]  6425.558974: kvm_vcpu_wakeup:      poll time 10407 ns
              qemu-kvm-8387  [002]  6425.558984: kvm_vcpu_wakeup:      poll time 0 ns
              qemu-kvm-8387  [002]  6425.561242: kvm_vcpu_wakeup:      poll time 10477 ns
              qemu-kvm-8387  [002]  6425.561251: kvm_vcpu_wakeup:      poll time 0 ns
      
      and so on.  This is because we need to go through kvm_vcpu_block again
      after the timer IRQ is injected.  Avoid it by polling once before
      entering kvm_vcpu_block.
      
      On my machine (Xeon E5 Sandy Bridge) this removes about 500 cycles (7%)
      from the latency of the TSC deadline timer.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9c8fd1ba
    • Paolo Bonzini's avatar
      KVM: x86: extract blocking logic from __vcpu_run · 362c698f
      Paolo Bonzini authored
      Rename the old __vcpu_run to vcpu_run, and extract part of it to a new
      function vcpu_block.
      
      The next patch will add a new condition in vcpu_block, avoid extra
      indentation.
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      362c698f
    • Wanpeng Li's avatar
      kvm: x86: fix x86 eflags fixed bit · 35fd68a3
      Wanpeng Li authored
      Guest can't be booted w/ ept=0, there is a message dumped as below:
      
      If you're running a guest on an Intel machine without unrestricted mode
      support, the failure can be most likely due to the guest entering an invalid
      state for Intel VT. For example, the guest maybe running in big real mode
      which is not supported on less recent Intel processors.
      
      EAX=00000011 EBX=f000d2f6 ECX=00006cac EDX=000f8956
      ESI=bffbdf62 EDI=00000000 EBP=00006c68 ESP=00006c68
      EIP=0000d187 EFL=00000004 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
      ES =e000 000e0000 ffffffff 00809300 DPL=0 DS16 [-WA]
      CS =f000 000f0000 ffffffff 00809b00 DPL=0 CS16 [-RA]
      SS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      DS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      FS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      GS =0000 00000000 ffffffff 00809300 DPL=0 DS16 [-WA]
      LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
      TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
      GDT=     000f6a80 00000037
      IDT=     000f6abe 00000000
      CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
      DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
      DR6=00000000ffff0ff0 DR7=0000000000000400
      EFER=0000000000000000
      Code=01 1e b8 6a 2e 0f 01 16 74 6a 0f 20 c0 66 83 c8 01 0f 22 c0 <66> ea 8f d1 0f 00 08 00 b8 10 00 00 00 8e d8 8e c0 8e d0 8e e0 8e e8 89 c8 ff e2 89 c1 b8X
      
      X86 eflags bit 1 is fixed set, which means that 1 << 1 is set instead of 1,
      this patch fix it.
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Message-Id: <1428473294-6633-1-git-send-email-wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      35fd68a3
  3. 07 Apr, 2015 3 commits
  4. 31 Mar, 2015 9 commits
  5. 30 Mar, 2015 9 commits