1. 10 Sep, 2019 2 commits
    • Sean Christopherson's avatar
      KVM: x86: Manually calculate reserved bits when loading PDPTRS · 16cfacc8
      Sean Christopherson authored
      Manually generate the PDPTR reserved bit mask when explicitly loading
      PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
      current paging mode, which is unlikely to be PAE paging in the vast
      majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
      __set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
      PDPTR, or more likely, miss a reserved bit check and subsequently fail
      a VM-Enter due to a bad VMCS.GUEST_PDPTR.
      
      Add a one off helper to generate the reserved bits instead of sharing
      code across the MMU's calculations and the PDPTR emulation.  The PDPTR
      reserved bits are basically set in stone, and pushing a helper into
      the MMU's calculation adds unnecessary complexity without improving
      readability.
      
      Oppurtunistically fix/update the comment for load_pdptrs().
      
      Note, the buggy commit also introduced a deliberate functional change,
      "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
      effectively (and correctly) reverted by commit cd9ae5fe ("KVM: x86:
      Fix page-tables reserved bits").  A bit of SDM archaeology shows that
      the SDM from late 2008 had a bug (likely a copy+paste error) where it
      listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
      for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
      always have been reserved.
      
      Fixes: 20c466b5 ("KVM: Use rsvd_bits_mask in load_pdptrs()")
      Cc: stable@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Reported-by: default avatarDoug Reiland <doug.reiland@intel.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      16cfacc8
    • Alexander Graf's avatar
      KVM: x86: Disable posted interrupts for non-standard IRQs delivery modes · fdcf7562
      Alexander Graf authored
      We can easily route hardware interrupts directly into VM context when
      they target the "Fixed" or "LowPriority" delivery modes.
      
      However, on modes such as "SMI" or "Init", we need to go via KVM code
      to actually put the vCPU into a different mode of operation, so we can
      not post the interrupt
      
      Add code in the VMX and SVM PI logic to explicitly refuse to establish
      posted mappings for advanced IRQ deliver modes. This reflects the logic
      in __apic_accept_irq() which also only ever passes Fixed and LowPriority
      interrupts as posted interrupts into the guest.
      
      This fixes a bug I have with code which configures real hardware to
      inject virtual SMIs into my guest.
      Signed-off-by: default avatarAlexander Graf <graf@amazon.com>
      Reviewed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fdcf7562
  2. 22 Aug, 2019 18 commits
  3. 21 Aug, 2019 3 commits
  4. 15 Aug, 2019 3 commits
  5. 14 Aug, 2019 4 commits
    • Miaohe Lin's avatar
      KVM: x86: svm: remove redundant assignment of var new_entry · c8e174b3
      Miaohe Lin authored
      new_entry is reassigned a new value next line. So
      it's redundant and remove it.
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c8e174b3
    • Paolo Bonzini's avatar
      MAINTAINERS: add KVM x86 reviewers · ed4e7b05
      Paolo Bonzini authored
      This is probably overdue---KVM x86 has quite a few contributors that
      usually review each other's patches, which is really helpful to me.
      Formalize this by listing them as reviewers.  I am including people
      with various expertise:
      
      - Joerg for SVM (with designated reviewers, it makes more sense to have
      him in the main KVM/x86 stanza)
      
      - Sean for MMU and VMX
      
      - Jim for VMX
      
      - Vitaly for Hyper-V and possibly SVM
      
      - Wanpeng for LAPIC and paravirtualization.
      
      Please ack if you are okay with this arrangement, otherwise speak up.
      
      In other news, Radim is going to leave Red Hat soon.  However, he has
      not been very much involved in upstream KVM development for some time,
      and in the immediate future he is still going to help maintain kvm/queue
      while I am on vacation.  Since not much is going to change, I will let
      him decide whether he wants to keep the maintainer role after he leaves.
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Acked-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ed4e7b05
    • Paolo Bonzini's avatar
      MAINTAINERS: change list for KVM/s390 · 74260dc2
      Paolo Bonzini authored
      KVM/s390 does not have a list of its own, and linux-s390 is in the
      loop anyway thanks to the generic arch/s390 match.  So use the generic
      KVM list for s390 patches.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      74260dc2
    • Radim Krcmar's avatar
      kvm: x86: skip populating logical dest map if apic is not sw enabled · b14c876b
      Radim Krcmar authored
      recalculate_apic_map does not santize ldr and it's possible that
      multiple bits are set. In that case, a previous valid entry
      can potentially be overwritten by an invalid one.
      
      This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
      triggering a crash to boot a kdump kernel. This is the sequence of
      events:
      1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
      writes to the LDR which probably will never be used.
      2. However, when booting into kdump, the stale LDR values remain as
      they are not cleared by the guest and there isn't a apic reset.
      3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
      logical map has been overwritten and points to an inactive vcpu.
      Signed-off-by: default avatarRadim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: default avatarBandan Das <bsd@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b14c876b
  6. 09 Aug, 2019 8 commits
  7. 05 Aug, 2019 2 commits
    • Marc Zyngier's avatar
      KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block · 5eeaf10e
      Marc Zyngier authored
      Since commit commit 328e5664 ("KVM: arm/arm64: vgic: Defer
      touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
      its GICv2 equivalent) loaded as long as we can, only syncing it
      back when we're scheduled out.
      
      There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
      which is indirectly called from kvm_vcpu_check_block(), needs to
      evaluate the guest's view of ICC_PMR_EL1. At the point were we
      call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
      changes to PMR is not visible in memory until we do a vcpu_put().
      
      Things go really south if the guest does the following:
      
      	mov x0, #0	// or any small value masking interrupts
      	msr ICC_PMR_EL1, x0
      
      	[vcpu preempted, then rescheduled, VMCR sampled]
      
      	mov x0, #ff	// allow all interrupts
      	msr ICC_PMR_EL1, x0
      	wfi		// traps to EL2, so samping of VMCR
      
      	[interrupt arrives just after WFI]
      
      Here, the hypervisor's view of PMR is zero, while the guest has enabled
      its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
      interrupts are pending (despite an interrupt being received) and we'll
      block for no reason. If the guest doesn't have a periodic interrupt
      firing once it has blocked, it will stay there forever.
      
      To avoid this unfortuante situation, let's resync VMCR from
      kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
      will observe the latest value of PMR.
      
      This has been found by booting an arm64 Linux guest with the pseudo NMI
      feature, and thus using interrupt priorities to mask interrupts instead
      of the usual PSTATE masking.
      
      Cc: stable@vger.kernel.org # 4.12
      Fixes: 328e5664 ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put")
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      5eeaf10e
    • Paolo Bonzini's avatar
      x86: kvm: remove useless calls to kvm_para_available · 57b76bdb
      Paolo Bonzini authored
      Most code in arch/x86/kernel/kvm.c is called through x86_hyper_kvm, and thus only
      runs if KVM has been detected.  There is no need to check again for the CPUID
      base.
      
      Cc: Sergio Lopez <slp@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      57b76bdb