1. 10 Jun, 2022 1 commit
  2. 09 Jun, 2022 31 commits
  3. 08 Jun, 2022 8 commits
    • Paul Durrant's avatar
      KVM: x86: PIT: Preserve state of speaker port data bit · b1728622
      Paul Durrant authored
      Currently the state of the speaker port (0x61) data bit (bit 1) is not
      saved in the exported state (kvm_pit_state2) and hence is lost when
      re-constructing guest state.
      
      This patch removes the 'speaker_data_port' field from kvm_kpit_state and
      instead tracks the state using a new KVM_PIT_FLAGS_SPEAKER_DATA_ON flag
      defined in the API.
      Signed-off-by: default avatarPaul Durrant <pdurrant@amazon.com>
      Message-Id: <20220531124421.1427-1-pdurrant@amazon.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b1728622
    • Sean Christopherson's avatar
      KVM: VMX: Reject kvm_intel if an inconsistent VMCS config is detected · 3dbec44d
      Sean Christopherson authored
      Add an on-by-default module param, error_on_inconsistent_vmcs_config, to
      allow rejecting the load of kvm_intel if an inconsistent VMCS config is
      detected.  Continuing on with an inconsistent, degraded config is
      undesirable in the vast majority of use cases, e.g. may result in a
      misconfigured VM, poor performance due to lack of fast MSR switching, or
      even security issues in the unlikely event the guest is relying on MPX.
      
      Practically speaking, an inconsistent VMCS config should never be
      encountered in a production quality environment, e.g. on bare metal it
      indicates a silicon defect (or a disturbing lack of validation by the
      hardware vendor), and in a virtualized machine (KVM as L1) it indicates a
      buggy/misconfigured L0 VMM/hypervisor.
      
      Provide a module param to override the behavior for testing purposes, or
      in the unlikely scenario that KVM is deployed on a flawed-but-usable CPU
      or virtual machine.
      
      Note, what is or isn't an inconsistency is somewhat subjective, e.g. one
      might argue that LOAD_EFER without SAVE_EFER is an inconsistency.  KVM's
      unofficial guideline for an "inconsistency" is either scenarios that are
      completely nonsensical, e.g. the existing checks on having EPT/VPID knobs
      without EPT/VPID, and/or scenarios that prevent KVM from virtualizing or
      utilizing a feature, e.g. the unpaired entry/exit controls checks.  Other
      checks that fall into one or both of the covered scenarios could be added
      in the future, e.g. asserting that a VMCS control exists available if and
      only if the associated feature is supported in bare metal.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220527170658.3571367-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3dbec44d
    • Sean Christopherson's avatar
      KVM: VMX: Sanitize VM-Entry/VM-Exit control pairs at kvm_intel load time · f5a81d0e
      Sean Christopherson authored
      Sanitize the VM-Entry/VM-Exit control pairs (load+load or load+clear)
      during setup instead of checking both controls in a pair at runtime.  If
      only one control is supported, KVM will report the associated feature as
      not available, but will leave the supported control bit set in the VMCS
      config, which could lead to corruption of host state.  E.g. if only the
      VM-Entry control is supported and the feature is not dynamically toggled,
      KVM will set the control in all VMCSes and load zeros without restoring
      host state.
      
      Note, while this is technically a bug fix, practically speaking no sane
      CPU or VMM would support only one control.  KVM's behavior of checking
      both controls is mostly pedantry.
      
      Cc: Chenyi Qiang <chenyi.qiang@intel.com>
      Cc: Lei Wang <lei4.wang@intel.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220527170658.3571367-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f5a81d0e
    • Like Xu's avatar
      KVM: x86/pmu: Accept 0 for absent PMU MSRs when host-initiated if !enable_pmu · 8e6a58e2
      Like Xu authored
      Whenever an MSR is part of KVM_GET_MSR_INDEX_LIST, as is the case for
      MSR_K7_EVNTSEL0 or MSR_F15H_PERF_CTL0, it has to be always retrievable
      and settable with KVM_GET_MSR and KVM_SET_MSR.
      
      Accept a zero value for these MSRs to obey the contract.
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20220601031925.59693-1-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8e6a58e2
    • Like Xu's avatar
      KVM: x86/pmu: Restrict advanced features based on module enable_pmu · 6ef25aa0
      Like Xu authored
      Once vPMU is disabled, the KVM would not expose features like:
      PEBS (via clear kvm_pmu_cap.pebs_ept), legacy LBR and ARCH_LBR,
      CPUID 0xA leaf, PDCM bit and MSR_IA32_PERF_CAPABILITIES, plus
      PT_MODE_HOST_GUEST mode.
      
      What this group of features has in common is that their use
      relies on the underlying PMU counter and the host perf_event as a
      back-end resource requester or sharing part of the irq delivery path.
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20220601031925.59693-2-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6ef25aa0
    • Like Xu's avatar
      KVM: x86/pmu: Avoid exposing Intel BTS feature · b9181c8e
      Like Xu authored
      The BTS feature (including the ability to set the BTS and BTINT
      bits in the DEBUGCTL MSR) is currently unsupported on KVM.
      
      But we may try using the BTS facility on a PEBS enabled guest like this:
          perf record -e branches:u -c 1 -d ls
      and then we would encounter the following call trace:
      
       [] unchecked MSR access error: WRMSR to 0x1d9 (tried to write 0x00000000000003c0)
              at rIP: 0xffffffff810745e4 (native_write_msr+0x4/0x20)
       [] Call Trace:
       []  intel_pmu_enable_bts+0x5d/0x70
       []  bts_event_add+0x54/0x70
       []  event_sched_in+0xee/0x290
      
      As it lacks any CPUID indicator or perf_capabilities valid bit
      fields to prompt for this information, the platform would hint
      the Intel BTS feature unavailable to guest by setting the
      BTS_UNAVAIL bit in the IA32_MISC_ENABLE.
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20220601031925.59693-3-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b9181c8e
    • Like Xu's avatar
      KVM: x86/pmu: Update global enable_pmu when PMU is undetected · d7808f73
      Like Xu authored
      On some virt platforms (L1 guest w/o PMU), the value of module parameter
      'enable_pmu' for nested L2 guests should be updated at initialisation.
      
      Considering that there is no concept of "architecture pmu" in AMD or Hygon
      and that the versions (prior to Zen 4) are all 0, but that the theoretical
      available counters are at least AMD64_NUM_COUNTERS, the utility
      check_hw_exists() is reused in the initialisation call path.
      
      Opportunistically update Intel specific comments.
      
      Fixes: 8eeac7e999e8 ("KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability")
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20220518170118.66263-3-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d7808f73
    • Paolo Bonzini's avatar
      x86: events: Do not return bogus capabilities if PMU is broken · 916e3a4f
      Paolo Bonzini authored
      If the PMU is broken due to firmware issues, check_hw_exists() will return
      false but perf_get_x86_pmu_capability() will still return data from x86_pmu.
      Likewise if some of the hotplug callbacks cannot be installed the contents
      of x86_pmu will not be reverted.
      
      Handle the failure in both cases by clearing x86_pmu if init_hw_perf_events()
      or reverts to software events only.
      Co-developed-by: default avatarLike Xu <likexu@tencent.com>
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      916e3a4f