Commits · d04c77d231223563405e8874fa7edfdc65e545fe · Kirill Smelkov / linux

26 Jul, 2024 5 commits

KVM: guest_memfd: delay folio_mark_uptodate() until after successful preparation · d04c77d2

Paolo Bonzini authored Jul 11, 2024

The up-to-date flag as is now is not too useful; it tells guest_memfd not
to overwrite the contents of a folio, but it doesn't say that the page
is ready to be mapped into the guest. For encrypted guests, mapping
a private page requires that the "preparation" phase has succeeded,
and at the same time the same page cannot be prepared twice.

So, ensure that folio_mark_uptodate() is only called on a prepared page. If
kvm_gmem_prepare_folio() or the post_populate callback fail, the folio
will not be marked up-to-date; it's not a problem to call clear_highpage()
again on such a page prior to the next preparation attempt.
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d04c77d2

KVM: guest_memfd: return folio from __kvm_gmem_get_pfn() · d0d87226

Paolo Bonzini authored Jul 11, 2024

Right now this is simply more consistent and avoids use of pfn_to_page()
and put_page().  It will be put to more use in upcoming patches, to
ensure that the up-to-date flag is set at the very end of both the
kvm_gmem_get_pfn() and kvm_gmem_populate() flows.
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d0d87226

KVM: x86: disallow pre-fault for SNP VMs before initialization · 5932ca41

Paolo Bonzini authored Jul 17, 2024

KVM_PRE_FAULT_MEMORY for an SNP guest can race with
sev_gmem_post_populate() in bad ways. The following sequence for
instance can potentially trigger an RMP fault:

  thread A, sev_gmem_post_populate: called
  thread B, sev_gmem_prepare: places below 'pfn' in a private state in RMP
  thread A, sev_gmem_post_populate: *vaddr = kmap_local_pfn(pfn + i);
  thread A, sev_gmem_post_populate: copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
  RMP #PF

Fix this by only allowing KVM_PRE_FAULT_MEMORY to run after a guest's
initial private memory contents have been finalized via
KVM_SEV_SNP_LAUNCH_FINISH.

Beyond fixing this issue, it just sort of makes sense to enforce this,
since the KVM_PRE_FAULT_MEMORY documentation states:

  "KVM maps memory as if the vCPU generated a stage-2 read page fault"

which sort of implies we should be acting on the same guest state that a
vCPU would see post-launch after the initial guest memory is all set up.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5932ca41

KVM: Documentation: Fix title underline too short warning · c2adcf05

Chang Yu authored Jul 23, 2024

Fix "WARNING: Title underline too short" by extending title line to the
proper length.
Signed-off-by: Chang Yu <marcus.yu.56@gmail.com>
Message-ID: <ZqB3lofbzMQh5Q-5@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c2adcf05

KVM: x86: Eliminate log spam from limited APIC timer periods · 0005ca20

Jim Mattson authored Jul 24, 2024

SAP's vSMP MemoryONE continuously requests a local APIC timer period
less than 500 us, resulting in the following kernel log spam:

  kvm: vcpu 15: requested 70240 ns lapic timer period limited to 500000 ns
  kvm: vcpu 19: requested 52848 ns lapic timer period limited to 500000 ns
  kvm: vcpu 15: requested 70256 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 70256 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 70208 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 387520 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 70160 ns lapic timer period limited to 500000 ns
  kvm: vcpu 66: requested 205744 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 70224 ns lapic timer period limited to 500000 ns
  kvm: vcpu 9: requested 70256 ns lapic timer period limited to 500000 ns
  limit_periodic_timer_frequency: 7569 callbacks suppressed
  ...

To eliminate this spam, change the pr_info_ratelimited() in
limit_periodic_timer_frequency() to pr_info_once().
Reported-by: James Houghton <jthoughton@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-ID: <20240724190640.2449291-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

0005ca20

17 Jul, 2024 1 commit

crypto: ccp: Add the SNP_VLEK_LOAD command · 332d2c1d

Michael Roth authored May 01, 2024

When requesting an attestation report a guest is able to specify whether
it wants SNP firmware to sign the report using either a Versioned Chip
Endorsement Key (VCEK), which is derived from chip-unique secrets, or a
Versioned Loaded Endorsement Key (VLEK) which is obtained from an AMD
Key Derivation Service (KDS) and derived from seeds allocated to
enrolled cloud service providers (CSPs).

For VLEK keys, an SNP_VLEK_LOAD SNP firmware command is used to load
them into the system after obtaining them from the KDS. Add a
corresponding userspace interface so to allow the loading of VLEK keys
into the system.

See SEV-SNP Firmware ABI 1.54, SNP_VLEK_LOAD for more details.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240501085210.2213060-21-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

332d2c1d

16 Jul, 2024 20 commits

KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops · 5d766508

Wei Wang authored May 07, 2024

Similar to kvm_x86_call(), kvm_pmu_call() is added to streamline the usage
of static calls of kvm_pmu_ops, which improves code readability.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-4-wei.w.wang@intel.comSigned-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5d766508

KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops · 89604647

Wei Wang authored May 07, 2024

Introduces kvm_x86_call(), to streamline the usage of static calls of
kvm_x86_ops. The current implementation of these calls is verbose and
could lead to alignment challenges. This makes the code susceptible to
exceeding the "80 columns per single line of code" limit as defined in
the coding-style document. Another issue with the existing implementation
is that the addition of kvm_x86_ prefix to hooks at the static_call sites
hinders code readability and navigation. kvm_x86_call() is added to
improve code readability and maintainability, while adhering to the coding
style guidelines.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-3-wei.w.wang@intel.comSigned-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

89604647

KVM: x86: Replace static_call_cond() with static_call() · f4854bf7

Wei Wang authored May 07, 2024

The use of static_call_cond() is essentially the same as static_call() on
x86 (e.g. static_call() now handles a NULL pointer as a NOP), so replace
it with static_call() to simplify the code.

Link: https://lore.kernel.org/all/3916caa1dcd114301a49beafa5030eca396745c1.1679456900.git.jpoimboe@kernel.org/Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-2-wei.w.wang@intel.comSigned-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f4854bf7

Merge branch 'kvm-6.11-sev-attestation' into HEAD · bc9cd5a2

Paolo Bonzini authored Jul 16, 2024

The GHCB 2.0 specification defines 2 GHCB request types to allow SNP guests
to send encrypted messages/requests to firmware: SNP Guest Requests and SNP
Extended Guest Requests. These encrypted messages are used for things like
servicing attestation requests issued by the guest. Implementing support for
these is required to be fully GHCB-compliant.

For the most part, KVM only needs to handle forwarding these requests to
firmware (to be issued via the SNP_GUEST_REQUEST firmware command defined
in the SEV-SNP Firmware ABI), and then forwarding the encrypted response to
the guest.

However, in the case of SNP Extended Guest Requests, the host is also
able to provide the certificate data corresponding to the endorsement key
used by firmware to sign attestation report requests. This certificate data
is provided by userspace because:

1) It allows for different keys/key types to be used for each particular
guest with requiring any sort of KVM API to configure the certificate
table in advance on a per-guest basis.

2) It provides additional flexibility with how attestation requests might
be handled during live migration where the certificate data for
source/dest might be different.

3) It allows all synchronization between certificates and firmware/signing
key updates to be handled purely by userspace rather than requiring
some in-kernel mechanism to facilitate it. [1]

To support fetching certificate data from userspace, a new KVM exit type will
be needed to handle fetching the certificate from userspace. An attempt to
define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this
was introduced in v1 of this patchset, but is still being discussed by
community, so for now this patchset only implements a stub version of SNP
Extended Guest Requests that does not provide certificate data, but is still
enough to provide compliance with the GHCB 2.0 spec.

bc9cd5a2

KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event · 74458e48

Michael Roth authored Jul 01, 2024

Version 2 of GHCB specification added support for the SNP Extended Guest
Request Message NAE event. This event serves a nearly identical purpose
to the previously-added SNP_GUEST_REQUEST event, but for certain message
types it allows the guest to supply a buffer to be used for additional
information in some cases.

Currently the GHCB spec only defines extended handling of this sort in
the case of attestation requests, where the additional buffer is used to
supply a table of certificate data corresponding to the attestion
report's signing key. Support for this extended handling will require
additional KVM APIs to handle coordinating with userspace.

Whether or not the hypervisor opts to provide this certificate data is
optional. However, support for processing SNP_EXTENDED_GUEST_REQUEST
GHCB requests is required by the GHCB 2.0 specification for SNP guests,
so for now implement a stub implementation that provides an empty
certificate table to the guest if it supplies an additional buffer, but
otherwise behaves identically to SNP_GUEST_REQUEST.
Reviewed-by: Carlos Bilbao <carlos.bilbao.osdev@gmail.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

74458e48

x86/sev: Move sev_guest.h into common SEV header · f55f3c3a

Michael Roth authored Jul 01, 2024

sev_guest.h currently contains various definitions relating to the
format of SNP_GUEST_REQUEST commands to SNP firmware. Currently only the
sev-guest driver makes use of them, but when the KVM side of this is
implemented there's a need to parse the SNP_GUEST_REQUEST header to
determine whether additional information needs to be provided to the
guest. Prepare for this by moving those definitions to a common header
that's shared by host/guest code so that KVM can also make use of them.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-3-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f55f3c3a

KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event · 88caf544

Brijesh Singh authored Jul 01, 2024

Version 2 of GHCB specification added support for the SNP Guest Request
Message NAE event. The event allows for an SEV-SNP guest to make
requests to the SEV-SNP firmware through the hypervisor using the
SNP_GUEST_REQUEST API defined in the SEV-SNP firmware specification.

This is used by guests primarily to request attestation reports from
firmware. There are other request types are available as well, but the
specifics of what guest requests are being made generally does not
affect how they are handled by the hypervisor, which only serves as a
proxy for the guest requests and firmware responses.

Implement handling for these events.

When an SNP Guest Request is issued, the guest will provide its own
request/response pages, which could in theory be passed along directly
to firmware. However, these pages would need special care:

  - Both pages are from shared guest memory, so they need to be
    protected from migration/etc. occurring while firmware reads/writes
    to them. At a minimum, this requires elevating the ref counts and
    potentially needing an explicit pinning of the memory. This places
    additional restrictions on what type of memory backends userspace
    can use for shared guest memory since there would be some reliance
    on using refcounted pages.

  - The response page needs to be switched to Firmware-owned state
    before the firmware can write to it, which can lead to potential
    host RMP #PFs if the guest is misbehaved and hands the host a
    guest page that KVM is writing to for other reasons (e.g. virtio
    buffers).

Both of these issues can be avoided completely by using
separately-allocated bounce pages for both the request/response pages
and passing those to firmware instead. So that's the approach taken
here.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
[mdr: ensure FW command failures are indicated to guest, drop extended
 request handling to be re-written as separate patch, massage commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

88caf544

KVM: x86: Suppress MMIO that is triggered during task switch emulation · 2a1fc7dc

Sean Christopherson authored Jul 12, 2024

Explicitly suppress userspace emulated MMIO exits that are triggered when
emulating a task switch as KVM doesn't support userspace MMIO during
complex (multi-step) emulation. Silently ignoring the exit request can
result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to
userspace for some other reason prior to purging mmio_needed.

See commit 0dc90226 ("KVM: x86: Suppress pending MMIO write exits if
emulator detects exception") for more details on KVM's limitations with
respect to emulated MMIO during complex emulator flows.

Reported-by: syzbot+2fb9f8ed752c01bc9a3f@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712144841.1230591-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

2a1fc7dc

KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro · 9fe17d2a

Sean Christopherson authored Jul 12, 2024

Tweak the definition of make_huge_page_split_spte() to eliminate an
unnecessarily long line, and opportunistically initialize child_spte to
make it more obvious that the child is directly derived from the huge
parent.

No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712151335.1242633-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9fe17d2a

KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE · 3d4415ed

Sean Christopherson authored Jul 12, 2024

Bug the VM instead of simply warning if KVM tries to split a SPTE that is
non-present or not-huge.  KVM is guaranteed to end up in a broken state as
the callers fully expect a valid SPTE, e.g. the shadow MMU will add an
rmap entry, and all MMUs will account the expected small page.  Returning
'0' is also technically wrong now that SHADOW_NONPRESENT_VALUE exists,
i.e. would cause KVM to create a potential #VE SPTE.

While it would be possible to have the callers gracefully handle failure,
doing so would provide no practical value as the scenario really should be
impossible, while the error handling would add a non-trivial amount of
noise.

Fixes: a3fe5dbd ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled")
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712151335.1242633-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

3d4415ed

Merge tag 'kvm-x86-vmx-6.11' of https://github.com/kvm-x86/linux into HEAD · 208a352a

Paolo Bonzini authored Jul 16, 2024

KVM VMX changes for 6.11

 - Remove an unnecessary EPT TLB flush when enabling hardware.

 - Fix a series of bugs that cause KVM to fail to detect nested pending posted
   interrupts as valid wake eents for a vCPU executing HLT in L2 (with
   HLT-exiting disable by L1).

 - Misc cleanups

208a352a

Merge tag 'kvm-x86-svm-6.11' of https://github.com/kvm-x86/linux into HEAD · 1229cbef

Paolo Bonzini authored Jul 16, 2024

KVM SVM changes for 6.11

 - Make per-CPU save_area allocations NUMA-aware.

 - Force sev_es_host_save_area() to be inlined to avoid calling into an
   instrumentable function from noinstr code.

1229cbef

Merge tag 'kvm-x86-selftests-6.11' of https://github.com/kvm-x86/linux into HEAD · dbfd50cb

Paolo Bonzini authored Jul 16, 2024

KVM selftests for 6.11

 - Remove dead code in the memslot modification stress test.

 - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.

 - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
   log for tests that create lots of VMs.

 - Make the PMU counters test less flaky when counting LLC cache misses by
   doing CLFLUSH{OPT} in every loop iteration.

dbfd50cb

Merge tag 'kvm-x86-pmu-6.11' of https://github.com/kvm-x86/linux into HEAD · cda231cd

Paolo Bonzini authored Jul 16, 2024

KVM x86/pmu changes for 6.11

 - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads
   '0' and writes from userspace are ignored.

 - Update to the newfangled Intel CPU FMS infrastructure.

 - Use macros instead of open-coded literals to clean up KVM's manipulation of
   FIXED_CTR_CTRL MSRs.

cda231cd

Merge tag 'kvm-x86-mtrrs-6.11' of https://github.com/kvm-x86/linux into HEAD · 5c5ddf71

Paolo Bonzini authored Jul 16, 2024

KVM x86 MTRR virtualization removal

Remove support for virtualizing MTRRs on Intel CPUs, along with a nasty CR0.CD
hack, and instead always honor guest PAT on CPUs that support self-snoop.

5c5ddf71

Merge tag 'kvm-x86-mmu-6.11' of https://github.com/kvm-x86/linux into HEAD · 34b69ede

Paolo Bonzini authored Jul 16, 2024

KVM x86 MMU changes for 6.11

 - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't
   hold leafs SPTEs.

 - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager
   page splitting to avoid stalling vCPUs when splitting huge pages.

 - Misc cleanups

34b69ede

Merge tag 'kvm-x86-misc-6.11' of https://github.com/kvm-x86/linux into HEAD · 5dcc1e76

Paolo Bonzini authored Jul 16, 2024

KVM x86 misc changes for 6.11

 - Add a global struct to consolidate tracking of host values, e.g. EFER, and
   move "shadow_phys_bits" into the structure as "maxphyaddr".

 - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
   bus frequency, because TDX.

 - Print the name of the APICv/AVIC inhibits in the relevant tracepoint.

 - Clean up KVM's handling of vendor specific emulation to consistently act on
   "compatible with Intel/AMD", versus checking for a specific vendor.

 - Misc cleanups

5dcc1e76

Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD · 86014c1e

Paolo Bonzini authored Jul 16, 2024

KVM generic changes for 6.11

 - Enable halt poll shrinking by default, as Intel found it to be a clear win.

 - Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.

 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().

 - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.

 - Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.

 - A few minor cleanups

86014c1e

Merge tag 'kvm-x86-fixes-6.10-11' of https://github.com/kvm-x86/linux into HEAD · f4501e8b

Paolo Bonzini authored Jul 16, 2024

KVM Xen:

Fix a bug where KVM fails to check the validity of an incoming userspace
virtual address and tries to activate a gfn_to_pfn_cache with a kernel address.

f4501e8b

Merge tag 'kvmarm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · 1c5a0b55

Paolo Bonzini authored Jul 16, 2024

KVM/arm64 changes for 6.11

 - Initial infrastructure for shadow stage-2 MMUs, as part of nested
   virtualization enablement

 - Support for userspace changes to the guest CTR_EL0 value, enabling
   (in part) migration of VMs between heterogenous hardware

 - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
   the protocol

 - FPSIMD/SVE support for nested, including merged trap configuration
   and exception routing

 - New command-line parameter to control the WFx trap behavior under KVM

 - Introduce kCFI hardening in the EL2 hypervisor

 - Fixes + cleanups for handling presence/absence of FEAT_TCRX

 - Miscellaneous fixes + documentation updates

1c5a0b55

14 Jul, 2024 8 commits

Merge branch kvm-arm64/docs into kvmarm/next · bb032b23

Oliver Upton authored Jul 14, 2024

* kvm-arm64/docs:
  : KVM Documentation fixes, courtesy of Changyuan Lyu
  :
  : Small set of typo fixes / corrections to the KVM API documentation
  : relating to MSIs and arm64 VGIC UAPI.
  MAINTAINERS: Include documentation in KVM/arm64 entry
  KVM: Documentation: Correct the VGIC V2 CPU interface addr space size
  KVM: Documentation: Enumerate allowed value macros of `irq_type`
  KVM: Documentation: Fix typo `BFD`
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

bb032b23

Merge branch kvm-arm64/nv-tcr2 into kvmarm/next · bc2e3253

Oliver Upton authored Jul 14, 2024

* kvm-arm64/nv-tcr2:
  : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier
  :
  : Series addresses a couple gaps that are present in KVM (from cover
  : letter):
  :
  :   - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly
  :     save/restore stuff.
  :
  :   - trap bit description and routing: none, obviously, since we make a
  :     point in not trapping.
  KVM: arm64: Honor trap routing for TCR2_EL1
  KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
  KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
  KVM: arm64: Get rid of HCRX_GUEST_FLAGS
  KVM: arm64: Correctly honor the presence of FEAT_TCRX
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

bc2e3253

Merge branch kvm-arm64/nv-sve into kvmarm/next · 8c2899e7

Oliver Upton authored Jul 14, 2024

* kvm-arm64/nv-sve:
  : CPTR_EL2, FPSIMD/SVE support for nested
  :
  : This series brings support for honoring the guest hypervisor's CPTR_EL2
  : trap configuration when running a nested guest, along with support for
  : FPSIMD/SVE usage at L1 and L2.
  KVM: arm64: Allow the use of SVE+NV
  KVM: arm64: nv: Add additional trap setup for CPTR_EL2
  KVM: arm64: nv: Add trap description for CPTR_EL2
  KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
  KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
  KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
  KVM: arm64: nv: Handle CPACR_EL1 traps
  KVM: arm64: Spin off helper for programming CPTR traps
  KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
  KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
  KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
  KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
  KVM: arm64: nv: Handle ZCR_EL2 traps
  KVM: arm64: nv: Forward SVE traps to guest hypervisor
  KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

8c2899e7

Merge branch kvm-arm64/el2-kcfi into kvmarm/next · 1270dad3

Oliver Upton authored Jul 14, 2024

* kvm-arm64/el2-kcfi:
  : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi
  :
  : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect
  : branches in the EL2 hypervisor. Unlike kernel support for the feature,
  : CFI failures at EL2 are always fatal.
  KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
  KVM: arm64: Introduce print_nvhe_hyp_panic helper
  arm64: Introduce esr_brk_comment, esr_is_cfi_brk
  KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
  KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
  KVM: arm64: nVHE: Simplify invalid_host_el2_vect
  KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
  KVM: arm64: Fix clobbered ELR in sync abort/SError
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

1270dad3

Merge branch kvm-arm64/ctr-el0 into kvmarm/next · 377d0e5d

Oliver Upton authored Jul 14, 2024

* kvm-arm64/ctr-el0:
  : Support for user changes to CTR_EL0, courtesy of Sebastian Ott
  :
  : Allow userspace to change the guest-visible value of CTR_EL0 for a VM,
  : so long as the requested value represents a subset of features supported
  : by hardware. In other words, prevent the VMM from over-promising the
  : capabilities of hardware.
  :
  : Make this happen by fitting CTR_EL0 into the existing infrastructure for
  : feature ID registers.
  KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset
  KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
  KVM: selftests: arm64: Test writes to CTR_EL0
  KVM: arm64: rename functions for invariant sys regs
  KVM: arm64: show writable masks for feature registers
  KVM: arm64: Treat CTR_EL0 as a VM feature ID register
  KVM: arm64: unify code to prepare traps
  KVM: arm64: nv: Use accessors for modifying ID registers
  KVM: arm64: Add helper for writing ID regs
  KVM: arm64: Use read-only helper for reading VM ID registers
  KVM: arm64: Make idregs debugfs iterator search sysreg table directly
  KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

377d0e5d

Merge branch kvm-arm64/shadow-mmu into kvmarm/next · 435a9f60

Oliver Upton authored Jul 14, 2024

* kvm-arm64/shadow-mmu:
  : Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier
  :
  : Initial implementation of shadow stage-2 page tables to support a guest
  : hypervisor. In the author's words:
  :
  :   So here's the 10000m (approximately 30000ft for those of you stuck
  :   with the wrong units) view of what this is doing:
  :
  :     - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a
  :       separate shadow s2_mmu context. This context has its own "real"
  :       VMID and a set of page tables that are the combination of the
  :       guest's S2 and the host S2, built dynamically one fault at a time.
  :
  :     - these shadow S2 contexts are ephemeral, and behave exactly as
  :       TLBs. For all intent and purposes, they *are* TLBs, and we discard
  :       them pretty often.
  :
  :     - TLB invalidation takes three possible paths:
  :
  :       * either this is an EL2 S1 invalidation, and we directly emulate
  :         it as early as possible
  :
  :       * or this is an EL1 S1 invalidation, and we need to apply it to
  :         the shadow S2s (plural!) that match the VMID set by the L1 guest
  :
  :       * or finally, this is affecting S2, and we need to teardown the
  :         corresponding part of the shadow S2s, which invalidates the TLBs
  KVM: arm64: nv: Truely enable nXS TLBI operations
  KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
  KVM: arm64: nv: Add handling of range-based TLBI operations
  KVM: arm64: nv: Add handling of outer-shareable TLBI operations
  KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
  KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
  KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
  KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
  KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
  KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
  KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
  KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
  KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
  KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
  KVM: arm64: nv: Handle shadow stage 2 page faults
  KVM: arm64: nv: Implement nested Stage-2 page table walk logic
  KVM: arm64: nv: Support multiple nested Stage-2 mmu structures
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

435a9f60

Merge branch kvm-arm64/ffa-1p1 into kvmarm/next · a35d5b20

Oliver Upton authored Jul 14, 2024

* kvm-arm64/ffa-1p1:
  : Improvements to the pKVM FF-A Proxy, courtesy of Sebastian Ene
  :
  : Various minor improvements to how host FF-A calls are proxied with the
  : TEE, along with support for v1.1 of the protocol.
  KVM: arm64: Use FF-A 1.1 with pKVM
  KVM: arm64: Update the identification range for the FF-A smcs
  KVM: arm64: Add support for FFA_PARTITION_INFO_GET
  KVM: arm64: Trap FFA_VERSION host call in pKVM
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

a35d5b20

Merge branch kvm-arm64/misc into kvmarm/next · bd2e9513

Oliver Upton authored Jul 14, 2024

* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Provide a command-line parameter to statically control the WFx trap
  :    selection in KVM
  :
  :  - Make sysreg masks allocation accounted
  Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"
  KVM: arm64: nv: Use GFP_KERNEL_ACCOUNT for sysreg_masks allocation
  KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity
  KVM: arm64: Add early_param to control WFx trapping
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

bd2e9513

12 Jul, 2024 6 commits

Merge tag 'loongarch-kvm-6.11' of... · c8b8b819

Paolo Bonzini authored Jul 12, 2024

Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD

LoongArch KVM changes for v6.11

1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.

c8b8b819

Merge tag 'kvm-s390-next-6.11-1' of... · f0a23883

Paolo Bonzini authored Jul 12, 2024

Merge tag 'kvm-s390-next-6.11-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

Assortment of tiny fixes which are not time critical:
- Rejecting memory region operations for ucontrol mode VMs
- Rewind the PSW on host intercepts for VSIE
- Remove unneeded include

f0a23883

Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEAD · 60d2b2f3

Paolo Bonzini authored Jul 12, 2024

KVM/riscv changes for 6.11

- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use guest files for IMSIC virtualization, when available

ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.

60d2b2f3

Merge branch 'kvm-prefault' into HEAD · f3996d4d

Paolo Bonzini authored Jul 12, 2024

Pre-population has been requested several times to mitigate KVM page faults
during guest boot or after live migration.  It is also required by TDX
before filling in the initial guest memory with measured contents.
Introduce it as a generic API.

f3996d4d

KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY · 9ff0e37c

Isaku Yamahata authored Apr 10, 2024

Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the
pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM
and KVM_X86_SW_PROTECTED_VM.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9ff0e37c

KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() · 6e01b760

Paolo Bonzini authored Jun 11, 2024

Wire KVM_PRE_FAULT_MEMORY ioctl to kvm_mmu_do_page_fault() to populate guest
memory. It can be called right after KVM_CREATE_VCPU creates a vCPU,
since at that point kvm_mmu_create() and kvm_init_mmu() are called and
the vCPU is ready to invoke the KVM page fault handler.

The helper function kvm_tdp_map_page() takes care of the logic to
process RET_PF_* return values and convert them to success or errno.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6e01b760