Commits · f0a23883fad4ec8a63faddb9639a92be2e007624 · Kirill Smelkov / linux

12 Jul, 2024 12 commits

Merge tag 'kvm-s390-next-6.11-1' of... · f0a23883

Paolo Bonzini authored Jul 12, 2024

Merge tag 'kvm-s390-next-6.11-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

Assortment of tiny fixes which are not time critical:
- Rejecting memory region operations for ucontrol mode VMs
- Rewind the PSW on host intercepts for VSIE
- Remove unneeded include

f0a23883

Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEAD · 60d2b2f3

Paolo Bonzini authored Jul 12, 2024

KVM/riscv changes for 6.11

- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use guest files for IMSIC virtualization, when available

ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.

60d2b2f3

Merge branch 'kvm-prefault' into HEAD · f3996d4d

Paolo Bonzini authored Jul 12, 2024

Pre-population has been requested several times to mitigate KVM page faults
during guest boot or after live migration.  It is also required by TDX
before filling in the initial guest memory with measured contents.
Introduce it as a generic API.

f3996d4d

KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY · 9ff0e37c

Isaku Yamahata authored Apr 10, 2024

Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the
pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM
and KVM_X86_SW_PROTECTED_VM.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9ff0e37c

KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() · 6e01b760

Paolo Bonzini authored Jun 11, 2024

Wire KVM_PRE_FAULT_MEMORY ioctl to kvm_mmu_do_page_fault() to populate guest
memory. It can be called right after KVM_CREATE_VCPU creates a vCPU,
since at that point kvm_mmu_create() and kvm_init_mmu() are called and
the vCPU is ready to invoke the KVM page fault handler.

The helper function kvm_tdp_map_page() takes care of the logic to
process RET_PF_* return values and convert them to success or errno.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6e01b760

KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level · 58ef2469

Paolo Bonzini authored Jul 10, 2024

The guest memory population logic will need to know what page size or level
(4K, 2M, ...) is mapped.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <eabc3f3e5eb03b370cadf6e1901ea34d7a020adc.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

58ef2469

KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" · f5e7f00c

Sean Christopherson authored Jun 12, 2024

Move the accounting of the result of kvm_mmu_do_page_fault() to its
callers, as only pf_fixed is common to guest page faults and async #PFs,
and upcoming support KVM_PRE_FAULT_MEMORY won't bump _any_ stats.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f5e7f00c

KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler · 5186ec22

Sean Christopherson authored Jun 12, 2024

Account stat.pf_taken in kvm_mmu_page_fault(), i.e. the actual page fault
handler, instead of conditionally bumping it in kvm_mmu_do_page_fault().
The "real" page fault handler is the only path that should ever increment
the number of taken page faults, as all other paths that "do page fault"
are by definition not handling faults that occurred in the guest.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5186ec22

KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory · bc1a5cd0

Isaku Yamahata authored Apr 10, 2024

Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the
memory range and calls the arch-specific function. The implementation is
optional and enabled by a Kconfig symbol.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bc1a5cd0

KVM: Document KVM_PRE_FAULT_MEMORY ioctl · 9aed7a6c

Isaku Yamahata authored Apr 10, 2024

Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1]

It populates guest memory.  It doesn't do extra operations on the
underlying technology-specific initialization [2].  For example,
CoCo-related operations won't be performed.  Concretely for TDX, this API
won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
are required for such operations.

The key point is to adapt of vcpu ioctl instead of VM ioctl.  First,
populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
hundreds of GB.

[1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
[2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9aed7a6c

Merge branch 'kvm-tdx-prep-1-truncated' into HEAD · eb162c94
Paolo Bonzini authored Jul 12, 2024
```
A rename and refactoring extracted from the preparatory series for
Intel TDX support in KVM's MMU.
```
eb162c94

mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE · 27e6a24a

Paolo Bonzini authored Jul 11, 2024

The flags AS_UNMOVABLE and AS_INACCESSIBLE were both added just for guest_memfd;
AS_UNMOVABLE is already in existing versions of Linux, while AS_INACCESSIBLE was
acked for inclusion in 6.11.

But really, they are the same thing: only guest_memfd uses them, at least for
now, and guest_memfd pages are unmovable because they should not be
accessed by the CPU.

So merge them into one; use the AS_INACCESSIBLE name which is more comprehensive.
At the same time, this fixes an embarrassing bug where AS_INACCESSIBLE was used
as a bit mask, despite it being just a bit index.

The bug was mostly benign, because AS_INACCESSIBLE's bit representation (1010)
corresponded to setting AS_UNEVICTABLE (which is already set) and AS_ENOSPC
(except no async writes can happen on the guest_memfd).  So the AS_INACCESSIBLE
flag simply had no effect.

Fixes: 1d23040c ("KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode")
Fixes: c72ceafb ("mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory")
Cc: linux-mm@kvack.org
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

27e6a24a

04 Jul, 2024 2 commits

kvm: s390: Reject memory region operations for ucontrol VMs · 7816e589

Christoph Schlameuss authored Jun 24, 2024

This change rejects the KVM_SET_USER_MEMORY_REGION and
KVM_SET_USER_MEMORY_REGION2 ioctls when called on a ucontrol VM.
This is necessary since ucontrol VMs have kvm->arch.gmap set to 0 and
would thus result in a null pointer dereference further in.
Memory management needs to be performed in userspace and using the
ioctls KVM_S390_UCAS_MAP and KVM_S390_UCAS_UNMAP.

Also improve s390 specific documentation for KVM_SET_USER_MEMORY_REGION
and KVM_SET_USER_MEMORY_REGION2.
Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com>
Fixes: 27e0393f ("KVM: s390: ucontrol: per vcpu address spaces")
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20240624095902.29375-1-schlameuss@linux.ibm.comSigned-off-by: Janosch Frank <frankja@linux.ibm.com>
[frankja@linux.ibm.com: commit message spelling fix, subject prefix fix]
Message-ID: <20240624095902.29375-1-schlameuss@linux.ibm.com>

7816e589

KVM: s390: vsie: retry SIE instruction on host intercepts · 33a729a1

Eric Farman authored Mar 01, 2024

It's possible that SIE exits for work that the host needs to perform
rather than something that is intended for the guest.

A Linux guest will ignore this intercept code since there is nothing
for it to do, but a more robust solution would rewind the PSW back to
the SIE instruction. This will transparently resume the guest once
the host completes its work, without the guest needing to process
what is effectively a NOP and re-issue SIE itself.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20240301204342.3217540-1-farman@linux.ibm.comSigned-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-ID: <20240301204342.3217540-1-farman@linux.ibm.com>

33a729a1

03 Jul, 2024 1 commit

KVM: s390: remove useless include · 98f77038

Claudio Imbrenda authored Jul 02, 2024

arch/s390/include/asm/kvm_host.h includes linux/kvm_host.h, but
linux/kvm_host.h includes asm/kvm_host.h .

It turns out that arch/s390/include/asm/kvm_host.h only needs
linux/kvm_types.h, which it already includes.

Stop including linux/kvm_host.h from arch/s390/include/asm/kvm_host.h .

Due to the #ifdef guards, the code works as it is today, but it's ugly
and it will get in the way of future patches.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20240702155606.71398-1-imbrenda@linux.ibm.comSigned-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-ID: <20240702155606.71398-1-imbrenda@linux.ibm.com>

98f77038

26 Jun, 2024 5 commits

RISC-V: KVM: Redirect AMO load/store access fault traps to guest · e3256183

Yu-Wei Hsu authored Apr 29, 2024

The KVM RISC-V does not delegate AMO load/store access fault traps to
VS-mode (hedeleg) so typically M-mode takes these traps and redirects
them back to HS-mode. However, upon returning from M-mode, the KVM
RISC-V running in HS-mode terminates VS-mode software.

The KVM RISC-V should redirect AMO load/store access fault traps back
to VS-mode and let the VS-mode trap handler determine the next steps.
Signed-off-by: Yu-Wei Hsu <betterman5240@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240429092113.70695-1-betterman5240@gmail.comSigned-off-by: Anup Patel <anup@brainfault.org>

e3256183

perf kvm/riscv: Port perf kvm stat to RISC-V · da7b1b52

Shenlin Liang authored Apr 22, 2024

'perf kvm stat report/record' generates a statistical analysis of KVM
events and can be used to analyze guest exit reasons.

"report" reports statistical analysis of guest exit events.

To record kvm events on the host:
 # perf kvm stat record -a

To report kvm VM EXIT events:
 # perf kvm stat report --event=vmexit
Signed-off-by: Shenlin Liang <liangshenlin@eswincomputing.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240422080833.8745-3-liangshenlin@eswincomputing.comSigned-off-by: Anup Patel <anup@brainfault.org>

da7b1b52

RISCV: KVM: add tracepoints for entry and exit events · 91195a90

Shenlin Liang authored Apr 22, 2024

Like other architectures, RISCV KVM also needs to add these event
tracepoints to count the number of times kvm guest entry/exit.
Signed-off-by: Shenlin Liang <liangshenlin@eswincomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240422080833.8745-2-liangshenlin@eswincomputing.comSigned-off-by: Anup Patel <anup@brainfault.org>

91195a90

RISC-V: KVM: Use IMSIC guest files when available · 33853392

Anup Patel authored Apr 11, 2024

Let us discover and use IMSIC guest files from the IMSIC global
config provided by the IMSIC irqchip driver.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-3-apatel@ventanamicro.comSigned-off-by: Anup Patel <anup@brainfault.org>

33853392

RISC-V: KVM: Share APLIC and IMSIC defines with irqchip drivers · e5b088c1

Anup Patel authored Apr 11, 2024

We have common APLIC and IMSIC headers available under
include/linux/irqchip/ directory which are used by APLIC
and IMSIC irqchip drivers. Let us replace the use of
kvm_aia_*.h headers with include/linux/irqchip/riscv-*.h
headers.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-2-apatel@ventanamicro.comSigned-off-by: Anup Patel <anup@brainfault.org>

e5b088c1

20 Jun, 2024 10 commits

KVM: x86/tdp_mmu: Take a GFN in kvm_tdp_mmu_fast_pf_get_last_sptep() · c2f38f75

Rick Edgecombe authored Jun 19, 2024

Pass fault->gfn into kvm_tdp_mmu_fast_pf_get_last_sptep(), instead of
passing fault->addr and then converting it to a GFN.

Future changes will make fault->addr and fault->gfn differ when running
TDX guests. The GFN will be conceptually the same as it is for normal VMs,
but fault->addr may contain a TDX specific bit that differentiates between
"shared" and "private" memory. This bit will be used to direct faults to
be handled on different roots, either the normal "direct" root or a new
type of root that handles private memory. The TDP iterators will process
the traditional GFN concept and apply the required TDX specifics depending
on the root type. For this reason, it needs to operate on regular GFN and
not the addr, which may contain these special TDX specific bits.

Today kvm_tdp_mmu_fast_pf_get_last_sptep() takes fault->addr and then
immediately converts it to a GFN with a bit shift. However, this would
unfortunately retain the TDX specific bits in what is supposed to be a
traditional GFN. Excluding TDX's needs, it is also is unnecessary to pass
fault->addr and convert it to a GFN when the GFN is already on hand.

So instead just pass the GFN into kvm_tdp_mmu_fast_pf_get_last_sptep() and
use it directly.
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20240619223614.290657-9-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c2f38f75

KVM: x86/tdp_mmu: Rename REMOVED_SPTE to FROZEN_SPTE · 964cea81

Rick Edgecombe authored Jun 19, 2024

Rename REMOVED_SPTE to FROZEN_SPTE so that it can be used for other
multi-part operations.

REMOVED_SPTE is used as a non-present intermediate value for multi-part
operations that can happen when a thread doesn't have an MMU write lock.
Today these operations are when removing PTEs.

However, future changes will want to use the same concept for setting a
PTE. In that case the REMOVED_SPTE name does not quite fit. So rename it
to FROZEN_SPTE so it can be used for both types of operations.

Also rename the relevant helpers and comments that refer to "removed"
within the context of the SPTE value. Take care to not update naming
referring the "remove" operations, which are still distinct.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20240619223614.290657-2-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

964cea81

Merge branch 'kvm-6.10-fixes' into HEAD · 02b0d3b9
Paolo Bonzini authored Jun 20, 2024

02b0d3b9

KVM: x86/tdp_mmu: Sprinkle __must_check · 8a4e2742

Isaku Yamahata authored Jan 22, 2024

The TDP MMU function __tdp_mmu_set_spte_atomic uses a cmpxchg64 to replace
the SPTE value and returns -EBUSY on failure.  The caller must check the
return value and retry.  Add __must_check to it, as well as to two more
functions that forward the return value of __tdp_mmu_set_spte_atomic to
their caller.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Message-Id: <8f7d5a1b241bf5351eaab828d1a1efe5c17699ca.1705965635.git.isaku.yamahata@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

8a4e2742

KVM: interrupt kvm_gmem_populate() on signals · d8147384

Paolo Bonzini authored Jun 07, 2024

kvm_gmem_populate() is a potentially lengthy operation that can involve
multiple calls to the firmware.  Interrupt it if a signal arrives.

Fixes: 1f6c06b1 ("KVM: guest_memfd: Add interface for populating gmem pages with user data")
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Cc: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d8147384

KVM: Discard zero mask with function kvm_dirty_ring_reset · 676f819c

Bibo Mao authored Jun 13, 2024

Function kvm_reset_dirty_gfn may be called with parameters cur_slot /
cur_offset / mask are all zero, it does not represent real dirty page.
It is not necessary to clear dirty page in this condition. Also return
value of macro __fls() is undefined if mask is zero which is called in
funciton kvm_reset_dirty_gfn(). Here just return.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240613122803.1031511-1-maobibo@loongson.cn>
[Move the conditional inside kvm_reset_dirty_gfn; suggested by
 Sean Christopherson. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

676f819c

virt: guest_memfd: fix reference leak on hwpoisoned page · c31745d2

Paolo Bonzini authored Jun 11, 2024

If kvm_gmem_get_pfn() detects an hwpoisoned page, it returns -EHWPOISON
but it does not put back the reference that kvm_gmem_get_folio() had
grabbed.  Add the forgotten folio_put().

Fixes: a7800aa8 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
Cc: stable@vger.kernel.org
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c31745d2

kvm: do not account temporary allocations to kmem · f474092c

Alexey Dobriyan authored Jun 10, 2024

Some allocations done by KVM are temporary, they are created as result
of program actions, but can't exists for arbitrary long times.

They should have been GFP_TEMPORARY (rip!).

OTOH, kvm-nx-lpage-recovery and kvm-pit kernel threads exist for as long
as VM exists but their task_struct memory is not accounted.
This is story for another day.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Message-ID: <c0122f66-f428-417e-a360-b25fc0f154a0@p183>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f474092c

MAINTAINERS: Drop Wanpeng Li as a Reviewer for KVM Paravirt support · b0185890

Sean Christopherson authored Jun 10, 2024

Drop Wanpeng as a KVM PARAVIRT reviewer as his @tencent.com email is
bouncing, and according to lore[*], the last activity from his @gmail.com
address was almost two years ago.

[*] https://lore.kernel.org/all/CANRm+Cwj29M9HU3=JRUOaKDR+iDKgr0eNMWQi0iLkR5THON-bg@mail.gmail.com

Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240610163427.3359426-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b0185890

KVM: x86: Always sync PIR to IRR prior to scanning I/O APIC routes · f3ced000

Sean Christopherson authored Jun 10, 2024

Sync pending posted interrupts to the IRR prior to re-scanning I/O APIC
routes, irrespective of whether the I/O APIC is emulated by userspace or
by KVM.  If a level-triggered interrupt routed through the I/O APIC is
pending or in-service for a vCPU, KVM needs to intercept EOIs on said
vCPU even if the vCPU isn't the destination for the new routing, e.g. if
servicing an interrupt using the old routing races with I/O APIC
reconfiguration.

Commit fceb3a36 ("KVM: x86: ioapic: Fix level-triggered EOI and
userspace I/OAPIC reconfigure race") fixed the common cases, but
kvm_apic_pending_eoi() only checks if an interrupt is in the local
APIC's IRR or ISR, i.e. misses the uncommon case where an interrupt is
pending in the PIR.

Failure to intercept EOI can manifest as guest hangs with Windows 11 if
the guest uses the RTC as its timekeeping source, e.g. if the VMM doesn't
expose a more modern form of time to the guest.

Cc: stable@vger.kernel.org
Cc: Adamos Ttofari <attofari@amazon.de>
Cc: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240611014845.82795-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f3ced000

06 Jun, 2024 1 commit

KVM: selftests: Fix RISC-V compilation · 0fc670d0

Andrew Jones authored Jun 03, 2024

Due to commit 2b7deea3 ("Revert "kvm: selftests: move base
kvm_util.h declarations to kvm_util_base.h"") kvm selftests now
requires explicitly including ucall_common.h when needed. The commit
added the directives everywhere they were needed at the time, but, by
merge time, new places had been merged for RISC-V. Add those now to
fix RISC-V's compilation.

Fixes: dee7ea42 ("Merge tag 'kvm-x86-selftests_utils-6.10' of https://github.com/kvm-x86/linux into HEAD")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240603122045.323064-2-ajones@ventanamicro.comSigned-off-by: Anup Patel <anup@brainfault.org>

0fc670d0

05 Jun, 2024 3 commits

KVM: SNP: Fix LBR Virtualization for SNP guest · f99b0522

Ravi Bangoria authored Jun 05, 2024

SEV-ES and thus SNP guest mandates LBR Virtualization to be _always_ ON.
Although commit b7e4be0a ("KVM: SEV-ES: Delegate LBR virtualization
to the processor") did the correct change for SEV-ES guests, it missed
the SNP. Fix it.
Reported-by: Srikanth Aithal <sraithal@amd.com>
Fixes: b7e4be0a ("KVM: SEV-ES: Delegate LBR virtualization to the processor")
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240605114810.1304-1-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f99b0522

KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attr · db574f2f

Tao Su authored May 28, 2024

Drop the second snapshot of mmu_invalidate_seq in kvm_faultin_pfn().
Before checking the mismatch of private vs. shared, mmu_invalidate_seq is
saved to fault->mmu_seq, which can be used to detect an invalidation
related to the gfn occurred, i.e. KVM will not install a mapping in page
table if fault->mmu_seq != mmu_invalidate_seq.

Currently there is a second snapshot of mmu_invalidate_seq, which may not
be same as the first snapshot in kvm_faultin_pfn(), i.e. the gfn attribute
may be changed between the two snapshots, but the gfn may be mapped in
page table without hindrance. Therefore, drop the second snapshot as it
has no obvious benefits.

Fixes: f6adeae8 ("KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn()")
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240528102234.2162763-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

db574f2f

Merge tag 'kvmarm-fixes-6.10-1' of... · 45ce0314

Paolo Bonzini authored Jun 05, 2024

Merge tag 'kvmarm-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.10, take #1

- Large set of FP/SVE fixes for pKVM, addressing the fallout
  from the per-CPU data rework and making sure that the host
  is not involved in the FP/SVE switching any more

- Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH
  is copletely supported

- Fix for the respective priorities of Failed PAC, Illegal
  Execution state and Instruction Abort exceptions

- Fix the handling of AArch32 instruction traps failing their
  condition code, which was broken by the introduction of
  ESR_EL2.ISS2

- Allow vpcus running in AArch32 state to be restored in
  System mode

- Fix AArch32 GPR restore that would lose the 64 bit state
  under some conditions

45ce0314

04 Jun, 2024 6 commits

KVM: arm64: Ensure that SME controls are disabled in protected mode · afb91f5f

Fuad Tabba authored Jun 03, 2024

KVM (and pKVM) do not support SME guests. Therefore KVM ensures
that the host's SME state is flushed and that SME controls for
enabling access to ZA storage and for streaming are disabled.

pKVM needs to protect against a buggy/malicious host. Ensure that
it wouldn't run a guest when protected mode is enabled should any
of the SME controls be enabled.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-10-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

afb91f5f

KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format · a69283ae

Fuad Tabba authored Jun 03, 2024

When setting/clearing CPACR bits for EL0 and EL1, use the ELx
format of the bits, which covers both. This makes the code
clearer, and reduces the chances of accidentally missing a bit.

No functional change intended.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

a69283ae

KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM · 1696fc21

Fuad Tabba authored Jun 03, 2024

Now that we have introduced finalize_init_hyp_mode(), lets
consolidate the initializing of the host_data fpsimd_state and
sve state.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240603122852.3923848-8-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

1696fc21

KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM · b5b99556

Fuad Tabba authored Jun 03, 2024

When running in protected mode we don't want to leak protected
guest state to the host, including whether a guest has used
fpsimd/sve. Therefore, eagerly restore the host state on guest
exit when running in protected mode, which happens only if the
guest has used fpsimd/sve.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-7-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

b5b99556

KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM · 66d5b53e

Fuad Tabba authored Jun 03, 2024

Protected mode needs to maintain (save/restore) the host's sve
state, rather than relying on the host kernel to do that. This is
to avoid leaking information to the host about guests and the
type of operations they are performing.

As a first step towards that, allocate memory mapped at hyp, per
cpu, for the host sve state. The following patch will use this
memory to save/restore the host state.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-6-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

66d5b53e

KVM: arm64: Specialize handling of host fpsimd state on trap · e511e08a

Fuad Tabba authored Jun 03, 2024

In subsequent patches, n/vhe will diverge on saving the host
fpsimd/sve state when taking a guest fpsimd/sve trap. Add a
specialized helper to handle it.

No functional change intended.
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20240603122852.3923848-5-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>

e511e08a