1. 01 Jun, 2022 3 commits
  2. 25 May, 2022 24 commits
    • Sean Christopherson's avatar
      KVM: Do not pin pages tracked by gfn=>pfn caches · 85165781
      Sean Christopherson authored
      Put the reference to any struct page mapped/tracked by a gfn=>pfn cache
      upon inserting the pfn into its associated cache, as opposed to putting
      the reference only when the cache is done using the pfn.  In other words,
      don't pin pages while they're in the cache.  One of the major roles of
      the gfn=>pfn cache is to play nicely with invalidation events, i.e. it
      exists in large part so that KVM doesn't rely on pinning pages.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-9-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      85165781
    • Sean Christopherson's avatar
      KVM: Fix multiple races in gfn=>pfn cache refresh · 58cd407c
      Sean Christopherson authored
      Rework the gfn=>pfn cache (gpc) refresh logic to address multiple races
      between the cache itself, and between the cache and mmu_notifier events.
      
      The existing refresh code attempts to guard against races with the
      mmu_notifier by speculatively marking the cache valid, and then marking
      it invalid if a mmu_notifier invalidation occurs.  That handles the case
      where an invalidation occurs between dropping and re-acquiring gpc->lock,
      but it doesn't handle the scenario where the cache is refreshed after the
      cache was invalidated by the notifier, but before the notifier elevates
      mmu_notifier_count.  The gpc refresh can't use the "retry" helper as its
      invalidation occurs _before_ mmu_notifier_count is elevated and before
      mmu_notifier_range_start is set/updated.
      
        CPU0                                    CPU1
        ----                                    ----
      
        gfn_to_pfn_cache_invalidate_start()
        |
        -> gpc->valid = false;
                                                kvm_gfn_to_pfn_cache_refresh()
                                                |
                                                |-> gpc->valid = true;
      
                                                hva_to_pfn_retry()
                                                |
                                                -> acquire kvm->mmu_lock
                                                   kvm->mmu_notifier_count == 0
                                                   mmu_seq == kvm->mmu_notifier_seq
                                                   drop kvm->mmu_lock
                                                   return pfn 'X'
        acquire kvm->mmu_lock
        kvm_inc_notifier_count()
        drop kvm->mmu_lock()
        kernel frees pfn 'X'
                                                kvm_gfn_to_pfn_cache_check()
                                                |
                                                |-> gpc->valid == true
      
                                                caller accesses freed pfn 'X'
      
      Key off of mn_active_invalidate_count to detect that a pfncache refresh
      needs to wait for an in-progress mmu_notifier invalidation.  While
      mn_active_invalidate_count is not guaranteed to be stable, it is
      guaranteed to be elevated prior to an invalidation acquiring gpc->lock,
      so either the refresh will see an active invalidation and wait, or the
      invalidation will run after the refresh completes.
      
      Speculatively marking the cache valid is itself flawed, as a concurrent
      kvm_gfn_to_pfn_cache_check() would see a valid cache with stale pfn/khva
      values.  The KVM Xen use case explicitly allows/wants multiple users;
      even though the caches are allocated per vCPU, __kvm_xen_has_interrupt()
      can read a different vCPU (or vCPUs).  Address this race by invalidating
      the cache prior to dropping gpc->lock (this is made possible by fixing
      the above mmu_notifier race).
      
      Complicating all of this is the fact that both the hva=>pfn resolution
      and mapping of the kernel address can sleep, i.e. must be done outside
      of gpc->lock.
      
      Fix the above races in one fell swoop, trying to fix each individual race
      is largely pointless and essentially impossible to test, e.g. closing one
      hole just shifts the focus to the other hole.
      
      Fixes: 982ed0de ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
      Cc: stable@vger.kernel.org
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Mingwei Zhang <mizhang@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-8-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      58cd407c
    • Sean Christopherson's avatar
      KVM: Fully serialize gfn=>pfn cache refresh via mutex · 93984f19
      Sean Christopherson authored
      Protect gfn=>pfn cache refresh with a mutex to fully serialize refreshes.
      The refresh logic doesn't protect against
      
      - concurrent unmaps, or refreshes with different GPAs (which may or may not
        happen in practice, for example if a cache is only used under vcpu->mutex;
        but it's allowed in the code)
      
      - a false negative on the memslot generation.  If the first refresh sees
        a stale memslot generation, it will refresh the hva and generation before
        moving on to the hva=>pfn translation.  If it then drops gpc->lock, a
        different user of the cache can come along, acquire gpc->lock, see that
        the memslot generation is fresh, and skip the hva=>pfn update due to the
        userspace address also matching (because it too was updated).
      
      The refresh path can already sleep during hva=>pfn resolution, so wrap
      the refresh with a mutex to ensure that any given refresh runs to
      completion before other callers can start their refresh.
      
      Cc: stable@vger.kernel.org
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      93984f19
    • Sean Christopherson's avatar
      KVM: Do not incorporate page offset into gfn=>pfn cache user address · 3ba2c95e
      Sean Christopherson authored
      Don't adjust the userspace address in the gfn=>pfn cache by the page
      offset from the gpa.  KVM should never use the user address directly, and
      all KVM operations that translate a user address to something else
      require the user address to be page aligned.  Ignoring the offset will
      allow the cache to reuse a gfn=>hva translation in the unlikely event
      that the page offset of the gpa changes, but the gfn does not.  And more
      importantly, not having to (un)adjust the user address will simplify a
      future bug fix.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3ba2c95e
    • Sean Christopherson's avatar
      KVM: Put the extra pfn reference when reusing a pfn in the gpc cache · 3dddf65b
      Sean Christopherson authored
      Put the struct page reference to pfn acquired by hva_to_pfn() when the
      old and new pfns for a gfn=>pfn cache match.  The cache already has a
      reference via the old/current pfn, and will only put one reference when
      the cache is done with the pfn.
      
      Fixes: 982ed0de ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3dddf65b
    • Sean Christopherson's avatar
      KVM: Drop unused @gpa param from gfn=>pfn cache's __release_gpc() helper · 345b0fd6
      Sean Christopherson authored
      Drop the @pga param from __release_gpc() and rename the helper to make it
      more obvious that the cache itself is not being released.  The helper
      will be reused by a future commit to release a pfn+khva combination that
      is _never_ associated with the cache, at which point the current name
      would go from slightly misleading to blatantly wrong.
      
      No functional change intended.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220429210025.3293691-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      345b0fd6
    • Lev Kujawski's avatar
      KVM: set_msr_mce: Permit guests to ignore single-bit ECC errors · 0471a7bd
      Lev Kujawski authored
      Certain guest operating systems (e.g., UNIXWARE) clear bit 0 of
      MC1_CTL to ignore single-bit ECC data errors.  Single-bit ECC data
      errors are always correctable and thus are safe to ignore because they
      are informational in nature rather than signaling a loss of data
      integrity.
      
      Prior to this patch, these guests would crash upon writing MC1_CTL,
      with resultant error messages like the following:
      
      error: kvm run failed Operation not permitted
      EAX=fffffffe EBX=fffffffe ECX=00000404 EDX=ffffffff
      ESI=ffffffff EDI=00000001 EBP=fffdaba4 ESP=fffdab20
      EIP=c01333a5 EFL=00000246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
      ES =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
      CS =0100 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
      SS =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
      DS =0108 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
      FS =0000 00000000 ffffffff 00c00000
      GS =0000 00000000 ffffffff 00c00000
      LDT=0118 c1026390 00000047 00008200 DPL=0 LDT
      TR =0110 ffff5af0 00000067 00008b00 DPL=0 TSS32-busy
      GDT=     ffff5020 000002cf
      IDT=     ffff52f0 000007ff
      CR0=8001003b CR2=00000000 CR3=0100a000 CR4=00000230
      DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
      DR6=ffff0ff0 DR7=00000400
      EFER=0000000000000000
      Code=08 89 01 89 51 04 c3 8b 4c 24 08 8b 01 8b 51 04 8b 4c 24 04 <0f>
      30 c3 f7 05 a4 6d ff ff 10 00 00 00 74 03 0f 31 c3 33 c0 33 d2 c3 8d
      74 26 00 0f 31 c3
      Signed-off-by: default avatarLev Kujawski <lkujaw@member.fsf.org>
      Message-Id: <20220521081511.187388-1-lkujaw@member.fsf.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0471a7bd
    • Jim Mattson's avatar
      KVM: VMX: Print VM-instruction error as unsigned · cc07e60b
      Jim Mattson authored
      Change the printf format character from 'd' to 'u' for the
      VM-instruction error in vmwrite_error().
      
      Fixes: 6aa8b732 ("[PATCH] kvm: userspace interface")
      Reported-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Message-Id: <20220510224035.1792952-2-jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cc07e60b
    • David Matlack's avatar
      KVM: VMX: Print VM-instruction error when it may be helpful · 8e39efd8
      David Matlack authored
      Include the value of the "VM-instruction error" field from the current
      VMCS (if any) in the error message for VMCLEAR and VMPTRLD, since each
      of these instructions may result in more than one VM-instruction
      error. Previously, this field was only reported for VMWRITE errors.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      [Rebased and refactored code; dropped the error number for INVVPID and
      INVEPT; reworded commit message.]
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Message-Id: <20220510224035.1792952-1-jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8e39efd8
    • Yanfei Xu's avatar
      KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest · ffd1925a
      Yanfei Xu authored
      When kernel handles the vm-exit caused by external interrupts and NMI,
      it always sets kvm_intr_type to tell if it's dealing an IRQ or NMI. For
      the PMI scenario, it could be IRQ or NMI.
      
      However, intel_pt PMIs are only generated for HARDWARE perf events, and
      HARDWARE events are always configured to generate NMIs.  Use
      kvm_handling_nmi_from_guest() to precisely identify if the intel_pt PMI
      came from the guest; this avoids false positives if an intel_pt PMI/NMI
      arrives while the host is handling an unrelated IRQ VM-Exit.
      
      Fixes: db215756 ("KVM: x86: More precisely identify NMI from guest when handling PMI")
      Signed-off-by: default avatarYanfei Xu <yanfei.xu@intel.com>
      Message-Id: <20220523140821.1345605-1-yanfei.xu@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ffd1925a
    • Like Xu's avatar
      KVM: selftests: x86: Sync the new name of the test case to .gitignore · 366d4a12
      Like Xu authored
      Fixing side effect of the so-called opportunistic change in the commit.
      
      Fixes: dc8a9febbab0 ("KVM: selftests: x86: Fix test failure on arch lbr capable platforms")
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20220518170118.66263-2-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      366d4a12
    • Paolo Bonzini's avatar
    • Paolo Bonzini's avatar
      x86, kvm: use correct GFP flags for preemption disabled · baec4f5a
      Paolo Bonzini authored
      Commit ddd7ed842627 ("x86/kvm: Alloc dummy async #PF token outside of
      raw spinlock") leads to the following Smatch static checker warning:
      
      	arch/x86/kernel/kvm.c:212 kvm_async_pf_task_wake()
      	warn: sleeping in atomic context
      
      arch/x86/kernel/kvm.c
          202         raw_spin_lock(&b->lock);
          203         n = _find_apf_task(b, token);
          204         if (!n) {
          205                 /*
          206                  * Async #PF not yet handled, add a dummy entry for the token.
          207                  * Allocating the token must be down outside of the raw lock
          208                  * as the allocator is preemptible on PREEMPT_RT kernels.
          209                  */
          210                 if (!dummy) {
          211                         raw_spin_unlock(&b->lock);
      --> 212                         dummy = kzalloc(sizeof(*dummy), GFP_KERNEL);
                                                                      ^^^^^^^^^^
      Smatch thinks the caller has preempt disabled.  The `smdb.py preempt
      kvm_async_pf_task_wake` output call tree is:
      
      sysvec_kvm_asyncpf_interrupt() <- disables preempt
      -> __sysvec_kvm_asyncpf_interrupt()
         -> kvm_async_pf_task_wake()
      
      The caller is this:
      
      arch/x86/kernel/kvm.c
         290        DEFINE_IDTENTRY_SYSVEC(sysvec_kvm_asyncpf_interrupt)
         291        {
         292                struct pt_regs *old_regs = set_irq_regs(regs);
         293                u32 token;
         294
         295                ack_APIC_irq();
         296
         297                inc_irq_stat(irq_hv_callback_count);
         298
         299                if (__this_cpu_read(apf_reason.enabled)) {
         300                        token = __this_cpu_read(apf_reason.token);
         301                        kvm_async_pf_task_wake(token);
         302                        __this_cpu_write(apf_reason.token, 0);
         303                        wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1);
         304                }
         305
         306                set_irq_regs(old_regs);
         307        }
      
      The DEFINE_IDTENTRY_SYSVEC() is a wrapper that calls this function
      from the call_on_irqstack_cond().  It's inside the call_on_irqstack_cond()
      where preempt is disabled (unless it's already disabled).  The
      irq_enter/exit_rcu() functions disable/enable preempt.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      baec4f5a
    • Wanpeng Li's avatar
      KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer · 619f51da
      Wanpeng Li authored
      The timer is disarmed when switching between TSC deadline and other modes;
      however, the pending timer is still in-flight, so let's accurately remove
      any traces of the previous mode.
      
      Fixes: 44275932 ("KVM: x86: thoroughly disarm LAPIC timer around TSC deadline switch")
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      619f51da
    • Sean Christopherson's avatar
      x86/kvm: Alloc dummy async #PF token outside of raw spinlock · 0547758a
      Sean Christopherson authored
      Drop the raw spinlock in kvm_async_pf_task_wake() before allocating the
      the dummy async #PF token, the allocator is preemptible on PREEMPT_RT
      kernels and must not be called from truly atomic contexts.
      
      Opportunistically document why it's ok to loop on allocation failure,
      i.e. why the function won't get stuck in an infinite loop.
      Reported-by: default avatarYajun Deng <yajun.deng@linux.dev>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0547758a
    • Sean Christopherson's avatar
      KVM: x86: avoid calling x86 emulator without a decoded instruction · fee060cd
      Sean Christopherson authored
      Whenever x86_decode_emulated_instruction() detects a breakpoint, it
      returns the value that kvm_vcpu_check_breakpoint() writes into its
      pass-by-reference second argument.  Unfortunately this is completely
      bogus because the expected outcome of x86_decode_emulated_instruction
      is an EMULATION_* value.
      
      Then, if kvm_vcpu_check_breakpoint() does "*r = 0" (corresponding to
      a KVM_EXIT_DEBUG userspace exit), it is misunderstood as EMULATION_OK
      and x86_emulate_instruction() is called without having decoded the
      instruction.  This causes various havoc from running with a stale
      emulation context.
      
      The fix is to move the call to kvm_vcpu_check_breakpoint() where it was
      before commit 4aa2691d ("KVM: x86: Factor out x86 instruction
      emulation with decoding") introduced x86_decode_emulated_instruction().
      The other caller of the function does not need breakpoint checks,
      because it is invoked as part of a vmexit and the processor has already
      checked those before executing the instruction that #GP'd.
      
      This fixes CVE-2022-1852.
      Reported-by: default avatarQiuhao Li <qiuhao@sysec.org>
      Reported-by: default avatarGaoning Pan <pgn@zju.edu.cn>
      Reported-by: default avatarYongkang Jia <kangel@zju.edu.cn>
      Fixes: 4aa2691d ("KVM: x86: Factor out x86 instruction emulation with decoding")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220311032801.3467418-2-seanjc@google.com>
      [Rewrote commit message according to Qiuhao's report, since a patch
       already existed to fix the bug. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fee060cd
    • Ashish Kalra's avatar
      KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak · d22d2474
      Ashish Kalra authored
      For some sev ioctl interfaces, the length parameter that is passed maybe
      less than or equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data
      that PSP firmware returns. In this case, kmalloc will allocate memory
      that is the size of the input rather than the size of the data.
      Since PSP firmware doesn't fully overwrite the allocated buffer, these
      sev ioctl interface may return uninitialized kernel slab memory.
      Reported-by: default avatarAndy Nguyen <theflow@google.com>
      Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
      Suggested-by: default avatarPeter Gonda <pgonda@google.com>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Fixes: eaf78265 ("KVM: SVM: Move SEV code to separate file")
      Fixes: 2c07ded0 ("KVM: SVM: add support for SEV attestation command")
      Fixes: 4cfdd47d ("KVM: SVM: Add KVM_SEV SEND_START command")
      Fixes: d3d1af85 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command")
      Fixes: eba04b20 ("KVM: x86: Account a variety of miscellaneous allocations")
      Signed-off-by: default avatarAshish Kalra <ashish.kalra@amd.com>
      Reviewed-by: default avatarPeter Gonda <pgonda@google.com>
      Message-Id: <20220516154310.3685678-1-Ashish.Kalra@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d22d2474
    • Sean Christopherson's avatar
      x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) · d187ba53
      Sean Christopherson authored
      Set the starting uABI size of KVM's guest FPU to 'struct kvm_xsave',
      i.e. to KVM's historical uABI size.  When saving FPU state for usersapce,
      KVM (well, now the FPU) sets the FP+SSE bits in the XSAVE header even if
      the host doesn't support XSAVE.  Setting the XSAVE header allows the VM
      to be migrated to a host that does support XSAVE without the new host
      having to handle FPU state that may or may not be compatible with XSAVE.
      
      Setting the uABI size to the host's default size results in out-of-bounds
      writes (setting the FP+SSE bits) and data corruption (that is thankfully
      caught by KASAN) when running on hosts without XSAVE, e.g. on Core2 CPUs.
      
      WARN if the default size is larger than KVM's historical uABI size; all
      features that can push the FPU size beyond the historical size must be
      opt-in.
      
        ==================================================================
        BUG: KASAN: slab-out-of-bounds in fpu_copy_uabi_to_guest_fpstate+0x86/0x130
        Read of size 8 at addr ffff888011e33a00 by task qemu-build/681
        CPU: 1 PID: 681 Comm: qemu-build Not tainted 5.18.0-rc5-KASAN-amd64 #1
        Hardware name:  /DG35EC, BIOS ECG3510M.86A.0118.2010.0113.1426 01/13/2010
        Call Trace:
         <TASK>
         dump_stack_lvl+0x34/0x45
         print_report.cold+0x45/0x575
         kasan_report+0x9b/0xd0
         fpu_copy_uabi_to_guest_fpstate+0x86/0x130
         kvm_arch_vcpu_ioctl+0x72a/0x1c50 [kvm]
         kvm_vcpu_ioctl+0x47f/0x7b0 [kvm]
         __x64_sys_ioctl+0x5de/0xc90
         do_syscall_64+0x31/0x50
         entry_SYSCALL_64_after_hwframe+0x44/0xae
         </TASK>
        Allocated by task 0:
        (stack is not available)
        The buggy address belongs to the object at ffff888011e33800
         which belongs to the cache kmalloc-512 of size 512
        The buggy address is located 0 bytes to the right of
         512-byte region [ffff888011e33800, ffff888011e33a00)
        The buggy address belongs to the physical page:
        page:0000000089cd4adb refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11e30
        head:0000000089cd4adb order:2 compound_mapcount:0 compound_pincount:0
        flags: 0x4000000000010200(slab|head|zone=1)
        raw: 4000000000010200 dead000000000100 dead000000000122 ffff888001041c80
        raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
        page dumped because: kasan: bad access detected
        Memory state around the buggy address:
         ffff888011e33900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         ffff888011e33980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >ffff888011e33a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                           ^
         ffff888011e33a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
         ffff888011e33b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ==================================================================
        Disabling lock debugging due to kernel taint
      
      Fixes: be50b206 ("kvm: x86: Add support for getting/setting expanded xstate buffer")
      Fixes: c60427dd ("x86/fpu: Add uabi_size to guest_fpu")
      Reported-by: default avatarZdenek Kaspar <zkaspar82@gmail.com>
      Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Tested-by: default avatarZdenek Kaspar <zkaspar82@gmail.com>
      Message-Id: <20220504001219.983513-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d187ba53
    • Paolo Bonzini's avatar
      s390/uv_uapi: depend on CONFIG_S390 · eb3de2d8
      Paolo Bonzini authored
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eb3de2d8
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-5.19-1' of... · 1644e270
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: Fix and feature for 5.19
      
      - ultravisor communication device driver
      - fix TEID on terminating storage key ops
      1644e270
    • Paolo Bonzini's avatar
      Merge tag 'kvm-riscv-5.19-1' of https://github.com/kvm-riscv/linux into HEAD · b699da3d
      Paolo Bonzini authored
      KVM/riscv changes for 5.19
      
      - Added Sv57x4 support for G-stage page table
      - Added range based local HFENCE functions
      - Added remote HFENCE functions based on VCPU requests
      - Added ISA extension registers in ONE_REG interface
      - Updated KVM RISC-V maintainers entry to cover selftests support
      b699da3d
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · 47e8eec8
      Paolo Bonzini authored
      KVM/arm64 updates for 5.19
      
      - Add support for the ARMv8.6 WFxT extension
      
      - Guard pages for the EL2 stacks
      
      - Trap and emulate AArch32 ID registers to hide unsupported features
      
      - Ability to select and save/restore the set of hypercalls exposed
        to the guest
      
      - Support for PSCI-initiated suspend in collaboration with userspace
      
      - GICv3 register-based LPI invalidation support
      
      - Move host PMU event merging into the vcpu data structure
      
      - GICv3 ITS save/restore fixes
      
      - The usual set of small-scale cleanups and fixes
      
      [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated
       from 4 to 6. - Paolo]
      47e8eec8
    • Yang Weijiang's avatar
      KVM: selftests: x86: Fix test failure on arch lbr capable platforms · 825be3b5
      Yang Weijiang authored
      On Arch LBR capable platforms, LBR_FMT in perf capability msr is 0x3f,
      so the last format test will fail. Use a true invalid format(0x30) for
      the test if it's running on these platforms. Opportunistically change
      the file name to reflect the tests actually carried out.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarYang Weijiang <weijiang.yang@intel.com>
      Message-Id: <20220512084046.105479-1-weijiang.yang@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      825be3b5
    • Wanpeng Li's avatar
      KVM: LAPIC: Trace LAPIC timer expiration on every vmentry · e0ac5351
      Wanpeng Li authored
      In commit ec0671d5 ("KVM: LAPIC: Delay trace_kvm_wait_lapic_expire
      tracepoint to after vmexit", 2019-06-04), trace_kvm_wait_lapic_expire
      was moved after guest_exit_irqoff() because invoking tracepoints within
      kvm_guest_enter/kvm_guest_exit caused a lockdep splat.
      
      These days this is not necessary, because commit 87fa7f3e ("x86/kvm:
      Move context tracking where it belongs", 2020-07-09) restricted
      the RCU extended quiescent state to be closer to vmentry/vmexit.
      Moving the tracepoint back to __kvm_wait_lapic_expire is more accurate,
      because it will be reported even if vcpu_enter_guest causes multiple
      vmentries via the IPI/Timer fast paths, and it allows the removal of
      advance_expire_delta.
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1650961551-38390-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e0ac5351
  3. 20 May, 2022 13 commits