- 11 Jun, 2020 10 commits
-
-
Paolo Bonzini authored
__kvm_set_memory_region does not use the hva at all, so trying to catch use-after-delete is pointless and, worse, it fails access_ok now that we apply it to all memslots including private kernel ones. This fixes an AVIC regression. Fixes: 09d952c9 ("KVM: check userspace_addr for all memslots") Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
It was reported that older GCCs compile smm_test in a way that breaks it completely: kvm_exit: reason EXIT_CPUID rip 0x4014db info 0 0 func 7ffffffd idx 830 rax 0 rbx 0 rcx 0 rdx 0, cpuid entry not found ... kvm_exit: reason EXIT_MSR rip 0x40abd9 info 0 0 kvm_msr: msr_read 487 = 0x0 (#GP) ... Note, '7ffffffd' was supposed to be '80000001' as we're checking for SVM. Dropping '-O2' from compiler flags help. Turns out, asm block in sync_with_host() is wrong. We us 'in 0xe, %%al' instruction to sync with the host and in 'AL' register we actually pass the parameter (stage) but after sync 'AL' gets written to but GCC thinks the value is still there and uses it to compute 'EAX' for 'cpuid'. smm_test can't fully use standard ucall() framework as we need to write a very simple SMI handler there. Fix the immediate issue by making RAX input/output operand. While on it, make sync_with_host() static inline. Reported-by: Marcelo Bandeira Condotta <mcondotta@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610164116.770811-1-vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
'Page not present' event may or may not get injected depending on guest's state. If the event wasn't injected, there is no need to inject the corresponding 'page ready' event as the guest may get confused. E.g. Linux thinks that the corresponding 'page not present' event wasn't delivered *yet* and allocates a 'dummy entry' for it. This entry is never freed. Note, 'wakeup all' events have no corresponding 'page not present' event and always get injected. s390 seems to always be able to inject 'page not present', the change is effectively a nop. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610175532.779793-2-vkuznets@redhat.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208081Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
schedule_work() returns 'false' only when the work is already on the queue and this can't happen as kvm_setup_async_pf() always allocates a new one. Also, to avoid potential race, it makes sense to to schedule_work() at the very end after we've added it to the queue. While on it, do some minor cleanup. gfn_to_pfn_async() mentioned in a comment does not currently exist and, moreover, we can check kvm_is_error_hva() at the very beginning, before we try to allocate work so 'retry_sync' label can go away completely. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610175532.779793-1-vkuznets@redhat.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Colin Ian King authored
The pointer s is being assigned a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Message-Id: <20200609233121.1118683-1-colin.king@canonical.com> Fixes: 7837699f ("KVM: In kernel PIT model") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Felipe Franciosi authored
When userspace configures KVM_GUESTDBG_SINGLESTEP, KVM will manage the presence of X86_EFLAGS_TF via kvm_set/get_rflags on vcpus. The actual rflag bit is therefore hidden from callers. That includes init_emulate_ctxt() which uses the value returned from kvm_get_flags() to set ctxt->tf. As a result, x86_emulate_instruction() will skip a single step, leaving singlestep_rip stale and not returning to userspace. This resolves the issue by observing the vcpu guest_debug configuration alongside ctxt->tf in x86_emulate_instruction(), performing the single step if set. Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Message-Id: <20200519081048.8204-1-felipe@nutanix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
KVM_CAP_HYPERV_ENLIGHTENED_VMCS will be reported as supported even when nested VMX is not, fix evmcs_test/hyperv_cpuid tests to check for both. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610135847.754289-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
state_test/smm_test use KVM_CAP_NESTED_STATE check as an indicator for nested VMX/SVM presence and this is incorrect. Check for the required features dirrectly. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610135847.754289-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Using a topic branch so that stable branches can simply cherry-pick the patch. Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Consult only the basic exit reason, i.e. bits 15:0 of vmcs.EXIT_REASON, when determining whether a nested VM-Exit should be reflected into L1 or handled by KVM in L0. For better or worse, the switch statement in nested_vmx_exit_reflected() currently defaults to "true", i.e. reflects any nested VM-Exit without dedicated logic. Because the case statements only contain the basic exit reason, any VM-Exit with modifier bits set will be reflected to L1, even if KVM intended to handle it in L0. Practically speaking, this only affects EXIT_REASON_MCE_DURING_VMENTRY, i.e. a #MC that occurs on nested VM-Enter would be incorrectly routed to L1, as "failed VM-Entry" is the only modifier that KVM can currently encounter. The SMM modifiers will never be generated as KVM doesn't support/employ a SMI Transfer Monitor. Ditto for "exit from enclave", as KVM doesn't yet support virtualizing SGX, i.e. it's impossible to enter an enclave in a KVM guest (L1 or L2). Fixes: 644d711a ("KVM: nVMX: Deciding if L0 or L1 should handle an L2 exit") Cc: Jim Mattson <jmattson@google.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200227174430.26371-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 09 Jun, 2020 2 commits
-
-
Sean Christopherson authored
Make x86_fpu_cache static now that FPU allocation and destruction is handled entirely by common x86 code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200608180218.20946-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Explicitly set the VA width to 48 bits for the x86_64-only PXXV48_4K VM mode instead of asserting the guest VA width is 48 bits. The fact that KVM supports 5-level paging is irrelevant unless the selftests opt-in to 5-level paging by setting CR4.LA57 for the guest. The overzealous assert prevents running the selftests on a kernel with 5-level paging enabled. Incorporate LA57 into the assert instead of removing the assert entirely as a sanity check of KVM's CPUID output. Fixes: 567a9f1e ("KVM: selftests: Introduce VM_MODE_PXXV48_4K") Reported-by: Sergio Perez Gonzalez <sergio.perez.gonzalez@intel.com> Cc: Adriana Cervantes Jimenez <adriana.cervantes.jimenez@intel.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200528021530.28091-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 08 Jun, 2020 6 commits
-
-
Eiichi Tsukata authored
Commit b1394e74 ("KVM: x86: fix APIC page invalidation") tried to fix inappropriate APIC page invalidation by re-introducing arch specific kvm_arch_mmu_notifier_invalidate_range() and calling it from kvm_mmu_notifier_invalidate_range_start. However, the patch left a possible race where the VMCS APIC address cache is updated *before* it is unmapped: (Invalidator) kvm_mmu_notifier_invalidate_range_start() (Invalidator) kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD) (KVM VCPU) vcpu_enter_guest() (KVM VCPU) kvm_vcpu_reload_apic_access_page() (Invalidator) actually unmap page Because of the above race, there can be a mismatch between the host physical address stored in the APIC_ACCESS_PAGE VMCS field and the host physical address stored in the EPT entry for the APIC GPA (0xfee0000). When this happens, the processor will not trap APIC accesses, and will instead show the raw contents of the APIC-access page. Because Windows OS periodically checks for unexpected modifications to the LAPIC register, this will show up as a BSOD crash with BugCheck CRITICAL_STRUCTURE_CORRUPTION (109) we are currently seeing in https://bugzilla.redhat.com/show_bug.cgi?id=1751017. The root cause of the issue is that kvm_arch_mmu_notifier_invalidate_range() cannot guarantee that no additional references are taken to the pages in the range before kvm_mmu_notifier_invalidate_range_end(). Fortunately, this case is supported by the MMU notifier API, as documented in include/linux/mmu_notifier.h: * If the subsystem * can't guarantee that no additional references are taken to * the pages in the range, it has to implement the * invalidate_range() notifier to remove any references taken * after invalidate_range_start(). The fix therefore is to reload the APIC-access page field in the VMCS from kvm_mmu_notifier_invalidate_range() instead of ..._range_start(). Cc: stable@vger.kernel.org Fixes: b1394e74 ("KVM: x86: fix APIC page invalidation") Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=197951Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Message-Id: <20200606042627.61070-1-eiichi.tsukata@nutanix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
is_intercept takes an INTERCEPT_* constant, not SVM_EXIT_*; because of this, the compiler was removing the body of the conditionals, as if is_intercept returned 0. This unveils a latent bug: when clearing the VINTR intercept, int_ctl must also be changed in the L1 VMCB (svm->nested.hsave), just like the intercept itself is also changed in the L1 VMCB. Otherwise V_IRQ remains set and, due to the VINTR intercept being clear, we get a spurious injection of a vector 0 interrupt on the next L2->L1 vmexit. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
GCC10 fails to build vmx_preemption_timer_test: gcc -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 -fno-stack-protector -fno-PIE -I../../../../tools/include -I../../../../tools/arch/x86/include -I../../../../usr/include/ -Iinclude -Ix86_64 -Iinclude/x86_64 -I.. -pthread -no-pie x86_64/evmcs_test.c ./linux/tools/testing/selftests/kselftest_harness.h ./linux/tools/testing/selftests/kselftest.h ./linux/tools/testing/selftests/kvm/libkvm.a -o ./linux/tools/testing/selftests/kvm/x86_64/evmcs_test /usr/bin/ld: ./linux/tools/testing/selftests/kvm/libkvm.a(vmx.o): ./linux/tools/testing/selftests/kvm/include/x86_64/vmx.h:603: multiple definition of `ctrl_exit_rev'; /tmp/ccMQpvNt.o: ./linux/tools/testing/selftests/kvm/include/x86_64/vmx.h:603: first defined here /usr/bin/ld: ./linux/tools/testing/selftests/kvm/libkvm.a(vmx.o): ./linux/tools/testing/selftests/kvm/include/x86_64/vmx.h:602: multiple definition of `ctrl_pin_rev'; /tmp/ccMQpvNt.o: ./linux/tools/testing/selftests/kvm/include/x86_64/vmx.h:602: first defined here ... ctrl_exit_rev/ctrl_pin_rev/basic variables are only used in vmx_preemption_timer_test.c, just move them there. Fixes: 8d7fbf01 ("KVM: selftests: VMX preemption timer migration test") Reported-by: Marcelo Bandeira Condotta <mcondotta@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200608112346.593513-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Add x86_64/debug_regs to .gitignore. Reported-by: Marcelo Bandeira Condotta <mcondotta@redhat.com> Fixes: 449aa906 ("KVM: selftests: Add KVM_SET_GUEST_DEBUG test") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200608112346.593513-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
handle_vmptrst()/handle_vmread() stopped injecting #PF unconditionally and switched to nested_vmx_handle_memory_failure() which just kills the guest with KVM_EXIT_INTERNAL_ERROR in case of MMIO access, zeroing 'exception' in kvm_write_guest_virt_system() is not needed anymore. This reverts commit 541ab2ae. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200605115906.532682-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Syzbot reports the following issue: WARNING: CPU: 0 PID: 6819 at arch/x86/kvm/x86.c:618 kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618 ... Call Trace: ... RIP: 0010:kvm_inject_emulated_page_fault+0x210/0x290 arch/x86/kvm/x86.c:618 ... nested_vmx_get_vmptr+0x1f9/0x2a0 arch/x86/kvm/vmx/nested.c:4638 handle_vmon arch/x86/kvm/vmx/nested.c:4767 [inline] handle_vmon+0x168/0x3a0 arch/x86/kvm/vmx/nested.c:4728 vmx_handle_exit+0x29c/0x1260 arch/x86/kvm/vmx/vmx.c:6067 'exception' we're trying to inject with kvm_inject_emulated_page_fault() comes from: nested_vmx_get_vmptr() kvm_read_guest_virt() kvm_read_guest_virt_helper() vcpu->arch.walk_mmu->gva_to_gpa() but it is only set when GVA to GPA conversion fails. In case it doesn't but we still fail kvm_vcpu_read_guest_page(), X86EMUL_IO_NEEDED is returned and nested_vmx_get_vmptr() calls kvm_inject_emulated_page_fault() with zeroed 'exception'. This happen when the argument is MMIO. Paolo also noticed that nested_vmx_get_vmptr() is not the only place in KVM code where kvm_read/write_guest_virt*() return result is mishandled. VMX instructions along with INVPCID have the same issue. This was already noticed before, e.g. see commit 541ab2ae ("KVM: x86: work around leak of uninitialized stack contents") but was never fully fixed. KVM could've handled the request correctly by going to userspace and performing I/O but there doesn't seem to be a good need for such requests in the first place. Introduce vmx_handle_memory_failure() as an interim solution. Note, nested_vmx_get_vmptr() now has three possible outcomes: OK, PF, KVM_EXIT_INTERNAL_ERROR and callers need to know if userspace exit is needed (for KVM_EXIT_INTERNAL_ERROR) in case of failure. We don't seem to have a good enum describing this tristate, just add "int *ret" to nested_vmx_get_vmptr() interface to pass the information. Reported-by: syzbot+2a7156e11dc199bdbd8a@syzkaller.appspotmail.com Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200605115906.532682-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 05 Jun, 2020 2 commits
-
-
Paolo Bonzini authored
Instructions starting with 0f18 up to 0f1f are reserved nops, except those that were assigned to MPX. These include the endbr markers used by CET. List them correctly in the opcode table. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Marcelo reports that kvm selftests fail to build with "make ARCH=x86_64": gcc -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 -fno-stack-protector -fno-PIE -I../../../../tools/include -I../../../../tools/arch/x86_64/include -I../../../../usr/include/ -Iinclude -Ilib -Iinclude/x86_64 -I.. -c lib/kvm_util.c -o /var/tmp/20200604202744-bin/lib/kvm_util.o In file included from lib/kvm_util.c:11: include/x86_64/processor.h:14:10: fatal error: asm/msr-index.h: No such file or directory #include <asm/msr-index.h> ^~~~~~~~~~~~~~~~~ compilation terminated. "make ARCH=x86", however, works. The problem is that arch specific headers for x86_64 live in 'tools/arch/x86/include', not in 'tools/arch/x86_64/include'. Fixes: 66d69e08 ("selftests: fix kvm relocatable native/cross builds and installs") Reported-by: Marcelo Bandeira Condotta <mcondotta@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200605142028.550068-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 04 Jun, 2020 20 commits
-
-
Paolo Bonzini authored
Merge tag 'kvm-ppc-next-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD PPC KVM update for 5.8 - Updates and bug fixes for secure guest support - Other minor bug fixes and cleanups.
-
Anthony Yznaga authored
Consolidate the code and correct the comments to show that the actions taken to update existing mappings to disable or enable dirty logging are not necessary when creating, moving, or deleting a memslot. Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Message-Id: <1591128450-11977-4-git-send-email-anthony.yznaga@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Anthony Yznaga authored
On large memory guests it has been observed that creating a memslot for a very large range can take noticeable amount of time. Investigation showed that the time is spent walking the rmaps to update existing sptes to remove write access or set/clear dirty bits to support dirty logging. These rmap walks are unnecessary when creating or moving a memslot. A newly created memslot will not have any existing mappings, and the existing mappings of a moved memslot will have been invalidated and flushed. Any mappings established once the new/moved memslot becomes visible will be set using the properties of the new slot. Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Message-Id: <1591128450-11977-3-git-send-email-anthony.yznaga@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Anthony Yznaga authored
There's no write access to remove. An existing memslot cannot be updated to set or clear KVM_MEM_READONLY, and any mappings established in a newly created or moved read-only memslot will already be read-only. Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Message-Id: <1591128450-11977-2-git-send-email-anthony.yznaga@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Denis Efremov authored
Replace opencoded alloc and copy with vmemdup_user(). Signed-off-by: Denis Efremov <efremov@linux.com> Message-Id: <20200603101131.2107303-1-efremov@linux.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Remove KVM_DEBUG_FS, which can easily be misconstrued as controlling KVM-as-a-host. The sole user of CONFIG_KVM_DEBUG_FS was removed by commit cfd8983f ("x86, locking/spinlocks: Remove ticket (spin)lock implementation"). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200528031121.28904-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
This patch enable KVM support for Loongson-3 by selecting HAVE_KVM, but only enable KVM/VZ on Loongson-3A R4+ (because VZ of early processors are incomplete). Besides, Loongson-3 support SMP guests, so we clear the linked load bit of LLAddr in kvm_vz_vcpu_load() if the guest has more than one VCPUs. Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-15-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
This patch add more MMIO load/store instructions emulation, which can be observed in QXL and some other device drivers: 1, LWL, LWR, LDW, LDR, SWL, SWR, SDL and SDR for all MIPS; 2, GSLBX, GSLHX, GSLWX, GSLDX, GSSBX, GSSHX, GSSWX and GSSDX for Loongson-3. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-14-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3 has CONFIG6 and DIAG registers which need to be emulated. CONFIG6 is mostly used to enable/disable FTLB and SFB, while DIAG is mostly used to flush BTB, ITLB, DTLB, VTLB and FTLB. Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-13-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3 overrides lwc2 instructions to implement CPUCFG and CSR read/write functions. These instructions all cause guest exit so CSR doesn't benifit KVM guest (and there are always legacy methods to provide the same functions as CSR). So, we only emulate CPUCFG and let it return a reduced feature list (which means the virtual CPU doesn't have any other advanced features, including CSR) in KVM. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-12-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
This patch add Loongson-3 Virtual IPI interrupt support in the kernel. The current implementation of IPI emulation in QEMU is based on GIC for MIPS, but Loongson-3 doesn't use GIC. Furthermore, IPI emulation in QEMU is too expensive for performance (because of too many context switches between Host and Guest). With current solution, the IPI delay may even cause RCU stall warnings in a multi-core Guest. So, we design a faster solution that emulate IPI interrupt in kernel (only used by Loongson-3 now). Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-11-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
In current implementation, MIPS KVM uses IP2, IP3, IP4 and IP7 for external interrupt, two kinds of IPIs and timer interrupt respectively, but Loongson-3 based machines prefer to use IP2, IP3, IP6 and IP7 for two kinds of external interrupts, IPI and timer interrupt. So we define two priority-irq mapping tables: kvm_loongson3_priority_to_irq[] for Loongson-3, and kvm_default_priority_to_irq[] for others. The virtual interrupt infrastructure is updated to deliver all types of interrupts from IP2, IP3, IP4, IP6 and IP7. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-10-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3's indexed cache operations need a node-id in the address, but in KVM guest the node-id may be incorrect. So, let indexed cache operations cause guest exit on Loongson-3. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-9-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
KVM guest has two levels of address translation: guest tlb translates GVA to GPA, and root tlb translates GPA to HPA. By default guest's CCA is controlled by guest tlb, but Loongson-3 maintains all cache coherency by hardware (including multi-core coherency and I/O DMA coherency) so it prefers all guest mappings be cacheable mappings. Thus, we use root tlb to control guest's CCA for Loongson-3. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-8-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3 has lddir/ldpte instructions and their related CP0 registers are the same as HTW. So we introduce a cpu_guest_has_ldpte flag and use it to indicate whether we need to save/restore HTW related CP0 registers (PWBase, PWSize, PWField and PWCtl). Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-7-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3 can use lddir/ldpte instuctions to accelerate page table walking, so use them to lookup gpa_mm.pgd. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-6-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Add EVENTFD support for KVM/MIPS, which is needed by VHOST. Tested on Loongson-3 platform. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-5-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Huacai Chen authored
Loongson-3 based machines can have as many as 16 CPUs, and so does memory slots, so increase KVM_MAX_VCPUS and KVM_USER_MEM_SLOTS to 16. Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Message-Id: <1590220602-3547-4-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Xing Li authored
If a CPU support more than 32bit vmbits (which is true for 64bit CPUs), VPN2_MASK set to fixed 0xffffe000 will lead to a wrong EntryHi in some functions such as _kvm_mips_host_tlb_inv(). The cpu_vmbits definition of 32bit CPU in cpu-features.h is 31, so we still use the old definition. Cc: Stable <stable@vger.kernel.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Xing Li <lixing@loongson.cn> [Huacai: Improve commit messages] Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1590220602-3547-3-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Xing Li authored
The code in decode_config4() of arch/mips/kernel/cpu-probe.c asid_mask = MIPS_ENTRYHI_ASID; if (config4 & MIPS_CONF4_AE) asid_mask |= MIPS_ENTRYHI_ASIDX; set_cpu_asid_mask(c, asid_mask); set asid_mask to cpuinfo->asid_mask. So in order to support variable ASID_MASK, KVM_ENTRYHI_ASID should also be changed to cpu_asid_mask(&boot_cpu_data). Cc: Stable <stable@vger.kernel.org> #4.9+ Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Xing Li <lixing@loongson.cn> [Huacai: Change current_cpu_data to boot_cpu_data for optimization] Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1590220602-3547-2-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-