- 08 Sep, 2016 11 commits
-
-
Marc Zyngier authored
It would make some sense to share the conditional execution code between 32 and 64bit. In order to achieve this, let's move that code to virt/kvm/arm/aarch32.c. While we're at it, drop a superfluous BUG_ON() that wasn't that useful. Following patches will migrate the 32bit port to that code base. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
In order to make emulate.c more generic, move the arch-specific manupulation bits out of emulate.c. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Vladimir Murzin authored
SCTLR_EL2.SPAN bit controls what happens with the PSTATE.PAN bit on an exception. However, this bit has no effect on the PSTATE.PAN when HCR_EL2.E2H or HCR_EL2.TGE is unset. Thus when VHE is used and exception taken from a guest PSTATE.PAN bit left unchanged and we continue with a value guest has set. To address that always reset PSTATE.PAN on entry from EL1. Fixes: 1f364c8c ("arm64: VHE: Add support for running Linux in EL2 mode") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: <stable@vger.kernel.org> # v4.6+ Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
When rewriting the assembly code to C code, it was useful to have exported aliases or static functions so that we could keep the existing common C code unmodified and at the same time rewrite arm64 from assembly to C code, and later do the arm part. Now when both are done, we really don't need this level of indirection anymore, and it's time to save a few lines and brain cells. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Mark Rutland authored
Now that 32-bit KVM no longer performs cache maintenance for page table updates, we no longer need empty stubs for arm64. Remove them. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Mark Rutland authored
When modifying Stage-2 page tables, we perform cache maintenance to account for non-coherent page table walks. However, this is unnecessary, as page table walks are guaranteed to be coherent in the presence of the virtualization extensions. Per ARM DDI 0406C.c, section B1.7 ("The Virtualization Extensions"), the virtualization extensions mandate the multiprocessing extensions. Per ARM DDI 0406C.c, section B3.10.1 ("General TLB maintenance requirements"), as described in the sub-section titled "TLB maintenance operations and the memory order model", this maintenance is not required in the presence of the multiprocessing extensions. Hence, we need not perform this cache maintenance when modifying Stage-2 entries. This patch removes the logic for performing the redundant maintenance. To ensure visibility and ordering of updates, a dsb(ishst) that was otherwise implicit in the maintenance is folded into kvm_set_pmd() and kvm_set_pte(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
As kvm_set_routing_entry() was changing prototype between 4.7 and 4.8, an ugly hack was put in place in order to survive both building in -next and the merge window. Now that everything has been merged, let's dump the compatibility hack for good. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Shanker Donthineni authored
We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant, the same information is available in tpidr_el2. The function __guest_exit() calling convention is slightly modified, caller only pushes the regs x0-x1 to stack instead of regs x0-x3. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
Just a rename so we can implement a v3-specific function later. We take the chance to get rid of the V2/V3 ops comments as well. No functional change. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
-
Christoffer Dall authored
As we are about to deal with multiple data types and situations where the vgic should not be initialized when doing userspace accesses on the register attributes, factor out the functionality of vgic_attr_regs_access into smaller bits which can be reused by a new function later. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
-
Christoffer Dall authored
Factor out the GICv3 and ITS-specific documentation into a separate documentation file. Add description for how to access distributor, redistributor, and CPU interface registers for GICv3 in this new file, and add a group for accessing level triggered IRQ information for GICv3 as well. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
-
- 07 Sep, 2016 8 commits
-
-
Jan Dakinevich authored
Expose the feature to L1 hypervisor if host CPU supports it, since certain hypervisors requires it for own purposes. According to Intel SDM A.1, if CPU supports the feature, VMX_INSTRUCTION_INFO field of VMCS will contain detailed information about INS/OUTS instructions handling. This field is already copied to VMCS12 for L1 hypervisor (see prepare_vmcs12 routine) independently feature presence. Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
setup_vmcs_config takes a pointer to the vmcs_config global. The indirection is somewhat pointless, but just keep things consistent for now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
handle_external_intr does not enable interrupts anymore, vcpu_enter_guest does it after calling guest_exit_irqoff. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
These are mostly related to nested VMX. They needn't have a loglevel as high as KERN_WARN, and mustn't be allowed to pollute the host logs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Jan Dakinevich authored
If EPT support is exposed to L1 hypervisor, guest linear-address field of VMCS should contain GVA of L2, the access to which caused EPT violation. Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Bhaktipriya Shridhar authored
The workqueue "irqfd_cleanup_wq" queues a single work item &irqfd->shutdown and hence doesn't require ordering. It is a host-wide workqueue for issuing deferred shutdown requests aggregated from all vm* instances. It is not being used on a memory reclaim path. Hence, it has been converted to use system_wq. The work item has been flushed in kvm_irqfd_release(). The workqueue "wqueue" queues a single work item &timer->expired and hence doesn't require ordering. Also, it is not being used on a memory reclaim path. Hence, it has been converted to use system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Commit 61abdbe0 ("kvm: x86: make lapic hrtimer pinned") pins the emulated lapic timer. This patch does the same for the emulated nested preemption timer to avoid vmexit an unrelated vCPU and the latency of kicking IPI to another vCPU. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Liang Li authored
The validity check for the guest line address is inefficient, check the invalid value instead of enumerating the valid ones. Signed-off-by: Liang Li <liang.z.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 05 Sep, 2016 10 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuPaolo Bonzini authored
Merge IOMMU bits for virtualization of interrupt injection into virtual machines.
-
Suravee Suthikulpanit authored
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver uses to determine if it should enable vAPIC support, by setting the ga_mode bit in the device's interrupt remapping table entry. Currently, it is enabled for all pass-through device if vAPIC mode is enabled. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
This patch implements irq_set_vcpu_affinity() function to set up interrupt remapping table entry with vapic mode for pass-through devices. In case requirements for vapic mode are not met, it falls back to set up the IRTE in legacy mode. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
Introduces a new IOMMU API, amd_iommu_update_ga(), which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when load/unload vcpu. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler. When IOMMU hardware receives an interrupt targeting a blocking vcpu, it creates an entry in the GALOG, and generates an interrupt to notify the AMD IOMMU driver. At this point, the driver processes the log entry, and notify the SVM driver via the registered iommu_ga_log_notifier function. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
This patch adds support to detect and initialize IOMMU Guest vAPIC log (GALOG). By default, it also enable GALog interrupt to notify IOMMU driver when GA Log entry is created. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
This patch enables support for the new 128-bit IOMMU IRTE format, which can be used for both legacy and vapic interrupt remapping modes. It replaces the existing operations on IRTE, which can only support the older 32-bit IRTE format, with calls to the new struct amd_irt_ops. It also provides helper functions for setting up, accessing, and updating interrupt remapping table entries in different mode. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
Currently, IOMMU support two interrupt remapping table entry formats, 32-bit (legacy) and 128-bit (GA). The spec also implies that it might support additional modes/formats in the future. So, this patch introduces the new struct amd_irte_ops, which allows the same code to work with different irte formats by providing hooks for various operations on an interrupt remapping table entry. Suggested-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
Move existing unions and structs for accessing/managing IRTE to a proper header file. This is mainly to simplify variable declarations in subsequent patches. Besides, this patch also introduces new struct irte_ga for the new 128-bit IRTE format. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
Suravee Suthikulpanit authored
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir, which can be used to specify different interrupt remapping mode for passthrough devices to VM guest: * legacy: Legacy interrupt remapping (w/ 32-bit IRTE) * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE) Note that in vapic mode, it can also supports legacy interrupt remapping for non-passthrough devices with the 128-bit IRTE. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
-
- 19 Aug, 2016 4 commits
-
-
Luwei Kang authored
Expose AVX512DQ, AVX512BW, AVX512VL feature to guest. Its spec can be found at: https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdfSigned-off-by: Luwei Kang <luwei.kang@intel.com> [Resolved a trivial conflict with removed F(PCOMMIT).] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Bandan Das authored
That parameter isn't used in these functions, it's probably a historical artifact. Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
APIC map table is recalculated during reset APIC ID to the initial value when enabling LAPIC. This patch move the recalculate_apic_map() to the next branch since we don't need to recalculate apic map twice in current codes. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
James Hogan authored
When mapping a page into the guest we error check using is_error_pfn(), however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an error HVA for the page. This can only happen on MIPS right now due to unusual memslot management (e.g. being moved / removed / resized), or with an Enhanced Virtual Memory (EVA) configuration where the default KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed in a later patch). This case will be treated as a pfn of zero, mapping the first page of physical memory into the guest. It would appear the MIPS KVM port wasn't updated prior to being merged (in v3.10) to take commit 81c52c56 ("KVM: do not treat noslot pfn as a error pfn") into account (merged v3.8), which converted a bunch of is_error_pfn() calls to is_error_noslot_pfn(). Switch to using is_error_noslot_pfn() instead to catch this case properly. Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.10.y- Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 18 Aug, 2016 5 commits
-
-
Paolo Bonzini authored
Merge tag 'kvm-arm-for-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Fixes for v4.8-rc3 This tag contains the following fixes on top of v4.8-rc1: - ITS init issues - ITS error handling issues - ITS IRQ leakage fix - Plug a couple of ITS race conditions - An erratum workaround for timers - Some removal of misleading use of errors and comments - A fix for GICv3 on 32-bit guests
-
Peter Feiner authored
When the host supported TSC scaling, L2 would use a TSC multiplier of 0, which causes a VM entry failure. Now L2's TSC uses the same multiplier as L1. Signed-off-by: Peter Feiner <pfeiner@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the write with vmcs02 as the current VMCS. This will incorrectly apply modifications intended for vmcs01 to vmcs02 and L2 can use it to gain access to L0's x2APIC registers by disabling virtualized x2APIC while using msr bitmap that assumes enabled. Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the current VMCS. An alternative solution would temporarily make vmcs01 the current VMCS, but it requires more care. Fixes: 8d14695f ("x86, apicv: add virtual x2apic support") Reported-by: Jim Mattson <jmattson@google.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
msr bitmap can be used to avoid a VM exit (interception) on guest MSR accesses. In some configurations of VMX controls, the guest can even directly access host's x2APIC MSRs. See SDM 29.5 VIRTUALIZING MSR-BASED APIC ACCESSES. L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI. To do so, L1 would first trick KVM to disable all possible interceptions by enabling APICv features and then would turn those features off; nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would not intercept previously enabled MSRs even though they were not safe with the new configuration. Correctly re-enabling interceptions is not enough as a second bug would still allow L1+L2 to access host's MSRs: msr bitmap was shared for all VMCSs, so L1 could trigger a race to get the desired combination of msr bitmap and VMX controls. This fix allocates a msr bitmap for every L1 VCPU, allows only safe x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would have to intercept everything anyway. Fixes: 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap") Reported-by: Jim Mattson <jmattson@google.com> Suggested-by: Wincy Van <fanwenyi0529@gmail.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix Fietkau. 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme. 3) Fix some tg3 ethtool logic bugs, and one that would cause no interrupts to be generated when rx-coalescing is set to 0. From Satish Baddipadige and Siva Reddy Kallam. 4) QLCNIC mailbox corruption and napi budget handling fix from Manish Chopra. 5) Fix fib_trie logic when walking the trie during /proc/net/route output than can access a stale node pointer. From David Forster. 6) Several sctp_diag fixes from Phil Sutter. 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel. 8) Checksum fixup fixes in bpf from Daniel Borkmann. 9) Memork leaks in nfnetlink, from Liping Zhang. 10) Use after free in rxrpc, from David Howells. 11) Use after free in new skb_array code of macvtap driver, from Jason Wang. 12) Calipso resource leak, from Colin Ian King. 13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang. 14) Fix bpf non-linear packet write helpers, from Daniel Borkmann. 15) Fix lockdep splats in macsec, from Sabrina Dubroca. 16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF handling. 17) Various tc-action bug fixes, from CONG Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) net_sched: allow flushing tc police actions net_sched: unify the init logic for act_police net_sched: convert tcf_exts from list to pointer array net_sched: move tc offload macros to pkt_cls.h net_sched: fix a typo in tc_for_each_action() net_sched: remove an unnecessary list_del() net_sched: remove the leftover cleanup_a() mlxsw: spectrum: Allow packets to be trapped from any PG mlxsw: spectrum: Unmap 802.1Q FID before destroying it mlxsw: spectrum: Add missing rollbacks in error path mlxsw: reg: Fix missing op field fill-up mlxsw: spectrum: Trap loop-backed packets mlxsw: spectrum: Add missing packet traps mlxsw: spectrum: Mark port as active before registering it mlxsw: spectrum: Create PVID vPort before registering netdevice mlxsw: spectrum: Remove redundant errors from the code mlxsw: spectrum: Don't return upon error in removal path i40e: check for and deal with non-contiguous TCs ixgbe: Re-enable ability to toggle VLAN filtering ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths ...
-
- 17 Aug, 2016 2 commits
-
-
David S. Miller authored
Cong Wang says: ==================== net_sched: tc action fixes and updates This patchset fixes a few regressions caused by the previous code refactor and more. Thanks to Jamal for catching them! Note, patch 3/7 and 4/7 are not strictly necessary for this patchset, I just want to carry them together. --- v4: adjust an indention for Jamal add two more patches v3: avoid list for fast path, suggested by Jamal v2: replace flex_array with regular dynamic array keep tcf_action_stats_update() in act_api.h fix macro typos found by Amir ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Roman Mashak authored
The act_police uses its own code to walk the action hashtable, which leads to that we could not flush standalone tc police actions, so just switch to tcf_generic_walker() like other actions. (Joint work from Roman and Cong.) Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-