Commit b9085bcb authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM update from Paolo Bonzini:
 "Fairly small update, but there are some interesting new features.

  Common:
     Optional support for adding a small amount of polling on each HLT
     instruction executed in the guest (or equivalent for other
     architectures).  This can improve latency up to 50% on some
     scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests).  This
     also has to be enabled manually for now, but the plan is to
     auto-tune this in the future.

  ARM/ARM64:
     The highlights are support for GICv3 emulation and dirty page
     tracking

  s390:
     Several optimizations and bugfixes.  Also a first: a feature
     exposed by KVM (UUID and long guest name in /proc/sysinfo) before
     it is available in IBM's hypervisor! :)

  MIPS:
     Bugfixes.

  x86:
     Support for PML (page modification logging, a new feature in
     Broadwell Xeons that speeds up dirty page tracking), nested
     virtualization improvements (nested APICv---a nice optimization),
     usual round of emulation fixes.

     There is also a new option to reduce latency of the TSC deadline
     timer in the guest; this needs to be tuned manually.

     Some commits are common between this pull and Catalin's; I see you
     have already included his tree.

  Powerpc:
     Nothing yet.

     The KVM/PPC changes will come in through the PPC maintainers,
     because I haven't received them yet and I might end up being
     offline for some part of next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
  KVM: ia64: drop kvm.h from installed user headers
  KVM: x86: fix build with !CONFIG_SMP
  KVM: x86: emulate: correct page fault error code for NoWrite instructions
  KVM: Disable compat ioctl for s390
  KVM: s390: add cpu model support
  KVM: s390: use facilities and cpu_id per KVM
  KVM: s390/CPACF: Choose crypto control block format
  s390/kernel: Update /proc/sysinfo file with Extended Name and UUID
  KVM: s390: reenable LPP facility
  KVM: s390: floating irqs: fix user triggerable endless loop
  kvm: add halt_poll_ns module parameter
  kvm: remove KVM_MMIO_SIZE
  KVM: MIPS: Don't leak FPU/DSP to guest
  KVM: MIPS: Disable HTW while in guest
  KVM: nVMX: Enable nested posted interrupt processing
  KVM: nVMX: Enable nested virtual interrupt delivery
  KVM: nVMX: Enable nested apic register virtualization
  KVM: nVMX: Make nested control MSRs per-cpu
  KVM: nVMX: Enable nested virtualize x2apic mode
  KVM: nVMX: Prepare for using hardware MSR bitmap
  ...
parents c7d7b986 6557bada
......@@ -612,11 +612,14 @@ Type: vm ioctl
Parameters: none
Returns: 0 on success, -1 on error
Creates an interrupt controller model in the kernel. On x86, creates a virtual
ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
only go to the IOAPIC. On ARM/arm64, a GIC is
created. On s390, a dummy irq routing table is created.
Creates an interrupt controller model in the kernel.
On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up
future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both
PIC and IOAPIC; GSI 16-23 only go to the IOAPIC.
On ARM/arm64, a GICv2 is created. Any other GIC versions require the usage of
KVM_CREATE_DEVICE, which also supports creating a GICv2. Using
KVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2.
On s390, a dummy irq routing table is created.
Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled
before KVM_CREATE_IRQCHIP can be used.
......@@ -2312,7 +2315,7 @@ struct kvm_s390_interrupt {
type can be one of the following:
KVM_S390_SIGP_STOP (vcpu) - sigp restart
KVM_S390_SIGP_STOP (vcpu) - sigp stop; optional flags in parm
KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
KVM_S390_RESTART (vcpu) - restart
......@@ -3225,3 +3228,23 @@ userspace from doing that.
If the hcall number specified is not one that has an in-kernel
implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL
error.
7.2 KVM_CAP_S390_USER_SIGP
Architectures: s390
Parameters: none
This capability controls which SIGP orders will be handled completely in user
space. With this capability enabled, all fast orders will be handled completely
in the kernel:
- SENSE
- SENSE RUNNING
- EXTERNAL CALL
- EMERGENCY SIGNAL
- CONDITIONAL EMERGENCY SIGNAL
All other orders will be handled completely in user space.
Only privileged operation exceptions will be checked for in the kernel (or even
in the hardware prior to interception). If this capability is not enabled, the
old way of handling SIGP orders is used (partially in kernel and user space).
......@@ -3,22 +3,42 @@ ARM Virtual Generic Interrupt Controller (VGIC)
Device types supported:
KVM_DEV_TYPE_ARM_VGIC_V2 ARM Generic Interrupt Controller v2.0
KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
Only one VGIC instance may be instantiated through either this API or the
legacy KVM_CREATE_IRQCHIP api. The created VGIC will act as the VM interrupt
controller, requiring emulated user-space devices to inject interrupts to the
VGIC instead of directly to CPUs.
Creating a guest GICv3 device requires a host GICv3 as well.
GICv3 implementations with hardware compatibility support allow a guest GICv2
as well.
Groups:
KVM_DEV_ARM_VGIC_GRP_ADDR
Attributes:
KVM_VGIC_V2_ADDR_TYPE_DIST (rw, 64-bit)
Base address in the guest physical address space of the GIC distributor
register mappings.
register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2.
This address needs to be 4K aligned and the region covers 4 KByte.
KVM_VGIC_V2_ADDR_TYPE_CPU (rw, 64-bit)
Base address in the guest physical address space of the GIC virtual cpu
interface register mappings.
interface register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2.
This address needs to be 4K aligned and the region covers 4 KByte.
KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
Base address in the guest physical address space of the GICv3 distributor
register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
This address needs to be 64K aligned and the region covers 64 KByte.
KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit)
Base address in the guest physical address space of the GICv3
redistributor register mappings. There are two 64K pages for each
VCPU and all of the redistributor pages are contiguous.
Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
This address needs to be 64K aligned.
KVM_DEV_ARM_VGIC_GRP_DIST_REGS
Attributes:
......@@ -36,6 +56,7 @@ Groups:
the register.
Limitations:
- Priorities are not implemented, and registers are RAZ/WI
- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
Errors:
-ENODEV: Getting or setting this register is not yet supported
-EBUSY: One or more VCPUs are running
......@@ -68,6 +89,7 @@ Groups:
Limitations:
- Priorities are not implemented, and registers are RAZ/WI
- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
Errors:
-ENODEV: Getting or setting this register is not yet supported
-EBUSY: One or more VCPUs are running
......@@ -81,3 +103,14 @@ Groups:
-EINVAL: Value set is out of the expected range
-EBUSY: Value has already be set, or GIC has already been initialized
with default values.
KVM_DEV_ARM_VGIC_GRP_CTRL
Attributes:
KVM_DEV_ARM_VGIC_CTRL_INIT
request the initialization of the VGIC, no additional parameter in
kvm_device_attr.addr.
Errors:
-ENXIO: VGIC not properly configured as required prior to calling
this attribute
-ENODEV: no online VCPU
-ENOMEM: memory shortage when allocating vgic internal data
......@@ -24,3 +24,62 @@ Returns: 0
Clear the CMMA status for all guest pages, so any pages the guest marked
as unused are again used any may not be reclaimed by the host.
1.3. ATTRIBUTE KVM_S390_VM_MEM_LIMIT_SIZE
Parameters: in attr->addr the address for the new limit of guest memory
Returns: -EFAULT if the given address is not accessible
-EINVAL if the virtual machine is of type UCONTROL
-E2BIG if the given guest memory is to big for that machine
-EBUSY if a vcpu is already defined
-ENOMEM if not enough memory is available for a new shadow guest mapping
0 otherwise
Allows userspace to query the actual limit and set a new limit for
the maximum guest memory size. The limit will be rounded up to
2048 MB, 4096 GB, 8192 TB respectively, as this limit is governed by
the number of page table levels.
2. GROUP: KVM_S390_VM_CPU_MODEL
Architectures: s390
2.1. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE (r/o)
Allows user space to retrieve machine and kvm specific cpu related information:
struct kvm_s390_vm_cpu_machine {
__u64 cpuid; # CPUID of host
__u32 ibc; # IBC level range offered by host
__u8 pad[4];
__u64 fac_mask[256]; # set of cpu facilities enabled by KVM
__u64 fac_list[256]; # set of cpu facilities offered by host
}
Parameters: address of buffer to store the machine related cpu data
of type struct kvm_s390_vm_cpu_machine*
Returns: -EFAULT if the given address is not accessible from kernel space
-ENOMEM if not enough memory is available to process the ioctl
0 in case of success
2.2. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR (r/w)
Allows user space to retrieve or request to change cpu related information for a vcpu:
struct kvm_s390_vm_cpu_processor {
__u64 cpuid; # CPUID currently (to be) used by this vcpu
__u16 ibc; # IBC level currently (to be) used by this vcpu
__u8 pad[6];
__u64 fac_list[256]; # set of cpu facilities currently (to be) used
# by this vcpu
}
KVM does not enforce or limit the cpu model data in any form. Take the information
retrieved by means of KVM_S390_VM_CPU_MACHINE as hint for reasonable configuration
setups. Instruction interceptions triggered by additionally set facilitiy bits that
are not handled by KVM need to by imlemented in the VM driver code.
Parameters: address of buffer to store/set the processor related cpu
data of type struct kvm_s390_vm_cpu_processor*.
Returns: -EBUSY in case 1 or more vcpus are already activated (only in write case)
-EFAULT if the given address is not accessible from kernel space
-ENOMEM if not enough memory is available to process the ioctl
0 in case of success
......@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
extern void __kvm_flush_vm_context(void);
extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
#endif
......
......@@ -23,6 +23,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_mmio.h>
#include <asm/kvm_arm.h>
#include <asm/cputype.h>
unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
......@@ -177,9 +178,9 @@ static inline u32 kvm_vcpu_hvc_get_imm(struct kvm_vcpu *vcpu)
return kvm_vcpu_get_hsr(vcpu) & HSR_HVC_IMM_MASK;
}
static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
{
return vcpu->arch.cp15[c0_MPIDR];
return vcpu->arch.cp15[c0_MPIDR] & MPIDR_HWID_BITMASK;
}
static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
......
......@@ -68,6 +68,7 @@ struct kvm_arch {
/* Interrupt controller */
struct vgic_dist vgic;
int max_vcpus;
};
#define KVM_NR_MEM_OBJS 40
......@@ -144,6 +145,7 @@ struct kvm_vm_stat {
};
struct kvm_vcpu_stat {
u32 halt_successful_poll;
u32 halt_wakeup;
};
......@@ -231,6 +233,10 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
int kvm_perf_init(void);
int kvm_perf_teardown(void);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
static inline void kvm_arch_hardware_disable(void) {}
static inline void kvm_arch_hardware_unsetup(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
......
......@@ -37,6 +37,7 @@ struct kvm_exit_mmio {
u8 data[8];
u32 len;
bool is_write;
void *private;
};
static inline void kvm_prepare_mmio(struct kvm_run *run,
......
......@@ -115,6 +115,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
pmd_val(*pmd) |= L_PMD_S2_RDWR;
}
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
}
static inline bool kvm_s2pte_readonly(pte_t *pte)
{
return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
}
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
}
static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
{
return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
}
/* Open coded p*d_addr_end that can deal with 64bit addresses */
#define kvm_pgd_addr_end(addr, end) \
({ u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK; \
......
......@@ -129,6 +129,7 @@
#define L_PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[1] */
#define L_PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
#define L_PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[1] */
#define L_PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
/*
......
......@@ -175,6 +175,8 @@ struct kvm_arch_memory_slot {
#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0
#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
#define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3
#define KVM_DEV_ARM_VGIC_GRP_CTRL 4
#define KVM_DEV_ARM_VGIC_CTRL_INIT 0
/* KVM_IRQ_LINE irq field index values */
#define KVM_ARM_IRQ_TYPE_SHIFT 24
......
......@@ -21,8 +21,10 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_ARCH_TLB_FLUSH_ALL
select KVM_MMIO
select KVM_ARM_HOST
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU
depends on ARM_VIRT_EXT && ARM_LPAE
---help---
......
......@@ -22,4 +22,5 @@ obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o
obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic.o
obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
obj-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
obj-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
......@@ -132,6 +132,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
/* Mark the initial VMID generation invalid */
kvm->arch.vmid_gen = 0;
/* The maximum number of VCPUs is limited by the host's GIC model */
kvm->arch.max_vcpus = kvm_vgic_get_max_vcpus();
return ret;
out_free_stage2_pgd:
kvm_free_stage2_pgd(kvm);
......@@ -218,6 +221,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
goto out;
}
if (id >= kvm->arch.max_vcpus) {
err = -EINVAL;
goto out;
}
vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
if (!vcpu) {
err = -ENOMEM;
......@@ -241,9 +249,8 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
return ERR_PTR(err);
}
int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
return 0;
}
void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
......@@ -777,9 +784,39 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
}
}
/**
* kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
* @kvm: kvm instance
* @log: slot id and address to which we copy the log
*
* Steps 1-4 below provide general overview of dirty page logging. See
* kvm_get_dirty_log_protect() function description for additional details.
*
* We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
* always flush the TLB (step 4) even if previous step failed and the dirty
* bitmap may be corrupt. Regardless of previous outcome the KVM logging API
* does not preclude user space subsequent dirty log read. Flushing TLB ensures
* writes will be marked dirty for next log read.
*
* 1. Take a snapshot of the bit and clear it if needed.
* 2. Write protect the corresponding page.
* 3. Copy the snapshot to the userspace.
* 4. Flush TLB's if needed.
*/
int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
{
return -EINVAL;
bool is_dirty = false;
int r;
mutex_lock(&kvm->slots_lock);
r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
if (is_dirty)
kvm_flush_remote_tlbs(kvm);
mutex_unlock(&kvm->slots_lock);
return r;
}
static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
......@@ -811,7 +848,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
switch (ioctl) {
case KVM_CREATE_IRQCHIP: {
if (vgic_present)
return kvm_vgic_create(kvm);
return kvm_vgic_create(kvm, KVM_DEV_TYPE_ARM_VGIC_V2);
else
return -ENXIO;
}
......@@ -1035,6 +1072,19 @@ static void check_kvm_target_cpu(void *ret)
*(int *)ret = kvm_target_cpu();
}
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
{
struct kvm_vcpu *vcpu;
int i;
mpidr &= MPIDR_HWID_BITMASK;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (mpidr == kvm_vcpu_get_mpidr_aff(vcpu))
return vcpu;
}
return NULL;
}
/**
* Initialize Hyp-mode and memory mappings on all CPUs.
*/
......
......@@ -87,11 +87,13 @@ static int handle_dabt_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
*/
static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
trace_kvm_wfi(*vcpu_pc(vcpu));
if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE)
if (kvm_vcpu_get_hsr(vcpu) & HSR_WFI_IS_WFE) {
trace_kvm_wfx(*vcpu_pc(vcpu), true);
kvm_vcpu_on_spin(vcpu);
else
} else {
trace_kvm_wfx(*vcpu_pc(vcpu), false);
kvm_vcpu_block(vcpu);
}
kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
......
......@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
bx lr
ENDPROC(__kvm_tlb_flush_vmid_ipa)
/**
* void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
*
* Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
* parameter
*/
ENTRY(__kvm_tlb_flush_vmid)
b __kvm_tlb_flush_vmid_ipa
ENDPROC(__kvm_tlb_flush_vmid)
/********************************************************************
* Flush TLBs and instruction caches of all CPUs inside the inner-shareable
* domain, for all VMIDs
......
This diff is collapsed.
......@@ -22,6 +22,7 @@
#include <asm/cputype.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_psci.h>
#include <asm/kvm_host.h>
/*
* This is an implementation of the Power State Coordination Interface
......@@ -66,25 +67,17 @@ static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
{
struct kvm *kvm = source_vcpu->kvm;
struct kvm_vcpu *vcpu = NULL, *tmp;
struct kvm_vcpu *vcpu = NULL;
wait_queue_head_t *wq;
unsigned long cpu_id;
unsigned long context_id;
unsigned long mpidr;
phys_addr_t target_pc;
int i;
cpu_id = *vcpu_reg(source_vcpu, 1);
cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);
kvm_for_each_vcpu(i, tmp, kvm) {
mpidr = kvm_vcpu_get_mpidr(tmp);
if ((mpidr & MPIDR_HWID_BITMASK) == (cpu_id & MPIDR_HWID_BITMASK)) {
vcpu = tmp;
break;
}
}
vcpu = kvm_mpidr_to_vcpu(kvm, cpu_id);
/*
* Make sure the caller requested a valid CPU and that the CPU is
......@@ -155,7 +148,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
* then ON else OFF
*/
kvm_for_each_vcpu(i, tmp, kvm) {
mpidr = kvm_vcpu_get_mpidr(tmp);
mpidr = kvm_vcpu_get_mpidr_aff(tmp);
if (((mpidr & target_affinity_mask) == target_affinity) &&
!tmp->arch.pause) {
return PSCI_0_2_AFFINITY_LEVEL_ON;
......
......@@ -140,19 +140,22 @@ TRACE_EVENT(kvm_emulate_cp15_imp,
__entry->CRm, __entry->Op2)
);
TRACE_EVENT(kvm_wfi,
TP_PROTO(unsigned long vcpu_pc),
TP_ARGS(vcpu_pc),
TRACE_EVENT(kvm_wfx,
TP_PROTO(unsigned long vcpu_pc, bool is_wfe),
TP_ARGS(vcpu_pc, is_wfe),
TP_STRUCT__entry(
__field( unsigned long, vcpu_pc )
__field( bool, is_wfe )
),
TP_fast_assign(
__entry->vcpu_pc = vcpu_pc;
__entry->is_wfe = is_wfe;
),
TP_printk("guest executed wfi at: 0x%08lx", __entry->vcpu_pc)
TP_printk("guest executed wf%c at: 0x%08lx",
__entry->is_wfe ? 'e' : 'i', __entry->vcpu_pc)
);
TRACE_EVENT(kvm_unmap_hva,
......
......@@ -96,6 +96,7 @@
#define ESR_ELx_COND_SHIFT (20)
#define ESR_ELx_COND_MASK (UL(0xF) << ESR_ELx_COND_SHIFT)
#define ESR_ELx_WFx_ISS_WFE (UL(1) << 0)
#define ESR_ELx_xVC_IMM_MASK ((1UL << 16) - 1)
#ifndef __ASSEMBLY__
#include <asm/types.h>
......
......@@ -126,6 +126,7 @@ extern char __kvm_hyp_vector[];
extern void __kvm_flush_vm_context(void);
extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
......
......@@ -29,6 +29,7 @@
#include <asm/kvm_asm.h>
#include <asm/kvm_mmio.h>
#include <asm/ptrace.h>
#include <asm/cputype.h>
unsigned long *vcpu_reg32(const struct kvm_vcpu *vcpu, u8 reg_num);
unsigned long *vcpu_spsr32(const struct kvm_vcpu *vcpu);
......@@ -140,6 +141,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
}
static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
}
static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISV);
......@@ -201,9 +207,9 @@ static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC_TYPE;
}
static inline unsigned long kvm_vcpu_get_mpidr(struct kvm_vcpu *vcpu)
static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
{
return vcpu_sys_reg(vcpu, MPIDR_EL1);
return vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
}
static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
......
......@@ -59,6 +59,9 @@ struct kvm_arch {
/* VTTBR value associated with above pgd and vmid */
u64 vttbr;
/* The maximum number of vCPUs depends on the used GIC model */
int max_vcpus;
/* Interrupt controller */
struct vgic_dist vgic;
......@@ -159,6 +162,7 @@ struct kvm_vm_stat {
};
struct kvm_vcpu_stat {
u32 halt_successful_poll;
u32 halt_wakeup;
};
......@@ -196,6 +200,7 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
u64 kvm_call_hyp(void *hypfn, ...);
void force_vm_exit(const cpumask_t *mask);
void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int exception_index);
......@@ -203,6 +208,8 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int kvm_perf_init(void);
int kvm_perf_teardown(void);
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
phys_addr_t pgd_ptr,
unsigned long hyp_stack_ptr,
......
......@@ -40,6 +40,7 @@ struct kvm_exit_mmio {
u8 data[8];
u32 len;
bool is_write;
void *private;
};
static inline void kvm_prepare_mmio(struct kvm_run *run,
......
......@@ -118,6 +118,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
pmd_val(*pmd) |= PMD_S2_RDWR;
}
static inline void kvm_set_s2pte_readonly(pte_t *pte)
{
pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY;
}
static inline bool kvm_s2pte_readonly(pte_t *pte)
{
return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
}
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
{
pmd_val(*pmd) = (pmd_val(*pmd) & ~PMD_S2_RDWR) | PMD_S2_RDONLY;
}
static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
{
return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
}
#define kvm_pgd_addr_end(addr, end) pgd_addr_end(addr, end)
#define kvm_pud_addr_end(addr, end) pud_addr_end(addr, end)
#define kvm_pmd_addr_end(addr, end) pmd_addr_end(addr, end)
......
......@@ -119,6 +119,7 @@
#define PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[2:1] */
#define PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
#define PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[2:1] */
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
/*
......
......@@ -78,6 +78,13 @@ struct kvm_regs {
#define KVM_VGIC_V2_DIST_SIZE 0x1000
#define KVM_VGIC_V2_CPU_SIZE 0x2000
/* Supported VGICv3 address types */
#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
#define KVM_VGIC_V3_ADDR_TYPE_REDIST 3
#define KVM_VGIC_V3_DIST_SIZE SZ_64K
#define KVM_VGIC_V3_REDIST_SIZE (2 * SZ_64K)
#define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
#define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */
#define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */
......@@ -161,6 +168,8 @@ struct kvm_arch_memory_slot {
#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0
#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
#define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3
#define KVM_DEV_ARM_VGIC_GRP_CTRL 4
#define KVM_DEV_ARM_VGIC_CTRL_INIT 0
/* KVM_IRQ_LINE irq field index values */
#define KVM_ARM_IRQ_TYPE_SHIFT 24
......
......@@ -140,6 +140,7 @@ int main(void)
DEFINE(VGIC_V2_CPU_ELRSR, offsetof(struct vgic_cpu, vgic_v2.vgic_elrsr));
DEFINE(VGIC_V2_CPU_APR, offsetof(struct vgic_cpu, vgic_v2.vgic_apr));
DEFINE(VGIC_V2_CPU_LR, offsetof(struct vgic_cpu, vgic_v2.vgic_lr));
DEFINE(VGIC_V3_CPU_SRE, offsetof(struct vgic_cpu, vgic_v3.vgic_sre));
DEFINE(VGIC_V3_CPU_HCR, offsetof(struct vgic_cpu, vgic_v3.vgic_hcr));
DEFINE(VGIC_V3_CPU_VMCR, offsetof(struct vgic_cpu, vgic_v3.vgic_vmcr));
DEFINE(VGIC_V3_CPU_MISR, offsetof(struct vgic_cpu, vgic_v3.vgic_misr));
......
......@@ -22,10 +22,12 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_ARCH_TLB_FLUSH_ALL
select KVM_MMIO
select KVM_ARM_HOST
select KVM_ARM_VGIC
select KVM_ARM_TIMER
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select SRCU
---help---
Support hosting virtualized guest machines.
......
......@@ -21,7 +21,9 @@ kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o
kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic.o
kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2.o
kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v2-emul.o
kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v2-switch.o
kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3.o
kvm-$(CONFIG_KVM_ARM_VGIC) += $(KVM)/arm/vgic-v3-emul.o
kvm-$(CONFIG_KVM_ARM_VGIC) += vgic-v3-switch.o
kvm-$(CONFIG_KVM_ARM_TIMER) += $(KVM)/arm/arch_timer.o
......@@ -28,12 +28,18 @@
#include <asm/kvm_mmu.h>
#include <asm/kvm_psci.h>
#define CREATE_TRACE_POINTS
#include "trace.h"
typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
int ret;
trace_kvm_hvc_arm64(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
ret = kvm_psci_call(vcpu);
if (ret < 0) {
kvm_inject_undefined(vcpu);
......@@ -63,10 +69,13 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
*/
static int kvm_handle_wfx(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE)
if (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WFx_ISS_WFE) {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true);
kvm_vcpu_on_spin(vcpu);
else
} else {
trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
kvm_vcpu_block(vcpu);
}
kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
......
......@@ -1032,6 +1032,28 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
ret
ENDPROC(__kvm_tlb_flush_vmid_ipa)
/**
* void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
* @struct kvm *kvm - pointer to kvm structure
*
* Invalidates all Stage 1 and 2 TLB entries for current VMID.
*/
ENTRY(__kvm_tlb_flush_vmid)
dsb ishst
kern_hyp_va x0
ldr x2, [x0, #KVM_VTTBR]
msr vttbr_el2, x2
isb
tlbi vmalls12e1is
dsb ish
isb
msr vttbr_el2, xzr
ret
ENDPROC(__kvm_tlb_flush_vmid)
ENTRY(__kvm_flush_vm_context)
dsb ishst
tlbi alle1is
......
......@@ -113,6 +113,27 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
return true;
}
/*
* Trap handler for the GICv3 SGI generation system register.
* Forward the request to the VGIC emulation.
* The cp15_64 code makes sure this automatically works
* for both AArch64 and AArch32 accesses.
*/
static bool access_gic_sgi(struct kvm_vcpu *vcpu,
const struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
u64 val;
if (!p->is_write)
return read_from_write_only(vcpu, p);
val = *vcpu_reg(vcpu, p->Rt);
vgic_v3_dispatch_sgi(vcpu, val);
return true;
}
static bool trap_raz_wi(struct kvm_vcpu *vcpu,
const struct sys_reg_params *p,
const struct sys_reg_desc *r)
......@@ -200,10 +221,19 @@ static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
static void reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
u64 mpidr;
/*
* Simply map the vcpu_id into the Aff0 field of the MPIDR.
* Map the vcpu_id into the first three affinity level fields of
* the MPIDR. We limit the number of VCPUs in level 0 due to a
* limitation to 16 CPUs in that level in the ICC_SGIxR registers
* of the GICv3 to be able to address each CPU directly when
* sending IPIs.
*/
vcpu_sys_reg(vcpu, MPIDR_EL1) = (1UL << 31) | (vcpu->vcpu_id & 0xff);
mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
vcpu_sys_reg(vcpu, MPIDR_EL1) = (1ULL << 31) | mpidr;
}
/* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
......@@ -373,6 +403,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b0000), Op2(0b000),
NULL, reset_val, VBAR_EL1, 0 },
/* ICC_SGI1R_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1011), Op2(0b101),
access_gic_sgi },
/* ICC_SRE_EL1 */
{ Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1100), Op2(0b101),
trap_raz_wi },
......@@ -605,6 +638,8 @@ static const struct sys_reg_desc cp14_64_regs[] = {
* register).
*/
static const struct sys_reg_desc cp15_regs[] = {
{ Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
{ Op1( 0), CRn( 1), CRm( 0), Op2( 0), access_vm_reg, NULL, c1_SCTLR },
{ Op1( 0), CRn( 2), CRm( 0), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
{ Op1( 0), CRn( 2), CRm( 0), Op2( 1), access_vm_reg, NULL, c2_TTBR1 },
......@@ -652,6 +687,7 @@ static const struct sys_reg_desc cp15_regs[] = {
static const struct sys_reg_desc cp15_64_regs[] = {
{ Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR0 },
{ Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi },
{ Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, c2_TTBR1 },
};
......
#if !defined(_TRACE_ARM64_KVM_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_ARM64_KVM_H
#include <linux/tracepoint.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm
TRACE_EVENT(kvm_wfx_arm64,
TP_PROTO(unsigned long vcpu_pc, bool is_wfe),
TP_ARGS(vcpu_pc, is_wfe),
TP_STRUCT__entry(
__field(unsigned long, vcpu_pc)
__field(bool, is_wfe)
),
TP_fast_assign(
__entry->vcpu_pc = vcpu_pc;
__entry->is_wfe = is_wfe;
),
TP_printk("guest executed wf%c at: 0x%08lx",
__entry->is_wfe ? 'e' : 'i', __entry->vcpu_pc)
);
TRACE_EVENT(kvm_hvc_arm64,
TP_PROTO(unsigned long vcpu_pc, unsigned long r0, unsigned long imm),
TP_ARGS(vcpu_pc, r0, imm),
TP_STRUCT__entry(
__field(unsigned long, vcpu_pc)
__field(unsigned long, r0)
__field(unsigned long, imm)
),
TP_fast_assign(
__entry->vcpu_pc = vcpu_pc;
__entry->r0 = r0;
__entry->imm = imm;
),
TP_printk("HVC at 0x%08lx (r0: 0x%08lx, imm: 0x%lx)",
__entry->vcpu_pc, __entry->r0, __entry->imm)
);
#endif /* _TRACE_ARM64_KVM_H */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH .
#undef TRACE_INCLUDE_FILE
#define TRACE_INCLUDE_FILE trace
/* This part must be outside protection */
#include <trace/define_trace.h>
......@@ -148,17 +148,18 @@
* x0: Register pointing to VCPU struct
*/
.macro restore_vgic_v3_state
// Disable SRE_EL1 access. Necessary, otherwise
// ICH_VMCR_EL2.VFIQEn becomes one, and FIQ happens...
msr_s ICC_SRE_EL1, xzr
isb
// Compute the address of struct vgic_cpu
add x3, x0, #VCPU_VGIC_CPU
// Restore all interesting registers
ldr w4, [x3, #VGIC_V3_CPU_HCR]
ldr w5, [x3, #VGIC_V3_CPU_VMCR]
ldr w25, [x3, #VGIC_V3_CPU_SRE]
msr_s ICC_SRE_EL1, x25
// make sure SRE is valid before writing the other registers
isb
msr_s ICH_HCR_EL2, x4
msr_s ICH_VMCR_EL2, x5
......@@ -244,9 +245,12 @@
dsb sy
// Prevent the guest from touching the GIC system registers
// if SRE isn't enabled for GICv3 emulation
cbnz x25, 1f
mrs_s x5, ICC_SRE_EL2
and x5, x5, #~ICC_SRE_EL2_ENABLE
msr_s ICC_SRE_EL2, x5
1:
.endm
ENTRY(__save_vgic_v3_state)
......
......@@ -18,7 +18,6 @@ header-y += intrinsics.h
header-y += ioctl.h
header-y += ioctls.h
header-y += ipcbuf.h
header-y += kvm.h
header-y += kvm_para.h
header-y += mman.h
header-y += msgbuf.h
......
......@@ -120,6 +120,7 @@ struct kvm_vcpu_stat {
u32 resvd_inst_exits;
u32 break_inst_exits;
u32 flush_dcache_exits;
u32 halt_successful_poll;
u32 halt_wakeup;
};
......
......@@ -434,7 +434,7 @@ __kvm_mips_return_to_guest:
/* Setup status register for running guest in UM */
.set at
or v1, v1, (ST0_EXL | KSU_USER | ST0_IE)
and v1, v1, ~ST0_CU0
and v1, v1, ~(ST0_CU0 | ST0_MX)
.set noat
mtc0 v1, CP0_STATUS
ehb
......
......@@ -15,9 +15,11 @@
#include <linux/vmalloc.h>
#include <linux/fs.h>
#include <linux/bootmem.h>
#include <asm/fpu.h>
#include <asm/page.h>
#include <asm/cacheflush.h>
#include <asm/mmu_context.h>
#include <asm/pgtable.h>
#include <linux/kvm_host.h>
......@@ -47,6 +49,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "resvd_inst", VCPU_STAT(resvd_inst_exits), KVM_STAT_VCPU },
{ "break_inst", VCPU_STAT(break_inst_exits), KVM_STAT_VCPU },
{ "flush_dcache", VCPU_STAT(flush_dcache_exits), KVM_STAT_VCPU },
{ "halt_successful_poll", VCPU_STAT(halt_successful_poll), KVM_STAT_VCPU },
{ "halt_wakeup", VCPU_STAT(halt_wakeup), KVM_STAT_VCPU },
{NULL}
};
......@@ -378,6 +381,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
vcpu->mmio_needed = 0;
}
lose_fpu(1);
local_irq_disable();
/* Check if we have any exceptions/interrupts pending */
kvm_mips_deliver_interrupts(vcpu,
......@@ -385,8 +390,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
kvm_guest_enter();
/* Disable hardware page table walking while in guest */
htw_stop();
r = __kvm_mips_vcpu_run(run, vcpu);
/* Re-enable HTW before enabling interrupts */
htw_start();
kvm_guest_exit();
local_irq_enable();
......@@ -832,9 +843,8 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
return -ENOIOCTLCMD;
}
int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
return 0;
}
int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
......@@ -980,9 +990,6 @@ static void kvm_mips_set_c0_status(void)
{
uint32_t status = read_c0_status();
if (cpu_has_fpu)
status |= (ST0_CU1);
if (cpu_has_dsp)
status |= (ST0_MX);
......@@ -1002,6 +1009,9 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
enum emulation_result er = EMULATE_DONE;
int ret = RESUME_GUEST;
/* re-enable HTW before enabling interrupts */
htw_start();
/* Set a default exit reason */
run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1;
......@@ -1136,6 +1146,9 @@ int kvm_mips_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
}
}
/* Disable HTW before returning to guest or host */
htw_stop();
return ret;
}
......
......@@ -107,6 +107,7 @@ struct kvm_vcpu_stat {
u32 emulated_inst_exits;
u32 dec_exits;
u32 ext_intr_exits;
u32 halt_successful_poll;
u32 halt_wakeup;
u32 dbell_exits;
u32 gdbell_exits;
......
......@@ -52,6 +52,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "dec", VCPU_STAT(dec_exits) },
{ "ext_intr", VCPU_STAT(ext_intr_exits) },
{ "queue_intr", VCPU_STAT(queue_intr) },
{ "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
{ "pf_storage", VCPU_STAT(pf_storage) },
{ "sp_storage", VCPU_STAT(sp_storage) },
......
......@@ -62,6 +62,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
{ "inst_emu", VCPU_STAT(emulated_inst_exits) },
{ "dec", VCPU_STAT(dec_exits) },
{ "ext_intr", VCPU_STAT(ext_intr_exits) },
{ "halt_successful_poll", VCPU_STAT(halt_successful_poll) },
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
{ "doorbell", VCPU_STAT(dbell_exits) },
{ "guest doorbell", VCPU_STAT(gdbell_exits) },
......
......@@ -623,9 +623,8 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id)
return vcpu;
}
int kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
return 0;
}
void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
......
......@@ -35,11 +35,13 @@
#define KVM_NR_IRQCHIPS 1
#define KVM_IRQCHIP_NUM_PINS 4096
#define SIGP_CTRL_C 0x00800000
#define SIGP_CTRL_C 0x80
#define SIGP_CTRL_SCN_MASK 0x3f
struct sca_entry {
atomic_t ctrl;
__u32 reserved;
__u8 reserved0;
__u8 sigp_ctrl;
__u16 reserved[3];
__u64 sda;
__u64 reserved2[2];
} __attribute__((packed));
......@@ -87,7 +89,8 @@ struct kvm_s390_sie_block {
atomic_t cpuflags; /* 0x0000 */
__u32 : 1; /* 0x0004 */
__u32 prefix : 18;
__u32 : 13;
__u32 : 1;
__u32 ibc : 12;
__u8 reserved08[4]; /* 0x0008 */
#define PROG_IN_SIE (1<<0)
__u32 prog0c; /* 0x000c */
......@@ -132,7 +135,9 @@ struct kvm_s390_sie_block {
__u8 reserved60; /* 0x0060 */
__u8 ecb; /* 0x0061 */
__u8 ecb2; /* 0x0062 */
__u8 reserved63[1]; /* 0x0063 */
#define ECB3_AES 0x04
#define ECB3_DEA 0x08
__u8 ecb3; /* 0x0063 */
__u32 scaol; /* 0x0064 */
__u8 reserved68[4]; /* 0x0068 */
__u32 todpr; /* 0x006c */
......@@ -159,6 +164,7 @@ struct kvm_s390_sie_block {
__u64 tecmc; /* 0x00e8 */
__u8 reservedf0[12]; /* 0x00f0 */
#define CRYCB_FORMAT1 0x00000001
#define CRYCB_FORMAT2 0x00000003
__u32 crycbd; /* 0x00fc */
__u64 gcr[16]; /* 0x0100 */
__u64 gbea; /* 0x0180 */
......@@ -192,6 +198,7 @@ struct kvm_vcpu_stat {
u32 exit_stop_request;
u32 exit_validity;
u32 exit_instruction;
u32 halt_successful_poll;
u32 halt_wakeup;
u32 instruction_lctl;
u32 instruction_lctlg;
......@@ -378,14 +385,11 @@ struct kvm_s390_interrupt_info {
struct kvm_s390_emerg_info emerg;
struct kvm_s390_extcall_info extcall;
struct kvm_s390_prefix_info prefix;
struct kvm_s390_stop_info stop;
struct kvm_s390_mchk_info mchk;
};
};
/* for local_interrupt.action_flags */
#define ACTION_STORE_ON_STOP (1<<0)
#define ACTION_STOP_ON_STOP (1<<1)
struct kvm_s390_irq_payload {
struct kvm_s390_io_info io;
struct kvm_s390_ext_info ext;
......@@ -393,6 +397,7 @@ struct kvm_s390_irq_payload {
struct kvm_s390_emerg_info emerg;
struct kvm_s390_extcall_info extcall;
struct kvm_s390_prefix_info prefix;
struct kvm_s390_stop_info stop;
struct kvm_s390_mchk_info mchk;
};
......@@ -401,7 +406,6 @@ struct kvm_s390_local_interrupt {
struct kvm_s390_float_interrupt *float_int;
wait_queue_head_t *wq;
atomic_t *cpuflags;
unsigned int action_bits;
DECLARE_BITMAP(sigp_emerg_pending, KVM_MAX_VCPUS);
struct kvm_s390_irq_payload irq;
unsigned long pending_irqs;
......@@ -470,7 +474,6 @@ struct kvm_vcpu_arch {
};
struct gmap *gmap;
struct kvm_guestdbg_info_arch guestdbg;
#define KVM_S390_PFAULT_TOKEN_INVALID (-1UL)
unsigned long pfault_token;
unsigned long pfault_select;
unsigned long pfault_compare;
......@@ -504,13 +507,39 @@ struct s390_io_adapter {
#define MAX_S390_IO_ADAPTERS ((MAX_ISC + 1) * 8)
#define MAX_S390_ADAPTER_MAPS 256
/* maximum size of facilities and facility mask is 2k bytes */
#define S390_ARCH_FAC_LIST_SIZE_BYTE (1<<11)
#define S390_ARCH_FAC_LIST_SIZE_U64 \
(S390_ARCH_FAC_LIST_SIZE_BYTE / sizeof(u64))
#define S390_ARCH_FAC_MASK_SIZE_BYTE S390_ARCH_FAC_LIST_SIZE_BYTE
#define S390_ARCH_FAC_MASK_SIZE_U64 \
(S390_ARCH_FAC_MASK_SIZE_BYTE / sizeof(u64))
struct s390_model_fac {
/* facilities used in SIE context */
__u64 sie[S390_ARCH_FAC_LIST_SIZE_U64];
/* subset enabled by kvm */
__u64 kvm[S390_ARCH_FAC_LIST_SIZE_U64];
};
struct kvm_s390_cpu_model {
struct s390_model_fac *fac;
struct cpuid cpu_id;
unsigned short ibc;
};
struct kvm_s390_crypto {
struct kvm_s390_crypto_cb *crycb;
__u32 crycbd;
__u8 aes_kw;
__u8 dea_kw;
};
struct kvm_s390_crypto_cb {
__u8 reserved00[128]; /* 0x0000 */
__u8 reserved00[72]; /* 0x0000 */
__u8 dea_wrapping_key_mask[24]; /* 0x0048 */
__u8 aes_wrapping_key_mask[32]; /* 0x0060 */
__u8 reserved80[128]; /* 0x0080 */
};
struct kvm_arch{
......@@ -523,12 +552,15 @@ struct kvm_arch{
int use_irqchip;
int use_cmma;
int user_cpu_state_ctrl;
int user_sigp;
struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
wait_queue_head_t ipte_wq;
int ipte_lock_count;
struct mutex ipte_mutex;
spinlock_t start_stop_lock;
struct kvm_s390_cpu_model model;
struct kvm_s390_crypto crypto;
u64 epoch;
};
#define KVM_HVA_ERR_BAD (-1UL)
......
......@@ -31,7 +31,8 @@ struct sclp_cpu_entry {
u8 reserved0[2];
u8 : 3;
u8 siif : 1;
u8 : 4;
u8 sigpif : 1;
u8 : 3;
u8 reserved2[10];
u8 type;
u8 reserved1;
......@@ -69,6 +70,7 @@ int memcpy_hsa(void *dest, unsigned long src, size_t count, int mode);
unsigned long sclp_get_hsa_size(void);
void sclp_early_detect(void);
int sclp_has_siif(void);
int sclp_has_sigpif(void);
unsigned int sclp_get_ibc(void);
long _sclp_print_early(const char *);
......
......@@ -15,6 +15,7 @@
#define __ASM_S390_SYSINFO_H
#include <asm/bitsperlong.h>
#include <linux/uuid.h>
struct sysinfo_1_1_1 {
unsigned char p:1;
......@@ -116,10 +117,13 @@ struct sysinfo_3_2_2 {
char name[8];
unsigned int caf;
char cpi[16];
char reserved_1[24];
char reserved_1[3];
char ext_name_encoding;
unsigned int reserved_2;
uuid_be uuid;
} vm[8];
char reserved_544[3552];
char reserved_3[1504];
char ext_names[8][256];
};
extern int topology_max_mnest;
......
......@@ -57,10 +57,44 @@ struct kvm_s390_io_adapter_req {
/* kvm attr_group on vm fd */
#define KVM_S390_VM_MEM_CTRL 0
#define KVM_S390_VM_TOD 1
#define KVM_S390_VM_CRYPTO 2
#define KVM_S390_VM_CPU_MODEL 3
/* kvm attributes for mem_ctrl */
#define KVM_S390_VM_MEM_ENABLE_CMMA 0
#define KVM_S390_VM_MEM_CLR_CMMA 1
#define KVM_S390_VM_MEM_LIMIT_SIZE 2
/* kvm attributes for KVM_S390_VM_TOD */
#define KVM_S390_VM_TOD_LOW 0
#define KVM_S390_VM_TOD_HIGH 1
/* kvm attributes for KVM_S390_VM_CPU_MODEL */
/* processor related attributes are r/w */
#define KVM_S390_VM_CPU_PROCESSOR 0
struct kvm_s390_vm_cpu_processor {
__u64 cpuid;
__u16 ibc;
__u8 pad[6];
__u64 fac_list[256];
};
/* machine related attributes are r/o */
#define KVM_S390_VM_CPU_MACHINE 1
struct kvm_s390_vm_cpu_machine {
__u64 cpuid;
__u32 ibc;
__u8 pad[4];
__u64 fac_mask[256];
__u64 fac_list[256];
};
/* kvm attributes for crypto */
#define KVM_S390_VM_CRYPTO_ENABLE_AES_KW 0
#define KVM_S390_VM_CRYPTO_ENABLE_DEA_KW 1
#define KVM_S390_VM_CRYPTO_DISABLE_AES_KW 2
#define KVM_S390_VM_CRYPTO_DISABLE_DEA_KW 3
/* for KVM_GET_REGS and KVM_SET_REGS */
struct kvm_regs {
......@@ -107,6 +141,9 @@ struct kvm_guest_debug_arch {
struct kvm_hw_breakpoint __user *hw_bp;
};
/* for KVM_SYNC_PFAULT and KVM_REG_S390_PFTOKEN */
#define KVM_S390_PFAULT_TOKEN_INVALID 0xffffffffffffffffULL
#define KVM_SYNC_PREFIX (1UL << 0)
#define KVM_SYNC_GPRS (1UL << 1)
#define KVM_SYNC_ACRS (1UL << 2)
......
......@@ -204,6 +204,33 @@ static void stsi_2_2_2(struct seq_file *m, struct sysinfo_2_2_2 *info)
}
}
static void print_ext_name(struct seq_file *m, int lvl,
struct sysinfo_3_2_2 *info)
{
if (info->vm[lvl].ext_name_encoding == 0)
return;
if (info->ext_names[lvl][0] == 0)
return;
switch (info->vm[lvl].ext_name_encoding) {
case 1: /* EBCDIC */
EBCASC(info->ext_names[lvl], sizeof(info->ext_names[lvl]));
break;
case 2: /* UTF-8 */
break;
default:
return;
}
seq_printf(m, "VM%02d Extended Name: %-.256s\n", lvl,
info->ext_names[lvl]);
}
static void print_uuid(struct seq_file *m, int i, struct sysinfo_3_2_2 *info)
{
if (!memcmp(&info->vm[i].uuid, &NULL_UUID_BE, sizeof(uuid_be)))
return;
seq_printf(m, "VM%02d UUID: %pUb\n", i, &info->vm[i].uuid);
}
static void stsi_3_2_2(struct seq_file *m, struct sysinfo_3_2_2 *info)
{
int i;
......@@ -221,6 +248,8 @@ static void stsi_3_2_2(struct seq_file *m, struct sysinfo_3_2_2 *info)
seq_printf(m, "VM%02d CPUs Configured: %d\n", i, info->vm[i].cpus_configured);
seq_printf(m, "VM%02d CPUs Standby: %d\n", i, info->vm[i].cpus_standby);
seq_printf(m, "VM%02d CPUs Reserved: %d\n", i, info->vm[i].cpus_reserved);
print_ext_name(m, i, info);
print_uuid(m, i, info);
}
}
......
......@@ -357,8 +357,8 @@ static unsigned long guest_translate(struct kvm_vcpu *vcpu, unsigned long gva,
union asce asce;
ctlreg0.val = vcpu->arch.sie_block->gcr[0];
edat1 = ctlreg0.edat && test_vfacility(8);
edat2 = edat1 && test_vfacility(78);
edat1 = ctlreg0.edat && test_kvm_facility(vcpu->kvm, 8);
edat2 = edat1 && test_kvm_facility(vcpu->kvm, 78);
asce.val = get_vcpu_asce(vcpu);
if (asce.r)
goto real_address;
......
......@@ -68,18 +68,27 @@ static int handle_noop(struct kvm_vcpu *vcpu)
static int handle_stop(struct kvm_vcpu *vcpu)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
int rc = 0;
unsigned int action_bits;
uint8_t flags, stop_pending;
vcpu->stat.exit_stop_request++;
trace_kvm_s390_stop_request(vcpu->arch.local_int.action_bits);
action_bits = vcpu->arch.local_int.action_bits;
/* delay the stop if any non-stop irq is pending */
if (kvm_s390_vcpu_has_irq(vcpu, 1))
return 0;
/* avoid races with the injection/SIGP STOP code */
spin_lock(&li->lock);
flags = li->irq.stop.flags;
stop_pending = kvm_s390_is_stop_irq_pending(vcpu);
spin_unlock(&li->lock);
if (!(action_bits & ACTION_STOP_ON_STOP))
trace_kvm_s390_stop_request(stop_pending, flags);
if (!stop_pending)
return 0;
if (action_bits & ACTION_STORE_ON_STOP) {
if (flags & KVM_S390_STOP_FLAG_STORE_STATUS) {
rc = kvm_s390_vcpu_store_status(vcpu,
KVM_S390_STORE_STATUS_NOADDR);
if (rc)
......@@ -279,11 +288,13 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu)
irq.type = KVM_S390_INT_CPU_TIMER;
break;
case EXT_IRQ_EXTERNAL_CALL:
if (kvm_s390_si_ext_call_pending(vcpu))
return 0;
irq.type = KVM_S390_INT_EXTERNAL_CALL;
irq.u.extcall.code = vcpu->arch.sie_block->extcpuaddr;
break;
rc = kvm_s390_inject_vcpu(vcpu, &irq);
/* ignore if another external call is already pending */
if (rc == -EBUSY)
return 0;
return rc;
default:
return -EOPNOTSUPP;
}
......@@ -307,17 +318,19 @@ static int handle_mvpg_pei(struct kvm_vcpu *vcpu)
kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
/* Make sure that the source is paged-in */
srcaddr = kvm_s390_real_to_abs(vcpu, vcpu->run->s.regs.gprs[reg2]);
if (kvm_is_error_gpa(vcpu->kvm, srcaddr))
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
rc = guest_translate_address(vcpu, vcpu->run->s.regs.gprs[reg2],
&srcaddr, 0);
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
rc = kvm_arch_fault_in_page(vcpu, srcaddr, 0);
if (rc != 0)
return rc;
/* Make sure that the destination is paged-in */
dstaddr = kvm_s390_real_to_abs(vcpu, vcpu->run->s.regs.gprs[reg1]);
if (kvm_is_error_gpa(vcpu->kvm, dstaddr))
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
rc = guest_translate_address(vcpu, vcpu->run->s.regs.gprs[reg1],
&dstaddr, 1);
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
rc = kvm_arch_fault_in_page(vcpu, dstaddr, 1);
if (rc != 0)
return rc;
......
This diff is collapsed.
This diff is collapsed.
......@@ -18,12 +18,10 @@
#include <linux/hrtimer.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <asm/facility.h>
typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
/* declare vfacilities extern */
extern unsigned long *vfacilities;
/* Transactional Memory Execution related macros */
#define IS_TE_ENABLED(vcpu) ((vcpu->arch.sie_block->ecb & 0x10))
#define TDB_FORMAT1 1
......@@ -127,6 +125,12 @@ static inline void kvm_s390_set_psw_cc(struct kvm_vcpu *vcpu, unsigned long cc)
vcpu->arch.sie_block->gpsw.mask |= cc << 44;
}
/* test availability of facility in a kvm intance */
static inline int test_kvm_facility(struct kvm *kvm, unsigned long nr)
{
return __test_facility(nr, kvm->arch.model.fac->kvm);
}
/* are cpu states controlled by user space */
static inline int kvm_s390_user_cpu_state_ctrl(struct kvm *kvm)
{
......@@ -183,7 +187,8 @@ int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu);
/* is cmma enabled */
bool kvm_s390_cmma_enabled(struct kvm *kvm);
int test_vfacility(unsigned long nr);
unsigned long kvm_s390_fac_list_mask_size(void);
extern unsigned long kvm_s390_fac_list_mask[];
/* implemented in diag.c */
int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);
......@@ -228,11 +233,13 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
struct kvm_s390_irq *s390irq);
/* implemented in interrupt.c */
int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
int kvm_s390_vcpu_has_irq(struct kvm_vcpu *vcpu, int exclude_stop);
int psw_extint_disabled(struct kvm_vcpu *vcpu);
void kvm_s390_destroy_adapters(struct kvm *kvm);
int kvm_s390_si_ext_call_pending(struct kvm_vcpu *vcpu);
int kvm_s390_ext_call_pending(struct kvm_vcpu *vcpu);
extern struct kvm_device_ops kvm_flic_ops;
int kvm_s390_is_stop_irq_pending(struct kvm_vcpu *vcpu);
void kvm_s390_clear_stop_irq(struct kvm_vcpu *vcpu);
/* implemented in guestdbg.c */
void kvm_s390_backup_guest_per_regs(struct kvm_vcpu *vcpu);
......
......@@ -337,19 +337,24 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
static int handle_stfl(struct kvm_vcpu *vcpu)
{
int rc;
unsigned int fac;
vcpu->stat.instruction_stfl++;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
/*
* We need to shift the lower 32 facility bits (bit 0-31) from a u64
* into a u32 memory representation. They will remain bits 0-31.
*/
fac = *vcpu->kvm->arch.model.fac->sie >> 32;
rc = write_guest_lc(vcpu, offsetof(struct _lowcore, stfl_fac_list),
vfacilities, 4);
&fac, sizeof(fac));
if (rc)
return rc;
VCPU_EVENT(vcpu, 5, "store facility list value %x",
*(unsigned int *) vfacilities);
trace_kvm_s390_handle_stfl(vcpu, *(unsigned int *) vfacilities);
VCPU_EVENT(vcpu, 5, "store facility list value %x", fac);
trace_kvm_s390_handle_stfl(vcpu, fac);
return 0;
}
......
......@@ -26,15 +26,17 @@ static int __sigp_sense(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu,
struct kvm_s390_local_interrupt *li;
int cpuflags;
int rc;
int ext_call_pending;
li = &dst_vcpu->arch.local_int;
cpuflags = atomic_read(li->cpuflags);
if (!(cpuflags & (CPUSTAT_ECALL_PEND | CPUSTAT_STOPPED)))
ext_call_pending = kvm_s390_ext_call_pending(dst_vcpu);
if (!(cpuflags & CPUSTAT_STOPPED) && !ext_call_pending)
rc = SIGP_CC_ORDER_CODE_ACCEPTED;
else {
*reg &= 0xffffffff00000000UL;
if (cpuflags & CPUSTAT_ECALL_PEND)
if (ext_call_pending)
*reg |= SIGP_STATUS_EXT_CALL_PENDING;
if (cpuflags & CPUSTAT_STOPPED)
*reg |= SIGP_STATUS_STOPPED;
......@@ -96,7 +98,7 @@ static int __sigp_conditional_emergency(struct kvm_vcpu *vcpu,
}
static int __sigp_external_call(struct kvm_vcpu *vcpu,
struct kvm_vcpu *dst_vcpu)
struct kvm_vcpu *dst_vcpu, u64 *reg)
{
struct kvm_s390_irq irq = {
.type = KVM_S390_INT_EXTERNAL_CALL,
......@@ -105,45 +107,31 @@ static int __sigp_external_call(struct kvm_vcpu *vcpu,
int rc;
rc = kvm_s390_inject_vcpu(dst_vcpu, &irq);
if (!rc)
if (rc == -EBUSY) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_EXT_CALL_PENDING;
return SIGP_CC_STATUS_STORED;
} else if (rc == 0) {
VCPU_EVENT(vcpu, 4, "sent sigp ext call to cpu %x",
dst_vcpu->vcpu_id);
return rc ? rc : SIGP_CC_ORDER_CODE_ACCEPTED;
}
static int __inject_sigp_stop(struct kvm_vcpu *dst_vcpu, int action)
{
struct kvm_s390_local_interrupt *li = &dst_vcpu->arch.local_int;
int rc = SIGP_CC_ORDER_CODE_ACCEPTED;
spin_lock(&li->lock);
if (li->action_bits & ACTION_STOP_ON_STOP) {
/* another SIGP STOP is pending */
rc = SIGP_CC_BUSY;
goto out;
}
if ((atomic_read(li->cpuflags) & CPUSTAT_STOPPED)) {
if ((action & ACTION_STORE_ON_STOP) != 0)
rc = -ESHUTDOWN;
goto out;
}
set_bit(IRQ_PEND_SIGP_STOP, &li->pending_irqs);
li->action_bits |= action;
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
kvm_s390_vcpu_wakeup(dst_vcpu);
out:
spin_unlock(&li->lock);
return rc;
return rc ? rc : SIGP_CC_ORDER_CODE_ACCEPTED;
}
static int __sigp_stop(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu)
{
struct kvm_s390_irq irq = {
.type = KVM_S390_SIGP_STOP,
};
int rc;
rc = __inject_sigp_stop(dst_vcpu, ACTION_STOP_ON_STOP);
VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", dst_vcpu->vcpu_id);
rc = kvm_s390_inject_vcpu(dst_vcpu, &irq);
if (rc == -EBUSY)
rc = SIGP_CC_BUSY;
else if (rc == 0)
VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x",
dst_vcpu->vcpu_id);
return rc;
}
......@@ -151,21 +139,19 @@ static int __sigp_stop(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu)
static int __sigp_stop_and_store_status(struct kvm_vcpu *vcpu,
struct kvm_vcpu *dst_vcpu, u64 *reg)
{
struct kvm_s390_irq irq = {
.type = KVM_S390_SIGP_STOP,
.u.stop.flags = KVM_S390_STOP_FLAG_STORE_STATUS,
};
int rc;
rc = __inject_sigp_stop(dst_vcpu, ACTION_STOP_ON_STOP |
ACTION_STORE_ON_STOP);
rc = kvm_s390_inject_vcpu(dst_vcpu, &irq);
if (rc == -EBUSY)
rc = SIGP_CC_BUSY;
else if (rc == 0)
VCPU_EVENT(vcpu, 4, "sent sigp stop and store status to cpu %x",
dst_vcpu->vcpu_id);
if (rc == -ESHUTDOWN) {
/* If the CPU has already been stopped, we still have
* to save the status when doing stop-and-store. This
* has to be done after unlocking all spinlocks. */
rc = kvm_s390_store_status_unloaded(dst_vcpu,
KVM_S390_STORE_STATUS_NOADDR);
}
return rc;
}
......@@ -197,41 +183,33 @@ static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter)
static int __sigp_set_prefix(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu,
u32 address, u64 *reg)
{
struct kvm_s390_local_interrupt *li;
struct kvm_s390_irq irq = {
.type = KVM_S390_SIGP_SET_PREFIX,
.u.prefix.address = address & 0x7fffe000u,
};
int rc;
li = &dst_vcpu->arch.local_int;
/*
* Make sure the new value is valid memory. We only need to check the
* first page, since address is 8k aligned and memory pieces are always
* at least 1MB aligned and have at least a size of 1MB.
*/
address &= 0x7fffe000u;
if (kvm_is_error_gpa(vcpu->kvm, address)) {
if (kvm_is_error_gpa(vcpu->kvm, irq.u.prefix.address)) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INVALID_PARAMETER;
return SIGP_CC_STATUS_STORED;
}
spin_lock(&li->lock);
/* cpu must be in stopped state */
if (!(atomic_read(li->cpuflags) & CPUSTAT_STOPPED)) {
rc = kvm_s390_inject_vcpu(dst_vcpu, &irq);
if (rc == -EBUSY) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INCORRECT_STATE;
rc = SIGP_CC_STATUS_STORED;
goto out_li;
return SIGP_CC_STATUS_STORED;
} else if (rc == 0) {
VCPU_EVENT(vcpu, 4, "set prefix of cpu %02x to %x",
dst_vcpu->vcpu_id, irq.u.prefix.address);
}
li->irq.prefix.address = address;
set_bit(IRQ_PEND_SET_PREFIX, &li->pending_irqs);
kvm_s390_vcpu_wakeup(dst_vcpu);
rc = SIGP_CC_ORDER_CODE_ACCEPTED;
VCPU_EVENT(vcpu, 4, "set prefix of cpu %02x to %x", dst_vcpu->vcpu_id,
address);
out_li:
spin_unlock(&li->lock);
return rc;
}
......@@ -242,9 +220,7 @@ static int __sigp_store_status_at_addr(struct kvm_vcpu *vcpu,
int flags;
int rc;
spin_lock(&dst_vcpu->arch.local_int.lock);
flags = atomic_read(dst_vcpu->arch.local_int.cpuflags);
spin_unlock(&dst_vcpu->arch.local_int.lock);
if (!(flags & CPUSTAT_STOPPED)) {
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INCORRECT_STATE;
......@@ -291,8 +267,9 @@ static int __prepare_sigp_re_start(struct kvm_vcpu *vcpu,
/* handle (RE)START in user space */
int rc = -EOPNOTSUPP;
/* make sure we don't race with STOP irq injection */
spin_lock(&li->lock);
if (li->action_bits & ACTION_STOP_ON_STOP)
if (kvm_s390_is_stop_irq_pending(dst_vcpu))
rc = SIGP_CC_BUSY;
spin_unlock(&li->lock);
......@@ -333,7 +310,7 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
break;
case SIGP_EXTERNAL_CALL:
vcpu->stat.instruction_sigp_external_call++;
rc = __sigp_external_call(vcpu, dst_vcpu);
rc = __sigp_external_call(vcpu, dst_vcpu, status_reg);
break;
case SIGP_EMERGENCY_SIGNAL:
vcpu->stat.instruction_sigp_emergency++;
......@@ -394,6 +371,53 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
return rc;
}
static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code)
{
if (!vcpu->kvm->arch.user_sigp)
return 0;
switch (order_code) {
case SIGP_SENSE:
case SIGP_EXTERNAL_CALL:
case SIGP_EMERGENCY_SIGNAL:
case SIGP_COND_EMERGENCY_SIGNAL:
case SIGP_SENSE_RUNNING:
return 0;
/* update counters as we're directly dropping to user space */
case SIGP_STOP:
vcpu->stat.instruction_sigp_stop++;
break;
case SIGP_STOP_AND_STORE_STATUS:
vcpu->stat.instruction_sigp_stop_store_status++;
break;
case SIGP_STORE_STATUS_AT_ADDRESS:
vcpu->stat.instruction_sigp_store_status++;
break;
case SIGP_SET_PREFIX:
vcpu->stat.instruction_sigp_prefix++;
break;
case SIGP_START:
vcpu->stat.instruction_sigp_start++;
break;
case SIGP_RESTART:
vcpu->stat.instruction_sigp_restart++;
break;
case SIGP_INITIAL_CPU_RESET:
vcpu->stat.instruction_sigp_init_cpu_reset++;
break;
case SIGP_CPU_RESET:
vcpu->stat.instruction_sigp_cpu_reset++;
break;
default:
vcpu->stat.instruction_sigp_unknown++;
}
VCPU_EVENT(vcpu, 4, "sigp order %u: completely handled in user space",
order_code);
return 1;
}
int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
{
int r1 = (vcpu->arch.sie_block->ipa & 0x00f0) >> 4;
......@@ -408,6 +432,8 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
order_code = kvm_s390_get_base_disp_rs(vcpu);
if (handle_sigp_order_in_user_space(vcpu, order_code))
return -EOPNOTSUPP;
if (r1 % 2)
parameter = vcpu->run->s.regs.gprs[r1];
......
......@@ -209,19 +209,21 @@ TRACE_EVENT(kvm_s390_request_resets,
* Trace point for a vcpu's stop requests.
*/
TRACE_EVENT(kvm_s390_stop_request,
TP_PROTO(unsigned int action_bits),
TP_ARGS(action_bits),
TP_PROTO(unsigned char stop_irq, unsigned char flags),
TP_ARGS(stop_irq, flags),
TP_STRUCT__entry(
__field(unsigned int, action_bits)
__field(unsigned char, stop_irq)
__field(unsigned char, flags)
),
TP_fast_assign(
__entry->action_bits = action_bits;
__entry->stop_irq = stop_irq;
__entry->flags = flags;
),
TP_printk("stop request, action_bits = %08x",
__entry->action_bits)
TP_printk("stop request, stop irq = %u, flags = %08x",
__entry->stop_irq, __entry->flags)
);
......
......@@ -208,6 +208,7 @@ struct x86_emulate_ops {
void (*get_cpuid)(struct x86_emulate_ctxt *ctxt,
u32 *eax, u32 *ebx, u32 *ecx, u32 *edx);
void (*set_nmi_mask)(struct x86_emulate_ctxt *ctxt, bool masked);
};
typedef u32 __attribute__((vector_size(16))) sse128_t;
......
......@@ -38,8 +38,6 @@
#define KVM_PRIVATE_MEM_SLOTS 3
#define KVM_MEM_SLOTS_NUM (KVM_USER_MEM_SLOTS + KVM_PRIVATE_MEM_SLOTS)
#define KVM_MMIO_SIZE 16
#define KVM_PIO_PAGE_OFFSET 1
#define KVM_COALESCED_MMIO_PAGE_OFFSET 2
......@@ -51,7 +49,7 @@
| X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))
#define CR3_L_MODE_RESERVED_BITS 0xFFFFFF0000000000ULL
#define CR3_PCID_INVD (1UL << 63)
#define CR3_PCID_INVD BIT_64(63)
#define CR4_RESERVED_BITS \
(~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
| X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
......@@ -160,6 +158,18 @@ enum {
#define DR7_FIXED_1 0x00000400
#define DR7_VOLATILE 0xffff2bff
#define PFERR_PRESENT_BIT 0
#define PFERR_WRITE_BIT 1
#define PFERR_USER_BIT 2
#define PFERR_RSVD_BIT 3
#define PFERR_FETCH_BIT 4
#define PFERR_PRESENT_MASK (1U << PFERR_PRESENT_BIT)
#define PFERR_WRITE_MASK (1U << PFERR_WRITE_BIT)
#define PFERR_USER_MASK (1U << PFERR_USER_BIT)
#define PFERR_RSVD_MASK (1U << PFERR_RSVD_BIT)
#define PFERR_FETCH_MASK (1U << PFERR_FETCH_BIT)
/* apic attention bits */
#define KVM_APIC_CHECK_VAPIC 0
/*
......@@ -615,6 +625,8 @@ struct kvm_arch {
#ifdef CONFIG_KVM_MMU_AUDIT
int audit_point;
#endif
bool boot_vcpu_runs_old_kvmclock;
};
struct kvm_vm_stat {
......@@ -643,6 +655,7 @@ struct kvm_vcpu_stat {
u32 irq_window_exits;
u32 nmi_window_exits;
u32 halt_exits;
u32 halt_successful_poll;
u32 halt_wakeup;
u32 request_irq_exits;
u32 irq_exits;
......@@ -787,6 +800,31 @@ struct kvm_x86_ops {
int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
void (*sched_in)(struct kvm_vcpu *kvm, int cpu);
/*
* Arch-specific dirty logging hooks. These hooks are only supposed to
* be valid if the specific arch has hardware-accelerated dirty logging
* mechanism. Currently only for PML on VMX.
*
* - slot_enable_log_dirty:
* called when enabling log dirty mode for the slot.
* - slot_disable_log_dirty:
* called when disabling log dirty mode for the slot.
* also called when slot is created with log dirty disabled.
* - flush_log_dirty:
* called before reporting dirty_bitmap to userspace.
* - enable_log_dirty_pt_masked:
* called when reenabling log dirty for the GFNs in the mask after
* corresponding bits are cleared in slot->dirty_bitmap.
*/
void (*slot_enable_log_dirty)(struct kvm *kvm,
struct kvm_memory_slot *slot);
void (*slot_disable_log_dirty)(struct kvm *kvm,
struct kvm_memory_slot *slot);
void (*flush_log_dirty)(struct kvm *kvm);
void (*enable_log_dirty_pt_masked)(struct kvm *kvm,
struct kvm_memory_slot *slot,
gfn_t offset, unsigned long mask);
};
struct kvm_arch_async_pf {
......@@ -819,8 +857,15 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
u64 dirty_mask, u64 nx_mask, u64 x_mask);
void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
struct kvm_memory_slot *memslot);
void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
struct kvm_memory_slot *memslot);
void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
struct kvm_memory_slot *memslot);
void kvm_mmu_slot_set_dirty(struct kvm *kvm,
struct kvm_memory_slot *memslot);
void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
struct kvm_memory_slot *slot,
gfn_t gfn_offset, unsigned long mask);
void kvm_mmu_zap_all(struct kvm *kvm);
......
......@@ -69,6 +69,7 @@
#define SECONDARY_EXEC_PAUSE_LOOP_EXITING 0x00000400
#define SECONDARY_EXEC_ENABLE_INVPCID 0x00001000
#define SECONDARY_EXEC_SHADOW_VMCS 0x00004000
#define SECONDARY_EXEC_ENABLE_PML 0x00020000
#define SECONDARY_EXEC_XSAVES 0x00100000
......@@ -121,6 +122,7 @@ enum vmcs_field {
GUEST_LDTR_SELECTOR = 0x0000080c,
GUEST_TR_SELECTOR = 0x0000080e,
GUEST_INTR_STATUS = 0x00000810,
GUEST_PML_INDEX = 0x00000812,
HOST_ES_SELECTOR = 0x00000c00,
HOST_CS_SELECTOR = 0x00000c02,
HOST_SS_SELECTOR = 0x00000c04,
......@@ -140,6 +142,8 @@ enum vmcs_field {
VM_EXIT_MSR_LOAD_ADDR_HIGH = 0x00002009,
VM_ENTRY_MSR_LOAD_ADDR = 0x0000200a,
VM_ENTRY_MSR_LOAD_ADDR_HIGH = 0x0000200b,
PML_ADDRESS = 0x0000200e,
PML_ADDRESS_HIGH = 0x0000200f,
TSC_OFFSET = 0x00002010,
TSC_OFFSET_HIGH = 0x00002011,
VIRTUAL_APIC_PAGE_ADDR = 0x00002012,
......
......@@ -364,6 +364,9 @@
#define MSR_IA32_UCODE_WRITE 0x00000079
#define MSR_IA32_UCODE_REV 0x0000008b
#define MSR_IA32_SMM_MONITOR_CTL 0x0000009b
#define MSR_IA32_SMBASE 0x0000009e
#define MSR_IA32_PERF_STATUS 0x00000198
#define MSR_IA32_PERF_CTL 0x00000199
#define INTEL_PERF_CTL_MASK 0xffff
......
......@@ -56,6 +56,7 @@
#define EXIT_REASON_MSR_READ 31
#define EXIT_REASON_MSR_WRITE 32
#define EXIT_REASON_INVALID_STATE 33
#define EXIT_REASON_MSR_LOAD_FAIL 34
#define EXIT_REASON_MWAIT_INSTRUCTION 36
#define EXIT_REASON_MONITOR_INSTRUCTION 39
#define EXIT_REASON_PAUSE_INSTRUCTION 40
......@@ -72,6 +73,7 @@
#define EXIT_REASON_XSETBV 55
#define EXIT_REASON_APIC_WRITE 56
#define EXIT_REASON_INVPCID 58
#define EXIT_REASON_PML_FULL 62
#define EXIT_REASON_XSAVES 63
#define EXIT_REASON_XRSTORS 64
......@@ -116,10 +118,14 @@
{ EXIT_REASON_APIC_WRITE, "APIC_WRITE" }, \
{ EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \
{ EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \
{ EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \
{ EXIT_REASON_INVD, "INVD" }, \
{ EXIT_REASON_INVVPID, "INVVPID" }, \
{ EXIT_REASON_INVPCID, "INVPCID" }, \
{ EXIT_REASON_XSAVES, "XSAVES" }, \
{ EXIT_REASON_XRSTORS, "XRSTORS" }
#define VMX_ABORT_SAVE_GUEST_MSR_FAIL 1
#define VMX_ABORT_LOAD_HOST_MSR_FAIL 4
#endif /* _UAPIVMX_H */
......@@ -39,6 +39,7 @@ config KVM
select PERF_EVENTS
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_VFIO
select SRCU
---help---
......
This diff is collapsed.
......@@ -98,7 +98,7 @@ static inline struct kvm_ioapic *ioapic_irqchip(struct kvm *kvm)
}
void kvm_rtc_eoi_tracking_restore_one(struct kvm_vcpu *vcpu);
int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
int short_hand, unsigned int dest, int dest_mode);
int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2);
void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector,
......
......@@ -138,7 +138,7 @@ int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot)
gfn += page_size >> PAGE_SHIFT;
cond_resched();
}
return 0;
......@@ -306,6 +306,8 @@ static void kvm_iommu_put_pages(struct kvm *kvm,
kvm_unpin_pages(kvm, pfn, unmap_pages);
gfn += unmap_pages;
cond_resched();
}
}
......
......@@ -33,6 +33,7 @@
#include <asm/page.h>
#include <asm/current.h>
#include <asm/apicdef.h>
#include <asm/delay.h>
#include <linux/atomic.h>
#include <linux/jump_label.h>
#include "kvm_cache_regs.h"
......@@ -327,17 +328,24 @@ static u8 count_vectors(void *bitmap)
return count;
}
void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
void __kvm_apic_update_irr(u32 *pir, void *regs)
{
u32 i, pir_val;
struct kvm_lapic *apic = vcpu->arch.apic;
for (i = 0; i <= 7; i++) {
pir_val = xchg(&pir[i], 0);
if (pir_val)
*((u32 *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val;
*((u32 *)(regs + APIC_IRR + i * 0x10)) |= pir_val;
}
}
EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
{
struct kvm_lapic *apic = vcpu->arch.apic;
__kvm_apic_update_irr(pir, apic->regs);
}
EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
static inline void apic_set_irr(int vec, struct kvm_lapic *apic)
......@@ -405,7 +413,7 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
* because the processor can modify ISR under the hood. Instead
* just set SVI.
*/
if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
if (unlikely(kvm_x86_ops->hwapic_isr_update))
kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
else {
++apic->isr_count;
......@@ -453,7 +461,7 @@ static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
* on the other hand isr_count and highest_isr_cache are unused
* and must be left alone.
*/
if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
if (unlikely(kvm_x86_ops->hwapic_isr_update))
kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
apic_find_highest_isr(apic));
else {
......@@ -580,55 +588,48 @@ static void apic_set_tpr(struct kvm_lapic *apic, u32 tpr)
apic_update_ppr(apic);
}
static int kvm_apic_broadcast(struct kvm_lapic *apic, u32 dest)
static bool kvm_apic_broadcast(struct kvm_lapic *apic, u32 dest)
{
return dest == (apic_x2apic_mode(apic) ?
X2APIC_BROADCAST : APIC_BROADCAST);
}
int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u32 dest)
static bool kvm_apic_match_physical_addr(struct kvm_lapic *apic, u32 dest)
{
return kvm_apic_id(apic) == dest || kvm_apic_broadcast(apic, dest);
}
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u32 mda)
static bool kvm_apic_match_logical_addr(struct kvm_lapic *apic, u32 mda)
{
int result = 0;
u32 logical_id;
if (kvm_apic_broadcast(apic, mda))
return 1;
return true;
if (apic_x2apic_mode(apic)) {
logical_id = kvm_apic_get_reg(apic, APIC_LDR);
return logical_id & mda;
}
logical_id = GET_APIC_LOGICAL_ID(kvm_apic_get_reg(apic, APIC_LDR));
if (apic_x2apic_mode(apic))
return ((logical_id >> 16) == (mda >> 16))
&& (logical_id & mda & 0xffff) != 0;
logical_id = GET_APIC_LOGICAL_ID(logical_id);
switch (kvm_apic_get_reg(apic, APIC_DFR)) {
case APIC_DFR_FLAT:
if (logical_id & mda)
result = 1;
break;
return (logical_id & mda) != 0;
case APIC_DFR_CLUSTER:
if (((logical_id >> 4) == (mda >> 0x4))
&& (logical_id & mda & 0xf))
result = 1;
break;
return ((logical_id >> 4) == (mda >> 4))
&& (logical_id & mda & 0xf) != 0;
default:
apic_debug("Bad DFR vcpu %d: %08x\n",
apic->vcpu->vcpu_id, kvm_apic_get_reg(apic, APIC_DFR));
break;
return false;
}
return result;
}
int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
bool kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
int short_hand, unsigned int dest, int dest_mode)
{
int result = 0;
struct kvm_lapic *target = vcpu->arch.apic;
apic_debug("target %p, source %p, dest 0x%x, "
......@@ -638,29 +639,21 @@ int kvm_apic_match_dest(struct kvm_vcpu *vcpu, struct kvm_lapic *source,
ASSERT(target);
switch (short_hand) {
case APIC_DEST_NOSHORT:
if (dest_mode == 0)
/* Physical mode. */
result = kvm_apic_match_physical_addr(target, dest);
if (dest_mode == APIC_DEST_PHYSICAL)
return kvm_apic_match_physical_addr(target, dest);
else
/* Logical mode. */
result = kvm_apic_match_logical_addr(target, dest);
break;
return kvm_apic_match_logical_addr(target, dest);
case APIC_DEST_SELF:
result = (target == source);
break;
return target == source;
case APIC_DEST_ALLINC:
result = 1;
break;
return true;
case APIC_DEST_ALLBUT:
result = (target != source);
break;
return target != source;
default:
apic_debug("kvm: apic: Bad dest shorthand value %x\n",
short_hand);
break;
return false;
}
return result;
}
bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
......@@ -693,7 +686,7 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
ret = true;
if (irq->dest_mode == 0) { /* physical mode */
if (irq->dest_mode == APIC_DEST_PHYSICAL) {
if (irq->dest_id >= ARRAY_SIZE(map->phys_map))
goto out;
......@@ -1076,25 +1069,72 @@ static void apic_timer_expired(struct kvm_lapic *apic)
{
struct kvm_vcpu *vcpu = apic->vcpu;
wait_queue_head_t *q = &vcpu->wq;
struct kvm_timer *ktimer = &apic->lapic_timer;
/*
* Note: KVM_REQ_PENDING_TIMER is implicitly checked in
* vcpu_enter_guest.
*/
if (atomic_read(&apic->lapic_timer.pending))
return;
atomic_inc(&apic->lapic_timer.pending);
/* FIXME: this code should not know anything about vcpus */
kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
kvm_set_pending_timer(vcpu);
if (waitqueue_active(q))
wake_up_interruptible(q);
if (apic_lvtt_tscdeadline(apic))
ktimer->expired_tscdeadline = ktimer->tscdeadline;
}
/*
* On APICv, this test will cause a busy wait
* during a higher-priority task.
*/
static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
u32 reg = kvm_apic_get_reg(apic, APIC_LVTT);
if (kvm_apic_hw_enabled(apic)) {
int vec = reg & APIC_VECTOR_MASK;
void *bitmap = apic->regs + APIC_ISR;
if (kvm_x86_ops->deliver_posted_interrupt)
bitmap = apic->regs + APIC_IRR;
if (apic_test_vector(vec, bitmap))
return true;
}
return false;
}
void wait_lapic_expire(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
u64 guest_tsc, tsc_deadline;
if (!kvm_vcpu_has_lapic(vcpu))
return;
if (apic->lapic_timer.expired_tscdeadline == 0)
return;
if (!lapic_timer_int_injected(vcpu))
return;
tsc_deadline = apic->lapic_timer.expired_tscdeadline;
apic->lapic_timer.expired_tscdeadline = 0;
guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc());
trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
/* __delay is delay_tsc whenever the hardware has TSC, thus always. */
if (guest_tsc < tsc_deadline)
__delay(tsc_deadline - guest_tsc);
}
static void start_apic_timer(struct kvm_lapic *apic)
{
ktime_t now;
atomic_set(&apic->lapic_timer.pending, 0);
if (apic_lvtt_period(apic) || apic_lvtt_oneshot(apic)) {
......@@ -1140,6 +1180,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
/* lapic timer in tsc deadline mode */
u64 guest_tsc, tscdeadline = apic->lapic_timer.tscdeadline;
u64 ns = 0;
ktime_t expire;
struct kvm_vcpu *vcpu = apic->vcpu;
unsigned long this_tsc_khz = vcpu->arch.virtual_tsc_khz;
unsigned long flags;
......@@ -1154,8 +1195,10 @@ static void start_apic_timer(struct kvm_lapic *apic)
if (likely(tscdeadline > guest_tsc)) {
ns = (tscdeadline - guest_tsc) * 1000000ULL;
do_div(ns, this_tsc_khz);
expire = ktime_add_ns(now, ns);
expire = ktime_sub_ns(expire, lapic_timer_advance_ns);
hrtimer_start(&apic->lapic_timer.timer,
ktime_add_ns(now, ns), HRTIMER_MODE_ABS);
expire, HRTIMER_MODE_ABS);
} else
apic_timer_expired(apic);
......@@ -1745,7 +1788,9 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
if (kvm_x86_ops->hwapic_irr_update)
kvm_x86_ops->hwapic_irr_update(vcpu,
apic_find_highest_irr(apic));
kvm_x86_ops->hwapic_isr_update(vcpu->kvm, apic_find_highest_isr(apic));
if (unlikely(kvm_x86_ops->hwapic_isr_update))
kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
apic_find_highest_isr(apic));
kvm_make_request(KVM_REQ_EVENT, vcpu);
kvm_rtc_eoi_tracking_restore_one(vcpu);
}
......
......@@ -14,6 +14,7 @@ struct kvm_timer {
u32 timer_mode;
u32 timer_mode_mask;
u64 tscdeadline;
u64 expired_tscdeadline;
atomic_t pending; /* accumulated triggered timers */
};
......@@ -56,9 +57,8 @@ u64 kvm_lapic_get_base(struct kvm_vcpu *vcpu);
void kvm_apic_set_version(struct kvm_vcpu *vcpu);
void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr);
void __kvm_apic_update_irr(u32 *pir, void *regs);
void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir);
int kvm_apic_match_physical_addr(struct kvm_lapic *apic, u32 dest);
int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u32 mda);
int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
unsigned long *dest_map);
int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
......@@ -170,4 +170,6 @@ static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
void wait_lapic_expire(struct kvm_vcpu *vcpu);
#endif
This diff is collapsed.
......@@ -44,18 +44,6 @@
#define PT_DIRECTORY_LEVEL 2
#define PT_PAGE_TABLE_LEVEL 1
#define PFERR_PRESENT_BIT 0
#define PFERR_WRITE_BIT 1
#define PFERR_USER_BIT 2
#define PFERR_RSVD_BIT 3
#define PFERR_FETCH_BIT 4
#define PFERR_PRESENT_MASK (1U << PFERR_PRESENT_BIT)
#define PFERR_WRITE_MASK (1U << PFERR_WRITE_BIT)
#define PFERR_USER_MASK (1U << PFERR_USER_BIT)
#define PFERR_RSVD_MASK (1U << PFERR_RSVD_BIT)
#define PFERR_FETCH_MASK (1U << PFERR_FETCH_BIT)
static inline u64 rsvd_bits(int s, int e)
{
return ((1ULL << (e - s + 1)) - 1) << s;
......@@ -81,9 +69,8 @@ enum {
};
int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct);
void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
bool execonly);
void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu);
void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly);
void update_permission_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
bool ept);
......
......@@ -2003,8 +2003,8 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu,
static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
{
kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu);
WARN_ON(mmu_is_nested(vcpu));
kvm_init_shadow_mmu(vcpu);
vcpu->arch.mmu.set_cr3 = nested_svm_set_tdp_cr3;
vcpu->arch.mmu.get_cr3 = nested_svm_get_tdp_cr3;
vcpu->arch.mmu.get_pdptr = nested_svm_get_tdp_pdptr;
......
......@@ -848,6 +848,24 @@ TRACE_EVENT(kvm_track_tsc,
#endif /* CONFIG_X86_64 */
/*
* Tracepoint for PML full VMEXIT.
*/
TRACE_EVENT(kvm_pml_full,
TP_PROTO(unsigned int vcpu_id),
TP_ARGS(vcpu_id),
TP_STRUCT__entry(
__field( unsigned int, vcpu_id )
),
TP_fast_assign(
__entry->vcpu_id = vcpu_id;
),
TP_printk("vcpu %d: PML full", __entry->vcpu_id)
);
TRACE_EVENT(kvm_ple_window,
TP_PROTO(bool grow, unsigned int vcpu_id, int new, int old),
TP_ARGS(grow, vcpu_id, new, old),
......@@ -914,6 +932,26 @@ TRACE_EVENT(kvm_pvclock_update,
__entry->flags)
);
TRACE_EVENT(kvm_wait_lapic_expire,
TP_PROTO(unsigned int vcpu_id, s64 delta),
TP_ARGS(vcpu_id, delta),
TP_STRUCT__entry(
__field( unsigned int, vcpu_id )
__field( s64, delta )
),
TP_fast_assign(
__entry->vcpu_id = vcpu_id;
__entry->delta = delta;
),
TP_printk("vcpu %u: delta %lld (%s)",
__entry->vcpu_id,
__entry->delta,
__entry->delta < 0 ? "early" : "late")
);
#endif /* _TRACE_KVM_H */
#undef TRACE_INCLUDE_PATH
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment