Commit 89f88337 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
parents 9e2d59ad 6b73a960
......@@ -219,19 +219,6 @@ allocation of vcpu ids. For example, if userspace wants
single-threaded guest vcpus, it should make all vcpu ids be a multiple
of the number of vcpus per vcore.
On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
threads in one or more virtual CPU cores. (This is because the
hardware requires all the hardware threads in a CPU core to be in the
same partition.) The KVM_CAP_PPC_SMT capability indicates the number
of vcpus per virtual core (vcore). The vcore id is obtained by
dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
given vcore will always be in the same physical core as each other
(though that might be a different physical core from time to time).
Userspace can control the threading (SMT) mode of the guest by its
allocation of vcpu ids. For example, if userspace wants
single-threaded guest vcpus, it should make all vcpu ids be a multiple
of the number of vcpus per vcore.
For virtual cpus that have been created with S390 user controlled virtual
machines, the resulting vcpu fd can be memory mapped at page offset
KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
......@@ -345,7 +332,7 @@ struct kvm_sregs {
__u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
};
/* ppc -- see arch/powerpc/include/asm/kvm.h */
/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
interrupt_bitmap is a bitmap of pending external interrupts. At most
one bit may be set. This interrupt has been acknowledged by the APIC
......@@ -892,12 +879,12 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
be identical. This allows large pages in the guest to be backed by large
pages in the host.
The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which instructs
kvm to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG
ioctl. The KVM_CAP_READONLY_MEM capability indicates the availability of the
KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only
allows read accesses. Writes will be posted to userspace as KVM_EXIT_MMIO
exits.
The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
to make a new slot read-only. In this case, writes to this memory will be
posted to userspace as KVM_EXIT_MMIO exits.
When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
the memory region are automatically reflected into the guest. For example, an
......@@ -931,7 +918,7 @@ documentation when it pops into existence).
4.37 KVM_ENABLE_CAP
Capability: KVM_CAP_ENABLE_CAP
Architectures: ppc
Architectures: ppc, s390
Type: vcpu ioctl
Parameters: struct kvm_enable_cap (in)
Returns: 0 on success; -1 on error
......@@ -1792,6 +1779,7 @@ registers, find a list below:
PPC | KVM_REG_PPC_VPA_SLB | 128
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
PPC | KVM_REG_PPC_EPR | 32
ARM registers are mapped using the lower 32 bits. The upper 16 of that
is the register group type, or coprocessor number:
......@@ -2108,6 +2096,14 @@ KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
I/O interruption parameters in parm (subchannel) and parm64 (intparm,
interruption subclass)
KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
machine check interrupt code in parm64 (note that
machine checks needing further payload are not
supported by this ioctl)
Note that the vcpu ioctl is asynchronous to vcpu execution.
......@@ -2359,8 +2355,8 @@ executed a memory-mapped I/O instruction which could not be satisfied
by kvm. The 'data' member contains the written data if 'is_write' is
true, and should be filled by application code otherwise.
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR
and KVM_EXIT_PAPR the corresponding
NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_DCR,
KVM_EXIT_PAPR and KVM_EXIT_EPR the corresponding
operations are complete (and guest state is consistent) only after userspace
has re-entered the kernel with KVM_RUN. The kernel side will first finish
incomplete operations and then check for pending signals. Userspace
......@@ -2463,6 +2459,41 @@ The possible hypercalls are defined in the Power Architecture Platform
Requirements (PAPR) document available from www.power.org (free
developer registration required to access it).
/* KVM_EXIT_S390_TSCH */
struct {
__u16 subchannel_id;
__u16 subchannel_nr;
__u32 io_int_parm;
__u32 io_int_word;
__u32 ipb;
__u8 dequeued;
} s390_tsch;
s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled
and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O
interrupt for the target subchannel has been dequeued and subchannel_id,
subchannel_nr, io_int_parm and io_int_word contain the parameters for that
interrupt. ipb is needed for instruction parameter decoding.
/* KVM_EXIT_EPR */
struct {
__u32 epr;
} epr;
On FSL BookE PowerPC chips, the interrupt controller has a fast patch
interrupt acknowledge path to the core. When the core successfully
delivers an interrupt, it automatically populates the EPR register with
the interrupt vector number and acknowledges the interrupt inside
the interrupt controller.
In case the interrupt controller lives in user space, we need to do
the interrupt acknowledge cycle through it to fetch the next to be
delivered interrupt vector using this exit.
It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
external interrupt has just been delivered into the guest. User space
should put the acknowledged interrupt vector into the 'epr' field.
/* Fix the size of the union. */
char padding[256];
};
......@@ -2584,3 +2615,34 @@ For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value.
- The tsize field of mas1 shall be set to 4K on TLB0, even though the
hardware ignores this value for TLB0.
6.4 KVM_CAP_S390_CSS_SUPPORT
Architectures: s390
Parameters: none
Returns: 0 on success; -1 on error
This capability enables support for handling of channel I/O instructions.
TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are
handled in-kernel, while the other I/O instructions are passed to userspace.
When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
SUBCHANNEL intercepts.
6.5 KVM_CAP_PPC_EPR
Architectures: ppc
Parameters: args[0] defines whether the proxy facility is active
Returns: 0 on success; -1 on error
This capability enables or disables the delivery of interrupts through the
external proxy facility.
When enabled (args[0] != 0), every time the guest gets an external interrupt
delivered, it automatically exits into user space with a KVM_EXIT_EPR exit
to receive the topmost interrupt vector.
When disabled (args[0] == 0), behavior is as if this facility is unsupported.
When this capability is enabled, KVM_EXIT_EPR can occur.
......@@ -187,13 +187,6 @@ Shadow pages contain the following information:
perform a reverse map from a pte to a gfn. When role.direct is set, any
element of this array can be calculated from the gfn field when used, in
this case, the array of gfns is not allocated. See role.direct and gfn.
slot_bitmap:
A bitmap containing one bit per memory slot. If the page contains a pte
mapping a page from memory slot n, then bit n of slot_bitmap will be set
(if a page is aliased among several slots, then it is not guaranteed that
all slots will be marked).
Used during dirty logging to avoid scanning a shadow page if none if its
pages need tracking.
root_count:
A counter keeping track of how many hardware registers (guest cr3 or
pdptrs) are now pointing at the page. While this counter is nonzero, the
......
......@@ -23,9 +23,7 @@
#ifndef __ASM_KVM_HOST_H
#define __ASM_KVM_HOST_H
#define KVM_MEMORY_SLOTS 32
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
#define KVM_USER_MEM_SLOTS 32
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
......
......@@ -955,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
kvm_mem.guest_phys_addr;
kvm_userspace_mem.memory_size = kvm_mem.memory_size;
r = kvm_vm_ioctl_set_memory_region(kvm,
&kvm_userspace_mem, 0);
&kvm_userspace_mem, false);
if (r)
goto out;
break;
......@@ -1580,7 +1580,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
int user_alloc)
bool user_alloc)
{
unsigned long i;
unsigned long pfn;
......@@ -1611,7 +1611,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
int user_alloc)
bool user_alloc)
{
return;
}
......@@ -1834,7 +1834,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
mutex_lock(&kvm->slots_lock);
r = -EINVAL;
if (log->slot >= KVM_MEMORY_SLOTS)
if (log->slot >= KVM_USER_MEM_SLOTS)
goto out;
memslot = id_to_memslot(kvm->memslots, log->slot);
......
......@@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
#define kvm_apic_present(x) (true)
#define kvm_lapic_enabled(x) (true)
static inline bool kvm_apic_vid_enabled(void)
{
/* IA64 has no apicv supporting, do nothing here */
return false;
}
#endif
......@@ -37,10 +37,8 @@
#define KVM_MAX_VCPUS NR_CPUS
#define KVM_MAX_VCORES NR_CPUS
#define KVM_MEMORY_SLOTS 32
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
#define KVM_USER_MEM_SLOTS 32
#define KVM_MEM_SLOTS_NUM KVM_USER_MEM_SLOTS
#ifdef CONFIG_KVM_MMIO
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
......@@ -523,6 +521,8 @@ struct kvm_vcpu_arch {
u8 sane;
u8 cpu_type;
u8 hcall_needed;
u8 epr_enabled;
u8 epr_needed;
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
......
......@@ -44,12 +44,11 @@ enum emulation_result {
EMULATE_DO_DCR, /* kvm_run filled with DCR request */
EMULATE_FAIL, /* can't emulate this instruction */
EMULATE_AGAIN, /* something went wrong. go again */
EMULATE_DO_PAPR, /* kvm_run filled with PAPR request */
};
extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
extern char kvmppc_handlers_start[];
extern unsigned long kvmppc_handler_len;
extern void kvmppc_handler_highmem(void);
extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
......@@ -263,6 +262,15 @@ static inline void kvm_linear_init(void)
{}
#endif
static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
{
#ifdef CONFIG_KVM_BOOKE_HV
mtspr(SPRN_GEPR, epr);
#elif defined(CONFIG_BOOKE)
vcpu->arch.epr = epr;
#endif
}
int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
struct kvm_config_tlb *cfg);
int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
......
......@@ -956,8 +956,6 @@
#define SPRN_SPRG_RSCRATCH_DBG SPRN_SPRG9
#define SPRN_SPRG_WSCRATCH_DBG SPRN_SPRG9
#endif
#define SPRN_SPRG_RVCPU SPRN_SPRG1
#define SPRN_SPRG_WVCPU SPRN_SPRG1
#endif
#ifdef CONFIG_8xx
......
......@@ -56,6 +56,7 @@
#define SPRN_SPRG7W 0x117 /* Special Purpose Register General 7 Write */
#define SPRN_EPCR 0x133 /* Embedded Processor Control Register */
#define SPRN_DBCR2 0x136 /* Debug Control Register 2 */
#define SPRN_DBCR4 0x233 /* Debug Control Register 4 */
#define SPRN_MSRP 0x137 /* MSR Protect Register */
#define SPRN_IAC3 0x13A /* Instruction Address Compare 3 */
#define SPRN_IAC4 0x13B /* Instruction Address Compare 4 */
......
......@@ -114,7 +114,10 @@ struct kvm_regs {
/* Embedded Floating Point (SPE) -- IVOR32-34 if KVM_SREGS_E_IVOR */
#define KVM_SREGS_E_SPE (1 << 9)
/* External Proxy (EXP) -- EPR */
/*
* DEPRECATED! USE ONE_REG FOR THIS ONE!
* External Proxy (EXP) -- EPR
*/
#define KVM_SREGS_EXP (1 << 10)
/* External PID (E.PD) -- EPSC/EPLC */
......@@ -412,5 +415,6 @@ struct kvm_get_htab_header {
#define KVM_REG_PPC_VPA_DTL (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x84)
#define KVM_REG_PPC_EPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x85)
#define KVM_REG_PPC_EPR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x86)
#endif /* __LINUX_KVM_POWERPC_H */
......@@ -118,7 +118,7 @@ int main(void)
#ifdef CONFIG_KVM_BOOK3S_32_HANDLER
DEFINE(THREAD_KVM_SVCPU, offsetof(struct thread_struct, kvm_shadow_vcpu));
#endif
#ifdef CONFIG_KVM_BOOKE_HV
#if defined(CONFIG_KVM) && defined(CONFIG_BOOKE)
DEFINE(THREAD_KVM_VCPU, offsetof(struct thread_struct, kvm_vcpu));
#endif
......
......@@ -10,7 +10,8 @@ common-objs-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o \
eventfd.o)
CFLAGS_44x_tlb.o := -I.
CFLAGS_e500_tlb.o := -I.
CFLAGS_e500_mmu.o := -I.
CFLAGS_e500_mmu_host.o := -I.
CFLAGS_emulate.o := -I.
common-objs-y += powerpc.o emulate.o
......@@ -35,7 +36,8 @@ kvm-e500-objs := \
booke_emulate.o \
booke_interrupts.o \
e500.o \
e500_tlb.o \
e500_mmu.o \
e500_mmu_host.o \
e500_emulate.o
kvm-objs-$(CONFIG_KVM_E500V2) := $(kvm-e500-objs)
......@@ -45,7 +47,8 @@ kvm-e500mc-objs := \
booke_emulate.o \
bookehv_interrupts.o \
e500mc.o \
e500_tlb.o \
e500_mmu.o \
e500_mmu_host.o \
e500_emulate.o
kvm-objs-$(CONFIG_KVM_E500MC) := $(kvm-e500mc-objs)
......
......@@ -34,6 +34,8 @@
#define OP_31_XOP_MTSRIN 242
#define OP_31_XOP_TLBIEL 274
#define OP_31_XOP_TLBIE 306
/* Opcode is officially reserved, reuse it as sc 1 when sc 1 doesn't trap */
#define OP_31_XOP_FAKE_SC1 308
#define OP_31_XOP_SLBMTE 402
#define OP_31_XOP_SLBIE 434
#define OP_31_XOP_SLBIA 498
......@@ -170,6 +172,32 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
vcpu->arch.mmu.tlbie(vcpu, addr, large);
break;
}
#ifdef CONFIG_KVM_BOOK3S_64_PR
case OP_31_XOP_FAKE_SC1:
{
/* SC 1 papr hypercalls */
ulong cmd = kvmppc_get_gpr(vcpu, 3);
int i;
if ((vcpu->arch.shared->msr & MSR_PR) ||
!vcpu->arch.papr_enabled) {
emulated = EMULATE_FAIL;
break;
}
if (kvmppc_h_pr(vcpu, cmd) == EMULATE_DONE)
break;
run->papr_hcall.nr = cmd;
for (i = 0; i < 9; ++i) {
ulong gpr = kvmppc_get_gpr(vcpu, 4 + i);
run->papr_hcall.args[i] = gpr;
}
emulated = EMULATE_DO_PAPR;
break;
}
#endif
case OP_31_XOP_EIOIO:
break;
case OP_31_XOP_SLBMTE:
......@@ -427,6 +455,7 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
case SPRN_PMC3_GEKKO:
case SPRN_PMC4_GEKKO:
case SPRN_WPAR_GEKKO:
case SPRN_MSSSR0:
break;
unprivileged:
default:
......@@ -523,6 +552,7 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
case SPRN_PMC3_GEKKO:
case SPRN_PMC4_GEKKO:
case SPRN_WPAR_GEKKO:
case SPRN_MSSSR0:
*spr_val = 0;
break;
default:
......
......@@ -1549,7 +1549,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
mutex_lock(&kvm->slots_lock);
r = -EINVAL;
if (log->slot >= KVM_MEMORY_SLOTS)
if (log->slot >= KVM_USER_MEM_SLOTS)
goto out;
memslot = id_to_memslot(kvm->memslots, log->slot);
......
......@@ -762,6 +762,11 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
run->exit_reason = KVM_EXIT_MMIO;
r = RESUME_HOST_NV;
break;
case EMULATE_DO_PAPR:
run->exit_reason = KVM_EXIT_PAPR_HCALL;
vcpu->arch.hcall_needed = 1;
r = RESUME_HOST_NV;
break;
default:
BUG();
}
......
......@@ -182,6 +182,14 @@ static void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu,
kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_INST_STORAGE);
}
static void kvmppc_core_queue_alignment(struct kvm_vcpu *vcpu, ulong dear_flags,
ulong esr_flags)
{
vcpu->arch.queued_dear = dear_flags;
vcpu->arch.queued_esr = esr_flags;
kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_ALIGNMENT);
}
void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong esr_flags)
{
vcpu->arch.queued_esr = esr_flags;
......@@ -300,13 +308,22 @@ static void set_guest_esr(struct kvm_vcpu *vcpu, u32 esr)
#endif
}
static unsigned long get_guest_epr(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_KVM_BOOKE_HV
return mfspr(SPRN_GEPR);
#else
return vcpu->arch.epr;
#endif
}
/* Deliver the interrupt of the corresponding priority, if possible. */
static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
unsigned int priority)
{
int allowed = 0;
ulong msr_mask = 0;
bool update_esr = false, update_dear = false;
bool update_esr = false, update_dear = false, update_epr = false;
ulong crit_raw = vcpu->arch.shared->critical;
ulong crit_r1 = kvmppc_get_gpr(vcpu, 1);
bool crit;
......@@ -330,9 +347,13 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
keep_irq = true;
}
if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
update_epr = true;
switch (priority) {
case BOOKE_IRQPRIO_DTLB_MISS:
case BOOKE_IRQPRIO_DATA_STORAGE:
case BOOKE_IRQPRIO_ALIGNMENT:
update_dear = true;
/* fall through */
case BOOKE_IRQPRIO_INST_STORAGE:
......@@ -346,7 +367,6 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
case BOOKE_IRQPRIO_SPE_FP_DATA:
case BOOKE_IRQPRIO_SPE_FP_ROUND:
case BOOKE_IRQPRIO_AP_UNAVAIL:
case BOOKE_IRQPRIO_ALIGNMENT:
allowed = 1;
msr_mask = MSR_CE | MSR_ME | MSR_DE;
int_class = INT_CLASS_NONCRIT;
......@@ -408,6 +428,8 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
set_guest_esr(vcpu, vcpu->arch.queued_esr);
if (update_dear == true)
set_guest_dear(vcpu, vcpu->arch.queued_dear);
if (update_epr == true)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
new_msr &= msr_mask;
#if defined(CONFIG_64BIT)
......@@ -581,6 +603,11 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
kvmppc_core_check_exceptions(vcpu);
if (vcpu->requests) {
/* Exception delivery raised request; start over */
return 1;
}
if (vcpu->arch.shared->msr & MSR_WE) {
local_irq_enable();
kvm_vcpu_block(vcpu);
......@@ -610,6 +637,13 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
r = 0;
}
if (kvm_check_request(KVM_REQ_EPR_EXIT, vcpu)) {
vcpu->run->epr.epr = 0;
vcpu->arch.epr_needed = true;
vcpu->run->exit_reason = KVM_EXIT_EPR;
r = 0;
}
return r;
}
......@@ -945,6 +979,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
r = RESUME_GUEST;
break;
case BOOKE_INTERRUPT_ALIGNMENT:
kvmppc_core_queue_alignment(vcpu, vcpu->arch.fault_dear,
vcpu->arch.fault_esr);
r = RESUME_GUEST;
break;
#ifdef CONFIG_KVM_BOOKE_HV
case BOOKE_INTERRUPT_HV_SYSCALL:
if (!(vcpu->arch.shared->msr & MSR_PR)) {
......@@ -1388,6 +1428,11 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
&vcpu->arch.dbg_reg.dac[dac], sizeof(u64));
break;
}
case KVM_REG_PPC_EPR: {
u32 epr = get_guest_epr(vcpu);
r = put_user(epr, (u32 __user *)(long)reg->addr);
break;
}
#if defined(CONFIG_64BIT)
case KVM_REG_PPC_EPCR:
r = put_user(vcpu->arch.epcr, (u32 __user *)(long)reg->addr);
......@@ -1420,6 +1465,13 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
(u64 __user *)(long)reg->addr, sizeof(u64));
break;
}
case KVM_REG_PPC_EPR: {
u32 new_epr;
r = get_user(new_epr, (u32 __user *)(long)reg->addr);
if (!r)
kvmppc_set_epr(vcpu, new_epr);
break;
}
#if defined(CONFIG_64BIT)
case KVM_REG_PPC_EPCR: {
u32 new_epcr;
......@@ -1556,7 +1608,9 @@ int __init kvmppc_booke_init(void)
{
#ifndef CONFIG_KVM_BOOKE_HV
unsigned long ivor[16];
unsigned long *handler = kvmppc_booke_handler_addr;
unsigned long max_ivor = 0;
unsigned long handler_len;
int i;
/* We install our own exception handlers by hijacking IVPR. IVPR must
......@@ -1589,14 +1643,16 @@ int __init kvmppc_booke_init(void)
for (i = 0; i < 16; i++) {
if (ivor[i] > max_ivor)
max_ivor = ivor[i];
max_ivor = i;
handler_len = handler[i + 1] - handler[i];
memcpy((void *)kvmppc_booke_handlers + ivor[i],
kvmppc_handlers_start + i * kvmppc_handler_len,
kvmppc_handler_len);
(void *)handler[i], handler_len);
}
flush_icache_range(kvmppc_booke_handlers,
kvmppc_booke_handlers + max_ivor + kvmppc_handler_len);
handler_len = handler[max_ivor + 1] - handler[max_ivor];
flush_icache_range(kvmppc_booke_handlers, kvmppc_booke_handlers +
ivor[max_ivor] + handler_len);
#endif /* !BOOKE_HV */
return 0;
}
......
......@@ -65,6 +65,7 @@
(1 << BOOKE_IRQPRIO_CRITICAL))
extern unsigned long kvmppc_booke_handlers;
extern unsigned long kvmppc_booke_handler_addr[];
void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr);
void kvmppc_mmu_msr_notify(struct kvm_vcpu *vcpu, u32 old_msr);
......
......@@ -269,6 +269,9 @@ int kvmppc_booke_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
case SPRN_ESR:
*spr_val = vcpu->arch.shared->esr;
break;
case SPRN_EPR:
*spr_val = vcpu->arch.epr;
break;
case SPRN_CSRR0:
*spr_val = vcpu->arch.csrr0;
break;
......
......@@ -45,18 +45,21 @@
(1<<BOOKE_INTERRUPT_DEBUG))
#define NEED_DEAR_MASK ((1<<BOOKE_INTERRUPT_DATA_STORAGE) | \
(1<<BOOKE_INTERRUPT_DTLB_MISS))
(1<<BOOKE_INTERRUPT_DTLB_MISS) | \
(1<<BOOKE_INTERRUPT_ALIGNMENT))
#define NEED_ESR_MASK ((1<<BOOKE_INTERRUPT_DATA_STORAGE) | \
(1<<BOOKE_INTERRUPT_INST_STORAGE) | \
(1<<BOOKE_INTERRUPT_PROGRAM) | \
(1<<BOOKE_INTERRUPT_DTLB_MISS))
(1<<BOOKE_INTERRUPT_DTLB_MISS) | \
(1<<BOOKE_INTERRUPT_ALIGNMENT))
.macro KVM_HANDLER ivor_nr scratch srr0
_GLOBAL(kvmppc_handler_\ivor_nr)
/* Get pointer to vcpu and record exit number. */
mtspr \scratch , r4
mfspr r4, SPRN_SPRG_RVCPU
mfspr r4, SPRN_SPRG_THREAD
lwz r4, THREAD_KVM_VCPU(r4)
stw r3, VCPU_GPR(R3)(r4)
stw r5, VCPU_GPR(R5)(r4)
stw r6, VCPU_GPR(R6)(r4)
......@@ -73,6 +76,14 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
.endm
.macro KVM_HANDLER_ADDR ivor_nr
.long kvmppc_handler_\ivor_nr
.endm
.macro KVM_HANDLER_END
.long kvmppc_handlers_end
.endm
_GLOBAL(kvmppc_handlers_start)
KVM_HANDLER BOOKE_INTERRUPT_CRITICAL SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
KVM_HANDLER BOOKE_INTERRUPT_MACHINE_CHECK SPRN_SPRG_RSCRATCH_MC SPRN_MCSRR0
......@@ -93,9 +104,7 @@ KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0
_GLOBAL(kvmppc_handler_len)
.long kvmppc_handler_1 - kvmppc_handler_0
_GLOBAL(kvmppc_handlers_end)
/* Registers:
* SPRG_SCRATCH0: guest r4
......@@ -402,9 +411,6 @@ lightweight_exit:
lwz r8, kvmppc_booke_handlers@l(r8)
mtspr SPRN_IVPR, r8
/* Save vcpu pointer for the exception handlers. */
mtspr SPRN_SPRG_WVCPU, r4
lwz r5, VCPU_SHARED(r4)
/* Can't switch the stack pointer until after IVPR is switched,
......@@ -463,6 +469,31 @@ lightweight_exit:
lwz r4, VCPU_GPR(R4)(r4)
rfi
.data
.align 4
.globl kvmppc_booke_handler_addr
kvmppc_booke_handler_addr:
KVM_HANDLER_ADDR BOOKE_INTERRUPT_CRITICAL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_MACHINE_CHECK
KVM_HANDLER_ADDR BOOKE_INTERRUPT_DATA_STORAGE
KVM_HANDLER_ADDR BOOKE_INTERRUPT_INST_STORAGE
KVM_HANDLER_ADDR BOOKE_INTERRUPT_EXTERNAL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_ALIGNMENT
KVM_HANDLER_ADDR BOOKE_INTERRUPT_PROGRAM
KVM_HANDLER_ADDR BOOKE_INTERRUPT_FP_UNAVAIL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_SYSCALL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_AP_UNAVAIL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_DECREMENTER
KVM_HANDLER_ADDR BOOKE_INTERRUPT_FIT
KVM_HANDLER_ADDR BOOKE_INTERRUPT_WATCHDOG
KVM_HANDLER_ADDR BOOKE_INTERRUPT_DTLB_MISS
KVM_HANDLER_ADDR BOOKE_INTERRUPT_ITLB_MISS
KVM_HANDLER_ADDR BOOKE_INTERRUPT_DEBUG
KVM_HANDLER_ADDR BOOKE_INTERRUPT_SPE_UNAVAIL
KVM_HANDLER_ADDR BOOKE_INTERRUPT_SPE_FP_DATA
KVM_HANDLER_ADDR BOOKE_INTERRUPT_SPE_FP_ROUND
KVM_HANDLER_END /*Always keep this in end*/
#ifdef CONFIG_SPE
_GLOBAL(kvmppc_save_guest_spe)
cmpi 0,r3,0
......
......@@ -491,6 +491,9 @@ static int __init kvmppc_e500_init(void)
{
int r, i;
unsigned long ivor[3];
/* Process remaining handlers above the generic first 16 */
unsigned long *handler = &kvmppc_booke_handler_addr[16];
unsigned long handler_len;
unsigned long max_ivor = 0;
r = kvmppc_core_check_processor_compat();
......@@ -506,15 +509,16 @@ static int __init kvmppc_e500_init(void)
ivor[1] = mfspr(SPRN_IVOR33);
ivor[2] = mfspr(SPRN_IVOR34);
for (i = 0; i < 3; i++) {
if (ivor[i] > max_ivor)
max_ivor = ivor[i];
if (ivor[i] > ivor[max_ivor])
max_ivor = i;
handler_len = handler[i + 1] - handler[i];
memcpy((void *)kvmppc_booke_handlers + ivor[i],
kvmppc_handlers_start + (i + 16) * kvmppc_handler_len,
kvmppc_handler_len);
(void *)handler[i], handler_len);
}
flush_icache_range(kvmppc_booke_handlers,
kvmppc_booke_handlers + max_ivor + kvmppc_handler_len);
handler_len = handler[max_ivor + 1] - handler[max_ivor];
flush_icache_range(kvmppc_booke_handlers, kvmppc_booke_handlers +
ivor[max_ivor] + handler_len);
return kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
}
......
......@@ -28,6 +28,7 @@
#define E500_TLB_VALID 1
#define E500_TLB_BITMAP 2
#define E500_TLB_TLB0 (1 << 2)
struct tlbe_ref {
pfn_t pfn;
......
This diff is collapsed.
/*
* Copyright (C) 2008-2013 Freescale Semiconductor, Inc. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License, version 2, as
* published by the Free Software Foundation.
*/
#ifndef KVM_E500_MMU_HOST_H
#define KVM_E500_MMU_HOST_H
void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel,
int esel);
int e500_mmu_host_init(struct kvmppc_vcpu_e500 *vcpu_e500);
void e500_mmu_host_uninit(struct kvmppc_vcpu_e500 *vcpu_e500);
#endif /* KVM_E500_MMU_HOST_H */
......@@ -150,8 +150,6 @@ static int kvmppc_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs)
case SPRN_TBWL: break;
case SPRN_TBWU: break;
case SPRN_MSSSR0: break;
case SPRN_DEC:
vcpu->arch.dec = spr_val;
kvmppc_emulate_dec(vcpu);
......@@ -202,9 +200,6 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt)
case SPRN_PIR:
spr_val = vcpu->vcpu_id;
break;
case SPRN_MSSSR0:
spr_val = 0;
break;
/* Note: mftb and TBRL/TBWL are user-accessible, so
* the guest can always access the real TB anyways.
......
......@@ -237,7 +237,8 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu)
r = RESUME_HOST;
break;
default:
BUG();
WARN_ON(1);
r = RESUME_GUEST;
}
return r;
......@@ -305,6 +306,7 @@ int kvm_dev_ioctl_check_extension(long ext)
#ifdef CONFIG_BOOKE
case KVM_CAP_PPC_BOOKE_SREGS:
case KVM_CAP_PPC_BOOKE_WATCHDOG:
case KVM_CAP_PPC_EPR:
#else
case KVM_CAP_PPC_SEGSTATE:
case KVM_CAP_PPC_HIOR:
......@@ -412,7 +414,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
int user_alloc)
bool user_alloc)
{
return kvmppc_core_prepare_memory_region(kvm, memslot, mem);
}
......@@ -420,7 +422,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
int user_alloc)
bool user_alloc)
{
kvmppc_core_commit_memory_region(kvm, mem, old);
}
......@@ -720,6 +722,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
for (i = 0; i < 9; ++i)
kvmppc_set_gpr(vcpu, 4 + i, run->papr_hcall.args[i]);
vcpu->arch.hcall_needed = 0;
#ifdef CONFIG_BOOKE
} else if (vcpu->arch.epr_needed) {
kvmppc_set_epr(vcpu, run->epr.epr);
vcpu->arch.epr_needed = 0;
#endif
}
r = kvmppc_vcpu_run(run, vcpu);
......@@ -761,6 +768,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
r = 0;
vcpu->arch.papr_enabled = true;
break;
case KVM_CAP_PPC_EPR:
r = 0;
vcpu->arch.epr_enabled = cap->args[0];
break;
#ifdef CONFIG_BOOKE
case KVM_CAP_PPC_BOOKE_WATCHDOG:
r = 0;
......
......@@ -41,6 +41,7 @@ enum interruption_class {
IRQIO_CSC,
IRQIO_PCI,
IRQIO_MSI,
IRQIO_VIR,
NMI_NMI,
CPU_RST,
NR_ARCH_IRQS
......
......@@ -20,9 +20,7 @@
#include <asm/cpu.h>
#define KVM_MAX_VCPUS 64
#define KVM_MEMORY_SLOTS 32
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
#define KVM_USER_MEM_SLOTS 32
struct sca_entry {
atomic_t scn;
......@@ -76,8 +74,11 @@ struct kvm_s390_sie_block {
__u64 epoch; /* 0x0038 */
__u8 reserved40[4]; /* 0x0040 */
#define LCTL_CR0 0x8000
#define LCTL_CR6 0x0200
#define LCTL_CR14 0x0002
__u16 lctl; /* 0x0044 */
__s16 icpua; /* 0x0046 */
#define ICTL_LPSW 0x00400000
__u32 ictl; /* 0x0048 */
__u32 eca; /* 0x004c */
__u8 icptcode; /* 0x0050 */
......@@ -127,6 +128,7 @@ struct kvm_vcpu_stat {
u32 deliver_prefix_signal;
u32 deliver_restart_signal;
u32 deliver_program_int;
u32 deliver_io_int;
u32 exit_wait_state;
u32 instruction_stidp;
u32 instruction_spx;
......@@ -187,6 +189,11 @@ struct kvm_s390_emerg_info {
__u16 code;
};
struct kvm_s390_mchk_info {
__u64 cr14;
__u64 mcic;
};
struct kvm_s390_interrupt_info {
struct list_head list;
u64 type;
......@@ -197,6 +204,7 @@ struct kvm_s390_interrupt_info {
struct kvm_s390_emerg_info emerg;
struct kvm_s390_extcall_info extcall;
struct kvm_s390_prefix_info prefix;
struct kvm_s390_mchk_info mchk;
};
};
......@@ -254,6 +262,7 @@ struct kvm_arch{
debug_info_t *dbf;
struct kvm_s390_float_interrupt float_int;
struct gmap *gmap;
int css_support;
};
extern int sie64a(struct kvm_s390_sie_block *, u64 *);
......
......@@ -81,6 +81,7 @@ static const struct irq_class irqclass_sub_desc[NR_ARCH_IRQS] = {
[IRQIO_CSC] = {.name = "CSC", .desc = "[I/O] CHSC Subchannel"},
[IRQIO_PCI] = {.name = "PCI", .desc = "[I/O] PCI Interrupt" },
[IRQIO_MSI] = {.name = "MSI", .desc = "[I/O] MSI Interrupt" },
[IRQIO_VIR] = {.name = "VIR", .desc = "[I/O] Virtual I/O Devices"},
[NMI_NMI] = {.name = "NMI", .desc = "[NMI] Machine Check"},
[CPU_RST] = {.name = "RST", .desc = "[CPU] CPU Restart"},
};
......
......@@ -26,27 +26,20 @@ static int handle_lctlg(struct kvm_vcpu *vcpu)
{
int reg1 = (vcpu->arch.sie_block->ipa & 0x00f0) >> 4;
int reg3 = vcpu->arch.sie_block->ipa & 0x000f;
int base2 = vcpu->arch.sie_block->ipb >> 28;
int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16) +
((vcpu->arch.sie_block->ipb & 0xff00) << 4);
u64 useraddr;
int reg, rc;
vcpu->stat.instruction_lctlg++;
if ((vcpu->arch.sie_block->ipb & 0xff) != 0x2f)
return -EOPNOTSUPP;
useraddr = disp2;
if (base2)
useraddr += vcpu->run->s.regs.gprs[base2];
useraddr = kvm_s390_get_base_disp_rsy(vcpu);
if (useraddr & 7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
reg = reg1;
VCPU_EVENT(vcpu, 5, "lctlg r1:%x, r3:%x,b2:%x,d2:%x", reg1, reg3, base2,
disp2);
VCPU_EVENT(vcpu, 5, "lctlg r1:%x, r3:%x, addr:%llx", reg1, reg3,
useraddr);
trace_kvm_s390_handle_lctl(vcpu, 1, reg1, reg3, useraddr);
do {
......@@ -68,23 +61,19 @@ static int handle_lctl(struct kvm_vcpu *vcpu)
{
int reg1 = (vcpu->arch.sie_block->ipa & 0x00f0) >> 4;
int reg3 = vcpu->arch.sie_block->ipa & 0x000f;
int base2 = vcpu->arch.sie_block->ipb >> 28;
int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16);
u64 useraddr;
u32 val = 0;
int reg, rc;
vcpu->stat.instruction_lctl++;
useraddr = disp2;
if (base2)
useraddr += vcpu->run->s.regs.gprs[base2];
useraddr = kvm_s390_get_base_disp_rs(vcpu);
if (useraddr & 3)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "lctl r1:%x, r3:%x,b2:%x,d2:%x", reg1, reg3, base2,
disp2);
VCPU_EVENT(vcpu, 5, "lctl r1:%x, r3:%x, addr:%llx", reg1, reg3,
useraddr);
trace_kvm_s390_handle_lctl(vcpu, 0, reg1, reg3, useraddr);
reg = reg1;
......@@ -104,14 +93,31 @@ static int handle_lctl(struct kvm_vcpu *vcpu)
return 0;
}
static intercept_handler_t instruction_handlers[256] = {
static const intercept_handler_t eb_handlers[256] = {
[0x2f] = handle_lctlg,
[0x8a] = kvm_s390_handle_priv_eb,
};
static int handle_eb(struct kvm_vcpu *vcpu)
{
intercept_handler_t handler;
handler = eb_handlers[vcpu->arch.sie_block->ipb & 0xff];
if (handler)
return handler(vcpu);
return -EOPNOTSUPP;
}
static const intercept_handler_t instruction_handlers[256] = {
[0x01] = kvm_s390_handle_01,
[0x82] = kvm_s390_handle_lpsw,
[0x83] = kvm_s390_handle_diag,
[0xae] = kvm_s390_handle_sigp,
[0xb2] = kvm_s390_handle_b2,
[0xb7] = handle_lctl,
[0xb9] = kvm_s390_handle_b9,
[0xe5] = kvm_s390_handle_e5,
[0xeb] = handle_lctlg,
[0xeb] = handle_eb,
};
static int handle_noop(struct kvm_vcpu *vcpu)
......@@ -258,6 +264,7 @@ static const intercept_handler_t intercept_funcs[] = {
[0x0C >> 2] = handle_instruction_and_prog,
[0x10 >> 2] = handle_noop,
[0x14 >> 2] = handle_noop,
[0x18 >> 2] = handle_noop,
[0x1C >> 2] = kvm_s390_handle_wait,
[0x20 >> 2] = handle_validity,
[0x28 >> 2] = handle_stop,
......
......@@ -21,11 +21,31 @@
#include "gaccess.h"
#include "trace-s390.h"
#define IOINT_SCHID_MASK 0x0000ffff
#define IOINT_SSID_MASK 0x00030000
#define IOINT_CSSID_MASK 0x03fc0000
#define IOINT_AI_MASK 0x04000000
static int is_ioint(u64 type)
{
return ((type & 0xfffe0000u) != 0xfffe0000u);
}
static int psw_extint_disabled(struct kvm_vcpu *vcpu)
{
return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_EXT);
}
static int psw_ioint_disabled(struct kvm_vcpu *vcpu)
{
return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_IO);
}
static int psw_mchk_disabled(struct kvm_vcpu *vcpu)
{
return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_MCHECK);
}
static int psw_interrupts_disabled(struct kvm_vcpu *vcpu)
{
if ((vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PER) ||
......@@ -35,6 +55,13 @@ static int psw_interrupts_disabled(struct kvm_vcpu *vcpu)
return 1;
}
static u64 int_word_to_isc_bits(u32 int_word)
{
u8 isc = (int_word & 0x38000000) >> 27;
return (0x80 >> isc) << 24;
}
static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
struct kvm_s390_interrupt_info *inti)
{
......@@ -67,7 +94,22 @@ static int __interrupt_is_deliverable(struct kvm_vcpu *vcpu,
case KVM_S390_SIGP_SET_PREFIX:
case KVM_S390_RESTART:
return 1;
case KVM_S390_MCHK:
if (psw_mchk_disabled(vcpu))
return 0;
if (vcpu->arch.sie_block->gcr[14] & inti->mchk.cr14)
return 1;
return 0;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
if (psw_ioint_disabled(vcpu))
return 0;
if (vcpu->arch.sie_block->gcr[6] &
int_word_to_isc_bits(inti->io.io_int_word))
return 1;
return 0;
default:
printk(KERN_WARNING "illegal interrupt type %llx\n",
inti->type);
BUG();
}
return 0;
......@@ -93,6 +135,7 @@ static void __reset_intercept_indicators(struct kvm_vcpu *vcpu)
CPUSTAT_IO_INT | CPUSTAT_EXT_INT | CPUSTAT_STOP_INT,
&vcpu->arch.sie_block->cpuflags);
vcpu->arch.sie_block->lctl = 0x0000;
vcpu->arch.sie_block->ictl &= ~ICTL_LPSW;
}
static void __set_cpuflag(struct kvm_vcpu *vcpu, u32 flag)
......@@ -116,6 +159,18 @@ static void __set_intercept_indicator(struct kvm_vcpu *vcpu,
case KVM_S390_SIGP_STOP:
__set_cpuflag(vcpu, CPUSTAT_STOP_INT);
break;
case KVM_S390_MCHK:
if (psw_mchk_disabled(vcpu))
vcpu->arch.sie_block->ictl |= ICTL_LPSW;
else
vcpu->arch.sie_block->lctl |= LCTL_CR14;
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
if (psw_ioint_disabled(vcpu))
__set_cpuflag(vcpu, CPUSTAT_IO_INT);
else
vcpu->arch.sie_block->lctl |= LCTL_CR6;
break;
default:
BUG();
}
......@@ -297,6 +352,73 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
exception = 1;
break;
case KVM_S390_MCHK:
VCPU_EVENT(vcpu, 4, "interrupt: machine check mcic=%llx",
inti->mchk.mcic);
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->mchk.cr14,
inti->mchk.mcic);
rc = kvm_s390_vcpu_store_status(vcpu,
KVM_S390_STORE_STATUS_PREFIXED);
if (rc == -EFAULT)
exception = 1;
rc = put_guest_u64(vcpu, __LC_MCCK_CODE, inti->mchk.mcic);
if (rc == -EFAULT)
exception = 1;
rc = copy_to_guest(vcpu, __LC_MCK_OLD_PSW,
&vcpu->arch.sie_block->gpsw, sizeof(psw_t));
if (rc == -EFAULT)
exception = 1;
rc = copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
__LC_MCK_NEW_PSW, sizeof(psw_t));
if (rc == -EFAULT)
exception = 1;
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
{
__u32 param0 = ((__u32)inti->io.subchannel_id << 16) |
inti->io.subchannel_nr;
__u64 param1 = ((__u64)inti->io.io_int_parm << 32) |
inti->io.io_int_word;
VCPU_EVENT(vcpu, 4, "interrupt: I/O %llx", inti->type);
vcpu->stat.deliver_io_int++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
param0, param1);
rc = put_guest_u16(vcpu, __LC_SUBCHANNEL_ID,
inti->io.subchannel_id);
if (rc == -EFAULT)
exception = 1;
rc = put_guest_u16(vcpu, __LC_SUBCHANNEL_NR,
inti->io.subchannel_nr);
if (rc == -EFAULT)
exception = 1;
rc = put_guest_u32(vcpu, __LC_IO_INT_PARM,
inti->io.io_int_parm);
if (rc == -EFAULT)
exception = 1;
rc = put_guest_u32(vcpu, __LC_IO_INT_WORD,
inti->io.io_int_word);
if (rc == -EFAULT)
exception = 1;
rc = copy_to_guest(vcpu, __LC_IO_OLD_PSW,
&vcpu->arch.sie_block->gpsw, sizeof(psw_t));
if (rc == -EFAULT)
exception = 1;
rc = copy_from_guest(vcpu, &vcpu->arch.sie_block->gpsw,
__LC_IO_NEW_PSW, sizeof(psw_t));
if (rc == -EFAULT)
exception = 1;
break;
}
default:
BUG();
}
......@@ -518,6 +640,61 @@ void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
}
}
void kvm_s390_deliver_pending_machine_checks(struct kvm_vcpu *vcpu)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
struct kvm_s390_float_interrupt *fi = vcpu->arch.local_int.float_int;
struct kvm_s390_interrupt_info *n, *inti = NULL;
int deliver;
__reset_intercept_indicators(vcpu);
if (atomic_read(&li->active)) {
do {
deliver = 0;
spin_lock_bh(&li->lock);
list_for_each_entry_safe(inti, n, &li->list, list) {
if ((inti->type == KVM_S390_MCHK) &&
__interrupt_is_deliverable(vcpu, inti)) {
list_del(&inti->list);
deliver = 1;
break;
}
__set_intercept_indicator(vcpu, inti);
}
if (list_empty(&li->list))
atomic_set(&li->active, 0);
spin_unlock_bh(&li->lock);
if (deliver) {
__do_deliver_interrupt(vcpu, inti);
kfree(inti);
}
} while (deliver);
}
if (atomic_read(&fi->active)) {
do {
deliver = 0;
spin_lock(&fi->lock);
list_for_each_entry_safe(inti, n, &fi->list, list) {
if ((inti->type == KVM_S390_MCHK) &&
__interrupt_is_deliverable(vcpu, inti)) {
list_del(&inti->list);
deliver = 1;
break;
}
__set_intercept_indicator(vcpu, inti);
}
if (list_empty(&fi->list))
atomic_set(&fi->active, 0);
spin_unlock(&fi->lock);
if (deliver) {
__do_deliver_interrupt(vcpu, inti);
kfree(inti);
}
} while (deliver);
}
}
int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code)
{
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
......@@ -540,12 +717,50 @@ int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code)
return 0;
}
struct kvm_s390_interrupt_info *kvm_s390_get_io_int(struct kvm *kvm,
u64 cr6, u64 schid)
{
struct kvm_s390_float_interrupt *fi;
struct kvm_s390_interrupt_info *inti, *iter;
if ((!schid && !cr6) || (schid && cr6))
return NULL;
mutex_lock(&kvm->lock);
fi = &kvm->arch.float_int;
spin_lock(&fi->lock);
inti = NULL;
list_for_each_entry(iter, &fi->list, list) {
if (!is_ioint(iter->type))
continue;
if (cr6 &&
((cr6 & int_word_to_isc_bits(iter->io.io_int_word)) == 0))
continue;
if (schid) {
if (((schid & 0x00000000ffff0000) >> 16) !=
iter->io.subchannel_id)
continue;
if ((schid & 0x000000000000ffff) !=
iter->io.subchannel_nr)
continue;
}
inti = iter;
break;
}
if (inti)
list_del_init(&inti->list);
if (list_empty(&fi->list))
atomic_set(&fi->active, 0);
spin_unlock(&fi->lock);
mutex_unlock(&kvm->lock);
return inti;
}
int kvm_s390_inject_vm(struct kvm *kvm,
struct kvm_s390_interrupt *s390int)
{
struct kvm_s390_local_interrupt *li;
struct kvm_s390_float_interrupt *fi;
struct kvm_s390_interrupt_info *inti;
struct kvm_s390_interrupt_info *inti, *iter;
int sigcpu;
inti = kzalloc(sizeof(*inti), GFP_KERNEL);
......@@ -569,6 +784,29 @@ int kvm_s390_inject_vm(struct kvm *kvm,
case KVM_S390_SIGP_STOP:
case KVM_S390_INT_EXTERNAL_CALL:
case KVM_S390_INT_EMERGENCY:
kfree(inti);
return -EINVAL;
case KVM_S390_MCHK:
VM_EVENT(kvm, 5, "inject: machine check parm64:%llx",
s390int->parm64);
inti->type = s390int->type;
inti->mchk.cr14 = s390int->parm; /* upper bits are not used */
inti->mchk.mcic = s390int->parm64;
break;
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
if (s390int->type & IOINT_AI_MASK)
VM_EVENT(kvm, 5, "%s", "inject: I/O (AI)");
else
VM_EVENT(kvm, 5, "inject: I/O css %x ss %x schid %04x",
s390int->type & IOINT_CSSID_MASK,
s390int->type & IOINT_SSID_MASK,
s390int->type & IOINT_SCHID_MASK);
inti->type = s390int->type;
inti->io.subchannel_id = s390int->parm >> 16;
inti->io.subchannel_nr = s390int->parm & 0x0000ffffu;
inti->io.io_int_parm = s390int->parm64 >> 32;
inti->io.io_int_word = s390int->parm64 & 0x00000000ffffffffull;
break;
default:
kfree(inti);
return -EINVAL;
......@@ -579,7 +817,22 @@ int kvm_s390_inject_vm(struct kvm *kvm,
mutex_lock(&kvm->lock);
fi = &kvm->arch.float_int;
spin_lock(&fi->lock);
list_add_tail(&inti->list, &fi->list);
if (!is_ioint(inti->type))
list_add_tail(&inti->list, &fi->list);
else {
u64 isc_bits = int_word_to_isc_bits(inti->io.io_int_word);
/* Keep I/O interrupts sorted in isc order. */
list_for_each_entry(iter, &fi->list, list) {
if (!is_ioint(iter->type))
continue;
if (int_word_to_isc_bits(iter->io.io_int_word)
<= isc_bits)
continue;
break;
}
list_add_tail(&inti->list, &iter->list);
}
atomic_set(&fi->active, 1);
sigcpu = find_first_bit(fi->idle_mask, KVM_MAX_VCPUS);
if (sigcpu == KVM_MAX_VCPUS) {
......@@ -651,8 +904,15 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
inti->type = s390int->type;
inti->emerg.code = s390int->parm;
break;
case KVM_S390_MCHK:
VCPU_EVENT(vcpu, 5, "inject: machine check parm64:%llx",
s390int->parm64);
inti->type = s390int->type;
inti->mchk.mcic = s390int->parm64;
break;
case KVM_S390_INT_VIRTIO:
case KVM_S390_INT_SERVICE:
case KVM_S390_INT_IO_MIN...KVM_S390_INT_IO_MAX:
default:
kfree(inti);
return -EINVAL;
......
......@@ -140,6 +140,8 @@ int kvm_dev_ioctl_check_extension(long ext)
#endif
case KVM_CAP_SYNC_REGS:
case KVM_CAP_ONE_REG:
case KVM_CAP_ENABLE_CAP:
case KVM_CAP_S390_CSS_SUPPORT:
r = 1;
break;
case KVM_CAP_NR_VCPUS:
......@@ -234,6 +236,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
if (!kvm->arch.gmap)
goto out_nogmap;
}
kvm->arch.css_support = 0;
return 0;
out_nogmap:
debug_unregister(kvm->arch.dbf);
......@@ -659,6 +664,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
case KVM_EXIT_INTR:
case KVM_EXIT_S390_RESET:
case KVM_EXIT_S390_UCONTROL:
case KVM_EXIT_S390_TSCH:
break;
default:
BUG();
......@@ -766,6 +772,14 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr)
} else
prefix = 0;
/*
* The guest FPRS and ACRS are in the host FPRS/ACRS due to the lazy
* copying in vcpu load/put. Lets update our copies before we save
* it into the save area
*/
save_fp_regs(&vcpu->arch.guest_fpregs);
save_access_regs(vcpu->run->s.regs.acrs);
if (__guestcopy(vcpu, addr + offsetof(struct save_area, fp_regs),
vcpu->arch.guest_fpregs.fprs, 128, prefix))
return -EFAULT;
......@@ -810,6 +824,29 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr)
return 0;
}
static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
struct kvm_enable_cap *cap)
{
int r;
if (cap->flags)
return -EINVAL;
switch (cap->cap) {
case KVM_CAP_S390_CSS_SUPPORT:
if (!vcpu->kvm->arch.css_support) {
vcpu->kvm->arch.css_support = 1;
trace_kvm_s390_enable_css(vcpu->kvm);
}
r = 0;
break;
default:
r = -EINVAL;
break;
}
return r;
}
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
......@@ -896,6 +933,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
r = 0;
break;
}
case KVM_ENABLE_CAP:
{
struct kvm_enable_cap cap;
r = -EFAULT;
if (copy_from_user(&cap, argp, sizeof(cap)))
break;
r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
break;
}
default:
r = -ENOTTY;
}
......@@ -930,7 +976,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
int user_alloc)
bool user_alloc)
{
/* A few sanity checks. We can have exactly one memory slot which has
to start at guest virtual zero and which has to be located at a
......@@ -960,7 +1006,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
int user_alloc)
bool user_alloc)
{
int rc;
......
......@@ -65,21 +65,67 @@ static inline void kvm_s390_set_prefix(struct kvm_vcpu *vcpu, u32 prefix)
vcpu->arch.sie_block->ihcpu = 0xffff;
}
static inline u64 kvm_s390_get_base_disp_s(struct kvm_vcpu *vcpu)
{
u32 base2 = vcpu->arch.sie_block->ipb >> 28;
u32 disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16);
return (base2 ? vcpu->run->s.regs.gprs[base2] : 0) + disp2;
}
static inline void kvm_s390_get_base_disp_sse(struct kvm_vcpu *vcpu,
u64 *address1, u64 *address2)
{
u32 base1 = (vcpu->arch.sie_block->ipb & 0xf0000000) >> 28;
u32 disp1 = (vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16;
u32 base2 = (vcpu->arch.sie_block->ipb & 0xf000) >> 12;
u32 disp2 = vcpu->arch.sie_block->ipb & 0x0fff;
*address1 = (base1 ? vcpu->run->s.regs.gprs[base1] : 0) + disp1;
*address2 = (base2 ? vcpu->run->s.regs.gprs[base2] : 0) + disp2;
}
static inline u64 kvm_s390_get_base_disp_rsy(struct kvm_vcpu *vcpu)
{
u32 base2 = vcpu->arch.sie_block->ipb >> 28;
u32 disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16) +
((vcpu->arch.sie_block->ipb & 0xff00) << 4);
/* The displacement is a 20bit _SIGNED_ value */
if (disp2 & 0x80000)
disp2+=0xfff00000;
return (base2 ? vcpu->run->s.regs.gprs[base2] : 0) + (long)(int)disp2;
}
static inline u64 kvm_s390_get_base_disp_rs(struct kvm_vcpu *vcpu)
{
u32 base2 = vcpu->arch.sie_block->ipb >> 28;
u32 disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16);
return (base2 ? vcpu->run->s.regs.gprs[base2] : 0) + disp2;
}
int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer);
void kvm_s390_tasklet(unsigned long parm);
void kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu);
void kvm_s390_deliver_pending_machine_checks(struct kvm_vcpu *vcpu);
int kvm_s390_inject_vm(struct kvm *kvm,
struct kvm_s390_interrupt *s390int);
int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
struct kvm_s390_interrupt *s390int);
int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code);
int kvm_s390_inject_sigp_stop(struct kvm_vcpu *vcpu, int action);
struct kvm_s390_interrupt_info *kvm_s390_get_io_int(struct kvm *kvm,
u64 cr6, u64 schid);
/* implemented in priv.c */
int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
int kvm_s390_handle_e5(struct kvm_vcpu *vcpu);
int kvm_s390_handle_01(struct kvm_vcpu *vcpu);
int kvm_s390_handle_b9(struct kvm_vcpu *vcpu);
int kvm_s390_handle_lpsw(struct kvm_vcpu *vcpu);
int kvm_s390_handle_priv_eb(struct kvm_vcpu *vcpu);
/* implemented in sigp.c */
int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu);
......
This diff is collapsed.
......@@ -137,8 +137,10 @@ static int __inject_sigp_stop(struct kvm_s390_local_interrupt *li, int action)
inti->type = KVM_S390_SIGP_STOP;
spin_lock_bh(&li->lock);
if ((atomic_read(li->cpuflags) & CPUSTAT_STOPPED))
if ((atomic_read(li->cpuflags) & CPUSTAT_STOPPED)) {
kfree(inti);
goto out;
}
list_add_tail(&inti->list, &li->list);
atomic_set(&li->active, 1);
atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags);
......@@ -324,8 +326,6 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
{
int r1 = (vcpu->arch.sie_block->ipa & 0x00f0) >> 4;
int r3 = vcpu->arch.sie_block->ipa & 0x000f;
int base2 = vcpu->arch.sie_block->ipb >> 28;
int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16);
u32 parameter;
u16 cpu_addr = vcpu->run->s.regs.gprs[r3];
u8 order_code;
......@@ -336,9 +336,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu,
PGM_PRIVILEGED_OPERATION);
order_code = disp2;
if (base2)
order_code += vcpu->run->s.regs.gprs[base2];
order_code = kvm_s390_get_base_disp_rs(vcpu);
if (r1 % 2)
parameter = vcpu->run->s.regs.gprs[r1];
......
......@@ -141,13 +141,13 @@ TRACE_EVENT(kvm_s390_inject_vcpu,
* Trace point for the actual delivery of interrupts.
*/
TRACE_EVENT(kvm_s390_deliver_interrupt,
TP_PROTO(unsigned int id, __u64 type, __u32 data0, __u64 data1),
TP_PROTO(unsigned int id, __u64 type, __u64 data0, __u64 data1),
TP_ARGS(id, type, data0, data1),
TP_STRUCT__entry(
__field(int, id)
__field(__u32, inttype)
__field(__u32, data0)
__field(__u64, data0)
__field(__u64, data1)
),
......@@ -159,7 +159,7 @@ TRACE_EVENT(kvm_s390_deliver_interrupt,
),
TP_printk("deliver interrupt (vcpu %d): type:%x (%s) " \
"data:%08x %016llx",
"data:%08llx %016llx",
__entry->id, __entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->data0, __entry->data1)
......@@ -204,6 +204,26 @@ TRACE_EVENT(kvm_s390_stop_request,
);
/*
* Trace point for enabling channel I/O instruction support.
*/
TRACE_EVENT(kvm_s390_enable_css,
TP_PROTO(void *kvm),
TP_ARGS(kvm),
TP_STRUCT__entry(
__field(void *, kvm)
),
TP_fast_assign(
__entry->kvm = kvm;
),
TP_printk("enabling channel I/O support (kvm @ %p)\n",
__entry->kvm)
);
#endif /* _TRACE_KVMS390_H */
/* This part must be outside protection */
......
......@@ -33,10 +33,10 @@
#define KVM_MAX_VCPUS 254
#define KVM_SOFT_MAX_VCPUS 160
#define KVM_MEMORY_SLOTS 32
/* memory slots that does not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 4
#define KVM_MEM_SLOTS_NUM (KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS)
#define KVM_USER_MEM_SLOTS 125
/* memory slots that are not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 3
#define KVM_MEM_SLOTS_NUM (KVM_USER_MEM_SLOTS + KVM_PRIVATE_MEM_SLOTS)
#define KVM_MMIO_SIZE 16
......@@ -219,11 +219,6 @@ struct kvm_mmu_page {
u64 *spt;
/* hold the gfn of each spte inside spt */
gfn_t *gfns;
/*
* One bit set per slot which has memory
* in this shadow page.
*/
DECLARE_BITMAP(slot_bitmap, KVM_MEM_SLOTS_NUM);
bool unsync;
int root_count; /* Currently serving as active root */
unsigned int unsync_children;
......@@ -502,6 +497,13 @@ struct kvm_vcpu_arch {
u64 msr_val;
struct gfn_to_hva_cache data;
} pv_eoi;
/*
* Indicate whether the access faults on its page table in guest
* which is set when fix page fault and used to detect unhandeable
* instruction.
*/
bool write_fault_to_shadow_pgtable;
};
struct kvm_lpage_info {
......@@ -697,6 +699,11 @@ struct kvm_x86_ops {
void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
void (*enable_irq_window)(struct kvm_vcpu *vcpu);
void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
int (*vm_has_apicv)(struct kvm *kvm);
void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
void (*hwapic_isr_update)(struct kvm *kvm, int isr);
void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
int (*get_tdp_level)(void);
u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
......@@ -991,6 +998,7 @@ int kvm_age_hva(struct kvm *kvm, unsigned long hva);
int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v);
int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
......
......@@ -27,7 +27,7 @@ static inline bool kvm_check_and_clear_guest_paused(void)
*
* Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
* The hypercall number should be placed in rax and the return value will be
* placed in rax. No other registers will be clobbered unless explicited
* placed in rax. No other registers will be clobbered unless explicitly
* noted by the particular hypercall.
*/
......
......@@ -57,9 +57,12 @@
#define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001
#define SECONDARY_EXEC_ENABLE_EPT 0x00000002
#define SECONDARY_EXEC_RDTSCP 0x00000008
#define SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE 0x00000010
#define SECONDARY_EXEC_ENABLE_VPID 0x00000020
#define SECONDARY_EXEC_WBINVD_EXITING 0x00000040
#define SECONDARY_EXEC_UNRESTRICTED_GUEST 0x00000080
#define SECONDARY_EXEC_APIC_REGISTER_VIRT 0x00000100
#define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY 0x00000200
#define SECONDARY_EXEC_PAUSE_LOOP_EXITING 0x00000400
#define SECONDARY_EXEC_ENABLE_INVPCID 0x00001000
......@@ -97,6 +100,7 @@ enum vmcs_field {
GUEST_GS_SELECTOR = 0x0000080a,
GUEST_LDTR_SELECTOR = 0x0000080c,
GUEST_TR_SELECTOR = 0x0000080e,
GUEST_INTR_STATUS = 0x00000810,
HOST_ES_SELECTOR = 0x00000c00,
HOST_CS_SELECTOR = 0x00000c02,
HOST_SS_SELECTOR = 0x00000c04,
......@@ -124,6 +128,14 @@ enum vmcs_field {
APIC_ACCESS_ADDR_HIGH = 0x00002015,
EPT_POINTER = 0x0000201a,
EPT_POINTER_HIGH = 0x0000201b,
EOI_EXIT_BITMAP0 = 0x0000201c,
EOI_EXIT_BITMAP0_HIGH = 0x0000201d,
EOI_EXIT_BITMAP1 = 0x0000201e,
EOI_EXIT_BITMAP1_HIGH = 0x0000201f,
EOI_EXIT_BITMAP2 = 0x00002020,
EOI_EXIT_BITMAP2_HIGH = 0x00002021,
EOI_EXIT_BITMAP3 = 0x00002022,
EOI_EXIT_BITMAP3_HIGH = 0x00002023,
GUEST_PHYSICAL_ADDRESS = 0x00002400,
GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401,
VMCS_LINK_POINTER = 0x00002800,
......@@ -346,9 +358,9 @@ enum vmcs_field {
#define AR_RESERVD_MASK 0xfffe0f00
#define TSS_PRIVATE_MEMSLOT (KVM_MEMORY_SLOTS + 0)
#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (KVM_MEMORY_SLOTS + 1)
#define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT (KVM_MEMORY_SLOTS + 2)
#define TSS_PRIVATE_MEMSLOT (KVM_USER_MEM_SLOTS + 0)
#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (KVM_USER_MEM_SLOTS + 1)
#define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT (KVM_USER_MEM_SLOTS + 2)
#define VMX_NR_VPIDS (1 << 16)
#define VMX_VPID_EXTENT_SINGLE_CONTEXT 1
......
......@@ -62,10 +62,12 @@
#define EXIT_REASON_MCE_DURING_VMENTRY 41
#define EXIT_REASON_TPR_BELOW_THRESHOLD 43
#define EXIT_REASON_APIC_ACCESS 44
#define EXIT_REASON_EOI_INDUCED 45
#define EXIT_REASON_EPT_VIOLATION 48
#define EXIT_REASON_EPT_MISCONFIG 49
#define EXIT_REASON_WBINVD 54
#define EXIT_REASON_XSETBV 55
#define EXIT_REASON_APIC_WRITE 56
#define EXIT_REASON_INVPCID 58
#define VMX_EXIT_REASONS \
......@@ -103,7 +105,12 @@
{ EXIT_REASON_APIC_ACCESS, "APIC_ACCESS" }, \
{ EXIT_REASON_EPT_VIOLATION, "EPT_VIOLATION" }, \
{ EXIT_REASON_EPT_MISCONFIG, "EPT_MISCONFIG" }, \
{ EXIT_REASON_WBINVD, "WBINVD" }
{ EXIT_REASON_WBINVD, "WBINVD" }, \
{ EXIT_REASON_APIC_WRITE, "APIC_WRITE" }, \
{ EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \
{ EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \
{ EXIT_REASON_INVD, "INVD" }, \
{ EXIT_REASON_INVPCID, "INVPCID" }
#endif /* _UAPIVMX_H */
......@@ -218,6 +218,9 @@ static void kvm_shutdown(void)
void __init kvmclock_init(void)
{
unsigned long mem;
int size;
size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
if (!kvm_para_available())
return;
......@@ -231,16 +234,14 @@ void __init kvmclock_init(void)
printk(KERN_INFO "kvm-clock: Using msrs %x and %x",
msr_kvm_system_time, msr_kvm_wall_clock);
mem = memblock_alloc(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS,
PAGE_SIZE);
mem = memblock_alloc(size, PAGE_SIZE);
if (!mem)
return;
hv_clock = __va(mem);
if (kvm_register_clock("boot clock")) {
hv_clock = NULL;
memblock_free(mem,
sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
memblock_free(mem, size);
return;
}
pv_time_ops.sched_clock = kvm_clock_read;
......@@ -275,7 +276,7 @@ int __init kvm_setup_vsyscall_timeinfo(void)
struct pvclock_vcpu_time_info *vcpu_time;
unsigned int size;
size = sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS;
size = PAGE_ALIGN(sizeof(struct pvclock_vsyscall_time_info)*NR_CPUS);
preempt_disable();
cpu = smp_processor_id();
......
This diff is collapsed.
......@@ -122,7 +122,6 @@ static s64 __kpit_elapsed(struct kvm *kvm)
*/
remaining = hrtimer_get_remaining(&ps->timer);
elapsed = ps->period - ktime_to_ns(remaining);
elapsed = mod_64(elapsed, ps->period);
return elapsed;
}
......
......@@ -241,6 +241,8 @@ int kvm_pic_read_irq(struct kvm *kvm)
int irq, irq2, intno;
struct kvm_pic *s = pic_irqchip(kvm);
s->output = 0;
pic_lock(s);
irq = pic_get_irq(&s->pics[0]);
if (irq >= 0) {
......
......@@ -37,50 +37,82 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL(kvm_cpu_has_pending_timer);
/*
* check if there is pending interrupt from
* non-APIC source without intack.
*/
static int kvm_cpu_has_extint(struct kvm_vcpu *v)
{
if (kvm_apic_accept_pic_intr(v))
return pic_irqchip(v->kvm)->output; /* PIC */
else
return 0;
}
/*
* check if there is injectable interrupt:
* when virtual interrupt delivery enabled,
* interrupt from apic will handled by hardware,
* we don't need to check it here.
*/
int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
{
if (!irqchip_in_kernel(v->kvm))
return v->arch.interrupt.pending;
if (kvm_cpu_has_extint(v))
return 1;
if (kvm_apic_vid_enabled(v->kvm))
return 0;
return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
}
/*
* check if there is pending interrupt without
* intack.
*/
int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
{
struct kvm_pic *s;
if (!irqchip_in_kernel(v->kvm))
return v->arch.interrupt.pending;
if (kvm_apic_has_interrupt(v) == -1) { /* LAPIC */
if (kvm_apic_accept_pic_intr(v)) {
s = pic_irqchip(v->kvm); /* PIC */
return s->output;
} else
return 0;
}
return 1;
if (kvm_cpu_has_extint(v))
return 1;
return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
}
EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
/*
* Read pending interrupt(from non-APIC source)
* vector and intack.
*/
static int kvm_cpu_get_extint(struct kvm_vcpu *v)
{
if (kvm_cpu_has_extint(v))
return kvm_pic_read_irq(v->kvm); /* PIC */
return -1;
}
/*
* Read pending interrupt vector and intack.
*/
int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
{
struct kvm_pic *s;
int vector;
if (!irqchip_in_kernel(v->kvm))
return v->arch.interrupt.nr;
vector = kvm_get_apic_interrupt(v); /* APIC */
if (vector == -1) {
if (kvm_apic_accept_pic_intr(v)) {
s = pic_irqchip(v->kvm);
s->output = 0; /* PIC */
vector = kvm_pic_read_irq(v->kvm);
}
}
return vector;
vector = kvm_cpu_get_extint(v);
if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
return vector; /* PIC */
return kvm_get_apic_interrupt(v); /* APIC */
}
EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
{
......
......@@ -140,31 +140,56 @@ static inline int apic_enabled(struct kvm_lapic *apic)
(LVT_MASK | APIC_MODE_MASK | APIC_INPUT_POLARITY | \
APIC_LVT_REMOTE_IRR | APIC_LVT_LEVEL_TRIGGER)
static inline int apic_x2apic_mode(struct kvm_lapic *apic)
{
return apic->vcpu->arch.apic_base & X2APIC_ENABLE;
}
static inline int kvm_apic_id(struct kvm_lapic *apic)
{
return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
}
static inline u16 apic_cluster_id(struct kvm_apic_map *map, u32 ldr)
void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
struct kvm_lapic_irq *irq,
u64 *eoi_exit_bitmap)
{
u16 cid;
ldr >>= 32 - map->ldr_bits;
cid = (ldr >> map->cid_shift) & map->cid_mask;
struct kvm_lapic **dst;
struct kvm_apic_map *map;
unsigned long bitmap = 1;
int i;
BUG_ON(cid >= ARRAY_SIZE(map->logical_map));
rcu_read_lock();
map = rcu_dereference(vcpu->kvm->arch.apic_map);
return cid;
}
if (unlikely(!map)) {
__set_bit(irq->vector, (unsigned long *)eoi_exit_bitmap);
goto out;
}
static inline u16 apic_logical_id(struct kvm_apic_map *map, u32 ldr)
{
ldr >>= (32 - map->ldr_bits);
return ldr & map->lid_mask;
if (irq->dest_mode == 0) { /* physical mode */
if (irq->delivery_mode == APIC_DM_LOWEST ||
irq->dest_id == 0xff) {
__set_bit(irq->vector,
(unsigned long *)eoi_exit_bitmap);
goto out;
}
dst = &map->phys_map[irq->dest_id & 0xff];
} else {
u32 mda = irq->dest_id << (32 - map->ldr_bits);
dst = map->logical_map[apic_cluster_id(map, mda)];
bitmap = apic_logical_id(map, mda);
}
for_each_set_bit(i, &bitmap, 16) {
if (!dst[i])
continue;
if (dst[i]->vcpu == vcpu) {
__set_bit(irq->vector,
(unsigned long *)eoi_exit_bitmap);
break;
}
}
out:
rcu_read_unlock();
}
static void recalculate_apic_map(struct kvm *kvm)
......@@ -230,6 +255,8 @@ static void recalculate_apic_map(struct kvm *kvm)
if (old)
kfree_rcu(old, rcu);
kvm_ioapic_make_eoibitmap_request(kvm);
}
static inline void kvm_apic_set_id(struct kvm_lapic *apic, u8 id)
......@@ -345,6 +372,10 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
{
int result;
/*
* Note that irr_pending is just a hint. It will be always
* true with virtual interrupt delivery enabled.
*/
if (!apic->irr_pending)
return -1;
......@@ -461,6 +492,8 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
static inline int apic_find_highest_isr(struct kvm_lapic *apic)
{
int result;
/* Note that isr_count is always 1 with vid enabled */
if (!apic->isr_count)
return -1;
if (likely(apic->highest_isr_cache != -1))
......@@ -740,6 +773,19 @@ int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2)
return vcpu1->arch.apic_arb_prio - vcpu2->arch.apic_arb_prio;
}
static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
{
if (!(kvm_apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) &&
kvm_ioapic_handles_vector(apic->vcpu->kvm, vector)) {
int trigger_mode;
if (apic_test_vector(vector, apic->regs + APIC_TMR))
trigger_mode = IOAPIC_LEVEL_TRIG;
else
trigger_mode = IOAPIC_EDGE_TRIG;
kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
}
}
static int apic_set_eoi(struct kvm_lapic *apic)
{
int vector = apic_find_highest_isr(apic);
......@@ -756,19 +802,26 @@ static int apic_set_eoi(struct kvm_lapic *apic)
apic_clear_isr(vector, apic);
apic_update_ppr(apic);
if (!(kvm_apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) &&
kvm_ioapic_handles_vector(apic->vcpu->kvm, vector)) {
int trigger_mode;
if (apic_test_vector(vector, apic->regs + APIC_TMR))
trigger_mode = IOAPIC_LEVEL_TRIG;
else
trigger_mode = IOAPIC_EDGE_TRIG;
kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
}
kvm_ioapic_send_eoi(apic, vector);
kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
return vector;
}
/*
* this interface assumes a trap-like exit, which has already finished
* desired side effect including vISR and vPPR update.
*/
void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector)
{
struct kvm_lapic *apic = vcpu->arch.apic;
trace_kvm_eoi(apic, vector);
kvm_ioapic_send_eoi(apic, vector);
kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
}
EXPORT_SYMBOL_GPL(kvm_apic_set_eoi_accelerated);
static void apic_send_ipi(struct kvm_lapic *apic)
{
u32 icr_low = kvm_apic_get_reg(apic, APIC_ICR);
......@@ -1212,6 +1265,21 @@ void kvm_lapic_set_eoi(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi);
/* emulate APIC access in a trap manner */
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
{
u32 val = 0;
/* hw has done the conditional check and inst decode */
offset &= 0xff0;
apic_reg_read(vcpu->arch.apic, offset, 4, &val);
/* TODO: optimize to just emulate side effect w/o one more write */
apic_reg_write(vcpu->arch.apic, offset, val);
}
EXPORT_SYMBOL_GPL(kvm_apic_write_nodecode);
void kvm_free_lapic(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
......@@ -1288,6 +1356,7 @@ u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu)
void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
{
u64 old_value = vcpu->arch.apic_base;
struct kvm_lapic *apic = vcpu->arch.apic;
if (!apic) {
......@@ -1309,11 +1378,16 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
value &= ~MSR_IA32_APICBASE_BSP;
vcpu->arch.apic_base = value;
if (apic_x2apic_mode(apic)) {
u32 id = kvm_apic_id(apic);
u32 ldr = ((id >> 4) << 16) | (1 << (id & 0xf));
kvm_apic_set_ldr(apic, ldr);
if ((old_value ^ value) & X2APIC_ENABLE) {
if (value & X2APIC_ENABLE) {
u32 id = kvm_apic_id(apic);
u32 ldr = ((id >> 4) << 16) | (1 << (id & 0xf));
kvm_apic_set_ldr(apic, ldr);
kvm_x86_ops->set_virtual_x2apic_mode(vcpu, true);
} else
kvm_x86_ops->set_virtual_x2apic_mode(vcpu, false);
}
apic->base_address = apic->vcpu->arch.apic_base &
MSR_IA32_APICBASE_BASE;
......@@ -1359,8 +1433,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu)
apic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
apic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
}
apic->irr_pending = false;
apic->isr_count = 0;
apic->irr_pending = kvm_apic_vid_enabled(vcpu->kvm);
apic->isr_count = kvm_apic_vid_enabled(vcpu->kvm);
apic->highest_isr_cache = -1;
update_divide_count(apic);
atomic_set(&apic->lapic_timer.pending, 0);
......@@ -1575,8 +1649,10 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
update_divide_count(apic);
start_apic_timer(apic);
apic->irr_pending = true;
apic->isr_count = count_vectors(apic->regs + APIC_ISR);
apic->isr_count = kvm_apic_vid_enabled(vcpu->kvm) ?
1 : count_vectors(apic->regs + APIC_ISR);
apic->highest_isr_cache = -1;
kvm_x86_ops->hwapic_isr_update(vcpu->kvm, apic_find_highest_isr(apic));
kvm_make_request(KVM_REQ_EVENT, vcpu);
}
......
......@@ -64,6 +64,9 @@ int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu);
u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu);
void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data);
void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset);
void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector);
void kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr);
void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu);
void kvm_lapic_sync_to_vapic(struct kvm_vcpu *vcpu);
......@@ -124,4 +127,35 @@ static inline int kvm_lapic_enabled(struct kvm_vcpu *vcpu)
return kvm_apic_present(vcpu) && kvm_apic_sw_enabled(vcpu->arch.apic);
}
static inline int apic_x2apic_mode(struct kvm_lapic *apic)
{
return apic->vcpu->arch.apic_base & X2APIC_ENABLE;
}
static inline bool kvm_apic_vid_enabled(struct kvm *kvm)
{
return kvm_x86_ops->vm_has_apicv(kvm);
}
static inline u16 apic_cluster_id(struct kvm_apic_map *map, u32 ldr)
{
u16 cid;
ldr >>= 32 - map->ldr_bits;
cid = (ldr >> map->cid_shift) & map->cid_mask;
BUG_ON(cid >= ARRAY_SIZE(map->logical_map));
return cid;
}
static inline u16 apic_logical_id(struct kvm_apic_map *map, u32 ldr)
{
ldr >>= (32 - map->ldr_bits);
return ldr & map->lid_mask;
}
void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
struct kvm_lapic_irq *irq,
u64 *eoi_bitmap);
#endif
This diff is collapsed.
......@@ -195,12 +195,6 @@ DEFINE_EVENT(kvm_mmu_page_class, kvm_mmu_prepare_zap_page,
TP_ARGS(sp)
);
DEFINE_EVENT(kvm_mmu_page_class, kvm_mmu_delay_free_pages,
TP_PROTO(struct kvm_mmu_page *sp),
TP_ARGS(sp)
);
TRACE_EVENT(
mark_mmio_spte,
TP_PROTO(u64 *sptep, gfn_t gfn, unsigned access),
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment