Commit 01227a88 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Gleb Natapov:
 "Highlights of the updates are:

  general:
   - new emulated device API
   - legacy device assignment is now optional
   - irqfd interface is more generic and can be shared between arches

  x86:
   - VMCS shadow support and other nested VMX improvements
   - APIC virtualization and Posted Interrupt hardware support
   - Optimize mmio spte zapping

  ppc:
    - BookE: in-kernel MPIC emulation with irqfd support
    - Book3S: in-kernel XICS emulation (incomplete)
    - Book3S: HV: migration fixes
    - BookE: more debug support preparation
    - BookE: e6500 support

  ARM:
   - reworking of Hyp idmaps

  s390:
   - ioeventfd for virtio-ccw

  And many other bug fixes, cleanups and improvements"

* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  kvm: Add compat_ioctl for device control API
  KVM: x86: Account for failing enable_irq_window for NMI window request
  KVM: PPC: Book3S: Add API for in-kernel XICS emulation
  kvm/ppc/mpic: fix missing unlock in set_base_addr()
  kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
  kvm/ppc/mpic: remove users
  kvm/ppc/mpic: fix mmio region lists when multiple guests used
  kvm/ppc/mpic: remove default routes from documentation
  kvm: KVM_CAP_IOMMU only available with device assignment
  ARM: KVM: iterate over all CPUs for CPU compatibility check
  KVM: ARM: Fix spelling in error message
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
  KVM: ARM: Fix API documentation for ONE_REG encoding
  ARM: KVM: promote vfp_host pointer to generic host cpu context
  ARM: KVM: add architecture specific hook for capabilities
  ARM: KVM: perform HYP initilization for hotplugged CPUs
  ARM: KVM: switch to a dual-step HYP init code
  ARM: KVM: rework HYP page table freeing
  ARM: KVM: enforce maximum size for identity mapped code
  ARM: KVM: move to a KVM provided HYP idmap
  ...
parents 9e687946 db6ae615
......@@ -1486,15 +1486,23 @@ struct kvm_ioeventfd {
__u8 pad[36];
};
For the special case of virtio-ccw devices on s390, the ioevent is matched
to a subchannel/virtqueue tuple instead.
The following flags are defined:
#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio)
#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign)
#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
(1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify)
If datamatch flag is set, the event will be signaled only if the written value
to the registered address is equal to datamatch in struct kvm_ioeventfd.
For virtio-ccw devices, addr contains the subchannel id and datamatch the
virtqueue index.
4.60 KVM_DIRTY_TLB
......@@ -1780,27 +1788,48 @@ registers, find a list below:
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
PPC | KVM_REG_PPC_EPR | 32
PPC | KVM_REG_PPC_TCR | 32
PPC | KVM_REG_PPC_TSR | 32
PPC | KVM_REG_PPC_OR_TSR | 32
PPC | KVM_REG_PPC_CLEAR_TSR | 32
PPC | KVM_REG_PPC_MAS0 | 32
PPC | KVM_REG_PPC_MAS1 | 32
PPC | KVM_REG_PPC_MAS2 | 64
PPC | KVM_REG_PPC_MAS7_3 | 64
PPC | KVM_REG_PPC_MAS4 | 32
PPC | KVM_REG_PPC_MAS6 | 32
PPC | KVM_REG_PPC_MMUCFG | 32
PPC | KVM_REG_PPC_TLB0CFG | 32
PPC | KVM_REG_PPC_TLB1CFG | 32
PPC | KVM_REG_PPC_TLB2CFG | 32
PPC | KVM_REG_PPC_TLB3CFG | 32
PPC | KVM_REG_PPC_TLB0PS | 32
PPC | KVM_REG_PPC_TLB1PS | 32
PPC | KVM_REG_PPC_TLB2PS | 32
PPC | KVM_REG_PPC_TLB3PS | 32
PPC | KVM_REG_PPC_EPTCFG | 32
PPC | KVM_REG_PPC_ICP_STATE | 64
ARM registers are mapped using the lower 32 bits. The upper 16 of that
is the register group type, or coprocessor number:
ARM core registers have the following id bit patterns:
0x4002 0000 0010 <index into the kvm_regs struct:16>
0x4020 0000 0010 <index into the kvm_regs struct:16>
ARM 32-bit CP15 registers have the following id bit patterns:
0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
ARM 64-bit CP15 registers have the following id bit patterns:
0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
ARM CCSIDR registers are demultiplexed by CSSELR value:
0x4002 0000 0011 00 <csselr:8>
0x4020 0000 0011 00 <csselr:8>
ARM 32-bit VFP control registers have the following id bit patterns:
0x4002 0000 0012 1 <regno:12>
0x4020 0000 0012 1 <regno:12>
ARM 64-bit FP registers have the following id bit patterns:
0x4002 0000 0012 0 <regno:12>
0x4030 0000 0012 0 <regno:12>
4.69 KVM_GET_ONE_REG
......@@ -2161,6 +2190,76 @@ header; first `n_valid' valid entries with contents from the data
written, then `n_invalid' invalid entries, invalidating any previously
valid entries found.
4.79 KVM_CREATE_DEVICE
Capability: KVM_CAP_DEVICE_CTRL
Type: vm ioctl
Parameters: struct kvm_create_device (in/out)
Returns: 0 on success, -1 on error
Errors:
ENODEV: The device type is unknown or unsupported
EEXIST: Device already created, and this type of device may not
be instantiated multiple times
Other error conditions may be defined by individual device types or
have their standard meanings.
Creates an emulated device in the kernel. The file descriptor returned
in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
device type is supported (not necessarily whether it can be created
in the current vm).
Individual devices should not define flags. Attributes should be used
for specifying any behavior that is not implied by the device type
number.
struct kvm_create_device {
__u32 type; /* in: KVM_DEV_TYPE_xxx */
__u32 fd; /* out: device handle */
__u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
};
4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
EPERM: The attribute cannot (currently) be accessed this way
(e.g. read-only attribute, or attribute that only makes
sense when the device is in a different state)
Other error conditions may be defined by individual device types.
Gets/sets a specified piece of device configuration and/or state. The
semantics are device-specific. See individual device documentation in
the "devices" directory. As with ONE_REG, the size of the data
transferred is defined by the particular attribute.
struct kvm_device_attr {
__u32 flags; /* no flags currently defined */
__u32 group; /* device-defined */
__u64 attr; /* group-defined */
__u64 addr; /* userspace address of attr data */
};
4.81 KVM_HAS_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
Tests whether a device supports a particular attribute. A successful
return indicates the attribute is implemented. It does not necessarily
indicate that the attribute can be read or written in the device's
current state. "addr" is ignored.
4.77 KVM_ARM_VCPU_INIT
......@@ -2243,6 +2342,25 @@ and distributor interface, the ioctl must be called after calling
KVM_CREATE_IRQCHIP, but before calling KVM_RUN on any of the VCPUs. Calling
this ioctl twice for any of the base addresses will return -EEXIST.
4.82 KVM_PPC_RTAS_DEFINE_TOKEN
Capability: KVM_CAP_PPC_RTAS
Architectures: ppc
Type: vm ioctl
Parameters: struct kvm_rtas_token_args
Returns: 0 on success, -1 on error
Defines a token value for a RTAS (Run Time Abstraction Services)
service in order to allow it to be handled in the kernel. The
argument struct gives the name of the service, which must be the name
of a service that has a kernel-side implementation. If the token
value is non-zero, it will be associated with that service, and
subsequent RTAS calls by the guest specifying that token will be
handled by the kernel. If the token value is 0, then any token
associated with the service will be forgotten, and subsequent RTAS
calls by the guest for that service will be passed to userspace to be
handled.
5. The kvm_run structure
------------------------
......@@ -2646,3 +2764,19 @@ to receive the topmost interrupt vector.
When disabled (args[0] == 0), behavior is as if this facility is unsupported.
When this capability is enabled, KVM_EXIT_EPR can occur.
6.6 KVM_CAP_IRQ_MPIC
Architectures: ppc
Parameters: args[0] is the MPIC device fd
args[1] is the MPIC CPU number for this vcpu
This capability connects the vcpu to an in-kernel MPIC device.
6.7 KVM_CAP_IRQ_XICS
Architectures: ppc
Parameters: args[0] is the XICS device fd
args[1] is the XICS CPU number (server ID) for this vcpu
This capability connects the vcpu to an in-kernel XICS device.
This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
MPIC interrupt controller
=========================
Device types supported:
KVM_DEV_TYPE_FSL_MPIC_20 Freescale MPIC v2.0
KVM_DEV_TYPE_FSL_MPIC_42 Freescale MPIC v4.2
Only one MPIC instance, of any type, may be instantiated. The created
MPIC will act as the system interrupt controller, connecting to each
vcpu's interrupt inputs.
Groups:
KVM_DEV_MPIC_GRP_MISC
Attributes:
KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
Base address of the 256 KiB MPIC register space. Must be
naturally aligned. A value of zero disables the mapping.
Reset value is zero.
KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
Access an MPIC register, as if the access were made from the guest.
"attr" is the byte offset into the MPIC register space. Accesses
must be 4-byte aligned.
MSIs may be signaled by using this attribute group to write
to the relevant MSIIR.
KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
IRQ input line for each standard openpic source. 0 is inactive and 1
is active, regardless of interrupt sense.
For edge-triggered interrupts: Writing 1 is considered an activating
edge, and writing 0 is ignored. Reading returns 1 if a previously
signaled edge has not been acknowledged, and 0 otherwise.
"attr" is the IRQ number. IRQ numbers for standard sources are the
byte offset of the relevant IVPR from EIVPR0, divided by 32.
IRQ Routing:
The MPIC emulation supports IRQ routing. Only a single MPIC device can
be instantiated. Once that device has been created, it's available as
irqchip id 0.
This irqchip 0 has 256 interrupt pins, which expose the interrupts in
the main array of interrupt sources (a.k.a. "SRC" interrupts).
The numbering is the same as the MPIC device tree binding -- based on
the register offset from the beginning of the sources array, without
regard to any subdivisions in chip documentation such as "internal"
or "external" interrupts.
Access to non-SRC interrupts is not implemented through IRQ routing mechanisms.
XICS interrupt controller
Device type supported: KVM_DEV_TYPE_XICS
Groups:
KVM_DEV_XICS_SOURCES
Attributes: One per interrupt source, indexed by the source number.
This device emulates the XICS (eXternal Interrupt Controller
Specification) defined in PAPR. The XICS has a set of interrupt
sources, each identified by a 20-bit source number, and a set of
Interrupt Control Presentation (ICP) entities, also called "servers",
each associated with a virtual CPU.
The ICP entities are created by enabling the KVM_CAP_IRQ_ARCH
capability for each vcpu, specifying KVM_CAP_IRQ_XICS in args[0] and
the interrupt server number (i.e. the vcpu number from the XICS's
point of view) in args[1] of the kvm_enable_cap struct. Each ICP has
64 bits of state which can be read and written using the
KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls on the vcpu. The 64 bit
state word has the following bitfields, starting at the
least-significant end of the word:
* Unused, 16 bits
* Pending interrupt priority, 8 bits
Zero is the highest priority, 255 means no interrupt is pending.
* Pending IPI (inter-processor interrupt) priority, 8 bits
Zero is the highest priority, 255 means no IPI is pending.
* Pending interrupt source number, 24 bits
Zero means no interrupt pending, 2 means an IPI is pending
* Current processor priority, 8 bits
Zero is the highest priority, meaning no interrupts can be
delivered, and 255 is the lowest priority.
Each source has 64 bits of state that can be read and written using
the KVM_GET_DEVICE_ATTR and KVM_SET_DEVICE_ATTR ioctls, specifying the
KVM_DEV_XICS_SOURCES attribute group, with the attribute number being
the interrupt source number. The 64 bit state word has the following
bitfields, starting from the least-significant end of the word:
* Destination (server number), 32 bits
This specifies where the interrupt should be sent, and is the
interrupt server number specified for the destination vcpu.
* Priority, 8 bits
This is the priority specified for this interrupt source, where 0 is
the highest priority and 255 is the lowest. An interrupt with a
priority of 255 will never be delivered.
* Level sensitive flag, 1 bit
This bit is 1 for a level-sensitive interrupt source, or 0 for
edge-sensitive (or MSI).
* Masked flag, 1 bit
This bit is set to 1 if the interrupt is masked (cannot be delivered
regardless of its priority), for example by the ibm,int-off RTAS
call, or 0 if it is not masked.
* Pending flag, 1 bit
This bit is 1 if the source has a pending interrupt, otherwise 0.
Only one XICS instance may be created per VM.
......@@ -8,7 +8,6 @@
#define __idmap __section(.idmap.text) noinline notrace
extern pgd_t *idmap_pgd;
extern pgd_t *hyp_pgd;
void setup_mm_for_reboot(void);
......
......@@ -87,7 +87,7 @@ struct kvm_vcpu_fault_info {
u32 hyp_pc; /* PC when exception was taken from Hyp mode */
};
typedef struct vfp_hard_struct kvm_kernel_vfp_t;
typedef struct vfp_hard_struct kvm_cpu_context_t;
struct kvm_vcpu_arch {
struct kvm_regs regs;
......@@ -105,8 +105,10 @@ struct kvm_vcpu_arch {
struct kvm_vcpu_fault_info fault;
/* Floating point registers (VFP and Advanced SIMD/NEON) */
kvm_kernel_vfp_t vfp_guest;
kvm_kernel_vfp_t *vfp_host;
struct vfp_hard_struct vfp_guest;
/* Host FP context */
kvm_cpu_context_t *host_cpu_context;
/* VGIC state */
struct vgic_cpu vgic_cpu;
......@@ -188,23 +190,38 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int exception_index);
static inline void __cpu_init_hyp_mode(unsigned long long pgd_ptr,
static inline void __cpu_init_hyp_mode(unsigned long long boot_pgd_ptr,
unsigned long long pgd_ptr,
unsigned long hyp_stack_ptr,
unsigned long vector_ptr)
{
unsigned long pgd_low, pgd_high;
pgd_low = (pgd_ptr & ((1ULL << 32) - 1));
pgd_high = (pgd_ptr >> 32ULL);
/*
* Call initialization code, and switch to the full blown
* HYP code. The init code doesn't need to preserve these registers as
* r1-r3 and r12 are already callee save according to the AAPCS.
* Note that we slightly misuse the prototype by casing the pgd_low to
* a void *.
* Call initialization code, and switch to the full blown HYP
* code. The init code doesn't need to preserve these
* registers as r0-r3 are already callee saved according to
* the AAPCS.
* Note that we slightly misuse the prototype by casing the
* stack pointer to a void *.
*
* We don't have enough registers to perform the full init in
* one go. Install the boot PGD first, and then install the
* runtime PGD, stack pointer and vectors. The PGDs are always
* passed as the third argument, in order to be passed into
* r2-r3 to the init code (yes, this is compliant with the
* PCS!).
*/
kvm_call_hyp((void *)pgd_low, pgd_high, hyp_stack_ptr, vector_ptr);
kvm_call_hyp(NULL, 0, boot_pgd_ptr);
kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
}
static inline int kvm_arch_dev_ioctl_check_extension(long ext)
{
return 0;
}
int kvm_perf_init(void);
int kvm_perf_teardown(void);
#endif /* __ARM_KVM_HOST_H__ */
......@@ -19,21 +19,33 @@
#ifndef __ARM_KVM_MMU_H__
#define __ARM_KVM_MMU_H__
#include <asm/cacheflush.h>
#include <asm/pgalloc.h>
#include <asm/idmap.h>
#include <asm/memory.h>
#include <asm/page.h>
/*
* We directly use the kernel VA for the HYP, as we can directly share
* the mapping (HTTBR "covers" TTBR1).
*/
#define HYP_PAGE_OFFSET_MASK (~0UL)
#define HYP_PAGE_OFFSET_MASK UL(~0)
#define HYP_PAGE_OFFSET PAGE_OFFSET
#define KERN_TO_HYP(kva) (kva)
/*
* Our virtual mapping for the boot-time MMU-enable code. Must be
* shared across all the page-tables. Conveniently, we use the vectors
* page, where no kernel data will ever be shared with HYP.
*/
#define TRAMPOLINE_VA UL(CONFIG_VECTORS_BASE)
#ifndef __ASSEMBLY__
#include <asm/cacheflush.h>
#include <asm/pgalloc.h>
int create_hyp_mappings(void *from, void *to);
int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
void free_hyp_pmds(void);
void free_boot_hyp_pgd(void);
void free_hyp_pgds(void);
int kvm_alloc_stage2_pgd(struct kvm *kvm);
void kvm_free_stage2_pgd(struct kvm *kvm);
......@@ -45,6 +57,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run);
void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
phys_addr_t kvm_mmu_get_httbr(void);
phys_addr_t kvm_mmu_get_boot_httbr(void);
phys_addr_t kvm_get_idmap_vector(void);
int kvm_mmu_init(void);
void kvm_clear_hyp_idmap(void);
......@@ -114,4 +128,8 @@ static inline void coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn)
}
}
#define kvm_flush_dcache_to_poc(a,l) __cpuc_flush_dcache_area((a), (l))
#endif /* !__ASSEMBLY__ */
#endif /* __ARM_KVM_MMU_H__ */
......@@ -158,7 +158,7 @@ int main(void)
DEFINE(VCPU_MIDR, offsetof(struct kvm_vcpu, arch.midr));
DEFINE(VCPU_CP15, offsetof(struct kvm_vcpu, arch.cp15));
DEFINE(VCPU_VFP_GUEST, offsetof(struct kvm_vcpu, arch.vfp_guest));
DEFINE(VCPU_VFP_HOST, offsetof(struct kvm_vcpu, arch.vfp_host));
DEFINE(VCPU_VFP_HOST, offsetof(struct kvm_vcpu, arch.host_cpu_context));
DEFINE(VCPU_REGS, offsetof(struct kvm_vcpu, arch.regs));
DEFINE(VCPU_USR_REGS, offsetof(struct kvm_vcpu, arch.regs.usr_regs));
DEFINE(VCPU_SVC_REGS, offsetof(struct kvm_vcpu, arch.regs.svc_regs));
......
......@@ -20,7 +20,7 @@
VMLINUX_SYMBOL(__idmap_text_start) = .; \
*(.idmap.text) \
VMLINUX_SYMBOL(__idmap_text_end) = .; \
ALIGN_FUNCTION(); \
. = ALIGN(32); \
VMLINUX_SYMBOL(__hyp_idmap_text_start) = .; \
*(.hyp.idmap.text) \
VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;
......@@ -315,3 +315,8 @@ SECTIONS
*/
ASSERT((__proc_info_end - __proc_info_begin), "missing CPU support")
ASSERT((__arch_info_end - __arch_info_begin), "no machine record defined")
/*
* The HYP init code can't be more than a page long.
* The above comment applies as well.
*/
ASSERT(((__hyp_idmap_text_end - __hyp_idmap_text_start) <= PAGE_SIZE), "HYP init code too big")
......@@ -41,9 +41,9 @@ config KVM_ARM_HOST
Provides host support for ARM processors.
config KVM_ARM_MAX_VCPUS
int "Number maximum supported virtual CPUs per VM"
depends on KVM_ARM_HOST
default 4
int "Number maximum supported virtual CPUs per VM" if KVM_ARM_HOST
default 4 if KVM_ARM_HOST
default 0
help
Static number of max supported virtual CPUs per VM.
......
......@@ -18,6 +18,6 @@ kvm-arm-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
obj-y += coproc.o coproc_a15.o mmio.o psci.o
obj-y += coproc.o coproc_a15.o mmio.o psci.o perf.o
obj-$(CONFIG_KVM_ARM_VGIC) += vgic.o
obj-$(CONFIG_KVM_ARM_TIMER) += arch_timer.o
......@@ -22,6 +22,7 @@
#include <linux/kvm_host.h>
#include <linux/interrupt.h>
#include <clocksource/arm_arch_timer.h>
#include <asm/arch_timer.h>
#include <asm/kvm_vgic.h>
......@@ -64,7 +65,7 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
timer->cntv_ctl |= 1 << 1; /* Mask the interrupt in the guest */
timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
vcpu->arch.timer_cpu.irq->irq,
vcpu->arch.timer_cpu.irq->level);
......@@ -133,8 +134,8 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
cycle_t cval, now;
u64 ns;
/* Check if the timer is enabled and unmasked first */
if ((timer->cntv_ctl & 3) != 1)
if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
return;
cval = timer->cntv_cval;
......
......@@ -16,6 +16,7 @@
* Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include <linux/cpu.h>
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
......@@ -48,7 +49,7 @@ __asm__(".arch_extension virt");
#endif
static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
static kvm_kernel_vfp_t __percpu *kvm_host_vfp_state;
static kvm_cpu_context_t __percpu *kvm_host_cpu_state;
static unsigned long hyp_default_vectors;
/* Per-CPU variable containing the currently running vcpu. */
......@@ -206,7 +207,7 @@ int kvm_dev_ioctl_check_extension(long ext)
r = KVM_MAX_VCPUS;
break;
default:
r = 0;
r = kvm_arch_dev_ioctl_check_extension(ext);
break;
}
return r;
......@@ -218,27 +219,18 @@ long kvm_arch_dev_ioctl(struct file *filp,
return -EINVAL;
}
int kvm_arch_set_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
int user_alloc)
{
return 0;
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
bool user_alloc)
enum kvm_mr_change change)
{
return 0;
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
bool user_alloc)
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
{
}
......@@ -326,7 +318,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
vcpu->cpu = cpu;
vcpu->arch.vfp_host = this_cpu_ptr(kvm_host_vfp_state);
vcpu->arch.host_cpu_context = this_cpu_ptr(kvm_host_cpu_state);
/*
* Check whether this vcpu requires the cache to be flushed on
......@@ -639,7 +631,8 @@ static int vcpu_interrupt_line(struct kvm_vcpu *vcpu, int number, bool level)
return 0;
}
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
bool line_status)
{
u32 irq = irq_level->irq;
unsigned int irq_type, vcpu_idx, irq_num;
......@@ -794,30 +787,48 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
}
static void cpu_init_hyp_mode(void *vector)
static void cpu_init_hyp_mode(void *dummy)
{
unsigned long long boot_pgd_ptr;
unsigned long long pgd_ptr;
unsigned long hyp_stack_ptr;
unsigned long stack_page;
unsigned long vector_ptr;
/* Switch from the HYP stub to our own HYP init vector */
__hyp_set_vectors((unsigned long)vector);
__hyp_set_vectors(kvm_get_idmap_vector());
boot_pgd_ptr = (unsigned long long)kvm_mmu_get_boot_httbr();
pgd_ptr = (unsigned long long)kvm_mmu_get_httbr();
stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
vector_ptr = (unsigned long)__kvm_hyp_vector;
__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
}
static int hyp_init_cpu_notify(struct notifier_block *self,
unsigned long action, void *cpu)
{
switch (action) {
case CPU_STARTING:
case CPU_STARTING_FROZEN:
cpu_init_hyp_mode(NULL);
break;
}
return NOTIFY_OK;
}
static struct notifier_block hyp_init_cpu_nb = {
.notifier_call = hyp_init_cpu_notify,
};
/**
* Inits Hyp-mode on all online CPUs
*/
static int init_hyp_mode(void)
{
phys_addr_t init_phys_addr;
int cpu;
int err = 0;
......@@ -849,24 +860,6 @@ static int init_hyp_mode(void)
per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
}
/*
* Execute the init code on each CPU.
*
* Note: The stack is not mapped yet, so don't do anything else than
* initializing the hypervisor mode on each CPU using a local stack
* space for temporary storage.
*/
init_phys_addr = virt_to_phys(__kvm_hyp_init);
for_each_online_cpu(cpu) {
smp_call_function_single(cpu, cpu_init_hyp_mode,
(void *)(long)init_phys_addr, 1);
}
/*
* Unmap the identity mapping
*/
kvm_clear_hyp_idmap();
/*
* Map the Hyp-code called directly from the host
*/
......@@ -890,33 +883,38 @@ static int init_hyp_mode(void)
}
/*
* Map the host VFP structures
* Map the host CPU structures
*/
kvm_host_vfp_state = alloc_percpu(kvm_kernel_vfp_t);
if (!kvm_host_vfp_state) {
kvm_host_cpu_state = alloc_percpu(kvm_cpu_context_t);
if (!kvm_host_cpu_state) {
err = -ENOMEM;
kvm_err("Cannot allocate host VFP state\n");
kvm_err("Cannot allocate host CPU state\n");
goto out_free_mappings;
}
for_each_possible_cpu(cpu) {
kvm_kernel_vfp_t *vfp;
kvm_cpu_context_t *cpu_ctxt;
vfp = per_cpu_ptr(kvm_host_vfp_state, cpu);
err = create_hyp_mappings(vfp, vfp + 1);
cpu_ctxt = per_cpu_ptr(kvm_host_cpu_state, cpu);
err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1);
if (err) {
kvm_err("Cannot map host VFP state: %d\n", err);
goto out_free_vfp;
kvm_err("Cannot map host CPU state: %d\n", err);
goto out_free_context;
}
}
/*
* Execute the init code on each CPU.
*/
on_each_cpu(cpu_init_hyp_mode, NULL, 1);
/*
* Init HYP view of VGIC
*/
err = kvm_vgic_hyp_init();
if (err)
goto out_free_vfp;
goto out_free_context;
#ifdef CONFIG_KVM_ARM_VGIC
vgic_present = true;
......@@ -929,12 +927,19 @@ static int init_hyp_mode(void)
if (err)
goto out_free_mappings;
#ifndef CONFIG_HOTPLUG_CPU
free_boot_hyp_pgd();
#endif
kvm_perf_init();
kvm_info("Hyp mode initialized successfully\n");
return 0;
out_free_vfp:
free_percpu(kvm_host_vfp_state);
out_free_context:
free_percpu(kvm_host_cpu_state);
out_free_mappings:
free_hyp_pmds();
free_hyp_pgds();
out_free_stack_pages:
for_each_possible_cpu(cpu)
free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
......@@ -943,27 +948,42 @@ static int init_hyp_mode(void)
return err;
}
static void check_kvm_target_cpu(void *ret)
{
*(int *)ret = kvm_target_cpu();
}
/**
* Initialize Hyp-mode and memory mappings on all CPUs.
*/
int kvm_arch_init(void *opaque)
{
int err;
int ret, cpu;
if (!is_hyp_mode_available()) {
kvm_err("HYP mode not available\n");
return -ENODEV;
}
if (kvm_target_cpu() < 0) {
kvm_err("Target CPU not supported!\n");
return -ENODEV;
for_each_online_cpu(cpu) {
smp_call_function_single(cpu, check_kvm_target_cpu, &ret, 1);
if (ret < 0) {
kvm_err("Error, CPU %d not supported!\n", cpu);
return -ENODEV;
}
}
err = init_hyp_mode();
if (err)
goto out_err;
err = register_cpu_notifier(&hyp_init_cpu_nb);
if (err) {
kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
goto out_err;
}
kvm_coproc_table_init();
return 0;
out_err:
......@@ -973,6 +993,7 @@ int kvm_arch_init(void *opaque)
/* NOP: Compiling as a module not supported */
void kvm_arch_exit(void)
{
kvm_perf_teardown();
}
static int arm_init(void)
......
......@@ -21,13 +21,33 @@
#include <asm/asm-offsets.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_mmu.h>
/********************************************************************
* Hypervisor initialization
* - should be called with:
* r0,r1 = Hypervisor pgd pointer
* r2 = top of Hyp stack (kernel VA)
* r3 = pointer to hyp vectors
* r0 = top of Hyp stack (kernel VA)
* r1 = pointer to hyp vectors
* r2,r3 = Hypervisor pgd pointer
*
* The init scenario is:
* - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
* runtime stack, runtime vectors
* - Enable the MMU with the boot pgd
* - Jump to a target into the trampoline page (remember, this is the same
* physical page!)
* - Now switch to the runtime pgd (same VA, and still the same physical
* page!)
* - Invalidate TLBs
* - Set stack and vectors
* - Profit! (or eret, if you only care about the code).
*
* As we only have four registers available to pass parameters (and we
* need six), we split the init in two phases:
* - Phase 1: r0 = 0, r1 = 0, r2,r3 contain the boot PGD.
* Provides the basic HYP init, and enable the MMU.
* - Phase 2: r0 = ToS, r1 = vectors, r2,r3 contain the runtime PGD.
* Switches to the runtime PGD, set stack and vectors.
*/
.text
......@@ -47,22 +67,25 @@ __kvm_hyp_init:
W(b) .
__do_hyp_init:
cmp r0, #0 @ We have a SP?
bne phase2 @ Yes, second stage init
@ Set the HTTBR to point to the hypervisor PGD pointer passed
mcrr p15, 4, r0, r1, c2
mcrr p15, 4, r2, r3, c2
@ Set the HTCR and VTCR to the same shareability and cacheability
@ settings as the non-secure TTBCR and with T0SZ == 0.
mrc p15, 4, r0, c2, c0, 2 @ HTCR
ldr r12, =HTCR_MASK
bic r0, r0, r12
ldr r2, =HTCR_MASK
bic r0, r0, r2
mrc p15, 0, r1, c2, c0, 2 @ TTBCR
and r1, r1, #(HTCR_MASK & ~TTBCR_T0SZ)
orr r0, r0, r1
mcr p15, 4, r0, c2, c0, 2 @ HTCR
mrc p15, 4, r1, c2, c1, 2 @ VTCR
ldr r12, =VTCR_MASK
bic r1, r1, r12
ldr r2, =VTCR_MASK
bic r1, r1, r2
bic r0, r0, #(~VTCR_HTCR_SH) @ clear non-reusable HTCR bits
orr r1, r0, r1
orr r1, r1, #(KVM_VTCR_SL0 | KVM_VTCR_T0SZ | KVM_VTCR_S)
......@@ -85,24 +108,41 @@ __do_hyp_init:
@ - Memory alignment checks: enabled
@ - MMU: enabled (this code must be run from an identity mapping)
mrc p15, 4, r0, c1, c0, 0 @ HSCR
ldr r12, =HSCTLR_MASK
bic r0, r0, r12
ldr r2, =HSCTLR_MASK
bic r0, r0, r2
mrc p15, 0, r1, c1, c0, 0 @ SCTLR
ldr r12, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
and r1, r1, r12
ARM( ldr r12, =(HSCTLR_M | HSCTLR_A) )
THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
orr r1, r1, r12
ldr r2, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
and r1, r1, r2
ARM( ldr r2, =(HSCTLR_M | HSCTLR_A) )
THUMB( ldr r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
orr r1, r1, r2
orr r0, r0, r1
isb
mcr p15, 4, r0, c1, c0, 0 @ HSCR
isb
@ Set stack pointer and return to the kernel
mov sp, r2
@ End of init phase-1
eret
phase2:
@ Set stack pointer
mov sp, r0
@ Set HVBAR to point to the HYP vectors
mcr p15, 4, r3, c12, c0, 0 @ HVBAR
mcr p15, 4, r1, c12, c0, 0 @ HVBAR
@ Jump to the trampoline page
ldr r0, =TRAMPOLINE_VA
adr r1, target
bfi r0, r1, #0, #PAGE_SHIFT
mov pc, r0
target: @ We're now in the trampoline code, switch page tables
mcrr p15, 4, r2, r3, c2
isb
@ Invalidate the old TLBs
mcr p15, 4, r0, c8, c7, 0 @ TLBIALLH
dsb
eret
......
This diff is collapsed.
/*
* Based on the x86 implementation.
*
* Copyright (C) 2012 ARM Ltd.
* Author: Marc Zyngier <marc.zyngier@arm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <linux/perf_event.h>
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
static int kvm_is_in_guest(void)
{
return kvm_arm_get_running_vcpu() != NULL;
}
static int kvm_is_user_mode(void)
{
struct kvm_vcpu *vcpu;
vcpu = kvm_arm_get_running_vcpu();
if (vcpu)
return !vcpu_mode_priv(vcpu);
return 0;
}
static unsigned long kvm_get_guest_ip(void)
{
struct kvm_vcpu *vcpu;
vcpu = kvm_arm_get_running_vcpu();
if (vcpu)
return *vcpu_pc(vcpu);
return 0;
}
static struct perf_guest_info_callbacks kvm_guest_cbs = {
.is_in_guest = kvm_is_in_guest,
.is_user_mode = kvm_is_user_mode,
.get_guest_ip = kvm_get_guest_ip,
};
int kvm_perf_init(void)
{
return perf_register_guest_info_callbacks(&kvm_guest_cbs);
}
int kvm_perf_teardown(void)
{
return perf_unregister_guest_info_callbacks(&kvm_guest_cbs);
}
......@@ -8,7 +8,6 @@
#include <asm/pgtable.h>
#include <asm/sections.h>
#include <asm/system_info.h>
#include <asm/virt.h>
pgd_t *idmap_pgd;
......@@ -83,37 +82,10 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
} while (pgd++, addr = next, addr != end);
}
#if defined(CONFIG_ARM_VIRT_EXT) && defined(CONFIG_ARM_LPAE)
pgd_t *hyp_pgd;
extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
static int __init init_static_idmap_hyp(void)
{
hyp_pgd = kzalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL);
if (!hyp_pgd)
return -ENOMEM;
pr_info("Setting up static HYP identity map for 0x%p - 0x%p\n",
__hyp_idmap_text_start, __hyp_idmap_text_end);
identity_mapping_add(hyp_pgd, __hyp_idmap_text_start,
__hyp_idmap_text_end, PMD_SECT_AP1);
return 0;
}
#else
static int __init init_static_idmap_hyp(void)
{
return 0;
}
#endif
extern char __idmap_text_start[], __idmap_text_end[];
static int __init init_static_idmap(void)
{
int ret;
idmap_pgd = pgd_alloc(&init_mm);
if (!idmap_pgd)
return -ENOMEM;
......@@ -123,12 +95,10 @@ static int __init init_static_idmap(void)
identity_mapping_add(idmap_pgd, __idmap_text_start,
__idmap_text_end, 0);
ret = init_static_idmap_hyp();
/* Flush L1 for the hardware to see this page table content */
flush_cache_louis();
return ret;
return 0;
}
early_initcall(init_static_idmap);
......
......@@ -26,6 +26,7 @@
#define KVM_USER_MEM_SLOTS 32
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
#define KVM_IRQCHIP_NUM_PINS KVM_IOAPIC_NUM_PINS
/* define exit reasons from vmm to kvm*/
#define EXIT_REASON_VM_PANIC 0
......
......@@ -27,7 +27,6 @@
/* Select x86 specific features in <linux/kvm.h> */
#define __KVM_HAVE_IOAPIC
#define __KVM_HAVE_IRQ_LINE
#define __KVM_HAVE_DEVICE_ASSIGNMENT
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
......
......@@ -21,12 +21,11 @@ config KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on BROKEN
depends on HAVE_KVM && MODULES
# for device assignment:
depends on PCI
depends on BROKEN
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select KVM_APIC_ARCHITECTURE
select KVM_MMIO
---help---
......@@ -50,6 +49,17 @@ config KVM_INTEL
Provides support for KVM on Itanium 2 processors equipped with the VT
extensions.
config KVM_DEVICE_ASSIGNMENT
bool "KVM legacy PCI device assignment support"
depends on KVM && PCI && IOMMU_API
default y
---help---
Provide support for legacy PCI device assignment through KVM. The
kernel now also supports a full featured userspace device driver
framework through VFIO, which supersedes much of this support.
If unsure, say Y.
source drivers/vhost/Kconfig
endif # VIRTUALIZATION
......@@ -49,10 +49,10 @@ ccflags-y := -Ivirt/kvm -Iarch/ia64/kvm/
asflags-y := -Ivirt/kvm -Iarch/ia64/kvm/
common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
coalesced_mmio.o irq_comm.o assigned-dev.o)
coalesced_mmio.o irq_comm.o)
ifeq ($(CONFIG_IOMMU_API),y)
common-objs += $(addprefix ../../../virt/kvm/, iommu.o)
ifeq ($(CONFIG_KVM_DEVICE_ASSIGNMENT),y)
common-objs += $(addprefix ../../../virt/kvm/, assigned-dev.o iommu.o)
endif
kvm-objs := $(common-objs) kvm-ia64.o kvm_fw.o
......
......@@ -204,9 +204,11 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_COALESCED_MMIO:
r = KVM_COALESCED_MMIO_PAGE_OFFSET;
break;
#ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
case KVM_CAP_IOMMU:
r = iommu_present(&pci_bus_type);
break;
#endif
default:
r = 0;
}
......@@ -924,13 +926,15 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
return 0;
}
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event)
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event,
bool line_status)
{
if (!irqchip_in_kernel(kvm))
return -ENXIO;
irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
irq_event->irq, irq_event->level);
irq_event->irq, irq_event->level,
line_status);
return 0;
}
......@@ -942,24 +946,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
int r = -ENOTTY;
switch (ioctl) {
case KVM_SET_MEMORY_REGION: {
struct kvm_memory_region kvm_mem;
struct kvm_userspace_memory_region kvm_userspace_mem;
r = -EFAULT;
if (copy_from_user(&kvm_mem, argp, sizeof kvm_mem))
goto out;
kvm_userspace_mem.slot = kvm_mem.slot;
kvm_userspace_mem.flags = kvm_mem.flags;
kvm_userspace_mem.guest_phys_addr =
kvm_mem.guest_phys_addr;
kvm_userspace_mem.memory_size = kvm_mem.memory_size;
r = kvm_vm_ioctl_set_memory_region(kvm,
&kvm_userspace_mem, false);
if (r)
goto out;
break;
}
case KVM_CREATE_IRQCHIP:
r = -EFAULT;
r = kvm_ioapic_init(kvm);
......@@ -1384,9 +1370,7 @@ void kvm_arch_sync_events(struct kvm *kvm)
void kvm_arch_destroy_vm(struct kvm *kvm)
{
kvm_iommu_unmap_guest(kvm);
#ifdef KVM_CAP_DEVICE_ASSIGNMENT
kvm_free_all_assigned_devices(kvm);
#endif
kfree(kvm->arch.vioapic);
kvm_release_vm_pages(kvm);
}
......@@ -1578,9 +1562,8 @@ int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
bool user_alloc)
enum kvm_mr_change change)
{
unsigned long i;
unsigned long pfn;
......@@ -1610,8 +1593,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
bool user_alloc)
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
{
return;
}
......
......@@ -27,10 +27,4 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
#define kvm_apic_present(x) (true)
#define kvm_lapic_enabled(x) (true)
static inline bool kvm_apic_vid_enabled(void)
{
/* IA64 has no apicv supporting, do nothing here */
return false;
}
#endif
......@@ -270,6 +270,9 @@
#define H_SET_MODE 0x31C
#define MAX_HCALL_OPCODE H_SET_MODE
/* Platform specific hcalls, used by KVM */
#define H_RTAS 0xf000
#ifndef __ASSEMBLY__
/**
......
......@@ -142,6 +142,8 @@ extern int kvmppc_mmu_hv_init(void);
extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, bool data);
extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, bool data);
extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int vec);
extern void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
unsigned int vec);
extern void kvmppc_inject_interrupt(struct kvm_vcpu *vcpu, int vec, u64 flags);
extern void kvmppc_set_bat(struct kvm_vcpu *vcpu, struct kvmppc_bat *bat,
bool upper, u32 val);
......@@ -156,7 +158,8 @@ void kvmppc_clear_ref_hpte(struct kvm *kvm, unsigned long *hptep,
unsigned long pte_index);
extern void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long addr,
unsigned long *nb_ret);
extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr);
extern void kvmppc_unpin_guest_page(struct kvm *kvm, void *addr,
unsigned long gpa, bool dirty);
extern long kvmppc_virtmode_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
long pte_index, unsigned long pteh, unsigned long ptel);
extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
......@@ -458,6 +461,8 @@ static inline bool kvmppc_critical_section(struct kvm_vcpu *vcpu)
#define OSI_SC_MAGIC_R4 0x77810F9B
#define INS_DCBZ 0x7c0007ec
/* TO = 31 for unconditional trap */
#define INS_TW 0x7fe00008
/* LPIDs we support with this build -- runtime limit may be lower */
#define KVMPPC_NR_LPIDS (LPID_RSVD + 1)
......
......@@ -268,4 +268,17 @@ static inline int is_vrma_hpte(unsigned long hpte_v)
(HPTE_V_1TB_SEG | (VRMA_VSID << (40 - 16)));
}
#ifdef CONFIG_KVM_BOOK3S_64_HV
/*
* Note modification of an HPTE; set the HPTE modified bit
* if anyone is interested.
*/
static inline void note_hpte_modification(struct kvm *kvm,
struct revmap_entry *rev)
{
if (atomic_read(&kvm->arch.hpte_mod_interest))
rev->guest_rpte |= HPTE_GR_MODIFIED;
}
#endif /* CONFIG_KVM_BOOK3S_64_HV */
#endif /* __ASM_KVM_BOOK3S_64_H__ */
......@@ -20,6 +20,11 @@
#ifndef __ASM_KVM_BOOK3S_ASM_H__
#define __ASM_KVM_BOOK3S_ASM_H__
/* XICS ICP register offsets */
#define XICS_XIRR 4
#define XICS_MFRR 0xc
#define XICS_IPI 2 /* interrupt source # for IPIs */
#ifdef __ASSEMBLY__
#ifdef CONFIG_KVM_BOOK3S_HANDLER
......@@ -81,10 +86,11 @@ struct kvmppc_host_state {
#ifdef CONFIG_KVM_BOOK3S_64_HV
u8 hwthread_req;
u8 hwthread_state;
u8 host_ipi;
struct kvm_vcpu *kvm_vcpu;
struct kvmppc_vcore *kvm_vcore;
unsigned long xics_phys;
u32 saved_xirr;
u64 dabr;
u64 host_mmcr[3];
u32 host_pmc[8];
......
......@@ -26,6 +26,8 @@
/* LPIDs we support with this build -- runtime limit may be lower */
#define KVMPPC_NR_LPIDS 64
#define KVMPPC_INST_EHPRIV 0x7c00021c
static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val)
{
vcpu->arch.gpr[num] = val;
......
......@@ -44,6 +44,10 @@
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
#endif
/* These values are internal and can be increased later */
#define KVM_NR_IRQCHIPS 1
#define KVM_IRQCHIP_NUM_PINS 256
#if !defined(CONFIG_KVM_440)
#include <linux/mmu_notifier.h>
......@@ -188,6 +192,10 @@ struct kvmppc_linear_info {
int type;
};
/* XICS components, defined in book3s_xics.c */
struct kvmppc_xics;
struct kvmppc_icp;
/*
* The reverse mapping array has one entry for each HPTE,
* which stores the guest's view of the second word of the HPTE
......@@ -255,6 +263,13 @@ struct kvm_arch {
#endif /* CONFIG_KVM_BOOK3S_64_HV */
#ifdef CONFIG_PPC_BOOK3S_64
struct list_head spapr_tce_tables;
struct list_head rtas_tokens;
#endif
#ifdef CONFIG_KVM_MPIC
struct openpic *mpic;
#endif
#ifdef CONFIG_KVM_XICS
struct kvmppc_xics *xics;
#endif
};
......@@ -301,11 +316,13 @@ struct kvmppc_vcore {
* that a guest can register.
*/
struct kvmppc_vpa {
unsigned long gpa; /* Current guest phys addr */
void *pinned_addr; /* Address in kernel linear mapping */
void *pinned_end; /* End of region */
unsigned long next_gpa; /* Guest phys addr for update */
unsigned long len; /* Number of bytes required */
u8 update_pending; /* 1 => update pinned_addr from next_gpa */
bool dirty; /* true => area has been modified by kernel */
};
struct kvmppc_pte {
......@@ -359,6 +376,11 @@ struct kvmppc_slb {
#define KVMPPC_BOOKE_MAX_IAC 4
#define KVMPPC_BOOKE_MAX_DAC 2
/* KVMPPC_EPR_USER takes precedence over KVMPPC_EPR_KERNEL */
#define KVMPPC_EPR_NONE 0 /* EPR not supported */
#define KVMPPC_EPR_USER 1 /* exit to userspace to fill EPR */
#define KVMPPC_EPR_KERNEL 2 /* in-kernel irqchip */
struct kvmppc_booke_debug_reg {
u32 dbcr0;
u32 dbcr1;
......@@ -370,6 +392,12 @@ struct kvmppc_booke_debug_reg {
u64 dac[KVMPPC_BOOKE_MAX_DAC];
};
#define KVMPPC_IRQ_DEFAULT 0
#define KVMPPC_IRQ_MPIC 1
#define KVMPPC_IRQ_XICS 2
struct openpic;
struct kvm_vcpu_arch {
ulong host_stack;
u32 host_pid;
......@@ -502,8 +530,11 @@ struct kvm_vcpu_arch {
spinlock_t wdt_lock;
struct timer_list wdt_timer;
u32 tlbcfg[4];
u32 tlbps[4];
u32 mmucfg;
u32 eptcfg;
u32 epr;
u32 crit_save;
struct kvmppc_booke_debug_reg dbg_reg;
#endif
gpa_t paddr_accessed;
......@@ -521,7 +552,7 @@ struct kvm_vcpu_arch {
u8 sane;
u8 cpu_type;
u8 hcall_needed;
u8 epr_enabled;
u8 epr_flags; /* KVMPPC_EPR_xxx */
u8 epr_needed;
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
......@@ -548,6 +579,13 @@ struct kvm_vcpu_arch {
unsigned long magic_page_pa; /* phys addr to map the magic page to */
unsigned long magic_page_ea; /* effect. addr to map the magic page to */
int irq_type; /* one of KVM_IRQ_* */
int irq_cpu_id;
struct openpic *mpic; /* KVM_IRQ_MPIC */
#ifdef CONFIG_KVM_XICS
struct kvmppc_icp *icp; /* XICS presentation controller */
#endif
#ifdef CONFIG_KVM_BOOK3S_64_HV
struct kvm_vcpu_arch_shared shregs;
......@@ -588,5 +626,6 @@ struct kvm_vcpu_arch {
#define KVM_MMIO_REG_FQPR 0x0060
#define __KVM_HAVE_ARCH_WQP
#define __KVM_HAVE_CREATE_DEVICE
#endif /* __POWERPC_KVM_HOST_H__ */
......@@ -44,7 +44,7 @@ enum emulation_result {
EMULATE_DO_DCR, /* kvm_run filled with DCR request */
EMULATE_FAIL, /* can't emulate this instruction */
EMULATE_AGAIN, /* something went wrong. go again */
EMULATE_DO_PAPR, /* kvm_run filled with PAPR request */
EMULATE_EXIT_USER, /* emulation requires exit to user-space */
};
extern int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
......@@ -104,8 +104,7 @@ extern void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu);
extern void kvmppc_core_dequeue_dec(struct kvm_vcpu *vcpu);
extern void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq);
extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq);
extern void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu);
extern void kvmppc_core_flush_tlb(struct kvm_vcpu *vcpu);
extern int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
......@@ -131,6 +130,7 @@ extern long kvmppc_prepare_vrma(struct kvm *kvm,
extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu,
struct kvm_memory_slot *memslot, unsigned long porder);
extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);
extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
......@@ -152,7 +152,7 @@ extern int kvmppc_core_prepare_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem);
extern void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old);
const struct kvm_memory_slot *old);
extern int kvm_vm_ioctl_get_smmu_info(struct kvm *kvm,
struct kvm_ppc_smmu_info *info);
extern void kvmppc_core_flush_memslot(struct kvm *kvm,
......@@ -165,6 +165,18 @@ extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
extern int kvm_vm_ioctl_rtas_define_token(struct kvm *kvm, void __user *argp);
extern int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu);
extern void kvmppc_rtas_tokens_free(struct kvm *kvm);
extern int kvmppc_xics_set_xive(struct kvm *kvm, u32 irq, u32 server,
u32 priority);
extern int kvmppc_xics_get_xive(struct kvm *kvm, u32 irq, u32 *server,
u32 *priority);
extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq);
extern int kvmppc_xics_int_off(struct kvm *kvm, u32 irq);
/*
* Cuts out inst bits with ordering according to spec.
* That means the leftmost bit is zero. All given bits are included.
......@@ -246,12 +258,29 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id, union kvmppc_one_reg *);
void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid);
struct openpic;
#ifdef CONFIG_KVM_BOOK3S_64_HV
static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
{
paca[cpu].kvm_hstate.xics_phys = addr;
}
static inline u32 kvmppc_get_xics_latch(void)
{
u32 xirr = get_paca()->kvm_hstate.saved_xirr;
get_paca()->kvm_hstate.saved_xirr = 0;
return xirr;
}
static inline void kvmppc_set_host_ipi(int cpu, u8 host_ipi)
{
paca[cpu].kvm_hstate.host_ipi = host_ipi;
}
extern void kvmppc_fast_vcpu_kick(struct kvm_vcpu *vcpu);
extern void kvm_linear_init(void);
#else
......@@ -260,6 +289,46 @@ static inline void kvmppc_set_xics_phys(int cpu, unsigned long addr)
static inline void kvm_linear_init(void)
{}
static inline u32 kvmppc_get_xics_latch(void)
{
return 0;
}
static inline void kvmppc_set_host_ipi(int cpu, u8 host_ipi)
{}
static inline void kvmppc_fast_vcpu_kick(struct kvm_vcpu *vcpu)
{
kvm_vcpu_kick(vcpu);
}
#endif
#ifdef CONFIG_KVM_XICS
static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
{
return vcpu->arch.irq_type == KVMPPC_IRQ_XICS;
}
extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server);
extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args);
extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd);
extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu);
#else
static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
{ return 0; }
static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { }
static inline int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu,
unsigned long server)
{ return -EINVAL; }
static inline int kvm_vm_ioctl_xics_irq(struct kvm *kvm,
struct kvm_irq_level *args)
{ return -ENOTTY; }
static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd)
{ return 0; }
#endif
static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
......@@ -271,6 +340,32 @@ static inline void kvmppc_set_epr(struct kvm_vcpu *vcpu, u32 epr)
#endif
}
#ifdef CONFIG_KVM_MPIC
void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
int kvmppc_mpic_connect_vcpu(struct kvm_device *dev, struct kvm_vcpu *vcpu,
u32 cpu);
void kvmppc_mpic_disconnect_vcpu(struct openpic *opp, struct kvm_vcpu *vcpu);
#else
static inline void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu)
{
}
static inline int kvmppc_mpic_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu)
{
return -EINVAL;
}
static inline void kvmppc_mpic_disconnect_vcpu(struct openpic *opp,
struct kvm_vcpu *vcpu)
{
}
#endif /* CONFIG_KVM_MPIC */
int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
struct kvm_config_tlb *cfg);
int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
......@@ -283,8 +378,15 @@ void kvmppc_init_lpid(unsigned long nr_lpids);
static inline void kvmppc_mmu_flush_icache(pfn_t pfn)
{
/* Clear i-cache for new pages */
struct page *page;
/*
* We can only access pages that the kernel maps
* as memory. Bail out for unmapped ones.
*/
if (!pfn_valid(pfn))
return;
/* Clear i-cache for new pages */
page = pfn_to_page(pfn);
if (!test_bit(PG_arch_1, &page->flags)) {
flush_dcache_icache_page(page);
......@@ -324,4 +426,6 @@ static inline ulong kvmppc_get_ea_indexed(struct kvm_vcpu *vcpu, int ra, int rb)
return ea;
}
extern void xics_wake_cpu(int cpu);
#endif /* __POWERPC_KVM_PPC_H__ */
......@@ -300,6 +300,7 @@
#define LPCR_PECE1 0x00002000 /* decrementer can cause exit */
#define LPCR_PECE2 0x00001000 /* machine check etc can cause exit */
#define LPCR_MER 0x00000800 /* Mediated External Exception */
#define LPCR_MER_SH 11
#define LPCR_LPES 0x0000000c
#define LPCR_LPES0 0x00000008 /* LPAR Env selector 0 */
#define LPCR_LPES1 0x00000004 /* LPAR Env selector 1 */
......
......@@ -25,6 +25,8 @@
/* Select powerpc specific features in <linux/kvm.h> */
#define __KVM_HAVE_SPAPR_TCE
#define __KVM_HAVE_PPC_SMT
#define __KVM_HAVE_IRQCHIP
#define __KVM_HAVE_IRQ_LINE
struct kvm_regs {
__u64 pc;
......@@ -272,8 +274,31 @@ struct kvm_debug_exit_arch {
/* for KVM_SET_GUEST_DEBUG */
struct kvm_guest_debug_arch {
struct {
/* H/W breakpoint/watchpoint address */
__u64 addr;
/*
* Type denotes h/w breakpoint, read watchpoint, write
* watchpoint or watchpoint (both read and write).
*/
#define KVMPPC_DEBUG_NONE 0x0
#define KVMPPC_DEBUG_BREAKPOINT (1UL << 1)
#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2)
#define KVMPPC_DEBUG_WATCH_READ (1UL << 3)
__u32 type;
__u32 reserved;
} bp[16];
};
/* Debug related defines */
/*
* kvm_guest_debug->control is a 32 bit field. The lower 16 bits are generic
* and upper 16 bits are architecture specific. Architecture specific defines
* that ioctl is for setting hardware breakpoint or software breakpoint.
*/
#define KVM_GUESTDBG_USE_SW_BP 0x00010000
#define KVM_GUESTDBG_USE_HW_BP 0x00020000
/* definition of registers in kvm_run */
struct kvm_sync_regs {
};
......@@ -299,6 +324,12 @@ struct kvm_allocate_rma {
__u64 rma_size;
};
/* for KVM_CAP_PPC_RTAS */
struct kvm_rtas_token_args {
char name[120];
__u64 token; /* Use a token of 0 to undefine a mapping */
};
struct kvm_book3e_206_tlb_entry {
__u32 mas8;
__u32 mas1;
......@@ -359,6 +390,26 @@ struct kvm_get_htab_header {
__u16 n_invalid;
};
/* Per-vcpu XICS interrupt controller state */
#define KVM_REG_PPC_ICP_STATE (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c)
#define KVM_REG_PPC_ICP_CPPR_SHIFT 56 /* current proc priority */
#define KVM_REG_PPC_ICP_CPPR_MASK 0xff
#define KVM_REG_PPC_ICP_XISR_SHIFT 32 /* interrupt status field */
#define KVM_REG_PPC_ICP_XISR_MASK 0xffffff
#define KVM_REG_PPC_ICP_MFRR_SHIFT 24 /* pending IPI priority */
#define KVM_REG_PPC_ICP_MFRR_MASK 0xff
#define KVM_REG_PPC_ICP_PPRI_SHIFT 16 /* pending irq priority */
#define KVM_REG_PPC_ICP_PPRI_MASK 0xff
/* Device control API: PPC-specific devices */
#define KVM_DEV_MPIC_GRP_MISC 1
#define KVM_DEV_MPIC_BASE_ADDR 0 /* 64-bit */
#define KVM_DEV_MPIC_GRP_REGISTER 2 /* 32-bit */
#define KVM_DEV_MPIC_GRP_IRQ_ACTIVE 3 /* 32-bit */
/* One-Reg API: PPC-specific registers */
#define KVM_REG_PPC_HIOR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
#define KVM_REG_PPC_IAC1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x2)
#define KVM_REG_PPC_IAC2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x3)
......@@ -417,4 +468,47 @@ struct kvm_get_htab_header {
#define KVM_REG_PPC_EPCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x85)
#define KVM_REG_PPC_EPR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x86)
/* Timer Status Register OR/CLEAR interface */
#define KVM_REG_PPC_OR_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x87)
#define KVM_REG_PPC_CLEAR_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x88)
#define KVM_REG_PPC_TCR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x89)
#define KVM_REG_PPC_TSR (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8a)
/* Debugging: Special instruction for software breakpoint */
#define KVM_REG_PPC_DEBUG_INST (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8b)
/* MMU registers */
#define KVM_REG_PPC_MAS0 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8c)
#define KVM_REG_PPC_MAS1 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x8d)
#define KVM_REG_PPC_MAS2 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8e)
#define KVM_REG_PPC_MAS7_3 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8f)
#define KVM_REG_PPC_MAS4 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x90)
#define KVM_REG_PPC_MAS6 (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x91)
#define KVM_REG_PPC_MMUCFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x92)
/*
* TLBnCFG fields TLBnCFG_N_ENTRY and TLBnCFG_ASSOC can be changed only using
* KVM_CAP_SW_TLB ioctl
*/
#define KVM_REG_PPC_TLB0CFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x93)
#define KVM_REG_PPC_TLB1CFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x94)
#define KVM_REG_PPC_TLB2CFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x95)
#define KVM_REG_PPC_TLB3CFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x96)
#define KVM_REG_PPC_TLB0PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x97)
#define KVM_REG_PPC_TLB1PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x98)
#define KVM_REG_PPC_TLB2PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x99)
#define KVM_REG_PPC_TLB3PS (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9a)
#define KVM_REG_PPC_EPTCFG (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0x9b)
/* PPC64 eXternal Interrupt Controller Specification */
#define KVM_DEV_XICS_GRP_SOURCES 1 /* 64-bit source attributes */
/* Layout of 64-bit source attribute values */
#define KVM_XICS_DESTINATION_SHIFT 0
#define KVM_XICS_DESTINATION_MASK 0xffffffffULL
#define KVM_XICS_PRIORITY_SHIFT 32
#define KVM_XICS_PRIORITY_MASK 0xff
#define KVM_XICS_LEVEL_SENSITIVE (1ULL << 40)
#define KVM_XICS_MASKED (1ULL << 41)
#define KVM_XICS_PENDING (1ULL << 42)
#endif /* __LINUX_KVM_POWERPC_H */
......@@ -480,6 +480,7 @@ int main(void)
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
DEFINE(VCPU_VPA_DIRTY, offsetof(struct kvm_vcpu, arch.vpa.dirty));
#endif
#ifdef CONFIG_PPC_BOOK3S
DEFINE(VCPU_VCPUID, offsetof(struct kvm_vcpu, vcpu_id));
......@@ -576,6 +577,8 @@ int main(void)
HSTATE_FIELD(HSTATE_KVM_VCPU, kvm_vcpu);
HSTATE_FIELD(HSTATE_KVM_VCORE, kvm_vcore);
HSTATE_FIELD(HSTATE_XICS_PHYS, xics_phys);
HSTATE_FIELD(HSTATE_SAVED_XIRR, saved_xirr);
HSTATE_FIELD(HSTATE_HOST_IPI, host_ipi);
HSTATE_FIELD(HSTATE_MMCR, host_mmcr);
HSTATE_FIELD(HSTATE_PMC, host_pmc);
HSTATE_FIELD(HSTATE_PURR, host_purr);
......@@ -599,6 +602,7 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save));
#endif /* CONFIG_PPC_BOOK3S */
#endif /* CONFIG_KVM */
......
......@@ -124,6 +124,18 @@ int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
return kvmppc_set_sregs_ivor(vcpu, sregs);
}
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
return -EINVAL;
}
int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
return -EINVAL;
}
struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
{
struct kvmppc_vcpu_44x *vcpu_44x;
......
......@@ -136,21 +136,41 @@ config KVM_E500V2
If unsure, say N.
config KVM_E500MC
bool "KVM support for PowerPC E500MC/E5500 processors"
bool "KVM support for PowerPC E500MC/E5500/E6500 processors"
depends on PPC_E500MC
select KVM
select KVM_MMIO
select KVM_BOOKE_HV
select MMU_NOTIFIER
---help---
Support running unmodified E500MC/E5500 (32-bit) guest kernels in
virtual machines on E500MC/E5500 host processors.
Support running unmodified E500MC/E5500/E6500 guest kernels in
virtual machines on E500MC/E5500/E6500 host processors.
This module provides access to the hardware capabilities through
a character device node named /dev/kvm.
If unsure, say N.
config KVM_MPIC
bool "KVM in-kernel MPIC emulation"
depends on KVM && E500
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
help
Enable support for emulating MPIC devices inside the
host kernel, rather than relying on userspace to emulate.
Currently, support is limited to certain versions of
Freescale's MPIC implementation.
config KVM_XICS
bool "KVM in-kernel XICS emulation"
depends on KVM_BOOK3S_64 && !KVM_MPIC
---help---
Include support for the XICS (eXternal Interrupt Controller
Specification) interrupt controller architecture used on
IBM POWER (pSeries) servers.
source drivers/vhost/Kconfig
endif # VIRTUALIZATION
......@@ -72,12 +72,18 @@ kvm-book3s_64-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
book3s_hv.o \
book3s_hv_interrupts.o \
book3s_64_mmu_hv.o
kvm-book3s_64-builtin-xics-objs-$(CONFIG_KVM_XICS) := \
book3s_hv_rm_xics.o
kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HV) := \
book3s_hv_rmhandlers.o \
book3s_hv_rm_mmu.o \
book3s_64_vio_hv.o \
book3s_hv_ras.o \
book3s_hv_builtin.o
book3s_hv_builtin.o \
$(kvm-book3s_64-builtin-xics-objs-y)
kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
book3s_xics.o
kvm-book3s_64-module-objs := \
../../../virt/kvm/kvm_main.o \
......@@ -86,6 +92,7 @@ kvm-book3s_64-module-objs := \
emulate.o \
book3s.o \
book3s_64_vio.o \
book3s_rtas.o \
$(kvm-book3s_64-objs-y)
kvm-objs-$(CONFIG_KVM_BOOK3S_64) := $(kvm-book3s_64-module-objs)
......@@ -103,6 +110,9 @@ kvm-book3s_32-objs := \
book3s_32_mmu.o
kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs)
kvm-objs-$(CONFIG_KVM_MPIC) += mpic.o
kvm-objs-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(addprefix ../../../virt/kvm/, irqchip.o)
kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
obj-$(CONFIG_KVM_440) += kvm.o
......
......@@ -104,7 +104,7 @@ static int kvmppc_book3s_vec2irqprio(unsigned int vec)
return prio;
}
static void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
void kvmppc_book3s_dequeue_irqprio(struct kvm_vcpu *vcpu,
unsigned int vec)
{
unsigned long old_pending = vcpu->arch.pending_exceptions;
......@@ -160,8 +160,7 @@ void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
kvmppc_book3s_queue_irqprio(vcpu, vec);
}
void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq)
void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu)
{
kvmppc_book3s_dequeue_irqprio(vcpu, BOOK3S_INTERRUPT_EXTERNAL);
kvmppc_book3s_dequeue_irqprio(vcpu, BOOK3S_INTERRUPT_EXTERNAL_LEVEL);
......@@ -530,6 +529,21 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
val = get_reg_val(reg->id, vcpu->arch.vscr.u[3]);
break;
#endif /* CONFIG_ALTIVEC */
case KVM_REG_PPC_DEBUG_INST: {
u32 opcode = INS_TW;
r = copy_to_user((u32 __user *)(long)reg->addr,
&opcode, sizeof(u32));
break;
}
#ifdef CONFIG_KVM_XICS
case KVM_REG_PPC_ICP_STATE:
if (!vcpu->arch.icp) {
r = -ENXIO;
break;
}
val = get_reg_val(reg->id, kvmppc_xics_get_icp(vcpu));
break;
#endif /* CONFIG_KVM_XICS */
default:
r = -EINVAL;
break;
......@@ -592,6 +606,16 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
vcpu->arch.vscr.u[3] = set_reg_val(reg->id, val);
break;
#endif /* CONFIG_ALTIVEC */
#ifdef CONFIG_KVM_XICS
case KVM_REG_PPC_ICP_STATE:
if (!vcpu->arch.icp) {
r = -ENXIO;
break;
}
r = kvmppc_xics_set_icp(vcpu,
set_reg_val(reg->id, val));
break;
#endif /* CONFIG_KVM_XICS */
default:
r = -EINVAL;
break;
......@@ -607,6 +631,12 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return 0;
}
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
return -EINVAL;
}
void kvmppc_decrementer_func(unsigned long data)
{
struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
......
......@@ -893,7 +893,10 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
/* Harvest R and C */
rcbits = hptep[1] & (HPTE_R_R | HPTE_R_C);
*rmapp |= rcbits << KVMPPC_RMAP_RC_SHIFT;
rev[i].guest_rpte = ptel | rcbits;
if (rcbits & ~rev[i].guest_rpte) {
rev[i].guest_rpte = ptel | rcbits;
note_hpte_modification(kvm, &rev[i]);
}
}
unlock_rmap(rmapp);
hptep[0] &= ~HPTE_V_HVLOCK;
......@@ -976,7 +979,10 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
/* Now check and modify the HPTE */
if ((hptep[0] & HPTE_V_VALID) && (hptep[1] & HPTE_R_R)) {
kvmppc_clear_ref_hpte(kvm, hptep, i);
rev[i].guest_rpte |= HPTE_R_R;
if (!(rev[i].guest_rpte & HPTE_R_R)) {
rev[i].guest_rpte |= HPTE_R_R;
note_hpte_modification(kvm, &rev[i]);
}
ret = 1;
}
hptep[0] &= ~HPTE_V_HVLOCK;
......@@ -1080,7 +1086,10 @@ static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
hptep[1] &= ~HPTE_R_C;
eieio();
hptep[0] = (hptep[0] & ~HPTE_V_ABSENT) | HPTE_V_VALID;
rev[i].guest_rpte |= HPTE_R_C;
if (!(rev[i].guest_rpte & HPTE_R_C)) {
rev[i].guest_rpte |= HPTE_R_C;
note_hpte_modification(kvm, &rev[i]);
}
ret = 1;
}
hptep[0] &= ~HPTE_V_HVLOCK;
......@@ -1090,11 +1099,30 @@ static int kvm_test_clear_dirty(struct kvm *kvm, unsigned long *rmapp)
return ret;
}
static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
struct kvm_memory_slot *memslot,
unsigned long *map)
{
unsigned long gfn;
if (!vpa->dirty || !vpa->pinned_addr)
return;
gfn = vpa->gpa >> PAGE_SHIFT;
if (gfn < memslot->base_gfn ||
gfn >= memslot->base_gfn + memslot->npages)
return;
vpa->dirty = false;
if (map)
__set_bit_le(gfn - memslot->base_gfn, map);
}
long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
unsigned long *map)
{
unsigned long i;
unsigned long *rmapp;
struct kvm_vcpu *vcpu;
preempt_disable();
rmapp = memslot->arch.rmap;
......@@ -1103,6 +1131,15 @@ long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
__set_bit_le(i, map);
++rmapp;
}
/* Harvest dirty bits from VPA and DTL updates */
/* Note: we never modify the SLB shadow buffer areas */
kvm_for_each_vcpu(i, vcpu, kvm) {
spin_lock(&vcpu->arch.vpa_update_lock);
harvest_vpa_dirty(&vcpu->arch.vpa, memslot, map);
harvest_vpa_dirty(&vcpu->arch.dtl, memslot, map);
spin_unlock(&vcpu->arch.vpa_update_lock);
}
preempt_enable();
return 0;
}
......@@ -1114,7 +1151,7 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
unsigned long gfn = gpa >> PAGE_SHIFT;
struct page *page, *pages[1];
int npages;
unsigned long hva, psize, offset;
unsigned long hva, offset;
unsigned long pa;
unsigned long *physp;
int srcu_idx;
......@@ -1146,14 +1183,9 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
}
srcu_read_unlock(&kvm->srcu, srcu_idx);
psize = PAGE_SIZE;
if (PageHuge(page)) {
page = compound_head(page);
psize <<= compound_order(page);
}
offset = gpa & (psize - 1);
offset = gpa & (PAGE_SIZE - 1);
if (nb_ret)
*nb_ret = psize - offset;
*nb_ret = PAGE_SIZE - offset;
return page_address(page) + offset;
err:
......@@ -1161,11 +1193,31 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa,
return NULL;
}
void kvmppc_unpin_guest_page(struct kvm *kvm, void *va)
void kvmppc_unpin_guest_page(struct kvm *kvm, void *va, unsigned long gpa,
bool dirty)
{
struct page *page = virt_to_page(va);
struct kvm_memory_slot *memslot;
unsigned long gfn;
unsigned long *rmap;
int srcu_idx;
put_page(page);
if (!dirty || !kvm->arch.using_mmu_notifiers)
return;
/* We need to mark this page dirty in the rmap chain */
gfn = gpa >> PAGE_SHIFT;
srcu_idx = srcu_read_lock(&kvm->srcu);
memslot = gfn_to_memslot(kvm, gfn);
if (memslot) {
rmap = &memslot->arch.rmap[gfn - memslot->base_gfn];
lock_rmap(rmap);
*rmap |= KVMPPC_RMAP_CHANGED;
unlock_rmap(rmap);
}
srcu_read_unlock(&kvm->srcu, srcu_idx);
}
/*
......@@ -1193,16 +1245,36 @@ struct kvm_htab_ctx {
#define HPTE_SIZE (2 * sizeof(unsigned long))
/*
* Returns 1 if this HPT entry has been modified or has pending
* R/C bit changes.
*/
static int hpte_dirty(struct revmap_entry *revp, unsigned long *hptp)
{
unsigned long rcbits_unset;
if (revp->guest_rpte & HPTE_GR_MODIFIED)
return 1;
/* Also need to consider changes in reference and changed bits */
rcbits_unset = ~revp->guest_rpte & (HPTE_R_R | HPTE_R_C);
if ((hptp[0] & HPTE_V_VALID) && (hptp[1] & rcbits_unset))
return 1;
return 0;
}
static long record_hpte(unsigned long flags, unsigned long *hptp,
unsigned long *hpte, struct revmap_entry *revp,
int want_valid, int first_pass)
{
unsigned long v, r;
unsigned long rcbits_unset;
int ok = 1;
int valid, dirty;
/* Unmodified entries are uninteresting except on the first pass */
dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
dirty = hpte_dirty(revp, hptp);
if (!first_pass && !dirty)
return 0;
......@@ -1223,16 +1295,28 @@ static long record_hpte(unsigned long flags, unsigned long *hptp,
while (!try_lock_hpte(hptp, HPTE_V_HVLOCK))
cpu_relax();
v = hptp[0];
/* re-evaluate valid and dirty from synchronized HPTE value */
valid = !!(v & HPTE_V_VALID);
dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
/* Harvest R and C into guest view if necessary */
rcbits_unset = ~revp->guest_rpte & (HPTE_R_R | HPTE_R_C);
if (valid && (rcbits_unset & hptp[1])) {
revp->guest_rpte |= (hptp[1] & (HPTE_R_R | HPTE_R_C)) |
HPTE_GR_MODIFIED;
dirty = 1;
}
if (v & HPTE_V_ABSENT) {
v &= ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
valid = 1;
}
/* re-evaluate valid and dirty from synchronized HPTE value */
valid = !!(v & HPTE_V_VALID);
if ((flags & KVM_GET_HTAB_BOLTED_ONLY) && !(v & HPTE_V_BOLTED))
valid = 0;
r = revp->guest_rpte | (hptp[1] & (HPTE_R_R | HPTE_R_C));
dirty = !!(revp->guest_rpte & HPTE_GR_MODIFIED);
r = revp->guest_rpte;
/* only clear modified if this is the right sort of entry */
if (valid == want_valid && dirty) {
r &= ~HPTE_GR_MODIFIED;
......@@ -1288,7 +1372,7 @@ static ssize_t kvm_htab_read(struct file *file, char __user *buf,
/* Skip uninteresting entries, i.e. clean on not-first pass */
if (!first_pass) {
while (i < kvm->arch.hpt_npte &&
!(revp->guest_rpte & HPTE_GR_MODIFIED)) {
!hpte_dirty(revp, hptp)) {
++i;
hptp += 2;
++revp;
......
......@@ -194,7 +194,9 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu,
run->papr_hcall.args[i] = gpr;
}
emulated = EMULATE_DO_PAPR;
run->exit_reason = KVM_EXIT_PAPR_HCALL;
vcpu->arch.hcall_needed = 1;
emulated = EMULATE_EXIT_USER;
break;
}
#endif
......
......@@ -66,6 +66,31 @@
static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
void kvmppc_fast_vcpu_kick(struct kvm_vcpu *vcpu)
{
int me;
int cpu = vcpu->cpu;
wait_queue_head_t *wqp;
wqp = kvm_arch_vcpu_wq(vcpu);
if (waitqueue_active(wqp)) {
wake_up_interruptible(wqp);
++vcpu->stat.halt_wakeup;
}
me = get_cpu();
/* CPU points to the first thread of the core */
if (cpu != me && cpu >= 0 && cpu < nr_cpu_ids) {
int real_cpu = cpu + vcpu->arch.ptid;
if (paca[real_cpu].kvm_hstate.xics_phys)
xics_wake_cpu(real_cpu);
else if (cpu_online(cpu))
smp_send_reschedule(cpu);
}
put_cpu();
}
/*
* We use the vcpu_load/put functions to measure stolen time.
* Stolen time is counted as time when either the vcpu is able to
......@@ -259,7 +284,7 @@ static unsigned long do_h_register_vpa(struct kvm_vcpu *vcpu,
len = ((struct reg_vpa *)va)->length.hword;
else
len = ((struct reg_vpa *)va)->length.word;
kvmppc_unpin_guest_page(kvm, va);
kvmppc_unpin_guest_page(kvm, va, vpa, false);
/* Check length */
if (len > nb || len < sizeof(struct reg_vpa))
......@@ -359,13 +384,13 @@ static void kvmppc_update_vpa(struct kvm_vcpu *vcpu, struct kvmppc_vpa *vpap)
va = NULL;
nb = 0;
if (gpa)
va = kvmppc_pin_guest_page(kvm, vpap->next_gpa, &nb);
va = kvmppc_pin_guest_page(kvm, gpa, &nb);
spin_lock(&vcpu->arch.vpa_update_lock);
if (gpa == vpap->next_gpa)
break;
/* sigh... unpin that one and try again */
if (va)
kvmppc_unpin_guest_page(kvm, va);
kvmppc_unpin_guest_page(kvm, va, gpa, false);
}
vpap->update_pending = 0;
......@@ -375,12 +400,15 @@ static void kvmppc_update_vpa(struct kvm_vcpu *vcpu, struct kvmppc_vpa *vpap)
* has changed the mappings underlying guest memory,
* so unregister the region.
*/
kvmppc_unpin_guest_page(kvm, va);
kvmppc_unpin_guest_page(kvm, va, gpa, false);
va = NULL;
}
if (vpap->pinned_addr)
kvmppc_unpin_guest_page(kvm, vpap->pinned_addr);
kvmppc_unpin_guest_page(kvm, vpap->pinned_addr, vpap->gpa,
vpap->dirty);
vpap->gpa = gpa;
vpap->pinned_addr = va;
vpap->dirty = false;
if (va)
vpap->pinned_end = va + vpap->len;
}
......@@ -472,6 +500,7 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
/* order writing *dt vs. writing vpa->dtl_idx */
smp_wmb();
vpa->dtl_idx = ++vcpu->arch.dtl_index;
vcpu->arch.dtl.dirty = true;
}
int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
......@@ -479,7 +508,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
unsigned long req = kvmppc_get_gpr(vcpu, 3);
unsigned long target, ret = H_SUCCESS;
struct kvm_vcpu *tvcpu;
int idx;
int idx, rc;
switch (req) {
case H_ENTER:
......@@ -515,6 +544,28 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
kvmppc_get_gpr(vcpu, 5),
kvmppc_get_gpr(vcpu, 6));
break;
case H_RTAS:
if (list_empty(&vcpu->kvm->arch.rtas_tokens))
return RESUME_HOST;
rc = kvmppc_rtas_hcall(vcpu);
if (rc == -ENOENT)
return RESUME_HOST;
else if (rc == 0)
break;
/* Send the error out to userspace via KVM_RUN */
return rc;
case H_XIRR:
case H_CPPR:
case H_EOI:
case H_IPI:
if (kvmppc_xics_enabled(vcpu)) {
ret = kvmppc_xics_hcall(vcpu, req);
break;
} /* fallthrough */
default:
return RESUME_HOST;
}
......@@ -913,15 +964,19 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
return ERR_PTR(err);
}
static void unpin_vpa(struct kvm *kvm, struct kvmppc_vpa *vpa)
{
if (vpa->pinned_addr)
kvmppc_unpin_guest_page(kvm, vpa->pinned_addr, vpa->gpa,
vpa->dirty);
}
void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu)
{
spin_lock(&vcpu->arch.vpa_update_lock);
if (vcpu->arch.dtl.pinned_addr)
kvmppc_unpin_guest_page(vcpu->kvm, vcpu->arch.dtl.pinned_addr);
if (vcpu->arch.slb_shadow.pinned_addr)
kvmppc_unpin_guest_page(vcpu->kvm, vcpu->arch.slb_shadow.pinned_addr);
if (vcpu->arch.vpa.pinned_addr)
kvmppc_unpin_guest_page(vcpu->kvm, vcpu->arch.vpa.pinned_addr);
unpin_vpa(vcpu->kvm, &vcpu->arch.dtl);
unpin_vpa(vcpu->kvm, &vcpu->arch.slb_shadow);
unpin_vpa(vcpu->kvm, &vcpu->arch.vpa);
spin_unlock(&vcpu->arch.vpa_update_lock);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, vcpu);
......@@ -955,7 +1010,6 @@ static void kvmppc_end_cede(struct kvm_vcpu *vcpu)
}
extern int __kvmppc_vcore_entry(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
extern void xics_wake_cpu(int cpu);
static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
struct kvm_vcpu *vcpu)
......@@ -1330,9 +1384,12 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
break;
vc->runner = vcpu;
n_ceded = 0;
list_for_each_entry(v, &vc->runnable_threads, arch.run_list)
list_for_each_entry(v, &vc->runnable_threads, arch.run_list) {
if (!v->arch.pending_exceptions)
n_ceded += v->arch.ceded;
else
v->arch.ceded = 0;
}
if (n_ceded == vc->n_runnable)
kvmppc_vcore_blocked(vc);
else
......@@ -1645,12 +1702,12 @@ int kvmppc_core_prepare_memory_region(struct kvm *kvm,
void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old)
const struct kvm_memory_slot *old)
{
unsigned long npages = mem->memory_size >> PAGE_SHIFT;
struct kvm_memory_slot *memslot;
if (npages && old.npages) {
if (npages && old->npages) {
/*
* If modifying a memslot, reset all the rmap dirty bits.
* If this is a new memslot, we don't need to do anything
......@@ -1827,6 +1884,7 @@ int kvmppc_core_init_vm(struct kvm *kvm)
cpumask_setall(&kvm->arch.need_tlb_flush);
INIT_LIST_HEAD(&kvm->arch.spapr_tce_tables);
INIT_LIST_HEAD(&kvm->arch.rtas_tokens);
kvm->arch.rma = NULL;
......@@ -1872,6 +1930,8 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
kvm->arch.rma = NULL;
}
kvmppc_rtas_tokens_free(kvm);
kvmppc_free_hpt(kvm);
WARN_ON(!list_empty(&kvm->arch.spapr_tce_tables));
}
......
......@@ -97,17 +97,6 @@ void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
}
EXPORT_SYMBOL_GPL(kvmppc_add_revmap_chain);
/*
* Note modification of an HPTE; set the HPTE modified bit
* if anyone is interested.
*/
static inline void note_hpte_modification(struct kvm *kvm,
struct revmap_entry *rev)
{
if (atomic_read(&kvm->arch.hpte_mod_interest))
rev->guest_rpte |= HPTE_GR_MODIFIED;
}
/* Remove this HPTE from the chain for a real page */
static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
......
This diff is collapsed.
......@@ -79,10 +79,6 @@ _GLOBAL(kvmppc_hv_entry_trampoline)
* *
*****************************************************************************/
#define XICS_XIRR 4
#define XICS_QIRR 0xc
#define XICS_IPI 2 /* interrupt source # for IPIs */
/*
* We come in here when wakened from nap mode on a secondary hw thread.
* Relocation is off and most register values are lost.
......@@ -101,50 +97,51 @@ kvm_start_guest:
li r0,1
stb r0,PACA_NAPSTATELOST(r13)
/* get vcpu pointer, NULL if we have no vcpu to run */
ld r4,HSTATE_KVM_VCPU(r13)
cmpdi cr1,r4,0
/* were we napping due to cede? */
lbz r0,HSTATE_NAPPING(r13)
cmpwi r0,0
bne kvm_end_cede
/*
* We weren't napping due to cede, so this must be a secondary
* thread being woken up to run a guest, or being woken up due
* to a stray IPI. (Or due to some machine check or hypervisor
* maintenance interrupt while the core is in KVM.)
*/
/* Check the wake reason in SRR1 to see why we got here */
mfspr r3,SPRN_SRR1
rlwinm r3,r3,44-31,0x7 /* extract wake reason field */
cmpwi r3,4 /* was it an external interrupt? */
bne 27f
/*
* External interrupt - for now assume it is an IPI, since we
* should never get any other interrupts sent to offline threads.
* Only do this for secondary threads.
*/
beq cr1,25f
lwz r3,VCPU_PTID(r4)
cmpwi r3,0
beq 27f
25: ld r5,HSTATE_XICS_PHYS(r13)
li r0,0xff
li r6,XICS_QIRR
li r7,XICS_XIRR
bne 27f /* if not */
ld r5,HSTATE_XICS_PHYS(r13)
li r7,XICS_XIRR /* if it was an external interrupt, */
lwzcix r8,r5,r7 /* get and ack the interrupt */
sync
clrldi. r9,r8,40 /* get interrupt source ID. */
beq 27f /* none there? */
cmpwi r9,XICS_IPI
bne 26f
beq 28f /* none there? */
cmpwi r9,XICS_IPI /* was it an IPI? */
bne 29f
li r0,0xff
li r6,XICS_MFRR
stbcix r0,r5,r6 /* clear IPI */
26: stwcix r8,r5,r7 /* EOI the interrupt */
27: /* XXX should handle hypervisor maintenance interrupts etc. here */
stwcix r8,r5,r7 /* EOI the interrupt */
sync /* order loading of vcpu after that */
/* reload vcpu pointer after clearing the IPI */
/* get vcpu pointer, NULL if we have no vcpu to run */
ld r4,HSTATE_KVM_VCPU(r13)
cmpdi r4,0
/* if we have no vcpu to run, go back to sleep */
beq kvm_no_guest
b kvmppc_hv_entry
/* were we napping due to cede? */
lbz r0,HSTATE_NAPPING(r13)
cmpwi r0,0
bne kvm_end_cede
27: /* XXX should handle hypervisor maintenance interrupts etc. here */
b kvm_no_guest
28: /* SRR1 said external but ICP said nope?? */
b kvm_no_guest
29: /* External non-IPI interrupt to offline secondary thread? help?? */
stw r8,HSTATE_SAVED_XIRR(r13)
b kvm_no_guest
.global kvmppc_hv_entry
kvmppc_hv_entry:
......@@ -260,6 +257,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
lwz r5, LPPACA_YIELDCOUNT(r3)
addi r5, r5, 1
stw r5, LPPACA_YIELDCOUNT(r3)
li r6, 1
stb r6, VCPU_VPA_DIRTY(r4)
25:
/* Load up DAR and DSISR */
ld r5, VCPU_DAR(r4)
......@@ -485,20 +484,20 @@ toc_tlbie_lock:
mtctr r6
mtxer r7
ld r10, VCPU_PC(r4)
ld r11, VCPU_MSR(r4)
kvmppc_cede_reentry: /* r4 = vcpu, r13 = paca */
ld r6, VCPU_SRR0(r4)
ld r7, VCPU_SRR1(r4)
ld r10, VCPU_PC(r4)
ld r11, VCPU_MSR(r4) /* r11 = vcpu->arch.msr & ~MSR_HV */
/* r11 = vcpu->arch.msr & ~MSR_HV */
rldicl r11, r11, 63 - MSR_HV_LG, 1
rotldi r11, r11, 1 + MSR_HV_LG
ori r11, r11, MSR_ME
/* Check if we can deliver an external or decrementer interrupt now */
ld r0,VCPU_PENDING_EXC(r4)
li r8,(1 << BOOK3S_IRQPRIO_EXTERNAL)
oris r8,r8,(1 << BOOK3S_IRQPRIO_EXTERNAL_LEVEL)@h
lis r8,(1 << BOOK3S_IRQPRIO_EXTERNAL_LEVEL)@h
and r0,r0,r8
cmpdi cr1,r0,0
andi. r0,r11,MSR_EE
......@@ -526,10 +525,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
/* Move SRR0 and SRR1 into the respective regs */
5: mtspr SPRN_SRR0, r6
mtspr SPRN_SRR1, r7
li r0,0
stb r0,VCPU_CEDED(r4) /* cancel cede */
fast_guest_return:
li r0,0
stb r0,VCPU_CEDED(r4) /* cancel cede */
mtspr SPRN_HSRR0,r10
mtspr SPRN_HSRR1,r11
......@@ -676,17 +675,99 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
cmpwi r12,BOOK3S_INTERRUPT_SYSCALL
beq hcall_try_real_mode
/* Check for mediated interrupts (could be done earlier really ...) */
/* Only handle external interrupts here on arch 206 and later */
BEGIN_FTR_SECTION
cmpwi r12,BOOK3S_INTERRUPT_EXTERNAL
bne+ 1f
andi. r0,r11,MSR_EE
beq 1f
mfspr r5,SPRN_LPCR
andi. r0,r5,LPCR_MER
bne bounce_ext_interrupt
1:
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
b ext_interrupt_to_host
END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
/* External interrupt ? */
cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL
bne+ ext_interrupt_to_host
/* External interrupt, first check for host_ipi. If this is
* set, we know the host wants us out so let's do it now
*/
do_ext_interrupt:
lbz r0, HSTATE_HOST_IPI(r13)
cmpwi r0, 0
bne ext_interrupt_to_host
/* Now read the interrupt from the ICP */
ld r5, HSTATE_XICS_PHYS(r13)
li r7, XICS_XIRR
cmpdi r5, 0
beq- ext_interrupt_to_host
lwzcix r3, r5, r7
rlwinm. r0, r3, 0, 0xffffff
sync
beq 3f /* if nothing pending in the ICP */
/* We found something in the ICP...
*
* If it's not an IPI, stash it in the PACA and return to
* the host, we don't (yet) handle directing real external
* interrupts directly to the guest
*/
cmpwi r0, XICS_IPI
bne ext_stash_for_host
/* It's an IPI, clear the MFRR and EOI it */
li r0, 0xff
li r6, XICS_MFRR
stbcix r0, r5, r6 /* clear the IPI */
stwcix r3, r5, r7 /* EOI it */
sync
/* We need to re-check host IPI now in case it got set in the
* meantime. If it's clear, we bounce the interrupt to the
* guest
*/
lbz r0, HSTATE_HOST_IPI(r13)
cmpwi r0, 0
bne- 1f
/* Allright, looks like an IPI for the guest, we need to set MER */
3:
/* Check if any CPU is heading out to the host, if so head out too */
ld r5, HSTATE_KVM_VCORE(r13)
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi r0, 0x100
bge ext_interrupt_to_host
/* See if there is a pending interrupt for the guest */
mfspr r8, SPRN_LPCR
ld r0, VCPU_PENDING_EXC(r9)
/* Insert EXTERNAL_LEVEL bit into LPCR at the MER bit position */
rldicl. r0, r0, 64 - BOOK3S_IRQPRIO_EXTERNAL_LEVEL, 63
rldimi r8, r0, LPCR_MER_SH, 63 - LPCR_MER_SH
beq 2f
/* And if the guest EE is set, we can deliver immediately, else
* we return to the guest with MER set
*/
andi. r0, r11, MSR_EE
beq 2f
mtspr SPRN_SRR0, r10
mtspr SPRN_SRR1, r11
li r10, BOOK3S_INTERRUPT_EXTERNAL
li r11, (MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */
rotldi r11, r11, 63
2: mr r4, r9
mtspr SPRN_LPCR, r8
b fast_guest_return
/* We raced with the host, we need to resend that IPI, bummer */
1: li r0, IPI_PRIORITY
stbcix r0, r5, r6 /* set the IPI */
sync
b ext_interrupt_to_host
ext_stash_for_host:
/* It's not an IPI and it's for the host, stash it in the PACA
* before exit, it will be picked up by the host ICP driver
*/
stw r3, HSTATE_SAVED_XIRR(r13)
ext_interrupt_to_host:
guest_exit_cont: /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save DEC */
......@@ -829,7 +910,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_201)
beq 44f
ld r8,HSTATE_XICS_PHYS(r6) /* get thread's XICS reg addr */
li r0,IPI_PRIORITY
li r7,XICS_QIRR
li r7,XICS_MFRR
stbcix r0,r7,r8 /* trigger the IPI */
44: srdi. r3,r3,1
addi r6,r6,PACA_SIZE
......@@ -1018,6 +1099,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
lwz r3, LPPACA_YIELDCOUNT(r8)
addi r3, r3, 1
stw r3, LPPACA_YIELDCOUNT(r8)
li r3, 1
stb r3, VCPU_VPA_DIRTY(r9)
25:
/* Save PMU registers if requested */
/* r8 and cr0.eq are live here */
......@@ -1350,11 +1433,19 @@ hcall_real_table:
.long 0 /* 0x58 */
.long 0 /* 0x5c */
.long 0 /* 0x60 */
.long 0 /* 0x64 */
.long 0 /* 0x68 */
.long 0 /* 0x6c */
.long 0 /* 0x70 */
.long 0 /* 0x74 */
#ifdef CONFIG_KVM_XICS
.long .kvmppc_rm_h_eoi - hcall_real_table
.long .kvmppc_rm_h_cppr - hcall_real_table
.long .kvmppc_rm_h_ipi - hcall_real_table
.long 0 /* 0x70 - H_IPOLL */
.long .kvmppc_rm_h_xirr - hcall_real_table
#else
.long 0 /* 0x64 - H_EOI */
.long 0 /* 0x68 - H_CPPR */
.long 0 /* 0x6c - H_IPI */
.long 0 /* 0x70 - H_IPOLL */
.long 0 /* 0x74 - H_XIRR */
#endif
.long 0 /* 0x78 */
.long 0 /* 0x7c */
.long 0 /* 0x80 */
......@@ -1405,15 +1496,6 @@ ignore_hdec:
mr r4,r9
b fast_guest_return
bounce_ext_interrupt:
mr r4,r9
mtspr SPRN_SRR0,r10
mtspr SPRN_SRR1,r11
li r10,BOOK3S_INTERRUPT_EXTERNAL
li r11,(MSR_ME << 1) | 1 /* synthesize MSR_SF | MSR_ME */
rotldi r11,r11,63
b fast_guest_return
_GLOBAL(kvmppc_h_set_dabr)
std r4,VCPU_DABR(r3)
/* Work around P7 bug where DABR can get corrupted on mtspr */
......@@ -1519,6 +1601,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_206)
b .
kvm_end_cede:
/* get vcpu pointer */
ld r4, HSTATE_KVM_VCPU(r13)
/* Woken by external or decrementer interrupt */
ld r1, HSTATE_HOST_R1(r13)
......@@ -1558,6 +1643,16 @@ kvm_end_cede:
li r0,0
stb r0,HSTATE_NAPPING(r13)
/* Check the wake reason in SRR1 to see why we got here */
mfspr r3, SPRN_SRR1
rlwinm r3, r3, 44-31, 0x7 /* extract wake reason field */
cmpwi r3, 4 /* was it an external interrupt? */
li r12, BOOK3S_INTERRUPT_EXTERNAL
mr r9, r4
ld r10, VCPU_PC(r9)
ld r11, VCPU_MSR(r9)
beq do_ext_interrupt /* if so */
/* see if any other thread is already exiting */
lwz r0,VCORE_ENTRY_EXIT(r5)
cmpwi r0,0x100
......@@ -1577,8 +1672,7 @@ kvm_cede_prodded:
/* we've ceded but we want to give control to the host */
kvm_cede_exit:
li r3,H_TOO_HARD
blr
b hcall_real_fallback
/* Try to handle a machine check in real mode */
machine_check_realmode:
......@@ -1626,7 +1720,7 @@ secondary_nap:
beq 37f
sync
li r0, 0xff
li r6, XICS_QIRR
li r6, XICS_MFRR
stbcix r0, r5, r6 /* clear the IPI */
stwcix r3, r5, r7 /* EOI it */
37: sync
......
......@@ -762,9 +762,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
run->exit_reason = KVM_EXIT_MMIO;
r = RESUME_HOST_NV;
break;
case EMULATE_DO_PAPR:
run->exit_reason = KVM_EXIT_PAPR_HCALL;
vcpu->arch.hcall_needed = 1;
case EMULATE_EXIT_USER:
r = RESUME_HOST_NV;
break;
default:
......@@ -1283,7 +1281,7 @@ int kvmppc_core_prepare_memory_region(struct kvm *kvm,
void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old)
const struct kvm_memory_slot *old)
{
}
......@@ -1298,6 +1296,7 @@ int kvmppc_core_init_vm(struct kvm *kvm)
{
#ifdef CONFIG_PPC64
INIT_LIST_HEAD(&kvm->arch.spapr_tce_tables);
INIT_LIST_HEAD(&kvm->arch.rtas_tokens);
#endif
if (firmware_has_feature(FW_FEATURE_SET_MODE)) {
......
......@@ -227,6 +227,13 @@ static int kvmppc_h_pr_put_tce(struct kvm_vcpu *vcpu)
return EMULATE_DONE;
}
static int kvmppc_h_pr_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd)
{
long rc = kvmppc_xics_hcall(vcpu, cmd);
kvmppc_set_gpr(vcpu, 3, rc);
return EMULATE_DONE;
}
int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
{
switch (cmd) {
......@@ -246,6 +253,20 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
clear_bit(KVM_REQ_UNHALT, &vcpu->requests);
vcpu->stat.halt_wakeup++;
return EMULATE_DONE;
case H_XIRR:
case H_CPPR:
case H_EOI:
case H_IPI:
if (kvmppc_xics_enabled(vcpu))
return kvmppc_h_pr_xics_hcall(vcpu, cmd);
break;
case H_RTAS:
if (list_empty(&vcpu->kvm->arch.rtas_tokens))
return RESUME_HOST;
if (kvmppc_rtas_hcall(vcpu))
break;
kvmppc_set_gpr(vcpu, 3, 0);
return EMULATE_DONE;
}
return EMULATE_FAIL;
......
/*
* Copyright 2012 Michael Ellerman, IBM Corporation.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License, version 2, as
* published by the Free Software Foundation.
*/
#include <linux/kernel.h>
#include <linux/kvm_host.h>
#include <linux/kvm.h>
#include <linux/err.h>
#include <asm/uaccess.h>
#include <asm/kvm_book3s.h>
#include <asm/kvm_ppc.h>
#include <asm/hvcall.h>
#include <asm/rtas.h>
#ifdef CONFIG_KVM_XICS
static void kvm_rtas_set_xive(struct kvm_vcpu *vcpu, struct rtas_args *args)
{
u32 irq, server, priority;
int rc;
if (args->nargs != 3 || args->nret != 1) {
rc = -3;
goto out;
}
irq = args->args[0];
server = args->args[1];
priority = args->args[2];
rc = kvmppc_xics_set_xive(vcpu->kvm, irq, server, priority);
if (rc)
rc = -3;
out:
args->rets[0] = rc;
}
static void kvm_rtas_get_xive(struct kvm_vcpu *vcpu, struct rtas_args *args)
{
u32 irq, server, priority;
int rc;
if (args->nargs != 1 || args->nret != 3) {
rc = -3;
goto out;
}
irq = args->args[0];
server = priority = 0;
rc = kvmppc_xics_get_xive(vcpu->kvm, irq, &server, &priority);
if (rc) {
rc = -3;
goto out;
}
args->rets[1] = server;
args->rets[2] = priority;
out:
args->rets[0] = rc;
}
static void kvm_rtas_int_off(struct kvm_vcpu *vcpu, struct rtas_args *args)
{
u32 irq;
int rc;
if (args->nargs != 1 || args->nret != 1) {
rc = -3;
goto out;
}
irq = args->args[0];
rc = kvmppc_xics_int_off(vcpu->kvm, irq);
if (rc)
rc = -3;
out:
args->rets[0] = rc;
}
static void kvm_rtas_int_on(struct kvm_vcpu *vcpu, struct rtas_args *args)
{
u32 irq;
int rc;
if (args->nargs != 1 || args->nret != 1) {
rc = -3;
goto out;
}
irq = args->args[0];
rc = kvmppc_xics_int_on(vcpu->kvm, irq);
if (rc)
rc = -3;
out:
args->rets[0] = rc;
}
#endif /* CONFIG_KVM_XICS */
struct rtas_handler {
void (*handler)(struct kvm_vcpu *vcpu, struct rtas_args *args);
char *name;
};
static struct rtas_handler rtas_handlers[] = {
#ifdef CONFIG_KVM_XICS
{ .name = "ibm,set-xive", .handler = kvm_rtas_set_xive },
{ .name = "ibm,get-xive", .handler = kvm_rtas_get_xive },
{ .name = "ibm,int-off", .handler = kvm_rtas_int_off },
{ .name = "ibm,int-on", .handler = kvm_rtas_int_on },
#endif
};
struct rtas_token_definition {
struct list_head list;
struct rtas_handler *handler;
u64 token;
};
static int rtas_name_matches(char *s1, char *s2)
{
struct kvm_rtas_token_args args;
return !strncmp(s1, s2, sizeof(args.name));
}
static int rtas_token_undefine(struct kvm *kvm, char *name)
{
struct rtas_token_definition *d, *tmp;
lockdep_assert_held(&kvm->lock);
list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) {
if (rtas_name_matches(d->handler->name, name)) {
list_del(&d->list);
kfree(d);
return 0;
}
}
/* It's not an error to undefine an undefined token */
return 0;
}
static int rtas_token_define(struct kvm *kvm, char *name, u64 token)
{
struct rtas_token_definition *d;
struct rtas_handler *h = NULL;
bool found;
int i;
lockdep_assert_held(&kvm->lock);
list_for_each_entry(d, &kvm->arch.rtas_tokens, list) {
if (d->token == token)
return -EEXIST;
}
found = false;
for (i = 0; i < ARRAY_SIZE(rtas_handlers); i++) {
h = &rtas_handlers[i];
if (rtas_name_matches(h->name, name)) {
found = true;
break;
}
}
if (!found)
return -ENOENT;
d = kzalloc(sizeof(*d), GFP_KERNEL);
if (!d)
return -ENOMEM;
d->handler = h;
d->token = token;
list_add_tail(&d->list, &kvm->arch.rtas_tokens);
return 0;
}
int kvm_vm_ioctl_rtas_define_token(struct kvm *kvm, void __user *argp)
{
struct kvm_rtas_token_args args;
int rc;
if (copy_from_user(&args, argp, sizeof(args)))
return -EFAULT;
mutex_lock(&kvm->lock);
if (args.token)
rc = rtas_token_define(kvm, args.name, args.token);
else
rc = rtas_token_undefine(kvm, args.name);
mutex_unlock(&kvm->lock);
return rc;
}
int kvmppc_rtas_hcall(struct kvm_vcpu *vcpu)
{
struct rtas_token_definition *d;
struct rtas_args args;
rtas_arg_t *orig_rets;
gpa_t args_phys;
int rc;
/* r4 contains the guest physical address of the RTAS args */
args_phys = kvmppc_get_gpr(vcpu, 4);
rc = kvm_read_guest(vcpu->kvm, args_phys, &args, sizeof(args));
if (rc)
goto fail;
/*
* args->rets is a pointer into args->args. Now that we've
* copied args we need to fix it up to point into our copy,
* not the guest args. We also need to save the original
* value so we can restore it on the way out.
*/
orig_rets = args.rets;
args.rets = &args.args[args.nargs];
mutex_lock(&vcpu->kvm->lock);
rc = -ENOENT;
list_for_each_entry(d, &vcpu->kvm->arch.rtas_tokens, list) {
if (d->token == args.token) {
d->handler->handler(vcpu, &args);
rc = 0;
break;
}
}
mutex_unlock(&vcpu->kvm->lock);
if (rc == 0) {
args.rets = orig_rets;
rc = kvm_write_guest(vcpu->kvm, args_phys, &args, sizeof(args));
if (rc)
goto fail;
}
return rc;
fail:
/*
* We only get here if the guest has called RTAS with a bogus
* args pointer. That means we can't get to the args, and so we
* can't fail the RTAS call. So fail right out to userspace,
* which should kill the guest.
*/
return rc;
}
void kvmppc_rtas_tokens_free(struct kvm *kvm)
{
struct rtas_token_definition *d, *tmp;
lockdep_assert_held(&kvm->lock);
list_for_each_entry_safe(d, tmp, &kvm->arch.rtas_tokens, list) {
list_del(&d->list);
kfree(d);
}
}
This diff is collapsed.
/*
* Copyright 2012 Michael Ellerman, IBM Corporation.
* Copyright 2012 Benjamin Herrenschmidt, IBM Corporation
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License, version 2, as
* published by the Free Software Foundation.
*/
#ifndef _KVM_PPC_BOOK3S_XICS_H
#define _KVM_PPC_BOOK3S_XICS_H
/*
* We use a two-level tree to store interrupt source information.
* There are up to 1024 ICS nodes, each of which can represent
* 1024 sources.
*/
#define KVMPPC_XICS_MAX_ICS_ID 1023
#define KVMPPC_XICS_ICS_SHIFT 10
#define KVMPPC_XICS_IRQ_PER_ICS (1 << KVMPPC_XICS_ICS_SHIFT)
#define KVMPPC_XICS_SRC_MASK (KVMPPC_XICS_IRQ_PER_ICS - 1)
/*
* Interrupt source numbers below this are reserved, for example
* 0 is "no interrupt", and 2 is used for IPIs.
*/
#define KVMPPC_XICS_FIRST_IRQ 16
#define KVMPPC_XICS_NR_IRQS ((KVMPPC_XICS_MAX_ICS_ID + 1) * \
KVMPPC_XICS_IRQ_PER_ICS)
/* Priority value to use for disabling an interrupt */
#define MASKED 0xff
/* State for one irq source */
struct ics_irq_state {
u32 number;
u32 server;
u8 priority;
u8 saved_priority;
u8 resend;
u8 masked_pending;
u8 asserted; /* Only for LSI */
u8 exists;
};
/* Atomic ICP state, updated with a single compare & swap */
union kvmppc_icp_state {
unsigned long raw;
struct {
u8 out_ee:1;
u8 need_resend:1;
u8 cppr;
u8 mfrr;
u8 pending_pri;
u32 xisr;
};
};
/* One bit per ICS */
#define ICP_RESEND_MAP_SIZE (KVMPPC_XICS_MAX_ICS_ID / BITS_PER_LONG + 1)
struct kvmppc_icp {
struct kvm_vcpu *vcpu;
unsigned long server_num;
union kvmppc_icp_state state;
unsigned long resend_map[ICP_RESEND_MAP_SIZE];
/* Real mode might find something too hard, here's the action
* it might request from virtual mode
*/
#define XICS_RM_KICK_VCPU 0x1
#define XICS_RM_CHECK_RESEND 0x2
#define XICS_RM_REJECT 0x4
u32 rm_action;
struct kvm_vcpu *rm_kick_target;
u32 rm_reject;
/* Debug stuff for real mode */
union kvmppc_icp_state rm_dbgstate;
struct kvm_vcpu *rm_dbgtgt;
};
struct kvmppc_ics {
struct mutex lock;
u16 icsid;
struct ics_irq_state irq_state[KVMPPC_XICS_IRQ_PER_ICS];
};
struct kvmppc_xics {
struct kvm *kvm;
struct kvm_device *dev;
struct dentry *dentry;
u32 max_icsid;
bool real_mode;
bool real_mode_dbg;
struct kvmppc_ics *ics[KVMPPC_XICS_MAX_ICS_ID + 1];
};
static inline struct kvmppc_icp *kvmppc_xics_find_server(struct kvm *kvm,
u32 nr)
{
struct kvm_vcpu *vcpu = NULL;
int i;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (vcpu->arch.icp && nr == vcpu->arch.icp->server_num)
return vcpu->arch.icp;
}
return NULL;
}
static inline struct kvmppc_ics *kvmppc_xics_find_ics(struct kvmppc_xics *xics,
u32 irq, u16 *source)
{
u32 icsid = irq >> KVMPPC_XICS_ICS_SHIFT;
u16 src = irq & KVMPPC_XICS_SRC_MASK;
struct kvmppc_ics *ics;
if (source)
*source = src;
if (icsid > KVMPPC_XICS_MAX_ICS_ID)
return NULL;
ics = xics->ics[icsid];
if (!ics)
return NULL;
return ics;
}
#endif /* _KVM_PPC_BOOK3S_XICS_H */
......@@ -222,8 +222,7 @@ void kvmppc_core_queue_external(struct kvm_vcpu *vcpu,
kvmppc_booke_queue_irqprio(vcpu, prio);
}
void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
struct kvm_interrupt *irq)
void kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu)
{
clear_bit(BOOKE_IRQPRIO_EXTERNAL, &vcpu->arch.pending_exceptions);
clear_bit(BOOKE_IRQPRIO_EXTERNAL_LEVEL, &vcpu->arch.pending_exceptions);
......@@ -347,7 +346,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
keep_irq = true;
}
if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_enabled)
if ((priority == BOOKE_IRQPRIO_EXTERNAL) && vcpu->arch.epr_flags)
update_epr = true;
switch (priority) {
......@@ -428,8 +427,14 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
set_guest_esr(vcpu, vcpu->arch.queued_esr);
if (update_dear == true)
set_guest_dear(vcpu, vcpu->arch.queued_dear);
if (update_epr == true)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
if (update_epr == true) {
if (vcpu->arch.epr_flags & KVMPPC_EPR_USER)
kvm_make_request(KVM_REQ_EPR_EXIT, vcpu);
else if (vcpu->arch.epr_flags & KVMPPC_EPR_KERNEL) {
BUG_ON(vcpu->arch.irq_type != KVMPPC_IRQ_MPIC);
kvmppc_mpic_set_epr(vcpu);
}
}
new_msr &= msr_mask;
#if defined(CONFIG_64BIT)
......@@ -746,6 +751,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu)
kvmppc_core_queue_program(vcpu, ESR_PIL);
return RESUME_HOST;
case EMULATE_EXIT_USER:
return RESUME_HOST;
default:
BUG();
}
......@@ -1148,6 +1156,18 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
return r;
}
static void kvmppc_set_tsr(struct kvm_vcpu *vcpu, u32 new_tsr)
{
u32 old_tsr = vcpu->arch.tsr;
vcpu->arch.tsr = new_tsr;
if ((old_tsr ^ vcpu->arch.tsr) & (TSR_ENW | TSR_WIS))
arm_next_watchdog(vcpu);
update_timer_ints(vcpu);
}
/* Initial guest state: 16MB mapping 0 -> 0, PC = 0, MSR = 0, R1 = 16MB */
int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
......@@ -1287,16 +1307,8 @@ static int set_sregs_base(struct kvm_vcpu *vcpu,
kvmppc_emulate_dec(vcpu);
}
if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
u32 old_tsr = vcpu->arch.tsr;
vcpu->arch.tsr = sregs->u.e.tsr;
if ((old_tsr ^ vcpu->arch.tsr) & (TSR_ENW | TSR_WIS))
arm_next_watchdog(vcpu);
update_timer_ints(vcpu);
}
if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR)
kvmppc_set_tsr(vcpu, sregs->u.e.tsr);
return 0;
}
......@@ -1409,84 +1421,134 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
{
int r = -EINVAL;
int r = 0;
union kvmppc_one_reg val;
int size;
long int i;
size = one_reg_size(reg->id);
if (size > sizeof(val))
return -EINVAL;
switch (reg->id) {
case KVM_REG_PPC_IAC1:
case KVM_REG_PPC_IAC2:
case KVM_REG_PPC_IAC3:
case KVM_REG_PPC_IAC4: {
int iac = reg->id - KVM_REG_PPC_IAC1;
r = copy_to_user((u64 __user *)(long)reg->addr,
&vcpu->arch.dbg_reg.iac[iac], sizeof(u64));
case KVM_REG_PPC_IAC4:
i = reg->id - KVM_REG_PPC_IAC1;
val = get_reg_val(reg->id, vcpu->arch.dbg_reg.iac[i]);
break;
}
case KVM_REG_PPC_DAC1:
case KVM_REG_PPC_DAC2: {
int dac = reg->id - KVM_REG_PPC_DAC1;
r = copy_to_user((u64 __user *)(long)reg->addr,
&vcpu->arch.dbg_reg.dac[dac], sizeof(u64));
case KVM_REG_PPC_DAC2:
i = reg->id - KVM_REG_PPC_DAC1;
val = get_reg_val(reg->id, vcpu->arch.dbg_reg.dac[i]);
break;
}
case KVM_REG_PPC_EPR: {
u32 epr = get_guest_epr(vcpu);
r = put_user(epr, (u32 __user *)(long)reg->addr);
val = get_reg_val(reg->id, epr);
break;
}
#if defined(CONFIG_64BIT)
case KVM_REG_PPC_EPCR:
r = put_user(vcpu->arch.epcr, (u32 __user *)(long)reg->addr);
val = get_reg_val(reg->id, vcpu->arch.epcr);
break;
#endif
case KVM_REG_PPC_TCR:
val = get_reg_val(reg->id, vcpu->arch.tcr);
break;
case KVM_REG_PPC_TSR:
val = get_reg_val(reg->id, vcpu->arch.tsr);
break;
case KVM_REG_PPC_DEBUG_INST:
val = get_reg_val(reg->id, KVMPPC_INST_EHPRIV);
break;
default:
r = kvmppc_get_one_reg(vcpu, reg->id, &val);
break;
}
if (r)
return r;
if (copy_to_user((char __user *)(unsigned long)reg->addr, &val, size))
r = -EFAULT;
return r;
}
int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg)
{
int r = -EINVAL;
int r = 0;
union kvmppc_one_reg val;
int size;
long int i;
size = one_reg_size(reg->id);
if (size > sizeof(val))
return -EINVAL;
if (copy_from_user(&val, (char __user *)(unsigned long)reg->addr, size))
return -EFAULT;
switch (reg->id) {
case KVM_REG_PPC_IAC1:
case KVM_REG_PPC_IAC2:
case KVM_REG_PPC_IAC3:
case KVM_REG_PPC_IAC4: {
int iac = reg->id - KVM_REG_PPC_IAC1;
r = copy_from_user(&vcpu->arch.dbg_reg.iac[iac],
(u64 __user *)(long)reg->addr, sizeof(u64));
case KVM_REG_PPC_IAC4:
i = reg->id - KVM_REG_PPC_IAC1;
vcpu->arch.dbg_reg.iac[i] = set_reg_val(reg->id, val);
break;
}
case KVM_REG_PPC_DAC1:
case KVM_REG_PPC_DAC2: {
int dac = reg->id - KVM_REG_PPC_DAC1;
r = copy_from_user(&vcpu->arch.dbg_reg.dac[dac],
(u64 __user *)(long)reg->addr, sizeof(u64));
case KVM_REG_PPC_DAC2:
i = reg->id - KVM_REG_PPC_DAC1;
vcpu->arch.dbg_reg.dac[i] = set_reg_val(reg->id, val);
break;
}
case KVM_REG_PPC_EPR: {
u32 new_epr;
r = get_user(new_epr, (u32 __user *)(long)reg->addr);
if (!r)
kvmppc_set_epr(vcpu, new_epr);
u32 new_epr = set_reg_val(reg->id, val);
kvmppc_set_epr(vcpu, new_epr);
break;
}
#if defined(CONFIG_64BIT)
case KVM_REG_PPC_EPCR: {
u32 new_epcr;
r = get_user(new_epcr, (u32 __user *)(long)reg->addr);
if (r == 0)
kvmppc_set_epcr(vcpu, new_epcr);
u32 new_epcr = set_reg_val(reg->id, val);
kvmppc_set_epcr(vcpu, new_epcr);
break;
}
#endif
case KVM_REG_PPC_OR_TSR: {
u32 tsr_bits = set_reg_val(reg->id, val);
kvmppc_set_tsr_bits(vcpu, tsr_bits);
break;
}
case KVM_REG_PPC_CLEAR_TSR: {
u32 tsr_bits = set_reg_val(reg->id, val);
kvmppc_clr_tsr_bits(vcpu, tsr_bits);
break;
}
case KVM_REG_PPC_TSR: {
u32 tsr = set_reg_val(reg->id, val);
kvmppc_set_tsr(vcpu, tsr);
break;
}
case KVM_REG_PPC_TCR: {
u32 tcr = set_reg_val(reg->id, val);
kvmppc_set_tcr(vcpu, tcr);
break;
}
default:
r = kvmppc_set_one_reg(vcpu, reg->id, &val);
break;
}
return r;
}
int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg)
{
return -EINVAL;
}
int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
{
return -ENOTSUPP;
......@@ -1531,7 +1593,7 @@ int kvmppc_core_prepare_memory_region(struct kvm *kvm,
void kvmppc_core_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old)
const struct kvm_memory_slot *old)
{
}
......
......@@ -54,8 +54,7 @@
(1<<BOOKE_INTERRUPT_DTLB_MISS) | \
(1<<BOOKE_INTERRUPT_ALIGNMENT))
.macro KVM_HANDLER ivor_nr scratch srr0
_GLOBAL(kvmppc_handler_\ivor_nr)
.macro __KVM_HANDLER ivor_nr scratch srr0
/* Get pointer to vcpu and record exit number. */
mtspr \scratch , r4
mfspr r4, SPRN_SPRG_THREAD
......@@ -76,6 +75,43 @@ _GLOBAL(kvmppc_handler_\ivor_nr)
bctr
.endm
.macro KVM_HANDLER ivor_nr scratch srr0
_GLOBAL(kvmppc_handler_\ivor_nr)
__KVM_HANDLER \ivor_nr \scratch \srr0
.endm
.macro KVM_DBG_HANDLER ivor_nr scratch srr0
_GLOBAL(kvmppc_handler_\ivor_nr)
mtspr \scratch, r4
mfspr r4, SPRN_SPRG_THREAD
lwz r4, THREAD_KVM_VCPU(r4)
stw r3, VCPU_CRIT_SAVE(r4)
mfcr r3
mfspr r4, SPRN_CSRR1
andi. r4, r4, MSR_PR
bne 1f
/* debug interrupt happened in enter/exit path */
mfspr r4, SPRN_CSRR1
rlwinm r4, r4, 0, ~MSR_DE
mtspr SPRN_CSRR1, r4
lis r4, 0xffff
ori r4, r4, 0xffff
mtspr SPRN_DBSR, r4
mfspr r4, SPRN_SPRG_THREAD
lwz r4, THREAD_KVM_VCPU(r4)
mtcr r3
lwz r3, VCPU_CRIT_SAVE(r4)
mfspr r4, \scratch
rfci
1: /* debug interrupt happened in guest */
mtcr r3
mfspr r4, SPRN_SPRG_THREAD
lwz r4, THREAD_KVM_VCPU(r4)
lwz r3, VCPU_CRIT_SAVE(r4)
mfspr r4, \scratch
__KVM_HANDLER \ivor_nr \scratch \srr0
.endm
.macro KVM_HANDLER_ADDR ivor_nr
.long kvmppc_handler_\ivor_nr
.endm
......@@ -100,7 +136,7 @@ KVM_HANDLER BOOKE_INTERRUPT_FIT SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_WATCHDOG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
KVM_HANDLER BOOKE_INTERRUPT_DTLB_MISS SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_ITLB_MISS SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
KVM_DBG_HANDLER BOOKE_INTERRUPT_DEBUG SPRN_SPRG_RSCRATCH_CRIT SPRN_CSRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA SPRN_SPRG_RSCRATCH0 SPRN_SRR0
KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND SPRN_SPRG_RSCRATCH0 SPRN_SRR0
......
......@@ -425,6 +425,20 @@ int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
return kvmppc_set_sregs_ivor(vcpu, sregs);
}
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = kvmppc_get_one_reg_e500_tlb(vcpu, id, val);
return r;
}
int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = kvmppc_get_one_reg_e500_tlb(vcpu, id, val);
return r;
}
struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
{
struct kvmppc_vcpu_e500 *vcpu_e500;
......
......@@ -23,6 +23,10 @@
#include <asm/mmu-book3e.h>
#include <asm/tlb.h>
enum vcpu_ftr {
VCPU_FTR_MMU_V2
};
#define E500_PID_NUM 3
#define E500_TLB_NUM 2
......@@ -131,6 +135,10 @@ void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500);
void kvmppc_get_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs);
int kvmppc_get_one_reg_e500_tlb(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val);
int kvmppc_set_one_reg_e500_tlb(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val);
#ifdef CONFIG_KVM_E500V2
unsigned int kvmppc_e500_get_sid(struct kvmppc_vcpu_e500 *vcpu_e500,
......@@ -295,4 +303,18 @@ static inline unsigned int get_tlbmiss_tid(struct kvm_vcpu *vcpu)
#define get_tlb_sts(gtlbe) (MAS1_TS)
#endif /* !BOOKE_HV */
static inline bool has_feature(const struct kvm_vcpu *vcpu,
enum vcpu_ftr ftr)
{
bool has_ftr;
switch (ftr) {
case VCPU_FTR_MMU_V2:
has_ftr = ((vcpu->arch.mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2);
break;
default:
return false;
}
return has_ftr;
}
#endif /* KVM_E500_H */
......@@ -284,6 +284,16 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
case SPRN_TLB1CFG:
*spr_val = vcpu->arch.tlbcfg[1];
break;
case SPRN_TLB0PS:
if (!has_feature(vcpu, VCPU_FTR_MMU_V2))
return EMULATE_FAIL;
*spr_val = vcpu->arch.tlbps[0];
break;
case SPRN_TLB1PS:
if (!has_feature(vcpu, VCPU_FTR_MMU_V2))
return EMULATE_FAIL;
*spr_val = vcpu->arch.tlbps[1];
break;
case SPRN_L1CSR0:
*spr_val = vcpu_e500->l1csr0;
break;
......@@ -307,6 +317,15 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val)
case SPRN_MMUCFG:
*spr_val = vcpu->arch.mmucfg;
break;
case SPRN_EPTCFG:
if (!has_feature(vcpu, VCPU_FTR_MMU_V2))
return EMULATE_FAIL;
/*
* Legacy Linux guests access EPTCFG register even if the E.PT
* category is disabled in the VM. Give them a chance to live.
*/
*spr_val = vcpu->arch.eptcfg;
break;
/* extra exceptions */
case SPRN_IVOR32:
......
......@@ -596,6 +596,140 @@ int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
return 0;
}
int kvmppc_get_one_reg_e500_tlb(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = 0;
long int i;
switch (id) {
case KVM_REG_PPC_MAS0:
*val = get_reg_val(id, vcpu->arch.shared->mas0);
break;
case KVM_REG_PPC_MAS1:
*val = get_reg_val(id, vcpu->arch.shared->mas1);
break;
case KVM_REG_PPC_MAS2:
*val = get_reg_val(id, vcpu->arch.shared->mas2);
break;
case KVM_REG_PPC_MAS7_3:
*val = get_reg_val(id, vcpu->arch.shared->mas7_3);
break;
case KVM_REG_PPC_MAS4:
*val = get_reg_val(id, vcpu->arch.shared->mas4);
break;
case KVM_REG_PPC_MAS6:
*val = get_reg_val(id, vcpu->arch.shared->mas6);
break;
case KVM_REG_PPC_MMUCFG:
*val = get_reg_val(id, vcpu->arch.mmucfg);
break;
case KVM_REG_PPC_EPTCFG:
*val = get_reg_val(id, vcpu->arch.eptcfg);
break;
case KVM_REG_PPC_TLB0CFG:
case KVM_REG_PPC_TLB1CFG:
case KVM_REG_PPC_TLB2CFG:
case KVM_REG_PPC_TLB3CFG:
i = id - KVM_REG_PPC_TLB0CFG;
*val = get_reg_val(id, vcpu->arch.tlbcfg[i]);
break;
case KVM_REG_PPC_TLB0PS:
case KVM_REG_PPC_TLB1PS:
case KVM_REG_PPC_TLB2PS:
case KVM_REG_PPC_TLB3PS:
i = id - KVM_REG_PPC_TLB0PS;
*val = get_reg_val(id, vcpu->arch.tlbps[i]);
break;
default:
r = -EINVAL;
break;
}
return r;
}
int kvmppc_set_one_reg_e500_tlb(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = 0;
long int i;
switch (id) {
case KVM_REG_PPC_MAS0:
vcpu->arch.shared->mas0 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_MAS1:
vcpu->arch.shared->mas1 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_MAS2:
vcpu->arch.shared->mas2 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_MAS7_3:
vcpu->arch.shared->mas7_3 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_MAS4:
vcpu->arch.shared->mas4 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_MAS6:
vcpu->arch.shared->mas6 = set_reg_val(id, *val);
break;
/* Only allow MMU registers to be set to the config supported by KVM */
case KVM_REG_PPC_MMUCFG: {
u32 reg = set_reg_val(id, *val);
if (reg != vcpu->arch.mmucfg)
r = -EINVAL;
break;
}
case KVM_REG_PPC_EPTCFG: {
u32 reg = set_reg_val(id, *val);
if (reg != vcpu->arch.eptcfg)
r = -EINVAL;
break;
}
case KVM_REG_PPC_TLB0CFG:
case KVM_REG_PPC_TLB1CFG:
case KVM_REG_PPC_TLB2CFG:
case KVM_REG_PPC_TLB3CFG: {
/* MMU geometry (N_ENTRY/ASSOC) can be set only using SW_TLB */
u32 reg = set_reg_val(id, *val);
i = id - KVM_REG_PPC_TLB0CFG;
if (reg != vcpu->arch.tlbcfg[i])
r = -EINVAL;
break;
}
case KVM_REG_PPC_TLB0PS:
case KVM_REG_PPC_TLB1PS:
case KVM_REG_PPC_TLB2PS:
case KVM_REG_PPC_TLB3PS: {
u32 reg = set_reg_val(id, *val);
i = id - KVM_REG_PPC_TLB0PS;
if (reg != vcpu->arch.tlbps[i])
r = -EINVAL;
break;
}
default:
r = -EINVAL;
break;
}
return r;
}
static int vcpu_mmu_geometry_update(struct kvm_vcpu *vcpu,
struct kvm_book3e_206_tlb_params *params)
{
vcpu->arch.tlbcfg[0] &= ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
if (params->tlb_sizes[0] <= 2048)
vcpu->arch.tlbcfg[0] |= params->tlb_sizes[0];
vcpu->arch.tlbcfg[0] |= params->tlb_ways[0] << TLBnCFG_ASSOC_SHIFT;
vcpu->arch.tlbcfg[1] &= ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[1] |= params->tlb_sizes[1];
vcpu->arch.tlbcfg[1] |= params->tlb_ways[1] << TLBnCFG_ASSOC_SHIFT;
return 0;
}
int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
struct kvm_config_tlb *cfg)
{
......@@ -692,16 +826,8 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
vcpu_e500->gtlb_offset[0] = 0;
vcpu_e500->gtlb_offset[1] = params.tlb_sizes[0];
vcpu->arch.mmucfg = mfspr(SPRN_MMUCFG) & ~MMUCFG_LPIDSIZE;
vcpu->arch.tlbcfg[0] &= ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
if (params.tlb_sizes[0] <= 2048)
vcpu->arch.tlbcfg[0] |= params.tlb_sizes[0];
vcpu->arch.tlbcfg[0] |= params.tlb_ways[0] << TLBnCFG_ASSOC_SHIFT;
vcpu->arch.tlbcfg[1] &= ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[1] |= params.tlb_sizes[1];
vcpu->arch.tlbcfg[1] |= params.tlb_ways[1] << TLBnCFG_ASSOC_SHIFT;
/* Update vcpu's MMU geometry based on SW_TLB input */
vcpu_mmu_geometry_update(vcpu, &params);
vcpu_e500->shared_tlb_pages = pages;
vcpu_e500->num_shared_tlb_pages = num_pages;
......@@ -737,6 +863,39 @@ int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
return 0;
}
/* Vcpu's MMU default configuration */
static int vcpu_mmu_init(struct kvm_vcpu *vcpu,
struct kvmppc_e500_tlb_params *params)
{
/* Initialize RASIZE, PIDSIZE, NTLBS and MAVN fields with host values*/
vcpu->arch.mmucfg = mfspr(SPRN_MMUCFG) & ~MMUCFG_LPIDSIZE;
/* Initialize TLBnCFG fields with host values and SW_TLB geometry*/
vcpu->arch.tlbcfg[0] = mfspr(SPRN_TLB0CFG) &
~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[0] |= params[0].entries;
vcpu->arch.tlbcfg[0] |= params[0].ways << TLBnCFG_ASSOC_SHIFT;
vcpu->arch.tlbcfg[1] = mfspr(SPRN_TLB1CFG) &
~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[1] |= params[1].entries;
vcpu->arch.tlbcfg[1] |= params[1].ways << TLBnCFG_ASSOC_SHIFT;
if (has_feature(vcpu, VCPU_FTR_MMU_V2)) {
vcpu->arch.tlbps[0] = mfspr(SPRN_TLB0PS);
vcpu->arch.tlbps[1] = mfspr(SPRN_TLB1PS);
vcpu->arch.mmucfg &= ~MMUCFG_LRAT;
/* Guest mmu emulation currently doesn't handle E.PT */
vcpu->arch.eptcfg = 0;
vcpu->arch.tlbcfg[0] &= ~TLBnCFG_PT;
vcpu->arch.tlbcfg[1] &= ~TLBnCFG_IND;
}
return 0;
}
int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
{
struct kvm_vcpu *vcpu = &vcpu_e500->vcpu;
......@@ -781,18 +940,7 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
if (!vcpu_e500->g2h_tlb1_map)
goto err;
/* Init TLB configuration register */
vcpu->arch.tlbcfg[0] = mfspr(SPRN_TLB0CFG) &
~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[0] |= vcpu_e500->gtlb_params[0].entries;
vcpu->arch.tlbcfg[0] |=
vcpu_e500->gtlb_params[0].ways << TLBnCFG_ASSOC_SHIFT;
vcpu->arch.tlbcfg[1] = mfspr(SPRN_TLB1CFG) &
~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC);
vcpu->arch.tlbcfg[1] |= vcpu_e500->gtlb_params[1].entries;
vcpu->arch.tlbcfg[1] |=
vcpu_e500->gtlb_params[1].ways << TLBnCFG_ASSOC_SHIFT;
vcpu_mmu_init(vcpu, vcpu_e500->gtlb_params);
kvmppc_recalc_tlb1map_range(vcpu_e500);
return 0;
......
......@@ -177,6 +177,8 @@ int kvmppc_core_check_processor_compat(void)
r = 0;
else if (strcmp(cur_cpu_spec->cpu_name, "e5500") == 0)
r = 0;
else if (strcmp(cur_cpu_spec->cpu_name, "e6500") == 0)
r = 0;
else
r = -ENOTSUPP;
......@@ -260,6 +262,20 @@ int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
return kvmppc_set_sregs_ivor(vcpu, sregs);
}
int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = kvmppc_get_one_reg_e500_tlb(vcpu, id, val);
return r;
}
int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
union kvmppc_one_reg *val)
{
int r = kvmppc_set_one_reg_e500_tlb(vcpu, id, val);
return r;
}
struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id)
{
struct kvmppc_vcpu_e500 *vcpu_e500;
......
......@@ -38,6 +38,7 @@
#define OP_31_XOP_TRAP 4
#define OP_31_XOP_LWZX 23
#define OP_31_XOP_DCBST 54
#define OP_31_XOP_TRAP_64 68
#define OP_31_XOP_DCBF 86
#define OP_31_XOP_LBZX 87
......@@ -370,6 +371,7 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu)
emulated = kvmppc_emulate_mtspr(vcpu, sprn, rs);
break;
case OP_31_XOP_DCBST:
case OP_31_XOP_DCBF:
case OP_31_XOP_DCBI:
/* Do nothing. The guest is performing dcbi because
......
#ifndef __IRQ_H
#define __IRQ_H
#include <linux/kvm_host.h>
static inline int irqchip_in_kernel(struct kvm *kvm)
{
int ret = 0;
#ifdef CONFIG_KVM_MPIC
ret = ret || (kvm->arch.mpic != NULL);
#endif
#ifdef CONFIG_KVM_XICS
ret = ret || (kvm->arch.xics != NULL);
#endif
smp_rmb();
return ret;
}
#endif
This diff is collapsed.
This diff is collapsed.
......@@ -51,6 +51,12 @@ static struct icp_ipl __iomem *icp_native_regs[NR_CPUS];
static inline unsigned int icp_native_get_xirr(void)
{
int cpu = smp_processor_id();
unsigned int xirr;
/* Handled an interrupt latched by KVM */
xirr = kvmppc_get_xics_latch();
if (xirr)
return xirr;
return in_be32(&icp_native_regs[cpu]->xirr.word);
}
......@@ -138,6 +144,7 @@ static unsigned int icp_native_get_irq(void)
static void icp_native_cause_ipi(int cpu, unsigned long data)
{
kvmppc_set_host_ipi(cpu, 1);
icp_native_set_qirr(cpu, IPI_PRIORITY);
}
......@@ -151,6 +158,7 @@ static irqreturn_t icp_native_ipi_action(int irq, void *dev_id)
{
int cpu = smp_processor_id();
kvmppc_set_host_ipi(cpu, 0);
icp_native_set_qirr(cpu, 0xff);
return smp_ipi_demux();
......
......@@ -44,5 +44,6 @@ header-y += termios.h
header-y += types.h
header-y += ucontext.h
header-y += unistd.h
header-y += virtio-ccw.h
header-y += vtoc.h
header-y += zcrypt.h
/*
* Definitions for virtio-ccw devices.
*
* Copyright IBM Corp. 2013
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License (version 2 only)
* as published by the Free Software Foundation.
*
* Author(s): Cornelia Huck <cornelia.huck@de.ibm.com>
*/
#ifndef __KVM_VIRTIO_CCW_H
#define __KVM_VIRTIO_CCW_H
/* Alignment of vring buffers. */
#define KVM_VIRTIO_CCW_RING_ALIGN 4096
/* Subcode for diagnose 500 (virtio hypercall). */
#define KVM_S390_VIRTIO_CCW_NOTIFY 3
#endif
......@@ -22,6 +22,7 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_EVENTFD
---help---
Support hosting paravirtualized guest machines using the SIE
virtualization capability on the mainframe. This should work
......
......@@ -6,7 +6,7 @@
# it under the terms of the GNU General Public License (version 2 only)
# as published by the Free Software Foundation.
common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o)
common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o eventfd.o)
ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
......
......@@ -13,6 +13,7 @@
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <asm/virtio-ccw.h>
#include "kvm-s390.h"
#include "trace.h"
#include "trace-s390.h"
......@@ -104,6 +105,29 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
return -EREMOTE;
}
static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu)
{
int ret, idx;
/* No virtio-ccw notification? Get out quickly. */
if (!vcpu->kvm->arch.css_support ||
(vcpu->run->s.regs.gprs[1] != KVM_S390_VIRTIO_CCW_NOTIFY))
return -EOPNOTSUPP;
idx = srcu_read_lock(&vcpu->kvm->srcu);
/*
* The layout is as follows:
* - gpr 2 contains the subchannel id (passed as addr)
* - gpr 3 contains the virtqueue index (passed as datamatch)
*/
ret = kvm_io_bus_write(vcpu->kvm, KVM_VIRTIO_CCW_NOTIFY_BUS,
vcpu->run->s.regs.gprs[2],
8, &vcpu->run->s.regs.gprs[3]);
srcu_read_unlock(&vcpu->kvm->srcu, idx);
/* kvm_io_bus_write returns -EOPNOTSUPP if it found no match. */
return ret < 0 ? ret : 0;
}
int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
{
int code = (vcpu->arch.sie_block->ipb & 0xfff0000) >> 16;
......@@ -118,6 +142,8 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
return __diag_time_slice_end_directed(vcpu);
case 0x308:
return __diag_ipl_functions(vcpu);
case 0x500:
return __diag_virtio_hypercall(vcpu);
default:
return -EOPNOTSUPP;
}
......
This diff is collapsed.
......@@ -43,12 +43,10 @@ static int handle_lctlg(struct kvm_vcpu *vcpu)
trace_kvm_s390_handle_lctl(vcpu, 1, reg1, reg3, useraddr);
do {
rc = get_guest_u64(vcpu, useraddr,
&vcpu->arch.sie_block->gcr[reg]);
if (rc == -EFAULT) {
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
break;
}
rc = get_guest(vcpu, vcpu->arch.sie_block->gcr[reg],
(u64 __user *) useraddr);
if (rc)
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
useraddr += 8;
if (reg == reg3)
break;
......@@ -78,11 +76,9 @@ static int handle_lctl(struct kvm_vcpu *vcpu)
reg = reg1;
do {
rc = get_guest_u32(vcpu, useraddr, &val);
if (rc == -EFAULT) {
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
break;
}
rc = get_guest(vcpu, val, (u32 __user *) useraddr);
if (rc)
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
vcpu->arch.sie_block->gcr[reg] &= 0xffffffff00000000ul;
vcpu->arch.sie_block->gcr[reg] |= val;
useraddr += 4;
......
This diff is collapsed.
......@@ -142,12 +142,16 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_ONE_REG:
case KVM_CAP_ENABLE_CAP:
case KVM_CAP_S390_CSS_SUPPORT:
case KVM_CAP_IOEVENTFD:
r = 1;
break;
case KVM_CAP_NR_VCPUS:
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;
break;
case KVM_CAP_NR_MEMSLOTS:
r = KVM_USER_MEM_SLOTS;
break;
case KVM_CAP_S390_COW:
r = MACHINE_HAS_ESOP;
break;
......@@ -632,8 +636,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
} else {
VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction");
trace_kvm_s390_sie_fault(vcpu);
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
rc = 0;
rc = kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
}
}
VCPU_EVENT(vcpu, 6, "exit sie icptcode %d",
......@@ -974,22 +977,13 @@ int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
/* Section: memory related */
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
bool user_alloc)
enum kvm_mr_change change)
{
/* A few sanity checks. We can have exactly one memory slot which has
to start at guest virtual zero and which has to be located at a
page boundary in userland and which has to end at a page boundary.
The memory in userland is ok to be fragmented into various different
vmas. It is okay to mmap() and munmap() stuff in this slot after
doing this call at any time */
if (mem->slot)
return -EINVAL;
if (mem->guest_phys_addr)
return -EINVAL;
/* A few sanity checks. We can have memory slots which have to be
located/ended at a segment boundary (1MB). The memory in userland is
ok to be fragmented into various different vmas. It is okay to mmap()
and munmap() stuff in this slot after doing this call at any time */
if (mem->userspace_addr & 0xffffful)
return -EINVAL;
......@@ -997,19 +991,26 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
if (mem->memory_size & 0xffffful)
return -EINVAL;
if (!user_alloc)
return -EINVAL;
return 0;
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
bool user_alloc)
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
{
int rc;
/* If the basics of the memslot do not change, we do not want
* to update the gmap. Every update causes several unnecessary
* segment translation exceptions. This is usually handled just
* fine by the normal fault handler + gmap, but it will also
* cause faults on the prefix page of running guest CPUs.
*/
if (old->userspace_addr == mem->userspace_addr &&
old->base_gfn * PAGE_SIZE == mem->guest_phys_addr &&
old->npages * PAGE_SIZE == mem->memory_size)
return;
rc = gmap_map_segment(kvm->arch.gmap, mem->userspace_addr,
mem->guest_phys_addr, mem->memory_size);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -28,6 +28,7 @@
/* Interrupt handlers registered during init_IRQ */
extern void apic_timer_interrupt(void);
extern void x86_platform_ipi(void);
extern void kvm_posted_intr_ipi(void);
extern void error_interrupt(void);
extern void irq_work_interrupt(void);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment