- 12 Sep, 2016 1 commit
-
-
Suresh Warrier authored
Currently, KVM switches back to the host to handle any external interrupt (when the interrupt is received while running in the guest). This patch updates real-mode KVM to check if an interrupt is generated by a passthrough adapter that is owned by this guest. If so, the real mode KVM will directly inject the corresponding virtual interrupt to the guest VCPU's ICS and also EOI the interrupt in hardware. In short, the interrupt is handled entirely in real mode in the guest context without switching back to the host. In some rare cases, the interrupt cannot be completely handled in real mode, for instance, a VCPU that is sleeping needs to be woken up. In this case, KVM simply switches back to the host with trap reason set to 0x500. This works, but it is clearly not very efficient. A following patch will distinguish this case and handle it correctly in the host. Note that we can use the existing check_too_hard() routine even though we are not in a hypercall to determine if there is unfinished business that needs to be completed in host virtual mode. The patch assumes that the mapping between hardware interrupt IRQ and virtual IRQ to be injected to the guest already exists for the PCI passthrough interrupts that need to be handled in real mode. If the mapping does not exist, KVM falls back to the default existing behavior. The KVM real mode code reads mappings from the mapped array in the passthrough IRQ map without taking any lock. We carefully order the loads and stores of the fields in the kvmppc_irq_map data structure using memory barriers to avoid an inconsistent mapping being seen by the reader. Thus, although it is possible to miss a map entry, it is not possible to read a stale value. [paulus@ozlabs.org - get irq_chip from irq_map rather than pimap, pulled out powernv eoi change into a separate patch, made kvmppc_read_intr get the vcpu from the paca rather than being passed in, rewrote the logic at the end of kvmppc_read_intr to avoid deep indentation, simplified logic in book3s_hv_rmhandlers.S since we were always restoring SRR0/1 anyway, get rid of the cached array (just use the mapped array), removed the kick_all_cpus_sync() call, clear saved_xirr PACA field when we handle the interrupt in real mode, fix compilation with CONFIG_KVM_XICS=n.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 09 Sep, 2016 9 commits
-
-
Suresh Warrier authored
Add the irq_bypass_add_producer and irq_bypass_del_producer functions. These functions get called whenever a GSI is being defined for a guest. They create/remove the mapping between host real IRQ numbers and the guest GSI. Add the following helper functions to manage the passthrough IRQ map. kvmppc_set_passthru_irq() Creates a mapping in the passthrough IRQ map that maps a host IRQ to a guest GSI. It allocates the structure (one per guest VM) the first time it is called. kvmppc_clr_passthru_irq() Removes the passthrough IRQ map entry given a guest GSI. The passthrough IRQ map structure is not freed even when the number of mapped entries goes to zero. It is only freed when the VM is destroyed. [paulus@ozlabs.org - modified to use is_pnv_opal_msi() rather than requiring all passed-through interrupts to use the same irq_chip; changed deletion so it zeroes out the r_hwirq field rather than copying the last entry down and decrementing the number of entries.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suresh Warrier authored
This patch introduces an IRQ mapping structure, the kvmppc_passthru_irqmap structure that is to be used to map the real hardware IRQ in the host with the virtual hardware IRQ (gsi) that is injected into a guest by KVM for passthrough adapters. Currently, we assume a separate IRQ mapping structure for each guest. Each kvmppc_passthru_irqmap has a mapping arrays, containing all defined real<->virtual IRQs. [paulus@ozlabs.org - removed irq_chip field from struct kvmppc_passthru_irqmap; changed parameter for kvmppc_get_passthru_irqmap from struct kvm_vcpu * to struct kvm *, removed small cached array.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suresh Warrier authored
Select IRQ_BYPASS_MANAGER for PPC when CONFIG_KVM is set. Add the PPC producer functions for add and del producer. [paulus@ozlabs.org - Moved new functions from book3s.c to powerpc.c so booke compiles; added kvm_arch_has_irq_bypass implementation.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suresh Warrier authored
Modify kvmppc_read_intr to make it a C function. Because it is called from kvmppc_check_wake_reason, any of the assembler code that calls either kvmppc_read_intr or kvmppc_check_wake_reason now has to assume that the volatile registers might have been modified. This also adds in the optimization of clearing saved_xirr in the case where we completely handle and EOI an IPI. Without this, the next device interrupt will require two trips through the host interrupt handling code. [paulus@ozlabs.org - made kvmppc_check_wake_reason create a stack frame when it is calling kvmppc_read_intr, which means we can set r12 to the trap number (0x500) after the call to kvmppc_read_intr, instead of using r31. Also moved the deliver_guest_interrupt label so as to restore XER and CTR, plus other minor tweaks.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This merges the topic branch 'kvm-ppc-infrastructure' into kvm-ppc-next so that I can then apply further patches that need the changes in the kvm-ppc-infrastructure branch. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paolo Bonzini authored
hmi.c functions are unused unless sibling_subcore_state is nonzero, and that in turn happens only if KVM is in use. So move the code to arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE rather than CONFIG_PPC_BOOK3S_64. The sibling_subcore_state is also included in struct paca_struct only if KVM is supported by the kernel. Cc: Daniel Axtens <dja@axtens.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: kvm-ppc@vger.kernel.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suresh Warrier authored
This adds a new function pnv_opal_pci_msi_eoi() which does the part of end-of-interrupt (EOI) handling of an MSI which involves doing an OPAL call. This function can be called in real mode. This doesn't just export pnv_ioda2_msi_eoi() because that does a call to icp_native_eoi(), which does not work in real mode. This also adds a function, is_pnv_opal_msi(), which KVM can call to check whether an interrupt is one for which we should be calling pnv_opal_pci_msi_eoi() when we need to do an EOI. [paulus@ozlabs.org - split out the addition of pnv_opal_pci_msi_eoi() from Suresh's patch "KVM: PPC: Book3S HV: Handle passthrough interrupts in guest"; added is_pnv_opal_msi(); wrote description.] Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suresh Warrier authored
Add simple cache inhibited accessors for memory mapped I/O. Unlike the accessors built from the DEF_MMIO_* macros, these don't include any hardware memory barriers, callers need to manage memory barriers on their own. These can only be called in hypervisor real mode. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> [paulus@ozlabs.org - added line to comment] Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
This replaces a 2-D search through an array with a simple 8-bit table lookup for determining the actual and/or base page size for a HPT entry. The encoding in the second doubleword of the HPTE is designed to encode the actual and base page sizes without using any more bits than would be needed for a 4k page number, by using between 1 and 8 low-order bits of the RPN (real page number) field to encode the page sizes. A single "large page" bit in the first doubleword indicates that these low-order bits are to be interpreted like this. We can determine the page sizes by using the low-order 8 bits of the RPN to look up a 256-entry table. For actual page sizes less than 1MB, some of the upper bits of these 8 bits are going to be real address bits, but we can cope with that by replicating the entries for those smaller page sizes. While we're at it, let's move the hpte_page_size() and hpte_base_page_size() functions from a KVM-specific header to a header for 64-bit HPT systems, since this computation doesn't have anything specifically to do with KVM. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 08 Sep, 2016 5 commits
-
-
Suraj Jitindar Singh authored
vcpu stats are used to collect information about a vcpu which can be viewed in the debugfs. For example halt_attempted_poll and halt_successful_poll are used to keep track of the number of times the vcpu attempts to and successfully polls. These stats are currently not used on powerpc. Implement incrementation of the halt_attempted_poll and halt_successful_poll vcpu stats for powerpc. Since these stats are summed over all the vcpus for all running guests it doesn't matter which vcpu they are attributed to, thus we choose the current runner vcpu of the vcore. Also add new vcpu stats: halt_poll_success_ns, halt_poll_fail_ns and halt_wait_ns to be used to accumulate the total time spend polling successfully, polling unsuccessfully and waiting respectively, and halt_successful_wait to accumulate the number of times the vcpu waits. Given that halt_poll_success_ns, halt_poll_fail_ns and halt_wait_ns are expressed in nanoseconds it is necessary to represent these as 64-bit quantities, otherwise they would overflow after only about 4 seconds. Given that the total time spend either polling or waiting will be known and the number of times that each was done, it will be possible to determine the average poll and wait times. This will give the ability to tune the kvm module parameters based on the calculated average wait and poll times. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suraj Jitindar Singh authored
vms and vcpus have statistics associated with them which can be viewed within the debugfs. Currently it is assumed within the vcpu_stat_get() and vm_stat_get() functions that all of these statistics are represented as u32s, however the next patch adds some u64 vcpu statistics. Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly. Since vcpu statistics are per vcpu, they will only be updated by a single vcpu at a time so this shouldn't present a problem on 32-bit machines which can't atomically increment 64-bit numbers. However vm statistics could potentially be updated by multiple vcpus from that vm at a time. To avoid the overhead of atomics make all vm statistics ulong such that they are 64-bit on 64-bit systems where they can be atomically incremented and are 32-bit on 32-bit systems which may not be able to atomically increment 64-bit numbers. Modify vm_stat_get() to expect ulongs. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Matlack <dmatlack@google.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suraj Jitindar Singh authored
This patch introduces new halt polling functionality into the kvm_hv kernel module. When a vcore is idle it will poll for some period of time before scheduling itself out. When all of the runnable vcpus on a vcore have ceded (and thus the vcore is idle) we schedule ourselves out to allow something else to run. In the event that we need to wake up very quickly (for example an interrupt arrives), we are required to wait until we get scheduled again. Implement halt polling so that when a vcore is idle, and before scheduling ourselves, we poll for vcpus in the runnable_threads list which have pending exceptions or which leave the ceded state. If we poll successfully then we can get back into the guest very quickly without ever scheduling ourselves, otherwise we schedule ourselves out as before. There exists generic halt_polling code in virt/kvm_main.c, however on powerpc the polling conditions are different to the generic case. It would be nice if we could just implement an arch specific kvm_check_block() function, but there is still other arch specific things which need to be done for kvm_hv (for example manipulating vcore states) which means that a separate implementation is the best option. Testing of this patch with a TCP round robin test between two guests with virtio network interfaces has found a decrease in round trip time of ~15us on average. A performance gain is only seen when going out of and back into the guest often and quickly, otherwise there is no net benefit from the polling. The polling interval is adjusted such that when we are often scheduled out for long periods of time it is reduced, and when we often poll successfully it is increased. The rate at which the polling interval increases or decreases, and the maximum polling interval, can be set through module parameters. Based on the implementation in the generic kvm module by Wanpeng Li and Paolo Bonzini, and on direction from Paul Mackerras. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suraj Jitindar Singh authored
The struct kvmppc_vcore is a structure used to store various information about a virtual core for a kvm guest. The runnable_threads element of the struct provides a list of all of the currently runnable vcpus on the core (those in the KVMPPC_VCPU_RUNNABLE state). The previous implementation of this list was a linked_list. The next patch requires that the list be able to be iterated over without holding the vcore lock. Reimplement the runnable_threads list in the kvmppc_vcore struct as an array. Implement function to iterate over valid entries in the array and update access sites accordingly. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Suraj Jitindar Singh authored
The next commit will introduce a member to the kvmppc_vcore struct which references MAX_SMT_THREADS which is defined in kvm_book3s_asm.h, however this file isn't included in kvm_host.h directly. Thus compiling for certain platforms such as pmac32_defconfig and ppc64e_defconfig with KVM fails due to MAX_SMT_THREADS not being defined. Move the struct kvmppc_vcore definition to kvm_book3s.h which explicitly includes kvm_book3s_asm.h. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 25 Aug, 2016 1 commit
-
-
Paul Mackerras authored
As discussed recently on the kvm mailing list, David Gibson's intention in commit 178a7875 ("vfio: Enable VFIO device for powerpc", 2016-02-01) was to have the KVM VFIO device built in on all powerpc platforms. This patch adds the "select KVM_VFIO" statement that makes this happen. Currently, arch/powerpc/kvm/Makefile doesn't include vfio.o for the 64-bit kvm module, because the list of objects doesn't use the $(common-objs-y) list. The reason it doesn't is because we don't necessarily want coalesced_mmio.o or emulate.o (for example if HV KVM is the only target), and common-objs-y includes both. Since this is confusing, this patch adjusts the definitions so that we now use $(common-objs-y) in the list for the 64-bit kvm.ko module, emulate.o is removed from common-objs-y and added in the places that need it, and the inclusion of coalesced_mmio.o now depends on CONFIG_KVM_MMIO. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 19 Aug, 2016 2 commits
-
-
Paul Mackerras authored
It doesn't make sense to create irqfds for a VM that doesn't have in-kernel interrupt controller emulation. There is an existing interface for architecture code to tell the irqfd code whether or not any interrupt controller has been initialized, called kvm_arch_intc_initialized(), so let's implement that for powerpc. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Paul Mackerras authored
It turns out that if userspace creates a pseries-type VM without in-kernel XICS (interrupt controller) emulation, and then connects an eventfd to the VM as an irqfd, and the eventfd gets signalled, that the code will try to deliver an interrupt via the non-existent XICS object and crash the host kernel with a NULL pointer dereference. To fix this, we check for the presence of the XICS object before trying to deliver the interrupt, and return with an error if not. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
- 15 Aug, 2016 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linuxLinus Torvalds authored
Pull thermal updates from Zhang Rui: - Fix a race condition when updating cooling device, which may lead to a situation where a thermal governor never updates the cooling device. From Michele Di Giorgio. - Fix a zero division error when disabling the forced idle injection from the intel powerclamp. From Petr Mladek. - Add suspend/resume callback for intel_pch_thermal thermal driver. From Srinivas Pandruvada. - Another two fixes for clocking cooling driver and hwmon sysfs I/F. From Michele Di Giorgio and Kuninori Morimoto. [ Hmm. That suspend/resume callback for intel_pch_thermal doesn't look like a fix, but I'm letting it slide.. - Linus ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: clock_cooling: Fix missing mutex_init() thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs thermal: fix race condition when updating cooling device thermal/powerclamp: Prevent division by zero when counting interval thermal: intel_pch_thermal: Add suspend/resume callback
-
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommuLinus Torvalds authored
Pull m68knommu fix from Greg Ungerer: "This contains only a single fix for a register corruption problem on certain types of m68k flat format binaries" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: fix user a5 register being overwritten
-
- 14 Aug, 2016 2 commits
-
-
Linus Torvalds authored
Merge tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull h8300 and unicore32 architecture fixes from Guenter Roeck: "Two patches to fix h8300 and unicore32 builds. unicore32 builds have been broken since v4.6. The fix has been available in -next since March of this year. h8300 builds have been broken since the last commit window. The fix has been available in -next since June of this year" * tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: h8300: Add missing include file to asm/io.h unicore32: mm: Add missing parameter to arch_vma_access_permitted
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Catalin Marinas: - support for nr_cpus= command line argument (maxcpus was previously changed to allow secondary CPUs to be hot-plugged) - ARM PMU interrupt handling fix - fix potential TLB conflict in the hibernate code - improved handling of EL1 instruction aborts (better error reporting) - removal of useless jprobes code for stack saving/restoring - defconfig updates * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO arm64: defconfig: add options for virtualization and containers arm64: hibernate: handle allocation failures arm64: hibernate: avoid potential TLB conflict arm64: Handle el1 synchronous instruction aborts cleanly arm64: Remove stack duplicating code from jprobes drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock arm64: Support hard limit of cpu count by nr_cpus
-
- 13 Aug, 2016 4 commits
-
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Radim Krčmář: "KVM: - lock kvm_device list to prevent corruption on device creation. PPC: - split debugfs initialization from creation of the xics device to unlock the newly taken kvm lock earlier. s390: - prevent userspace from triggering two WARN_ON_ONCE. MIPS: - fix several issues in the management of TLB faults (Cc: stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MIPS: KVM: Propagate kseg0/mapped tlb fault errors MIPS: KVM: Fix gfn range check in kseg0 tlb faults MIPS: KVM: Add missing gfn range check MIPS: KVM: Fix mapped fault broken commpage handling KVM: Protect device ops->create and list_add with kvm->lock KVM: PPC: Move xics_debugfs_init out of create KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed KVM: s390: set the prefix initially properly
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: - an NVMe fix from Gabriel, fixing a suspend/resume issue on some setups - addition of a few missing entries in the block queue sysfs documentation, from Joe - a fix for a sparse shadow warning for the bvec iterator, from Johannes - a writeback deadlock involving raid issuing barriers, and not flushing the plug when we wakeup the flusher threads. From Konstantin - a set of patches for the NVMe target/loop/rdma code, from Roland and Sagi * 'for-linus' of git://git.kernel.dk/linux-block: bvec: avoid variable shadowing warning doc: update block/queue-sysfs.txt entries nvme: Suspend all queues before deletion mm, writeback: flush plugged IO in wakeup_flusher_threads() nvme-rdma: Remove unused includes nvme-rdma: start async event handler after reconnecting to a controller nvmet: Fix controller serial number inconsistency nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads nvmet-rdma: Correctly handle RDMA device hot removal nvme-rdma: Make sure to shutdown the controller if we can nvme-loop: Remove duplicate call to nvme_remove_namespaces nvme-rdma: Free the I/O tags when we delete the controller nvme-rdma: Remove duplicate call to nvme_remove_namespaces nvme-rdma: Fix device removal handling nvme-rdma: Queue ns scanning after a sucessful reconnection nvme-rdma: Don't leak uninitialized memory in connect request private data
-
Guenter Roeck authored
h8300 builds fail with arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’ arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’ arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’ and many related errors. Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix") Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-
Guenter Roeck authored
unicore32 fails to compile with the following errors. mm/memory.c: In function ‘__handle_mm_fault’: mm/memory.c:3381: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘check_vma_flags’: mm/gup.c:456: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘vma_permits_fault’: mm/gup.c:640: error: too many arguments to function ‘arch_vma_access_permitted’ Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches") Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
-
- 12 Aug, 2016 13 commits
-
-
git://github.com/awilliam/linux-vfioLinus Torvalds authored
Pull VFIO fix from Alex Williamson: "Fix oops when dereferencing empty data (Alex Williamson)" * tag 'vfio-v4.8-rc2' of git://github.com/awilliam/linux-vfio: vfio/pci: Fix NULL pointer oops in error interrupt setup handling
-
git://linux-nfs.org/~bfields/linuxLinus Torvalds authored
Pull nfsd fixes from Bruce Fields: "Fixes for the dentry refcounting leak I introduced in 4.8-rc1, and for races in the LOCK code which appear to go back to the big nfsd state lock removal from 3.17" * tag 'nfsd-4.8-1' of git://linux-nfs.org/~bfields/linux: nfsd: don't return an unhashed lock stateid after taking mutex nfsd: Fix race between FREE_STATEID and LOCK nfsd: fix dentry refcounting on create
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "Two hibernation fixes allowing it to work with the recently added randomization of the kernel identity mapping base on x86-64 and one cpufreq driver regression fix. Specifics: - Fix the x86 identity mapping creation helpers to avoid the assumption that the base address of the mapping will always be aligned at the PGD level, as it may be aligned at the PUD level if address space randomization is enabled (Rafael Wysocki). - Fix the hibernation core to avoid executing tracing functions before restoring the processor state completely during resume (Thomas Garnier). - Fix a recently introduced regression in the powernv cpufreq driver that causes it to crash due to an out-of-bounds array access (Akshay Adiga)" * tag 'pm-4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly cpufreq: powernv: Fix crash in gpstate_timer_handler()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "This is bigger than usual - the reason is partly a pent-up stream of fixes after the merge window and partly accidental. The fixes are: - five patches to fix a boot failure on Andy Lutomirsky's laptop - four SGI UV platform fixes - KASAN fix - warning fix - documentation update - swap entry definition fix - pkeys fix - irq stats fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/x2apic, smp/hotplug: Don't use before alloc in x2apic_cluster_probe() x86/efi: Allocate a trampoline if needed in efi_free_boot_services() x86/boot: Rework reserve_real_mode() to allow multiple tries x86/boot: Defer setup_real_mode() to early_initcall time x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly x86/boot: Run reserve_bios_regions() after we initialize the memory map x86/irq: Do not substract irq_tlb_count from irq_call_count x86/mm: Fix swap entry comment and macro x86/mm/kaslr: Fix -Wformat-security warning x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems x86/platform/UV: Fix problem with UV4 BIOS providing incorrect PXM values x86/platform/UV: Fix bug with iounmap() of the UV4 EFI System Table causing a crash x86/platform/UV: Fix problem with UV4 Socket IDs not being contiguous x86/entry: Clarify the RF saving/restoring situation with SYSCALL/SYSRET x86/mm: Disable preemption during CR3 read+write x86/mm/KASLR: Increase BRK pages for KASLR memory randomization x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Ingo Molnar: "Misc fixes: a /dev/rtc regression fix, two APIC timer period calibration fixes, an ARM clocksource driver fix and a NOHZ power use regression fix" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hpet: Fix /dev/rtc breakage caused by RTC cleanup x86/timers/apic: Inform TSC deadline clockevent device about recalibration x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error timers: Fix get_next_timer_interrupt() computation clocksource/arm_arch_timer: Force per-CPU interrupt to be level-triggered
-
Rafael J. Wysocki authored
* pm-sleep: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly * pm-cpufreq: cpufreq: powernv: Fix crash in gpstate_timer_handler()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Ingo Molnar: "Misc fixes: cputime fixes, two deadline scheduler fixes and a cgroups scheduling fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/cputime: Fix omitted ticks passed in parameter sched/cputime: Fix steal time accounting sched/deadline: Fix lock pinning warning during CPU hotplug sched/cputime: Mitigate performance regression in times()/clock_gettime() sched/fair: Fix typo in sync_throttle() sched/deadline: Fix wrap-around in DL heap
-
Thomas Garnier authored
Restore the processor state before calling any other functions to ensure per-CPU variables can be used with KASLR memory randomization. Tracing functions use per-CPU variables (GS based on x86) and one was called just before restoring the processor state fully. It resulted in a double fault when both the tracing & the exception handler functions tried to use a per-CPU variable. Fixes: bb3632c6 (PM / sleep: trace events for suspend/resume) Reported-and-tested-by: Borislav Petkov <bp@suse.de> Reported-by: Jiri Kosina <jikos@kernel.org> Tested-by: Rafael J. Wysocki <rafael@kernel.org> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Garnier <thgarnie@google.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus two uncore-PMU fixes, an uprobes fix, a perf-cgroups fix and an AUX events fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add enable_box for client MSR uncore perf/x86/intel/uncore: Fix uncore num_counters uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions perf/core: Set cgroup in CPU contexts for new cgroup events perf/core: Fix sideband list-iteration vs. event ordering NULL pointer deference crash perf probe ppc64le: Fix probe location when using DWARF perf probe: Add function to post process kernel trace events tools: Sync cpufeatures headers with the kernel toops: Sync tools/include/uapi/linux/bpf.h with the kernel tools: Sync cpufeatures.h and vmx.h with the kernel perf probe: Support signedness casting perf stat: Avoid skew when reading events perf probe: Fix module name matching perf probe: Adjust map->reloc offset when finding kernel symbol from map perf hists: Trim libtraceevent trace_seq buffers perf script: Add 'bpf-output' field to usage message
-
Jeff Layton authored
nfsd4_lock will take the st_mutex before working with the stateid it gets, but between the time when we drop the cl_lock and take the mutex, the stateid could become unhashed (a'la FREE_STATEID). If that happens the lock stateid returned to the client will be forgotten. Fix this by first moving the st_mutex acquisition into lookup_or_create_lock_state. Then, have it check to see if the lock stateid is still hashed after taking the mutex. If it's not, then put the stateid and try the find/create again. Signed-off-by: Jeff Layton <jlayton@redhat.com> Tested-by: Alexey Kodanev <alexey.kodanev@oracle.com> Cc: stable@vger.kernel.org # feb9dad5 nfsd: Always lock state exclusively. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fixes from Ingo Molnar: "Misc fixes: lockstat fix, futex fix on !MMU systems, big endian fix for qrwlocks and a race fix for pvqspinlocks" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock: Fix a bug in qstat_read() locking/pvqspinlock: Fix double hash race locking/qrwlock: Fix write unlock bug on big endian systems futex: Assume all mappings are private on !MMU systems
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fix from Ingo Molnar: "A fix for an MSI regression" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi: Make sure PCI MSIs are activated early
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull EFI fixes from Ingo Molnar: "A fix for EFI capsules and an SGI UV platform fix" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/capsule: Allocate whole capsule into virtual memory x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled case
-