- 07 Apr, 2022 4 commits
-
-
Oliver Upton authored
In order to correctly destroy a VM, all references to the VM must be freed. The arch_timer selftest creates a VGIC for the guest, which itself holds a reference to the VM. Close the GIC FD when cleaning up a VM. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220406235615.1447180-4-oupton@google.com
-
Oliver Upton authored
dirty_log_perf_test instantiates a VGICv3 for the guest (if supported by hardware) to reduce the overhead of guest exits. However, the test does not actually close the GIC fd when cleaning up the VM between test iterations, meaning that the VM is never actually destroyed in the kernel. While this is generally a bad idea, the bug was detected from the kernel spewing about duplicate debugfs entries as subsequent VMs happen to reuse the same FD even though the debugfs directory is still present. Abstract away the notion of setup/cleanup of the GIC FD from the test by creating arch-specific helpers for test setup/cleanup. Close the GIC FD on VM cleanup and do nothing for the other architectures. Fixes: c340f789 ("KVM: selftests: Add vgic initialization for dirty log perf test for ARM") Reviewed-by: Jing Zhang <jingzhangos@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220406235615.1447180-3-oupton@google.com
-
Oliver Upton authored
Unfortunately, there is no guarantee that KVM was able to instantiate a debugfs directory for a particular VM. To that end, KVM shouldn't even attempt to create new debugfs files in this case. If the specified parent dentry is NULL, debugfs_create_file() will instantiate files at the root of debugfs. For arm64, it is possible to create the vgic-state file outside of a VM directory, the file is not cleaned up when a VM is destroyed. Nonetheless, the corresponding struct kvm is freed when the VM is destroyed. Nip the problem in the bud for all possible errant debugfs file creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing, debugfs_create_file() will fail instead of creating the file in the root directory. Cc: stable@kernel.org Fixes: 929f45e3 ("kvm: no need to check return value of debugfs_create functions") Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com
-
Andrew Jones authored
When testing a kernel with commit a5905d6a ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated") get-reg-list output vregs: Number blessed registers: 234 vregs: Number registers: 238 vregs: There are 1 new registers. Consider adding them to the blessed reg list with the following lines: KVM_REG_ARM_FW_REG(3), vregs: PASS ... That output inspired two changes: 1) add the new register to the blessed list and 2) explain why "Number registers" is actually four larger than "Number blessed registers" (on the system used for testing), even though only one register is being stated as new. The reason is that some registers are host dependent and they get filtered out when comparing with the blessed list. The system used for the test apparently had three filtered registers. Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220316125129.392128-1-drjones@redhat.com
-
- 06 Apr, 2022 7 commits
-
-
Reiji Watanabe authored
Introduce a test for aarch64 that ensures non-mixed-width vCPUs (all 64bit vCPUs or all 32bit vcPUs) can be configured, and mixed-width vCPUs cannot be configured. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329031924.619453-3-reijiw@google.com
-
Reiji Watanabe authored
KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs for a guest. At vCPU reset, vcpu_allowed_register_width() checks if the vcpu's register width is consistent with all other vCPUs'. Since the checking is done even against vCPUs that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs are erroneously treated as 64bit vCPU, which causes the function to incorrectly detect a mixed-width VM. Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED bits for kvm->arch.flags. A value of the EL1_32BIT bit indicates that the guest needs to be configured with all 32bit or 64bit vCPUs, and a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the EL1_32BIT bit is valid (already set up). Values in those bits are set at the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT configuration for the vCPU. Check vcpu's register width against those new bits at the vcpu's KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width). Fixes: 66e94d5c ("KVM: arm64: Prevent mixed-width VM creation") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com
-
Yu Zhe authored
Remove unnecessary casts. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329102059.268983-1-yuzhe@nfschina.com
-
Oliver Upton authored
It is possible to take a stage-2 permission fault on a page larger than PAGE_SIZE. For example, when running a guest backed by 2M HugeTLB, KVM eagerly maps at the largest possible block size. When dirty logging is enabled on a memslot, KVM does *not* eagerly split these 2M stage-2 mappings and instead clears the write bit on the pte. Since dirty logging is always performed at PAGE_SIZE granularity, KVM lazily splits these 2M block mappings down to PAGE_SIZE in the stage-2 fault handler. This operation must be done under the write lock. Since commit f783ef1c ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging"), the stage-2 fault handler conditionally takes the read lock on permission faults with dirty logging enabled. To that end, it is possible to split a 2M block mapping while only holding the read lock. The problem is demonstrated by running kvm_page_table_test with 2M anonymous HugeTLB, which splats like so: WARNING: CPU: 5 PID: 15276 at arch/arm64/kvm/hyp/pgtable.c:153 stage2_map_walk_leaf+0x124/0x158 [...] Call trace: stage2_map_walk_leaf+0x124/0x158 stage2_map_walker+0x5c/0xf0 __kvm_pgtable_walk+0x100/0x1d4 __kvm_pgtable_walk+0x140/0x1d4 __kvm_pgtable_walk+0x140/0x1d4 kvm_pgtable_walk+0xa0/0xf8 kvm_pgtable_stage2_map+0x15c/0x198 user_mem_abort+0x56c/0x838 kvm_handle_guest_abort+0x1fc/0x2a4 handle_exit+0xa4/0x120 kvm_arch_vcpu_ioctl_run+0x200/0x448 kvm_vcpu_ioctl+0x588/0x664 __arm64_sys_ioctl+0x9c/0xd4 invoke_syscall+0x4c/0x144 el0_svc_common+0xc4/0x190 do_el0_svc+0x30/0x8c el0_svc+0x28/0xcc el0t_64_sync_handler+0x84/0xe4 el0t_64_sync+0x1a4/0x1a8 Fix the issue by only acquiring the read lock if the guest faulted on a PAGE_SIZE granule w/ dirty logging enabled. Add a WARN to catch locking bugs in future changes. Fixes: f783ef1c ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging") Cc: Jing Zhang <jingzhangos@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220401194652.950240-1-oupton@google.com
-
Oliver Upton authored
We already sanitize the guest's PSCI version when it is being written by userspace, rejecting unsupported version numbers. Additionally, the 'minor' parameter to kvm_psci_1_x_call() is a constant known at compile time for all callsites. Though it is benign, the additional check against the PSCI kvm_psci_1_x_call() is unnecessary and likely to be missed the next time KVM raises its maximum PSCI version. Drop the check altogether and rely on sanitization when the PSCI version is set by userspace. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220322183538.2757758-4-oupton@google.com
-
Oliver Upton authored
The SMCCC does not allow the SMC64 calling convention to be used from AArch32. While KVM checks to see if the calling convention is allowed in PSCI_1_0_FN_PSCI_FEATURES, it does not actually prevent calls to unadvertised PSCI v1.0+ functions. Hoist the check to see if the requested function is allowed into kvm_psci_call(), thereby preventing SMC64 calls from AArch32 for all PSCI versions. Fixes: d43583b8 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest") Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220322183538.2757758-3-oupton@google.com
-
Oliver Upton authored
The only valid calling SMC calling convention from an AArch32 state is SMC32. Disallow any PSCI function that sets the SMC64 function ID bit when called from AArch32 rather than comparing against known SMC64 PSCI functions. Note that without this change KVM advertises the SMC64 flavor of SYSTEM_RESET2 to AArch32 guests. Fixes: d43583b8 ("KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest") Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220322183538.2757758-2-oupton@google.com
-
- 03 Apr, 2022 8 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull more tracing updates from Steven Rostedt: - Rename the staging files to give them some meaning. Just stage1,stag2,etc, does not show what they are for - Check for NULL from allocation in bootconfig - Hold event mutex for dyn_event call in user events - Mark user events to broken (to work on the API) - Remove eBPF updates from user events - Remove user events from uapi header to keep it from being installed. - Move ftrace_graph_is_dead() into inline as it is called from hot paths and also convert it into a static branch. * tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Move user_events.h temporarily out of include/uapi ftrace: Make ftrace_graph_is_dead() a static branch tracing: Set user_events to BROKEN tracing/user_events: Remove eBPF interfaces tracing/user_events: Hold event_mutex during dyn_event_add proc: bootconfig: Add null pointer check tracing: Rename the staging files for trace_events
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fix from Stephen Boyd: "A single revert to fix a boot regression seen when clk_put() started dropping rate range requests. It's best to keep various systems booting so we'll kick this out and try again next time" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: Revert "clk: Drop the rate range on clk_put()"
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "A set of x86 fixes and updates: - Make the prctl() for enabling dynamic XSTATE components correct so it adds the newly requested feature to the permission bitmap instead of overwriting it. Add a selftest which validates that. - Unroll string MMIO for encrypted SEV guests as the hypervisor cannot emulate it. - Handle supervisor states correctly in the FPU/XSTATE code so it takes the feature set of the fpstate buffer into account. The feature sets can differ between host and guest buffers. Guest buffers do not contain supervisor states. So far this was not an issue, but with enabling PASID it needs to be handled in the buffer offset calculation and in the permission bitmaps. - Avoid a gazillion of repeated CPUID invocations in by caching the values early in the FPU/XSTATE code. - Enable CONFIG_WERROR in x86 defconfig. - Make the X86 defconfigs more useful by adapting them to Y2022 reality" * tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu/xstate: Consolidate size calculations x86/fpu/xstate: Handle supervisor states in XSTATE permissions x86/fpu/xsave: Handle compacted offsets correctly with supervisor states x86/fpu: Cache xfeature flags from CPUID x86/fpu/xsave: Initialize offset/size cache early x86/fpu: Remove unused supervisor only offsets x86/fpu: Remove redundant XCOMP_BV initialization x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO x86/config: Make the x86 defconfigs a bit more usable x86/defconfig: Enable WERROR selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull RT signal fix from Thomas Gleixner: "Revert the RT related signal changes. They need to be reworked and generalized" * tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull more dma-mapping updates from Christoph Hellwig: - fix a regression in dma remap handling vs AMD memory encryption (me) - finally kill off the legacy PCI DMA API (Christophe JAILLET) * tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: move pgprot_decrypted out of dma_pgprot PCI/doc: cleanup references to the legacy PCI DMA API PCI: Remove the deprecated "pci-dma-compat.h" API
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fixes from Russell King: - avoid unnecessary rebuilds for library objects - fix return value of __setup handlers - fix invalid input check for "crashkernel=" kernel option - silence KASAN warnings in unwind_frame * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9191/1: arm/stacktrace, kasan: Silence KASAN warnings in unwind_frame() ARM: 9190/1: kdump: add invalid input check for 'crashkernel=0' ARM: 9187/1: JIVE: fix return value of __setup handler ARM: 9189/1: decompressor: fix unneeded rebuilds of library objects
-
Stephen Boyd authored
This reverts commit 7dabfa2b. There are multiple reports that this breaks boot on various systems. The common theme is that orphan clks are having rates set on them when that isn't expected. Let's revert it out for now so that -rc1 boots. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Reported-by: Tony Lindgren <tony@atomide.com> Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lore.kernel.org/r/366a0232-bb4a-c357-6aa8-636e398e05eb@samsung.com Cc: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20220403022818.39572-1-sboyd@kernel.org
-
- 02 Apr, 2022 21 commits
-
-
Linus Torvalds authored
Merge tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: - Avoid SEGV if core.cpus isn't set in 'perf stat'. - Stop depending on .git files for building PERF-VERSION-FILE, used in 'perf --version', fixing some perf tools build scenarios. - Convert tracepoint.py example to python3. - Update UAPI header copies from the kernel sources: socket, mman-common, msr-index, KVM, i915 and cpufeatures. - Update copy of libbpf's hashmap.c. - Directly return instead of using local ret variable in evlist__create_syswide_maps(), found by coccinelle. * tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf python: Convert tracepoint.py example to python3 perf evlist: Directly return instead of using local ret variable perf cpumap: More cpu map reuse by merge. perf cpumap: Add is_subset function perf evlist: Rename cpus to user_requested_cpus perf tools: Stop depending on .git files for building PERF-VERSION-FILE tools headers cpufeatures: Sync with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools kvm headers arm64: Update KVM headers from the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: Sync asm-generic/mman-common.h with the kernel perf beauty: Update copy of linux/socket.h with the kernel sources perf tools: Update copy of libbpf's hashmap.c perf stat: Avoid SEGV if core.cpus isn't set
-
Linus Torvalds authored
Merge tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix empty $(PYTHON) expansion. - Fix UML, which got broken by the attempt to suppress Clang warnings. - Fix warning message in modpost. * tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: restore the warning message for missing symbol versions Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS" kbuild: Remove '-mno-global-merge' kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh kconfig: remove stale comment about removed kconfig_print_symbol()
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - build fix for gpio - fix crc32 build problems - check for failed memory allocations * tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: crypto: Fix CRC32 code MIPS: rb532: move GPIOD definition into C-files MIPS: lantiq: check the return value of kzalloc() mips: sgi-ip22: add a check for the return of kzalloc()
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr - Documentation improvements - Prevent module exit until all VMs are freed - PMU Virtualization fixes - Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences - Other miscellaneous bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: fix sending PV IPI KVM: x86/mmu: do compare-and-exchange of gPTE via the user address KVM: x86: Remove redundant vm_entry_controls_clearbit() call KVM: x86: cleanup enter_rmode() KVM: x86: SVM: fix tsc scaling when the host doesn't support it kvm: x86: SVM: remove unused defines KVM: x86: SVM: move tsc ratio definitions to svm.h KVM: x86: SVM: fix avic spec based definitions again KVM: MIPS: remove reference to trap&emulate virtualization KVM: x86: document limitations of MSR filtering KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr KVM: x86/emulator: Emulate RDPID only if it is enabled in guest KVM: x86/pmu: Fix and isolate TSX-specific performance event logic KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86: Trace all APICv inhibit changes and capture overall status KVM: x86: Add wrappers for setting/clearing APICv inhibits KVM: x86: Make APICv inhibit reasons an enum and cleanup naming KVM: X86: Handle implicit supervisor access with SMAP KVM: X86: Rename variable smap to not_smap in permission_fault() ...
-
Masahiro Yamada authored
This log message was accidentally chopped off. I was wondering why this happened, but checking the ML log, Mark precisely followed my suggestion [1]. I just used "..." because I was too lazy to type the sentence fully. Sorry for the confusion. [1]: https://lore.kernel.org/all/CAK7LNAR6bXXk9-ZzZYpTqzFqdYbQsZHmiWspu27rtsFxvfRuVA@mail.gmail.com/ Fixes: 4a679593 ("kbuild: modpost: Explicitly warn about unprototyped symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block driver fix from Jens Axboe: "Got two reports on nbd spewing warnings on load now, which is a regression from a commit that went into your tree yesterday. Revert the problematic change for now" * tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block: Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
-
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds authored
Pull pci fix from Bjorn Helgaas: - Fix Hyper-V "defined but not used" build issue added during merge window (YueHaibing) * tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: hv: Remove unused hv_set_msi_entry_from_desc()
-
Linus Torvalds authored
Merge tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform updates from Benson Leung: "cros_ec_typec: - Check for EC device - Fix a crash when using the cros_ec_typec driver on older hardware not capable of typec commands - Make try power role optional - Mux configuration reorganization series from Prashant cros_ec_debugfs: - Fix use after free. Thanks Tzung-bi sensorhub: - cros_ec_sensorhub fixup - Split trace include file misc: - Add new mailing list for chrome-platform development: chrome-platform@lists.linux.dev Now with patchwork!" * tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_debugfs: detach log reader wq from devm platform: chrome: Split trace include file platform/chrome: cros_ec_typec: Update mux flags during partner removal platform/chrome: cros_ec_typec: Configure muxes at start of port update platform/chrome: cros_ec_typec: Get mux state inside configure_mux platform/chrome: cros_ec_typec: Move mux flag checks platform/chrome: cros_ec_typec: Check for EC device platform/chrome: cros_ec_typec: Make try power role optional MAINTAINERS: platform-chrome: Add new chrome-platform@lists.linux.dev list
-
Jens Axboe authored
This reverts commit 6d35d04a. Both Gabriel and Borislav report that this commit casues a regression with nbd: sysfs: cannot create duplicate filename '/dev/block/43:0' Revert it before 5.18-rc1 and we'll investigage this separately in due time. Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/Reported-by: Gabriel L. Somlo <somlo@cmu.edu> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Eric Dumazet authored
Commit 7ea1a012 ("watch_queue: Free the alloc bitmap when the watch_queue is torn down") took care of the bitmap, but not the page array. BUG: memory leak unreferenced object 0xffff88810d9bc140 (size 32): comm "syz-executor335", pid 3603, jiffies 4294946994 (age 12.840s) hex dump (first 32 bytes): 40 a7 40 04 00 ea ff ff 00 00 00 00 00 00 00 00 @.@............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: kmalloc_array include/linux/slab.h:621 [inline] kcalloc include/linux/slab.h:652 [inline] watch_queue_set_size+0x12f/0x2e0 kernel/watch_queue.c:251 pipe_ioctl+0x82/0x140 fs/pipe.c:632 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] Reported-by: syzbot+25ea042ae28f3888727a@syzkaller.appspotmail.com Fixes: c73be61c ("pipe: Add general notification queue support") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20220322004654.618274-1-eric.dumazet@gmail.com/Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Steven Rostedt (Google) authored
After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. To not have the code silently bitrot, still allow building it with COMPILE_TEST. And to prevent the uapi header from being installed, then later changed, and then have an old distro user space see the old version, move the header file out of the uapi directory. Surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.comSuggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Steven Rostedt (Google) authored
While user_events API is under development and has been marked for broken to not let the API become fixed, move the header file out of the uapi directory. This is to prevent it from being installed, then later changed, and then have an old distro user space update with a new kernel, where applications see the user_events being available, but the old header is in place, and then they get compiled incorrectly. Also, surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.homeSuggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Christophe Leroy authored
ftrace_graph_is_dead() is used on hot paths, it just reads a variable in memory and is not worth suffering function call constraints. For instance, at entry of prepare_ftrace_return(), inlining it avoids saving prepare_ftrace_return() parameters to stack and restoring them after calling ftrace_graph_is_dead(). While at it using a static branch is even more performant and is rather well adapted considering that the returned value will almost never change. Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool by a static branch. The performance improvement is noticeable. Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.euSigned-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. Link: https://lore.kernel.org/all/2059213643.196683.1648499088753.JavaMail.zimbra@efficios.com/ Link: https://lkml.kernel.org/r/20220330155835.5e1f6669@gandalf.local.homeSigned-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Beau Belgrave authored
Remove eBPF interfaces within user_events to ensure they are fully reviewed. Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/ Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.comSuggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Beau Belgrave authored
Make sure the event_mutex is properly held during dyn_event_add call. This is required when adding dynamic events. Link: https://lkml.kernel.org/r/20220328223225.1992-1-beaub@linux.microsoft.comReported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Lv Ruyi authored
kzalloc is a memory allocation function which can return NULL when some internal memory errors happen. It is safer to add null pointer check. Link: https://lkml.kernel.org/r/20220329104004.2376879-1-lv.ruyi@zte.com.cn Cc: stable@vger.kernel.org Fixes: c1a3c360 ("proc: bootconfig: Add /proc/bootconfig to show boot config list") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
When looking for implementation of different phases of the creation of the TRACE_EVENT() macro, it is pretty useless when all helper macro redefinitions are in files labeled "stageX_defines.h". Rename them to state which phase the files are for. For instance, when looking for the defines that are used to create the event fields, seeing "stage4_event_fields.h" gives the developer a good idea that the defines are in that file. Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Li RongQing authored
If apic_id is less than min, and (max - apic_id) is greater than KVM_IPI_CLUSTER_SIZE, then the third check condition is satisfied but the new apic_id does not fit the bitmask. In this case __send_ipi_mask should send the IPI. This is mostly theoretical, but it can happen if the apic_ids on three iterations of the loop are for example 1, KVM_IPI_CLUSTER_SIZE, 0. Fixes: aaffcfd1 ("KVM: X86: Implement PV IPIs in linux guest") Signed-off-by: Li RongQing <lirongqing@baidu.com> Message-Id: <1646814944-51801-1-git-send-email-lirongqing@baidu.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
FNAME(cmpxchg_gpte) is an inefficient mess. It is at least decent if it can go through get_user_pages_fast(), but if it cannot then it tries to use memremap(); that is not just terribly slow, it is also wrong because it assumes that the VM_PFNMAP VMA is contiguous. The right way to do it would be to do the same thing as hva_to_pfn_remapped() does since commit add6a0cd ("KVM: MMU: try to fix up page faults before giving up", 2016-07-05), using follow_pte() and fixup_user_fault() to determine the correct address to use for memremap(). To do this, one could for example extract hva_to_pfn() for use outside virt/kvm/kvm_main.c. But really there is no reason to do that either, because there is already a perfectly valid address to do the cmpxchg() on, only it is a userspace address. That means doing user_access_begin()/user_access_end() and writing the code in assembly to handle exceptions correctly. Worse, the guest PTE can be 8-byte even on i686 so there is the extra complication of using cmpxchg8b to account for. But at least it is an efficient mess. (Thanks to Linus for suggesting improvement on the inline assembly). Reported-by: Qiuhao Li <qiuhao@sysec.org> Reported-by: Gaoning Pan <pgn@zju.edu.cn> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Fixes: bd53cb35 ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Zhenzhong Duan authored
When emulating exit from long mode, EFER_LMA is cleared with vmx_set_efer(). This will already unset the VM_ENTRY_IA32E_MODE control bit as requested by SDM, so there is no need to unset VM_ENTRY_IA32E_MODE again in exit_lmode() explicitly. In case EFER isn't supported by hardware, long mode isn't supported, so exit_lmode() cannot be reached. Note that, thanks to the shadow controls mechanism, this change doesn't eliminate vmread or vmwrite. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20220311102643.807507-3-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-