- 25 Apr, 2019 3 commits
-
-
Christoffer Dall authored
When a VCPU never runs before a guest exists, but we set timer registers up via ioctls, the associated hrtimer might never get cancelled. Since we moved vcpu_load/put into the arch-specific implementations and only have load/put for KVM_RUN, we won't ever have a scheduled hrtimer for emulating a timer when modifying the timer state via an ioctl from user space. All we need to do is make sure that we pick up the right state when we load the timer state next time userspace calls KVM_RUN again. We also do not need to worry about this interacting with the bg_timer, because if we were in WFI from the guest, and somehow ended up in a kvm_arm_timer_set_reg, it means that: 1. the VCPU thread has received a signal, 2. we have called vcpu_load when being scheduled in again, 3. we have called vcpu_put when we returned to userspace for it to issue another ioctl And therefore will not have a bg_timer programmed and the event is treated as a spurious wakeup from WFI if userspace decides to run the vcpu again even if there are not virtual interrupts. This fixes stray virtual timer interrupts triggered by an expiring hrtimer, which happens after a failed live migration, for instance. Fixes: bee038a6 ("KVM: arm/arm64: Rework the timer code to use a timer_map") Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Reported-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Suzuki K Poulose authored
With commit a80868f3, we no longer ensure that the THP page is properly aligned in the guest IPA. Skip the stage2 huge mapping for unaligned IPA backed by transparent hugepages. Fixes: a80868f3 ("KVM: arm/arm64: Enforce PTE mappings at stage2 when needed") Reported-by: Eric Auger <eric.auger@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Chirstoffer Dall <christoffer.dall@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Zheng Xiang <zhengxiang9@huawei.com> Cc: Andrew Murray <andrew.murray@arm.com> Cc: Eric Auger <eric.auger@redhat.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Andrew Jones authored
A failed KVM_ARM_VCPU_INIT should not set the vcpu target, as the vcpu target is used by kvm_vcpu_initialized() to determine if other vcpu ioctls may proceed. We need to set the target before calling kvm_reset_vcpu(), but if that call fails, we should then unset it and clear the feature bitmap while we're at it. Signed-off-by: Andrew Jones <drjones@redhat.com> [maz: Simplified patch, completed commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 03 Apr, 2019 1 commit
-
-
Marc Zyngier authored
When disabling LPIs (for example on reset) at the redistributor level, it is expected that LPIs that was pending in the CPU interface are eventually retired. Currently, this is not what is happening, and these LPIs will stay in the ap_list, eventually being acknowledged by the vcpu (which didn't quite expect this behaviour). The fix is thus to retire these LPIs from the list of pending interrupts as we disable LPIs. Reported-by: Heyi Guo <guoheyi@huawei.com> Tested-by: Heyi Guo <guoheyi@huawei.com> Fixes: 0e4e82f1 ("KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 30 Mar, 2019 1 commit
-
-
Wei Huang authored
Recently the generic timer test of kvm-unit-tests failed to complete (stalled) when a physical timer is being used. This issue is caused by incorrect update of CNTP_CVAL when CNTP_TVAL is being accessed, introduced by 'Commit 84135d3d ("KVM: arm/arm64: consolidate arch timer trap handlers")'. According to Arm ARM, the read/write behavior of accesses to the TVAL registers is expected to be: * READ: TimerValue = (CompareValue – (Counter - Offset) * WRITE: CompareValue = ((Counter - Offset) + Sign(TimerValue) This patch fixes the TVAL read/write code path according to the specification. Fixes: 84135d3d ("KVM: arm/arm64: consolidate arch timer trap handlers") Signed-off-by: Wei Huang <wei@redhat.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 28 Mar, 2019 1 commit
-
-
Zenghui Yu authored
Some comments in virt/kvm/arm/mmu.c are outdated. Update them to reflect the current state of the code. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 20 Mar, 2019 2 commits
-
-
YueHaibing authored
Fix sparse warnings: arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1732:5: warning: symbol 'vgic_its_has_attr_regs' was not declared. Should it be static? arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic-its.c:1753:5: warning: symbol 'vgic_its_attr_regs_access' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> [maz: fixed subject] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Suzuki K Poulose authored
We rely on the mmu_notifier call backs to handle the split/merge of huge pages and thus we are guaranteed that, while creating a block mapping, either the entire block is unmapped at stage2 or it is missing permission. However, we miss a case where the block mapping is split for dirty logging case and then could later be made block mapping, if we cancel the dirty logging. This not only creates inconsistent TLB entries for the pages in the the block, but also leakes the table pages for PMD level. Handle this corner case for the huge mappings at stage2 by unmapping the non-huge mapping for the block. This could potentially release the upper level table. So we need to restart the table walk once we unmap the range. Fixes : ad361f09 ("KVM: ARM: Support hugetlbfs backed huge pages") Reported-by: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zheng Xiang <zhengxiang9@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 19 Mar, 2019 5 commits
-
-
Suzuki K Poulose authored
commit 6794ad54 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") made the checks to skip huge mappings, stricter. However it introduced a bug where we still use huge mappings, ignoring the flag to use PTE mappings, by not reseting the vma_pagesize to PAGE_SIZE. Also, the checks do not cover the PUD huge pages, that was under review during the same period. This patch fixes both the issues. Fixes : 6794ad54 ("KVM: arm/arm64: Fix unintended stage 2 PMD mappings") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Marc Zyngier authored
Calling kvm_is_visible_gfn() implies that we're parsing the memslots, and doing this without the srcu lock is frown upon: [12704.164532] ============================= [12704.164544] WARNING: suspicious RCU usage [12704.164560] 5.1.0-rc1-00008-g600025238f51-dirty #16 Tainted: G W [12704.164573] ----------------------------- [12704.164589] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [12704.164602] other info that might help us debug this: [12704.164616] rcu_scheduler_active = 2, debug_locks = 1 [12704.164631] 6 locks held by qemu-system-aar/13968: [12704.164644] #0: 000000007ebdae4f (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [12704.164691] #1: 000000007d751022 (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [12704.164726] #2: 00000000219d2706 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164761] #3: 00000000a760aecd (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164794] #4: 000000000ef8e31d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164827] #5: 000000007a872093 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [12704.164861] stack backtrace: [12704.164878] CPU: 2 PID: 13968 Comm: qemu-system-aar Tainted: G W 5.1.0-rc1-00008-g600025238f51-dirty #16 [12704.164887] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [12704.164896] Call trace: [12704.164910] dump_backtrace+0x0/0x138 [12704.164920] show_stack+0x24/0x30 [12704.164934] dump_stack+0xbc/0x104 [12704.164946] lockdep_rcu_suspicious+0xcc/0x110 [12704.164958] gfn_to_memslot+0x174/0x190 [12704.164969] kvm_is_visible_gfn+0x28/0x70 [12704.164980] vgic_its_check_id.isra.0+0xec/0x1e8 [12704.164991] vgic_its_save_tables_v0+0x1ac/0x330 [12704.165001] vgic_its_set_attr+0x298/0x3a0 [12704.165012] kvm_device_ioctl_attr+0x9c/0xd8 [12704.165022] kvm_device_ioctl+0x8c/0xf8 [12704.165035] do_vfs_ioctl+0xc8/0x960 [12704.165045] ksys_ioctl+0x8c/0xa0 [12704.165055] __arm64_sys_ioctl+0x28/0x38 [12704.165067] el0_svc_common+0xd8/0x138 [12704.165078] el0_svc_handler+0x38/0x78 [12704.165089] el0_svc+0x8/0xc Make sure the lock is taken when doing this. Fixes: bf308242 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Marc Zyngier authored
When halting a guest, QEMU flushes the virtual ITS caches, which amounts to writing to the various tables that the guest has allocated. When doing this, we fail to take the srcu lock, and the kernel shouts loudly if running a lockdep kernel: [ 69.680416] ============================= [ 69.680819] WARNING: suspicious RCU usage [ 69.681526] 5.1.0-rc1-00008-g600025238f51-dirty #18 Not tainted [ 69.682096] ----------------------------- [ 69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [ 69.683225] [ 69.683225] other info that might help us debug this: [ 69.683225] [ 69.683975] [ 69.683975] rcu_scheduler_active = 2, debug_locks = 1 [ 69.684598] 6 locks held by qemu-system-aar/4097: [ 69.685059] #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [ 69.686087] #1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [ 69.686919] #2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.687698] #3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.688475] #4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.689978] #5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.690729] [ 69.690729] stack backtrace: [ 69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty #18 [ 69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [ 69.692831] Call trace: [ 69.694072] lockdep_rcu_suspicious+0xcc/0x110 [ 69.694490] gfn_to_memslot+0x174/0x190 [ 69.694853] kvm_write_guest+0x50/0xb0 [ 69.695209] vgic_its_save_tables_v0+0x248/0x330 [ 69.695639] vgic_its_set_attr+0x298/0x3a0 [ 69.696024] kvm_device_ioctl_attr+0x9c/0xd8 [ 69.696424] kvm_device_ioctl+0x8c/0xf8 [ 69.696788] do_vfs_ioctl+0xc8/0x960 [ 69.697128] ksys_ioctl+0x8c/0xa0 [ 69.697445] __arm64_sys_ioctl+0x28/0x38 [ 69.697817] el0_svc_common+0xd8/0x138 [ 69.698173] el0_svc_handler+0x38/0x78 [ 69.698528] el0_svc+0x8/0xc The fix is to obviously take the srcu lock, just like we do on the read side of things since bf308242. One wonders why this wasn't fixed at the same time, but hey... Fixes: bf308242 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Marc Zyngier authored
The normal interrupt flow is not to enable the vgic when no virtual interrupt is to be injected (i.e. the LRs are empty). But when a guest is likely to use GICv4 for LPIs, we absolutely need to switch it on at all times. Otherwise, VLPIs only get delivered when there is something in the LRs, which doesn't happen very often. Reported-by: Nianyao Tang <tangnianyao@huawei.com> Tested-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Marc Zyngier authored
We've become very cautious to now always reset the vcpu when nothing is loaded on the physical CPU. To do so, we now disable preemption and do a kvm_arch_vcpu_put() to make sure we have all the state in memory (and that it won't be loaded behind out back). This now causes issues with resetting the PMU, which calls into perf. Perf itself uses mutexes, which clashes with the lack of preemption. It is worth realizing that the PMU is fully emulated, and that no PMU state is ever loaded on the physical CPU. This means we can perfectly reset the PMU outside of the non-preemptible section. Fixes: e761a927 ("KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded") Reported-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 17 Mar, 2019 14 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds authored
Pull more Kbuild updates from Masahiro Yamada: - add more Build-Depends to Debian source package - prefix header search paths with $(srctree)/ - make modpost show verbose section mismatch warnings - avoid hard-coded CROSS_COMPILE for h8300 - fix regression for Debian make-kpkg command - add semantic patch to detect missing put_device() - fix some warnings of 'make deb-pkg' - optimize NOSTDINC_FLAGS evaluation - add warnings about redundant generic-y - clean up Makefiles and scripts * tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: remove stale lxdialog/.gitignore kbuild: force all architectures except um to include mandatory-y kbuild: warn redundant generic-y Revert "modsign: Abort modules_install when signing fails" kbuild: Make NOSTDINC_FLAGS a simply expanded variable kbuild: deb-pkg: avoid implicit effects coccinelle: semantic code search for missing put_device() kbuild: pkg: grep include/config/auto.conf instead of $KCONFIG_CONFIG kbuild: deb-pkg: introduce is_enabled and if_enabled_echo to builddeb kbuild: deb-pkg: add CONFIG_ prefix to kernel config options kbuild: add workaround for Debian make-kpkg kbuild: source include/config/auto.conf instead of ${KCONFIG_CONFIG} unicore32: simplify linker script generation for decompressor h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- kbuild: move archive command to scripts/Makefile.lib modpost: always show verbose warning for section mismatch ia64: prefix header search path with $(srctree)/ libfdt: prefix header search paths with $(srctree)/ deb-pkg: generate correct build dependencies
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 asm updates from Thomas Gleixner: "Two cleanup patches removing dead conditionals and unused code" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Remove unused __constant_c_x_memset() macro and inlines x86/asm: Remove dead __GNUC__ conditionals
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Thomas Gleixner: "Three fixes for the fallout from the TSX errata workaround: - Prevent memory corruption caused by a unchecked out of bound array index. - Two trivial fixes to address compiler warnings" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Make dev_attr_allow_tsx_force_abort static perf/x86: Fixup typo in stub functions perf/x86/intel: Fix memory corruption
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull xen fix from Juergen Gross: "A fix for a Xen bug introduced by David's series for excluding ballooned pages in vmcores" * tag 'for-linus-5.1b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Fix mapping PG_offline pages to user space
-
git://github.com/martinetd/linuxLinus Torvalds authored
Pull 9p updates from Dominique Martinet: "Here is a 9p update for 5.1; there honestly hasn't been much. Two fixes (leak on invalid mount argument and possible deadlock on i_size update on 32bit smp) and a fall-through warning cleanup" * tag '9p-for-5.1' of git://github.com/martinetd/linux: 9p/net: fix memory leak in p9_client_create 9p: use inode->i_lock to protect i_size_write() under 32-bit 9p: mark expected switch fall-through
-
kbuild test robot authored
Fixes: 400816f6 ("perf/x86/intel: Implement support for TSX Force Abort") Signed-off-by: kbuild test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: kbuild-all@01.org Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190313184243.GA10820@lkp-sb-ep06
-
Masahiro Yamada authored
When this .gitignore was added, lxdialog was an independent hostprogs-y. Now that all objects in lxdialog/ are directly linked to mconf, the lxdialog is no longer generated. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Masahiro Yamada authored
Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes the common Kbuild.asm file. Factor out the duplicated include directives to scripts/Makefile.asm-generic so that no architecture would opt out of the mandatory-y mechanism. um is not forced to include mandatory-y since it is a very exceptional case which does not support UAPI. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Masahiro Yamada authored
The generic-y is redundant under the following condition: - arch has its own implementation - the same header is added to generated-y - the same header is added to mandatory-y If a redundant generic-y is found, the warning like follows is displayed: scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h I fixed up arch Kbuild files found by this. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Douglas Anderson authored
This reverts commit caf6fe91. The commit was fine but is no longer needed as of commit 3a2429e1 ("kbuild: change if_changed_rule for multi-line recipe"). Let's go back to using ";" to be consistent. For some discussion, see: https://lkml.kernel.org/r/CAK7LNASde0Q9S5GKeQiWhArfER4S4wL1=R_FW8q0++_X3T5=hQ@mail.gmail.comSigned-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Douglas Anderson authored
During a simple no-op (nothing changed) build I saw 39 invocations of the C compiler with the argument "-print-file-name=include". We don't need to call the C compiler 39 times for this--one time will suffice. Let's change NOSTDINC_FLAGS to a simply expanded variable to avoid this since there doesn't appear to be any reason it should be recursively expanded. On my build this shaved ~400 ms off my "no-op" build. Note that the recursive expansion seems to date back to the (really old) commit e8f5bdb0 ("[PATCH] Makefile include path ordering"). It's a little unclear to me if the point of that patch was to switch the variable to be recursively expanded (which it did) or to avoid directly assigning to NOSTDINC_FLAGS (AKA to switch to +=) because someone else (out of tree?) was setting it. I presume later since if the only goal was to switch to recursive expansion the patch would have just removed the ":". Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Arseny Maslennikov authored
* The man page for dpkg-source(1) notes: > -b, --build directory [format-specific-parameters] > Build a source package (--build since dpkg 1.17.14). > <...> > > dpkg-source will build the source package with the first > format found in this ordered list: the format indicated > with the --format command line option, the format > indicated in debian/source/format, “1.0”. The fallback > to “1.0” is deprecated and will be removed at some point > in the future, you should always document the desired > source format in debian/source/format. See section > SOURCE PACKAGE FORMATS for an extensive description of > the various source package formats. Thus it would be more foolproof to explicitly use 1.0 (as we always did) than to rely on dpkg-source's defaults. * In a similar vein, debian/rules is not made executable by mkdebian, and dpkg-source warns about that but still silently fixes the file. Let's be explicit once again. Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
Wen Yang authored
The of_find_device_by_node() takes a reference to the underlying device structure, we should release that reference. The implementation of this semantic code search is: In a function, for a local variable returned by calling of_find_device_by_node(), a, if it is released by a function such as put_device()/of_dev_put()/platform_device_put() after the last use, it is considered that there is no reference leak; b, if it is passed back to the caller via dev_get_drvdata()/platform_get_drvdata()/get_device(), etc., the reference will be released in other functions, and the current function also considers that there is no reference leak; c, for the rest of the situation, the current function should release the reference by calling put_device, this code search will report the corresponding error message. By using this semantic code search, we have found some object reference leaks, such as: commit 11907e9d ("ASoC: fsl-asoc-card: fix object reference leaks in fsl_asoc_card_probe") commit a12085d1 ("mtd: rawnand: atmel: fix possible object reference leak") commit 11493f26 ("mtd: rawnand: jz4780: fix possible object reference leak") There are still dozens of reference leaks in the current kernel code. Further, for the case of b, the object returned to other functions may also have a reference leak, we will continue to develop other cocci scripts to further check the reference leak. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Reviewed-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Markus Elfring <Markus.Elfring@web.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
-
- 16 Mar, 2019 9 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull pidfd system call from Christian Brauner: "This introduces the ability to use file descriptors from /proc/<pid>/ as stable handles on struct pid. Even if a pid is recycled the handle will not change. For a start these fds can be used to send signals to the processes they refer to. With the ability to use /proc/<pid> fds as stable handles on struct pid we can fix a long-standing issue where after a process has exited its pid can be reused by another process. If a caller sends a signal to a reused pid it will end up signaling the wrong process. With this patchset we enable a variety of use cases. One obvious example is that we can now safely delegate an important part of process management - sending signals - to processes other than the parent of a given process by sending file descriptors around via scm rights and not fearing that the given process will have been recycled in the meantime. It also allows for easy testing whether a given process is still alive or not by sending signal 0 to a pidfd which is quite handy. There has been some interest in this feature e.g. from systems management (systemd, glibc) and container managers. I have requested and gotten comments from glibc to make sure that this syscall is suitable for their needs as well. In the future I expect it to take on most other pid-based signal syscalls. But such features are left for the future once they are needed. This has been sitting in linux-next for quite a while and has not caused any issues. It comes with selftests which verify basic functionality and also test that a recycled pid cannot be signaled via a pidfd. Jon has written about a prior version of this patchset. It should cover the basic functionality since not a lot has changed since then: https://lwn.net/Articles/773459/ The commit message for the syscall itself is extensively documenting the syscall, including it's functionality and extensibility" * tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add tests for pidfd_send_signal() signal: add pidfd_send_signal() syscall
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull device-dax updates from Dan Williams: "New device-dax infrastructure to allow persistent memory and other "reserved" / performance differentiated memories, to be assigned to the core-mm as "System RAM". Some users want to use persistent memory as additional volatile memory. They are willing to cope with potential performance differences, for example between DRAM and 3D Xpoint, and want to use typical Linux memory management apis rather than a userspace memory allocator layered over an mmap() of a dax file. The administration model is to decide how much Persistent Memory (pmem) to use as System RAM, create a device-dax-mode namespace of that size, and then assign it to the core-mm. The rationale for device-dax is that it is a generic memory-mapping driver that can be layered over any "special purpose" memory, not just pmem. On subsequent boots udev rules can be used to restore the memory assignment. One implication of using pmem as RAM is that mlock() no longer keeps data off persistent media. For this reason it is recommended to enable NVDIMM Security (previously merged for 5.0) to encrypt pmem contents at rest. We considered making this recommendation an actively enforced requirement, but in the end decided to leave it as a distribution / administrator policy to allow for emulation and test environments that lack security capable NVDIMMs. Summary: - Replace the /sys/class/dax device model with /sys/bus/dax, and include a compat driver so distributions can opt-in to the new ABI. - Allow for an alternative driver for the device-dax address-range - Introduce the 'kmem' driver to hotplug / assign a device-dax address-range to the core-mm. - Arrange for the device-dax target-node to be onlined so that the newly added memory range can be uniquely referenced by numa apis" NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because we currently have special - and very annoying rules in the kernel about accessing PMEM only with the "MC safe" accessors, because machine checks inside the regular repeat string copy functions can be fatal in some (not described) circumstances. And apparently the PMEM modules can cause that a lot more than regular RAM. The argument is that this happens because PMEM doesn't necessarily get scrubbed at boot like RAM does, but that is planned to be added for the user space tooling. Quoting Dan from another email: "The exposure can be reduced in the volatile-RAM case by scanning for and clearing errors before it is onlined as RAM. The userspace tooling for that can be in place before v5.1-final. There's also runtime notifications of errors via acpi_nfit_uc_error_notify() from background scrubbers on the DIMM devices. With that mechanism the kernel could proactively clear newly discovered poison in the volatile case, but that would be additional development more suitable for v5.2. I understand the concern, and the need to highlight this issue by tapping the brakes on feature development, but I don't see PMEM as RAM making the situation worse when the exposure is also there via DAX in the PMEM case. Volatile-RAM is arguably a safer use case since it's possible to repair pages where the persistent case needs active application coordination" * tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: "Hotplug" persistent memory for use like normal RAM mm/resource: Let walk_system_ram_range() search child resources mm/memory-hotplug: Allow memory resources to be children mm/resource: Move HMM pr_debug() deeper into resource code mm/resource: Return real error codes from walk failures device-dax: Add a 'modalias' attribute to DAX 'bus' devices device-dax: Add a 'target_node' attribute device-dax: Auto-bind device after successful new_id acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node device-dax: Add /sys/class/dax backwards compatibility device-dax: Add support for a dax override driver device-dax: Move resource pinning+mapping into the common driver device-dax: Introduce bus + driver model device-dax: Start defining a dax bus model device-dax: Remove multi-resource infrastructure device-dax: Kill dax_region base device-dax: Kill dax_region ida
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull more SCSI updates from James Bottomley: "This is the final round of mostly small fixes and performance improvements to our initial submit. The main regression fix is the ia64 simscsi build failure which was missed in the serial number elimination conversion" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: ia64: simscsi: use request tag instead of serial_number scsi: aacraid: Fix performance issue on logical drives scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup() scsi: libiscsi: Hold back_lock when calling iscsi_complete_task scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port scsi: hisi_sas: Set PHY linkrate when disconnected scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO scsi: hisi_sas: Change return variable type in phy_up_v3_hw() scsi: qla2xxx: check for kstrtol() failure scsi: lpfc: fix 32-bit format string warning scsi: lpfc: fix unused variable warning scsi: target: tcmu: Switch to bitmap_zalloc() scsi: libiscsi: fall back to sendmsg for slab pages scsi: qla2xxx: avoid printf format warning scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check scsi: ufs: hisi: fix ufs_hba_variant_ops passing scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull more block layer changes from Jens Axboe: "This is a collection of both stragglers, and fixes that came in after I finalized the initial pull. This contains: - An MD pull request from Song, with a few minor fixes - Set of NVMe patches via Christoph - Pull request from Konrad, with a few fixes for xen/blkback - pblk fix IO calculation fix (Javier) - Segment calculation fix for pass-through (Ming) - Fallthrough annotation for blkcg (Mathieu)" * tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block: (25 commits) blkcg: annotate implicit fall through nvme-tcp: support C2HData with SUCCESS flag nvmet: ignore EOPNOTSUPP for discard nvme: add proper write zeroes setup for the multipath device nvme: add proper discard setup for the multipath device nvme: remove nvme_ns_config_oncs nvme: disable Write Zeroes for qemu controllers nvmet-fc: bring Disconnect into compliance with FC-NVME spec nvmet-fc: fix issues with targetport assoc_list list walking nvme-fc: reject reconnect if io queue count is reduced to zero nvme-fc: fix numa_node when dev is null nvme-fc: use nr_phys_segments to determine existence of sgl nvme-loop: init nvmet_ctrl fatal_err_work when allocate nvme: update comment to make the code easier to read nvme: put ns_head ref if namespace fails allocation nvme-trace: fix cdw10 buffer overrun nvme: don't warn on block content change effects nvme: add get-feature to admin cmds tracer md: Fix failed allocation of md_register_thread It's wrong to add len to sector_nr in raid10 reshape twice ...
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfixes: - Fix an Oops in SUNRPC back channel tracepoints - Fix a SUNRPC client regression when handling oversized replies - Fix the minimal size for SUNRPC reply buffer allocation - rpc_decode_header() must always return a non-zero value on error - Fix a typo in pnfs_update_layout() Cleanup: - Remove redundant check for the reply length in call_decode()" * tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Remove redundant check for the reply length in call_decode() SUNRPC: Handle the SYSTEM_ERR rpc error SUNRPC: rpc_decode_header() must always return a non-zero value on error SUNRPC: Use the ENOTCONN error on socket disconnect SUNRPC: Fix the minimal size for reply buffer allocation SUNRPC: Fix a client regression when handling oversized replies pNFS: Fix a typo in pnfs_update_layout fix null pointer deref in tracepoints in back channel
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: "One fix to prevent runtime allocation of 16GB pages when running in a VM (as opposed to bare metal), because it doesn't work. A small fix to our recently added KCOV support to exempt some more code from being instrumented. Plus a few minor build fixes, a small dead code removal and a defconfig update. Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Christophe Leroy, Jason Yan, Joel Stanley, Mahesh Salgaonkar, Mathieu Malaterre" * tag 'powerpc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Include <asm/nmi.h> header file to fix a warning powerpc/powernv: Fix compile without CONFIG_TRACEPOINTS powerpc/mm: Disable kcov for SLB routines powerpc: remove dead code in head_fsl_booke.S powerpc/configs: Sync skiroot defconfig powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs mount infrastructure fix from Al Viro: "Fixup for sysfs braino. Capabilities checks for sysfs mount do include those on netns, but only if CONFIG_NET_NS is enabled. Sorry, should've caught that earlier..." * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix sysfs_init_fs_context() in !CONFIG_NET_NS case
-
Al Viro authored
Permission checks on current's netns should be done only when netns are enabled. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Fixes: 23bf1b6bSigned-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more smb3 updates from Steve French: "Various tracing and debugging improvements, crediting fixes, some cleanup, and important fallocate fix (fixes three xfstests) and lock fix. Summary: - Various additional dynamic tracing tracepoints - Debugging improvements (including ability to query the server via SMB3 fsctl from userspace tools which can help with stats and debugging) - One minor performance improvement (root directory inode caching) - Crediting (SMB3 flow control) fixes - Some cleanup (docs and to mknod) - Important fixes: one to smb3 implementation of fallocate zero range (which fixes three xfstests) and a POSIX lock fix" * tag '5.1-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6: (22 commits) CIFS: fix POSIX lock leak and invalid ptr deref SMB3: Allow SMB3 FSCTL queries to be sent to server from tools cifs: fix incorrect handling of smb2_set_sparse() return in smb3_simple_falloc smb2: fix typo in definition of a few error flags CIFS: make mknod() an smb_version_op cifs: minor documentation updates cifs: remove unused value pointed out by Coverity SMB3: passthru query info doesn't check for SMB3 FSCTL passthru smb3: add dynamic tracepoints for simple fallocate and zero range cifs: fix smb3_zero_range so it can expand the file-size when required cifs: add SMB2_ioctl_init/free helpers to be used with compounding smb3: Add dynamic trace points for various compounded smb3 ops cifs: cache FILE_ALL_INFO for the shared root handle smb3: display volume serial number for shares in /proc/fs/cifs/DebugData cifs: simplify how we handle credits in compound_send_recv() smb3: add dynamic tracepoint for timeout waiting for credits smb3: display security information in /proc/fs/cifs/DebugData more accurately cifs: add a timeout argument to wait_for_free_credits cifs: prevent starvation in wait_for_free_credits for multi-credit requests cifs: wait_for_free_credits() make it possible to wait for >=1 credits ...
-
- 15 Mar, 2019 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds authored
Pull UML updates from Richard Weinberger: "Bugfix for the UML block device driver" * 'for-linus-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix for a possible OOPS in ubd initialization um: Remove duplicated include from vector_user.c
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM updates from Paolo Bonzini: "ARM: - some cleanups - direct physical timer assignment - cache sanitization for 32-bit guests s390: - interrupt cleanup - introduction of the Guest Information Block - preparation for processor subfunctions in cpu models PPC: - bug fixes and improvements, especially related to machine checks and protection keys x86: - many, many cleanups, including removing a bunch of MMU code for unnecessary optimizations - AVIC fixes Generic: - memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits) kvm: vmx: fix formatting of a comment KVM: doc: Document the life cycle of a VM and its resources MAINTAINERS: Add KVM selftests to existing KVM entry Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char() KVM: PPC: Fix compilation when KVM is not enabled KVM: Minor cleanups for kvm_main.c KVM: s390: add debug logging for cpu model subfunctions KVM: s390: implement subfunction processor calls arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2 KVM: arm/arm64: Remove unused timer variable KVM: PPC: Book3S: Improve KVM reference counting KVM: PPC: Book3S HV: Fix build failure without IOMMU support Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()" x86: kvmguest: use TSC clocksource if invariant TSC is exposed KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes() KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing fixes and cleanups from Steven Rostedt: "This contains a series of last minute clean ups, small fixes and error checks" * tag 'trace-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/probe: Verify alloc_trace_*probe() result tracing/probe: Check event/group naming rule at parsing tracing/probe: Check the size of argument name and body tracing/probe: Check event name length correctly tracing/probe: Check maxactive error cases tracing: kdb: Fix ftdump to not sleep trace/probes: Remove kernel doc style from non kernel doc comment tracing/probes: Make reserved_field_names static
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull IOMMU fix from Joerg Roedel: "Fix a NULL-pointer dereference issue in the ACPI device matching code of the AMD IOMMU driver" * tag 'iommu-fix-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix NULL dereference bug in match_hid_uid
-