- 21 Jul, 2017 40 commits
-
-
Eric W. Biederman authored
commit 296990de upstream. Andrei Vagin pointed out that time to executue propagate_umount can go non-linear (and take a ludicrious amount of time) when the mount propogation trees of the mounts to be unmunted by a lazy unmount overlap. Make the walk of the mount propagation trees nearly linear by remembering which mounts have already been visited, allowing subsequent walks to detect when walking a mount propgation tree or a subtree of a mount propgation tree would be duplicate work and to skip them entirely. Walk the list of mounts whose propgatation trees need to be traversed from the mount highest in the mount tree to mounts lower in the mount tree so that odds are higher that the code will walk the largest trees first, allowing later tree walks to be skipped entirely. Add cleanup_umount_visitation to remover the code's memory of which mounts have been visited. Add the functions last_slave and skip_propagation_subtree to allow skipping appropriate parts of the mount propagation tree without needing to change the logic of the rest of the code. A script to generate overlapping mount propagation trees: $ cat runs.h set -e mount -t tmpfs zdtm /mnt mkdir -p /mnt/1 /mnt/2 mount -t tmpfs zdtm /mnt/1 mount --make-shared /mnt/1 mkdir /mnt/1/1 iteration=10 if [ -n "$1" ] ; then iteration=$1 fi for i in $(seq $iteration); do mount --bind /mnt/1/1 /mnt/1/1 done mount --rbind /mnt/1 /mnt/2 TIMEFORMAT='%Rs' nr=$(( ( 2 ** ( $iteration + 1 ) ) + 1 )) echo -n "umount -l /mnt/1 -> $nr " time umount -l /mnt/1 nr=$(cat /proc/self/mountinfo | grep zdtm | wc -l ) time umount -l /mnt/2 $ for i in $(seq 9 19); do echo $i; unshare -Urm bash ./run.sh $i; done Here are the performance numbers with and without the patch: mhash | 8192 | 8192 | 1048576 | 1048576 mounts | before | after | before | after ------------------------------------------------ 1025 | 0.040s | 0.016s | 0.038s | 0.019s 2049 | 0.094s | 0.017s | 0.080s | 0.018s 4097 | 0.243s | 0.019s | 0.206s | 0.023s 8193 | 1.202s | 0.028s | 1.562s | 0.032s 16385 | 9.635s | 0.036s | 9.952s | 0.041s 32769 | 60.928s | 0.063s | 44.321s | 0.064s 65537 | | 0.097s | | 0.097s 131073 | | 0.233s | | 0.176s 262145 | | 0.653s | | 0.344s 524289 | | 2.305s | | 0.735s 1048577 | | 7.107s | | 2.603s Andrei Vagin reports fixing the performance problem is part of the work to fix CVE-2016-6213. Fixes: a05964f3 ("[PATCH] shared mounts handling: umount") Reported-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric W. Biederman authored
commit 99b19d16 upstream. While investigating some poor umount performance I realized that in the case of overlapping mount trees where some of the mounts are locked the code has been failing to unmount all of the mounts it should have been unmounting. This failure to unmount all of the necessary mounts can be reproduced with: $ cat locked_mounts_test.sh mount -t tmpfs test-base /mnt mount --make-shared /mnt mkdir -p /mnt/b mount -t tmpfs test1 /mnt/b mount --make-shared /mnt/b mkdir -p /mnt/b/10 mount -t tmpfs test2 /mnt/b/10 mount --make-shared /mnt/b/10 mkdir -p /mnt/b/10/20 mount --rbind /mnt/b /mnt/b/10/20 unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi' sleep 1 umount -l /mnt/b wait %% $ unshare -Urm ./locked_mounts_test.sh This failure is corrected by removing the prepass that marks mounts that may be umounted. A first pass is added that umounts mounts if possible and if not sets mount mark if they could be unmounted if they weren't locked and adds them to a list to umount possibilities. This first pass reconsiders the mounts parent if it is on the list of umount possibilities, ensuring that information of umoutability will pass from child to mount parent. A second pass then walks through all mounts that are umounted and processes their children unmounting them or marking them for reparenting. A last pass cleans up the state on the mounts that could not be umounted and if applicable reparents them to their first parent that remained mounted. While a bit longer than the old code this code is much more robust as it allows information to flow up from the leaves and down from the trunk making the order in which mounts are encountered in the umount propgation tree irrelevant. Fixes: 0c56fe31 ("mnt: Don't propagate unmounts to locked mounts") Reviewed-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric W. Biederman authored
commit 570487d3 upstream. It was observed that in some pathlogical cases that the current code does not unmount everything it should. After investigation it was determined that the issue is that mnt_change_mntpoint can can change which mounts are available to be unmounted during mount propagation which is wrong. The trivial reproducer is: $ cat ./pathological.sh mount -t tmpfs test-base /mnt cd /mnt mkdir 1 2 1/1 mount --bind 1 1 mount --make-shared 1 mount --bind 1 2 mount --bind 1/1 1/1 mount --bind 1/1 1/1 echo grep test-base /proc/self/mountinfo umount 1/1 echo grep test-base /proc/self/mountinfo $ unshare -Urm ./pathological.sh The expected output looks like: 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 49 54 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 50 53 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 51 49 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 54 47 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 53 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 50 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 The output without the fix looks like: 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 49 54 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 50 53 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 51 49 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 54 47 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 53 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 50 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 That last mount in the output was in the propgation tree to be unmounted but was missed because the mnt_change_mountpoint changed it's parent before the walk through the mount propagation tree observed it. Fixes: 1064f874 ("mnt: Tuck mounts under others instead of creating shadow/side mounts.") Acked-by: Andrei Vagin <avagin@virtuozzo.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Kelley authored
commit 13b9abfc upstream. Extend the disabling of preemption to include the hypercall so that another thread can't get the CPU and corrupt the per-cpu page used for hypercall arguments. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 3360acdf upstream. Make sure to deregister and release the nvmem device and underlying memory on registration errors. Note that the private data must be freed using put_device() once the struct device has been initialised. Also note that there's a related reference leak in the deregistration function as reported by Mika Westerberg which is being fixed separately. Fixes: b6c217ab ("nvmem: Add backwards compatibility support for older EEPROM drivers.") Fixes: eace75cf ("nvmem: Add a simple NVMEM framework for nvmem providers") Cc: Andrew Lunn <andrew@lunn.ch> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul E. McKenney authored
commit 6b5fc3a1 upstream. Wait/wakeup operations do not guarantee ordering on their own. Instead, either locking or memory barriers are required. This commit therefore adds memory barriers to wake_nocb_leader() and nocb_leader_wait(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Adam Borowski authored
commit 6987dc8a upstream. Only read access is checked before this call. Actually, at the moment this is not an issue, as every in-tree arch does the same manual checks for VERIFY_READ vs VERIFY_WRITE, relying on the MMU to tell them apart, but this wasn't the case in the past and may happen again on some odd arch in the future. If anyone cares about 3.7 and earlier, this is a security hole (untested) on real 80386 CPUs. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dong Bo authored
commit 48f99c8e upstream. Like arch/arm/, we inherit the READ_IMPLIES_EXEC personality flag across fork(). This is undesirable for a number of reasons: * ELF files that don't require executable stack can end up with it anyway * We end up performing un-necessary I-cache maintenance when mapping what should be non-executable pages * Restricting what is executable is generally desirable when defending against overflow attacks This patch clears the personality flag when setting up the personality for newly spwaned native tasks. Given that semi-recent AArch64 toolchains emit a non-executable PT_GNU_STACK header, userspace applications can already not rely on READ_IMPLIES_EXEC so shouldn't be adversely affected by this change. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dong Bo <dongbo4@huawei.com> [will: added comment to compat code, rewrote commit message] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 88cda007 upstream. Contrary to popular belief, PPIs connected to a GICv3 to not have an affinity field similar to that of GICv2. That is consistent with the fact that GICv3 is designed to accomodate thousands of CPUs, and fitting them as a bitmap in a byte is... difficult. Fixes: adbc3695 ("arm64: dts: add the Marvell Armada 3700 family and a development board") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Balbir Singh authored
commit 1e2a516e upstream. This patch fixes a crash seen while doing a kexec from radix mode to hash mode. Key 0 is special in hash and used in the RPN by default, we set the key values to 0 today. In radix mode key 0 is used to control supervisor<->user access. In hash key 0 is used by default, so the first instruction after the switch causes a crash on kexec. Commit 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space") introduced the setting of IAMR and AMOR values to prevent execution of user mode instructions from supervisor mode. We need to clean up these SPR's on kexec. Fixes: 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space") Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit da029c11 upstream. To avoid pathological stack usage or the need to special-case setuid execs, just limit all arg stack usage to at most 75% of _STK_LIM (6MB). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit a73dc537 upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For s390 the position could be 0x10000, but that is needlessly close to the NULL address. Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 47ebb09d upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 02445990 upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, to match ARM. This could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running arm compat PIE will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498251600-132458-4-git-send-email-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 6a9af90a upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. 4MB is chosen here mainly to have parity with x86, where this is the traditional minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For ARM the position could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running PIE on 32-bit ARM will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit eab09532 upstream. The ELF_ET_DYN_BASE position was originally intended to keep loaders away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2 /bin/cat" might cause the subsequent load of /bin/cat into where the loader had been loaded.) With the advent of PIE (ET_DYN binaries with an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial portion of the address space is unused. For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are loaded above the mmap region. This means they can be made to collide (CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with pathological stack regions. Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap region in all cases, and will now additionally avoid programs falling back to the mmap region by enforcing MAP_FIXED for program loads (i.e. if it would have collided with the stack, now it will fail to load instead of falling back to the mmap region). To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP) are loaded into the mmap region, leaving space available for either an ET_EXEC binary with a fixed location or PIE being loaded into mmap by the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE, which means architectures can now safely lower their values without risk of loaders colliding with their subsequently loaded programs. For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and suggestions on how to implement this solution. Fixes: d1fd836d ("mm: split ET_DYN ASLR from mmap ASLR") Link: http://lkml.kernel.org/r/20170621173201.GA114489@beastSigned-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Cyril Bur authored
commit 8d81ae05 upstream. As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have occurred when running checkpatch. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){ <-- HERE \s*/ at scripts/checkpatch.pl line 3544. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){ <-- HERE \s*/ at scripts/checkpatch.pl line 3885. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374. It seems perfectly reasonable to do as the warning suggests and simply escape the left brace in these three locations. Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.comSigned-off-by: Cyril Bur <cyrilbur@gmail.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sahitya Tummala authored
commit b17c070f upstream. __list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer duration if there are more number of items in the lru list. As per the current code, it can hold the spin lock for upto maximum UINT_MAX entries at a time. So if there are more number of items in the lru list, then "BUG: spinlock lockup suspected" is observed in the below path: spin_bug+0x90 do_raw_spin_lock+0xfc _raw_spin_lock+0x28 list_lru_add+0x28 dput+0x1c8 path_put+0x20 terminate_walk+0x3c path_lookupat+0x100 filename_lookup+0x6c user_path_at_empty+0x54 SyS_faccessat+0xd0 el0_svc_naked+0x24 This nlru->lock is acquired by another CPU in this path - d_lru_shrink_move+0x34 dentry_lru_isolate_shrink+0x48 __list_lru_walk_one.isra.10+0x94 list_lru_walk_node+0x40 shrink_dcache_sb+0x60 do_remount_sb+0xbc do_emergency_remount+0xb0 process_one_work+0x228 worker_thread+0x2e0 kthread+0xf4 ret_from_fork+0x10 Fix this lockup by reducing the number of entries to be shrinked from the lru list to 1024 at once. Also, add cond_resched() before processing the lru list again. Link: http://marc.info/?t=149722864900001&r=1&w=2 Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.orgSigned-off-by: Sahitya Tummala <stummala@codeaurora.org> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sahitya Tummala authored
commit 2c80cd57 upstream. list_lru_count_node() iterates over all memcgs to get the total number of entries on the node but it can race with memcg_drain_all_list_lrus(), which migrates the entries from a dead cgroup to another. This can return incorrect number of entries from list_lru_count_node(). Fix this by keeping track of entries per node and simply return it in list_lru_count_node(). Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.orgSigned-off-by: Sahitya Tummala <stummala@codeaurora.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcin Nowakowski authored
commit c0d80dda upstream. core_kernel_text is used by MIPS in its function graph trace processing, so having this method traced leads to an infinite set of recursive calls such as: Call Trace: ftrace_return_to_handler+0x50/0x128 core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 return_to_handler+0x10/0x30 return_to_handler+0x0/0x30 return_to_handler+0x0/0x30 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 (...) Mark the function notrace to avoid it being traced. Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.comSigned-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kirill A. Shutemov authored
commit bbf29ffc upstream. Reinette reported the following crash: BUG: Bad page state in process log2exe pfn:57600 page:ffffea00015d8000 count:0 mapcount:0 mapping: (null) index:0x20200 flags: 0x4000000000040019(locked|uptodate|dirty|swapbacked) raw: 4000000000040019 0000000000000000 0000000000020200 00000000ffffffff raw: ffffea00015d8020 ffffea00015d8020 0000000000000000 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1(locked) Modules linked in: rfcomm 8021q bnep intel_rapl x86_pkg_temp_thermal coretemp efivars btusb btrtl btbcm pwm_lpss_pci snd_hda_codec_hdmi btintel pwm_lpss snd_hda_codec_realtek snd_soc_skl snd_hda_codec_generic snd_soc_skl_ipc spi_pxa2xx_platform snd_soc_sst_ipc snd_soc_sst_dsp i2c_designware_platform i2c_designware_core snd_hda_ext_core snd_soc_sst_match snd_hda_intel snd_hda_codec mei_me snd_hda_core mei snd_soc_rt286 snd_soc_rl6347a snd_soc_core efivarfs CPU: 1 PID: 354 Comm: log2exe Not tainted 4.12.0-rc7-test-test #19 Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS AYAPLCEL.86A.0027.2016.1108.1529 11/08/2016 Call Trace: bad_page+0x16a/0x1f0 free_pages_check_bad+0x117/0x190 free_hot_cold_page+0x7b1/0xad0 __put_page+0x70/0xa0 madvise_free_huge_pmd+0x627/0x7b0 madvise_free_pte_range+0x6f8/0x1150 __walk_page_range+0x6b5/0xe30 walk_page_range+0x13b/0x310 madvise_free_page_range.isra.16+0xad/0xd0 madvise_free_single_vma+0x2e4/0x470 SyS_madvise+0x8ce/0x1450 If somebody frees the page under us and we hold the last reference to it, put_page() would attempt to free the page before unlocking it. The fix is trivial reorder of operations. Dave said: "I came up with the exact same patch. For posterity, here's the test case, generated by syzkaller and trimmed down by Reinette: https://www.sr71.net/~dave/intel/log2.c And the config that helps detect this: https://www.sr71.net/~dave/intel/config-log2" Fixes: b8d3c4c3 ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") Link: http://lkml.kernel.org/r/20170628101249.17879-1-kirill.shutemov@linux.intel.comSigned-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Rientjes authored
commit 9a04dbcf upstream. The motivation for commit abb2ea7d ("compiler, clang: suppress warning for unused static inline functions") was to suppress clang's warnings about unused static inline functions. For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86 architecture, `inline' in the kernel implies that __attribute__((always_inline)) is used. Some code depends on that behavior, see https://lkml.org/lkml/2017/6/13/918: net/built-in.o: In function `__xchg_mb': arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99' arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99 The full fix would be to identify these breakages and annotate the functions with __always_inline instead of `inline'. But since we are late in the 4.12-rc cycle, simply carry forward the forced inlining behavior and work toward moving arm64, and other architectures, toward CONFIG_OPTIMIZE_INLINING behavior. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.comSigned-off-by: David Rientjes <rientjes@google.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ben Hutchings authored
commit 98dcea0c upstream. liblockdep has been broken since commit 75dd602a ("lockdep: Fix lock_chain::base size"), as that adds a check that MAX_LOCK_DEPTH is within the range of lock_chain::depth and in liblockdep it is much too large. That should have resulted in a compiler error, but didn't because: - the check uses ARRAY_SIZE(), which isn't yet defined in liblockdep so is assumed to be an (undeclared) function - putting a function call inside a BUILD_BUG_ON() expression quietly turns it into some nonsense involving a variable-length array It did produce a compiler warning, but I didn't notice because liblockdep already produces too many warnings if -Wall is enabled (which I'll fix shortly). Even before that commit, which reduced lock_chain::depth from 8 bits to 6, MAX_LOCK_DEPTH was too large. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/20170525130005.5947-3-alexander.levin@verizon.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 649aa242 upstream. This is because of commit f98db601 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") in which switch_mm_irqs_off() is called by the scheduler, vs switch_mm() which is used by use_mm(). This patch lets the parisc code mirror the x86 and powerpc code, ie. it disables interrupts in switch_mm(), and optimises the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Bogendoerfer authored
commit 33f9e024 upstream. Enabling parport pc driver on a B2600 (and probably other 64bit PARISC systems) produced following BUG: CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc5-30198-g1132d5e7 #156 task: 000000009e050000 task.stack: 000000009e04c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Not tainted r00-03 000000ff0806ff0f 000000009e04c990 0000000040871b78 000000009e04cac0 r04-07 0000000040c14de0 ffffffffffffffff 000000009e07f098 000000009d82d200 r08-11 000000009d82d210 0000000000000378 0000000000000000 0000000040c345e0 r12-15 0000000000000005 0000000040c345e0 0000000000000000 0000000040c9d5e0 r16-19 0000000040c345e0 00000000f00001c4 00000000f00001bc 0000000000000061 r20-23 000000009e04ce28 0000000000000010 0000000000000010 0000000040b89e40 r24-27 0000000000000003 0000000000ffffff 000000009d82d210 0000000040c14de0 r28-31 0000000000000000 000000009e04ca90 000000009e04cb40 0000000000000000 sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404aece0 00000000404aece4 IIR: 03ffe01f ISR: 0000000010340000 IOR: 000001781304cac8 CPU: 0 CR30: 000000009e04c000 CR31: 00000000e2976de2 ORIG_R28: 0000000000000200 IAOQ[0]: sba_dma_supported+0x80/0xd0 IAOQ[1]: sba_dma_supported+0x84/0xd0 RP(r2): parport_pc_probe_port+0x178/0x1200 Cause is a call to dma_coerce_mask_and_coherenet in parport_pc_probe_port, which PARISC DMA API doesn't handle very nicely. This commit gives back DMA_ERROR_CODE for DMA API calls, if device isn't capable of DMA transaction. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit b0f94efd upstream. Architectures with a compat syscall table must put compat_sys_keyctl() in it, not sys_keyctl(). The parisc architecture was not doing this; fix it. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 24746231 upstream. When a process runs out of stack the parisc kernel wrongly faults with SIGBUS instead of the expected SIGSEGV signal. This example shows how the kernel faults: do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000] trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000 The vma->vm_end value is the first address which does not belong to the vma, so adjust the check to include vma->vm_end to the range for which to send the SIGSEGV signal. This patch unbreaks building the debian libsigsegv package. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Suzuki K Poulose authored
commit 866d7c1b upstream. The GICv3 driver doesn't check if the target CPU for gic_set_affinity is valid before going ahead and making the changes. This triggers the following splat with KASAN: [ 141.189434] BUG: KASAN: global-out-of-bounds in gic_set_affinity+0x8c/0x140 [ 141.189704] Read of size 8 at addr ffff200009741d20 by task swapper/1/0 [ 141.189958] [ 141.190158] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.12.0-rc7 [ 141.190458] Hardware name: Foundation-v8A (DT) [ 141.190658] Call trace: [ 141.190908] [<ffff200008089d70>] dump_backtrace+0x0/0x328 [ 141.191224] [<ffff20000808a1b4>] show_stack+0x14/0x20 [ 141.191507] [<ffff200008504c3c>] dump_stack+0xa4/0xc8 [ 141.191858] [<ffff20000826c19c>] print_address_description+0x13c/0x250 [ 141.192219] [<ffff20000826c5c8>] kasan_report+0x210/0x300 [ 141.192547] [<ffff20000826ad54>] __asan_load8+0x84/0x98 [ 141.192874] [<ffff20000854eeec>] gic_set_affinity+0x8c/0x140 [ 141.193158] [<ffff200008148b14>] irq_do_set_affinity+0x54/0xb8 [ 141.193473] [<ffff200008148d2c>] irq_set_affinity_locked+0x64/0xf0 [ 141.193828] [<ffff200008148e00>] __irq_set_affinity+0x48/0x78 [ 141.194158] [<ffff200008bc48a4>] arm_perf_starting_cpu+0x104/0x150 [ 141.194513] [<ffff2000080d73bc>] cpuhp_invoke_callback+0x17c/0x1f8 [ 141.194783] [<ffff2000080d94ec>] notify_cpu_starting+0x8c/0xb8 [ 141.195130] [<ffff2000080911ec>] secondary_start_kernel+0x15c/0x200 [ 141.195390] [<0000000080db81b4>] 0x80db81b4 [ 141.195603] [ 141.195685] The buggy address belongs to the variable: [ 141.196012] __cpu_logical_map+0x200/0x220 [ 141.196176] [ 141.196315] Memory state around the buggy address: [ 141.196586] ffff200009741c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.196913] ffff200009741c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.197158] >ffff200009741d00: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 00 [ 141.197487] ^ [ 141.197758] ffff200009741d80: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00 [ 141.198060] ffff200009741e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.198358] ================================================================== [ 141.198609] Disabling lock debugging due to kernel taint [ 141.198961] CPU1: Booted secondary processor [410fd051] This patch adds the check to make sure the cpu is valid. Fixes: commit 021f6537 ("irqchip: gic-v3: Initial support for GICv3") Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Williamson authored
commit e323369b upstream. Unset-KVM and decrement-assignment only when we find the group in our list. Otherwise we can get out of sync if the user triggers this for groups that aren't currently on our list. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Mackerras authored
commit 00c14757 upstream. This fixes a typo where the wrong loop index was used to index the kvmppc_xive_vcpu.queues[] array in xive_pre_save_scan(). The variable i contains the vcpu number; we need to index queues[] using j, which iterates from 0 to KVMPPC_XIVE_Q_COUNT-1. The effect of this bug is that things that save the interrupt controller state, such as "virsh dump", on a VM with more than 8 vCPUs, result in xive_pre_save_queue() getting called on a bogus queue structure, usually resulting in a crash like this: [ 501.821107] Unable to handle kernel paging request for data at address 0x00000084 [ 501.821212] Faulting instruction address: 0xc008000004c7c6f8 [ 501.821234] Oops: Kernel access of bad area, sig: 11 [#1] [ 501.821305] SMP NR_CPUS=1024 [ 501.821307] NUMA [ 501.821376] PowerNV [ 501.821470] Modules linked in: vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel kvm_hv nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm tg3 ptp pps_core [ 501.822477] CPU: 3 PID: 3934 Comm: live_migration Not tainted 4.11.0-4.git8caa70f.el7.centos.ppc64le #1 [ 501.822633] task: c0000003f9e3ae80 task.stack: c0000003f9ed4000 [ 501.822745] NIP: c008000004c7c6f8 LR: c008000004c7c628 CTR: 0000000030058018 [ 501.822877] REGS: c0000003f9ed7980 TRAP: 0300 Not tainted (4.11.0-4.git8caa70f.el7.centos.ppc64le) [ 501.823030] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 501.823047] CR: 28022244 XER: 00000000 [ 501.823203] CFAR: c008000004c7c77c DAR: 0000000000000084 DSISR: 40000000 SOFTE: 1 [ 501.823203] GPR00: c008000004c7c628 c0000003f9ed7c00 c008000004c91450 00000000000000ff [ 501.823203] GPR04: c0000003f5580000 c0000003f559bf98 9000000000009033 0000000000000000 [ 501.823203] GPR08: 0000000000000084 0000000000000000 00000000000001e0 9000000000001003 [ 501.823203] GPR12: c00000000008a7d0 c00000000fdc1b00 000000000a9a0000 0000000000000000 [ 501.823203] GPR16: 00000000402954e8 000000000a9a0000 0000000000000004 0000000000000000 [ 501.823203] GPR20: 0000000000000008 c000000002e8f180 c000000002e8f1e0 0000000000000001 [ 501.823203] GPR24: 0000000000000008 c0000003f5580008 c0000003f4564018 c000000002e8f1e8 [ 501.823203] GPR28: 00003ff6e58bdc28 c0000003f4564000 0000000000000000 0000000000000000 [ 501.825441] NIP [c008000004c7c6f8] xive_get_attr+0x3b8/0x5b0 [kvm] [ 501.825671] LR [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] [ 501.825887] Call Trace: [ 501.825991] [c0000003f9ed7c00] [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] (unreliable) [ 501.826312] [c0000003f9ed7cd0] [c008000004c62ec4] kvm_device_ioctl_attr+0x64/0xa0 [kvm] [ 501.826581] [c0000003f9ed7d20] [c008000004c62fcc] kvm_device_ioctl+0xcc/0xf0 [kvm] [ 501.826843] [c0000003f9ed7d40] [c000000000350c70] do_vfs_ioctl+0xd0/0x8c0 [ 501.827060] [c0000003f9ed7de0] [c000000000351534] SyS_ioctl+0xd4/0xf0 [ 501.827282] [c0000003f9ed7e30] [c00000000000b8e0] system_call+0x38/0xfc [ 501.827496] Instruction dump: [ 501.827632] 419e0078 3b760008 e9160008 83fb000c 83db0010 80fb0008 2f280000 60000000 [ 501.827901] 60000000 60420000 419a0050 7be91764 <7d284c2c> 552a0ffe 7f8af040 419e003c [ 501.828176] ---[ end trace 2d0529a5bbbbafed ]--- Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hu Huajun authored
commit 02d50cda upstream. When reading the cntpct_el0 in guest with VHE (Virtual Host Extension) enabled in host, the "Unsupported guest sys_reg access" error reported. The reason is cnthctl_el2.EL1PCTEN is not enabled, which is expected to be done in kvm_timer_init_vhe(). The problem is kvm_timer_init_vhe is called by cpu_init_hyp_mode, and which is called when VHE is disabled. This patch remove the incorrect call to kvm_timer_init_vhe() from cpu_init_hyp_mode(), and calls kvm_timer_init_vhe() to enable cnthctl_el2.EL1PCTEN in cpu_hyp_reinit(). Fixes: 488f94d7 ("KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems") Signed-off-by: Hu Huajun <huhuajun@huawei.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Deucher authored
commit 6653ebd4 upstream. This was missing for gfx6. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Srinivas Dasari authored
commit 0a27844c upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from userspace with NL80211_NAN_FUNC_SERVICE_ID. Fixes: a442b761 ("cfg80211: add add_nan_func / del_nan_func") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Srinivas Dasari authored
commit 9361df14 upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, the wireless drivers may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum WLAN_PMKID_LEN bytes are received from userspace with NL80211_ATTR_PMKID. Fixes: 67fbb16b ("nl80211: PMKSA caching support") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Srinivas Dasari authored
commit d7f13f74 upstream. validate_scan_freqs() retrieves frequencies from attributes nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with nla_get_u32(), which reads 4 bytes from each attribute without validating the size of data received. Attributes nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy. Validate size of each attribute before parsing to avoid potential buffer overread. Fixes: 2a519311 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Srinivas Dasari authored
commit 8feb69c7 upstream. Buffer overread may happen as nl80211_set_station() reads 4 bytes from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without validating the size of data received when userspace sends less than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE. Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid the buffer overread. Fixes: 3b1c5a53 ("{cfg,nl}80211: mesh power mode primitives and userspace access") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Kiper authored
commit 457ea3f7 upstream. Otherwise e.g. Xen dom0 on x86_64 EFI platforms crashes. In theory we can check EFI_PARAVIRT too, however, EFI_MEMMAP looks more targeted and covers more cases. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andrew.cooper3@citrix.com Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: linux-efi@vger.kernel.org Cc: matt@codeblueprint.co.uk Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1498128697-12943-2-git-send-email-daniel.kiper@oracle.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter S. Housel authored
commit 5ea59db8 upstream. An earlier change to this function (3bdae810) fixed a leak in the case of an unsuccessful call to brcmf_sdiod_buffrw(). However, the glom_skb buffer, used for emulating a scattering read, is never used or referenced after its contents are copied into the destination buffers, and therefore always needs to be freed by the end of the function. Fixes: 3bdae810 ("brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain") Fixes: a413e39a ("brcmfmac: fix brcmf_sdcard_recv_chain() for host without sg support") Signed-off-by: Peter S. Housel <housel@acm.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe Jaillet authored
commit 57c00f2f upstream. If 'wiphy_new()' fails, we leak 'ops'. Add a new label in the error handling path to free it in such a case. Fixes: 5c22fb85 ("brcmfmac: add wowl gtk rekeying offload support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nitin Gupta authored
[ Upstream commit dbd2667a ] The function assumes that each PMD points to head of a huge page. This is not correct as a PMD can point to start of any 8M region with a, say 256M, hugepage. The fix ensures that it points to the correct head of any PMD huge page. Cc: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-