- 22 Jun, 2020 29 commits
-
-
Xiaochun Lee authored
commit 1574051e upstream. The Intel C620 Platform Controller Hub has MROM functions that have non-PCI registers (undocumented in the public spec) where BAR 0 is supposed to be, which results in messages like this: pci 0000:00:11.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size) Mark these MROM functions as having non-compliant BARs so we don't try to probe any of them. There are no other BARs on these devices. See the Intel C620 Series Chipset Platform Controller Hub Datasheet, May 2019, Document Number 336067-007US, sec 2.1, 35.5, 35.6. [bhelgaas: commit log, add 0xa26d] Link: https://lore.kernel.org/r/1589513467-17070-1-git-send-email-lixiaochun.2888@163.comSigned-off-by: Xiaochun Lee <lixc17@lenovo.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bob Haarman authored
commit d8ad6d39 upstream. 'jiffies' and 'jiffies_64' are meant to alias (two different symbols that share the same address). Most architectures make the symbols alias to the same address via a linker script assignment in their arch/<arch>/kernel/vmlinux.lds.S: jiffies = jiffies_64; which is effectively a definition of jiffies. jiffies and jiffies_64 are both forward declared for all architectures in include/linux/jiffies.h. jiffies_64 is defined in kernel/time/timer.c. x86_64 was peculiar in that it wasn't doing the above linker script assignment, but rather was: 1. defining jiffies in arch/x86/kernel/time.c instead via the linker script. 2. overriding the symbol jiffies_64 from kernel/time/timer.c in arch/x86/kernel/vmlinux.lds.s via 'jiffies_64 = jiffies;'. As Fangrui notes: In LLD, symbol assignments in linker scripts override definitions in object files. GNU ld appears to have the same behavior. It would probably make sense for LLD to error "duplicate symbol" but GNU ld is unlikely to adopt for compatibility reasons. This results in an ODR violation (UB), which seems to have survived thus far. Where it becomes harmful is when; 1. -fno-semantic-interposition is used: As Fangrui notes: Clang after LLVM commit 5b22bcc2b70d ("[X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local") defaults to -fno-semantic-interposition similar semantics which help -fpic/-fPIC code avoid GOT/PLT when the referenced symbol is defined within the same translation unit. Unlike GCC -fno-semantic-interposition, Clang emits such relocations referencing local symbols for non-pic code as well. This causes references to jiffies to refer to '.Ljiffies$local' when jiffies is defined in the same translation unit. Likewise, references to jiffies_64 become references to '.Ljiffies_64$local' in translation units that define jiffies_64. Because these differ from the names used in the linker script, they will not be rewritten to alias one another. 2. Full LTO Full LTO effectively treats all source files as one translation unit, causing these local references to be produced everywhere. When the linker processes the linker script, there are no longer any references to jiffies_64' anywhere to replace with 'jiffies'. And thus '.Ljiffies$local' and '.Ljiffies_64$local' no longer alias at all. In the process of porting patches enabling Full LTO from arm64 to x86_64, spooky bugs have been observed where the kernel appeared to boot, but init doesn't get scheduled. Avoid the ODR violation by matching other architectures and define jiffies only by linker script. For -fno-semantic-interposition + Full LTO, there is no longer a global definition of jiffies for the compiler to produce a local symbol which the linker script won't ensure aliases to jiffies_64. Fixes: 40747ffa ("asmlinkage: Make jiffies visible") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Alistair Delva <adelva@google.com> Debugged-by: Nick Desaulniers <ndesaulniers@google.com> Debugged-by: Sami Tolvanen <samitolvanen@google.com> Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Bob Haarman <inglorion@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # build+boot on Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/852 Link: https://lkml.kernel.org/r/20200602193100.229287-1-inglorion@google.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qu Wenruo authored
[ Upstream commit f556faa4 ] Although we have tree level check at tree read runtime, it's completely based on its parent level. We still need to do accurate level check to avoid invalid tree blocks sneak into kernel space. The check itself is simple, for leaf its level should always be 0. For nodes its level should be in range [1, BTRFS_MAX_LEVEL - 1]. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Miklos Szeredi authored
commit 530f32fc upstream. Avi Kivity reports that on fuse filesystems running in a user namespace asyncronous fsync fails with EOVERFLOW. The reason is that f_ops->fsync() is called with the creds of the kthread performing aio work instead of the creds of the process originally submitting IOCB_CMD_FSYNC. Fuse sends the creds of the caller in the request header and it needs to translate the uid and gid into the server's user namespace. Since the kthread is running in init_user_ns, the translation will fail and the operation returns an error. It can be argued that fsync doesn't actually need any creds, but just zeroing out those fields in the header (as with requests that currently don't take creds) is a backward compatibility risk. Instead of working around this issue in fuse, solve the core of the problem by calling the filesystem with the proper creds. Reported-by: Avi Kivity <avi@scylladb.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Fixes: c9582eb0 ("fuse: Fail all requests with invalid uids or gids") Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Waiman Long authored
[ Upstream commit d4eaa283 ] For kvmalloc'ed data object that contains sensitive information like cryptographic keys, we need to make sure that the buffer is always cleared before freeing it. Using memset() alone for buffer clearing may not provide certainty as the compiler may compile it away. To be sure, the special memzero_explicit() has to be used. This patch introduces a new kvfree_sensitive() for freeing those sensitive data objects allocated by kvmalloc(). The relevant places where kvfree_sensitive() can be used are modified to use it. Fixes: 4f088249 ("KEYS: Avoid false positive ENOMEM error on key read") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Joe Perches <joe@perches.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Uladzislau Rezki <urezki@gmail.com> Link: http://lkml.kernel.org/r/20200407200318.11711-1-longman@redhat.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Masami Hiramatsu authored
[ Upstream commit c6aab66a ] Since the commit 6a13a0d7 ("ftrace/kprobe: Show the maxactive number on kprobe_events") introduced to show the instance number of kretprobe events, the length of the 1st format of the kprobe event will not 1, but it can be longer. This caused a parser error in perf-probe. Skip the length check the 1st format of the kprobe event to accept this instance number. Without this fix: # perf probe -a vfs_read%return Added new event: probe:vfs_read__return (on vfs_read%return) You can now use it in all perf tools, such as: perf record -e probe:vfs_read__return -aR sleep 1 # perf probe -l Semantic error :Failed to parse event name: r16:probe/vfs_read__return Error: Failed to show event list. And with this fixes: # perf probe -a vfs_read%return ... # perf probe -l probe:vfs_read__return (on vfs_read%return) Fixes: 6a13a0d7 ("ftrace/kprobe: Show the maxactive number on kprobe_events") Reported-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Yuxuan Shui <yshuiv7@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207587 Link: http://lore.kernel.org/lkml/158877535215.26469.1113127926699134067.stgit@devnote2Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Kim Phillips authored
[ Upstream commit e2abfc04 ] Commit 21b5ee59 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF") mistakenly added erratum #1054 as an OS Visible Workaround (OSVW) ID 0. Erratum #1054 is not OSVW ID 0 [1], so make it a legacy erratum. There would never have been a false positive on older hardware that has OSVW bit 0 set, since the IRPERF feature was not available. However, save a couple of RDMSR executions per thread, on modern system configurations that correctly set non-zero values in their OSVW_ID_Length MSRs. [1] Revision Guide for AMD Family 17h Models 00h-0Fh Processors. The revision guide is available from the bugzilla link below. Fixes: 21b5ee59 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200417143356.26054-1-kim.phillips@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jason Gunthorpe authored
[ Upstream commit eb356e6d ] If is_closed is set, and the event list is empty, then read() will return -EIO without blocking. After setting is_closed in ib_uverbs_free_event_queue(), we do trigger a wake_up on the poll_wait, but the fops->poll() function does not check it, so poll will continue to sleep on an empty list. Fixes: 14e23bd6 ("RDMA/core: Fix locking in ib_uverbs_event_read") Link: https://lore.kernel.org/r/0-v1-ace813388969+48859-uverbs_poll_fix%25jgg@mellanox.comReviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Masashi Honma authored
[ Upstream commit 450edd28 ] Some devices like TP-Link TL-WN722N produces this kind of messages frequently. kernel: ath: phy0: Short RX data len, dropping (dlen: 4) This warning is useful for developers to recognize that the device (Wi-Fi dongle or USB hub etc) is noisy but not for general users. So this patch make this warning to debug message. Reported-By: Denis <pro.denis@protonmail.com> Ref: https://bugzilla.kernel.org/show_bug.cgi?id=207539 Fixes: cd486e62 ("ath9k_htc: Discard undersized packets") Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200504214443.4485-1-masashi.honma@gmail.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Cédric Le Goater authored
[ Upstream commit a101950f ] Commit 1ca3dec2 ("powerpc/xive: Prevent page fault issues in the machine crash handler") fixed an issue in the FW assisted dump of machines using hash MMU and the XIVE interrupt mode under the POWER hypervisor. It forced the mapping of the ESB page of interrupts being mapped in the Linux IRQ number space to make sure the 'crash kexec' sequence worked during such an event. But it didn't handle the un-mapping. This mapping is now blocking the removal of a passthrough IO adapter under the POWER hypervisor because it expects the guest OS to have cleared all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations". Remove these mapping in the IRQ data cleanup routine. Under KVM, this cleanup is not required because the ESB pages for the adapter interrupts are un-mapped from the guest by the hypervisor in the KVM XIVE native device. This is now redundant but it's harmless. Fixes: 1ca3dec2 ("powerpc/xive: Prevent page fault issues in the machine crash handler") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200429075122.1216388-2-clg@kaod.orgSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Thomas Falcon authored
[ Upstream commit 78468899 ] VNIC protocol version is reported in big-endian format, but it is not byteswapped before logging. Fix that, and remove version comparison as only one protocol version exists at this time. Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Dennis Kadioglu authored
[ Upstream commit 642aa86e ] The Lenovo Thinkpad T470s I own has a different touchpad with "LEN007a" instead of the already included PNP ID "LEN006c". However, my touchpad seems to work well without any problems using RMI. So this patch adds the other PNP ID. Signed-off-by: Dennis Kadioglu <denk@eclipso.email> Link: https://lore.kernel.org/r/ff770543cd53ae818363c0fe86477965@mail.eclipso.deSigned-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jens Axboe authored
[ Upstream commit 18f855e5 ] Stefano reported a crash with using SQPOLL with io_uring: BUG: kernel NULL pointer dereference, address: 00000000000003b0 CPU: 2 PID: 1307 Comm: io_uring-sq Not tainted 5.7.0-rc7 #11 RIP: 0010:task_numa_work+0x4f/0x2c0 Call Trace: task_work_run+0x68/0xa0 io_sq_thread+0x252/0x3d0 kthread+0xf9/0x130 ret_from_fork+0x35/0x40 which is task_numa_work() oopsing on current->mm being NULL. The task work is queued by task_tick_numa(), which checks if current->mm is NULL at the time of the call. But this state isn't necessarily persistent, if the kthread is using use_mm() to temporarily adopt the mm of a task. Change the task_tick_numa() check to exclude kernel threads in general, as it doesn't make sense to attempt ot balance for kthreads anyway. Reported-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/865de121-8190-5d30-ece5-3b097dc74431@kernel.dkSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Fredrik Strupe authored
[ Upstream commit 3866f217 ] call_undef_hook() in traps.c applies the same instr_mask for both 16-bit and 32-bit thumb instructions. If instr_mask then is only 16 bits wide (0xffff as opposed to 0xffffffff), the first half-word of 32-bit thumb instructions will be masked out. This makes the function match 32-bit thumb instructions where the second half-word is equal to instr_val, regardless of the first half-word. The result in this case is that all undefined 32-bit thumb instructions with the second half-word equal to 0xde01 (udf #1) work as breakpoints and will raise a SIGTRAP instead of a SIGILL, instead of just the one intended 16-bit instruction. An example of such an instruction is 0xeaa0de01, which is unallocated according to Arm ARM and should raise a SIGILL, but instead raises a SIGTRAP. This patch fixes the issue by setting all the bits in instr_mask, which will still match the intended 16-bit thumb instruction (where the upper half is always 0), but not any 32-bit thumb instructions. Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Fredrik Strupe <fredrik@strupe.net> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Stephan Gerhold authored
[ Upstream commit 3f8f7705 ] MMS345L is another first generation touch screen from Melfas, which uses the same registers as MMS152. However, using I2C_M_NOSTART for it causes errors when reading: i2c i2c-0: sendbytes: NAK bailout. mms114 0-0048: __mms114_read_reg: i2c transfer failed (-5) The driver works fine as soon as I2C_M_NOSTART is removed. Reviewed-by: Andi Shyti <andi@etezian.org> Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Link: https://lore.kernel.org/r/20200405170904.61512-1-stephan@gerhold.net [dtor: removed separate mms345l handling, made everyone use standard transfer mode, propagated the 10bit addressing flag to the read part of the transfer as well.] Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Su Kang Yin authored
commit e1de42fd ("crypto: talitos - fix ECB algs ivsize") wrongly modified CBC algs ivsize instead of ECB aggs ivsize. This restore the CBC algs original ivsize of removes ECB's ones. Fixes: e1de42fd ("crypto: talitos - fix ECB algs ivsize") Signed-off-by: Su Kang Yin <cantona@cantona.net> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qu Wenruo authored
commit 62fdaa52 upstream. [BUG] With crafted image, btrfs will panic at btree operations: kernel BUG at fs/btrfs/ctree.c:3894! invalid opcode: 0000 [#1] SMP PTI CPU: 0 PID: 1138 Comm: btrfs-transacti Not tainted 5.0.0-rc8+ #9 RIP: 0010:__push_leaf_left+0x6b6/0x6e0 RSP: 0018:ffffc0bd4128b990 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffa0a4ab8f0e38 RCX: 0000000000000000 RDX: ffffa0a280000000 RSI: 0000000000000000 RDI: ffffa0a4b3814000 RBP: ffffc0bd4128ba38 R08: 0000000000001000 R09: ffffc0bd4128b948 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000240 R13: ffffa0a4b556fb60 R14: ffffa0a4ab8f0af0 R15: ffffa0a4ab8f0af0 FS: 0000000000000000(0000) GS:ffffa0a4b7a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2461c80020 CR3: 000000022b32a006 CR4: 00000000000206f0 Call Trace: ? _cond_resched+0x1a/0x50 push_leaf_left+0x179/0x190 btrfs_del_items+0x316/0x470 btrfs_del_csums+0x215/0x3a0 __btrfs_free_extent.isra.72+0x5a7/0xbe0 __btrfs_run_delayed_refs+0x539/0x1120 btrfs_run_delayed_refs+0xdb/0x1b0 btrfs_commit_transaction+0x52/0x950 ? start_transaction+0x94/0x450 transaction_kthread+0x163/0x190 kthread+0x105/0x140 ? btrfs_cleanup_transaction+0x560/0x560 ? kthread_destroy_worker+0x50/0x50 ret_from_fork+0x35/0x40 Modules linked in: ---[ end trace c2425e6e89b5558f ]--- [CAUSE] The offending csum tree looks like this: checksum tree key (CSUM_TREE ROOT_ITEM 0) node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE ... key (EXTENT_CSUM EXTENT_CSUM 85975040) block 29630464 gen 17 key (EXTENT_CSUM EXTENT_CSUM 89911296) block 29642752 gen 17 <<< key (EXTENT_CSUM EXTENT_CSUM 92274688) block 29646848 gen 17 ... leaf 29630464 items 6 free space 1 generation 17 owner CSUM_TREE item 0 key (EXTENT_CSUM EXTENT_CSUM 85975040) itemoff 3987 itemsize 8 range start 85975040 end 85983232 length 8192 ... leaf 29642752 items 0 free space 3995 generation 17 owner 0 ^ empty leaf invalid owner ^ leaf 29646848 items 1 free space 602 generation 17 owner CSUM_TREE item 0 key (EXTENT_CSUM EXTENT_CSUM 92274688) itemoff 627 itemsize 3368 range start 92274688 end 95723520 length 3448832 So we have a corrupted csum tree where one tree leaf is completely empty, causing unbalanced btree, thus leading to unexpected btree balance error. [FIX] For this particular case, we handle it in two directions to catch it: - Check if the tree block is empty through btrfs_verify_level_key() So that invalid tree blocks won't be read out through btrfs_search_slot() and its variants. - Check 0 tree owner in tree checker NO tree is using 0 as its tree owner, detect it and reject at tree block read time. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202821Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Vikash Bansal <bvikas@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anand Jain authored
commit 09ba3bc9 upstream. Both btrfs_find_device() and find_device() does the same thing except that the latter does not take the seed device onto account in the device scanning context. We can merge them. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [4.19.y backport notes: Vikash : - To apply this patch, a portion of commit e4319cd9 was used to change the first argument of function "btrfs_find_device" from "struct btrfs_fs_info" to "struct btrfs_fs_devices". Signed-off-by: Vikash Bansal <bvikas@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe Leroy authored
commit ab10ae1c upstream. The range passed to user_access_begin() by strncpy_from_user() and strnlen_user() starts at 'src' and goes up to the limit of userspace although reads will be limited by the 'count' param. On 32 bits powerpc (book3s/32) access has to be granted for each 256Mbytes segment and the cost increases with the number of segments to unlock. Limit the range with 'count' param. Fixes: 594cc251 ("make 'user_access_begin()' do 'access_ok()'") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Will Deacon authored
commit 6e693b3f upstream. Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'") makes the access_ok() check part of the user_access_begin() preceding a series of 'unsafe' accesses. This has the desirable effect of ensuring that all 'unsafe' accesses have been range-checked, without having to pick through all of the callsites to verify whether the appropriate checking has been made. However, the consolidated range check does not inhibit speculation, so it is still up to the caller to ensure that they are not susceptible to any speculative side-channel attacks for user addresses that ultimately fail the access_ok() check. This is an oversight, so use __uaccess_begin_nospec() to ensure that speculation is inhibited until the access_ok() check has passed. Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stafford Horne authored
commit 9cb2feb4 upstream. The commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'") exposed incorrect implementations of access_ok() macro in several architectures. This change fixes 2 issues found in OpenRISC. OpenRISC was not properly using parenthesis for arguments and also using arguments twice. This patch fixes those 2 issues. I test booted this patch with v5.0-rc1 on qemu and it's working fine. Cc: Guenter Roeck <linux@roeck-us.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit 94bd8a05 upstream. Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'") broke both alpha and SH booting in qemu, as noticed by Guenter Roeck. It turns out that the bug wasn't actually in that commit itself (which would have been surprising: it was mostly a no-op), but in how the addition of access_ok() to the strncpy_from_user() and strnlen_user() functions now triggered the case where those functions would test the access of the very last byte of the user address space. The string functions actually did that user range test before too, but they did it manually by just comparing against user_addr_max(). But with user_access_begin() doing the check (using "access_ok()"), it now exposed problems in the architecture implementations of that function. For example, on alpha, the access_ok() helper macro looked like this: #define __access_ok(addr, size) \ ((get_fs().seg & (addr | size | (addr+size))) == 0) and what it basically tests is of any of the high bits get set (the USER_DS masking value is 0xfffffc0000000000). And that's completely wrong for the "addr+size" check. Because it's off-by-one for the case where we check to the very end of the user address space, which is exactly what the strn*_user() functions do. Why? Because "addr+size" will be exactly the size of the address space, so trying to access the last byte of the user address space will fail the __access_ok() check, even though it shouldn't. As a result, the user string accessor functions failed consistently - because they literally don't know how long the string is going to be, and the max access is going to be that last byte of the user address space. Side note: that alpha macro is buggy for another reason too - it re-uses the arguments twice. And SH has another version of almost the exact same bug: #define __addr_ok(addr) \ ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg) so far so good: yes, a user address must be below the limit. But then: #define __access_ok(addr, size) \ (__addr_ok((addr) + (size))) is wrong with the exact same off-by-one case: the case when "addr+size" is exactly _equal_ to the limit is actually perfectly fine (think "one byte access at the last address of the user address space") The SH version is actually seriously buggy in another way: it doesn't actually check for overflow, even though it did copy the _comment_ that talks about overflow. So it turns out that both SH and alpha actually have completely buggy implementations of access_ok(), but they happened to work in practice (although the SH overflow one is a serious serious security bug, not that anybody likely cares about SH security). This fixes the problems by using a similar macro on both alpha and SH. It isn't trying to be clever, the end address is based on this logic: unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b; which basically says "add start and length, and then subtract one unless the length was zero". We can't subtract one for a zero length, or we'd just hit an underflow instead. For a lot of access_ok() users the length is a constant, so this isn't actually as expensive as it initially looks. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit 594cc251 upstream. Originally, the rule used to be that you'd have to do access_ok() separately, and then user_access_begin() before actually doing the direct (optimized) user access. But experience has shown that people then decide not to do access_ok() at all, and instead rely on it being implied by other operations or similar. Which makes it very hard to verify that the access has actually been range-checked. If you use the unsafe direct user accesses, hardware features (either SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged Access Never - on ARM) do force you to use user_access_begin(). But nothing really forces the range check. By putting the range check into user_access_begin(), we actually force people to do the right thing (tm), and the range check vill be visible near the actual accesses. We have way too long a history of people trying to avoid them. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Miles Chen <miles.chen@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenz Bauer authored
commit 634efb75 ("selftests: bpf: Reset global state between reuseport test runs") uses a macro RET_IF which doesn't exist in the v4.19 tree. It is defined as follows: #define RET_IF(condition, tag, format...) ({ if (CHECK_FAIL(condition)) { printf(tag " " format); return; } }) CHECK_FAIL in turn is defined as: #define CHECK_FAIL(condition) ({ int __ret = !!(condition); int __save_errno = errno; if (__ret) { test__fail(); fprintf(stdout, "%s:FAIL:%d\n", __func__, __LINE__); } errno = __save_errno; __ret; }) Replace occurences of RET_IF with CHECK. This will abort the test binary if clearing the intermediate state fails. Fixes: 634efb75 ("selftests: bpf: Reset global state between reuseport test runs") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Willem de Bruijn authored
[ Upstream commit 96aa1b22 ] Tun in IFF_NAPI_FRAGS mode calls napi_gro_frags. Unlike netif_rx and netif_gro_receive, this expects skb->data to point to the mac layer. But skb_probe_transport_header, __skb_get_hash_symmetric, and xdp_do_generic in tun_get_user need skb->data to point to the network header. Flow dissection also needs skb->protocol set, so eth_type_trans has to be called. Ensure the link layer header lies in linear as eth_type_trans pulls ETH_HLEN. Then take the same code paths for frags as for not frags. Push the link layer header back just before calling napi_gro_frags. By pulling up to ETH_HLEN from frag0 into linear, this disables the frag0 optimization in the special case when IFF_NAPI_FRAGS is used with zero length iov[0] (and thus empty skb->linear). Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Petar Penkov <ppenkov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ido Schimmel authored
[ Upstream commit 8066e6b4 ] When proxy mode is enabled the vxlan device might reply to Neighbor Solicitation (NS) messages on behalf of remote hosts. In case the NS message includes the "Source link-layer address" option [1], the vxlan device will use the specified address as the link-layer destination address in its reply. To avoid an infinite loop, break out of the options parsing loop when encountering an option with length zero and disregard the NS message. This is consistent with the IPv6 ndisc code and RFC 4886 which states that "Nodes MUST silently discard an ND packet that contains an option with length zero" [2]. [1] https://tools.ietf.org/html/rfc4861#section-4.3 [2] https://tools.ietf.org/html/rfc4861#section-4.6 Fixes: 4b29dba9 ("vxlan: fix nonfunctional neigh_reduce()") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ido Schimmel authored
[ Upstream commit 53fc6852 ] When neighbor suppression is enabled the bridge device might reply to Neighbor Solicitation (NS) messages on behalf of remote hosts. In case the NS message includes the "Source link-layer address" option [1], the bridge device will use the specified address as the link-layer destination address in its reply. To avoid an infinite loop, break out of the options parsing loop when encountering an option with length zero and disregard the NS message. This is consistent with the IPv6 ndisc code and RFC 4886 which states that "Nodes MUST silently discard an ND packet that contains an option with length zero" [2]. [1] https://tools.ietf.org/html/rfc4861#section-4.3 [2] https://tools.ietf.org/html/rfc4861#section-4.6 Fixes: ed842fae ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alla Segal <allas@mellanox.com> Tested-by: Alla Segal <allas@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
[ Upstream commit e8224bfe ] found by smatch: drivers/net/net_failover.c:65 net_failover_open() error: we previously assumed 'primary_dev' could be null (see line 43) Fixes: cfc80d9a ("net: Introduce net_failover driver") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hangbin Liu authored
[ Upstream commit 79a1f0cc ] Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present. Previously the checking logic looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol != IPPROTO_TCP) break; After commit b6f61189 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP was blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; break; else break; Then after commit 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation") UDP/UDPLITE were blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; if (sk->sk_protocol == IPPROTO_TCP) do_some_check; if (sk->sk_protocol != IPPROTO_TCP) break; Fix it by using Eric's code and simply remove the break in TCP check, which looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; else break; Fixes: 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 Jun, 2020 11 commits
-
-
Greg Kroah-Hartman authored
-
Greg Kroah-Hartman authored
This reverts commit 95fde2e4 which is commit 9ca41539 upstream. It was backported incorrectly, Paul writes at: https://lore.kernel.org/r/20200607203425.GD23662@windriver.com I happened to notice this commit: 9ca41539 - "net/mlx5: Annotate mutex destroy for root ns" ...was backported to 4.19 and 5.4 and v5.6 in linux-stable. It patches del_sw_root_ns() - which only exists after v5.7-rc7 from: 6eb7a268 - "net/mlx5: Don't maintain a case of del_sw_func being null" which creates the one line del_sw_root_ns stub function around kfree(node) by breaking it out of tree_put_node(). In the absense of del_sw_root_ns - the backport finds an identical one line kfree stub fcn - named del_sw_prio from this earlier commit: 139ed6c6 - "net/mlx5: Fix steering memory leak" [in v4.15-rc5] and then puts the mutex_destroy() into that (wrong) function, instead of putting it into tree_put_node where the root ns case used to be hand Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Roi Dayan <roid@mellanox.com> Cc: Mark Bloch <markb@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oleg Nesterov authored
commit 013b2deb upstream. uprobe_write_opcode() must not cross page boundary; prepare_uprobe() relies on arch_uprobe_analyze_insn() which should validate "vaddr" but some architectures (csky, s390, and sparc) don't do this. We can remove the BUG_ON() check in prepare_uprobe() and validate the offset early in __uprobe_register(). The new IS_ALIGNED() check matches the alignment check in arch_prepare_kprobe() on supported architectures, so I think that all insns must be aligned to UPROBE_SWBP_INSN_SIZE. Another problem is __update_ref_ctr() which was wrong from the very beginning, it can read/write outside of kmap'ed page unless "vaddr" is aligned to sizeof(short), __uprobe_register() should check this too. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ check for ref_ctr_offset removed for backport - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Josh Poimboeuf authored
commit 3798cc4d upstream Make the docs match the code. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Gross authored
commit 7222a1b5 upstream Add documentation for the SRBDS vulnerability and its mitigation. [ bp: Massage. jpoimboe: sysfs table strings. ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Gross authored
commit 7e5b3c26 upstream SRBDS is an MDS-like speculative side channel that can leak bits from the random number generator (RNG) across cores and threads. New microcode serializes the processor access during the execution of RDRAND and RDSEED. This ensures that the shared buffer is overwritten before it is released for reuse. While it is present on all affected CPU models, the microcode mitigation is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the cases where TSX is not supported or has been disabled with TSX_CTRL. The mitigation is activated by default on affected processors and it increases latency for RDRAND and RDSEED instructions. Among other effects this will reduce throughput from /dev/urandom. * Enable administrator to configure the mitigation off when desired using either mitigations=off or srbds=off. * Export vulnerability status via sysfs * Rename file-scoped macros to apply for non-whitelist table initializations. [ bp: Massage, - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g, - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in, - flip check in cpu_set_bug_bits() to save an indentation level, - reflow comments. jpoimboe: s/Mitigated/Mitigation/ in user-visible strings tglx: Dropped the fused off magic for now ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Gross authored
commit 93920f61 upstream To make cpu_matches() reusable for other matching tables, have it take a pointer to a x86_cpu_id table as an argument. [ bp: Flip arguments order. ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Gross authored
commit e9d71445 upstream Intel uses the same family/model for several CPUs. Sometimes the stepping must be checked to tell them apart. On x86 there can be at most 16 steppings. Add a steppings bitmask to x86_cpu_id and a X86_MATCH_VENDOR_FAMILY_MODEL_STEPPING_FEATURE macro and support for matching against family/model/stepping. [ bp: Massage. tglx: Lightweight variant for backporting ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Srinivas Kandagatla authored
commit 8d9eb0d6 upstream. qfprom has different address spaces for read and write. Reads are always done from corrected address space, where as writes are done on raw address space. Writing to corrected address space is invalid and ignored, so it does not make sense to have this support in the driver which only supports corrected address space regions at the moment. Fixes: 4ab11996 ("nvmem: qfprom: Add Qualcomm QFPROM support.") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200522113341.7728-1-srinivas.kandagatla@linaro.orgSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oliver Neukum authored
commit 97fe8099 upstream. If buffers are iterated over in the error case, the lower limits for quirky devices must be heeded. Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-by: Jean Rene Dawin <jdawin@math.uni-bielefeld.de> Fixes: a4e7279c ("cdc-acm: introduce a cool down") Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200526124420.22160-1-oneukum@suse.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pascal Terjan authored
commit 15ea976a upstream. The value in shared headers was fixed 9 years ago in commit 8d661f1e ("ieee80211: correct IEEE80211_ADDBA_PARAM_BUF_SIZE_MASK macro") and while looking at using shared headers for other duplicated constants I noticed this driver uses the old value. The macros are also defined twice in this file so I am deleting the second definition. Signed-off-by: Pascal Terjan <pterjan@google.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200523211247.23262-1-pterjan@google.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-