- 30 Nov, 2015 40 commits
-
-
Filipe Manana authored
commit f1cd1f0b upstream. When listing a inode's xattrs we have a time window where we race against a concurrent operation for adding a new hard link for our inode that makes us not return any xattr to user space. In order for this to happen, the first xattr of our inode needs to be at slot 0 of a leaf and the previous leaf must still have room for an inode ref (or extref) item, and this can happen because an inode's listxattrs callback does not lock the inode's i_mutex (nor does the VFS does it for us), but adding a hard link to an inode makes the VFS lock the inode's i_mutex before calling the inode's link callback. If we have the following leafs: Leaf X (has N items) Leaf Y [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 XATTR_ITEM 12345), ... ] slot N - 2 slot N - 1 slot 0 The race illustrated by the following sequence diagram is possible: CPU 1 CPU 2 btrfs_listxattr() searches for key (257 XATTR_ITEM 0) gets path with path->nodes[0] == leaf X and path->slots[0] == N because path->slots[0] is >= btrfs_header_nritems(leaf X), it calls btrfs_next_leaf() btrfs_next_leaf() releases the path adds key (257 INODE_REF 666) to the end of leaf X (slot N), and leaf X now has N + 1 items searches for the key (257 INODE_REF 256), with path->keep_locks == 1, because that is the last key it saw in leaf X before releasing the path ends up at leaf X again and it verifies that the key (257 INODE_REF 256) is no longer the last key in leaf X, so it returns with path->nodes[0] == leaf X and path->slots[0] == N, pointing to the new item with key (257 INODE_REF 666) btrfs_listxattr's loop iteration sees that the type of the key pointed by the path is different from the type BTRFS_XATTR_ITEM_KEY and so it breaks the loop and stops looking for more xattr items --> the application doesn't get any xattr listed for our inode So fix this by breaking the loop only if the key's type is greater than BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Peter Oberparleiter authored
commit 863e02d0 upstream. Writing a number to /sys/bus/scsi/devices/<sdev>/queue_ramp_up_period returns the value of that number instead of the number of bytes written. This behavior can confuse programs expecting POSIX write() semantics. Fix this by returning the number of bytes written instead. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Peter Zijlstra authored
commit b71b437e upstream. Arnaldo reported that tracepoint filters seem to misbehave (ie. not apply) on inherited events. The fix is obvious; filters are only set on the actual (parent) event, use the normal pattern of using this parent event for filters. This is safe because each child event has a reference to it. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151102095051.GN17308@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Jurgen Kramer authored
commit 16771c7c upstream. This patch adds native DSD support for the Aune X1S 32BIT/384 DSD DAC Signed-off-by: Jurgen Kramer <gtmkramer@xs4all.nl> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit 1d512cb7 upstream. If we are using the NO_HOLES feature, we have a tiny time window when running delalloc for a nodatacow inode where we can race with a concurrent link or xattr add operation leading to a BUG_ON. This happens because at run_delalloc_nocow() we end up casting a leaf item of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a file extent item (struct btrfs_file_extent_item) and then analyse its extent type field, which won't match any of the expected extent types (values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an explicit BUG_ON(1). The following sequence diagram shows how the race happens when running a no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following neighbour leafs: Leaf X (has N items) Leaf Y [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 EXTENT_DATA 8192), ... ] slot N - 2 slot N - 1 slot 0 (Note the implicit hole for inode 257 regarding the [0, 8K[ range) CPU 1 CPU 2 run_dealloc_nocow() btrfs_lookup_file_extent() --> searches for a key with value (257 EXTENT_DATA 4096) in the fs/subvol tree --> returns us a path with path->nodes[0] == leaf X and path->slots[0] == N because path->slots[0] is >= btrfs_header_nritems(leaf X), it calls btrfs_next_leaf() btrfs_next_leaf() --> releases the path hard link added to our inode, with key (257 INODE_REF 500) added to the end of leaf X, so leaf X now has N + 1 keys --> searches for the key (257 INODE_REF 256), because it was the last key in leaf X before it released the path, with path->keep_locks set to 1 --> ends up at leaf X again and it verifies that the key (257 INODE_REF 256) is no longer the last key in the leaf, so it returns with path->nodes[0] == leaf X and path->slots[0] == N, pointing to the new item with key (257 INODE_REF 500) the loop iteration of run_dealloc_nocow() does not break out the loop and continues because the key referenced in the path at path->nodes[0] and path->slots[0] is for inode 257, its type is < BTRFS_EXTENT_DATA_KEY and its offset (500) is less then our delalloc range's end (8192) the item pointed by the path, an inode reference item, is (incorrectly) interpreted as a file extent item and we get an invalid extent type, leading to the BUG_ON(1): if (extent_type == BTRFS_FILE_EXTENT_REG || extent_type == BTRFS_FILE_EXTENT_PREALLOC) { (...) } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) { (...) } else { BUG_ON(1) } The same can happen if a xattr is added concurrently and ends up having a key with an offset smaller then the delalloc's range end. So fix this by skipping keys with a type smaller than BTRFS_EXTENT_DATA_KEY. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit aeafbf84 upstream. While running a stress test I got the following warning triggered: [191627.672810] ------------[ cut here ]------------ [191627.673949] WARNING: CPU: 8 PID: 8447 at fs/btrfs/file.c:779 __btrfs_drop_extents+0x391/0xa50 [btrfs]() (...) [191627.701485] Call Trace: [191627.702037] [<ffffffff8145f077>] dump_stack+0x4f/0x7b [191627.702992] [<ffffffff81095de5>] ? console_unlock+0x356/0x3a2 [191627.704091] [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb [191627.705380] [<ffffffffa0664499>] ? __btrfs_drop_extents+0x391/0xa50 [btrfs] [191627.706637] [<ffffffff8104b46d>] warn_slowpath_null+0x1a/0x1c [191627.707789] [<ffffffffa0664499>] __btrfs_drop_extents+0x391/0xa50 [btrfs] [191627.709155] [<ffffffff8115663c>] ? cache_alloc_debugcheck_after.isra.32+0x171/0x1d0 [191627.712444] [<ffffffff81155007>] ? kmemleak_alloc_recursive.constprop.40+0x16/0x18 [191627.714162] [<ffffffffa06570c9>] insert_reserved_file_extent.constprop.40+0x83/0x24e [btrfs] [191627.715887] [<ffffffffa065422b>] ? start_transaction+0x3bb/0x610 [btrfs] [191627.717287] [<ffffffffa065b604>] btrfs_finish_ordered_io+0x273/0x4e2 [btrfs] [191627.728865] [<ffffffffa065b888>] finish_ordered_fn+0x15/0x17 [btrfs] [191627.730045] [<ffffffffa067d688>] normal_work_helper+0x14c/0x32c [btrfs] [191627.731256] [<ffffffffa067d96a>] btrfs_endio_write_helper+0x12/0x14 [btrfs] [191627.732661] [<ffffffff81061119>] process_one_work+0x24c/0x4ae [191627.733822] [<ffffffff810615b0>] worker_thread+0x206/0x2c2 [191627.734857] [<ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f [191627.736052] [<ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f [191627.737349] [<ffffffff810669a6>] kthread+0xef/0xf7 [191627.738267] [<ffffffff810f3b3a>] ? time_hardirqs_on+0x15/0x28 [191627.739330] [<ffffffff810668b7>] ? __kthread_parkme+0xad/0xad [191627.741976] [<ffffffff81465592>] ret_from_fork+0x42/0x70 [191627.743080] [<ffffffff810668b7>] ? __kthread_parkme+0xad/0xad [191627.744206] ---[ end trace bbfddacb7aaada8d ]--- $ cat -n fs/btrfs/file.c 691 int __btrfs_drop_extents(struct btrfs_trans_handle *trans, (...) 758 btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); 759 if (key.objectid > ino || 760 key.type > BTRFS_EXTENT_DATA_KEY || key.offset >= end) 761 break; 762 763 fi = btrfs_item_ptr(leaf, path->slots[0], 764 struct btrfs_file_extent_item); 765 extent_type = btrfs_file_extent_type(leaf, fi); 766 767 if (extent_type == BTRFS_FILE_EXTENT_REG || 768 extent_type == BTRFS_FILE_EXTENT_PREALLOC) { (...) 774 } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) { (...) 778 } else { 779 WARN_ON(1); 780 extent_end = search_start; 781 } (...) This happened because the item we were processing did not match a file extent item (its key type != BTRFS_EXTENT_DATA_KEY), and even on this case we cast the item to a struct btrfs_file_extent_item pointer and then find a type field value that does not match any of the expected values (BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]). This scenario happens due to a tiny time window where a race can happen as exemplified below. For example, consider the following scenario where we're using the NO_HOLES feature and we have the following two neighbour leafs: Leaf X (has N items) Leaf Y [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 EXTENT_DATA 8192), ... ] slot N - 2 slot N - 1 slot 0 Our inode 257 has an implicit hole in the range [0, 8K[ (implicit rather than explicit because NO_HOLES is enabled). Now if our inode has an ordered extent for the range [4K, 8K[ that is finishing, the following can happen: CPU 1 CPU 2 btrfs_finish_ordered_io() insert_reserved_file_extent() __btrfs_drop_extents() Searches for the key (257 EXTENT_DATA 4096) through btrfs_lookup_file_extent() Key not found and we get a path where path->nodes[0] == leaf X and path->slots[0] == N Because path->slots[0] is >= btrfs_header_nritems(leaf X), we call btrfs_next_leaf() btrfs_next_leaf() releases the path inserts key (257 INODE_REF 4096) at the end of leaf X, leaf X now has N + 1 keys, and the new key is at slot N btrfs_next_leaf() searches for key (257 INODE_REF 256), with path->keep_locks set to 1, because it was the last key it saw in leaf X finds it in leaf X again and notices it's no longer the last key of the leaf, so it returns 0 with path->nodes[0] == leaf X and path->slots[0] == N (which is now < btrfs_header_nritems(leaf X)), pointing to the new key (257 INODE_REF 4096) __btrfs_drop_extents() casts the item at path->nodes[0], slot path->slots[0], to a struct btrfs_file_extent_item - it does not skip keys for the target inode with a type less than BTRFS_EXTENT_DATA_KEY (BTRFS_INODE_REF_KEY < BTRFS_EXTENT_DATA_KEY) sees a bogus value for the type field triggering the WARN_ON in the trace shown above, and sets extent_end = search_start (4096) does the if-then-else logic to fixup 0 length extent items created by a past bug from hole punching: if (extent_end == key.offset && extent_end >= search_start) goto delete_extent_item; that evaluates to true and it ends up deleting the key pointed to by path->slots[0], (257 INODE_REF 4096), from leaf X The same could happen for example for a xattr that ends up having a key with an offset value that matches search_start (very unlikely but not impossible). So fix this by ensuring that keys smaller than BTRFS_EXTENT_DATA_KEY are skipped, never casted to struct btrfs_file_extent_item and never deleted by accident. Also protect against the unexpected case of getting a key for a lower inode number by skipping that key and issuing a warning. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Helge Deller authored
commit d0cf62fb upstream. This patch fixes some bugs and partly cleans up the parisc uapi header files to what glibc defined: - compat_semid64_ds was wrong and did not take the endianess into account - ipc64_perm exported userspace types which broke building userspace packages on debian (e.g. trinity) - ipc64_perm needs to use a 32bit mode_t on 64bit kernel - msqid64_ds and semid64_ds needs unsigned longs for various struct members - shmid64_ds exported size_t instead of __kernel_size_t And finally add some compile-time checks for the sizes of those structs to avoid future breakage. Runtime-tested with the Linux Test Project (LTP) testsuite. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Borislav Petkov authored
commit 04633df0 upstream. When we get loaded by a 64-bit bootloader, kernel entry point is startup_64 in head_64.S. We don't trust any and all bootloaders because some will fiddle with CPU configuration so we go ahead and massage each CPU into sanity again. For example, some dell BIOSes have this XD disable feature which set IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround for other OSes but Linux sure doesn't need it. A similar thing is present in the Surface 3 firmware - see https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit only on the BSP: # rdmsr -a 0x1a0 400850089 850089 850089 850089 I know, right?! There's not even an off switch in there. So fix all those cases by sanitizing the 64-bit entry point too. For that, make verify_cpu() callable in 64-bit mode also. Requested-and-debugged-by: "H. Peter Anvin" <hpa@zytor.com> Reported-and-tested-by: Bastien Nocera <bugzilla@hadess.net> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email-bp@alien8.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Greg Thelen authored
commit 0f930902 upstream. Since 5cec38ac ("fs, seq_file: fallback to vmalloc instead of oom kill processes") seq_buf_alloc() avoids calling the oom killer for PAGE_SIZE or smaller allocations; but larger allocations can use the oom killer via vmalloc(). Thus reads of small files can return ENOMEM, but larger files use the oom killer to avoid ENOMEM. The effect of this bug is that reads from /proc and other virtual filesystems can return ENOMEM instead of the preferred behavior - oom killing something (possibly the calling process). I don't know of anyone except Google who has noticed the issue. I suspect the fix is more needed in smaller systems where there isn't any reclaimable memory. But these seem like the kinds of systems which probably don't use the oom killer for production situations. Memory overcommit requires use of the oom killer to select a victim regardless of file size. Enable oom killer for small seq_buf_alloc() allocations. Fixes: 5cec38ac ("fs, seq_file: fallback to vmalloc instead of oom kill processes") Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Mathias Krause authored
commit 3824657c upstream. The following statement of ABI/testing/dev-kmsg is not quite right: It is not possible to inject messages from userspace with the facility number LOG_KERN (0), to make sure that the origin of the messages can always be reliably determined. Userland actually can inject messages with a facility of 0 by abusing the fact that the facility is stored in a u8 data type. By using a facility which is a multiple of 256 the assignment of msg->facility in log_store() implicitly truncates it to 0, i.e. LOG_KERN, allowing users of /dev/kmsg to spoof kernel messages as shown below: The following call... # printf '<%d>Kernel panic - not syncing: beer empty\n' 0 >/dev/kmsg ...leads to the following log entry (dmesg -x | tail -n 1): user :emerg : [ 66.137758] Kernel panic - not syncing: beer empty However, this call... # printf '<%d>Kernel panic - not syncing: beer empty\n' 0x800 >/dev/kmsg ...leads to the slightly different log entry (note the kernel facility): kern :emerg : [ 74.177343] Kernel panic - not syncing: beer empty Fix that by limiting the user provided facility to 8 bit right from the beginning and catch the truncation early. Fixes: 7ff9554b ("printk: convert byte-buffer to variable-length...") Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Alex Elder <elder@linaro.org> Cc: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Oleg Nesterov authored
commit 54708d28 upstream. The commit 96d0df79 ("proc: make proc_fd_permission() thread-friendly") fixed the access to /proc/self/fd from sub-threads, but introduced another problem: a sub-thread can't access /proc/<tid>/fd/ or /proc/thread-self/fd if generic_permission() fails. Change proc_fd_permission() to check same_thread_group(pid_task(), current). Fixes: 96d0df79 ("proc: make proc_fd_permission() thread-friendly") Reported-by: "Jin, Yihua" <yihua.jin@intel.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Takashi Iwai authored
commit 60603950 upstream. Another Lifebook machine that needs the same quirk as other similar models to make the driver working. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=883192Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Zi Shen Lim authored
commit 14e589ff upstream. Turns out in the case of modulo by zero in a BPF program: A = A % X; (X == 0) the expected behavior is to terminate with return value 0. The bug in JIT is exposed by a new test case [1]. [1] https://lkml.org/lkml/2015/11/4/499Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Reported-by: Yang Shi <yang.shi@linaro.org> Reported-by: Xi Wang <xi.wang@gmail.com> CC: Alexei Starovoitov <ast@plumgrid.com> Fixes: e54bcde3 ("arm64: eBPF JIT compiler") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Zi Shen Lim authored
commit 251599e1 upstream. In the case of division by zero in a BPF program: A = A / X; (X == 0) the expected behavior is to terminate with return value 0. This is confirmed by the test case introduced in commit 86bf1721 ("test_bpf: add tests checking that JIT/interpreter sets A and X to 0."). Reported-by: Yang Shi <yang.shi@linaro.org> Tested-by: Yang Shi <yang.shi@linaro.org> CC: Xi Wang <xi.wang@gmail.com> CC: Alexei Starovoitov <ast@plumgrid.com> CC: linux-arm-kernel@lists.infradead.org CC: linux-kernel@vger.kernel.org Fixes: e54bcde3 ("arm64: eBPF JIT compiler") Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Michal Hocko authored
commit c12176d3 upstream. Commit 424cdc14 ("memcg: convert threshold to bytes") has fixed a regression introduced by 3e32cb2e ("mm: memcontrol: lockless page counters") where thresholds were silently converted to use page units rather than bytes when interpreting the user input. The fix is not complete, though, as properly pointed out by Ben Hutchings during stable backport review. The page count is converted to bytes but unsigned long is used to hold the value which would be obviously not sufficient for 32b systems with more than 4G thresholds. The same applies to usage as taken from mem_cgroup_usage which might overflow. Let's remove this bytes vs. pages internal tracking differences and handle thresholds in page units internally. Chage mem_cgroup_usage() to return the value in page units and revert 424cdc14 because this should be sufficient for the consistent handling. mem_cgroup_read_u64 as the only users of mem_cgroup_usage outside of the threshold handling code is converted to give the proper in bytes result. It is doing that already for page_counter output so this is more consistent as well. The value presented to the userspace is still in bytes units. Fixes: 424cdc14 ("memcg: convert threshold to bytes") Fixes: 3e32cb2e ("mm: memcontrol: lockless page counters") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> From: Michal Hocko <mhocko@kernel.org> Subject: memcg-fix-thresholds-for-32b-architectures-fix Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> From: Andrew Morton <akpm@linux-foundation.org> Subject: memcg-fix-thresholds-for-32b-architectures-fix-fix don't attempt to inline mem_cgroup_usage() The compiler ignores the inline anwyay. And __always_inlining it adds 600 bytes of goop to the .o file. Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Catalin Marinas authored
commit d4322d88 upstream. On systems with a KMALLOC_MIN_SIZE of 128 (arm64, some mips and powerpc configurations defining ARCH_DMA_MINALIGN to 128), the first kmalloc_caches[] entry to be initialised after slab_early_init = 0 is "kmalloc-128" with index 7. Depending on the debug kernel configuration, sizeof(struct kmem_cache) can be larger than 128 resulting in an INDEX_NODE of 8. Commit 8fc9cf42 ("slab: make more slab management structure off the slab") enables off-slab management objects for sizes starting with PAGE_SIZE >> 5 (128 bytes for a 4KB page configuration) and the creation of the "kmalloc-128" cache would try to place the management objects off-slab. However, since KMALLOC_MIN_SIZE is already 128 and freelist_size == 32 in __kmem_cache_create(), kmalloc_slab(freelist_size) returns NULL (kmalloc_caches[7] not populated yet). This triggers the following bug on arm64: kernel BUG at /work/Linux/linux-2.6-aarch64/mm/slab.c:2283! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc4+ #540 Hardware name: Juno (DT) PC is at __kmem_cache_create+0x21c/0x280 LR is at __kmem_cache_create+0x210/0x280 [...] Call trace: __kmem_cache_create+0x21c/0x280 create_boot_cache+0x48/0x80 create_kmalloc_cache+0x50/0x88 create_kmalloc_caches+0x4c/0xf4 kmem_cache_init+0x100/0x118 start_kernel+0x214/0x33c This patch introduces an OFF_SLAB_MIN_SIZE definition to avoid off-slab management objects for sizes equal to or smaller than KMALLOC_MIN_SIZE. Fixes: 8fc9cf42 ("slab: make more slab management structure off the slab") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Christoph Hellwig authored
commit 40998193 upstream. When dropping a lock while iterating a list we must restart the search as other threads could have manipulated the list under us. Without this we can get stuck in an endless loop. This bug was introduced by commit bc3f02a7 Author: Dan Williams <djbw@fb.com> Date: Tue Aug 28 22:12:10 2012 -0700 [SCSI] scsi_remove_target: fix softlockup regression on hot remove Which was itself trying to fix a reported soft lockup issue http://thread.gmane.org/gmane.linux.kernel/1348679 However, we believe even with this revert of the original patch, the soft lockup problem has been fixed by commit f2495e22 Author: James Bottomley <JBottomley@Parallels.com> Date: Tue Jan 21 07:01:41 2014 -0800 [SCSI] dual scan thread bug fix Thanks go to Dan Williams <dan.j.williams@intel.com> for tracking all this prior history down. Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: bc3f02a7Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Stefan Richter authored
commit 100ceb66 upstream. Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo controllers: Often or even most of the time, the controller is initialized with the message "added OHCI v1.10 device as card 0, 4 IR + 0 IT contexts, quirks 0x10". With 0 isochronous transmit DMA contexts (IT contexts), applications like audio output are impossible. However, OHCI-1394 demands that at least 4 IT contexts are implemented by the link layer controller, and indeed JMicron JMB38x do implement four of them. Only their IsoXmitIntMask register is unreliable at early access. With my own JMB381 single function controller I found: - I can reproduce the problem with a lower probability than Craig's. - If I put a loop around the section which clears and reads IsoXmitIntMask, then either the first or the second attempt will return the correct initial mask of 0x0000000f. I never encountered a case of needing more than a second attempt. - Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet) before the first write, the subsequent read will return the correct result. - If I merely ignore a wrong read result and force the known real result, later isochronous transmit DMA usage works just fine. So let's just fix this chip bug up by the latter method. Tested with JMB381 on kernel 3.13 and 4.3. Since OHCI-1394 generally requires 4 IT contexts at a minium, this workaround is simply applied whenever the initial read of IsoXmitIntMask returns 0, regardless whether it's a JMicron chip or not. I never heard of this issue together with any other chip though. I am not 100% sure that this fix works on the OHCI-1394 part of JMB380 and JMB388 combo controllers exactly the same as on the JMB381 single- function controller, but so far I haven't had a chance to let an owner of a combo chip run a patched kernel. Strangely enough, IsoRecvIntMask is always reported correctly, even though it is probed right before IsoXmitIntMask. Reported-by: Clifford Dunn Reported-by: Craig Moore <craig.moore@qenos.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Alexandra Yates authored
commit 5cf92c8b upstream. Adding Intel codename Lewisburg platform device IDs for audio. [rearranged the position by tiwai] Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Takashi Iwai authored
commit c932b98c upstream. HP ProBook 6550b needs the same pin fixup applied to other HP B-series laptops with docks for making its headphone and dock headphone jacks working properly. We just need to add the codec SSID to the list. Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=191971Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Krzysztof Kozlowski authored
commit 824ead03 upstream. During probe if the regulator could not be enabled, the error exit path would still disable it. This could lead to unbalanced counter of regulator enable/disable. The patch moves code for getting and enabling the regulator from exynos_map_dt_data() to probe function because it is really not a part of getting Device Tree properties. Acked-by: Lukasz Majewski <l.majewski@samsung.com> Tested-by: Lukasz Majewski <l.majewski@samsung.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 5f09a5cb ("thermal: exynos: Disable the regulator on probe failure") Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Radim Krčmář authored
commit 656ec4a4 upstream. The comment in code had it mostly right, but we enable paging for emulated real mode regardless of EPT. Without EPT (which implies emulated real mode), secondary VCPUs won't start unless we disable SM[AE]P when the guest doesn't use paging. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Li Bin authored
commit 2ee8a74f upstream. By now, the recordmcount only records the function that in following sections: .text/.ref.text/.sched.text/.spinlock.text/.irqentry.text/ .kprobes.text/.text.unlikely For the function that not in these sections, the call mcount will be in place and not be replaced when kernel boot up. And it will bring performance overhead, such as do_mem_abort (in .exception.text section). This patch make the call mcount to nop for this case in recordmcount. Link: http://lkml.kernel.org/r/1446019445-14421-1-git-send-email-huawei.libin@huawei.com Link: http://lkml.kernel.org/r/1446193864-24593-4-git-send-email-huawei.libin@huawei.com Cc: <lkp@intel.com> Cc: <catalin.marinas@arm.com> Cc: <takahiro.akashi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
libin authored
commit c84da8b9 upstream. In nop_mcount, shdr->sh_offset and welp->r_offset should handle endianness properly, otherwise it will trigger Segmentation fault if the recordmcount main and file.o have different endianness. Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.comSigned-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Max Filippov authored
commit ab45fb14 upstream. There are multiple factors adding to the issue in different configurations: - commit 17290231 ("xtensa: add fixup for double exception raised in window overflow") added function window_overflow_restore_a0_fixup to double exception vector overlapping reset vector location of secondary processor cores. - on MMUv2 cores RESET_VECTOR1_VADDR may point to uncached kernel memory making code overlapping depend on cache type and size, so that without cache or with WT cache reset vector code overwrites double exception code, making issue even harder to detect. - on MMUv3 cores RESET_VECTOR1_VADDR may point to unmapped area, as MMUv3 cores change virtual address map to match MMUv2 layout, but reset vector virtual address is given for the original MMUv3 mapping. - physical memory region of the secondary reset vector is not reserved in the physical memory map, and thus may be allocated and overwritten at arbitrary moment. Fix it as follows: - move window_overflow_restore_a0_fixup code to .text section. - define RESET_VECTOR1_VADDR so that it points to reset vector in the cacheable MMUv2 map for cores with MMU. - reserve reset vector region in the physical memory map. Drop separate literal section and build mxhead.S with text section literals. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Arik Nemtsov authored
commit 254d3dfe upstream. In TDLS channel-switch operations the chandef can sometimes be NULL. Avoid an oops in the trace code for these cases and just print a chandef full of zeros. Fixes: a7a6bdd0 ("mac80211: introduce TDLS channel switch ops") Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Ola Olsson authored
commit 4baf6bea upstream. If parse_acl_data succeeds but the subsequent parsing of smps attributes fails, there will be a memory leak due to early returns. Fix that by moving the ACL parsing later. Fixes: 18998c38 ("cfg80211: allow requesting SMPS mode on ap start") Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Janusz.Dziedzic@tieto.com authored
commit 519ee691 upstream. In case of one shot NOA the interval can be 0, catch that instead of potentially (depending on the driver) crashing like this: divide error: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffc08e891c>] ieee80211_extend_absent_time+0x6c/0xb0 [mac80211] [<ffffffffc08e8a17>] ieee80211_update_p2p_noa+0xb7/0xe0 [mac80211] [<ffffffffc069cc30>] ath9k_p2p_ps_timer+0x170/0x190 [ath9k] [<ffffffffc070adf8>] ath_gen_timer_isr+0xc8/0xf0 [ath9k_hw] [<ffffffffc0691156>] ath9k_tasklet+0x296/0x2f0 [ath9k] [<ffffffff8107ad65>] tasklet_action+0xe5/0xf0 [...] Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
sumit.saxena@avagotech.com authored
commit 323c4a02 upstream. This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit OS. Do not access user memory from kernel code. The SMAP bit restricts accessing user memory from kernel code. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gabriele Paoloni authored
commit fa3b7cba upstream. The first argument of dw_pcie_cfg_read/write() is a 32-bit aligned address. The second argument is the byte offset into a 32-bit word, and dw_pcie_cfg_read/write() only look at the low two bits. SPEAr13xx used dw_pcie_cfg_read() and dw_pcie_cfg_write() incorrectly: it passed important address bits in the second argument, where they were ignored. Pass the complete 32-bit word address in the first argument and only the 2-bit offset into that word in the second argument. Without this fix, SPEAr13xx host will never work with few buggy gen1 card which connects with only gen1 host and also with any endpoint which would generate a read request of more than 128 bytes. [bhelgaas: changelog] Reported-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Max Filippov authored
commit 5029615e upstream. Build-time fixes: - make lbeg/lend/lcount save/restore conditional on kernel entry; - don't clear lcount in platform_restart functions unconditionally. Run-time fixes: - use correct end of range register in __endla paired with __loopt, not the unused temporary register. This fixes .bss zero-initialization. Update comments in asmmacro.h; - don't clobber a10 in the usercopy that leads to access to unmapped memory. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Herbert Xu authored
commit 4afa5f96 upstream. The hash_accept call fails to work on sockets that have not received any data. For some algorithm implementations it may cause crashes. This patch fixes this by ensuring that we only export and import on sockets that have received data. Reported-by: Harsh Jain <harshjain.prof@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Jani Nikula authored
commit 9be64eee upstream. Reported-by: Keith Webb <khwebb@gmail.com> Suggested-by: Keith Webb <khwebb@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=106671Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446209424-28801-1-git-send-email-jani.nikula@intel.comSigned-off-by: Kamal Mostafa <kamal@canonical.com>
-
Mauricio Faria de Oliveira authored
commit 47796938 upstream. This reverts commit a1989b33. That commit introduced a regression at least for the case of the SG_IO ioctl() running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there are no active paths: the ioctl() fails with the ENOTTY errno immediately rather than blocking due to queue_if_no_path until a path becomes active, for example. That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]) from multipath devices; which leads to SCSI/filesystem errors in such a guest. More general scenarios can hit that regression too. The following demonstration employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective (some output & user changes omitted for brevity and comments added for clarity). Reverting that commit restores normal operation (queueing) in failing scenarios; tested on linux-next (next-20151022). 1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM) $ cat sg_simple0.c ... see [3] ... $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c $ gcc sgio_inquiry.c -o sgio_inquiry 2) The ioctl() works fine with active paths present. # multipath -l 85ag56 85ag56 (...) dm-19 IBM ,2145 size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 8:0:11:0 sdz 65:144 active undef running | `- 9:0:9:0 sdbf 67:144 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 8:0:12:0 sdae 65:224 active undef running `- 9:0:12:0 sdbo 68:32 active undef running $ ./sgio_inquiry /dev/mapper/85ag56 Some of the INQUIRY command's response: IBM 2145 0000 INQUIRY duration=0 millisecs, resid=0 3) The ioctl() fails with ENOTTY errno with _no_ active paths present, for unprivileged users (rather than blocking due to queue_if_no_path). # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \ do multipathd -k"fail path $path"; done # multipath -l 85ag56 85ag56 (...) dm-19 IBM ,2145 size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=enabled | |- 8:0:11:0 sdz 65:144 failed undef running | `- 9:0:9:0 sdbf 67:144 failed undef running `-+- policy='service-time 0' prio=0 status=enabled |- 8:0:12:0 sdae 65:224 failed undef running `- 9:0:12:0 sdbo 68:32 failed undef running $ ./sgio_inquiry /dev/mapper/85ag56 sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285); it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl(). $ dmesg <...> [] device-mapper: multipath: Failing path 65:144. [] device-mapper: multipath: Failing path 67:144. [] device-mapper: multipath: Failing path 65:224. [] device-mapper: multipath: Failing path 68:32. [] sgio_inquiry: sending ioctl 2285 to a partition! 5) The ioctl() only works if the SYS_CAP_RAWIO capability is present (then queueing happens -- in this example, queue_if_no_path is set); this is due to a conditional check in scsi_verify_blk_ioctl(). # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56' sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device # ./sgio_inquiry /dev/mapper/85ag56 & [1] 72830 # cat /proc/72830/stack [<c00000171c0df700>] 0xc00000171c0df700 [<c000000000015934>] __switch_to+0x204/0x350 [<c000000000152d4c>] msleep+0x5c/0x80 [<c00000000077dfb0>] dm_blk_ioctl+0x70/0x170 [<c000000000487c40>] blkdev_ioctl+0x2b0/0x9b0 [<c0000000003128e4>] block_ioctl+0x64/0xd0 [<c0000000002dd3b0>] do_vfs_ioctl+0x490/0x780 [<c0000000002dd774>] SyS_ioctl+0xd4/0xf0 [<c000000000009358>] system_call+0x38/0xd0 6) This is the function call chain exercised in this analysis: SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c -> do_vfs_ioctl() -> vfs_ioctl() ... error = filp->f_op->unlocked_ioctl(filp, cmd, arg); ... -> dm_blk_ioctl() @ drivers/md/dm.c -> multipath_ioctl() @ drivers/md/dm-mpath.c ... (bdev = NULL, due to no active paths) ... if (!bdev || <...>) { int err = scsi_verify_blk_ioctl(NULL, cmd); if (err) r = err; } ... -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c ... if (bd && bd == bd->bd_contains) // not taken (bd = NULL) return 0; ... if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user) return 0; ... printk_ratelimited(KERN_WARNING "%s: sending ioctl %x to a partition!\n" <...>); return -ENOIOCTLCMD; <- ... return r ? : <...> <- ... if (error == -ENOIOCTLCMD) error = -ENOTTY; out: return error; ... Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') [3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03) Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Brian Norris authored
commit f3c63795 upstream. Commit 073db4a5 ("mtd: fix: avoid race condition when accessing mtd->usecount") fixed a race condition but due to poor ordering of the mutex acquisition, introduced a potential deadlock. The deadlock can occur, for example, when rmmod'ing the m25p80 module, which will delete one or more MTDs, along with any corresponding mtdblock devices. This could potentially race with an acquisition of the block device as follows. -> blktrans_open() -> mutex_lock(&dev->lock); -> mutex_lock(&mtd_table_mutex); -> del_mtd_device() -> mutex_lock(&mtd_table_mutex); -> blktrans_notify_remove() -> del_mtd_blktrans_dev() -> mutex_lock(&dev->lock); This is a classic (potential) ABBA deadlock, which can be fixed by making the A->B ordering consistent everywhere. There was no real purpose to the ordering in the original patch, AFAIR, so this shouldn't be a problem. This ordering was actually already present in del_mtd_blktrans_dev(), for one, where the function tried to ensure that its caller already held mtd_table_mutex before it acquired &dev->lock: if (mutex_trylock(&mtd_table_mutex)) { mutex_unlock(&mtd_table_mutex); BUG(); } So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so we always acquire mtd_table_mutex first. Snippets of the lockdep output follow: # modprobe -r m25p80 [ 53.419251] [ 53.420838] ====================================================== [ 53.427300] [ INFO: possible circular locking dependency detected ] [ 53.433865] 4.3.0-rc6 #96 Not tainted [ 53.437686] ------------------------------------------------------- [ 53.444220] modprobe/372 is trying to acquire lock: [ 53.449320] (&new->lock){+.+...}, at: [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc [ 53.457271] [ 53.457271] but task is already holding lock: [ 53.463372] (mtd_table_mutex){+.+.+.}, at: [<c0439994>] del_mtd_device+0x18/0x100 [ 53.471321] [ 53.471321] which lock already depends on the new lock. [ 53.471321] [ 53.479856] [ 53.479856] the existing dependency chain (in reverse order) is: [ 53.487660] -> #1 (mtd_table_mutex){+.+.+.}: [ 53.492331] [<c043fc5c>] blktrans_open+0x34/0x1a4 [ 53.497879] [<c01afce0>] __blkdev_get+0xc4/0x3b0 [ 53.503364] [<c01b0bb8>] blkdev_get+0x108/0x320 [ 53.508743] [<c01713c0>] do_dentry_open+0x218/0x314 [ 53.514496] [<c0180454>] path_openat+0x4c0/0xf9c [ 53.519959] [<c0182044>] do_filp_open+0x5c/0xc0 [ 53.525336] [<c0172758>] do_sys_open+0xfc/0x1cc [ 53.530716] [<c000f740>] ret_fast_syscall+0x0/0x1c [ 53.536375] -> #0 (&new->lock){+.+...}: [ 53.540587] [<c063f124>] mutex_lock_nested+0x38/0x3cc [ 53.546504] [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc [ 53.552606] [<c043f164>] blktrans_notify_remove+0x7c/0x84 [ 53.558891] [<c04399f0>] del_mtd_device+0x74/0x100 [ 53.564544] [<c043c670>] del_mtd_partitions+0x80/0xc8 [ 53.570451] [<c0439aa0>] mtd_device_unregister+0x24/0x48 [ 53.576637] [<c046ce6c>] spi_drv_remove+0x1c/0x34 [ 53.582207] [<c03de0f0>] __device_release_driver+0x88/0x114 [ 53.588663] [<c03de19c>] device_release_driver+0x20/0x2c [ 53.594843] [<c03dd9e8>] bus_remove_device+0xd8/0x108 [ 53.600748] [<c03dacc0>] device_del+0x10c/0x210 [ 53.606127] [<c03dadd0>] device_unregister+0xc/0x20 [ 53.611849] [<c046d878>] __unregister+0x10/0x20 [ 53.617211] [<c03da868>] device_for_each_child+0x50/0x7c [ 53.623387] [<c046eae8>] spi_unregister_master+0x58/0x8c [ 53.629578] [<c03e12f0>] release_nodes+0x15c/0x1c8 [ 53.635223] [<c03de0f8>] __device_release_driver+0x90/0x114 [ 53.641689] [<c03de900>] driver_detach+0xb4/0xb8 [ 53.647147] [<c03ddc78>] bus_remove_driver+0x4c/0xa0 [ 53.652970] [<c00cab50>] SyS_delete_module+0x11c/0x1e4 [ 53.658976] [<c000f740>] ret_fast_syscall+0x0/0x1c [ 53.664621] [ 53.664621] other info that might help us debug this: [ 53.664621] [ 53.672979] Possible unsafe locking scenario: [ 53.672979] [ 53.679169] CPU0 CPU1 [ 53.683900] ---- ---- [ 53.688633] lock(mtd_table_mutex); [ 53.692383] lock(&new->lock); [ 53.698306] lock(mtd_table_mutex); [ 53.704658] lock(&new->lock); [ 53.707946] [ 53.707946] *** DEADLOCK *** Fixes: 073db4a5 ("mtd: fix: avoid race condition when accessing mtd->usecount") Reported-by: Felipe Balbi <balbi@ti.com> Tested-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Marek Vasut authored
commit 562b103a upstream. The sizeof() is invoked on an incorrect variable, likely due to some copy-paste error, and this might result in memory corruption. Fix this. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: netdev@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Robin Murphy authored
commit 5accd17d upstream. For reasons not entirely apparent, but now enshrined in history, the architectural mapping of AArch32 banked registers to AArch64 registers actually orders SP_<mode> and LR_<mode> backwards compared to the intuitive r13/r14 order, for all modes except FIQ. Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding subtle bugs with KVM and AArch32 guests. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
David Hildenbrand authored
commit c5c2c393 upstream. We seemed to have missed a few corner cases in commit f6c137ff ("KVM: s390: randomize sca address"). The SCA has a maximum size of 2112 bytes. By setting the sca_offset to some unlucky numbers, we exceed the page. 0x7c0 (1984) -> Fits exactly 0x7d0 (2000) -> 16 bytes out 0x7e0 (2016) -> 32 bytes out 0x7f0 (2032) -> 48 bytes out One VCPU entry is 32 bytes long. For the last two cases, we actually write data to the other page. 1. The address of the VCPU. 2. Injection/delivery/clearing of SIGP externall calls via SIGP IF. Especially the 2. happens regularly. So this could produce two problems: 1. The guest losing/getting external calls. 2. Random memory overwrites in the host. So this problem happens on every 127 + 128 created VM with 64 VCPUs. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
sumit.saxena@avagotech.com authored
commit 357ae967 upstream. Do not use PAGE_SIZE marco to calculate max_sectors per I/O request. Driver code assumes PAGE_SIZE will be always 4096 which can lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue was reported in Ubuntu Bugzilla Bug #1475166. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
sumit.saxena@avagotech.com authored
commit 0d5b47a7 upstream. Expose non-disk (TAPE drive, CD-ROM) unconditionally. Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-