- 24 Apr, 2018 7 commits
-
-
Claudio Imbrenda authored
commit a38c015f upstream. When using KSM with use_zero_pages, we replace anonymous pages containing only zeroes with actual zero pages, which are not anonymous. We need to do proper accounting of the mm counters, otherwise we will get wrong values in /proc and a BUG message in dmesg when tearing down the mm. Link: http://lkml.kernel.org/r/1522931274-15552-1-git-send-email-imbrenda@linux.vnet.ibm.com Fixes: e86c59b1 ("mm/ksm: improve deduplication of zero pages with colouring") Signed-off-by:
Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Richard Weinberger authored
commit b5094b7f upstream. While UBI and UBIFS seem to work at first sight with MLC NAND, you will most likely lose all your data upon a power-cut or due to read/write disturb. In order to protect users from bad surprises, refuse to attach to MLC NAND. Cc: stable@vger.kernel.org Signed-off-by:
Richard Weinberger <richard@nod.at> Acked-by:
Boris Brezillon <boris.brezillon@bootlin.com> Acked-by:
Artem Bityutskiy <dedekind1@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Romain Izard authored
commit 78a8dfba upstream. When opening a device with write access, ubiblock_open returns an error code. Currently, this error code is -EPERM, but this is not the right value. The open function for other block devices returns -EROFS when opening read-only devices with FMODE_WRITE set. When used with dm-verity, the veritysetup userspace tool is expecting EROFS, and refuses to use the ubiblock device. Use -EROFS for ubiblock as well. As a result, veritysetup accepts the ubiblock device as valid. Cc: stable@vger.kernel.org Fixes: 9d54c8a3 (UBI: R/O block driver on top of UBI volumes) Signed-off-by:
Romain Izard <romain.izard.pro@gmail.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Richard Weinberger authored
commit 29b7a6fa upstream. At this point UBI volumes have already been free()'ed and fastmap can no longer access these data structures. Reported-by:
Martin Townsend <mtownsend1973@gmail.com> Fixes: 74cdaf24 ("UBI: Fastmap: Fix memory leaks while closing the WL sub-system") Cc: stable@vger.kernel.org Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Richard Weinberger authored
commit aac17948 upstream. If ubifs_wbuf_sync() fails we must not write a master node with the dirty marker cleared. Otherwise it is possible that in case of an IO error while syncing we mark the filesystem as clean and UBIFS refuses to recover upon next mount. Cc: <stable@vger.kernel.org> Fixes: 1e51764a ("UBIFS: add new flash file system") Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
George Cherian authored
commit 3d41386d upstream. With commit e948bc8f (cpufreq: Cap the default transition delay value to 10 ms) the cpufreq was not honouring the delay passed via ACPI (PCCT). Due to which on ARM based platforms using CPPC the cpufreq governor tries to change the frequency of CPUs faster than expected. This leads to continuous error messages like the following. " ACPI CPPC: PCC check channel failed. Status=0 " Earlier (without above commit) the default transition delay was taken form the value passed from PCCT. Use the same value provided by PCCT to set the transition_delay_us. Fixes: e948bc8f (cpufreq: Cap the default transition delay value to 10 ms) Signed-off-by:
George Cherian <george.cherian@cavium.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit 28b0f8a6 upstream. A tty is hung up by __tty_hangup() setting file->f_op to hung_up_tty_fops, which is skipped on ttys whose write operation isn't tty_write(). This means that, for example, /dev/console whose write op is redirected_tty_write() is never actually marked hung up. Because n_tty_read() uses the hung up status to decide whether to abort the waiting readers, the lack of hung-up marking can lead to the following scenario. 1. A session contains two processes. The leader and its child. The child ignores SIGHUP. 2. The leader exits and starts disassociating from the controlling terminal (/dev/console). 3. __tty_hangup() skips setting f_op to hung_up_tty_fops. 4. SIGHUP is delivered and ignored. 5. tty_ldisc_hangup() is invoked. It wakes up the waits which should clear the read lockers of tty->ldisc_sem. 6. The reader wakes up but because tty_hung_up_p() is false, it doesn't abort and goes back to sleep while read-holding tty->ldisc_sem. 7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup() and is now stuck in D sleep indefinitely waiting for tty->ldisc_sem. The following is Alan's explanation on why some ttys aren't hung up. http://lkml.kernel.org/r/20171101170908.6ad08580@alans-desktop 1. It broke the serial consoles because they would hang up and close down the hardware. With tty_port that *should* be fixable properly for any cases remaining. 2. The console layer was (and still is) completely broken and doens't refcount properly. So if you turn on console hangups it breaks (as indeed does freeing consoles and half a dozen other things). As neither can be fixed quickly, this patch works around the problem by introducing a new flag, TTY_HUPPING, which is used solely to tell n_tty_read() that hang-up is in progress for the console and the readers should be aborted regardless of the hung-up status of the device. The following is a sample hung task warning caused by this issue. INFO: task agetty:2662 blocked for more than 120 seconds. Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty #28 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 0 2662 1 0x00000086 Call Trace: __schedule+0x267/0x890 schedule+0x36/0x80 schedule_timeout+0x23c/0x2e0 ldsem_down_write+0xce/0x1f6 tty_ldisc_lock+0x16/0x30 tty_ldisc_hangup+0xb3/0x1b0 __tty_hangup+0x300/0x410 disassociate_ctty+0x6c/0x290 do_exit+0x7ef/0xb00 do_group_exit+0x3f/0xa0 get_signal+0x1b3/0x5d0 do_signal+0x28/0x660 exit_to_usermode_loop+0x46/0x86 do_syscall_64+0x9c/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 The following is the repro. Run "$PROG /dev/console". The parent process hangs in D state. #include <sys/types.h> #include <sys/stat.h> #include <sys/wait.h> #include <sys/ioctl.h> #include <fcntl.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <signal.h> #include <time.h> #include <termios.h> int main(int argc, char **argv) { struct sigaction sact = { .sa_handler = SIG_IGN }; struct timespec ts1s = { .tv_sec = 1 }; pid_t pid; int fd; if (argc < 2) { fprintf(stderr, "test-hung-tty /dev/$TTY\n"); return 1; } /* fork a child to ensure that it isn't already the session leader */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { /* top parent, wait for everyone */ while (waitpid(-1, NULL, 0) >= 0) ; if (errno != ECHILD) perror("waitpid"); return 0; } /* new session, start a new session and set the controlling tty */ if (setsid() < 0) { perror("setsid"); return 1; } fd = open(argv[1], O_RDWR); if (fd < 0) { perror("open"); return 1; } if (ioctl(fd, TIOCSCTTY, 1) < 0) { perror("ioctl"); return 1; } /* fork a child, sleep a bit and exit */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { nanosleep(&ts1s, NULL); printf("Session leader exiting\n"); exit(0); } /* * The child ignores SIGHUP and keeps reading from the controlling * tty. Because SIGHUP is ignored, the child doesn't get killed on * parent exit and the bug in n_tty makes the read(2) block the * parent's control terminal hangup attempt. The parent ends up in * D sleep until the child is explicitly killed. */ sigaction(SIGHUP, &sact, NULL); printf("Child reading tty\n"); while (1) { char buf[1024]; if (read(fd, buf, sizeof(buf)) < 0) { perror("read"); return 1; } } return 0; } Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@llwyncelyn.cymru> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 19 Apr, 2018 33 commits
-
-
Greg Kroah-Hartman authored
-
J. Bruce Fields authored
commit 880a3a53 upstream. We're neglecting to clear the umask after it's set, which can cause a later unrelated rpc to (incorrectly) use the same umask if it happens to be processed by the same thread. There's a more subtle problem here too: An NFSv4 compound request is decoded all in one pass before any operations are executed. Currently we're setting current->fs->umask at the time we decode the compound. In theory a single compound could contain multiple creates each setting a umask. In that case we'd end up using whichever umask was passed in the *last* operation as the umask for all the creates, whether that was correct or not. So, we should just be saving the umask at decode time and waiting to set it until we actually process the corresponding operation. In practice it's unlikely any client would do multiple creates in a single compound. And even if it did they'd likely be from the same process (hence carry the same umask). So this is a little academic, but we should get it right anyway. Fixes: 47057abd (nfsd: add support for the umask attribute) Cc: stable@vger.kernel.org Reported-by:
Lucash Stach <l.stach@pengutronix.de> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mike Kravetz authored
commit 5df63c2a upstream. This is a fix for a regression in 32 bit kernels caused by an invalid check for pgoff overflow in hugetlbfs mmap setup. The check incorrectly specified that the size of a loff_t was the same as the size of a long. The regression prevents mapping hugetlbfs files at offsets greater than 4GB on 32 bit kernels. On 32 bit kernels conversion from a page based unsigned long can not overflow a loff_t byte offset. Therefore, skip this check if sizeof(unsigned long) != sizeof(loff_t). Link: http://lkml.kernel.org/r/20180330145402.5053-1-mike.kravetz@oracle.com Fixes: 63489f8e ("hugetlbfs: check for pgoff value overflow") Reported-by:
Dan Rue <dan.rue@linaro.org> Signed-off-by:
Mike Kravetz <mike.kravetz@oracle.com> Tested-by:
Anders Roxell <anders.roxell@linaro.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Nic Losby <blurbdust@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Simon Gaiser authored
commit 2a22ee6c upstream. Commit fd8aa909 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") made a subtle change to the semantic of xenbus_dev_request_and_reply() and xenbus_transaction_end(). Before on an error response to XS_TRANSACTION_END xenbus_dev_request_and_reply() would not decrement the active transaction counter. But xenbus_transaction_end() has always counted the transaction as finished regardless of the response. The new behavior is that xenbus_dev_request_and_reply() and xenbus_transaction_end() will always count the transaction as finished regardless the response code (handled in xs_request_exit()). But xenbus_dev_frontend tries to end a transaction on closing of the device if the XS_TRANSACTION_END failed before. Trying to close the transaction twice corrupts the reference count. So fix this by also considering a transaction closed if we have sent XS_TRANSACTION_END once regardless of the return code. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa909 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Signed-off-by:
Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Amir Goldstein authored
commit 3ec9b3fa upstream. As of now if we encounter an opaque dir while looking for a dentry, we set d->last=true. This means that there is no need to look further in any of the lower layers. This works fine as long as there are no redirets or relative redircts. But what if there is an absolute redirect on the children dentry of opaque directory. We still need to continue to look into next lower layer. This patch fixes it. Here is an example to demonstrate the issue. Say you have following setup. upper: /redirect (redirect=/a/b/c) lower1: /a/[b]/c ([b] is opaque) (c has absolute redirect=/a/b/d/) lower0: /a/b/d/foo Now "redirect" dir should merge with lower1:/a/b/c/ and lower0:/a/b/d. Note, despite the fact lower1:/a/[b] is opaque, we need to continue to look into lower0 because children c has an absolute redirect. Following is a reproducer. Watch me make foo disappear: $ mkdir lower middle upper work work2 merged $ mkdir lower/origin $ touch lower/origin/foo $ mount -t overlay none merged/ \ -olowerdir=lower,upperdir=middle,workdir=work2 $ mkdir merged/pure $ mv merged/origin merged/pure/redirect $ umount merged $ mount -t overlay none merged/ \ -olowerdir=middle:lower,upperdir=upper,workdir=work $ mv merged/pure/redirect merged/redirect Now you see foo inside a twice redirected merged dir: $ ls merged/redirect foo $ umount merged $ mount -t overlay none merged/ \ -olowerdir=middle:lower,upperdir=upper,workdir=work After mount cycle you don't see foo inside the same dir: $ ls merged/redirect During middle layer lookup, the opaqueness of middle/pure is left in the lookup state and then middle/pure/redirect is wrongly treated as opaque. Fixes: 02b69b28 ("ovl: lookup redirects") Cc: <stable@vger.kernel.org> #v4.10 Signed-off-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ming Lei authored
commit bffa9909 upstream. From commit 4b855ad3 ("blk-mq: Create hctx for each present CPU), blk-mq doesn't remap queue after CPU topo is changed, that said when some of these offline CPUs become online, they are still mapped to hctx 0, then hctx 0 may become the bottleneck of IO dispatch and completion. This patch sets up the mapping from the beginning, and aligns to queue mapping for PCI device (blk_mq_pci_map_queues()). Cc: Stefan Haberland <sth@linux.vnet.ibm.com> Cc: Keith Busch <keith.busch@intel.com> Cc: stable@vger.kernel.org Fixes: 4b855ad3 ("blk-mq: Create hctx for each present CPU) Tested-by:
Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yury Norov authored
commit 8351760f upstream. syzbot is catching stalls at __bitmap_parselist() (https://syzkaller.appspot.com/bug?id=ad7e0351fbc90535558514a71cd3edc11681997a). The trigger is unsigned long v = 0; bitmap_parselist("7:,", &v, BITS_PER_LONG); which results in hitting infinite loop at while (a <= b) { off = min(b - a + 1, used_size); bitmap_set(maskp, a, off); a += group_size; } due to used_size == group_size == 0. Link: http://lkml.kernel.org/r/20180404162647.15763-1-ynorov@caviumnetworks.com Fixes: 0a5ce083 ("lib/bitmap.c: make bitmap_parselist() thread-safe and much faster") Signed-off-by:
Yury Norov <ynorov@caviumnetworks.com> Reported-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by:
syzbot <syzbot+6887cbb011c8054e8a3d@syzkaller.appspotmail.com> Cc: Noam Camus <noamca@mellanox.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yunlong Song authored
commit b94929d9 upstream. Commit 7a20b8a6 ("f2fs: allocate node and hot data in the beginning of partition") introduces another mount option, heap, to reset it back. But it does not do anything for heap mode, so fix it. Cc: stable@vger.kernel.org Signed-off-by:
Yunlong Song <yunlong.song@huawei.com> Reviewed-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit f3aefb6a upstream. make_checksum_hmac_md5() is allocating an HMAC transform and doing crypto API calls in the following order: crypto_ahash_init() crypto_ahash_setkey() crypto_ahash_digest() This is wrong because it makes no sense to init() the request before a key has been set, given that the initial state depends on the key. And digest() is short for init() + update() + final(), so in this case there's no need to explicitly call init() at all. Before commit 9fa68f62 ("crypto: hash - prevent using keyed hashes without setting key") the extra init() had no real effect, at least for the software HMAC implementation. (There are also hardware drivers that implement HMAC-MD5, and it's not immediately obvious how gracefully they handle init() before setkey().) But now the crypto API detects this incorrect initialization and returns -ENOKEY. This is breaking NFS mounts in some cases. Fix it by removing the incorrect call to crypto_ahash_init(). Reported-by:
Michael Young <m.a.young@durham.ac.uk> Fixes: 9fa68f62 ("crypto: hash - prevent using keyed hashes without setting key") Fixes: fffdaef2 ("gss_krb5: Add support for rc4-hmac encryption") Cc: stable@vger.kernel.org Signed-off-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Toke Høiland-Jørgensen authored
commit 182b1917 upstream. When ath9k was switched over to use the mac80211 intermediate queues, node cleanup now drains the mac80211 queues. However, this call path is not protected by rcu_read_lock() as it was previously entirely internal to the driver which uses its own locking. This leads to a possible rcu_dereference() without holding rcu_read_lock(); but only if a station is cleaned up while having packets queued on the TXQ. Fix this by adding the rcu_read_lock() to the caller in ath9k. Fixes: 50f08edf ("ath9k: Switch to using mac80211 intermediate software queues.") Cc: stable@vger.kernel.org Reported-by:
Ben Greear <greearb@candelatech.com> Signed-off-by:
Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by:
Kalle Valo <kvalo@codeaurora.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marek Szyprowski authored
commit 0c4c5860 upstream. Initialize data->config_lock mutex before it is used by the driver code. This fixes following warning on Odroid XU3 boards: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc7-next-20180115-00001-gb75575dee3f2 #107 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c0111504>] (unwind_backtrace) from [<c010dbec>] (show_stack+0x10/0x14) [<c010dbec>] (show_stack) from [<c09b3f74>] (dump_stack+0x90/0xc8) [<c09b3f74>] (dump_stack) from [<c0179528>] (register_lock_class+0x1c0/0x59c) [<c0179528>] (register_lock_class) from [<c017bd1c>] (__lock_acquire+0x78/0x1850) [<c017bd1c>] (__lock_acquire) from [<c017de30>] (lock_acquire+0xc8/0x2b8) [<c017de30>] (lock_acquire) from [<c09ca59c>] (__mutex_lock+0x60/0xa0c) [<c09ca59c>] (__mutex_lock) from [<c09cafd0>] (mutex_lock_nested+0x1c/0x24) [<c09cafd0>] (mutex_lock_nested) from [<c068b0d0>] (ina2xx_set_shunt+0x70/0xb0) [<c068b0d0>] (ina2xx_set_shunt) from [<c068b218>] (ina2xx_probe+0x88/0x1b0) [<c068b218>] (ina2xx_probe) from [<c0673d90>] (i2c_device_probe+0x1e0/0x2d0) [<c0673d90>] (i2c_device_probe) from [<c053a268>] (driver_probe_device+0x2b8/0x4a0) [<c053a268>] (driver_probe_device) from [<c053a54c>] (__driver_attach+0xfc/0x120) [<c053a54c>] (__driver_attach) from [<c05384cc>] (bus_for_each_dev+0x58/0x7c) [<c05384cc>] (bus_for_each_dev) from [<c0539590>] (bus_add_driver+0x174/0x250) [<c0539590>] (bus_add_driver) from [<c053b5e0>] (driver_register+0x78/0xf4) [<c053b5e0>] (driver_register) from [<c0675ef0>] (i2c_register_driver+0x38/0xa8) [<c0675ef0>] (i2c_register_driver) from [<c0102b40>] (do_one_initcall+0x48/0x18c) [<c0102b40>] (do_one_initcall) from [<c0e00df0>] (kernel_init_freeable+0x110/0x1d4) [<c0e00df0>] (kernel_init_freeable) from [<c09c8120>] (kernel_init+0x8/0x114) [<c09c8120>] (kernel_init) from [<c01010b4>] (ret_from_fork+0x14/0x20) Fixes: 5d389b12 ("hwmon: (ina2xx) Make calibration register value fixed") Signed-off-by:
Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
commit 27bd5950 upstream. The block address is saved after the block is initialized when threshold_init_device() is called. Use the saved block address, if available, rather than trying to rediscover it. This will avoid a call trace, when resuming from suspend, due to the rdmsr_safe_on_cpu() call in get_block_address(). The rdmsr_safe_on_cpu() call issues an IPI but we're running with interrupts disabled. This triggers: WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0 Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14.x Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180221101900.10326-8-bp@alien8.deSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
commit 68627a69 upstream. Currently, bank 4 is reserved on Fam17h, so we chose not to initialize bank 4 in the smca_banks array. This means that when we check if a bank is initialized, like during boot or resume, we will see that bank 4 is not initialized and try to initialize it. This will cause a call trace, when resuming from suspend, due to rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue an IPI but we're running with interrupts disabled. This triggers: WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0 ... Reserved banks will be read-as-zero, so their MCA_IPID register will be zero. So, like the smca_banks array, the threshold_banks array will not have an entry for a reserved bank since all its MCA_MISC* registers will be zero. Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0. Use the "Reserved" type when checking if a bank is reserved. It's possible that other bank numbers may be reserved on future systems. Don't try to find the block address on reserved banks. Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14.x Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.deSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
commit e5d6a126 upstream. Pass the bank number to smca_get_bank_type() since that's all we need. Also, we should compare the bank number to MAX_NR_BANKS (size of the smca_banks array) not the number of bank types. Bank types are reused for multiple banks, so the number of types can be different from the number of banks in a system and thus we could return an invalid bank type. Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # 4.14.x Cc: <stable@vger.kernel.org> # 4.14.x: 11cf8877 x86/MCE/AMD: Define a function to get SMCA bank type Cc: <stable@vger.kernel.org> # 4.14.x: c6708d50 x86/MCE: Report only DRAM ECC as memory errors on AMD systems Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180221101900.10326-6-bp@alien8.deSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
commit c6708d50 upstream. The MCA_STATUS[ErrorCodeExt] field is very bank type specific. We currently check if the ErrorCodeExt value is 0x0 or 0x8 in mce_is_memory_error(), but we don't check the bank number. This means that we could flag non-memory errors as memory errors. We know that we want to flag DRAM ECC errors as memory errors, so let's do those cases first. We can add more cases later when needed. Define a wrapper function in mce_amd.c so we can use SMCA enums. [ bp: Remove brackets around return statements. ] Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171207203955.118171-2-Yazen.Ghannam@amd.comSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sudhir Sreedharan authored
commit 7972326a upstream. This can be reproduced by bind/unbind the driver multiple times in AM3517 board. Analysis revealed that rtl8187_start() was invoked before probe finishes(ie. before the mutex is initialized). INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250 Hardware name: Generic AM3517 (Flattened Device Tree) [<c010e0d8>] (unwind_backtrace) from [<c010beac>] (show_stack+0x10/0x14) [<c010beac>] (show_stack) from [<c017401c>] (register_lock_class+0x4f4/0x55c) [<c017401c>] (register_lock_class) from [<c0176fe0>] (__lock_acquire+0x74/0x1938) [<c0176fe0>] (__lock_acquire) from [<c0178cfc>] (lock_acquire+0xfc/0x23c) [<c0178cfc>] (lock_acquire) from [<c08aa2f8>] (mutex_lock_nested+0x50/0x3b0) [<c08aa2f8>] (mutex_lock_nested) from [<c05f5bf8>] (rtl8187_start+0x2c/0xd54) [<c05f5bf8>] (rtl8187_start) from [<c082dea0>] (drv_start+0xa8/0x320) [<c082dea0>] (drv_start) from [<c084d1d4>] (ieee80211_do_open+0x2bc/0x8e4) [<c084d1d4>] (ieee80211_do_open) from [<c069be94>] (__dev_open+0xb8/0x120) [<c069be94>] (__dev_open) from [<c069c11c>] (__dev_change_flags+0x88/0x14c) [<c069c11c>] (__dev_change_flags) from [<c069c1f8>] (dev_change_flags+0x18/0x48) [<c069c1f8>] (dev_change_flags) from [<c0710b08>] (devinet_ioctl+0x738/0x840) [<c0710b08>] (devinet_ioctl) from [<c067925c>] (sock_ioctl+0x164/0x2f4) [<c067925c>] (sock_ioctl) from [<c02883f8>] (do_vfs_ioctl+0x8c/0x9d0) [<c02883f8>] (do_vfs_ioctl) from [<c0288da8>] (SyS_ioctl+0x6c/0x7c) [<c0288da8>] (SyS_ioctl) from [<c0107760>] (ret_fast_syscall+0x0/0x1c) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = cd1ec000 [00000000] *pgd=8d1de831, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250 Hardware name: Generic AM3517 (Flattened Device Tree) task: ce73eec0 task.stack: cd1ea000 PC is at mutex_lock_nested+0xe8/0x3b0 LR is at mutex_lock_nested+0xd0/0x3b0 Cc: stable@vger.kernel.org Signed-off-by:
Sudhir Sreedharan <ssreedharan@mvista.com> Signed-off-by:
Kalle Valo <kvalo@codeaurora.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
commit bb5208b3 upstream. Older devices with a serdev attached bcm bt hci, use an Interrupt ACPI resource to describe the IRQ (rather then a GpioInt resource). These device seem to all claim the IRQ is active-high and seem to all need a DMI quirk to treat it as active-low. Instead simply always assume that Interrupt resource specified IRQs are always active-low. This fixes the bt device not being able to wake the host from runtime- suspend on the: Asus T100TAM, Asus T200TA, Lenovo Yoga2 and the Toshiba Encore, without the need to add 4 new DMI quirks for these models. This also allows us to remove 2 DMI quirks for the Asus T100TA and Asus T100CHI series. Likely the 2 remaining quirks can also be removed but I could not find a DSDT of these devices to verify this. Cc: stable@vger.kernel.org Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198953 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1554835Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Szymon Janc authored
commit 082f2300 upstream. Local random address needs to be updated before creating connection if RPA from LE Direct Advertising Report was resolved in host. Otherwise remote device might ignore connection request due to address mismatch. This was affecting following qualification test cases: GAP/CONN/SCEP/BV-03-C, GAP/CONN/GCEP/BV-05-C, GAP/CONN/DCEP/BV-05-C Before patch: < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #11350 [hci0] 84680.231216 Address: 56:BC:E8:24:11:68 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) > HCI Event: Command Complete (0x0e) plen 4 #11351 [hci0] 84680.246022 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #11352 [hci0] 84680.246417 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #11353 [hci0] 84680.248854 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11354 [hci0] 84680.249466 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #11355 [hci0] 84680.253222 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #11356 [hci0] 84680.458387 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:D6:76:8C:DF:82 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) RSSI: -74 dBm (0xb6) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11357 [hci0] 84680.458737 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #11358 [hci0] 84680.469982 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #11359 [hci0] 84680.470444 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #11360 [hci0] 84680.474971 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0 #11361 [hci0] 84682.545385 > HCI Event: Command Complete (0x0e) plen 4 #11362 [hci0] 84682.551014 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #11363 [hci0] 84682.551074 LE Connection Complete (0x01) Status: Unknown Connection Identifier (0x02) Handle: 0 Role: Master (0x00) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Connection interval: 0.00 msec (0x0000) Connection latency: 0 (0x0000) Supervision timeout: 0 msec (0x0000) Master clock accuracy: 0x00 After patch: < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #210 [hci0] 667.152459 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #211 [hci0] 667.153613 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #212 [hci0] 667.153704 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #213 [hci0] 667.154584 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #214 [hci0] 667.182619 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) RSSI: -70 dBm (0xba) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #215 [hci0] 667.182704 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #216 [hci0] 667.183599 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #217 [hci0] 667.183645 Address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) > HCI Event: Command Complete (0x0e) plen 4 #218 [hci0] 667.184590 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #219 [hci0] 667.184613 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #220 [hci0] 667.186558 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #221 [hci0] 667.485824 LE Connection Complete (0x01) Status: Success (0x00) Handle: 0 Role: Master (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x07 @ MGMT Event: Device Connected (0x000b) plen 13 {0x0002} [hci0] 667.485996 LE Address: 11:22:33:44:55:66 (OUI 11-22-33) Flags: 0x00000000 Data length: 0 Signed-off-by:
Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Al Viro authored
commit 30ce4d19 upstream. missed it in "kill struct filename.separate" several years ago. Cc: stable@vger.kernel.org Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael S. Tsirkin authored
commit c61611f7 upstream. get_user_pages_fast is supposed to be a faster drop-in equivalent of get_user_pages. As such, callers expect it to return a negative return code when passed an invalid address, and never expect it to return 0 when passed a positive number of pages, since its documentation says: * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages * were pinned, returns -errno. When get_user_pages_fast fall back on get_user_pages this is exactly what happens. Unfortunately the implementation is inconsistent: it returns 0 if passed a kernel address, confusing callers: for example, the following is pretty common but does not appear to do the right thing with a kernel address: ret = get_user_pages_fast(addr, 1, writeable, &page); if (ret < 0) return ret; Change get_user_pages_fast to return -EFAULT when supplied a kernel address to make it match expectations. All callers have been audited for consistency with the documented semantics. Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com Fixes: 5b65c467 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()") Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Gorbik authored
commit 15deb080 upstream. When loadparm is set in reipl parm block, the kernel should also set DIAG308_FLAGS_LP_VALID flag. This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct boot -> ccw reipl. Cc: <stable@vger.kernel.org> Reviewed-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Julian Wiedmann authored
commit 0cf1e051 upstream. On an Output queue, both EMPTY and PENDING buffer states imply that the buffer is ready for completion-processing by the upper-layer drivers. So for a non-QEBSM Output queue, get_buf_states() merges mixed batches of PENDING and EMPTY buffers into one large batch of EMPTY buffers. The upper-layer driver (ie. qeth) later distuingishes PENDING from EMPTY by inspecting the slsb_state for QDIO_OUTBUF_STATE_FLAG_PENDING. But the merge logic in get_buf_states() contains a bug that causes us to erronously also merge ERROR buffers into such a batch of EMPTY buffers (ERROR is 0xaf, EMPTY is 0xa1; so ERROR & EMPTY == EMPTY). Effectively, most outbound ERROR buffers are currently discarded silently and processed as if they had succeeded. Note that this affects _all_ non-QEBSM device types, not just IQD with CQ. Fix it by explicitly spelling out the exact conditions for merging. For extracting the "get initial state" part out of the loop, this relies on the fact that get_buf_states() is never called with a count of 0. The QEBSM path already strictly requires this, and the two callers with variable 'count' make sure of it. Fixes: 104ea556 ("qdio: support asynchronous delivery of storage blocks") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by:
Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by:
Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by:
Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Julian Wiedmann authored
commit dae55b6f upstream. Immediate retry of EQBS after CCQ 96 means that we potentially misreport the state of buffers inspected during the first EQBS call. This occurs when 1. the first EQBS finds all inspected buffers still in the initial state set by the driver (ie INPUT EMPTY or OUTPUT PRIMED), 2. the EQBS terminates early with CCQ 96, and 3. by the time that the second EQBS comes around, the state of those previously inspected buffers has changed. If the state reported by the second EQBS is 'driver-owned', all we know is that the previous buffers are driver-owned now as well. But we can't tell if they all have the same state. So for instance - the second EQBS reports OUTPUT EMPTY, but any number of the previous buffers could be OUTPUT ERROR by now, - the second EQBS reports OUTPUT ERROR, but any number of the previous buffers could be OUTPUT EMPTY by now. Effectively, this can result in both over- and underreporting of errors. If the state reported by the second EQBS is 'HW-owned', that doesn't guarantee that the previous buffers have not been switched to driver-owned in the mean time. So for instance - the second EQBS reports INPUT EMPTY, but any number of the previous buffers could be INPUT PRIMED (or INPUT ERROR) by now. This would result in failure to process pending work on the queue. If it's the final check before yielding initiative, this can cause a (temporary) queue stall due to IRQ avoidance. Fixes: 25f269f1 ("[S390] qdio: EQBS retry after CCQ 96") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by:
Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by:
Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 8d0d8ed3 upstream. Commit 1cf03c00 "nfit: scrub and register regions in a workqueue" mistakenly attempts to register a region per BLK aperture. There is nothing to register for individual apertures as they belong as a set to a BLK aperture group that are registered with a corresponding DIMM-control-region. Filter them for registration to prevent some needless devm_kzalloc() allocations. Cc: <stable@vger.kernel.org> Fixes: 1cf03c00 ("nfit: scrub and register regions in a workqueue") Reviewed-by:
Dave Jiang <dave.jiang@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tetsuo Handa authored
commit 1e047eaa upstream. syzbot is reporting deadlocks at __blkdev_get() [1]. ---------------------------------------- [ 92.493919] systemd-udevd D12696 525 1 0x00000000 [ 92.495891] Call Trace: [ 92.501560] schedule+0x23/0x80 [ 92.502923] schedule_preempt_disabled+0x5/0x10 [ 92.504645] __mutex_lock+0x416/0x9e0 [ 92.510760] __blkdev_get+0x73/0x4f0 [ 92.512220] blkdev_get+0x12e/0x390 [ 92.518151] do_dentry_open+0x1c3/0x2f0 [ 92.519815] path_openat+0x5d9/0xdc0 [ 92.521437] do_filp_open+0x7d/0xf0 [ 92.527365] do_sys_open+0x1b8/0x250 [ 92.528831] do_syscall_64+0x6e/0x270 [ 92.530341] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 92.931922] 1 lock held by systemd-udevd/525: [ 92.933642] #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0 ---------------------------------------- The reason of deadlock turned out that wait_event_interruptible() in blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put() due to q->mq_freeze_depth == 1. ---------------------------------------- [ 92.787172] a.out S12584 634 633 0x80000002 [ 92.789120] Call Trace: [ 92.796693] schedule+0x23/0x80 [ 92.797994] blk_queue_enter+0x3cb/0x540 [ 92.803272] generic_make_request+0xf0/0x3d0 [ 92.807970] submit_bio+0x67/0x130 [ 92.810928] submit_bh_wbc+0x15e/0x190 [ 92.812461] __block_write_full_page+0x218/0x460 [ 92.815792] __writepage+0x11/0x50 [ 92.817209] write_cache_pages+0x1ae/0x3d0 [ 92.825585] generic_writepages+0x5a/0x90 [ 92.831865] do_writepages+0x43/0xd0 [ 92.836972] __filemap_fdatawrite_range+0xc1/0x100 [ 92.838788] filemap_write_and_wait+0x24/0x70 [ 92.840491] __blkdev_put+0x69/0x1e0 [ 92.841949] blkdev_close+0x16/0x20 [ 92.843418] __fput+0xda/0x1f0 [ 92.844740] task_work_run+0x87/0xb0 [ 92.846215] do_exit+0x2f5/0xba0 [ 92.850528] do_group_exit+0x34/0xb0 [ 92.852018] SyS_exit_group+0xb/0x10 [ 92.853449] do_syscall_64+0x6e/0x270 [ 92.854944] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 92.943530] 1 lock held by a.out/634: [ 92.945105] #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0 ---------------------------------------- The reason of q->mq_freeze_depth == 1 turned out that loop_set_status() forgot to call blk_mq_unfreeze_queue() at error paths for info->lo_encrypt_type != NULL case. ---------------------------------------- [ 37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G W 4.16.0+ #457 [ 37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40 [ 37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246 [ 37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000 [ 37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798 [ 37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898 [ 37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678 [ 37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940 [ 37.538186] FS: 00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000 [ 37.541168] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0 [ 37.546410] Call Trace: [ 37.547902] blk_freeze_queue+0x9/0x30 [ 37.549968] loop_set_status+0x67/0x3c0 [loop] [ 37.549975] loop_set_status64+0x3b/0x70 [loop] [ 37.549986] lo_ioctl+0x223/0x810 [loop] [ 37.549995] blkdev_ioctl+0x572/0x980 [ 37.550003] block_ioctl+0x34/0x40 [ 37.550006] do_vfs_ioctl+0xa7/0x6d0 [ 37.550017] ksys_ioctl+0x6b/0x80 [ 37.573076] SyS_ioctl+0x5/0x10 [ 37.574831] do_syscall_64+0x6e/0x270 [ 37.576769] entry_SYSCALL_64_after_hwframe+0x42/0xb7 ---------------------------------------- [1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56fSigned-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by:
syzbot <bot+48594378e9851eab70bcd6f99327c7db58c5a28a@syzkaller.appspotmail.com> Fixes: ecdd0959 ("block/loop: fix race between I/O and set_status") Cc: Ming Lei <tom.leiming@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: stable <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Johansen authored
commit b5beb07a upstream. Resource auditing is using the peer field which is not available when the rlim data struct is used, because it is a different element of the same union. Accessing peer during resource auditing could cause garbage log entries or even oops the kernel. Move the rlim data block into the same struct as the peer field so they can be used together. CC: <stable@vger.kernel.org> Fixes: 86b92cb7 ("apparmor: move resource checks to using labels") Signed-off-by:
John Johansen <john.johansen@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Johansen authored
commit 040d9e2b upstream. The .ns_name should not be virtualized by the current ns view. It needs to report the ns base name as that is being used during startup as part of determining apparmor policy namespace support. BugLink: http://bugs.launchpad.net/bugs/1746463 Fixes: d9f02d9c ("apparmor: fix display of ns name") Cc: Stable <stable@vger.kernel.org> Reported-by:
Serge Hallyn <serge@hallyn.com> Tested-by:
Serge Hallyn <serge@hallyn.com> Signed-off-by:
John Johansen <john.johansen@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Johansen authored
commit 98cf5bbf upstream. The existence test is not being properly logged as the signal mapping maps it to the last entry in the named signal table. This is done to help catch bugs by making the 0 mapped signal value invalid so that we can catch the signal value not being filled in. When fixing the off-by-one comparision logic the reporting of the existence test was broken, because the logic behind the mapped named table was hidden. Fix this by adding a define for the name lookup and using it. Cc: Stable <stable@vger.kernel.org> Fixes: f7dc4c9a ("apparmor: fix off-by-one comparison on MAXMAPPED_SIG") Signed-off-by:
John Johansen <john.johansen@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bill Kuzeja authored
commit 6d634067 upstream. The code that fixes the crashes in the following commit introduced a small memory leak: commit 6a2cf8d3 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure") Fixing this requires a bit of reworking, which I've explained. Also provide some code cleanup. There is a small window in qla2x00_probe_one where if qla2x00_alloc_queues fails, we end up never freeing req and rsp and leak 0xc0 and 0xc8 bytes respectively (the sizes of req and rsp). I originally put in checks to test for this condition which were based on the incorrect assumption that if ha->rsp_q_map and ha->req_q_map were allocated, then rsp and req were allocated as well. This is incorrect. There is a window between these allocations: ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp); goto probe_hw_failed; [if successful, both rsp and req allocated] base_vha = qla2x00_create_host(sht, ha); goto probe_hw_failed; ret = qla2x00_request_irqs(ha, rsp); goto probe_failed; if (qla2x00_alloc_queues(ha, req, rsp)) { goto probe_failed; [if successful, now ha->rsp_q_map and ha->req_q_map allocated] To simplify this, we should just set req and rsp to NULL after we free them. Sounds simple enough? The problem is that req and rsp are pointers defined in the qla2x00_probe_one and they are not always passed by reference to the routines that free them. Here are paths which can free req and rsp: PATH 1: qla2x00_probe_one ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp); [req and rsp are passed by reference, but if this fails, we currently do not NULL out req and rsp. Easily fixed] PATH 2: qla2x00_probe_one failing in qla2x00_request_irqs or qla2x00_alloc_queues probe_failed: qla2x00_free_device(base_vha); qla2x00_free_req_que(ha, req) qla2x00_free_rsp_que(ha, rsp) PATH 3: qla2x00_probe_one: failing in qla2x00_mem_alloc or qla2x00_create_host probe_hw_failed: qla2x00_free_req_que(ha, req) qla2x00_free_rsp_que(ha, rsp) PATH 1: This should currently work, but it doesn't because rsp and rsp are not set to NULL in qla2x00_mem_alloc. Easily remedied. PATH 2: req and rsp aren't passed in at all to qla2x00_free_device but are derived from ha->req_q_map[0] and ha->rsp_q_map[0]. These are only set up if qla2x00_alloc_queues succeeds. In qla2x00_free_queues, we are protected from crashing if these don't exist because req_qid_map and rsp_qid_map are only set on their allocation. We are guarded in this way: for (cnt = 0; cnt < ha->max_req_queues; cnt++) { if (!test_bit(cnt, ha->req_qid_map)) continue; PATH 3: This works. We haven't freed req or rsp yet (or they were never allocated if qla2x00_mem_alloc failed), so we'll attempt to free them here. To summarize, there are a few small changes to make this work correctly and (and for some cleanup): 1) (For PATH 1) Set *rsp and *req to NULL in case of failure in qla2x00_mem_alloc so these are correctly set to NULL back in qla2x00_probe_one 2) After jumping to probe_failed: and calling qla2x00_free_device, explicitly set rsp and req to NULL so further calls with these pointers do not crash, i.e. the free queue calls in the probe_hw_failed section we fall through to. 3) Fix return code check in the call to qla2x00_alloc_queues. We currently drop the return code on the floor. The probe fails but the caller of the probe doesn't have an error code, so it attaches to pci. This can result in a crash on module shutdown. 4) Remove unnecessary NULL checks in qla2x00_free_req_que, qla2x00_free_rsp_que, and the egregious NULL checks before kfrees and vfrees in qla2x00_mem_free. I tested this out running a scenario where the card breaks at various times during initialization. I made sure I forced every error exit path in qla2x00_probe_one. Cc: <stable@vger.kernel.org> # v4.16 Fixes: 6a2cf8d3 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure") Signed-off-by:
Bill Kuzeja <william.kuzeja@stratus.com> Acked-by:
Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
commit 11cf8877 upstream. Scalable MCA systems have various types of banks. The bank's type can determine how we handle errors from it. For example, if a bank represents a UMC (Unified Memory Controller) then we will need to convert its address from a normalized address to a system physical address before handling the error. [ bp: Verify m->bank is within range and use bank pointer. ] Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171207203955.118171-1-Yazen.Ghannam@amd.comSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit c02216ac upstream. In randconfig testing, we sometimes get this warning: drivers/gpu/drm/radeon/radeon_object.c: In function 'radeon_bo_create': drivers/gpu/drm/radeon/radeon_object.c:242:2: error: #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Werror=cpp] #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \ This is rather annoying since almost all other code produces no build-time output unless we have found a real bug. We already fixed this in the amdgpu driver in commit 31bb90f1 ("drm/amdgpu: shut up #warning for compile testing") by adding a CONFIG_COMPILE_TEST check last year and agreed to do the same here, but both Michel and I then forgot about it until I came across the issue again now. For stable kernels, as this is one of very few remaining randconfig warnings in 4.14. Cc: stable@vger.kernel.org Link: https://patchwork.kernel.org/patch/9550009/Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Michel Dänzer <michel.daenzer@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Prashant Bhole authored
commit 621b6d2e upstream. A use-after-free bug was caught by KASAN while running usdt related code (BCC project. bcc/tests/python/test_usdt2.py): ================================================================== BUG: KASAN: use-after-free in uprobe_perf_close+0x222/0x3b0 Read of size 4 at addr ffff880384f9b4a4 by task test_usdt2.py/870 CPU: 4 PID: 870 Comm: test_usdt2.py Tainted: G W 4.16.0-next-20180409 #215 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0xc7/0x15b ? show_regs_print_info+0x5/0x5 ? printk+0x9c/0xc3 ? kmsg_dump_rewind_nolock+0x6e/0x6e ? uprobe_perf_close+0x222/0x3b0 print_address_description+0x83/0x3a0 ? uprobe_perf_close+0x222/0x3b0 kasan_report+0x1dd/0x460 ? uprobe_perf_close+0x222/0x3b0 uprobe_perf_close+0x222/0x3b0 ? probes_open+0x180/0x180 ? free_filters_list+0x290/0x290 trace_uprobe_register+0x1bb/0x500 ? perf_event_attach_bpf_prog+0x310/0x310 ? probe_event_disable+0x4e0/0x4e0 perf_uprobe_destroy+0x63/0xd0 _free_event+0x2bc/0xbd0 ? lockdep_rcu_suspicious+0x100/0x100 ? ring_buffer_attach+0x550/0x550 ? kvm_sched_clock_read+0x1a/0x30 ? perf_event_release_kernel+0x3e4/0xc00 ? __mutex_unlock_slowpath+0x12e/0x540 ? wait_for_completion+0x430/0x430 ? lock_downgrade+0x3c0/0x3c0 ? lock_release+0x980/0x980 ? do_raw_spin_trylock+0x118/0x150 ? do_raw_spin_unlock+0x121/0x210 ? do_raw_spin_trylock+0x150/0x150 perf_event_release_kernel+0x5d4/0xc00 ? put_event+0x30/0x30 ? fsnotify+0xd2d/0xea0 ? sched_clock_cpu+0x18/0x1a0 ? __fsnotify_update_child_dentry_flags.part.0+0x1b0/0x1b0 ? pvclock_clocksource_read+0x152/0x2b0 ? pvclock_read_flags+0x80/0x80 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock_cpu+0x18/0x1a0 ? pvclock_clocksource_read+0x152/0x2b0 ? locks_remove_file+0xec/0x470 ? pvclock_read_flags+0x80/0x80 ? fcntl_setlk+0x880/0x880 ? ima_file_free+0x8d/0x390 ? lockdep_rcu_suspicious+0x100/0x100 ? ima_file_check+0x110/0x110 ? fsnotify+0xea0/0xea0 ? kvm_sched_clock_read+0x1a/0x30 ? rcu_note_context_switch+0x600/0x600 perf_release+0x21/0x40 __fput+0x264/0x620 ? fput+0xf0/0xf0 ? do_raw_spin_unlock+0x121/0x210 ? do_raw_spin_trylock+0x150/0x150 ? SyS_fchdir+0x100/0x100 ? fsnotify+0xea0/0xea0 task_work_run+0x14b/0x1e0 ? task_work_cancel+0x1c0/0x1c0 ? copy_fd_bitmaps+0x150/0x150 ? vfs_read+0xe5/0x260 exit_to_usermode_loop+0x17b/0x1b0 ? trace_event_raw_event_sys_exit+0x1a0/0x1a0 do_syscall_64+0x3f6/0x490 ? syscall_return_slowpath+0x2c0/0x2c0 ? lockdep_sys_exit+0x1f/0xaa ? syscall_return_slowpath+0x1a3/0x2c0 ? lockdep_sys_exit+0x1f/0xaa ? prepare_exit_to_usermode+0x11c/0x1e0 ? enter_from_user_mode+0x30/0x30 random: crng init done ? __put_user_4+0x1c/0x30 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f41d95f9340 RSP: 002b:00007fffe71e4268 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 000000000000000d RCX: 00007f41d95f9340 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 000000000000000d RBP: 0000000000000000 R08: 00007f41ca8ff700 R09: 00007f41d996dd1f R10: 00007fffe71e41e0 R11: 0000000000000246 R12: 00007fffe71e4330 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007fffe71e4290 Allocated by task 870: kasan_kmalloc+0xa0/0xd0 kmem_cache_alloc_node+0x11a/0x430 copy_process.part.19+0x11a0/0x41c0 _do_fork+0x1be/0xa20 do_syscall_64+0x198/0x490 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Freed by task 0: __kasan_slab_free+0x12e/0x180 kmem_cache_free+0x102/0x4d0 free_task+0xfe/0x160 __put_task_struct+0x189/0x290 delayed_put_task_struct+0x119/0x250 rcu_process_callbacks+0xa6c/0x1b60 __do_softirq+0x238/0x7ae The buggy address belongs to the object at ffff880384f9b480 which belongs to the cache task_struct of size 12928 It occurs because task_struct is freed before perf_event which refers to the task and task flags are checked while teardown of the event. perf_event_alloc() assigns task_struct to hw.target of perf_event, but there is no reference counting for it. As a fix we get_task_struct() in perf_event_alloc() at above mentioned assignment and put_task_struct() in _free_event(). Signed-off-by:
Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Reviewed-by:
Oleg Nesterov <oleg@redhat.com> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 63b6da39 ("perf: Fix perf_event_exit_task() race") Link: http://lkml.kernel.org/r/20180409100346.6416-1-bhole_prashant_q7@lab.ntt.co.jpSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Adrian Hunter authored
commit 91d29b28 upstream. timestamp_insn_cnt is used to estimate the timestamp based on the number of instructions since the last known timestamp. If the estimate is not accurate enough decoding might not be correctly synchronized with side-band events causing more trace errors. However there are always timestamps following an overflow, so the estimate is not needed and can indeed result in more errors. Suppress the estimate by setting timestamp_insn_cnt to zero. Signed-off-by:
Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.comSigned-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-