- 27 May, 2015 9 commits
-
-
mancha security authored
commit 0b053c95 upstream. OPTIMIZER_HIDE_VAR(), as defined when using gcc, is insufficient to ensure protection from dead store optimization. For the random driver and crypto drivers, calls are emitted ... $ gdb vmlinux (gdb) disassemble memzero_explicit Dump of assembler code for function memzero_explicit: 0xffffffff813a18b0 <+0>: push %rbp 0xffffffff813a18b1 <+1>: mov %rsi,%rdx 0xffffffff813a18b4 <+4>: xor %esi,%esi 0xffffffff813a18b6 <+6>: mov %rsp,%rbp 0xffffffff813a18b9 <+9>: callq 0xffffffff813a7120 <memset> 0xffffffff813a18be <+14>: pop %rbp 0xffffffff813a18bf <+15>: retq End of assembler dump. (gdb) disassemble extract_entropy [...] 0xffffffff814a5009 <+313>: mov %r12,%rdi 0xffffffff814a500c <+316>: mov $0xa,%esi 0xffffffff814a5011 <+321>: callq 0xffffffff813a18b0 <memzero_explicit> 0xffffffff814a5016 <+326>: mov -0x48(%rbp),%rax [...] ... but in case in future we might use facilities such as LTO, then OPTIMIZER_HIDE_VAR() is not sufficient to protect gcc from a possible eviction of the memset(). We have to use a compiler barrier instead. Minimal test example when we assume memzero_explicit() would *not* be a call, but would have been *inlined* instead: static inline void memzero_explicit(void *s, size_t count) { memset(s, 0, count); <foo> } int main(void) { char buff[20]; snprintf(buff, sizeof(buff) - 1, "test"); printf("%s", buff); memzero_explicit(buff, sizeof(buff)); return 0; } With <foo> := OPTIMIZER_HIDE_VAR(): (gdb) disassemble main Dump of assembler code for function main: [...] 0x0000000000400464 <+36>: callq 0x400410 <printf@plt> 0x0000000000400469 <+41>: xor %eax,%eax 0x000000000040046b <+43>: add $0x28,%rsp 0x000000000040046f <+47>: retq End of assembler dump. With <foo> := barrier(): (gdb) disassemble main Dump of assembler code for function main: [...] 0x0000000000400464 <+36>: callq 0x400410 <printf@plt> 0x0000000000400469 <+41>: movq $0x0,(%rsp) 0x0000000000400471 <+49>: movq $0x0,0x8(%rsp) 0x000000000040047a <+58>: movl $0x0,0x10(%rsp) 0x0000000000400482 <+66>: xor %eax,%eax 0x0000000000400484 <+68>: add $0x28,%rsp 0x0000000000400488 <+72>: retq End of assembler dump. As can be seen, movq, movq, movl are being emitted inlined via memset(). Reference: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/13764/ Fixes: d4c5efdb ("random: add and use memzero_explicit() for clearing data") Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: mancha security <mancha1@zoho.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nicolas Iooss authored
commit a3fa71c4 upstream. In struct wl18xx_acx_rx_rate_stat, rx_frames_per_rates field is an array, not a number. This means WL18XX_DEBUGFS_FWSTATS_FILE can't be used to display this field in debugfs (it would display a pointer, not the actual data). Use WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY instead. This bug has been found by adding a __printf attribute to wl1271_format_buffer. gcc complained about "format '%u' expects argument of type 'unsigned int', but argument 5 has type 'u32 *'". Fixes: c5d94169 ("wl18xx: use new fw stats structures") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Sabrina Dubroca authored
commit 08e83316 upstream. There is a race condition between e1000_change_mtu's cleanups and netpoll, when we change the MTU across jumbo size: Changing MTU frees all the rx buffers: e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings -> e1000_clean_rx_ring Then, close to the end of e1000_change_mtu: pr_info -> ... -> netpoll_poll_dev -> e1000_clean -> e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag And when we come back to do the rest of the MTU change: e1000_up -> e1000_configure -> e1000_configure_rx -> e1000_alloc_jumbo_rx_buffers alloc_jumbo finds the buffers already != NULL, since data (shared with page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage, or at least not what is expected when in jumbo state. This results in an unusable adapter (packets don't get through), and a NULL pointer dereference on the next call to e1000_clean_rx_ring (other mtu change, link down, shutdown): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330 [...] Call Trace: [<ffffffff81195445>] put_page+0x55/0x60 [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200 [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60 [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0 [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840 [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170 [<ffffffff81647050>] dev_set_mtu+0xa0/0x140 [<ffffffff81664218>] do_setlink+0x218/0xac0 [<ffffffff814459e9>] ? nla_parse+0xb9/0x120 [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890 [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40 [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260 By setting the allocator to a dummy version, netpoll can't mess up our rx buffers. The allocator is set back to a sane value in e1000_configure_rx. Fixes: edbbb3ca ("e1000: implement jumbo receive with partial descriptors") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Jeff Layton authored
commit 5d05e54a upstream. Chuck pointed out a problem that crept in with commit 6ffa30d3 (nfs: don't call blocking operations while !TASK_RUNNING). Linux counts tasks in uninterruptible sleep against the load average, so this caused the system's load average to be pinned at at least 1 when there was a NFSv4.1+ mount active. Not a huge problem, but it's probably worth fixing before we get too many complaints about it. This patch converts the code back to use TASK_INTERRUPTIBLE sleep, simply has it flush any signals on each loop iteration. In practice no one should really be signalling this thread at all, so I think this is reasonably safe. With this change, there's also no need to game the hung task watchdog so we can also convert the schedule_timeout call back to a normal schedule. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Fixes: commit 6ffa30d3 (“nfs: don't call blocking . . .”) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Jeff Layton authored
commit 6ffa30d3 upstream. Bruce reported seeing this warning pop when mounting using v4.1: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1121 at kernel/sched/core.c:7300 __might_sleep+0xbd/0xd0() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810ff58f>] prepare_to_wait+0x2f/0x90 Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer ppdev joydev snd virtio_console virtio_balloon pcspkr serio_raw parport_pc parport pvpanic floppy soundcore i2c_piix4 virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring ata_generic virtio pata_acpi CPU: 1 PID: 1121 Comm: nfsv4.1-svc Not tainted 3.19.0-rc4+ #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014 0000000000000000 000000004e5e3f73 ffff8800b998fb48 ffffffff8186ac78 0000000000000000 ffff8800b998fba0 ffff8800b998fb88 ffffffff810ac9da ffff8800b998fb68 ffffffff81c923e7 00000000000004d9 0000000000000000 Call Trace: [<ffffffff8186ac78>] dump_stack+0x4c/0x65 [<ffffffff810ac9da>] warn_slowpath_common+0x8a/0xc0 [<ffffffff810aca65>] warn_slowpath_fmt+0x55/0x70 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810dd2ad>] __might_sleep+0xbd/0xd0 [<ffffffff8124c973>] kmem_cache_alloc_trace+0x243/0x430 [<ffffffff810d941e>] ? groups_alloc+0x3e/0x130 [<ffffffff810d941e>] groups_alloc+0x3e/0x130 [<ffffffffa0301b1e>] svcauth_unix_accept+0x16e/0x290 [sunrpc] [<ffffffffa0300571>] svc_authenticate+0xe1/0xf0 [sunrpc] [<ffffffffa02fc564>] svc_process_common+0x244/0x6a0 [sunrpc] [<ffffffffa02fd044>] bc_svc_process+0x1c4/0x260 [sunrpc] [<ffffffffa03d5478>] nfs41_callback_svc+0x128/0x1f0 [nfsv4] [<ffffffff810ff970>] ? wait_woken+0xc0/0xc0 [<ffffffffa03d5350>] ? nfs4_callback_svc+0x60/0x60 [nfsv4] [<ffffffff810d45bf>] kthread+0x11f/0x140 [<ffffffff810ea815>] ? local_clock+0x15/0x30 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 [<ffffffff81874bfc>] ret_from_fork+0x7c/0xb0 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 ---[ end trace 675220a11e30f4f2 ]--- nfs41_callback_svc does most of its work while in TASK_INTERRUPTIBLE, which is just wrong. Fix that by finishing the wait immediately if we've found that the list has something on it. Also, we don't expect this kthread to accept signals, so we should be using a TASK_UNINTERRUPTIBLE sleep instead. That however, opens us up hung task warnings from the watchdog, so have the schedule_timeout wake up every 60s if there's no callback activity. Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Len Brown authored
sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance commit b253149b upstream. In Linux-3.9 we removed the mwait_idle() loop: 69fb3676 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param") The reasoning was that modern machines should be sufficiently happy during the boot process using the default_idle() HALT loop, until cpuidle loads and either acpi_idle or intel_idle invoke the newer MWAIT-with-hints idle loop. But two machines reported problems: 1. Certain Core2-era machines support MWAIT-C1 and HALT only. MWAIT-C1 is preferred for optimal power and performance. But if they support just C1, cpuidle never loads and so they use the boot-time default idle loop forever. 2. Some laptops will boot-hang if HALT is used, but will boot successfully if MWAIT is used. This appears to be a hidden assumption in BIOS SMI, that is presumably valid on the proprietary OS where the BIOS was validated. https://bugzilla.kernel.org/show_bug.cgi?id=60770 So here we effectively revert the patch above, restoring the mwait_idle() loop. However, we don't bother restoring the idle=mwait cmdline parameter, since it appears to add no value. Maintainer notes: For 3.9, simply revert 69fb3676 for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in context For 3.11, 3.12, 3.13, this patch applies cleanly Tested-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Mike Galbraith <bitbucket@online.de> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ian Malone <ibmalone@gmail.com> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com [ Ported to recent kernels. ] [ Mike: 3.10 backport ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Krzysztof Kozlowski authored
commit 1915a718 upstream. The return value of power_supply_register() call was not checked and even on error probe() function returned 0. If registering failed then during unbind the driver tried to unregister power supply which was not actually registered. This could lead to memory corruption because power_supply_unregister() unconditionally cleans up given power supply. Fix this by checking return status of power_supply_register() call. In case of failure, clean up sysfs entries and fail the probe. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 9be0fcb5 ("compal-laptop: add JHL90, battery & hwmon interface") Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Al Viro authored
commit 3cab989a upstream. Calling unlazy_walk() in walk_component() and do_last() when we find a symlink that needs to be followed doesn't acquire a reference to vfsmount. That's fine when the symlink is on the same vfsmount as the parent directory (which is almost always the case), but it's not always true - one _can_ manage to bind a symlink on top of something. And in such cases we end up with excessive mntput(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Dmitry Torokhov authored
commit 9535c475 upstream. The hardware, according to the specs, is limited to 256 byte transfers, and current driver has no protections in case users attempt to do larger transfers. The code will just stomp over status register and mayhem ensues. Let's split larger transfers into digestable chunks. Doing this allows Atmel MXT driver on Pixel 1 function properly (it hasn't since commit 9d8dc3e5 "Input: atmel_mxt_ts - implement T44 message handling" which tries to consume multiple touchscreen/touchpad reports in a single transaction). Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
- 26 May, 2015 31 commits
-
-
James Bottomley authored
commit 56cbd0cc upstream. mvsas is giving a General protection fault when it encounters an expander attached ATA device. Analysis of mvs_task_prep_ata() shows that the driver is assuming all ATA devices are locally attached and obtaining the phy mask by indexing the local phy table (in the HBA structure) with the phy id. Since expanders have many more phys than the HBA, this is causing the index into the HBA phy table to overflow and returning rubbish as the pointer. mvs_task_prep_ssp() instead does the phy mask using the port properties. Mirror this in mvs_task_prep_ata() to fix the panic. Reported-by: Adam Talbot <ajtalbot1@gmail.com> Tested-by: Adam Talbot <ajtalbot1@gmail.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Oleg Nesterov authored
commit b72c1869 upstream. ptrace_resume() is called when the tracee is still __TASK_TRACED. We set tracee->exit_code and then wake_up_state() changes tracee->state. If the tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T) wrongly looks like another report from tracee. This confuses debugger, and since wait_task_stopped() clears ->exit_code the tracee can miss a signal. Test-case: #include <stdio.h> #include <unistd.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <pthread.h> #include <assert.h> int pid; void *waiter(void *arg) { int stat; for (;;) { assert(pid == wait(&stat)); assert(WIFSTOPPED(stat)); if (WSTOPSIG(stat) == SIGHUP) continue; assert(WSTOPSIG(stat) == SIGCONT); printf("ERR! extra/wrong report:%x\n", stat); } } int main(void) { pthread_t thread; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); for (;;) kill(getpid(), SIGHUP); } assert(pthread_create(&thread, NULL, waiter, NULL) == 0); for (;;) ptrace(PTRACE_CONT, pid, 0, SIGCONT); return 0; } Note for stable: the bug is very old, but without 9899d11f "ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix should use lock_task_sighand(child). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Pavel Labath <labath@google.com> Tested-by: Pavel Labath <labath@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Yann Droneaud authored
commit 66578b0b upstream. In a call to ib_umem_get(), if address is 0x0 and size is already page aligned, check added in commit 8494057a ("IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic") will refuse to register a memory region that could otherwise be valid (provided vm.mmap_min_addr sysctl and mmap_low_allowed SELinux knobs allow userspace to map something at address 0x0). This patch allows back such registration: ib_umem_get() should probably don't care of the base address provided it can be pinned with get_user_pages(). There's two possible overflows, in (addr + size) and in PAGE_ALIGN(addr + size), this patch keep ensuring none of them happen while allowing to pin memory at address 0x0. Anyway, the case of size equal 0 is no more (partially) handled as 0-length memory region are disallowed by an earlier check. Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com Cc: Shachar Raindel <raindel@mellanox.com> Cc: Jack Morgenstein <jackm@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Yann Droneaud authored
commit 8abaae62 upstream. If ib_umem_get() is called with a size equal to 0 and an non-page aligned address, one page will be pinned and a 0-sized umem will be returned to the caller. This should not be allowed: it's not expected for a memory region to have a size equal to 0. This patch adds a check to explicitly refuse to register a 0-sized region. Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com Cc: Shachar Raindel <raindel@mellanox.com> Cc: Jack Morgenstein <jackm@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Michael Davidson authored
commit a87938b2 upstream. With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down address allocation strategy, load_elf_binary() will attempt to map a PIE binary into an address range immediately below mm->mmap_base. Unfortunately, load_elf_ binary() does not take account of the need to allocate sufficient space for the entire binary which means that, while the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are that is supposed to be the "gap" between the stack and the binary. Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this means that binaries with large data segments > 128MB can end up mapping part of their data segment over their stack resulting in corruption of the stack (and the data segment once the binary starts to run). Any PIE binary with a data segment > 128MB is vulnerable to this although address randomization means that the actual gap between the stack and the end of the binary is normally greater than 128MB. The larger the data segment of the binary the higher the probability of failure. Fix this by calculating the total size of the binary in the same way as load_elf_interp(). Signed-off-by: Michael Davidson <md@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gerald Schaefer authored
commit 97534127 upstream. Commit 61f77eda ("mm/hugetlb: reduce arch dependent code around follow_huge_*") broke follow_huge_pmd() on s390, where pmd and pte layout differ and using pte_page() on a huge pmd will return wrong results. Using pmd_page() instead fixes this. All architectures that were touched by that commit have pmd_page() defined, so this should not break anything on other architectures. Fixes: 61f77eda "mm/hugetlb: reduce arch dependent code around follow_huge_*" Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.cz>, Andrea Arcangeli <aarcange@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nicholas Bellinger authored
commit c8e63985 upstream. This patch fixes a bug for COMPARE_AND_WRITE handling with fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC. It adds the missing allocation for cmd->t_bidi_data_sg within transport_generic_new_cmd() that is used by COMPARE_AND_WRITE for the initial READ payload, even if the fabric is already providing a pre-allocated buffer for cmd->t_data_sg. Also, fix zero-length COMPARE_AND_WRITE handling within the compare_and_write_callback() and target_complete_ok_work() to queue the response, skipping the initial READ. This fixes COMPARE_AND_WRITE emulation with loopback, vhost, and xen-backend fabric drivers using SG_TO_MEM_NOALLOC. Reported-by: Christoph Hellwig <hch@lst.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Daniel Vetter authored
commit 37ef01ab upstream. We stopped handling them in commit aaecdf61 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Nov 4 15:52:22 2014 +0100 drm/i915: Stop gathering error states for CS error interrupts but just clearing is apparently not enough: A sufficiently dead gpu left behind by firmware (*cough* coreboot *cough*) can keep the gpu in an endless loop of such interrupts, eventually leading to the nmi firing. And definitely to what looks like a machine hang. Since we don't even enable these interrupts on gen5+ let's do the same on earlier platforms. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93171Tested-by: Mono <mono-for-kernel-org@donderklumpen.de> Tested-by: info@gluglug.org.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Lv Zheng authored
commit 2b876010 upstream. ACPICA commit aacf863cfffd46338e268b7415f7435cae93b451 It is reported that on a physically 64-bit addressed machine, 32-bit kernel can trigger crashes in accessing the memory regions that are beyond the 32-bit boundary. The region field's start address should still be 32-bit compliant, but after a calculation (adding some offsets), it may exceed the 32-bit boundary. This case is rare and buggy, but there are real BIOSes leaked with such issues (see References below). This patch fixes this gap by always defining IO addresses as 64-bit, and allows OSPMs to optimize it for a real 32-bit machine to reduce the size of the internal objects. Internal acpi_physical_address usages in the structures that can be fixed by this change include: 1. struct acpi_object_region: acpi_physical_address address; 2. struct acpi_address_range: acpi_physical_address start_address; acpi_physical_address end_address; 3. struct acpi_mem_space_context; acpi_physical_address address; 4. struct acpi_table_desc acpi_physical_address address; See known issues 1 for other usages. Note that acpi_io_address which is used for ACPI_PROCESSOR may also suffer from same problem, so this patch changes it accordingly. For iasl, it will enforce acpi_physical_address as 32-bit to generate 32-bit OSPM compatible tables on 32-bit platforms, we need to define ACPI_32BIT_PHYSICAL_ADDRESS for it in acenv.h. Known issues: 1. Cleanup of mapped virtual address In struct acpi_mem_space_context, acpi_physical_address is used as a virtual address: acpi_physical_address mapped_physical_address; It is better to introduce acpi_virtual_address or use acpi_size instead. This patch doesn't make such a change. Because this should be done along with a change to acpi_os_map_memory()/acpi_os_unmap_memory(). There should be no functional problem to leave this unchanged except that only this structure is enlarged unexpectedly. Link: https://github.com/acpica/acpica/commit/aacf863c Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501Reported-and-tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Reported-and-tested-by: Sial Nije <sialnije@gmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Anton Blanchard authored
commit 9a5cbce4 upstream. We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH (currently 127), but we forgot to do the same for 64bit backtraces. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit ccccf3d6 upstream. If we attempt to clone a 0 length region into a file we can end up inserting a range in the inode's extent_io tree with a start offset that is greater then the end offset, which triggers immediately the following warning: [ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]() [ 3914.620886] BTRFS: end < start 4095 4096 (...) [ 3914.638093] Call Trace: [ 3914.638636] [<ffffffff81425fd9>] dump_stack+0x4c/0x65 [ 3914.639620] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb [ 3914.640789] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs] [ 3914.642041] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48 [ 3914.643236] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs] [ 3914.644441] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs] [ 3914.645711] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs] [ 3914.646914] [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33 [ 3914.648058] [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs] [ 3914.650105] [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs] [ 3914.651361] [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs] [ 3914.652761] [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs] [ 3914.654128] [<ffffffff811226dd>] ? might_fault+0x58/0xb5 [ 3914.655320] [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs] (...) [ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]--- This later makes the inode eviction handler enter an infinite loop that keeps dumping the following warning over and over: [ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]() [ 3915.119913] BTRFS: end < start 4095 4096 (...) [ 3915.137394] Call Trace: [ 3915.137913] [<ffffffff81425fd9>] dump_stack+0x4c/0x65 [ 3915.139154] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb [ 3915.140316] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs] [ 3915.141505] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48 [ 3915.142709] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs] [ 3915.143849] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs] [ 3915.145120] [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs] [ 3915.146352] [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50 [ 3915.147565] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs] [ 3915.148785] [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33 [ 3915.149931] [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs] [ 3915.151154] [<ffffffff81168904>] evict+0xa0/0x148 [ 3915.152094] [<ffffffff811689e5>] dispose_list+0x39/0x43 [ 3915.153081] [<ffffffff81169564>] evict_inodes+0xdc/0xeb [ 3915.154062] [<ffffffff81154418>] generic_shutdown_super+0x49/0xef [ 3915.155193] [<ffffffff811546d1>] kill_anon_super+0x13/0x1e [ 3915.156274] [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs] (...) [ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]--- So just bail out of the clone ioctl if the length of the region to clone is zero, without locking any extent range, in order to prevent this issue (same behaviour as a pwrite with a 0 length for example). This is trivial to reproduce. For example, the steps for the test I just made for fstests: mkfs.btrfs -f SCRATCH_DEV mount SCRATCH_DEV $SCRATCH_MNT touch $SCRATCH_MNT/foo touch $SCRATCH_MNT/bar $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar umount $SCRATCH_MNT A test case for fstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit 113e8283 upstream. If we pass a length of 0 to the extent_same ioctl, we end up locking an extent range with a start offset greater then its end offset (if the destination file's offset is greater than zero). This results in a warning from extent_io.c:insert_state through the following call chain: btrfs_extent_same() btrfs_double_lock() lock_extent_range() lock_extent(inode->io_tree, offset, offset + len - 1) lock_extent_bits() __set_extent_bit() insert_state() --> WARN_ON(end < start) This leads to an infinite loop when evicting the inode. This is the same problem that my previous patch titled "Btrfs: fix inode eviction infinite loop after cloning into it" addressed but for the extent_same ioctl instead of the clone ioctl. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Heiko Carstens authored
commit d7441949 upstream. Sebastian reported a crash caused by a jump label mismatch after resume. This happens because we do not save the kernel text section during suspend and therefore also do not restore it during resume, but use the kernel image that restores the old system. This means that after a suspend/resume cycle we lost all modifications done to the kernel text section. The reason for this is the pfn_is_nosave() function, which incorrectly returns that read-only pages don't need to be saved. This is incorrect since we mark the kernel text section read-only. We still need to make sure to not save and restore pages contained within NSS and DCSS segment. To fix this add an extra case for the kernel text section and only save those pages if they are not contained within an NSS segment. Fixes the following crash (and the above bugs as well): Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0 Found: c0 04 00 00 00 00 Expected: c0 f4 00 00 00 11 New: c0 04 00 00 00 00 Kernel panic - not syncing: Corrupted kernel text CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4 Call Trace: [<0000000000113972>] show_stack+0x72/0xf0 [<000000000081f15e>] dump_stack+0x6e/0x90 [<000000000081c4e8>] panic+0x108/0x2b0 [<000000000081be64>] jump_label_bug.isra.2+0x104/0x108 [<0000000000112176>] __jump_label_transform+0x9e/0xd0 [<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50 [<00000000001d1136>] multi_cpu_stop+0x12e/0x170 [<00000000001d1472>] cpu_stopper_thread+0xb2/0x168 [<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0 [<0000000000158baa>] kthread+0x10a/0x110 [<0000000000824a86>] kernel_thread_starter+0x6/0xc Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Max Filippov authored
commit 24e94454 upstream. - don't lock lp->lock in the iss_net_timer for the call of iss_net_poll, it will lock it itself; - invert order of lp->lock and opened_lock acquisition in the iss_net_open to make it consistent with iss_net_poll; - replace spin_lock with spin_lock_bh when acquiring locks used in iss_net_timer from non-atomic context; - replace spin_lock_irqsave with spin_lock_bh in the iss_net_start_xmit as the driver doesn't use lp->lock in the hard IRQ context; - replace __SPIN_LOCK_UNLOCKED(lp.lock) with spin_lock_init, otherwise lockdep is unhappy about using non-static key. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Michael Gernoth authored
commit 91bf0c2d upstream. The functions snd_emu10k1_proc_spdif_read and snd_emu1010_fpga_read acquire the emu_lock before accessing the FPGA. The function used to access the FPGA (snd_emu1010_fpga_read) also tries to take the emu_lock which causes a deadlock. Remove the outer locking in the proc-functions (guarding only the already safe fpga read) to prevent this deadlock. [removed superfluous flags variables too -- tiwai] Signed-off-by: Michael Gernoth <michael@gernoth.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
K. Y. Srinivasan authored
commit 8de58074 upstream. We may exit this function without properly freeing up the maapings we may have acquired. Fix the bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gregory CLEMENT authored
commit 61819549 upstream. Level IRQ handlers and edge IRQ handler are managed by tow different sets of registers. But currently the driver uses the same mask for the both registers. It lead to issues with the following scenario: First, an IRQ is requested on a GPIO to be triggered on front. After, this an other IRQ is requested for a GPIO of the same bank but triggered on level. Then the first one will be also setup to be triggered on level. It leads to an interrupt storm. The different kind of handler are already associated with two different irq chip type. With this patch the driver uses a private mask for each one which solves this issue. It has been tested on an Armada XP based board and on an Armada 375 board. For the both boards, with this patch is applied, there is no such interrupt storm when running the previous scenario. This bug was already fixed but in a different way in the legacy version of this driver by Evgeniy Dushistov: 9ece8839 "ARM: orion: Fix for certain sequence of request_irq can cause irq storm". The fact the new version of the gpio drive could be affected had been discussed there: http://thread.gmane.org/gmane.linux.ports.arm.kernel/344670/focus=364012Reported-by: Evgeniy A. Dushistov <dushistov@mail.ru> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Radim Krčmář authored
commit 80f7fdb1 upstream. If we were migrated right after __getcpu, but before reading the migration_count, we wouldn't notice that we read TSC of a different VCPU, nor that KVM's bug made pvti invalid, as only migration_count on source VCPU is increased. Change vdso instead of updating migration_count on destination. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Fixes: 0a4e6be9 ("x86: kvm: Revert "remove sched notifier for cross-cpu migrations"") Message-Id: <1428000263-11892-1-git-send-email-rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Sagi Grimberg authored
commit 4a579da2 upstream. Before we reach to connection established we may get an error event. In this case the core won't teardown this connection (never established it), so we take care of freeing it ourselves. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit bbc78c07 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 59c9904c upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> [ luis: backported to 3.16: - file rename: drivers/usb/isp1760/isp1760-hcd.c -> drivers/usb/host/isp1760-hcd.c ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 74bd7b69 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 08debfb1 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 7a606ac2 upstream. While this driver was already using a 50ms resume timeout, let's make sure everybody uses the same macro so it's easy to fix later should anything go wrong. It also gives a more "stable" expectation to Linux users. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 84c0d178 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> [ luis: backported to 3.16: - replaced 'hub_wq' by 'khubd' in comment ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 595227db upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 7e136bb7 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 8c0ae657 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit 309be239 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Based on original work by Bin Liu <Bin Liu <b-liu@ti.com>> Cc: Bin Liu <b-liu@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> [ kamal: backport to 3.13-stable: context; fewer USB_RESUME_TIMEOUT sites ] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit b8fb6f79 upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Felipe Balbi authored
commit ea16328f upstream. Make sure we're using the new macro, so our resume signaling will always pass certification. Signed-off-by: Felipe Balbi <balbi@ti.com> [ luis: backported to 3.16: - replaced 'hub_wq' by 'khubd' in comment ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-