- 18 Sep, 2015 40 commits
-
-
Al Viro authored
commit 3cab989a upstream. Calling unlazy_walk() in walk_component() and do_last() when we find a symlink that needs to be followed doesn't acquire a reference to vfsmount. That's fine when the symlink is on the same vfsmount as the parent directory (which is almost always the case), but it's not always true - one _can_ manage to bind a symlink on top of something. And in such cases we end up with excessive mntput(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [lizf: Backported to 3.4: drop the changes to do_last()] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jeff Layton authored
commit 5d05e54a upstream. Chuck pointed out a problem that crept in with commit 6ffa30d3 (nfs: don't call blocking operations while !TASK_RUNNING). Linux counts tasks in uninterruptible sleep against the load average, so this caused the system's load average to be pinned at at least 1 when there was a NFSv4.1+ mount active. Not a huge problem, but it's probably worth fixing before we get too many complaints about it. This patch converts the code back to use TASK_INTERRUPTIBLE sleep, simply has it flush any signals on each loop iteration. In practice no one should really be signalling this thread at all, so I think this is reasonably safe. With this change, there's also no need to game the hung task watchdog so we can also convert the schedule_timeout call back to a normal schedule. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Fixes: commit 6ffa30d3 (“nfs: don't call blocking . . .”) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jeff Layton authored
commit 6ffa30d3 upstream. Bruce reported seeing this warning pop when mounting using v4.1: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1121 at kernel/sched/core.c:7300 __might_sleep+0xbd/0xd0() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810ff58f>] prepare_to_wait+0x2f/0x90 Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer ppdev joydev snd virtio_console virtio_balloon pcspkr serio_raw parport_pc parport pvpanic floppy soundcore i2c_piix4 virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring ata_generic virtio pata_acpi CPU: 1 PID: 1121 Comm: nfsv4.1-svc Not tainted 3.19.0-rc4+ #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014 0000000000000000 000000004e5e3f73 ffff8800b998fb48 ffffffff8186ac78 0000000000000000 ffff8800b998fba0 ffff8800b998fb88 ffffffff810ac9da ffff8800b998fb68 ffffffff81c923e7 00000000000004d9 0000000000000000 Call Trace: [<ffffffff8186ac78>] dump_stack+0x4c/0x65 [<ffffffff810ac9da>] warn_slowpath_common+0x8a/0xc0 [<ffffffff810aca65>] warn_slowpath_fmt+0x55/0x70 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810dd2ad>] __might_sleep+0xbd/0xd0 [<ffffffff8124c973>] kmem_cache_alloc_trace+0x243/0x430 [<ffffffff810d941e>] ? groups_alloc+0x3e/0x130 [<ffffffff810d941e>] groups_alloc+0x3e/0x130 [<ffffffffa0301b1e>] svcauth_unix_accept+0x16e/0x290 [sunrpc] [<ffffffffa0300571>] svc_authenticate+0xe1/0xf0 [sunrpc] [<ffffffffa02fc564>] svc_process_common+0x244/0x6a0 [sunrpc] [<ffffffffa02fd044>] bc_svc_process+0x1c4/0x260 [sunrpc] [<ffffffffa03d5478>] nfs41_callback_svc+0x128/0x1f0 [nfsv4] [<ffffffff810ff970>] ? wait_woken+0xc0/0xc0 [<ffffffffa03d5350>] ? nfs4_callback_svc+0x60/0x60 [nfsv4] [<ffffffff810d45bf>] kthread+0x11f/0x140 [<ffffffff810ea815>] ? local_clock+0x15/0x30 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 [<ffffffff81874bfc>] ret_from_fork+0x7c/0xb0 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 ---[ end trace 675220a11e30f4f2 ]--- nfs41_callback_svc does most of its work while in TASK_INTERRUPTIBLE, which is just wrong. Fix that by finishing the wait immediately if we've found that the list has something on it. Also, we don't expect this kthread to accept signals, so we should be using a TASK_UNINTERRUPTIBLE sleep instead. That however, opens us up hung task warnings from the watchdog, so have the schedule_timeout wake up every 60s if there's no callback activity. Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Giuseppe Cantavenera authored
commit bb7ffbf2 upstream. nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...) in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id. The following was observed on a MIPS 32-core processor: kernel: Call Trace: kernel: [<ffffffffc00bc5e4>] rpc_pipefs_event+0x7c/0x158 [nfsd] kernel: [<ffffffff8017a2a0>] notifier_call_chain+0x70/0xb8 kernel: [<ffffffff8017a4e4>] __blocking_notifier_call_chain+0x4c/0x70 kernel: [<ffffffff8053aff8>] rpc_fill_super+0xf8/0x1a0 kernel: [<ffffffff8022204c>] mount_ns+0xb4/0xf0 kernel: [<ffffffff80222b48>] mount_fs+0x50/0x1f8 kernel: [<ffffffff8023dc00>] vfs_kern_mount+0x58/0xf0 kernel: [<ffffffff802404ac>] do_mount+0x27c/0xa28 kernel: [<ffffffff80240cf0>] SyS_mount+0x98/0xe8 kernel: [<ffffffff80135d24>] handle_sys64+0x44/0x68 kernel: kernel: Code: 0040f809 00000000 2e020001 <00020336> 3c12c00d 3c02801a de100000 6442eb98 0040f809 kernel: ---[ end trace 7471374335809536 ]--- Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before registering rpc_pipefs_event(...) with the notifier chain. Signed-off-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com> Signed-off-by: Lorenzo Restelli <lorenzo.restelli.ext@nokia.com> Reviewed-by: Kinlong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Dan Carpenter authored
commit 13f6b191 upstream. Using the indenting we can see the curly braces were obviously intended. This is a static checker fix, but my guess is that we don't read enough bytes, because we don't calculate "t_len" correctly. Fixes: f1d82698 ('memstick: use fully asynchronous request processing') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Oleg Nesterov authored
commit b72c1869 upstream. ptrace_resume() is called when the tracee is still __TASK_TRACED. We set tracee->exit_code and then wake_up_state() changes tracee->state. If the tracer's sub-thread does wait() in between, task_stopped_code(ptrace => T) wrongly looks like another report from tracee. This confuses debugger, and since wait_task_stopped() clears ->exit_code the tracee can miss a signal. Test-case: #include <stdio.h> #include <unistd.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <pthread.h> #include <assert.h> int pid; void *waiter(void *arg) { int stat; for (;;) { assert(pid == wait(&stat)); assert(WIFSTOPPED(stat)); if (WSTOPSIG(stat) == SIGHUP) continue; assert(WSTOPSIG(stat) == SIGCONT); printf("ERR! extra/wrong report:%x\n", stat); } } int main(void) { pthread_t thread; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); for (;;) kill(getpid(), SIGHUP); } assert(pthread_create(&thread, NULL, waiter, NULL) == 0); for (;;) ptrace(PTRACE_CONT, pid, 0, SIGCONT); return 0; } Note for stable: the bug is very old, but without 9899d11f "ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL" the fix should use lock_task_sighand(child). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Pavel Labath <labath@google.com> Tested-by: Pavel Labath <labath@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Iooss authored
commit d43698e8 upstream. Commit 2473238e ("ihex: add support for CS:IP/EIP records") removes the "default:" statement in the switch block, making the "return usage();" line dead code and ihex2fw silently ignoring unknown options. Restore this statement. This bug was found by building with HOSTCC=clang and adding -Wunreachable-code-return to HOSTCFLAGS. Fixes: 2473238e ("ihex: add support for CS:IP/EIP records") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Christoph Hellwig authored
commit 16b8528d upstream. We only want to steer the I/O completion towards a queue, but don't actually access any per-CPU data, so the raw_ version is fine to use and avoids the warnings when using smp_processor_id(). Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Andy Lutomirski <luto@kernel.org> Acked-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> [lizf: Backported to 3.4: drop the changes to megasas_build_dcdb_fusion()] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Erez Shitrit authored
commit ca9b590c upstream. The current code decreases from the mss size (which is the gso_size from the kernel skb) the size of the packet headers. It shouldn't do that because the mss that comes from the stack (e.g IPoIB) includes only the tcp payload without the headers. The result is indication to the HW that each packet that the HW sends is smaller than what it could be, and too many packets will be sent for big messages. An easy way to demonstrate one more aspect of the problem is by configuring the ipoib mtu to be less than 2*hlen (2*56) and then run app sending big TCP messages. This will tell the HW to send packets with giant (negative value which under unsigned arithmetics becomes a huge positive one) length and the QP moves to SQE state. Fixes: b832be1e ('IB/mlx4: Add IPoIB LSO support') Reported-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Yann Droneaud authored
commit 8abaae62 upstream. If ib_umem_get() is called with a size equal to 0 and an non-page aligned address, one page will be pinned and a 0-sized umem will be returned to the caller. This should not be allowed: it's not expected for a memory region to have a size equal to 0. This patch adds a check to explicitly refuse to register a 0-sized region. Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com Cc: Shachar Raindel <raindel@mellanox.com> Cc: Jack Morgenstein <jackm@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ben Collins authored
commit 0618764c upstream. I suspect this doesn't show up for most anyone because software algorithms typically don't have a sense of being too busy. However, when working with the Freescale CAAM driver it will return -EBUSY on occasion under heavy -- which resulted in dm-crypt deadlock. After checking the logic in some other drivers, the scheme for crypt_convert() and it's callback, kcryptd_async_done(), were not correctly laid out to properly handle -EBUSY or -EINPROGRESS. Fix this by using the completion for both -EBUSY and -EINPROGRESS. Now crypt_convert()'s use of completion is comparable to af_alg_wait_for_completion(). Similarly, kcryptd_async_done() follows the pattern used in af_alg_complete(). Before this fix dm-crypt would lockup within 1-2 minutes running with the CAAM driver. Fix was regression tested against software algorithms on PPC32 and x86_64, and things seem perfectly happy there as well. Signed-off-by: Ben Collins <ben.c@servergy.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Michael Davidson authored
commit a87938b2 upstream. With CONFIG_ARCH_BINFMT_ELF_RANDOMIZE_PIE enabled, and a normal top-down address allocation strategy, load_elf_binary() will attempt to map a PIE binary into an address range immediately below mm->mmap_base. Unfortunately, load_elf_ binary() does not take account of the need to allocate sufficient space for the entire binary which means that, while the first PT_LOAD segment is mapped below mm->mmap_base, the subsequent PT_LOAD segment(s) end up being mapped above mm->mmap_base into the are that is supposed to be the "gap" between the stack and the binary. Since the size of the "gap" on x86_64 is only guaranteed to be 128MB this means that binaries with large data segments > 128MB can end up mapping part of their data segment over their stack resulting in corruption of the stack (and the data segment once the binary starts to run). Any PIE binary with a data segment > 128MB is vulnerable to this although address randomization means that the actual gap between the stack and the end of the binary is normally greater than 128MB. The larger the data segment of the binary the higher the probability of failure. Fix this by calculating the total size of the binary in the same way as load_elf_interp(). Signed-off-by: Michael Davidson <md@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Lv Zheng authored
commit 2b876010 upstream. ACPICA commit aacf863cfffd46338e268b7415f7435cae93b451 It is reported that on a physically 64-bit addressed machine, 32-bit kernel can trigger crashes in accessing the memory regions that are beyond the 32-bit boundary. The region field's start address should still be 32-bit compliant, but after a calculation (adding some offsets), it may exceed the 32-bit boundary. This case is rare and buggy, but there are real BIOSes leaked with such issues (see References below). This patch fixes this gap by always defining IO addresses as 64-bit, and allows OSPMs to optimize it for a real 32-bit machine to reduce the size of the internal objects. Internal acpi_physical_address usages in the structures that can be fixed by this change include: 1. struct acpi_object_region: acpi_physical_address address; 2. struct acpi_address_range: acpi_physical_address start_address; acpi_physical_address end_address; 3. struct acpi_mem_space_context; acpi_physical_address address; 4. struct acpi_table_desc acpi_physical_address address; See known issues 1 for other usages. Note that acpi_io_address which is used for ACPI_PROCESSOR may also suffer from same problem, so this patch changes it accordingly. For iasl, it will enforce acpi_physical_address as 32-bit to generate 32-bit OSPM compatible tables on 32-bit platforms, we need to define ACPI_32BIT_PHYSICAL_ADDRESS for it in acenv.h. Known issues: 1. Cleanup of mapped virtual address In struct acpi_mem_space_context, acpi_physical_address is used as a virtual address: acpi_physical_address mapped_physical_address; It is better to introduce acpi_virtual_address or use acpi_size instead. This patch doesn't make such a change. Because this should be done along with a change to acpi_os_map_memory()/acpi_os_unmap_memory(). There should be no functional problem to leave this unchanged except that only this structure is enlarged unexpectedly. Link: https://github.com/acpica/acpica/commit/aacf863c Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501Reported-and-tested-by: Paul Menzel <paulepanter@users.sourceforge.net> Reported-and-tested-by: Sial Nije <sialnije@gmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Anton Blanchard authored
commit 9a5cbce4 upstream. We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH (currently 127), but we forgot to do the same for 64bit backtraces. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Filipe Manana authored
commit ccccf3d6 upstream. If we attempt to clone a 0 length region into a file we can end up inserting a range in the inode's extent_io tree with a start offset that is greater then the end offset, which triggers immediately the following warning: [ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]() [ 3914.620886] BTRFS: end < start 4095 4096 (...) [ 3914.638093] Call Trace: [ 3914.638636] [<ffffffff81425fd9>] dump_stack+0x4c/0x65 [ 3914.639620] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb [ 3914.640789] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs] [ 3914.642041] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48 [ 3914.643236] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs] [ 3914.644441] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs] [ 3914.645711] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs] [ 3914.646914] [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33 [ 3914.648058] [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs] [ 3914.650105] [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs] [ 3914.651361] [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs] [ 3914.652761] [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs] [ 3914.654128] [<ffffffff811226dd>] ? might_fault+0x58/0xb5 [ 3914.655320] [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs] (...) [ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]--- This later makes the inode eviction handler enter an infinite loop that keeps dumping the following warning over and over: [ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]() [ 3915.119913] BTRFS: end < start 4095 4096 (...) [ 3915.137394] Call Trace: [ 3915.137913] [<ffffffff81425fd9>] dump_stack+0x4c/0x65 [ 3915.139154] [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb [ 3915.140316] [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs] [ 3915.141505] [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48 [ 3915.142709] [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs] [ 3915.143849] [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs] [ 3915.145120] [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs] [ 3915.146352] [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50 [ 3915.147565] [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs] [ 3915.148785] [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33 [ 3915.149931] [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs] [ 3915.151154] [<ffffffff81168904>] evict+0xa0/0x148 [ 3915.152094] [<ffffffff811689e5>] dispose_list+0x39/0x43 [ 3915.153081] [<ffffffff81169564>] evict_inodes+0xdc/0xeb [ 3915.154062] [<ffffffff81154418>] generic_shutdown_super+0x49/0xef [ 3915.155193] [<ffffffff811546d1>] kill_anon_super+0x13/0x1e [ 3915.156274] [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs] (...) [ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]--- So just bail out of the clone ioctl if the length of the region to clone is zero, without locking any extent range, in order to prevent this issue (same behaviour as a pwrite with a 0 length for example). This is trivial to reproduce. For example, the steps for the test I just made for fstests: mkfs.btrfs -f SCRATCH_DEV mount SCRATCH_DEV $SCRATCH_MNT touch $SCRATCH_MNT/foo touch $SCRATCH_MNT/bar $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar umount $SCRATCH_MNT A test case for fstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Heiko Carstens authored
commit d7441949 upstream. Sebastian reported a crash caused by a jump label mismatch after resume. This happens because we do not save the kernel text section during suspend and therefore also do not restore it during resume, but use the kernel image that restores the old system. This means that after a suspend/resume cycle we lost all modifications done to the kernel text section. The reason for this is the pfn_is_nosave() function, which incorrectly returns that read-only pages don't need to be saved. This is incorrect since we mark the kernel text section read-only. We still need to make sure to not save and restore pages contained within NSS and DCSS segment. To fix this add an extra case for the kernel text section and only save those pages if they are not contained within an NSS segment. Fixes the following crash (and the above bugs as well): Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0 Found: c0 04 00 00 00 00 Expected: c0 f4 00 00 00 11 New: c0 04 00 00 00 00 Kernel panic - not syncing: Corrupted kernel text CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4 Call Trace: [<0000000000113972>] show_stack+0x72/0xf0 [<000000000081f15e>] dump_stack+0x6e/0x90 [<000000000081c4e8>] panic+0x108/0x2b0 [<000000000081be64>] jump_label_bug.isra.2+0x104/0x108 [<0000000000112176>] __jump_label_transform+0x9e/0xd0 [<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50 [<00000000001d1136>] multi_cpu_stop+0x12e/0x170 [<00000000001d1472>] cpu_stopper_thread+0xb2/0x168 [<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0 [<0000000000158baa>] kthread+0x10a/0x110 [<0000000000824a86>] kernel_thread_starter+0x6/0xc Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [lizf: Backported to 3.4: add necessary includes] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Dichtel authored
commit bd2cba07 upstream. This command is missing. Fixes: 3a2dfbe8 ("xfrm: Notify changes in UDP encapsulation via netlink") CC: Martin Willi <martin@strongswan.org> Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Dichtel authored
commit 8d465bb7 upstream. This command is missing. Fixes: 5c79de6e ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE") Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Dichtel authored
commit b0b59b00 upstream. This command is missing. Fixes: 97a64b45 ("[XFRM]: Introduce XFRM_MSG_REPORT.") Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Dave Olson authored
commit f7e9e358 upstream. This problem appears to have been introduced in 2.6.29 by commit 93197a36 "Rewrite sysfs processor cache info code". This caused lscpu to error out on at least e500v2 devices, eg: error: cannot open /sys/devices/system/cpu/cpu0/cache/index2/size: No such file or directory Some embedded powerpc systems use cache-size in DTS for the unified L2 cache size, not d-cache-size, so we need to allow for both DTS names. Added a new CACHE_TYPE_UNIFIED_D cache_type_info structure to handle this. Fixes: 93197a36 ("powerpc: Rewrite sysfs processor cache info code") Signed-off-by: Dave Olson <olson@cumulusnetworks.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Huacai Chen authored
commit a843d00d upstream. We found that TLB mismatch not only happens after kernel resume, but also happens during snapshot restore. So move it to the beginning of swsusp_arch_suspend(). Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/9621/Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Michael Gernoth authored
commit 91bf0c2d upstream. The functions snd_emu10k1_proc_spdif_read and snd_emu1010_fpga_read acquire the emu_lock before accessing the FPGA. The function used to access the FPGA (snd_emu1010_fpga_read) also tries to take the emu_lock which causes a deadlock. Remove the outer locking in the proc-functions (guarding only the already safe fpga read) to prevent this deadlock. [removed superfluous flags variables too -- tiwai] Signed-off-by: Michael Gernoth <michael@gernoth.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
K. Y. Srinivasan authored
commit 8de58074 upstream. We may exit this function without properly freeing up the maapings we may have acquired. Fix the bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Aravind Gopalakrishnan authored
commit b4491592 upstream. The comment line regarding IOMMU_INIT and IOMMU_INIT_FINISH macros is incorrect: "The standard vs the _FINISH differs in that the _FINISH variant will continue detecting other IOMMUs in the call list..." It should be "..the *standard* variant will continue detecting..." Fix that. Also, make it readable while at it. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: konrad.wilk@oracle.com Fixes: 6e963669 ("x86, iommu: Update header comments with appropriate naming") Link: http://lkml.kernel.org/r/1428508017-5316-1-git-send-email-Aravind.Gopalakrishnan@amd.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Dichtel authored
commit 5b5800fa upstream. These commands are missing. Fixes: 28d8909b ("[XFRM]: Export SAD info.") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nicolas Dichtel authored
commit 5e6deeba upstream. This command is missing. Fixes: ecfd6b18 ("[XFRM]: Export SPD info") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Sowmini Varadhan authored
commit ebe96e64 upstream. AF_RDS, PF_RDS and SOL_RDS are available in header files, and there is no need to get their values from /proc. Document this correctly. Fixes: 0c5f9b88 ("RDS: Documentation") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ulrik De Bie authored
commit bd884149 upstream. On ASUS TP500LN and X750JN, the touchpad absolute mode is reset each time set_rate is done. In order to fix this, we will verify the firmware version, and if it matches the one in those laptops, the set_rate function is overloaded with a function elantech_set_rate_restore_reg_07 that performs the set_rate with the original function, followed by a restore of reg_07 (the register that sets the absolute mode on elantech v4 hardware). Also the ASUS TP500LN and X750JN firmware version, capabilities, and button constellation is added to elantech.c Reported-and-tested-by: George Moutsopoulos <gmoutso@yahoo.co.uk> Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Alexander Duyck authored
commit 2e7056c4 upstream. Looking over the implementation for jhash2 and comparing it to jhash_3words I realized that the two hashes were in fact very different. Doing a bit of digging led me to "The new jhash implementation" in which lookup2 was supposed to have been replaced with lookup3. In reviewing the patch I noticed that jhash2 had originally initialized a and b to JHASH_GOLDENRATIO and c to initval, but after the patch a, b, and c were initialized to initval + (length << 2) + JHASH_INITVAL. However the changes in jhash_3words simply replaced the initialization of a and b with JHASH_INITVAL. This change corrects what I believe was an oversight so that a, b, and c in jhash_3words all have the same value added consisting of initval + (length << 2) + JHASH_INITVAL so that jhash2 and jhash_3words will now produce the same hash result given the same inputs. Fixes: 60d509c8 ("The new jhash implementation") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Lukas Czerner authored
commit e12fb972 upstream. Previously commit 14ece102 added a support for for syncing parent directory of newly created inodes to make sure that the inode is not lost after a power failure in no-journal mode. However this does not work in majority of cases, namely: - if the directory has inline data - if the directory is already indexed - if the directory already has at least one block and: - the new entry fits into it - or we've successfully converted it to indexed So in those cases we might lose the inode entirely even after fsync in the no-journal mode. This also includes ext2 default mode obviously. I've noticed this while running xfstest generic/321 and even though the test should fail (we need to run fsck after a crash in no-journal mode) I could not find a newly created entries even when if it was fsynced before. Fix this by adjusting the ext4_add_entry() successful exit paths to set the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the parent directory as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Frank Mayhar <fmayhar@google.com> [lizf: Backported to 3.4: remove a change from return to goto, as that doesn't exist in 3.4] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Pascal Huerst authored
commit 74ff9602 upstream. The delay time after a reset in the codec probe callback was too short, and did not work on certain hw because the codec needs more time to power on. This increases the delay time from 1us to 1ms. Signed-off-by: Pascal Huerst <pascal.huerst@gmail.com> Acked-by: Brian Austin <brian.austin@cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Larry Finger authored
commit 2f92b314 upstream. USB ID 2001:330d is used for a D-Link DWA-131. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Andrey Ryabinin authored
commit 8defb336 upstream. Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel split this is not so, because 2*TASK_SIZE overflows 32 bits, so the actual value of ELF_ET_DYN_BASE is: (2 * TASK_SIZE / 3) = 0x2a000000 When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address. On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000] for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled as it fails to map shadow memory. Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries has a high chance of loading somewhere in between [0x2a000000 - 0x40000000] even if ASLR enabled. This makes ASan with PIE absolutely incompatible. Fix overflow by dividing TASK_SIZE prior to multiplying. After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y): (TASK_SIZE / 3 * 2) = 0x7f555554 [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#MappingSigned-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Reported-by: Maria Guseva <m.guseva@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
David Sterba authored
commit 3c3b04d1 upstream. Due to insufficient check in btrfs_is_valid_xattr, this unexpectedly works: $ touch file $ setfattr -n user. -v 1 file $ getfattr -d file user.="1" ie. the missing attribute name after the namespace. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94291Reported-by: William Douglas <william.douglas@intel.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> [lizf: Backported to 3.4: - 3.4 doesn't support XATTR_BTRFS_PREFIX] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Filipe Manana authored
commit dcc82f47 upstream. While committing a transaction we free the log roots before we write the new super block. Freeing the log roots implies marking the disk location of every node/leaf (metadata extent) as pinned before the new super block is written. This is to prevent the disk location of log metadata extents from being reused before the new super block is written, otherwise we would have a corrupted log tree if before the new super block is written a crash/reboot happens and the location of any log tree metadata extent ended up being reused and rewritten. Even though we pinned the log tree's metadata extents, we were issuing a discard against them if the fs was mounted with the -o discard option, resulting in corruption of the log tree if a crash/reboot happened before writing the new super block - the next time the fs was mounted, during the log replay process we would find nodes/leafs of the log btree with a content full of zeroes, causing the process to fail and require the use of the tool btrfs-zero-log to wipeout the log tree (and all data previously fsynced becoming lost forever). Fix this by not doing a discard when pinning an extent. The discard will be done later when it's safe (after the new super block is committed) at extent-tree.c:btrfs_finish_extent_commit(). Fixes: e688b725 (Btrfs: fix extent pinning bugs in the tree log) Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
K. Y. Srinivasan authored
commit 73cffdb6 upstream. Don't wait after sending request for offers to the host. This wait is unnecessary and simply adds 5 seconds to the boot time. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Nishanth Menon authored
commit f4831605 upstream. time_init invokes timer64_init (which is __init annotation) since all of these are invoked at init time, lets maintain consistency by ensuring time_init is marked appropriately as well. This fixes the following warning with CONFIG_DEBUG_SECTION_MISMATCH=y WARNING: vmlinux.o(.text+0x3bfc): Section mismatch in reference from the function time_init() to the function .init.text:timer64_init() The function time_init() references the function __init timer64_init(). This is often because time_init lacks a __init annotation or the annotation of timer64_init is wrong. Fixes: 546a3954 ("C6X: time management") Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Brian Norris authored
commit 299d0c5b upstream. The comparison from the previous line seems to have been erroneously (partially) copied-and-pasted onto the next. The second line should be checking req.bytes, not req.lnum. Coverity CID #139400 Signed-off-by: Brian Norris <computersforpeace@gmail.com> [rw: Fixed comparison] Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Brian Norris authored
commit f16db807 upstream. In some of the 'out_not_moved' error paths, lnum may be used uninitialized. Don't ignore the warning; let's fix it. This uninitialized variable doesn't have much visible effect in the end, since we just schedule the PEB for erasure, and its LEB number doesn't really matter (it just gets printed in debug messages). But let's get it straight anyway. Coverity CID #113449 Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Brian Norris authored
commit d74adbdb upstream. If aeb->len >= vol->reserved_pebs, we should not be writing aeb into the PEB->LEB mapping. Caught by Coverity, CID #711212. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-