- 06 Aug, 2015 9 commits
-
-
Sanidhya Kashyap authored
commit ce657611 upstream. There is a possibility of nothing being allocated to the new_opts in case of memory pressure, therefore return ENOMEM for such case. Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Rafael J. Wysocki authored
commit 0294112e upstream. This effectively reverts the following three commits: 7bc10388 ACPI / resources: free memory on error in add_region_before() 0f1b414d ACPI / PNP: Avoid conflicting resource reservations b9a5e5e1 ACPI / init: Fix the ordering of acpi_reserve_resources() (commit b9a5e5e1 introduced regressions some of which, but not all, were addressed by commit 0f1b414d and commit 7bc10388 was a fixup on top of the latter) and causes ACPI fixed hardware resources to be reserved at the fs_initcall_sync stage of system initialization. The story is as follows. First, a boot regression was reported due to an apparent resource reservation ordering change after a commit that shouldn't lead to such changes. Investigation led to the conclusion that the problem happened because acpi_reserve_resources() was executed at the device_initcall() stage of system initialization which wasn't strictly ordered with respect to driver initialization (and with respect to the initialization of the pcieport driver in particular), so a random change causing the device initcalls to be run in a different order might break things. The response to that was to attempt to run acpi_reserve_resources() as soon as we knew that ACPI would be in use (commit b9a5e5e1). However, that turned out to be too early, because it caused resource reservations made by the PNP system driver to fail on at least one system and that failure was addressed by commit 0f1b414d. That fix still turned out to be insufficient, though, because calling acpi_reserve_resources() before the fs_initcall stage of system initialization caused a boot regression to happen on the eCAFE EC-800-H20G/S netbook. That meant that we only could call acpi_reserve_resources() at the fs_initcall initialization stage or later, but then we might just as well call it after the PNP initalization in which case commit 0f1b414d wouldn't be necessary any more. For this reason, the changes made by commit 0f1b414d are reverted (along with a memory leak fixup on top of that commit), the changes made by commit b9a5e5e1 that went too far are reverted too and acpi_reserve_resources() is changed into fs_initcall_sync, which will cause it to be executed after the PNP subsystem initialization (which is an fs_initcall) and before device initcalls (including the pcieport driver initialization) which should avoid the initial issue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 Link: http://marc.info/?t=143092384600002&r=1&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e1 "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Michal Hocko authored
commit 7444a072 upstream. ext4_free_blocks is looping around the allocation request and mimics __GFP_NOFAIL behavior without any allocation fallback strategy. Let's remove the open coded loop and replace it with __GFP_NOFAIL. Without the flag the allocator has no way to find out never-fail requirement and cannot help in any way. Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Eryu Guan authored
commit 8974fec7 upstream. Currently ext4_ind_migrate() doesn't correctly handle a file which contains a hole at the beginning of the file. This caused the migration to be done incorrectly, and then if there is a subsequent following delayed allocation write to the "hole", this would reclaim the same data blocks again and results in fs corruption. # assmuing 4k block size ext4, with delalloc enabled # skip the first block and write to the second block xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/ext4/testfile # converting to indirect-mapped file, which would move the data blocks # to the beginning of the file, but extent status cache still marks # that region as a hole chattr -e /mnt/ext4/testfile # delayed allocation writes to the "hole", reclaim the same data block # again, results in i_blocks corruption xfs_io -c "pwrite 0 4k" /mnt/ext4/testfile umount /mnt/ext4 e2fsck -nf /dev/sda6 ... Inode 53, i_blocks is 16, should be 8. Fix? no ... Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Eryu Guan authored
commit d6f123a9 upstream. Currently the check in ext4_ind_migrate() is not enough before doing the real conversion: a) delayed allocated extents could bypass the check on eh->eh_entries and eh->eh_depth This can be demonstrated by this script xfs_io -fc "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/ext4/testfile chattr -e /mnt/ext4/testfile where testfile has two extents but still be converted to non-extent based file format. b) only extent length is checked but not the offset, which would result in data lose (delalloc) or fs corruption (nodelalloc), because non-extent based file only supports at most (12 + 2^10 + 2^20 + 2^30) blocks This can be demostrated by xfs_io -fc "pwrite 5T 4k" /mnt/ext4/testfile chattr -e /mnt/ext4/testfile sync If delalloc is enabled, dmesg prints EXT4-fs warning (device dm-4): ext4_block_to_path:105: block 1342177280 > max in inode 53 EXT4-fs (dm-4): Delayed block allocation failed for inode 53 at logical offset 1342177280 with max blocks 1 with error 5 EXT4-fs (dm-4): This should not happen!! Data will be lost If delalloc is disabled, e2fsck -nf shows corruption Inode 53, i_size is 5497558142976, should be 4096. Fix? no Fix the two issues by a) forcing all delayed allocation blocks to be allocated before checking eh->eh_depth and eh->eh_entries b) limiting the last logical block of the extent is within direct map Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Lukas Czerner authored
commit 9705acd6 upstream. On delalloc enabled file system on invalidatepage operation in ext4_da_page_release_reservation() we want to clear the delayed buffer and remove the extent covering the delayed buffer from the extent status tree. However currently there is a bug where on the systems with page size > block size we will always remove extents from the start of the page regardless where the actual delayed buffers are positioned in the page. This leads to the errors like this: EXT4-fs warning (device loop0): ext4_da_release_space:1225: ext4_da_release_space: ino 13, to_free 1 with only 0 reserved data blocks This however can cause data loss on writeback time if the file system is in ENOSPC condition because we're releasing reservation for someones else delayed buffer. Fix this by only removing extents that corresponds to the part of the page we want to invalidate. This problem is reproducible by the following fio receipt (however I was only able to reproduce it with fio-2.1 or older. [global] bs=8k iodepth=1024 iodepth_batch=60 randrepeat=1 size=1m directory=/mnt/test numjobs=20 [job1] ioengine=sync bs=1k direct=1 rw=randread filename=file1:file2 [job2] ioengine=libaio rw=randwrite direct=1 filename=file1:file2 [job3] bs=1k ioengine=posixaio rw=randwrite direct=1 filename=file1:file2 [job5] bs=1k ioengine=sync rw=randread filename=file1:file2 [job7] ioengine=libaio rw=randwrite filename=file1:file2 [job8] ioengine=posixaio rw=randwrite filename=file1:file2 [job10] ioengine=mmap rw=randwrite bs=1k filename=file1:file2 [job11] ioengine=mmap rw=randwrite direct=1 filename=file1:file2 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit e4545de5 upstream. If we do an append write to a file (which increases its inode's i_size) that does not have the flag BTRFS_INODE_NEEDS_FULL_SYNC set in its inode, and the previous transaction added a new hard link to the file, which sets the flag BTRFS_INODE_COPY_EVERYTHING in the file's inode, and then fsync the file, the inode's new i_size isn't logged. This has the consequence that after the fsync log is replayed, the file size remains what it was before the append write operation, which means users/applications will not be able to read the data that was successsfully fsync'ed before. This happens because neither the inode item nor the delayed inode get their i_size updated when the append write is made - doing so would require starting a transaction in the buffered write path, something that we do not do intentionally for performance reasons. Fix this by making sure that when the flag BTRFS_INODE_COPY_EVERYTHING is set the inode is logged with its current i_size (log the in-memory inode into the log tree). This issue is not a recent regression and is easy to reproduce with the following test case for fstests: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" here=`pwd` tmp=/tmp/$$ status=1 # failure is the default! _cleanup() { _cleanup_flakey rm -f $tmp.* } trap "_cleanup; exit \$status" 0 1 2 3 15 # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _supported_fs generic _supported_os Linux _need_to_be_root _require_scratch _require_dm_flakey _require_metadata_journaling $SCRATCH_DEV _crash_and_mount() { # Simulate a crash/power loss. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again and mount. This makes the fs replay its fsync log. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey } rm -f $seqres.full _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create the test file with some initial data and then fsync it. # The fsync here is only needed to trigger the issue in btrfs, as it causes the # the flag BTRFS_INODE_NEEDS_FULL_SYNC to be removed from the btrfs inode. $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 32k" \ -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io sync # Add a hard link to our file. # On btrfs this sets the flag BTRFS_INODE_COPY_EVERYTHING on the btrfs inode, # which is a necessary condition to trigger the issue. ln $SCRATCH_MNT/foo $SCRATCH_MNT/bar # Sync the filesystem to force a commit of the current btrfs transaction, this # is a necessary condition to trigger the bug on btrfs. sync # Now append more data to our file, increasing its size, and fsync the file. # In btrfs because the inode flag BTRFS_INODE_COPY_EVERYTHING was set and the # write path did not update the inode item in the btree nor the delayed inode # item (in memory struture) in the current transaction (created by the fsync # handler), the fsync did not record the inode's new i_size in the fsync # log/journal. This made the data unavailable after the fsync log/journal is # replayed. $XFS_IO_PROG -c "pwrite -S 0xbb 32K 32K" \ -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io echo "File content after fsync and before crash:" od -t x1 $SCRATCH_MNT/foo _crash_and_mount echo "File content after crash and log replay:" od -t x1 $SCRATCH_MNT/foo status=0 exit The expected file output before and after the crash/power failure expects the appended data to be available, which is: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0100000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * 0200000 Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit ae9d8f17 upstream. While the inode cache caching kthread is calling btrfs_unpin_free_ino(), we could have a concurrent call to btrfs_return_ino() that adds a new entry to the root's free space cache of pinned inodes. This concurrent call does not acquire the fs_info->commit_root_sem before adding a new entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem because the caching kthread calls btrfs_unpin_free_ino() after setting the caching state to BTRFS_CACHE_FINISHED and therefore races with the task calling btrfs_return_ino(), which is adding a new entry, while the former (caching kthread) is navigating the cache's rbtree, removing and freeing nodes from the cache's rbtree without acquiring the spinlock that protects the rbtree. This race resulted in memory corruption due to double free of struct btrfs_free_space objects because both tasks can end up doing freeing the same objects. Note that adding a new entry can result in merging it with other entries in the cache, in which case those entries are freed. This is particularly important as btrfs_free_space structures are also used for the block group free space caches. This memory corruption can be detected by a debugging kernel, which reports it with the following trace: [132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected [132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G W 4.1.0-rc5-btrfs-next-10+ #1 [132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [132408.505075] ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce [132408.505075] ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68 [132408.505075] ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f [132408.505075] Call Trace: [132408.505075] [<ffffffff8145eec7>] dump_stack+0x4f/0x7b [132408.505075] [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2 [132408.505075] [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36 [132408.505075] [<ffffffff81155733>] __cache_free+0xe2/0x4b6 [132408.505075] [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs] [132408.505075] [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs] [132408.505075] [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28 [132408.505075] [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf [132408.505075] [<ffffffff811563a1>] ? kfree+0xb6/0x14e [132408.505075] [<ffffffff811563d0>] kfree+0xe5/0x14e [132408.505075] [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs] [132408.505075] [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs] [132408.505075] [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs] [132408.505075] [<ffffffff8106698f>] kthread+0xef/0xf7 [132408.505075] [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28 [132408.505075] [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad [132408.505075] [<ffffffff814653d2>] ret_from_fork+0x42/0x70 [132408.505075] [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad [132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b. [132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320 [132409.503355] ------------[ cut here ]------------ [132409.504241] kernel BUG at mm/slab.c:2571! Therefore fix this by having btrfs_unpin_free_ino() acquire the lock that protects the rbtree while doing the searches and removing entries. Fixes: 1c70d8fb ("Btrfs: fix inode caching vs tree log") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Filipe Manana authored
commit c3f4a168 upstream. The free space entries are allocated using kmem_cache_zalloc(), through __btrfs_add_free_space(), therefore we should use kmem_cache_free() and not kfree() to avoid any confusion and any potential problem. Looking at the kfree() definition at mm/slab.c it has the following comment: /* * (...) * * Don't free memory not originally allocated by kmalloc() * or you will run into trouble. */ So better be safe and use kmem_cache_free(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
- 04 Aug, 2015 1 commit
-
-
Al Viro authored
commit 451a2886 upstream. unfortunately, allowing an arbitrary 16bit value means a possibility of overflow in the calculation of total number of pages in bio_map_user_iov() - we rely on there being no more than PAGE_SIZE members of sum in the first loop there. If that sum wraps around, we end up allocating too small array of pointers to pages and it's easy to overflow it in the second loop. X-Coverup: TINC (and there's no lumber cartel either) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [bwh: s/MAX_UIOVEC/UIO_MAXIOV/. This was fixed upstream by commit fdc81f45 ("sg_start_req(): use import_iovec()"), but we don't have that function.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reference: CVE-2015-5707 Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
- 30 Jul, 2015 9 commits
-
-
Colin Ian King authored
commit ca4da5dd upstream. __key_link_end is not freeing the associated array edit structure and this leads to a 512 byte memory leak each time an identical existing key is added with add_key(). The reason the add_key() system call returns okay is that key_create_or_update() calls __key_link_begin() before checking to see whether it can update a key directly rather than adding/replacing - which it turns out it can. Thus __key_link() is not called through __key_instantiate_and_link() and __key_link_end() must cancel the edit. CVE-2015-1333 Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit 810bc075 upstream. We have a tricky bug in the nested NMI code: if we see RSP pointing to the NMI stack on NMI entry from kernel mode, we assume that we are executing a nested NMI. This isn't quite true. A malicious userspace program can point RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to happen while RSP is still pointing at the NMI stack. Fix it with a sneaky trick. Set DF in the region of code that the RSP check is intended to detect. IRET will clear DF atomically. (Note: other than paravirt, there's little need for all this complexity. We could check RIP instead of RSP.) Fixes CVE-2015-3291. Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> [bwh: Backported to 4.0: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> CVE-2015-3291 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit a27507ca upstream. Check the repeat_nmi .. end_repeat_nmi special case first. The next patch will rework the RSP check and, as a side effect, the RSP check will no longer detect repeat_nmi .. end_repeat_nmi, so we'll need this ordering of the checks. Note: this is more subtle than it appears. The check for repeat_nmi .. end_repeat_nmi jumps straight out of the NMI code instead of adjusting the "iret" frame to force a repeat. This is necessary, because the code between repeat_nmi and end_repeat_nmi sets "NMI executing" and then writes to the "iret" frame itself. If a nested NMI comes in and modifies the "iret" frame while repeat_nmi is also modifying it, we'll end up with garbage. The old code got this right, as does the new code, but the new code is a bit more explicit. If we were to move the check right after the "NMI executing" check, then we'd get it wrong and have random crashes. This is a prerequisite for the fix for CVE-2015-3291. Signed-off-by: Andy Lutomirski <luto@kernel.org> [bwh: Backported to 4.0: adjust filename, spacing] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit 0b22930e upstream. I found the nested NMI documentation to be difficult to follow. Improve the comments. Signed-off-by: Andy Lutomirski <luto@kernel.org> [bwh: Backported to 4.0: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit 9b6e6a83 upstream. Returning to userspace is tricky: IRET can fail, and ESPFIX can rearrange the stack prior to IRET. The NMI nesting fixup relies on a precise stack layout and atomic IRET. Rather than trying to teach the NMI nesting fixup to handle ESPFIX and failed IRET, punt: run NMIs that came from user mode on the normal kernel stack. This will make some nested NMIs visible to C code, but the C code is okay with that. As a side effect, this should speed up perf: it eliminates an RDMSR when NMIs come from user mode. Fixes CVE-2015-3290. Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> [bwh: Backported to 4.0: - Adjust filename, context - s/restore_c_regs_and_iret/restore_args/ - Use kernel_stack + KERNEL_STACK_OFFSET instead of cpu_current_top_of_stack] [luto: Open-coded return path to avoid dependency on partial pt_regs details] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> CVE-2015-3290, CVE-2015-5157 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit 0e181bb5 upstream. Now that do_nmi saves cr2, we don't need to save it in asm. This is a prerequisity for the fix for CVE-2015-3290. Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> [bwh: Backported to 4.0: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Lutomirski authored
commit 9d050416 upstream. 32-bit kernels handle nested NMIs in C. Enable the exact same handling on 64-bit kernels as well. This isn't currently necessary, but it will become necessary once the asm code starts allowing limited nesting. This is a prerequisite for the fix for CVE-2015-3290. Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Denys Vlasenko authored
commit a30b0085 upstream. Jumping to the very next instruction is not very useful: jmp label label: Removing the jump. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1428439424-7258-5-git-send-email-dvlasenk@redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Denys Vlasenko authored
commit 0784b364 upstream. No code changes. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427899858-7165-1-git-send-email-dvlasenk@redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
- 28 Jul, 2015 1 commit
-
-
Kamal Mostafa authored
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
- 23 Jul, 2015 20 commits
-
-
Michal Kazior authored
commit ab499db8 upstream. There was a possible race between ieee80211_reconfig() and ieee80211_delayed_tailroom_dec(). This could result in inability to transmit data if driver crashed during roaming or rekeying and subsequent skbs with insufficient tailroom appeared. This race was probably never seen in the wild because a device driver would have to crash AND recover within 0.5s which is very unlikely. I was able to prove this race exists after changing the delay to 10s locally and crashing ath10k via debugfs immediately after GTK rekeying. In case of ath10k the counter went below 0. This was harmless but other drivers which actually require tailroom (e.g. for WEP ICV or MMIC) could end up with the counter at 0 instead of >0 and introduce insufficient skb tailroom failures because mac80211 would not resize skbs appropriately anymore. Fixes: 8d1f7ecd ("mac80211: defer tailroom counter manipulation when roaming") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Dan Carpenter authored
commit c8fd51dc upstream. These defines are used like this: if (test_bit(I2C_HID_STARTED, &ihid->flags)) The intent was to use bits 0, 1, and 2 but because of the extra shifts we're using bits 1, 2, and 4. It's harmless becuase it's done consistently but it's not the intent and static checkers will complain. Fixes: 4a200c3b ('HID: i2c-hid: introduce HID over i2c specification implementation') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Konstantin Khlebnikov authored
commit c8fff7bc upstream. Node 0 might be offline as well as any other numa node, in this case kernel cannot handle memory allocation and crashes. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 0c3f061c ("of: implement of_node_to_nid as a weak function") Signed-off-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Jesper Dangaard Brouer authored
commit d079abd1 upstream. Too many spaces were introduced in commit 63adc6fb ("pktgen: cleanup checkpatch warnings"), thus misaligning "src_min:" to other columns. Fixes: 63adc6fb ("pktgen: cleanup checkpatch warnings") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Vasily Averin authored
commit d194e5d6 upstream. The final version of commit 637241a9 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are processed incorrectly: - open of /dev/kmsg checks syslog access permissions by using check_syslog_permissions() where security_syslog() is not called if dmesg_restrict is set. - syslog syscall and /proc/kmsg calls do_syslog() where security_syslog can be executed twice (inside check_syslog_permissions() and then directly in do_syslog()) With this patch security_syslog() is called once only in all syslog-related operations regardless of dmesg_restrict value. Fixes: 637241a9 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Chuck Lever authored
commit d683cc49 upstream. When encoding the NFSACL SETACL operation, reserve just the estimated size of the ACL rather than a fixed maximum. This eliminates needless zero padding on the wire that the server ignores. Fixes: ee5dc773 ('NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Dan Carpenter authored
commit e3958e9d upstream. These are used like: set_bit(WORK_LINK_UP, &priv->work_pending); The problem is that set_bit() takes the actual bit number and not a mask so static checkers get upset. It doesn't affect run time because we do it consistently, but we may as well clean it up. Fixes: 6010ce07 ('rndis_wlan: do link-down state change in worker thread') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Uwe Kleine-König authored
commit e5babdf9 upstream. Since commit bd31b859 (which is in 3.2-rc1) nw_gpio_lock is a raw spinlock that needs usage of the corresponding raw functions. This fixes: drivers/mtd/maps/dc21285.c: In function 'nw_en_write': drivers/mtd/maps/dc21285.c:41:340: warning: passing argument 1 of 'spinlock_check' from incompatible pointer type spin_lock_irqsave(&nw_gpio_lock, flags); In file included from include/linux/seqlock.h:35:0, from include/linux/time.h:5, from include/linux/stat.h:18, from include/linux/module.h:10, from drivers/mtd/maps/dc21285.c:8: include/linux/spinlock.h:299:102: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *' static inline raw_spinlock_t *spinlock_check(spinlock_t *lock) ^ drivers/mtd/maps/dc21285.c:43:25: warning: passing argument 1 of 'spin_unlock_irqrestore' from incompatible pointer type spin_unlock_irqrestore(&nw_gpio_lock, flags); ^ In file included from include/linux/seqlock.h:35:0, from include/linux/time.h:5, from include/linux/stat.h:18, from include/linux/module.h:10, from drivers/mtd/maps/dc21285.c:8: include/linux/spinlock.h:370:91: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *' static inline void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags) Fixes: bd31b859 ("locking, ARM: Annotate low level hw locks as raw") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Arnd Bergmann authored
commit ffb6e0c9 upstream. The platform_sysrq_reset_seq code was intended as a way for an embedded platform to provide its own sysrq sequence at compile time. After over two years, nobody has started using it in an upstream kernel, and the platforms that were interested in it have moved on to devicetree, which can be used to configure the sequence without requiring kernel changes. The method is also incompatible with the way that most architectures build support for multiple platforms into a single kernel. Now the code is producing warnings when built with gcc-5.1: drivers/tty/sysrq.c: In function 'sysrq_init': drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds] key = platform_sysrq_reset_seq[i]; We could fix this, but it seems unlikely that it will ever be used, so let's just remove the code instead. We still have the option to pass the sequence either in DT, using the kernel command line, or using the /sys/module/sysrq/parameters/reset_seq file. Fixes: 154b7a48 ("Input: sysrq - allow specifying alternate reset sequence") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Ding Wang authored
commit 29535f7b upstream. The current handler of MMC_BLK_CMD_ERR in mmc_blk_issue_rw_rq function may cause new coming request permanent missing when the ongoing request (previoulsy started) complete end. The problem scenario is as follows: (1) Request A is ongoing; (2) Request B arrived, and finally mmc_blk_issue_rw_rq() is called; (3) Request A encounters the MMC_BLK_CMD_ERR error; (4) In the error handling of MMC_BLK_CMD_ERR, suppose mmc_blk_cmd_err() end request A completed and return zero. Continue the error handling, suppose mmc_blk_reset() reset device success; (5) Continue the execution, while loop completed because variable ret is zero now; (6) Finally, mmc_blk_issue_rw_rq() return without processing request B. The process related to the missing request may wait that IO request complete forever, possibly crashing the application or hanging the system. Fix this issue by starting new request when reset success. Signed-off-by: Ding Wang <justin.wang@spreadtrum.com> Fixes: 67716327 ("mmc: block: add eMMC hardware reset support") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Julian Anastasov authored
commit 2c51a97f upstream. The lockless lookups can return entry that is unlinked. Sometimes they get reference before last neigh_cleanup_and_release, sometimes they do not need reference. Later, any modification attempts may result in the following problems: 1. entry is not destroyed immediately because neigh_update can start the timer for dead entry, eg. on change to NUD_REACHABLE state. As result, entry lives for some time but is invisible and out of control. 2. __neigh_event_send can run in parallel with neigh_destroy while refcnt=0 but if timer is started and expired refcnt can reach 0 for second time leading to second neigh_destroy and possible crash. Thanks to Eric Dumazet and Ying Xue for their work and analyze on the __neigh_event_send change. Fixes: 767e97e1 ("neigh: RCU conversion of struct neighbour") Fixes: a263b309 ("ipv4: Make neigh lookups directly in output packet path.") Fixes: 6fd6ce20 ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ kamal: backport to 3.13-stable: no __neigh_set_probe_once() ] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Dan Carpenter authored
commit 474ff0ae upstream. My static checker complains that: sound/soc/fsl/imx-wm8962.c:196 imx_wm8962_probe() warn: we tested 'ret' before and it was 'false' The intent was that we use "ret" to check imx_audmux_v2_configure_port(). Fixes: 8de2ae2a ('ASoC: fsl: add imx-wm8962 machine driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Otherwise, Acked-by: Nicolin Chen <nicoleotsuka@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Uwe Kleine-König authored
commit 530c11d4 upstream. The omap watchdog has the annoying behaviour that writes to most registers don't have any effect when the watchdog is already running. Quoting the AM335x reference manual: To modify the timer counter value (the WDT_WCRR register), prescaler ratio (the WDT_WCLR[4:2] PTV bit field), delay configuration value (the WDT_WDLY[31:0] DLY_VALUE bit field), or the load value (the WDT_WLDR[31:0] TIMER_LOAD bit field), the watchdog timer must be disabled by using the start/stop sequence (the WDT_WSPR register). Currently the timer is stopped in the .probe callback but still there are possibilities that yield to a situation where omap_wdt_start is entered with the timer running (e.g. when /dev/watchdog is closed without stopping and then reopened). In such a case programming the timeout silently fails! To circumvent this stop the timer before reprogramming. Assuming one of the first things the watchdog user does is setting the timeout explicitly nothing too bad should happen because this explicit setting works fine. Fixes: 7768a13c ("[PATCH] OMAP: Add Watchdog driver support") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Naoya Horiguchi authored
commit 641844f5 upstream. Currently the initial value of order in dissolve_free_huge_page is 64 or 32, which leads to the following warning in static checker: mm/hugetlb.c:1203 dissolve_free_huge_pages() warn: potential right shift more than type allows '9,18,64' This is a potential risk of infinite loop, because 1 << order (== 0) is used in for-loop like this: for (pfn =3D start_pfn; pfn < end_pfn; pfn +=3D 1 << order) ... So this patch fixes it by using global minimum_order calculated at boot time. text data bss dec hex filename 28313 469 84236 113018 1b97a mm/hugetlb.o 28256 473 84236 112965 1b945 mm/hugetlb.o (patched) Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Li Zhong authored
commit d0177639 upstream. It is possible for some platforms, such as powerpc to set HPAGE_SHIFT to 0 to indicate huge pages not supported. When this is the case, hugetlbfs could be disabled during boot time: hugetlbfs: disabling because there are no supported hugepage sizes Then in dissolve_free_huge_pages(), order is kept maximum (64 for 64bits), and the for loop below won't end: for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << order) As suggested by Naoya, below fix checks hugepages_supported() before calling dissolve_free_huge_pages(). [rientjes@google.com: no legitimate reason to call dissolve_free_huge_pages() when !hugepages_supported()] Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ kamal: 3.13-stable prereq for 641844f5 mm/hugetlb: introduce minimum hugepage order ] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nishanth Aravamudan authored
commit 457c1b27 upstream. Currently, I am seeing the following when I `mount -t hugetlbfs /none /dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`. I think it's related to the fact that hugetlbfs is properly not correctly setting itself up in this state?: Unable to handle kernel paging request for data at address 0x00000031 Faulting instruction address: 0xc000000000245710 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries .... In KVM guests on Power, in a guest not backed by hugepages, we see the following: AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 64 kB HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages are not supported at boot-time, but this is only checked in hugetlb_init(). Extract the check to a helper function, and use it in a few relevant places. This does make hugetlbfs not supported (not registered at all) in this environment. I believe this is fine, as there are no valid hugepages and that won't change at runtime. [akpm@linux-foundation.org: use pr_info(), per Mel] [akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined] Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ kamal: 3.13-stable prereq for 641844f5 mm/hugetlb: introduce minimum hugepage order ] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Michal Kazior authored
commit 6cbfb1bb upstream. It was possible for mac80211 to be coerced into an unexpected flow causing sdata union to become corrupted. Station pointer was put into sdata->u.vlan.sta memory location while it was really master AP's sdata->u.ap.next_beacon. This led to station entry being later freed as next_beacon before __sta_info_flush() in ieee80211_stop_ap() and a subsequent invalid pointer dereference crash. The problem was that ieee80211_ptr->use_4addr wasn't cleared on interface type changes. This could be reproduced with the following steps: # host A and host B have just booted; no # wpa_s/hostapd running; all vifs are down host A> iw wlan0 set type station host A> iw wlan0 set 4addr on host A> printf 'interface=wlan0\nssid=4addrcrash\nchannel=1\nwds_sta=1' > /tmp/hconf host A> hostapd -B /tmp/conf host B> iw wlan0 set 4addr on host B> ifconfig wlan0 up host B> iw wlan0 connect -w hostAssid host A> pkill hostapd # host A crashed: [ 127.928192] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c8 [ 127.929014] IP: [<ffffffff816f4f32>] __sta_info_flush+0xac/0x158 ... [ 127.934578] [<ffffffff8170789e>] ieee80211_stop_ap+0x139/0x26c [ 127.934578] [<ffffffff8100498f>] ? dump_trace+0x279/0x28a [ 127.934578] [<ffffffff816dc661>] __cfg80211_stop_ap+0x84/0x191 [ 127.934578] [<ffffffff816dc7ad>] cfg80211_stop_ap+0x3f/0x58 [ 127.934578] [<ffffffff816c5ad6>] nl80211_stop_ap+0x1b/0x1d [ 127.934578] [<ffffffff815e53f8>] genl_family_rcv_msg+0x259/0x2b5 Note: This isn't a revert of f8cdddb8 ("cfg80211: check iface combinations only when iface is running") as far as functionality is considered because b6a55015 ("cfg80211/mac80211: move more combination checks to mac80211") moved the logic somewhere else already. Fixes: f8cdddb8 ("cfg80211: check iface combinations only when iface is running") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Satish Ashok authored
commit 754bc547 upstream. When a port goes through a link down/up the multicast router configuration is not restored. Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 0909e117 ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Dan Carpenter authored
commit 83ed07c5 upstream. Static checkers complain that the current condition is never true. It seems pretty likely that it's a typo and "URB" was intended instead of "USB". Fixes: 3d97ff63 ('usbdevfs: Use scatter-gather lists for large bulk transfers') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nathan Fontenot authored
commit 2222ce0f upstream. Failure return from dlpar_configure_connector when dlpar adding cpus results in leaking references to the cpus parent device node. Move the call to of_node_put() prior to checking the result of dlpar_configure_connector. Fixes: 8d5ff320 ("powerpc/pseries: Make dlpar_configure_connector parent node aware") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-