1. 06 Aug, 2015 12 commits
    • Adriana Reus's avatar
      iio: inv-mpu: Specify the expected format/precision for write channels · 6b1380d6
      Adriana Reus authored
      commit 6a3c45bb upstream.
      
      The gyroscope needs IIO_VAL_INT_PLUS_NANO for the scale channel and
      unless specified write returns MICRO by default.
      This needs to be properly specified so that write operations into scale
      have the expected behaviour.
      Signed-off-by: default avatarAdriana Reus <adriana.reus@intel.com>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6b1380d6
    • Al Viro's avatar
      freeing unlinked file indefinitely delayed · f34efdfe
      Al Viro authored
      commit 75a6f82a upstream.
      
      	Normally opening a file, unlinking it and then closing will have
      the inode freed upon close() (provided that it's not otherwise busy and
      has no remaining links, of course).  However, there's one case where that
      does *not* happen.  Namely, if you open it by fhandle with cold dcache,
      then unlink() and close().
      
      	In normal case you get d_delete() in unlink(2) notice that dentry
      is busy and unhash it; on the final dput() it will be forcibly evicted from
      dcache, triggering iput() and inode removal.  In this case, though, we end
      up with *two* dentries - disconnected (created by open-by-fhandle) and
      regular one (used by unlink()).  The latter will have its reference to inode
      dropped just fine, but the former will not - it's considered hashed (it
      is on the ->s_anon list), so it will stay around until the memory pressure
      will finally do it in.  As the result, we have the final iput() delayed
      indefinitely.  It's trivial to reproduce -
      
      void flush_dcache(void)
      {
              system("mount -o remount,rw /");
      }
      
      static char buf[20 * 1024 * 1024];
      
      main()
      {
              int fd;
              union {
                      struct file_handle f;
                      char buf[MAX_HANDLE_SZ];
              } x;
              int m;
      
              x.f.handle_bytes = sizeof(x);
              chdir("/root");
              mkdir("foo", 0700);
              fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
              close(fd);
              name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
              flush_dcache();
              fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
              unlink("foo/bar");
              write(fd, buf, sizeof(buf));
              system("df .");			/* 20Mb eaten */
              close(fd);
              system("df .");			/* should've freed those 20Mb */
              flush_dcache();
              system("df .");			/* should be the same as #2 */
      }
      
      will spit out something like
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 283282     21692  93% /
      - inode gets freed only when dentry is finally evicted (here we trigger
      than by remount; normally it would've happened in response to memory
      pressure hell knows when).
      Acked-by: default avatarJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [ kamal: backport to 3.19-stable: no fast_dput() ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f34efdfe
    • Al Viro's avatar
      9p: don't leave a half-initialized inode sitting around · 72069d8f
      Al Viro authored
      commit 0a73d0a2 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      72069d8f
    • Sanidhya Kashyap's avatar
      hpfs: kstrdup() out of memory handling · be901bdf
      Sanidhya Kashyap authored
      commit ce657611 upstream.
      
      There is a possibility of nothing being allocated to the new_opts in
      case of memory pressure, therefore return ENOMEM for such case.
      Signed-off-by: default avatarSanidhya Kashyap <sanidhya.gatech@gmail.com>
      Signed-off-by: default avatarMikulas Patocka <mikulas@twibright.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      be901bdf
    • Rafael J. Wysocki's avatar
      ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage · 3e47907f
      Rafael J. Wysocki authored
      commit 0294112e upstream.
      
      This effectively reverts the following three commits:
      
       7bc10388 ACPI / resources: free memory on error in add_region_before()
       0f1b414d ACPI / PNP: Avoid conflicting resource reservations
       b9a5e5e1 ACPI / init: Fix the ordering of acpi_reserve_resources()
      
      (commit b9a5e5e1 introduced regressions some of which, but not
      all, were addressed by commit 0f1b414d and commit 7bc10388
      was a fixup on top of the latter) and causes ACPI fixed hardware
      resources to be reserved at the fs_initcall_sync stage of system
      initialization.
      
      The story is as follows.  First, a boot regression was reported due
      to an apparent resource reservation ordering change after a commit
      that shouldn't lead to such changes.  Investigation led to the
      conclusion that the problem happened because acpi_reserve_resources()
      was executed at the device_initcall() stage of system initialization
      which wasn't strictly ordered with respect to driver initialization
      (and with respect to the initialization of the pcieport driver in
      particular), so a random change causing the device initcalls to be
      run in a different order might break things.
      
      The response to that was to attempt to run acpi_reserve_resources()
      as soon as we knew that ACPI would be in use (commit b9a5e5e1).
      However, that turned out to be too early, because it caused resource
      reservations made by the PNP system driver to fail on at least one
      system and that failure was addressed by commit 0f1b414d.
      
      That fix still turned out to be insufficient, though, because
      calling acpi_reserve_resources() before the fs_initcall stage of
      system initialization caused a boot regression to happen on the
      eCAFE EC-800-H20G/S netbook.  That meant that we only could call
      acpi_reserve_resources() at the fs_initcall initialization stage
      or later, but then we might just as well call it after the PNP
      initalization in which case commit 0f1b414d wouldn't be
      necessary any more.
      
      For this reason, the changes made by commit 0f1b414d are reverted
      (along with a memory leak fixup on top of that commit), the changes
      made by commit b9a5e5e1 that went too far are reverted too and
      acpi_reserve_resources() is changed into fs_initcall_sync, which
      will cause it to be executed after the PNP subsystem initialization
      (which is an fs_initcall) and before device initcalls (including
      the pcieport driver initialization) which should avoid the initial
      issue.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581
      Link: http://marc.info/?t=143092384600002&r=1&w=2
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831
      Link: http://marc.info/?t=143389402600001&r=1&w=2
      Fixes: b9a5e5e1 "ACPI / init: Fix the ordering of acpi_reserve_resources()"
      Reported-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3e47907f
    • Michal Hocko's avatar
      ext4: replace open coded nofail allocation in ext4_free_blocks() · 61bd00c1
      Michal Hocko authored
      commit 7444a072 upstream.
      
      ext4_free_blocks is looping around the allocation request and mimics
      __GFP_NOFAIL behavior without any allocation fallback strategy. Let's
      remove the open coded loop and replace it with __GFP_NOFAIL. Without the
      flag the allocator has no way to find out never-fail requirement and
      cannot help in any way.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      61bd00c1
    • Eryu Guan's avatar
      ext4: correctly migrate a file with a hole at the beginning · 4e792b2a
      Eryu Guan authored
      commit 8974fec7 upstream.
      
      Currently ext4_ind_migrate() doesn't correctly handle a file which
      contains a hole at the beginning of the file.  This caused the migration
      to be done incorrectly, and then if there is a subsequent following
      delayed allocation write to the "hole", this would reclaim the same data
      blocks again and results in fs corruption.
      
        # assmuing 4k block size ext4, with delalloc enabled
        # skip the first block and write to the second block
        xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/ext4/testfile
      
        # converting to indirect-mapped file, which would move the data blocks
        # to the beginning of the file, but extent status cache still marks
        # that region as a hole
        chattr -e /mnt/ext4/testfile
      
        # delayed allocation writes to the "hole", reclaim the same data block
        # again, results in i_blocks corruption
        xfs_io -c "pwrite 0 4k" /mnt/ext4/testfile
        umount /mnt/ext4
        e2fsck -nf /dev/sda6
        ...
        Inode 53, i_blocks is 16, should be 8.  Fix? no
        ...
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4e792b2a
    • Eryu Guan's avatar
      ext4: be more strict when migrating to non-extent based file · 261c583a
      Eryu Guan authored
      commit d6f123a9 upstream.
      
      Currently the check in ext4_ind_migrate() is not enough before doing the
      real conversion:
      
      a) delayed allocated extents could bypass the check on eh->eh_entries
         and eh->eh_depth
      
      This can be demonstrated by this script
      
        xfs_io -fc "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/ext4/testfile
        chattr -e /mnt/ext4/testfile
      
      where testfile has two extents but still be converted to non-extent
      based file format.
      
      b) only extent length is checked but not the offset, which would result
         in data lose (delalloc) or fs corruption (nodelalloc), because
         non-extent based file only supports at most (12 + 2^10 + 2^20 + 2^30)
         blocks
      
      This can be demostrated by
      
        xfs_io -fc "pwrite 5T 4k" /mnt/ext4/testfile
        chattr -e /mnt/ext4/testfile
        sync
      
      If delalloc is enabled, dmesg prints
        EXT4-fs warning (device dm-4): ext4_block_to_path:105: block 1342177280 > max in inode 53
        EXT4-fs (dm-4): Delayed block allocation failed for inode 53 at logical offset 1342177280 with max blocks 1 with error 5
        EXT4-fs (dm-4): This should not happen!! Data will be lost
      
      If delalloc is disabled, e2fsck -nf shows corruption
        Inode 53, i_size is 5497558142976, should be 4096.  Fix? no
      
      Fix the two issues by
      
      a) forcing all delayed allocation blocks to be allocated before checking
         eh->eh_depth and eh->eh_entries
      b) limiting the last logical block of the extent is within direct map
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      261c583a
    • Lukas Czerner's avatar
      ext4: fix reservation release on invalidatepage for delalloc fs · e726980a
      Lukas Czerner authored
      commit 9705acd6 upstream.
      
      On delalloc enabled file system on invalidatepage operation
      in ext4_da_page_release_reservation() we want to clear the delayed
      buffer and remove the extent covering the delayed buffer from the extent
      status tree.
      
      However currently there is a bug where on the systems with page size >
      block size we will always remove extents from the start of the page
      regardless where the actual delayed buffers are positioned in the page.
      This leads to the errors like this:
      
      EXT4-fs warning (device loop0): ext4_da_release_space:1225:
      ext4_da_release_space: ino 13, to_free 1 with only 0 reserved data
      blocks
      
      This however can cause data loss on writeback time if the file system is
      in ENOSPC condition because we're releasing reservation for someones
      else delayed buffer.
      
      Fix this by only removing extents that corresponds to the part of the
      page we want to invalidate.
      
      This problem is reproducible by the following fio receipt (however I was
      only able to reproduce it with fio-2.1 or older.
      
      [global]
      bs=8k
      iodepth=1024
      iodepth_batch=60
      randrepeat=1
      size=1m
      directory=/mnt/test
      numjobs=20
      [job1]
      ioengine=sync
      bs=1k
      direct=1
      rw=randread
      filename=file1:file2
      [job2]
      ioengine=libaio
      rw=randwrite
      direct=1
      filename=file1:file2
      [job3]
      bs=1k
      ioengine=posixaio
      rw=randwrite
      direct=1
      filename=file1:file2
      [job5]
      bs=1k
      ioengine=sync
      rw=randread
      filename=file1:file2
      [job7]
      ioengine=libaio
      rw=randwrite
      filename=file1:file2
      [job8]
      ioengine=posixaio
      rw=randwrite
      filename=file1:file2
      [job10]
      ioengine=mmap
      rw=randwrite
      bs=1k
      filename=file1:file2
      [job11]
      ioengine=mmap
      rw=randwrite
      direct=1
      filename=file1:file2
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e726980a
    • Filipe Manana's avatar
      Btrfs: fix fsync data loss after append write · f5880941
      Filipe Manana authored
      commit e4545de5 upstream.
      
      If we do an append write to a file (which increases its inode's i_size)
      that does not have the flag BTRFS_INODE_NEEDS_FULL_SYNC set in its inode,
      and the previous transaction added a new hard link to the file, which sets
      the flag BTRFS_INODE_COPY_EVERYTHING in the file's inode, and then fsync
      the file, the inode's new i_size isn't logged. This has the consequence
      that after the fsync log is replayed, the file size remains what it was
      before the append write operation, which means users/applications will
      not be able to read the data that was successsfully fsync'ed before.
      
      This happens because neither the inode item nor the delayed inode get
      their i_size updated when the append write is made - doing so would
      require starting a transaction in the buffered write path, something that
      we do not do intentionally for performance reasons.
      
      Fix this by making sure that when the flag BTRFS_INODE_COPY_EVERYTHING is
      set the inode is logged with its current i_size (log the in-memory inode
      into the log tree).
      
      This issue is not a recent regression and is easy to reproduce with the
      following test case for fstests:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        here=`pwd`
        tmp=/tmp/$$
        status=1	# failure is the default!
      
        _cleanup()
        {
                _cleanup_flakey
                rm -f $tmp.*
        }
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
        . ./common/dmflakey
      
        # real QA test starts here
        _supported_fs generic
        _supported_os Linux
        _need_to_be_root
        _require_scratch
        _require_dm_flakey
        _require_metadata_journaling $SCRATCH_DEV
      
        _crash_and_mount()
        {
                # Simulate a crash/power loss.
                _load_flakey_table $FLAKEY_DROP_WRITES
                _unmount_flakey
                # Allow writes again and mount. This makes the fs replay its fsync log.
                _load_flakey_table $FLAKEY_ALLOW_WRITES
                _mount_flakey
        }
      
        rm -f $seqres.full
      
        _scratch_mkfs >> $seqres.full 2>&1
        _init_flakey
        _mount_flakey
      
        # Create the test file with some initial data and then fsync it.
        # The fsync here is only needed to trigger the issue in btrfs, as it causes the
        # the flag BTRFS_INODE_NEEDS_FULL_SYNC to be removed from the btrfs inode.
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 32k" \
                        -c "fsync" \
                        $SCRATCH_MNT/foo | _filter_xfs_io
        sync
      
        # Add a hard link to our file.
        # On btrfs this sets the flag BTRFS_INODE_COPY_EVERYTHING on the btrfs inode,
        # which is a necessary condition to trigger the issue.
        ln $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      
        # Sync the filesystem to force a commit of the current btrfs transaction, this
        # is a necessary condition to trigger the bug on btrfs.
        sync
      
        # Now append more data to our file, increasing its size, and fsync the file.
        # In btrfs because the inode flag BTRFS_INODE_COPY_EVERYTHING was set and the
        # write path did not update the inode item in the btree nor the delayed inode
        # item (in memory struture) in the current transaction (created by the fsync
        # handler), the fsync did not record the inode's new i_size in the fsync
        # log/journal. This made the data unavailable after the fsync log/journal is
        # replayed.
        $XFS_IO_PROG -c "pwrite -S 0xbb 32K 32K" \
                     -c "fsync" \
                     $SCRATCH_MNT/foo | _filter_xfs_io
      
        echo "File content after fsync and before crash:"
        od -t x1 $SCRATCH_MNT/foo
      
        _crash_and_mount
      
        echo "File content after crash and log replay:"
        od -t x1 $SCRATCH_MNT/foo
      
        status=0
        exit
      
      The expected file output before and after the crash/power failure expects the
      appended data to be available, which is:
      
        0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
        *
        0100000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
        *
        0200000
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f5880941
    • Filipe Manana's avatar
      Btrfs: fix race between caching kthread and returning inode to inode cache · 13321418
      Filipe Manana authored
      commit ae9d8f17 upstream.
      
      While the inode cache caching kthread is calling btrfs_unpin_free_ino(),
      we could have a concurrent call to btrfs_return_ino() that adds a new
      entry to the root's free space cache of pinned inodes. This concurrent
      call does not acquire the fs_info->commit_root_sem before adding a new
      entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem
      because the caching kthread calls btrfs_unpin_free_ino() after setting
      the caching state to BTRFS_CACHE_FINISHED and therefore races with
      the task calling btrfs_return_ino(), which is adding a new entry, while
      the former (caching kthread) is navigating the cache's rbtree, removing
      and freeing nodes from the cache's rbtree without acquiring the spinlock
      that protects the rbtree.
      
      This race resulted in memory corruption due to double free of struct
      btrfs_free_space objects because both tasks can end up doing freeing the
      same objects. Note that adding a new entry can result in merging it with
      other entries in the cache, in which case those entries are freed.
      This is particularly important as btrfs_free_space structures are also
      used for the block group free space caches.
      
      This memory corruption can be detected by a debugging kernel, which
      reports it with the following trace:
      
      [132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected
      [132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G        W       4.1.0-rc5-btrfs-next-10+ #1
      [132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
      [132408.505075]  ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce
      [132408.505075]  ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68
      [132408.505075]  ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f
      [132408.505075] Call Trace:
      [132408.505075]  [<ffffffff8145eec7>] dump_stack+0x4f/0x7b
      [132408.505075]  [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2
      [132408.505075]  [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36
      [132408.505075]  [<ffffffff81155733>] __cache_free+0xe2/0x4b6
      [132408.505075]  [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs]
      [132408.505075]  [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
      [132408.505075]  [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28
      [132408.505075]  [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf
      [132408.505075]  [<ffffffff811563a1>] ? kfree+0xb6/0x14e
      [132408.505075]  [<ffffffff811563d0>] kfree+0xe5/0x14e
      [132408.505075]  [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
      [132408.505075]  [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs]
      [132408.505075]  [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs]
      [132408.505075]  [<ffffffff8106698f>] kthread+0xef/0xf7
      [132408.505075]  [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28
      [132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
      [132408.505075]  [<ffffffff814653d2>] ret_from_fork+0x42/0x70
      [132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
      [132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b.
      [132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320
      [132409.503355] ------------[ cut here ]------------
      [132409.504241] kernel BUG at mm/slab.c:2571!
      
      Therefore fix this by having btrfs_unpin_free_ino() acquire the lock
      that protects the rbtree while doing the searches and removing entries.
      
      Fixes: 1c70d8fb ("Btrfs: fix inode caching vs tree log")
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      13321418
    • Filipe Manana's avatar
      Btrfs: use kmem_cache_free when freeing entry in inode cache · 6c7f16ec
      Filipe Manana authored
      commit c3f4a168 upstream.
      
      The free space entries are allocated using kmem_cache_zalloc(),
      through __btrfs_add_free_space(), therefore we should use
      kmem_cache_free() and not kfree() to avoid any confusion and
      any potential problem. Looking at the kfree() definition at
      mm/slab.c it has the following comment:
      
        /*
         * (...)
         *
         * Don't free memory not originally allocated by kmalloc()
         * or you will run into trouble.
         */
      
      So better be safe and use kmem_cache_free().
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6c7f16ec
  2. 04 Aug, 2015 1 commit
  3. 30 Jul, 2015 9 commits
  4. 28 Jul, 2015 1 commit
  5. 23 Jul, 2015 17 commits
    • Michal Kazior's avatar
      mac80211: prevent possible crypto tx tailroom corruption · aa9d4028
      Michal Kazior authored
      commit ab499db8 upstream.
      
      There was a possible race between
      ieee80211_reconfig() and
      ieee80211_delayed_tailroom_dec(). This could
      result in inability to transmit data if driver
      crashed during roaming or rekeying and subsequent
      skbs with insufficient tailroom appeared.
      
      This race was probably never seen in the wild
      because a device driver would have to crash AND
      recover within 0.5s which is very unlikely.
      
      I was able to prove this race exists after
      changing the delay to 10s locally and crashing
      ath10k via debugfs immediately after GTK
      rekeying. In case of ath10k the counter went below
      0. This was harmless but other drivers which
      actually require tailroom (e.g. for WEP ICV or
      MMIC) could end up with the counter at 0 instead
      of >0 and introduce insufficient skb tailroom
      failures because mac80211 would not resize skbs
      appropriately anymore.
      
      Fixes: 8d1f7ecd ("mac80211: defer tailroom counter manipulation when roaming")
      Signed-off-by: default avatarMichal Kazior <michal.kazior@tieto.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      aa9d4028
    • Dan Carpenter's avatar
      HID: i2c-hid: fix harmless test_bit() issue · ff0474b4
      Dan Carpenter authored
      commit c8fd51dc upstream.
      
      These defines are used like this:
      
      	if (test_bit(I2C_HID_STARTED, &ihid->flags))
      
      The intent was to use bits 0, 1, and 2 but because of the extra shifts
      we're using bits 1, 2, and 4.  It's harmless becuase it's done
      consistently but it's not the intent and static checkers will complain.
      
      Fixes: 4a200c3b ('HID: i2c-hid: introduce HID over i2c specification implementation')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ff0474b4
    • Konstantin Khlebnikov's avatar
      of: return NUMA_NO_NODE from fallback of_node_to_nid() · d82d7501
      Konstantin Khlebnikov authored
      commit c8fff7bc upstream.
      
      Node 0 might be offline as well as any other numa node,
      in this case kernel cannot handle memory allocation and crashes.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Fixes: 0c3f061c ("of: implement of_node_to_nid as a weak function")
      Signed-off-by: default avatarGrant Likely <grant.likely@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d82d7501
    • Jesper Dangaard Brouer's avatar
      pktgen: adjust spacing in proc file interface output · 4ac38fc9
      Jesper Dangaard Brouer authored
      commit d079abd1 upstream.
      
      Too many spaces were introduced in commit 63adc6fb ("pktgen: cleanup
      checkpatch warnings"), thus misaligning "src_min:" to other columns.
      
      Fixes: 63adc6fb ("pktgen: cleanup checkpatch warnings")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4ac38fc9
    • Vasily Averin's avatar
      security_syslog() should be called once only · ce458596
      Vasily Averin authored
      commit d194e5d6 upstream.
      
      The final version of commit 637241a9 ("kmsg: honor dmesg_restrict
      sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
      processed incorrectly:
      
      - open of /dev/kmsg checks syslog access permissions by using
        check_syslog_permissions() where security_syslog() is not called if
        dmesg_restrict is set.
      
      - syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
        can be executed twice (inside check_syslog_permissions() and then
        directly in do_syslog())
      
      With this patch security_syslog() is called once only in all
      syslog-related operations regardless of dmesg_restrict value.
      
      Fixes: 637241a9 ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg")
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Eric Paris <eparis@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ce458596
    • Chuck Lever's avatar
      NFS: Fix size of NFSACL SETACL operations · 5648c103
      Chuck Lever authored
      commit d683cc49 upstream.
      
      When encoding the NFSACL SETACL operation, reserve just the estimated
      size of the ACL rather than a fixed maximum. This eliminates needless
      zero padding on the wire that the server ignores.
      
      Fixes: ee5dc773 ('NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"')
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5648c103
    • Dan Carpenter's avatar
      rndis_wlan: harmless issue calling set_bit() · 9c24c7a2
      Dan Carpenter authored
      commit e3958e9d upstream.
      
      These are used like:
      
      	set_bit(WORK_LINK_UP, &priv->work_pending);
      
      The problem is that set_bit() takes the actual bit number and not a mask
      so static checkers get upset.  It doesn't affect run time because we do
      it consistently, but we may as well clean it up.
      
      Fixes: 6010ce07 ('rndis_wlan: do link-down state change in worker thread')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9c24c7a2
    • Uwe Kleine-König's avatar
      mtd: dc21285: use raw spinlock functions for nw_gpio_lock · f15929a9
      Uwe Kleine-König authored
      commit e5babdf9 upstream.
      
      Since commit bd31b859 (which is in 3.2-rc1) nw_gpio_lock is a raw spinlock
      that needs usage of the corresponding raw functions.
      
      This fixes:
      
        drivers/mtd/maps/dc21285.c: In function 'nw_en_write':
        drivers/mtd/maps/dc21285.c:41:340: warning: passing argument 1 of 'spinlock_check' from incompatible pointer type
          spin_lock_irqsave(&nw_gpio_lock, flags);
      
        In file included from include/linux/seqlock.h:35:0,
                         from include/linux/time.h:5,
                         from include/linux/stat.h:18,
                         from include/linux/module.h:10,
                         from drivers/mtd/maps/dc21285.c:8:
        include/linux/spinlock.h:299:102: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *'
         static inline raw_spinlock_t *spinlock_check(spinlock_t *lock)
                                                                                                              ^
        drivers/mtd/maps/dc21285.c:43:25: warning: passing argument 1 of 'spin_unlock_irqrestore' from incompatible pointer type
          spin_unlock_irqrestore(&nw_gpio_lock, flags);
                                 ^
        In file included from include/linux/seqlock.h:35:0,
                         from include/linux/time.h:5,
                         from include/linux/stat.h:18,
                         from include/linux/module.h:10,
                         from drivers/mtd/maps/dc21285.c:8:
        include/linux/spinlock.h:370:91: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *'
         static inline void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
      
      Fixes: bd31b859 ("locking, ARM: Annotate low level hw locks as raw")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f15929a9
    • Arnd Bergmann's avatar
      tty: remove platform_sysrq_reset_seq · b909b574
      Arnd Bergmann authored
      commit ffb6e0c9 upstream.
      
      The platform_sysrq_reset_seq code was intended as a way for an embedded
      platform to provide its own sysrq sequence at compile time. After over two
      years, nobody has started using it in an upstream kernel, and the platforms
      that were interested in it have moved on to devicetree, which can be used
      to configure the sequence without requiring kernel changes. The method is
      also incompatible with the way that most architectures build support for
      multiple platforms into a single kernel.
      
      Now the code is producing warnings when built with gcc-5.1:
      
      drivers/tty/sysrq.c: In function 'sysrq_init':
      drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds]
         key = platform_sysrq_reset_seq[i];
      
      We could fix this, but it seems unlikely that it will ever be used, so
      let's just remove the code instead. We still have the option to pass the
      sequence either in DT, using the kernel command line, or using the
      /sys/module/sysrq/parameters/reset_seq file.
      
      Fixes: 154b7a48 ("Input: sysrq - allow specifying alternate reset sequence")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b909b574
    • Ding Wang's avatar
      mmc: card: Fixup request missing in mmc_blk_issue_rw_rq · b21cbf58
      Ding Wang authored
      commit 29535f7b upstream.
      
      The current handler of MMC_BLK_CMD_ERR in mmc_blk_issue_rw_rq function
      may cause new coming request permanent missing when the ongoing
      request (previoulsy started) complete end.
      
      The problem scenario is as follows:
      (1) Request A is ongoing;
      (2) Request B arrived, and finally mmc_blk_issue_rw_rq() is called;
      (3) Request A encounters the MMC_BLK_CMD_ERR error;
      (4) In the error handling of MMC_BLK_CMD_ERR, suppose mmc_blk_cmd_err()
          end request A completed and return zero. Continue the error handling,
          suppose mmc_blk_reset() reset device success;
      (5) Continue the execution, while loop completed because variable ret
          is zero now;
      (6) Finally, mmc_blk_issue_rw_rq() return without processing request B.
      
      The process related to the missing request may wait that IO request
      complete forever, possibly crashing the application or hanging the system.
      
      Fix this issue by starting new request when reset success.
      Signed-off-by: default avatarDing Wang <justin.wang@spreadtrum.com>
      Fixes: 67716327 ("mmc: block: add eMMC hardware reset support")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b21cbf58
    • Julian Anastasov's avatar
      neigh: do not modify unlinked entries · 8f95a0d1
      Julian Anastasov authored
      commit 2c51a97f upstream.
      
      The lockless lookups can return entry that is unlinked.
      Sometimes they get reference before last neigh_cleanup_and_release,
      sometimes they do not need reference. Later, any
      modification attempts may result in the following problems:
      
      1. entry is not destroyed immediately because neigh_update
      can start the timer for dead entry, eg. on change to NUD_REACHABLE
      state. As result, entry lives for some time but is invisible
      and out of control.
      
      2. __neigh_event_send can run in parallel with neigh_destroy
      while refcnt=0 but if timer is started and expired refcnt can
      reach 0 for second time leading to second neigh_destroy and
      possible crash.
      
      Thanks to Eric Dumazet and Ying Xue for their work and analyze
      on the __neigh_event_send change.
      
      Fixes: 767e97e1 ("neigh: RCU conversion of struct neighbour")
      Fixes: a263b309 ("ipv4: Make neigh lookups directly in output packet path.")
      Fixes: 6fd6ce20 ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ kamal: backport to 3.13-stable: no __neigh_set_probe_once() ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8f95a0d1
    • Dan Carpenter's avatar
      ASoC: imx-wm8962: Add a missing error check · 6aee2e89
      Dan Carpenter authored
      commit 474ff0ae upstream.
      
      My static checker complains that:
      
      	sound/soc/fsl/imx-wm8962.c:196 imx_wm8962_probe() warn:
      	we tested 'ret' before and it was 'false'
      
      The intent was that we use "ret" to check imx_audmux_v2_configure_port().
      
      Fixes: 8de2ae2a ('ASoC: fsl: add imx-wm8962 machine driver')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Otherwise, Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6aee2e89
    • Uwe Kleine-König's avatar
      watchdog: omap: assert the counter being stopped before reprogramming · 10d08999
      Uwe Kleine-König authored
      commit 530c11d4 upstream.
      
      The omap watchdog has the annoying behaviour that writes to most
      registers don't have any effect when the watchdog is already running.
      Quoting the AM335x reference manual:
      
      	To modify the timer counter value (the WDT_WCRR register),
      	prescaler ratio (the WDT_WCLR[4:2] PTV bit field), delay
      	configuration value (the WDT_WDLY[31:0] DLY_VALUE bit field), or
      	the load value (the WDT_WLDR[31:0] TIMER_LOAD bit field), the
      	watchdog timer must be disabled by using the start/stop sequence
      	(the WDT_WSPR register).
      
      Currently the timer is stopped in the .probe callback but still there
      are possibilities that yield to a situation where omap_wdt_start is
      entered with the timer running (e.g. when /dev/watchdog is closed
      without stopping and then reopened). In such a case programming the
      timeout silently fails!
      
      To circumvent this stop the timer before reprogramming.
      
      Assuming one of the first things the watchdog user does is setting the
      timeout explicitly nothing too bad should happen because this explicit
      setting works fine.
      
      Fixes: 7768a13c ("[PATCH] OMAP: Add Watchdog driver support")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      10d08999
    • Naoya Horiguchi's avatar
      mm/hugetlb: introduce minimum hugepage order · 198039dd
      Naoya Horiguchi authored
      commit 641844f5 upstream.
      
      Currently the initial value of order in dissolve_free_huge_page is 64 or
      32, which leads to the following warning in static checker:
      
        mm/hugetlb.c:1203 dissolve_free_huge_pages()
        warn: potential right shift more than type allows '9,18,64'
      
      This is a potential risk of infinite loop, because 1 << order (== 0) is used
      in for-loop like this:
      
        for (pfn =3D start_pfn; pfn < end_pfn; pfn +=3D 1 << order)
            ...
      
      So this patch fixes it by using global minimum_order calculated at boot time.
      
          text    data     bss     dec     hex filename
         28313     469   84236  113018   1b97a mm/hugetlb.o
         28256     473   84236  112965   1b945 mm/hugetlb.o (patched)
      
      Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      198039dd
    • Li Zhong's avatar
      mm: fix potential infinite loop in dissolve_free_huge_pages() · 78c2699f
      Li Zhong authored
      commit d0177639 upstream.
      
      It is possible for some platforms, such as powerpc to set HPAGE_SHIFT to
      0 to indicate huge pages not supported.
      
      When this is the case, hugetlbfs could be disabled during boot time:
      hugetlbfs: disabling because there are no supported hugepage sizes
      
      Then in dissolve_free_huge_pages(), order is kept maximum (64 for
      64bits), and the for loop below won't end: for (pfn = start_pfn; pfn <
      end_pfn; pfn += 1 << order)
      
      As suggested by Naoya, below fix checks hugepages_supported() before
      calling dissolve_free_huge_pages().
      
      [rientjes@google.com: no legitimate reason to call dissolve_free_huge_pages() when !hugepages_supported()]
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ kamal: 3.13-stable prereq for
        641844f5 mm/hugetlb: introduce minimum hugepage order ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      78c2699f
    • Nishanth Aravamudan's avatar
      hugetlb: ensure hugepage access is denied if hugepages are not supported · 462e6bf1
      Nishanth Aravamudan authored
      commit 457c1b27 upstream.
      
      Currently, I am seeing the following when I `mount -t hugetlbfs /none
      /dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`.  I think it's
      related to the fact that hugetlbfs is properly not correctly setting
      itself up in this state?:
      
        Unable to handle kernel paging request for data at address 0x00000031
        Faulting instruction address: 0xc000000000245710
        Oops: Kernel access of bad area, sig: 11 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        ....
      
      In KVM guests on Power, in a guest not backed by hugepages, we see the
      following:
      
        AnonHugePages:         0 kB
        HugePages_Total:       0
        HugePages_Free:        0
        HugePages_Rsvd:        0
        HugePages_Surp:        0
        Hugepagesize:         64 kB
      
      HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
      are not supported at boot-time, but this is only checked in
      hugetlb_init().  Extract the check to a helper function, and use it in a
      few relevant places.
      
      This does make hugetlbfs not supported (not registered at all) in this
      environment.  I believe this is fine, as there are no valid hugepages
      and that won't change at runtime.
      
      [akpm@linux-foundation.org: use pr_info(), per Mel]
      [akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined]
      Signed-off-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ kamal: 3.13-stable prereq for
        641844f5 mm/hugetlb: introduce minimum hugepage order ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      462e6bf1
    • Michal Kazior's avatar
      cfg80211: ignore netif running state when changing iftype · c0fdda83
      Michal Kazior authored
      commit 6cbfb1bb upstream.
      
      It was possible for mac80211 to be coerced into an
      unexpected flow causing sdata union to become
      corrupted. Station pointer was put into
      sdata->u.vlan.sta memory location while it was
      really master AP's sdata->u.ap.next_beacon. This
      led to station entry being later freed as
      next_beacon before __sta_info_flush() in
      ieee80211_stop_ap() and a subsequent invalid
      pointer dereference crash.
      
      The problem was that ieee80211_ptr->use_4addr
      wasn't cleared on interface type changes.
      
      This could be reproduced with the following steps:
      
       # host A and host B have just booted; no
       # wpa_s/hostapd running; all vifs are down
       host A> iw wlan0 set type station
       host A> iw wlan0 set 4addr on
       host A> printf 'interface=wlan0\nssid=4addrcrash\nchannel=1\nwds_sta=1' > /tmp/hconf
       host A> hostapd -B /tmp/conf
       host B> iw wlan0 set 4addr on
       host B> ifconfig wlan0 up
       host B> iw wlan0 connect -w hostAssid
       host A> pkill hostapd
       # host A crashed:
      
       [  127.928192] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c8
       [  127.929014] IP: [<ffffffff816f4f32>] __sta_info_flush+0xac/0x158
       ...
       [  127.934578]  [<ffffffff8170789e>] ieee80211_stop_ap+0x139/0x26c
       [  127.934578]  [<ffffffff8100498f>] ? dump_trace+0x279/0x28a
       [  127.934578]  [<ffffffff816dc661>] __cfg80211_stop_ap+0x84/0x191
       [  127.934578]  [<ffffffff816dc7ad>] cfg80211_stop_ap+0x3f/0x58
       [  127.934578]  [<ffffffff816c5ad6>] nl80211_stop_ap+0x1b/0x1d
       [  127.934578]  [<ffffffff815e53f8>] genl_family_rcv_msg+0x259/0x2b5
      
      Note: This isn't a revert of f8cdddb8
      ("cfg80211: check iface combinations only when
      iface is running") as far as functionality is
      considered because b6a55015 ("cfg80211/mac80211:
      move more combination checks to mac80211") moved
      the logic somewhere else already.
      
      Fixes: f8cdddb8 ("cfg80211: check iface combinations only when iface is running")
      Signed-off-by: default avatarMichal Kazior <michal.kazior@tieto.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c0fdda83