1. 22 Oct, 2016 40 commits
    • David Howells's avatar
      cachefiles: Fix attempt to read i_blocks after deleting file [ver #2] · 336f2e1e
      David Howells authored
      commit a818101d upstream.
      
      An NULL-pointer dereference happens in cachefiles_mark_object_inactive()
      when it tries to read i_blocks so that it can tell the cachefilesd daemon
      how much space it's making available.
      
      The problem is that cachefiles_drop_object() calls
      cachefiles_mark_object_inactive() after calling cachefiles_delete_object()
      because the object being marked active staves off attempts to (re-)use the
      file at that filename until after it has been deleted.  This means that
      d_inode is NULL by the time we come to try to access it.
      
      To fix the problem, have the caller of cachefiles_mark_object_inactive()
      supply the number of blocks freed up.
      
      Without this, the following oops may occur:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      IP: [<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles]
      ...
      CPU: 11 PID: 527 Comm: kworker/u64:4 Tainted: G          I    ------------   3.10.0-470.el7.x86_64 #1
      Hardware name: Hewlett-Packard HP Z600 Workstation/0B54h, BIOS 786G4 v03.19 03/11/2011
      Workqueue: fscache_object fscache_object_work_func [fscache]
      task: ffff880035edaf10 ti: ffff8800b77c0000 task.ti: ffff8800b77c0000
      RIP: 0010:[<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles]
      RSP: 0018:ffff8800b77c3d70  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8800bf6cc400 RCX: 0000000000000034
      RDX: 0000000000000000 RSI: ffff880090ffc710 RDI: ffff8800bf761ef8
      RBP: ffff8800b77c3d88 R08: 2000000000000000 R09: 0090ffc710000000
      R10: ff51005d2ff1c400 R11: 0000000000000000 R12: ffff880090ffc600
      R13: ffff8800bf6cc520 R14: ffff8800bf6cc400 R15: ffff8800bf6cc498
      FS:  0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000098 CR3: 00000000019ba000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffff880090ffc600 ffff8800bf6cc400 ffff8800867df140 ffff8800b77c3db0
       ffffffffa06c48cb ffff880090ffc600 ffff880090ffc180 ffff880090ffc658
       ffff8800b77c3df0 ffffffffa085d846 ffff8800a96b8150 ffff880090ffc600
      Call Trace:
       [<ffffffffa06c48cb>] cachefiles_drop_object+0x6b/0xf0 [cachefiles]
       [<ffffffffa085d846>] fscache_drop_object+0xd6/0x1e0 [fscache]
       [<ffffffffa085d615>] fscache_object_work_func+0xa5/0x200 [fscache]
       [<ffffffff810a605b>] process_one_work+0x17b/0x470
       [<ffffffff810a6e96>] worker_thread+0x126/0x410
       [<ffffffff810a6d70>] ? rescuer_thread+0x460/0x460
       [<ffffffff810ae64f>] kthread+0xcf/0xe0
       [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff81695418>] ret_from_fork+0x58/0x90
       [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140
      
      The oopsing code shows:
      
      	callq  0xffffffff810af6a0 <wake_up_bit>
      	mov    0xf8(%r12),%rax
      	mov    0x30(%rax),%rax
      	mov    0x98(%rax),%rax   <---- oops here
      	lock add %rax,0x130(%rbx)
      
      where this is:
      
      	d_backing_inode(object->dentry)->i_blocks
      
      Fixes: a5b3a80b (CacheFiles: Provide read-and-reset release counters for cachefilesd)
      Reported-by: default avatarJianhong Yin <jiyin@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Reviewed-by: default avatarSteve Dickson <steved@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      336f2e1e
    • Miklos Szeredi's avatar
      vfs: move permission checking into notify_change() for utimes(NULL) · 7bf99896
      Miklos Szeredi authored
      commit f2b20f6e upstream.
      
      This fixes a bug where the permission was not properly checked in
      overlayfs.  The testcase is ltp/utimensat01.
      
      It is also cleaner and safer to do the permission checking in the vfs
      helper instead of the caller.
      
      This patch introduces an additional ia_valid flag ATTR_TOUCH (since
      touch(1) is the most obvious user of utimes(NULL)) that is passed into
      notify_change whenever the conditions for this special permission checking
      mode are met.
      Reported-by: default avatarAihua Zhang <zhangaihua1@huawei.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Tested-by: default avatarAihua Zhang <zhangaihua1@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bf99896
    • Marcelo Ricardo Leitner's avatar
      dlm: free workqueues after the connections · 6b746940
      Marcelo Ricardo Leitner authored
      commit 3a8db798 upstream.
      
      After backporting commit ee44b4bc ("dlm: use sctp 1-to-1 API")
      series to a kernel with an older workqueue which didn't use RCU yet, it
      was noticed that we are freeing the workqueues in dlm_lowcomms_stop()
      too early as free_conn() will try to access that memory for canceling
      the queued works if any.
      
      This issue was introduced by commit 0d737a8c as before it such
      attempt to cancel the queued works wasn't performed, so the issue was
      not present.
      
      This patch fixes it by simply inverting the free order.
      
      Fixes: 0d737a8c ("dlm: fix race while closing connections")
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b746940
    • Marcelo Cerri's avatar
      crypto: vmx - Fix memory corruption caused by p8_ghash · 3fae7862
      Marcelo Cerri authored
      commit 80da44c2 upstream.
      
      This patch changes the p8_ghash driver to use ghash-generic as a fixed
      fallback implementation. This allows the correct value of descsize to be
      defined directly in its shash_alg structure and avoids problems with
      incorrect buffer sizes when its state is exported or imported.
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Fixes: cc333cd6 ("crypto: vmx - Adding GHASH routines for VMX module")
      Signed-off-by: default avatarMarcelo Cerri <marcelo.cerri@canonical.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fae7862
    • Marcelo Cerri's avatar
      crypto: ghash-generic - move common definitions to a new header file · e15e0b84
      Marcelo Cerri authored
      commit a397ba82 upstream.
      
      Move common values and types used by ghash-generic to a new header file
      so drivers can directly use ghash-generic as a fallback implementation.
      
      Fixes: cc333cd6 ("crypto: vmx - Adding GHASH routines for VMX module")
      Signed-off-by: default avatarMarcelo Cerri <marcelo.cerri@canonical.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e15e0b84
    • Jan Kara's avatar
      ext4: unmap metadata when zeroing blocks · fb13b62a
      Jan Kara authored
      commit 9b623df6 upstream.
      
      When zeroing blocks for DAX allocations, we also have to unmap aliases
      in the block device mappings.  Otherwise writeback can overwrite zeros
      with stale data from block device page cache.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb13b62a
    • gmail's avatar
      ext4: release bh in make_indexed_dir · ff50a724
      gmail authored
      commit e81d4477 upstream.
      
      The commit 6050d47a: "ext4: bail out from make_indexed_dir() on
      first error" could end up leaking bh2 in the error path.
      
      [ Also avoid renaming bh2 to bh, which just confuses things --tytso ]
      Signed-off-by: default avataryangsheng <yngsion@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ff50a724
    • Ross Zwisler's avatar
      ext4: allow DAX writeback for hole punch · 99fa4c50
      Ross Zwisler authored
      commit cca32b7e upstream.
      
      Currently when doing a DAX hole punch with ext4 we fail to do a writeback.
      This is because the logic around filemap_write_and_wait_range() in
      ext4_punch_hole() only looks for dirty page cache pages in the radix tree,
      not for dirty DAX exceptional entries.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99fa4c50
    • Eric Biggers's avatar
      ext4: fix memory leak when symlink decryption fails · 5373f6cc
      Eric Biggers authored
      commit dcce7a46 upstream.
      
      This bug was introduced in v4.8-rc1.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5373f6cc
    • Fabian Frederick's avatar
      ext4: fix memory leak in ext4_insert_range() · 7a689387
      Fabian Frederick authored
      commit edf15aa1 upstream.
      
      Running xfstests generic/013 with kmemleak gives the following:
      
      unreferenced object 0xffff8801d3d27de0 (size 96):
        comm "fsstress", pid 4941, jiffies 4294860168 (age 53.485s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff818eaaf3>] kmemleak_alloc+0x23/0x40
          [<ffffffff81179805>] __kmalloc+0xf5/0x1d0
          [<ffffffff8122ef5c>] ext4_find_extent+0x1ec/0x2f0
          [<ffffffff8123530c>] ext4_insert_range+0x34c/0x4a0
          [<ffffffff81235942>] ext4_fallocate+0x4e2/0x8b0
          [<ffffffff81181334>] vfs_fallocate+0x134/0x210
          [<ffffffff8118203f>] SyS_fallocate+0x3f/0x60
          [<ffffffff818efa9b>] entry_SYSCALL_64_fastpath+0x13/0x8f
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Problem seems mitigated by dropping refs and freeing path
      when there's no path[depth].p_ext
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a689387
    • wangguang's avatar
      ext4: bugfix for mmaped pages in mpage_release_unused_pages() · 30ac674a
      wangguang authored
      commit 4e800c03 upstream.
      
      Pages clear buffers after ext4 delayed block allocation failed,
      However, it does not clean its pte_dirty flag.
      if the pages unmap ,in cording to the pte_dirty ,
      unmap_page_range may try to call __set_page_dirty,
      
      which may lead to the bugon at
      mpage_prepare_extent_to_map:head = page_buffers(page);.
      
      This patch just call clear_page_dirty_for_io to clean pte_dirty
      at mpage_release_unused_pages for pages mmaped.
      
      Steps to reproduce the bug:
      
      (1) mmap a file in ext4
      	addr = (char *)mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED,
      	       	            fd, 0);
      	memset(addr, 'i', 4096);
      
      (2) return EIO at
      
      	ext4_writepages->mpage_map_and_submit_extent->mpage_map_one_extent
      
      which causes this log message to be print:
      
                      ext4_msg(sb, KERN_CRIT,
                              "Delayed block allocation failed for "
                              "inode %lu at logical offset %llu with"
                              " max blocks %u with error %d",
                              inode->i_ino,
                              (unsigned long long)map->m_lblk,
                              (unsigned)map->m_len, -err);
      
      (3)Unmap the addr cause warning at
      
      	__set_page_dirty:WARN_ON_ONCE(warn && !PageUptodate(page));
      
      (4) wait for a minute,then bugon happen.
      Signed-off-by: default avatarwangguang <wangguang03@zte.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30ac674a
    • Daeho Jeong's avatar
      ext4: reinforce check of i_dtime when clearing high fields of uid and gid · eac6c9e4
      Daeho Jeong authored
      commit 93e3b4e6 upstream.
      
      Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid
      of deleted and evicted inode to fix up interoperability with old
      kernels. However, it checks only i_dtime of an inode to determine
      whether the inode was deleted and evicted, and this is very risky,
      because i_dtime can be used for the pointer maintaining orphan inode
      list, too. We need to further check whether the i_dtime is being
      used for the orphan inode list even if the i_dtime is not NULL.
      
      We found that high 16-bit fields of uid/gid of inode are unintentionally
      and permanently cleared when the inode truncation is just triggered,
      but not finished, and the inode metadata, whose high uid/gid bits are
      cleared, is written on disk, and the sudden power-off follows that
      in order.
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eac6c9e4
    • Eric Whitney's avatar
      ext4: enforce online defrag restriction for encrypted files · ddcd9969
      Eric Whitney authored
      commit 14fbd4aa upstream.
      
      Online defragging of encrypted files is not currently implemented.
      However, the move extent ioctl can still return successfully when
      called.  For example, this occurs when xfstest ext4/020 is run on an
      encrypted file system, resulting in a corrupted test file and a
      corresponding test failure.
      
      Until the proper functionality is implemented, fail the move extent
      ioctl if either the original or donor file is encrypted.
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddcd9969
    • Jan Kara's avatar
      jbd2: fix lockdep annotation in add_transaction_credits() · 4cd2546a
      Jan Kara authored
      commit e03a9976 upstream.
      
      Thomas has reported a lockdep splat hitting in
      add_transaction_credits(). The problem is that that function calls
      jbd2_might_wait_for_commit() while holding j_state_lock which is wrong
      (we do not really wait for transaction commit while holding that lock).
      
      Fix the problem by moving jbd2_might_wait_for_commit() into places where
      we are ready to wait for transaction commit and thus j_state_lock is
      unlocked.
      
      Fixes: 1eaa566dReported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cd2546a
    • Wei Fang's avatar
      vfs,mm: fix a dead loop in truncate_inode_pages_range() · 3d549dcf
      Wei Fang authored
      commit c2a9737f upstream.
      
      We triggered a deadloop in truncate_inode_pages_range() on 32 bits
      architecture with the test case bellow:
      
      	...
      	fd = open();
      	write(fd, buf, 4096);
      	preadv64(fd, &iovec, 1, 0xffffffff000);
      	ftruncate(fd, 0);
      	...
      
      Then ftruncate() will not return forever.
      
      The filesystem used in this case is ubifs, but it can be triggered on
      many other filesystems.
      
      When preadv64() is called with offset=0xffffffff000, a page with
      index=0xffffffff will be added to the radix tree of ->mapping.  Then
      this page can be found in ->mapping with pagevec_lookup().  After that,
      truncate_inode_pages_range(), which is called in ftruncate(), will fall
      into an infinite loop:
      
       - find a page with index=0xffffffff, since index>=end, this page won't
         be truncated
      
       - index++, and index become 0
      
       - the page with index=0xffffffff will be found again
      
      The data type of index is unsigned long, so index won't overflow to 0 on
      64 bits architecture in this case, and the dead loop won't happen.
      
      Since truncate_inode_pages_range() is executed with holding lock of
      inode->i_rwsem, any operation related with this lock will be blocked,
      and a hung task will happen, e.g.:
      
        INFO: task truncate_test:3364 blocked for more than 120 seconds.
        ...
           call_rwsem_down_write_failed+0x17/0x30
           generic_file_write_iter+0x32/0x1c0
           ubifs_write_iter+0xcc/0x170
           __vfs_write+0xc4/0x120
           vfs_write+0xb2/0x1b0
           SyS_write+0x46/0xa0
      
      The page with index=0xffffffff added to ->mapping is useless.  Fix this
      by checking the read position before allocating pages.
      
      Link: http://lkml.kernel.org/r/1475151010-40166-1-git-send-email-fangwei1@huawei.comSigned-off-by: default avatarWei Fang <fangwei1@huawei.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d549dcf
    • Gerald Schaefer's avatar
      mm/hugetlb: fix memory offline with hugepage size > memory block size · d9cf9c31
      Gerald Schaefer authored
      commit 2247bb33 upstream.
      
      Patch series "mm/hugetlb: memory offline issues with hugepages", v4.
      
      This addresses several issues with hugepages and memory offline.  While
      the first patch fixes a panic, and is therefore rather important, the
      last patch is just a performance optimization.
      
      The second patch fixes a theoretical issue with reserved hugepages,
      while still leaving some ugly usability issue, see description.
      
      This patch (of 3):
      
      dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
      list corruption and addressing exception when trying to set a memory
      block offline that is part (but not the first part) of a "gigantic"
      hugetlb page with a size > memory block size.
      
      When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
      will trigger directly.  In the other case we will run into an addressing
      exception later, because dissolve_free_huge_page() will not work on the
      head page of the compound hugetlb page which will result in a NULL
      hstate from page_hstate().
      
      To fix this, first remove the VM_BUG_ON() because it is wrong, and then
      use the compound head page in dissolve_free_huge_page().  This means
      that an unused pre-allocated gigantic page that has any part of itself
      inside the memory block that is going offline will be dissolved
      completely.  Losing an unused gigantic hugepage is preferable to failing
      the memory offline, for example in the situation where a (possibly
      faulty) memory DIMM needs to go offline.
      
      Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
      Link: http://lkml.kernel.org/r/20160926172811.94033-2-gerald.schaefer@de.ibm.comSigned-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Rui Teng <rui.teng@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9cf9c31
    • Manfred Spraul's avatar
      ipc/sem.c: fix complex_count vs. simple op race · a9d465be
      Manfred Spraul authored
      commit 5864a2fd upstream.
      
      Commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") introduced a
      race:
      
      sem_lock has a fast path that allows parallel simple operations.
      There are two reasons why a simple operation cannot run in parallel:
       - a non-simple operations is ongoing (sma->sem_perm.lock held)
       - a complex operation is sleeping (sma->complex_count != 0)
      
      As both facts are stored independently, a thread can bypass the current
      checks by sleeping in the right positions.  See below for more details
      (or kernel bugzilla 105651).
      
      The patch fixes that by creating one variable (complex_mode)
      that tracks both reasons why parallel operations are not possible.
      
      The patch also updates stale documentation regarding the locking.
      
      With regards to stable kernels:
      The patch is required for all kernels that include the
      commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") (3.10?)
      
      The alternative is to revert the patch that introduced the race.
      
      The patch is safe for backporting, i.e. it makes no assumptions
      about memory barriers in spin_unlock_wait().
      
      Background:
      Here is the race of the current implementation:
      
      Thread A: (simple op)
      - does the first "sma->complex_count == 0" test
      
      Thread B: (complex op)
      - does sem_lock(): This includes an array scan. But the scan can't
        find Thread A, because Thread A does not own sem->lock yet.
      - the thread does the operation, increases complex_count,
        drops sem_lock, sleeps
      
      Thread A:
      - spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
      - sleeps before the complex_count test
      
      Thread C: (complex op)
      - does sem_lock (no array scan, complex_count==1)
      - wakes up Thread B.
      - decrements complex_count
      
      Thread A:
      - does the complex_count test
      
      Bug:
      Now both thread A and thread C operate on the same array, without
      any synchronization.
      
      Fixes: 6d07b68c ("ipc/sem.c: optimize sem_lock()")
      Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com
      Reported-by: <felixh@informatik.uni-bremen.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <1vier1@web.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9d465be
    • Brian King's avatar
      scsi: ibmvfc: Fix I/O hang when port is not mapped · f653028e
      Brian King authored
      commit 07d0e9a8 upstream.
      
      If a VFC port gets unmapped in the VIOS, it may not respond with a CRQ
      init complete following H_REG_CRQ. If this occurs, we can end up having
      called scsi_block_requests and not a resulting unblock until the init
      complete happens, which may never occur, and we end up hanging I/O
      requests.  This patch ensures the host action stay set to
      IBMVFC_HOST_ACTION_TGT_DEL so we move all rports into devloss state and
      unblock unless we receive an init complete.
      Signed-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Acked-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f653028e
    • Borislav Petkov's avatar
      scsi: arcmsr: Simplify user_len checking · 0f247610
      Borislav Petkov authored
      commit 4bd173c3 upstream.
      
      Do the user_len check first and then the ver_addr allocation so that we
      can save us the kfree() on the error path when user_len is >
      ARCMSR_API_DATA_BUFLEN.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Marco Grassi <marco.gra@gmail.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Tomas Henzl <thenzl@redhat.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f247610
    • Dan Carpenter's avatar
      scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer() · cf4dc8d4
      Dan Carpenter authored
      commit 7bc2b55a upstream.
      
      We need to put an upper bound on "user_len" so the memcpy() doesn't
      overflow.
      Reported-by: default avatarMarco Grassi <marco.gra@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf4dc8d4
    • Eric W. Biederman's avatar
      autofs: Fix automounts by using current_real_cred()->uid · 2f9aa718
      Eric W. Biederman authored
      commit 069d5ac9 upstream.
      
      Seth Forshee reports that in 4.8-rcN some automounts are failing
      because the requesting the automount changed.
      
      The relevant call path is:
      follow_automount()
          ->d_automount
          autofs4_d_automount
             autofs4_mount_wait
                 autofs4_wait
      
      In autofs4_wait wq_uid and wq_gid are set to current_uid() and
      current_gid respectively.  With follow_automount now overriding creds
      uid that we export to userspace changes and that breaks existing
      setups.
      
      To remove the regression set wq_uid and wq_gid from
      current_real_cred()->uid and current_real_cred()->gid respectively.
      This restores the current behavior as current->real_cred is identical
      to current->cred except when override creds are used.
      
      Fixes: aeaa4a79 ("fs: Call d_automount with the filesystems creds")
      Reported-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Tested-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f9aa718
    • Justin Maggard's avatar
      async_pq_val: fix DMA memory leak · dfca701b
      Justin Maggard authored
      commit c8475090 upstream.
      
      Add missing dmaengine_unmap_put(), so we don't OOM during RAID6 sync.
      
      Fixes: 1786b943 ("async_pq_val: convert to dmaengine_unmap_data")
      Signed-off-by: default avatarJustin Maggard <jmaggard@netgear.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfca701b
    • Mike Galbraith's avatar
      reiserfs: Unlock superblock before calling reiserfs_quota_on_mount() · 5c55afa3
      Mike Galbraith authored
      commit 420902c9 upstream.
      
      If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can
      deadlock our own worker - mount blocks kworker/3:2, sleeps forever more.
      
      crash> ps|grep UN
          715      2   3  ffff880220734d30  UN   0.0       0      0  [kworker/3:2]
         9369   9341   2  ffff88021ffb7560  UN   1.3  493404 123184  Xorg
         9665   9664   3  ffff880225b92ab0  UN   0.0   47368    812  udisks-daemon
        10635  10403   3  ffff880222f22c70  UN   0.0   14904    936  mount
      crash> bt ffff880220734d30
      PID: 715    TASK: ffff880220734d30  CPU: 3   COMMAND: "kworker/3:2"
       #0 [ffff8802244c3c20] schedule at ffffffff8144584b
       #1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3
       #2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5
       #3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs]
       #4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs]
       #5 [ffff8802244c3e08] process_one_work at ffffffff81073726
       #6 [ffff8802244c3e68] worker_thread at ffffffff81073eba
       #7 [ffff8802244c3ec8] kthread at ffffffff810782e0
       #8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064
      crash> rd ffff8802244c3cc8 10
      ffff8802244c3cc8:  ffffffff814472b3 ffff880222f23250   .rD.....P2."....
      ffff8802244c3cd8:  0000000000000000 0000000000000286   ................
      ffff8802244c3ce8:  ffff8802244c3d30 ffff880220734d80   0=L$.....Ms ....
      ffff8802244c3cf8:  ffff880222e8f628 0000000000000000   (.."............
      ffff8802244c3d08:  0000000000000000 0000000000000002   ................
      crash> struct rt_mutex ffff880222e8f628
      struct rt_mutex {
        wait_lock = {
          raw_lock = {
            slock = 65537
          }
        },
        wait_list = {
          node_list = {
            next = 0xffff8802244c3d48,
            prev = 0xffff8802244c3d48
          }
        },
        owner = 0xffff880222f22c71,
        save_state = 0
      }
      crash> bt 0xffff880222f22c70
      PID: 10635  TASK: ffff880222f22c70  CPU: 3   COMMAND: "mount"
       #0 [ffff8802216a9868] schedule at ffffffff8144584b
       #1 [ffff8802216a9910] schedule_timeout at ffffffff81446865
       #2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74
       #3 [ffff8802216a9a30] flush_work at ffffffff810712d3
       #4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463
       #5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba
       #6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632
       #7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c
       #8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs]
       #9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs]
          RIP: 00007f7b9303997a  RSP: 00007ffff443c7a8  RFLAGS: 00010202
          RAX: 00000000000000a5  RBX: ffffffff8144ef12  RCX: 00007f7b932e9ee0
          RDX: 00007f7b93d9a400  RSI: 00007f7b93d9a3e0  RDI: 00007f7b93d9a3c0
          RBP: 00007f7b93d9a2c0   R8: 00007f7b93d9a550   R9: 0000000000000001
          R10: ffffffffc0ed040e  R11: 0000000000000202  R12: 000000000000040e
          R13: 0000000000000000  R14: 00000000c0ed040e  R15: 00007ffff443ca20
          ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarMike Galbraith <mgalbraith@suse.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c55afa3
    • Nicolas Iooss's avatar
      ASoC: Intel: Atom: add a missing star in a memcpy call · dad17541
      Nicolas Iooss authored
      commit 61ab0d40 upstream.
      
      In sst_prepare_and_post_msg(), when a response is received in "block",
      the following code gets executed:
      
          *data = kzalloc(block->size, GFP_KERNEL);
          memcpy(data, (void *) block->data, block->size);
      
      The memcpy() call overwrites the content of the *data pointer instead of
      filling the newly-allocated memory (which pointer is hold by *data).
      Fix this by merging kzalloc+memcpy into a single kmemdup() call.
      
      Thanks Joe Perches for suggesting using kmemdup()
      
      Fixes: 60dc8dba ("ASoC: Intel: sst: Add some helper functions")
      Signed-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dad17541
    • John Hsu's avatar
      ASoC: nau8825: fix bug in FLL parameter · 9b3aaaa8
      John Hsu authored
      commit a8961cae upstream.
      
      In the FLL parameter calculation, the FVCO should choose the maximum one.
      The patch is to fix the bug about the wrong FVCO chosen.
      Signed-off-by: default avatarJohn Hsu <KCHSU0@nuvoton.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b3aaaa8
    • Rafał Miłecki's avatar
      brcmfmac: use correct skb freeing helper when deleting flowring · 9119232c
      Rafał Miłecki authored
      commit 7f00ee2b upstream.
      
      Flowrings contain skbs waiting for transmission that were passed to us
      by netif. It means we checked every one of them looking for 802.1x
      Ethernet type. When deleting flowring we have to use freeing function
      that will check for 802.1x type as well.
      
      Freeing skbs without a proper check was leading to counter not being
      properly decreased. This was triggering a WARNING every time
      brcmf_netdev_wait_pend8021x was called.
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Acked-by: default avatarArend van Spriel <arend@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9119232c
    • Rafał Miłecki's avatar
      brcmfmac: fix memory leak in brcmf_fill_bss_param · 5de3caef
      Rafał Miłecki authored
      commit 23e9c128 upstream.
      
      This function is called from get_station callback which means that every
      time user space was getting/dumping station(s) we were leaking 2 KiB.
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Fixes: 1f0dc59a ("brcmfmac: rework .get_station() callback")
      Acked-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5de3caef
    • Nicolas Iooss's avatar
      brcmfmac: fix pmksa->bssid usage · 4a50f927
      Nicolas Iooss authored
      commit 7703773e upstream.
      
      The struct cfg80211_pmksa defines its bssid field as:
      
          const u8 *bssid;
      
      contrary to struct brcmf_pmksa, which uses:
      
          u8 bssid[ETH_ALEN];
      
      Therefore in brcmf_cfg80211_del_pmksa(), &pmksa->bssid takes the address
      of this field (of type u8**), not the one of its content (which would be
      u8*).  Remove the & operator to make brcmf_dbg("%pM") and memcmp()
      behave as expected.
      
      This bug have been found using a custom static checker (which checks the
      usage of %p... attributes at build time).  It has been introduced in
      commit 6c404f34 ("brcmfmac: Cleanup pmksa cache handling code"),
      which replaced pmksa->bssid by &pmksa->bssid while refactoring the code,
      without modifying struct cfg80211_pmksa definition.
      
      Replace &pmk[i].bssid with pmk[i].bssid too to make the code clearer,
      this change does not affect the semantic.
      
      Fixes: 6c404f34 ("brcmfmac: Cleanup pmksa cache handling code")
      Signed-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a50f927
    • Johannes Weiner's avatar
      mm: filemap: don't plant shadow entries without radix tree node · 7d5d3b13
      Johannes Weiner authored
      commit d3798ae8 upstream.
      
      When the underflow checks were added to workingset_node_shadow_dec(),
      they triggered immediately:
      
        kernel BUG at ./include/linux/swap.h:276!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: isofs usb_storage fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_REJECT nf_reject_ipv6
         soundcore wmi acpi_als pinctrl_sunrisepoint kfifo_buf tpm_tis industrialio acpi_pad pinctrl_intel tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt
        CPU: 0 PID: 20929 Comm: blkid Not tainted 4.8.0-rc8-00087-gbe67d60b #1
        Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016
        task: ffff8faa93ecd940 task.stack: ffff8faa7f478000
        RIP: page_cache_tree_insert+0xf1/0x100
        Call Trace:
          __add_to_page_cache_locked+0x12e/0x270
          add_to_page_cache_lru+0x4e/0xe0
          mpage_readpages+0x112/0x1d0
          blkdev_readpages+0x1d/0x20
          __do_page_cache_readahead+0x1ad/0x290
          force_page_cache_readahead+0xaa/0x100
          page_cache_sync_readahead+0x3f/0x50
          generic_file_read_iter+0x5af/0x740
          blkdev_read_iter+0x35/0x40
          __vfs_read+0xe1/0x130
          vfs_read+0x96/0x130
          SyS_read+0x55/0xc0
          entry_SYSCALL_64_fastpath+0x13/0x8f
        Code: 03 00 48 8b 5d d8 65 48 33 1c 25 28 00 00 00 44 89 e8 75 19 48 83 c4 18 5b 41 5c 41 5d 41 5e 5d c3 0f 0b 41 bd ef ff ff ff eb d7 <0f> 0b e8 88 68 ef ff 0f 1f 84 00
        RIP  page_cache_tree_insert+0xf1/0x100
      
      This is a long-standing bug in the way shadow entries are accounted in
      the radix tree nodes. The shrinker needs to know when radix tree nodes
      contain only shadow entries, no pages, so node->count is split in half
      to count shadows in the upper bits and pages in the lower bits.
      
      Unfortunately, the radix tree implementation doesn't know of this and
      assumes all entries are in node->count. When there is a shadow entry
      directly in root->rnode and the tree is later extended, the radix tree
      implementation will copy that entry into the new node and and bump its
      node->count, i.e. increases the page count bits. Once the shadow gets
      removed and we subtract from the upper counter, node->count underflows
      and triggers the warning. Afterwards, without node->count reaching 0
      again, the radix tree node is leaked.
      
      Limit shadow entries to when we have actual radix tree nodes and can
      count them properly. That means we lose the ability to detect refaults
      from files that had only the first page faulted in at eviction time.
      
      Fixes: 449dd698 ("mm: keep page cache radix tree nodes in check")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-and-tested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d5d3b13
    • Dave Chinner's avatar
      xfs: change mailing list address · bbf4e0b9
      Dave Chinner authored
      commit 541d48f0 upstream.
      
      oss.sgi.com is going away, move contact details over to vger.
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbf4e0b9
    • Guilherme G Piccoli's avatar
      i40e: avoid NULL pointer dereference and recursive errors on early PCI error · c5b4bb79
      Guilherme G Piccoli authored
      commit edfc23ee upstream.
      
      Although rare, it's possible to hit PCI error early on device
      probe, meaning possibly some structs are not entirely initialized,
      and some might even be completely uninitialized, leading to NULL
      pointer dereference.
      
      The i40e driver currently presents a "bad" behavior if device hits
      such early PCI error: firstly, the struct i40e_pf might not be
      attached to pci_dev yet, leading to a NULL pointer dereference on
      access to pf->state.
      
      Even checking if the struct is NULL and avoiding the access in that
      case isn't enough, since the driver cannot recover from PCI error
      that early; in our experiments we saw multiple failures on kernel
      log, like:
      
        [549.664] i40e 0007:01:00.1: Initial pf_reset failed: -15
        [549.664] i40e: probe of 0007:01:00.1 failed with error -15
        [...]
        [871.644] i40e 0007:01:00.1: The driver for the device stopped because the
        device firmware failed to init. Try updating your NVM image.
        [871.644] i40e: probe of 0007:01:00.1 failed with error -32
        [...]
        [872.516] i40e 0007:01:00.0: ARQ: Unknown event 0x0000 ignored
      
      Between the first probe failure (error -15) and the second (error -32)
      another PCI error happened due to the first bad probe. Also, driver
      started to flood console with those ARQ event messages.
      
      This patch will prevent these issues by allowing error recovery
      mechanism to remove the failed device from the system instead of
      trying to recover from early PCI errors during device probe.
      Signed-off-by: default avatarGuilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
      Acked-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5b4bb79
    • Johannes Weiner's avatar
      mm: filemap: fix mapping->nrpages double accounting in fuse · 65fc3ba5
      Johannes Weiner authored
      commit 3ddf40e8 upstream.
      
      Commit 22f2ac51 ("mm: workingset: fix crash in shadow node shrinker
      caused by replace_page_cache_page()") switched replace_page_cache() from
      raw radix tree operations to page_cache_tree_insert() but didn't take
      into account that the latter function, unlike the raw radix tree op,
      handles mapping->nrpages.  As a result, that counter is bumped for each
      page replacement rather than balanced out even.
      
      The mapping->nrpages counter is used to skip needless radix tree walks
      when invalidating, truncating, syncing inodes without pages, as well as
      statistics for userspace.  Since the error is positive, we'll do more
      page cache tree walks than necessary; we won't miss a necessary one.
      And we'll report more buffer pages to userspace than there are.  The
      error is limited to fuse inodes.
      
      Fixes: 22f2ac51 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65fc3ba5
    • Miklos Szeredi's avatar
      fuse: fix killing s[ug]id in setattr · 0e2993ef
      Miklos Szeredi authored
      commit a09f99ed upstream.
      
      Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on
      chown and truncate, and (since writeback_cache) write.  The problem with
      this is that it'll potentially restore a stale mode.
      
      The poper fix would be to let the filesystems do the suid/sgid clearing on
      the relevant operations.  Possibly some are already doing it but there's no
      way we can detect this.
      
      So fix this by refreshing and recalculating the mode.  Do this only if
      ATTR_KILL_S[UG]ID is set to not destroy performance for writes.  This is
      still racy but the size of the window is reduced.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e2993ef
    • Miklos Szeredi's avatar
      fuse: invalidate dir dentry after chmod · f72bae37
      Miklos Szeredi authored
      commit 5e2b8828 upstream.
      
      Without "default_permissions" the userspace filesystem's lookup operation
      needs to perform the check for search permission on the directory.
      
      If directory does not allow search for everyone (this is quite rare) then
      userspace filesystem has to set entry timeout to zero to make sure
      permissions are always performed.
      
      Changing the mode bits of the directory should also invalidate the
      (previously cached) dentry to make sure the next lookup will have a chance
      of updating the timeout, if needed.
      Reported-by: default avatarJean-Pierre André <jean-pierre.andre@wanadoo.fr>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f72bae37
    • Miklos Szeredi's avatar
      fuse: listxattr: verify xattr list · 66b8e7fc
      Miklos Szeredi authored
      commit cb3ae6d2 upstream.
      
      Make sure userspace filesystem is returning a well formed list of xattr
      names (zero or more nonzero length, null terminated strings).
      
      [Michael Theall: only verify in the nonzero size case]
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66b8e7fc
    • Marcin Wojtas's avatar
      clk: mvebu: dynamically allocate resources in Armada CP110 system controller · a4be7458
      Marcin Wojtas authored
      commit a0245eb7 upstream.
      
      Original commit, which added support for Armada CP110 system controller
      used global variables for storing all clock information. It worked
      fine for Armada 7k SoC, with single CP110 block. After dual-CP110 Armada 8k
      was introduced, the data got overwritten and corrupted.
      
      This patch fixes the issue by allocating resources dynamically in the
      driver probe and storing it as platform drvdata.
      
      Fixes: d3da3eae ("clk: mvebu: new driver for Armada CP110 system ...")
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Reviewed-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4be7458
    • Marcin Wojtas's avatar
      clk: mvebu: fix setting unwanted flags in CP110 gate clock · a6cf0bc8
      Marcin Wojtas authored
      commit ad715b26 upstream.
      
      Armada CP110 system controller comprises its own routine responsble
      for registering gate clocks. Among others 'flags' field in
      struct clk_init_data was not set, using a random values, which
      may cause an unpredicted behavior.
      
      This patch fixes the problem by resetting all fields of clk_init_data
      before assigning values for all gated clocks of Armada 7k/8k SoCs family.
      
      Fixes: d3da3eae ("clk: mvebu: new driver for Armada CP110 system ...")
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6cf0bc8
    • Mike Marciniszyn's avatar
      IB/hfi1: Fix defered ack race with qp destroy · 51b2e351
      Mike Marciniszyn authored
      commit 72f53af2 upstream.
      
      There is a a bug in defered ack stuff that causes a race with the
      destroy of a QP.
      
      A packet causes a defered ack to be pended by putting the QP
      into an rcd queue.
      
      A return from the driver interrupt processing will process that rcd
      queue of QPs and attempt to do a direct send of the ack.   At this
      point no locks are held and the above QP could now be put in the reset
      state in the qp destroy logic.   A refcount protects the QP while it
      is in the rcd queue so it isn't going anywhere yet.
      
      If the direct send fails to allocate a pio buffer,
      hfi1_schedule_send() is called to trigger sending an ack from the
      send engine. There is no state test in that code path.
      
      The refcount is then dropped from the driver.c caller
      potentially allowing the qp destroy to continue from its
      refcount wait in parallel with the workqueue scheduling of the qp.
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51b2e351
    • Peng Fan's avatar
      drivers: base: dma-mapping: page align the size when unmap_kernel_range · 9ae3f9e1
      Peng Fan authored
      commit 85714108 upstream.
      
      When dma_common_free_remap, the input parameter 'size' may not
      be page aligned. And, met kernel warning when doing iommu dma
      for usb on i.MX8 platform:
      "
      WARNING: CPU: 0 PID: 869 at mm/vmalloc.c:70 vunmap_page_range+0x1cc/0x1d0()
      Modules linked in:
      CPU: 0 PID: 869 Comm: kworker/u8:2 Not tainted 4.1.12-00444-gc5f9d1d-dirty #147
      Hardware name: Freescale i.MX8DV Sabreauto (DT)
      Workqueue: ci_otg ci_otg_work
      Call trace:
      [<ffffffc000089920>] dump_backtrace+0x0/0x124
      [<ffffffc000089a54>] show_stack+0x10/0x1c
      [<ffffffc0006d1e6c>] dump_stack+0x84/0xc8
      [<ffffffc0000b4568>] warn_slowpath_common+0x98/0xd0
      [<ffffffc0000b4664>] warn_slowpath_null+0x14/0x20
      [<ffffffc000170348>] vunmap_page_range+0x1c8/0x1d0
      [<ffffffc000170388>] unmap_kernel_range+0x20/0x88
      [<ffffffc000460ad0>] dma_common_free_remap+0x74/0x84
      [<ffffffc0000940d8>] __iommu_free_attrs+0x9c/0x178
      [<ffffffc0005032bc>] ehci_mem_cleanup+0x140/0x194
      [<ffffffc000503548>] ehci_stop+0x8c/0xdc
      [<ffffffc0004e8258>] usb_remove_hcd+0xf0/0x1cc
      [<ffffffc000516bc0>] host_stop+0x1c/0x58
      [<ffffffc000514240>] ci_otg_work+0xdc/0x120
      [<ffffffc0000c9c34>] process_one_work+0x134/0x33c
      [<ffffffc0000c9f78>] worker_thread+0x13c/0x47c
      [<ffffffc0000cf43c>] kthread+0xd8/0xf0
      "
      
      For dma_common_pages_remap:
      dma_common_pages_remap
         |->get_vm_area_caller
              |->__get_vm_area_node
                  |->size = PAGE_ALIGN(size);   Round up to page aligned
      
      So, in dma_common_free_remap, we also need a page aligned size,
      pass 'PAGE_ALIGN(size)' to unmap_kernel_range.
      Signed-off-by: default avatarPeng Fan <van.freenix@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ae3f9e1
    • Alexander Usyskin's avatar
      mei: amthif: fix deadlock in initialization during a reset · 3ec2e370
      Alexander Usyskin authored
      commit e728ae27 upstream.
      
      The device lock was unnecessary obtained in bus rescan work before the
      amthif client search.  That causes incorrect lock ordering and task
      hang:
      ...
      [88004.613213] INFO: task kworker/1:14:21832 blocked for more than 120 seconds.
      ...
      [88004.645934] Workqueue: events mei_cl_bus_rescan_work
      ...
      
      The correct lock order is
       cl_bus_lock
        device_lock
         me_clients_rwsem
      
      Move device_lock into amthif init function that called
      after me_clients_rwsem is released.
      
      This fixes regression introduced by commit:
      commit 025fb792 ("mei: split amthif client init from end of clients enumeration")
      Signed-off-by: default avatarAlexander Usyskin <alexander.usyskin@intel.com>
      Signed-off-by: default avatarTomas Winkler <tomas.winkler@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ec2e370