1. 25 May, 2019 20 commits
  2. 21 May, 2019 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.178 · a5f56b52
      Greg Kroah-Hartman authored
      a5f56b52
    • Sean Christopherson's avatar
      KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes · 0dd8bef5
      Sean Christopherson authored
      commit 11988499 upstream.
      
      KVM allows userspace to violate consistency checks related to the
      guest's CPUID model to some degree.  Generally speaking, userspace has
      carte blanche when it comes to guest state so long as jamming invalid
      state won't negatively affect the host.
      
      Currently this is seems to be a non-issue as most of the interesting
      EFER checks are missing, e.g. NX and LME, but those will be added
      shortly.  Proactively exempt userspace from the CPUID checks so as not
      to break userspace.
      
      Note, the efer_reserved_bits check still applies to userspace writes as
      that mask reflects the host's capabilities, e.g. KVM shouldn't allow a
      guest to run with NX=1 if it has been disabled in the host.
      
      Fixes: d8017474 ("KVM: SVM: Only allow setting of EFER_SVME when CPUID SVM is set")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dd8bef5
    • Michał Wadowski's avatar
      ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug · 51776204
      Michał Wadowski authored
      commit 56df90b6 upstream.
      
      Add patch for realtek codec in Lenovo B50-70 that fixes inverted
      internal microphone channel.
      Device IdeaPad Y410P has the same PCI SSID as Lenovo B50-70,
      but first one is about fix the noise and it didn't seem help in a
      later kernel version.
      So I replaced IdeaPad Y410P device description with B50-70 and apply
      inverted microphone fix.
      
      Bugzilla: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1524215Signed-off-by: default avatarMichał Wadowski <wadosm@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51776204
    • Lukas Czerner's avatar
      ext4: fix data corruption caused by overlapping unaligned and aligned IO · fa089775
      Lukas Czerner authored
      commit 57a0da28 upstream.
      
      Unaligned AIO must be serialized because the zeroing of partial blocks
      of unaligned AIO can result in data corruption in case it's overlapping
      another in flight IO.
      
      Currently we wait for all unwritten extents before we submit unaligned
      AIO which protects data in case of unaligned AIO is following overlapping
      IO. However if a unaligned AIO is followed by overlapping aligned AIO we
      can still end up corrupting data.
      
      To fix this, we must make sure that the unaligned AIO is the only IO in
      flight by waiting for unwritten extents conversion not just before the
      IO submission, but right after it as well.
      
      This problem can be reproduced by xfstest generic/538
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa089775
    • Sriram Rajagopalan's avatar
      ext4: zero out the unused memory region in the extent tree block · ab6d14e8
      Sriram Rajagopalan authored
      commit 592acbf1 upstream.
      
      This commit zeroes out the unused memory region in the buffer_head
      corresponding to the extent metablock after writing the extent header
      and the corresponding extent node entries.
      
      This is done to prevent random uninitialized data from getting into
      the filesystem when the extent block is synced.
      
      This fixes CVE-2019-11833.
      Signed-off-by: default avatarSriram Rajagopalan <sriramr@arista.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab6d14e8
    • Jiufei Xue's avatar
      fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount · 54e35658
      Jiufei Xue authored
      commit ec084de9 upstream.
      
      synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
      switch may not go to the workqueue after synchronize_rcu().  Thus
      previous scheduled switches was not finished even flushing the
      workqueue, which will cause a NULL pointer dereferenced followed below.
      
        VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds.  Have a nice day...
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
          evict+0xb3/0x180
          iput+0x1b0/0x230
          inode_switch_wbs_work_fn+0x3c0/0x6a0
          worker_thread+0x4e/0x490
          ? process_one_work+0x410/0x410
          kthread+0xe6/0x100
          ret_from_fork+0x39/0x50
      
      Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
      pending callbacks to finish.  And inc isw_nr_in_flight after call_rcu()
      in inode_switch_wbs() to make more sense.
      
      Link: http://lkml.kernel.org/r/20190429024108.54150-1-jiufei.xue@linux.alibaba.comSigned-off-by: default avatarJiufei Xue <jiufei.xue@linux.alibaba.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Suggested-by: default avatarTejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54e35658
    • Tejun Heo's avatar
      writeback: synchronize sync(2) against cgroup writeback membership switches · 1cfaba5b
      Tejun Heo authored
      commit 7fc5854f upstream.
      
      sync_inodes_sb() can race against cgwb (cgroup writeback) membership
      switches and fail to writeback some inodes.  For example, if an inode
      switches to another wb while sync_inodes_sb() is in progress, the new
      wb might not be visible to bdi_split_work_to_wbs() at all or the inode
      might jump from a wb which hasn't issued writebacks yet to one which
      already has.
      
      This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb
      switch path against sync_inodes_sb() so that sync_inodes_sb() is
      guaranteed to see all the target wbs and inodes can't jump wbs to
      escape syncing.
      
      v2: Fixed misplaced rwsem init.  Spotted by Jiufei.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJiufei Xue <xuejiufei@gmail.com>
      Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.comAcked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1cfaba5b
    • Greg Kroah-Hartman's avatar
      fib_rules: fix error in backport of e9919a24 ("fib_rules: return 0...") · d5c71a7c
      Greg Kroah-Hartman authored
      When commit e9919a24 ("fib_rules: return 0 directly if an exactly
      same rule exists when NLM_F_EXCL not supplied") was backported to 4.9.y,
      it changed the logic a bit as err should have been reset before exiting
      the test, like it happens in the original logic.
      
      If this is not set, errors happen :(
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reported-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reported-by: default avatarFlorian Westphal <fw@strlen.de>
      Cc: Hangbin Liu <liuhangbin@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5c71a7c
    • Eric Biggers's avatar
      crypto: arm/aes-neonbs - don't access already-freed walk.iv · 7f9290f7
      Eric Biggers authored
      commit 767f015e upstream.
      
      If the user-provided IV needs to be aligned to the algorithm's
      alignmask, then skcipher_walk_virt() copies the IV into a new aligned
      buffer walk.iv.  But skcipher_walk_virt() can fail afterwards, and then
      if the caller unconditionally accesses walk.iv, it's a use-after-free.
      
      arm32 xts-aes-neonbs doesn't set an alignmask, so currently it isn't
      affected by this despite unconditionally accessing walk.iv.  However
      this is more subtle than desired, and it was actually broken prior to
      the alignmask being removed by commit cc477bf6 ("crypto: arm/aes -
      replace bit-sliced OpenSSL NEON code").  Thus, update xts-aes-neonbs to
      start checking the return value of skcipher_walk_virt().
      
      Fixes: e4e7f10b ("ARM: add support for bit sliced AES using NEON instructions")
      Cc: <stable@vger.kernel.org> # v3.13+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      7f9290f7
    • Eric Biggers's avatar
      crypto: salsa20 - don't access already-freed walk.iv · 91078439
      Eric Biggers authored
      commit edaf28e9 upstream.
      
      If the user-provided IV needs to be aligned to the algorithm's
      alignmask, then skcipher_walk_virt() copies the IV into a new aligned
      buffer walk.iv.  But skcipher_walk_virt() can fail afterwards, and then
      if the caller unconditionally accesses walk.iv, it's a use-after-free.
      
      salsa20-generic doesn't set an alignmask, so currently it isn't affected
      by this despite unconditionally accessing walk.iv.  However this is more
      subtle than desired, and it was actually broken prior to the alignmask
      being removed by commit b62b3db7 ("crypto: salsa20-generic - cleanup
      and convert to skcipher API").
      
      Since salsa20-generic does not update the IV and does not need any IV
      alignment, update it to use req->iv instead of walk.iv.
      
      Fixes: 2407d608 ("[CRYPTO] salsa20: Salsa20 stream cipher")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      91078439
    • Eric Biggers's avatar
      crypto: gcm - fix incompatibility between "gcm" and "gcm_base" · 62d629a5
      Eric Biggers authored
      commit f699594d upstream.
      
      GCM instances can be created by either the "gcm" template, which only
      allows choosing the block cipher, e.g. "gcm(aes)"; or by "gcm_base",
      which allows choosing the ctr and ghash implementations, e.g.
      "gcm_base(ctr(aes-generic),ghash-generic)".
      
      However, a "gcm_base" instance prevents a "gcm" instance from being
      registered using the same implementations.  Nor will the instance be
      found by lookups of "gcm".  This can be used as a denial of service.
      Moreover, "gcm_base" instances are never tested by the crypto
      self-tests, even if there are compatible "gcm" tests.
      
      The root cause of these problems is that instances of the two templates
      use different cra_names.  Therefore, fix these problems by making
      "gcm_base" instances set the same cra_name as "gcm" instances, e.g.
      "gcm(aes)" instead of "gcm_base(ctr(aes-generic),ghash-generic)".
      
      This requires extracting the block cipher name from the name of the ctr
      algorithm.  It also requires starting to verify that the algorithms are
      really ctr and ghash, not something else entirely.  But it would be
      bizarre if anyone were actually using non-gcm-compatible algorithms with
      gcm_base, so this shouldn't break anyone in practice.
      
      Fixes: d00aa19b ("[CRYPTO] gcm: Allow block cipher parameter")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62d629a5
    • Wei Yongjun's avatar
      crypto: gcm - Fix error return code in crypto_gcm_create_common() · 2f95ee0b
      Wei Yongjun authored
      commit 9b40f79c upstream.
      
      Fix to return error code -EINVAL from the invalid alg ivsize error
      handling case instead of 0, as done elsewhere in this function.
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f95ee0b
    • Kamlakant Patel's avatar
      ipmi:ssif: compare block number correctly for multi-part return messages · a2a2a146
      Kamlakant Patel authored
      commit 55be8658 upstream.
      
      According to ipmi spec, block number is a number that is incremented,
      starting with 0, for each new block of message data returned using the
      middle transaction.
      
      Here, the 'blocknum' is data[0] which always starts from zero(0) and
      'ssif_info->multi_pos' starts from 1.
      So, we need to add +1 to blocknum while comparing with multi_pos.
      
      Fixes: 7d6380cd ("ipmi:ssif: Fix handling of multi-part return messages").
      Reported-by: default avatarKiran Kolukuluru <kirank@ami.com>
      Signed-off-by: default avatarKamlakant Patel <kamlakantp@marvell.com>
      Message-Id: <1556106615-18722-1-git-send-email-kamlakantp@marvell.com>
      [Also added a debug log if the block numbers don't match.]
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Cc: stable@vger.kernel.org # 4.4
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2a2a146
    • Coly Li's avatar
      bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim() · 7366d0cd
      Coly Li authored
      commit 1bee2add upstream.
      
      In journal_reclaim() ja->cur_idx of each cache will be update to
      reclaim available journal buckets. Variable 'int n' is used to count how
      many cache is successfully reclaimed, then n is set to c->journal.key
      by SET_KEY_PTRS(). Later in journal_write_unlocked(), a for_each_cache()
      loop will write the jset data onto each cache.
      
      The problem is, if all jouranl buckets on each cache is full, the
      following code in journal_reclaim(),
      
      529 for_each_cache(ca, c, iter) {
      530       struct journal_device *ja = &ca->journal;
      531       unsigned int next = (ja->cur_idx + 1) % ca->sb.njournal_buckets;
      532
      533       /* No space available on this device */
      534       if (next == ja->discard_idx)
      535               continue;
      536
      537       ja->cur_idx = next;
      538       k->ptr[n++] = MAKE_PTR(0,
      539                         bucket_to_sector(c, ca->sb.d[ja->cur_idx]),
      540                         ca->sb.nr_this_dev);
      541 }
      542
      543 bkey_init(k);
      544 SET_KEY_PTRS(k, n);
      
      If there is no available bucket to reclaim, the if() condition at line
      534 will always true, and n remains 0. Then at line 544, SET_KEY_PTRS()
      will set KEY_PTRS field of c->journal.key to 0.
      
      Setting KEY_PTRS field of c->journal.key to 0 is wrong. Because in
      journal_write_unlocked() the journal data is written in following loop,
      
      649	for (i = 0; i < KEY_PTRS(k); i++) {
      650-671		submit journal data to cache device
      672	}
      
      If KEY_PTRS field is set to 0 in jouranl_reclaim(), the journal data
      won't be written to cache device here. If system crahed or rebooted
      before bkeys of the lost journal entries written into btree nodes, data
      corruption will be reported during bcache reload after rebooting the
      system.
      
      Indeed there is only one cache in a cache set, there is no need to set
      KEY_PTRS field in journal_reclaim() at all. But in order to keep the
      for_each_cache() logic consistent for now, this patch fixes the above
      problem by not setting 0 KEY_PTRS of journal key, if there is no bucket
      available to reclaim.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7366d0cd
    • Liang Chen's avatar
      bcache: fix a race between cache register and cacheset unregister · 3946cbbe
      Liang Chen authored
      commit a4b732a2 upstream.
      
      There is a race between cache device register and cache set unregister.
      For an already registered cache device, register_bcache will call
      bch_is_open to iterate through all cachesets and check every cache
      there. The race occurs if cache_set_free executes at the same time and
      clears the caches right before ca is dereferenced in bch_is_open_cache.
      To close the race, let's make sure the clean up work is protected by
      the bch_register_lock as well.
      
      This issue can be reproduced as follows,
      while true; do echo /dev/XXX> /sys/fs/bcache/register ; done&
      while true; do echo 1> /sys/block/XXX/bcache/set/unregister ; done &
      
      and results in the following oops,
      
      [  +0.000053] BUG: unable to handle kernel NULL pointer dereference at 0000000000000998
      [  +0.000457] #PF error: [normal kernel read fault]
      [  +0.000464] PGD 800000003ca9d067 P4D 800000003ca9d067 PUD 3ca9c067 PMD 0
      [  +0.000388] Oops: 0000 [#1] SMP PTI
      [  +0.000269] CPU: 1 PID: 3266 Comm: bash Not tainted 5.0.0+ #6
      [  +0.000346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
      [  +0.000472] RIP: 0010:register_bcache+0x1829/0x1990 [bcache]
      [  +0.000344] Code: b0 48 83 e8 50 48 81 fa e0 e1 10 c0 0f 84 a9 00 00 00 48 89 c6 48 89 ca 0f b7 ba 54 04 00 00 4c 8b 82 60 0c 00 00 85 ff 74 2f <49> 3b a8 98 09 00 00 74 4e 44 8d 47 ff 31 ff 49 c1 e0 03 eb 0d
      [  +0.000839] RSP: 0018:ffff92ee804cbd88 EFLAGS: 00010202
      [  +0.000328] RAX: ffffffffc010e190 RBX: ffff918b5c6b5000 RCX: ffff918b7d8e0000
      [  +0.000399] RDX: ffff918b7d8e0000 RSI: ffffffffc010e190 RDI: 0000000000000001
      [  +0.000398] RBP: ffff918b7d318340 R08: 0000000000000000 R09: ffffffffb9bd2d7a
      [  +0.000385] R10: ffff918b7eb253c0 R11: ffffb95980f51200 R12: ffffffffc010e1a0
      [  +0.000411] R13: fffffffffffffff2 R14: 000000000000000b R15: ffff918b7e232620
      [  +0.000384] FS:  00007f955bec2740(0000) GS:ffff918b7eb00000(0000) knlGS:0000000000000000
      [  +0.000420] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  +0.000801] CR2: 0000000000000998 CR3: 000000003cad6000 CR4: 00000000001406e0
      [  +0.000837] Call Trace:
      [  +0.000682]  ? _cond_resched+0x10/0x20
      [  +0.000691]  ? __kmalloc+0x131/0x1b0
      [  +0.000710]  kernfs_fop_write+0xfa/0x170
      [  +0.000733]  __vfs_write+0x2e/0x190
      [  +0.000688]  ? inode_security+0x10/0x30
      [  +0.000698]  ? selinux_file_permission+0xd2/0x120
      [  +0.000752]  ? security_file_permission+0x2b/0x100
      [  +0.000753]  vfs_write+0xa8/0x1a0
      [  +0.000676]  ksys_write+0x4d/0xb0
      [  +0.000699]  do_syscall_64+0x3a/0xf0
      [  +0.000692]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3946cbbe
    • Filipe Manana's avatar
      Btrfs: do not start a transaction at iterate_extent_inodes() · c44e237e
      Filipe Manana authored
      commit bfc61c36 upstream.
      
      When finding out which inodes have references on a particular extent, done
      by backref.c:iterate_extent_inodes(), from the BTRFS_IOC_LOGICAL_INO (both
      v1 and v2) ioctl and from scrub we use the transaction join API to grab a
      reference on the currently running transaction, since in order to give
      accurate results we need to inspect the delayed references of the currently
      running transaction.
      
      However, if there is currently no running transaction, the join operation
      will create a new transaction. This is inefficient as the transaction will
      eventually be committed, doing unnecessary IO and introducing a potential
      point of failure that will lead to a transaction abort due to -ENOSPC, as
      recently reported [1].
      
      That's because the join, creates the transaction but does not reserve any
      space, so when attempting to update the root item of the root passed to
      btrfs_join_transaction(), during the transaction commit, we can end up
      failling with -ENOSPC. Users of a join operation are supposed to actually
      do some filesystem changes and reserve space by some means, which is not
      the case of iterate_extent_inodes(), it is a read-only operation for all
      contextes from which it is called.
      
      The reported [1] -ENOSPC failure stack trace is the following:
      
       heisenberg kernel: ------------[ cut here ]------------
       heisenberg kernel: BTRFS: Transaction aborted (error -28)
       heisenberg kernel: WARNING: CPU: 0 PID: 7137 at fs/btrfs/root-tree.c:136 btrfs_update_root+0x22b/0x320 [btrfs]
      (...)
       heisenberg kernel: CPU: 0 PID: 7137 Comm: btrfs-transacti Not tainted 4.19.0-4-amd64 #1 Debian 4.19.28-2
       heisenberg kernel: Hardware name: FUJITSU LIFEBOOK U757/FJNB2A5, BIOS Version 1.21 03/19/2018
       heisenberg kernel: RIP: 0010:btrfs_update_root+0x22b/0x320 [btrfs]
      (...)
       heisenberg kernel: RSP: 0018:ffffb5448828bd40 EFLAGS: 00010286
       heisenberg kernel: RAX: 0000000000000000 RBX: ffff8ed56bccef50 RCX: 0000000000000006
       heisenberg kernel: RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8ed6bda166a0
       heisenberg kernel: RBP: 00000000ffffffe4 R08: 00000000000003df R09: 0000000000000007
       heisenberg kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ed63396a078
       heisenberg kernel: R13: ffff8ed092d7c800 R14: ffff8ed64f5db028 R15: ffff8ed6bd03d068
       heisenberg kernel: FS:  0000000000000000(0000) GS:ffff8ed6bda00000(0000) knlGS:0000000000000000
       heisenberg kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       heisenberg kernel: CR2: 00007f46f75f8000 CR3: 0000000310a0a002 CR4: 00000000003606f0
       heisenberg kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       heisenberg kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       heisenberg kernel: Call Trace:
       heisenberg kernel:  commit_fs_roots+0x166/0x1d0 [btrfs]
       heisenberg kernel:  ? _cond_resched+0x15/0x30
       heisenberg kernel:  ? btrfs_run_delayed_refs+0xac/0x180 [btrfs]
       heisenberg kernel:  btrfs_commit_transaction+0x2bd/0x870 [btrfs]
       heisenberg kernel:  ? start_transaction+0x9d/0x3f0 [btrfs]
       heisenberg kernel:  transaction_kthread+0x147/0x180 [btrfs]
       heisenberg kernel:  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
       heisenberg kernel:  kthread+0x112/0x130
       heisenberg kernel:  ? kthread_bind+0x30/0x30
       heisenberg kernel:  ret_from_fork+0x35/0x40
       heisenberg kernel: ---[ end trace 05de912e30e012d9 ]---
      
      So fix that by using the attach API, which does not create a transaction
      when there is currently no running transaction.
      
      [1] https://lore.kernel.org/linux-btrfs/b2a668d7124f1d3e410367f587926f622b3f03a4.camel@scientia.net/Reported-by: default avatarZygo Blaxell <ce3g8jdj@umail.furryterror.org>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c44e237e
    • Debabrata Banerjee's avatar
      ext4: fix ext4_show_options for file systems w/o journal · d5258b8a
      Debabrata Banerjee authored
      commit 50b29d8f upstream.
      
      Instead of removing EXT4_MOUNT_JOURNAL_CHECKSUM from s_def_mount_opt as
      I assume was intended, all other options were blown away leading to
      _ext4_show_options() output being incorrect.
      
      Fixes: 1e381f60 ("ext4: do not allow journal_opts for fs w/o journal")
      Signed-off-by: default avatarDebabrata Banerjee <dbanerje@akamai.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5258b8a
    • Kirill Tkhai's avatar
      ext4: actually request zeroing of inode table after grow · 083b19c7
      Kirill Tkhai authored
      commit 310a997f upstream.
      
      It is never possible, that number of block groups decreases,
      since only online grow is supported.
      
      But after a growing occured, we have to zero inode tables
      for just created new block groups.
      
      Fixes: 19c5246d ("ext4: add new online resize interface")
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      083b19c7
    • Jiufei Xue's avatar
      jbd2: check superblock mapped prior to committing · bd8f3bc2
      Jiufei Xue authored
      commit 742b06b5 upstream.
      
      We hit a BUG at fs/buffer.c:3057 if we detached the nbd device
      before unmounting ext4 filesystem.
      
      The typical chain of events leading to the BUG:
      jbd2_write_superblock
        submit_bh
          submit_bh_wbc
            BUG_ON(!buffer_mapped(bh));
      
      The block device is removed and all the pages are invalidated. JBD2
      was trying to write journal superblock to the block device which is
      no longer present.
      
      Fix this by checking the journal superblock's buffer head prior to
      submitting.
      Reported-by: default avatarEric Ren <renzhen@linux.alibaba.com>
      Signed-off-by: default avatarJiufei Xue <jiufei.xue@linux.alibaba.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd8f3bc2
    • Sergei Trofimovich's avatar
      tty/vt: fix write/write race in ioctl(KDSKBSENT) handler · 948c9cec
      Sergei Trofimovich authored
      commit 46ca3f73 upstream.
      
      The bug manifests as an attempt to access deallocated memory:
      
          BUG: unable to handle kernel paging request at ffff9c8735448000
          #PF error: [PROT] [WRITE]
          PGD 288a05067 P4D 288a05067 PUD 288a07067 PMD 7f60c2063 PTE 80000007f5448161
          Oops: 0003 [#1] PREEMPT SMP
          CPU: 6 PID: 388 Comm: loadkeys Tainted: G         C        5.0.0-rc6-00153-g5ded5871 #91
          Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M-D3H, BIOS F12 11/14/2013
          RIP: 0010:__memmove+0x81/0x1a0
          Code: 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2 00 00 00 66 90 48 89 d1 4c 8b 5c 16 f8 4c 8d 54 17 f8 48 c1 e9 03 <f3> 48 a5 4d 89 1a e9 0c 01 00 00 0f 1f 40 00 48 89 d1 4c 8b 1e 49
          RSP: 0018:ffffa1b9002d7d08 EFLAGS: 00010203
          RAX: ffff9c873541af43 RBX: ffff9c873541af43 RCX: 00000c6f105cd6bf
          RDX: 0000637882e986b6 RSI: ffff9c8735447ffb RDI: ffff9c8735447ffb
          RBP: ffff9c8739cd3800 R08: ffff9c873b802f00 R09: 00000000fffff73b
          R10: ffffffffb82b35f1 R11: 00505b1b004d5b1b R12: 0000000000000000
          R13: ffff9c873541af3d R14: 000000000000000b R15: 000000000000000c
          FS:  00007f450c390580(0000) GS:ffff9c873f180000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffff9c8735448000 CR3: 00000007e213c002 CR4: 00000000000606e0
          Call Trace:
           vt_do_kdgkb_ioctl+0x34d/0x440
           vt_ioctl+0xba3/0x1190
           ? __bpf_prog_run32+0x39/0x60
           ? mem_cgroup_commit_charge+0x7b/0x4e0
           tty_ioctl+0x23f/0x920
           ? preempt_count_sub+0x98/0xe0
           ? __seccomp_filter+0x67/0x600
           do_vfs_ioctl+0xa2/0x6a0
           ? syscall_trace_enter+0x192/0x2d0
           ksys_ioctl+0x3a/0x70
           __x64_sys_ioctl+0x16/0x20
           do_syscall_64+0x54/0xe0
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The bug manifests on systemd systems with multiple vtcon devices:
        # cat /sys/devices/virtual/vtconsole/vtcon0/name
        (S) dummy device
        # cat /sys/devices/virtual/vtconsole/vtcon1/name
        (M) frame buffer device
      
      There systemd runs 'loadkeys' tool in tapallel for each vtcon
      instance. This causes two parallel ioctl(KDSKBSENT) calls to
      race into adding the same entry into 'func_table' array at:
      
          drivers/tty/vt/keyboard.c:vt_do_kdgkb_ioctl()
      
      The function has no locking around writes to 'func_table'.
      
      The simplest reproducer is to have initrams with the following
      init on a 8-CPU machine x86_64:
      
          #!/bin/sh
      
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
      
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
          loadkeys -q windowkeys ru4 &
          wait
      
      The change adds lock on write path only. Reads are still racy.
      
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Jiri Slaby <jslaby@suse.com>
      Link: https://lkml.org/lkml/2019/2/17/256Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      948c9cec