1. 30 Oct, 2014 9 commits
    • Sage Weil's avatar
      Btrfs: fix race in WAIT_SYNC ioctl · 7d6d0aa7
      Sage Weil authored
      commit 42383020 upstream.
      
      We check whether transid is already committed via last_trans_committed and
      then search through trans_list for pending transactions.  If
      last_trans_committed is updated by btrfs_commit_transaction after we check
      it (there is no locking), we will fail to find the committed transaction
      and return EINVAL to the caller.  This has been observed occasionally by
      ceph-osd (which uses this ioctl heavily).
      
      Fix by rechecking whether the provided transid <= last_trans_committed
      after the search fails, and if so return 0.
      Signed-off-by: default avatarSage Weil <sage@redhat.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d6d0aa7
    • Josef Bacik's avatar
      Btrfs: fix build_backref_tree issue with multiple shared blocks · b341d2a4
      Josef Bacik authored
      commit bbe90514 upstream.
      
      Marc Merlin sent me a broken fs image months ago where it would blow up in the
      upper->checked BUG_ON() in build_backref_tree.  This is because we had a
      scenario like this
      
      block a -- level 4 (not shared)
         |
      block b -- level 3 (reloc block, shared)
         |
      block c -- level 2 (not shared)
         |
      block d -- level 1 (shared)
         |
      block e -- level 0 (shared)
      
      We go to build a backref tree for block e, we notice block d is shared and add
      it to the list of blocks to lookup it's backrefs for.  Now when we loop around
      we will check edges for the block, so we will see we looked up block c last
      time.  So we lookup block d and then see that the block that points to it is
      block c and we can just skip that edge since we've already been up this path.
      The problem is because we clear need_check when we see block d (as it is shared)
      we never add block b as needing to be checked.  And because block c is in our
      path already we bail out before we walk up to block b and add it to the backref
      check list.
      
      To fix this we need to reset need_check if we trip over a block that doesn't
      need to be checked.  This will make sure that any subsequent blocks in the path
      as we're walking up afterwards are added to the list to be processed.  With this
      patch I can now mount Marc's fs image and it'll complete the balance without
      panicing.  Thanks,
      Reported-by: default avatarMarc MERLIN <marc@merlins.org>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b341d2a4
    • Josef Bacik's avatar
      Btrfs: cleanup error handling in build_backref_tree · eb7ddab5
      Josef Bacik authored
      commit 75bfb9af upstream.
      
      When balance panics it tends to panic in the
      
      BUG_ON(!upper->checked);
      
      test, because it means it couldn't build the backref tree properly.  This is
      annoying to users and frankly a recoverable error, nothing in this function is
      actually fatal since it is just an in-memory building of the backrefs for a
      given bytenr.  So go through and change all the BUG_ON()'s to ASSERT()'s, and
      fix the BUG_ON(!upper->checked) thing to just return an error.
      
      This patch also fixes the error handling so it tears down the work we've done
      properly.  This code was horribly broken since we always just panic'ed instead
      of actually erroring out, so it needed to be completely re-worked.  With this
      patch my broken image no longer panics when I mount it.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb7ddab5
    • Josef Bacik's avatar
      Btrfs: try not to ENOSPC on log replay · 581aa18a
      Josef Bacik authored
      commit 1d52c78a upstream.
      
      When doing log replay we may have to update inodes, which traditionally goes
      through our delayed inode stuff.  This will try to move space over from the
      trans handle, but we don't reserve space in our trans handle on replay since we
      don't know how much we will need, so instead we try to flush.  But because we
      have a trans handle open we won't flush anything, so if we are out of reserve
      space we will simply return ENOSPC.  Since we know that if an operation made it
      into the log then we definitely had space before the box bought the farm then we
      don't need to worry about doing this space reservation.  Use the
      fs_info->log_root_recovering flag to skip the delayed inode stuff and update the
      item directly.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      581aa18a
    • Josef Bacik's avatar
      Btrfs: don't do async reclaim during log replay · 206b329c
      Josef Bacik authored
      commit f6acfd50 upstream.
      
      Trying to reproduce a log enospc bug I hit a panic in the async reclaim code
      during log replay.  This is because we use fs_info->fs_root as our root for
      shrinking and such.  Technically we can use whatever root we want, but let's
      just not allow async reclaim while we're doing log replay.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      206b329c
    • Liu Bo's avatar
      Btrfs: fix up bounds checking in lseek · aee22378
      Liu Bo authored
      commit 4d1a40c6 upstream.
      
      An user reported this, it is because that lseek's SEEK_SET/SEEK_CUR/SEEK_END
      allow a negative value for @offset, but btrfs's SEEK_DATA/SEEK_HOLE don't
      prepare for that and convert the negative @offset into unsigned type,
      so we get (end < start) warning.
      
      [ 1269.835374] ------------[ cut here ]------------
      [ 1269.836809] WARNING: CPU: 0 PID: 1241 at fs/btrfs/extent_io.c:430 insert_state+0x11d/0x140()
      [ 1269.838816] BTRFS: end < start 4094 18446744073709551615
      [ 1269.840334] CPU: 0 PID: 1241 Comm: a.out Tainted: G        W      3.16.0+ #306
      [ 1269.858229] Call Trace:
      [ 1269.858612]  [<ffffffff81801a69>] dump_stack+0x4e/0x68
      [ 1269.858952]  [<ffffffff8107894c>] warn_slowpath_common+0x8c/0xc0
      [ 1269.859416]  [<ffffffff81078a36>] warn_slowpath_fmt+0x46/0x50
      [ 1269.859929]  [<ffffffff813b0fbd>] insert_state+0x11d/0x140
      [ 1269.860409]  [<ffffffff813b1396>] __set_extent_bit+0x3b6/0x4e0
      [ 1269.860805]  [<ffffffff813b21c7>] lock_extent_bits+0x87/0x200
      [ 1269.861697]  [<ffffffff813a5b28>] btrfs_file_llseek+0x148/0x2a0
      [ 1269.862168]  [<ffffffff811f201e>] SyS_lseek+0xae/0xc0
      [ 1269.862620]  [<ffffffff8180b212>] system_call_fastpath+0x16/0x1b
      [ 1269.862970] ---[ end trace 4d33ea885832054b ]---
      
      This assumes that btrfs starts finding DATA/HOLE from the beginning of file
      if the assigned @offset is negative.
      
      Also we add alignment for lock_extent_bits 's range.
      Reported-by: default avatarToralf Förster <toralf.foerster@gmx.de>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aee22378
    • Filipe Manana's avatar
      Btrfs: add missing compression property remove in btrfs_ioctl_setflags · 31b89b44
      Filipe Manana authored
      commit 78a017a2 upstream.
      
      The behaviour of a 'chattr -c' consists of getting the current flags,
      clearing the FS_COMPR_FL bit and then sending the result to the set
      flags ioctl - this means the bit FS_NOCOMP_FL isn't set in the flags
      passed to the ioctl. This results in the compression property not being
      cleared from the inode - it was cleared only if the bit FS_NOCOMP_FL
      was set in the received flags.
      
      Reproducer:
      
          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt && cd /mnt
          $ mkdir a
          $ chattr +c a
          $ touch a/file
          $ lsattr a/file
          --------c------- a/file
          $ chattr -c a
          $ touch a/file2
          $ lsattr a/file2
          --------c------- a/file2
          $ lsattr -d a
          ---------------- a
      Reported-by: default avatarAndreas Schneider <asn@cryptomilk.org>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31b89b44
    • Qu Wenruo's avatar
      btrfs: Fix a deadlock in btrfs_dev_replace_finishing() · 88579aa5
      Qu Wenruo authored
      commit 12b894cb upstream.
      
      btrfs-transacion:5657
      [stack snip]
      btrfs_bio_map()
          btrfs_bio_counter_inc_blocked()
              percpu_counter_inc(&fs_info->bio_counter)  ###bio_counter > 0(A)
              __btrfs_bio_map()
                  btrfs_dev_replace_lock()
                      mutex_lock(dev_replace->lock)	   ###wait mutex(B)
      
      btrfs:32612
      [stack snip]
      btrfs_dev_replace_start()
          btrfs_dev_replace_lock()
      	mutex_lock(dev_replace->lock)		   ###hold mutex(B)
          btrfs_dev_replace_finishing()
              btrfs_rm_dev_replace_blocked()
                  wait until percpu_counter_sum == 0	   ###wait on bio_counter(A)
      
      This bug can be triggered quite easily by the following test script:
      http://pastebin.com/MQmb37Cy
      
      This patch will fix the ABBA problem by calling
      btrfs_dev_replace_unlock() before btrfs_rm_dev_replace_blocked().
      
      The consistency of btrfs devices list and their superblocks is protected
      by device_list_mutex, not btrfs_dev_replace_lock/unlock().
      So it is safe the move btrfs_dev_replace_unlock() before
      btrfs_rm_dev_replace_blocked().
      Reported-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Signed-off-by: default avatarQu Wenruo <quwenruo@cn.fujitsu.com>
      Cc: Stefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88579aa5
    • David Sterba's avatar
      btrfs: wake up transaction thread from SYNC_FS ioctl · 4abbb927
      David Sterba authored
      commit 2fad4e83 upstream.
      
      The transaction thread may want to do more work, namely it pokes the
      cleaner ktread that will start processing uncleaned subvols.
      
      This can be triggered by user via the 'btrfs fi sync' command, otherwise
      there was a delay up to 30 seconds before the cleaner started to clean
      old snapshots.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4abbb927
  2. 15 Oct, 2014 31 commits