1. 06 May, 2013 40 commits
    • Miao Xie's avatar
      Btrfs: fix unblocked autodefraggers when remount · f42a34b2
      Miao Xie authored
      The new mount option is set after parsing the remount arguments,
      so it is wrong that checking the autodefrag is close or not at
      btrfs_remount_prepare(). Fix it.
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      f42a34b2
    • Wang Shilong's avatar
      Btrfs: add a rb_tree to improve performance of ulist search · f7f82b81
      Wang Shilong authored
      Walking backref tree and btrfs quota rely on ulist very much.
      This patch tries to use rb_tree to speed up search time.
      
      The original code always checks whether an element
      exists before adding a new element, however it costs O(n).
      
      I try to add a rb_tree in the ulist,this is only used to speed up
      search. I also do some measurements with quota enabled.
      
      fsstress -p 4 -n 10000
      
      Without this path:
      real    0m51.058s       2m4.745s        1m28.222s       1m5.137s
      user    0m0.035s        0m0.041s        0m0.105s        0m0.100s
      sys     0m12.009s       0m11.246s       0m10.901s       0m10.999s       0m11.287s
      
      With this path:
      real    0m55.295s       0m50.960s       1m2.214s        0m48.273s
      user    0m0.053s        0m0.095s        0m0.135s        0m0.107s
      sys     0m7.766s        0m6.013s        0m6.319s        0m6.030s        0m6.532s
      
      After applying the patch,the execute time is down by ~42%.(11.287s->6.532s)
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Reviewed-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      f7f82b81
    • Stefan Behrens's avatar
      Btrfs: allow omitting stream header and end-cmd for btrfs send · c2c71324
      Stefan Behrens authored
      Two new flags are added to allow omitting the stream header and the
      end command for btrfs send streams. This is used in cases where you
      send multiple snapshots back-to-back in one stream.
      
      This used to be encoded like this (with 2 snapshots in this example):
      <stream header> + <sequence of commands> + <end cmd> +
      <stream header> + <sequence of commands> + <end cmd> + EOF
      
      The new format (if the two new flags are used) is this one:
      <stream header> + <sequence of commands> +
                        <sequence of commands> + <end cmd>
      
      Note that the currently existing receivers treat <end cmd> only as
      an indication that a new <stream header> is following. This means,
      you can just skip the sequence <end cmd> <stream header> without
      loosing compatibility. As long as an EOF is following, the currently
      existing receivers handle the new format (if the two new flags are
      used) exactly as the old one.
      
      So what is the benefit of this change? The goal is to be able to use
      a single stream (one TCP connection) to multiplex a request/response
      handshake plus Btrfs send streams, all in the same stream. In this
      case you cannot evaluate an EOF condition as an end of the Btrfs send
      stream. You need something else, and the <end cmd> is just perfect
      for this purpose.
      
      The summary is:
      The format change is driven by the need to send several Btrfs send
      streams over a single TCP connections, with the ability for a repeated
      request/response handshake in the middle. And this format change does
      not break any existing tool, it is completely compatible.
      
      You could compare the old behaviour of the Btrfs send stream to the
      one of ftp where you need a seperate request/response channel and
      newly opened data transfer channels for each file, while the new
      behaviour is more like http using a single stream for everything.
      Signed-off-by: default avatarStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      c2c71324
    • Wang Shilong's avatar
      Btrfs: make __merge_refs() return type be void · 692206b1
      Wang Shilong authored
      __merge_refs() always return 0, it is unnecessary
      for the caller to check the return value.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      692206b1
    • Wang Shilong's avatar
      Btrfs: remove some BUG_ONs() when walking backref tree · 1149ab6b
      Wang Shilong authored
      The only error return value of __add_prelim_ref() is -ENOMEM,
      just return errors rather than trigger BUG_ON().
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      1149ab6b
    • Wang Shilong's avatar
      Btrfs: use tree_root to avoid edquot when disabling quota · 92f183aa
      Wang Shilong authored
      Steps to reproduce:
      	mkfs.btrfs <disk>
      	mount <disk> <mnt>
      	btrfs quota enable <mnt>
      	btrfs sub create <mnt>/subv
      	btrfs qgroup limit 10K <mnt>/subv
      	btrfs quota disable <mnt>/subv
      
      It is wrong for qgroup to reserve when disabling quota,
      so just use tree_root to avoid edquot when disabling quota.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      92f183aa
    • Wang Shilong's avatar
      Btrfs: fix a warning when updating qgroup limit · ddb47afa
      Wang Shilong authored
      Step to reproduce:
      	mkfs.btrfs <disk>
      	mount <disk> <mnt>
      	btrfs quota enable <mnt>
      	btrfs qgroup limit 0/1 <mnt>
      	dmesg
      
      If the relative qgroup dosen't exist, flag 'BTRFS_QGROUP_STATUS_
      FLAG_INCONSISTENT' will be set, and print the noise message.
      This is wrong, we can just move find_qgroup_rb() before
      update_qgroup_limit_item().this dosen't change the logic of the
      function. But it can avoid unnecessary noise message and wrong set of flag.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      ddb47afa
    • Wang Shilong's avatar
      Btrfs: fix missing check in the btrfs_qgroup_inherit() · 3f5e2d3b
      Wang Shilong authored
      The original code forgot to check 'inherit', we should
      gurantee that all the qgroups in the struct 'inherit' exist.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      3f5e2d3b
    • Wang Shilong's avatar
      Btrfs: fix missing check before creating a qgroup relation · b7fef4f5
      Wang Shilong authored
      Step to reproduce:
      		mkfs.btrfs <disk>
      		mount <disk> <mnt>
      		btrfs quota enable <mnt>
      		btrfs qgroup assign 0/1 1/1 <mnt>
      		umount <mnt>
      		btrfs-debug-tree <disk> | grep QGROUP
      If we want to add a qgroup relation, we should gurantee that
      'src' and 'dst' exist, otherwise, such qgroup relation should
      not be allowed to create.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      b7fef4f5
    • Wang Shilong's avatar
      Btrfs: remove some unnecessary spin_lock usages · 58400fce
      Wang Shilong authored
      We use mutex lock to protect all the user change operations.
      So when we are calling find_qgroup_rb() to check whether qgroup
      exists, we don't have to hold spin_lock.
      
      Besides, when enabling/disabling quota, it must be single thread
      when operations come here. spin lock must be firstly used to
      clear quota_root when disabling quota, while enabling quota, spin
      lock must be used to complete the last assign work.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      58400fce
    • Wang Shilong's avatar
      Btrfs: introduce a mutex lock for btrfs quota operations · f2f6ed3d
      Wang Shilong authored
      The original code has one spin_lock 'qgroup_lock' to protect quota
      configurations in memory. If we want to add a BTRFS_QGROUP_INFO_KEY,
      it will be added to Btree firstly, and then update configurations in
      memory,however, a race condition may happen between these operations.
      For example:
      	->add_qgroup_info_item()
      		->add_qgroup_rb()
      
      For the above case, del_qgroup_info_item() may happen just before
      add_qgroup_rb().
      
      What's worse, when we want to add a qgroup relation:
      	->add_qgroup_relation_item()
      		->add_qgroup_relations()
      
      We don't have any checks whether 'src' and 'dst' exist before
      add_qgroup_relation_item(), a race condition can also happen for
      the above case.
      
      To avoid race condition and have all the necessary checks, we introduce
      a mutex lock 'qgroup_ioctl_lock', and we make all the user change operations
      protected by the mutex lock.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      f2f6ed3d
    • Wang Shilong's avatar
      Btrfs: creating the subvolume qgroup automatically when enabling quota · 7708f029
      Wang Shilong authored
      Creating the subvolume/snapshots(including root subvolume) qgroup
      auotomatically when enabling quota.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      7708f029
    • Zach Brown's avatar
      btrfs: abort unlink trans in missed error case · d4e3991b
      Zach Brown authored
      __btrfs_unlink_inode() aborts its transaction when it sees errors after
      it removes the directory item.  But it missed the case where
      btrfs_del_dir_entries_in_log() returns an error.  If this happens then
      the unlink appears to fail but the items have been removed without
      updating the directory size.  The directory then has leaked bytes in
      i_size and can never be removed.
      
      Adding the missing transaction abort at least makes this failure
      consistent with the other failure cases.
      
      I noticed this while reading the code after someone on irc reported
      having a directory with i_size but no entries.  I tested it by forcing
      btrfs_del_dir_entries_in_log() to return -ENOMEM.
      Signed-off-by: default avatarZach Brown <zab@redhat.com>
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      d4e3991b
    • Eric Sandeen's avatar
      btrfs: ignore device open failures in __btrfs_open_devices · f63e0cca
      Eric Sandeen authored
      This:
      
         # mkfs.btrfs /dev/sdb{1,2} ; wipefs -a /dev/sdb1; mount /dev/sdb2 /mnt/test
      
      would lead to a blkdev open/close mismatch when the mount fails, and
      a permanently busy (opened O_EXCL) sdb2:
      
         # wipefs -a /dev/sdb2
         wipefs: error: /dev/sdb2: probing initialization failed: Device or resource busy
      
      It's because btrfs_open_devices() may open some devices, fail on
      the last one, and return that failure stored in "ret."   The mount
      then fails, but the caller then does not clean up the open devices.
      
      Chris assures me that:
      
      "btrfs_open_devices just means: go off and open every bdev you can from
      this uuid.  It should return success if we opened any of them at all."
      
      So change the logic to ignore any open failures; just skip processing
      of that device.  Later on it's decided whether we have enough devices
      to continue.
      Reported-by: default avatarJan Safranek <jsafrane@redhat.com>
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      f63e0cca
    • Miao Xie's avatar
      Btrfs: improve the performance of the csums lookup · e4100d98
      Miao Xie authored
      It is very likely that there are several blocks in bio, it is very
      inefficient if we get their csums one by one. This patch improves
      this problem by getting the csums in batch.
      
      According to the result of the following test, the execute time of
      __btrfs_lookup_bio_sums() is down by ~28%(300us -> 217us).
      
       # dd if=<mnt>/file of=/dev/null bs=1M count=1024
      Signed-off-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      e4100d98
    • Josef Bacik's avatar
      Btrfs: fix bad extent logging · 09a2a8f9
      Josef Bacik authored
      A user sent me a btrfs-image of a file system that was panicing on mount during
      the log recovery.  I had originally thought these problems were from a bug in
      the free space cache code, but that was just a symptom of the problem.  The
      problem is if your application does something like this
      
      [prealloc][prealloc][prealloc]
      
      the internal extent maps will merge those all together into one extent map, even
      though on disk they are 3 separate extents.  So if you go to write into one of
      these ranges the extent map will be right since we use the physical extent when
      doing the write, but when we log the extents they will use the wrong sizes for
      the remainder prealloc space.  If this doesn't happen to trip up the free space
      cache (which it won't in a lot of cases) then you will get bogus entries in your
      extent tree which will screw stuff up later.  The data and such will still work,
      but everything else is broken.  This patch fixes this by not allowing extents
      that are on the modified list to be merged.  This has the side effect that we
      are no longer adding everything to the modified list all the time, which means
      we now have to call btrfs_drop_extents every time we log an extent into the
      tree.  So this allows me to drop all this speciality code I was using to get
      around calling btrfs_drop_extents.  With this patch the testcase I've created no
      longer creates a bogus file system after replaying the log.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      09a2a8f9
    • Josef Bacik's avatar
      Btrfs: log ram bytes properly · cc95bef6
      Josef Bacik authored
      When logging changed extents I was logging ram_bytes as the current length,
      which isn't correct, it's supposed to be the ram bytes of the original extent.
      This is for compression where even if we split the extent we need to know the
      ram bytes so when we uncompress the extent we know how big it will be.  This was
      still working out right with compression for some reason but I think we were
      getting lucky.  It was definitely off for prealloc which is why I noticed it,
      btrfsck was complaining about it.  With this patch btrfsck no longer complains
      after a log replay.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      cc95bef6
    • Josef Bacik's avatar
      Btrfs: don't wait on ordered extents if we have a trans open · 98ad69cf
      Josef Bacik authored
      Dave was hitting a lockdep warning because we're now properly taking the ordered
      operations mutex in the ordered wait stuff.  This is because some cases we will
      have a trans handle when we are flushing delalloc space, but we can't wait on
      ordered extents because we could potentially deadlock, so fix this by not doing
      the wait if we have a trans handle.  Thanks
      Reported-and-tested-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      98ad69cf
    • Josef Bacik's avatar
      Btrfs: fix error handling in make/read block group · 8c579fe7
      Josef Bacik authored
      I noticed that we will add a block group to the space info before we add it to
      the block group cache rb tree, so we could potentially allocate from the block
      group before it's able to be searched for.  I don't think this is too much of
      a problem, the race window is microscopic, but just in case move the tree
      insertion to above the space info linking.  This makes it easier to adjust the
      error handling as well, so we can remove a couple of BUG_ON(ret)'s and have real
      error handling setup for these scenarios.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      8c579fe7
    • Wang Shilong's avatar
      Btrfs: fix double free in the iterate_extent_inodes() · 5c2d867f
      Wang Shilong authored
      If btrfs_find_all_roots() fails, 'roots' has been freed or 'roots'
      fails to allocate. We don't need to free it outside btrfs_find_all_roots()
      again.Fix it.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      5c2d867f
    • Wang Shilong's avatar
      Btrfs: kill some BUG_ONs() in the find_parent_nodes() · f1723939
      Wang Shilong authored
      The reason that BUG_ON() happens in these places is just
      because of ENOMEM.
      
      We try ro return ENOMEM rather than trigger BUG_ON(), the
      caller will abort the transaction thus avoiding the kernel panic.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Reviewed-by: default avatarMiao Xie <miaox@cn.fujitsu.com>
      Reviewed-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      f1723939
    • Josef Bacik's avatar
      Btrfs: compare relevant parts of delayed tree refs · 41b0fc42
      Josef Bacik authored
      A user reported a panic while running a balance.  What was happening was he was
      relocating a block, which added the reference to the relocation tree.  Then
      relocation would walk through the relocation tree and drop that reference and
      free that block, and then it would walk down a snapshot which referenced the
      same block and add another ref to the block.  The problem is this was all
      happening in the same transaction, so the parent block was free'ed up when we
      drop our reference which was immediately available for allocation, and then it
      was used _again_ to add a reference for the same block from a different
      snapshot.  This resulted in something like this in the delayed ref tree
      
      add ref to 90234880, parent=2067398656, ref_root 1766, level 1
      del ref to 90234880, parent=2067398656, ref_root 18446744073709551608, level 1
      add ref to 90234880, parent=2067398656, ref_root 1767, level 1
      
      as you can see the ref_root's don't match, because when we inc the ref we use
      the header owner, which is the original tree the block belonged to, instead of
      the data reloc tree.  Then when we remove the extent we use the reloc tree
      objectid.  But none of this matters, since it is a shared reference which means
      only the parent matters.  When the delayed ref stuff runs it adds all the
      increments first, and then does all the drops, to make sure that we don't delete
      the ref if we net a positive ref count.  But tree blocks aren't allowed to have
      multiple refs from the same block, so this panics when it tries to add the
      second ref.  We need the add and the drop to cancel each other out in memory so
      we only do the final add.
      
      So to fix this we need to adjust how the delayed refs are added to the tree.
      Only the ref_root matters when it is a normal backref, and only the parent
      matters when it is a shared backref.  So make our decision based on what ref
      type we have.  This allows us to keep the ref_root in memory in case anybody
      wants to use it for something else, and it allows the delayed refs to be merged
      properly so we don't end up with this panic.
      
      With this patch the users image no longer panics on mount, and it has a clean
      fsck after a normal mount/umount cycle.  Thanks,
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRoman Mamedov <rm@romanrm.ru>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      41b0fc42
    • Josef Bacik's avatar
      Btrfs: fix infinite loop when we abort on mount · cf79ffb5
      Josef Bacik authored
      Testing my enospc log code I managed to abort a transaction during mount, which
      put me into an infinite loop.  This is because of two things, first we don't
      reset trans_no_join if we abort during transaction commit, which will force
      anybody trying to start a transaction to just loop endlessly waiting for it to
      be set to 0.  But this is still just a symptom, the second issue is we don't set
      the fs state to error during errors on mount.  This is because we don't want to
      do the flip read only thing during mount, but we still really want to set the fs
      state to an error to keep us from even getting to the trans_no_join check.  So
      fix both of these things, make sure to reset trans_no_join if we abort during a
      commit, and make sure we set the fs state to error no matter if we're mounting
      or not.  This should keep us from getting into this infinite loop again.
      Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      cf79ffb5
    • Wang Shilong's avatar
      Btrfs: fix a warning when disabling quota · c9a9dbf2
      Wang Shilong authored
      Steps to reproduce:
      	mkfs.btrfs <disk>
      	mount <disk> <mnt>
      	btrfs quota enable <mnt>
      	btrfs sub create <mnt>/subv
      
      	i=1
      	while [ $i -le 10000 ]
      	do
      		dd if=/dev/zero of=<mnt>/subv/data_$i bs=1K count=1
      		i=$(($i+1))
      		if [ $i -eq 500 ]
      		then
      			btrfs quota disable $mnt
      		fi
      	done
      	dmesg
      Obviously, this warn_on() is unnecessary, and it will be easily triggered.
      Just remove it.
      Signed-off-by: default avatarWang Shilong <wangsl-fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      c9a9dbf2
    • Liu Bo's avatar
      Btrfs: pass NULL instead of 0 · 6b67a320
      Liu Bo authored
      set_extent_bit()'s (u64 *failed_start) expects NULL not 0.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      6b67a320
    • Eric Sandeen's avatar
      btrfs: document mount options in Documentation/fs/btrfs.txt · c854a990
      Eric Sandeen authored
      Document all current btrfs mount options.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      c854a990
    • David Sterba's avatar
      btrfs: make subvol creation/deletion killable in the early stages · 5c50c9b8
      David Sterba authored
      The subvolume ioctls block on the parent directory mutex that can be
      held by other concurrent snapshot activity for a long time. Give the
      user at least some chance to get out of this situation by allowing
      to send a kill signal.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      5c50c9b8
    • David Sterba's avatar
      94ef7280
    • David Sterba's avatar
      btrfs: make orphan cleanup less verbose · 4884b476
      David Sterba authored
      The messages
      
        btrfs: unlinked 123 orphans
        btrfs: truncated 456 orphans
      
      are not useful to regular users and raise questions whether there are
      problems with the filesystem.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      4884b476
    • David Sterba's avatar
      btrfs: deprecate subvolrootid mount option · 5e2a4b25
      David Sterba authored
      This mount option was a workaround when subvol= assumed path relative
      to the default subvolume, not the toplevel one. This was fixed long time
      ago and subvolrootid has no effect.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      5e2a4b25
    • Simon Kirby's avatar
      Btrfs: Include the device in most error printk()s · c2cf52eb
      Simon Kirby authored
      With more than one btrfs volume mounted, it can be very difficult to find
      out which volume is hitting an error. btrfs_error() will print this, but
      it is currently rigged as more of a fatal error handler, while many of
      the printk()s are currently for debugging and yet-unhandled cases.
      
      This patch just changes the functions where the device information is
      already available. Some cases remain where the root or fs_info is not
      passed to the function emitting the error.
      
      This may introduce some confusion with volumes backed by multiple devices
      emitting errors referring to the primary device in the set instead of the
      one on which the error occurred.
      
      Use btrfs_printk(fs_info, format, ...) rather than writing the device
      string every time, and introduce macro wrappers ala XFS for brevity.
      Since the function already cannot be used for continuations, print a
      newline as part of the btrfs_printk() message rather than at each caller.
      Signed-off-by: default avatarSimon Kirby <sim@hostway.ca>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      c2cf52eb
    • David Sterba's avatar
      btrfs: update kconfig title · aa825914
      David Sterba authored
      The Kconfig title does not make much sense after the cleanup of
      CONFIG_EXPERIMENTAL option, align the wording with other filesystems.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      aa825914
    • David Sterba's avatar
      btrfs: clean snapshots one by one · 9d1a2a3a
      David Sterba authored
      Each time pick one dead root from the list and let the caller know if
      it's needed to continue. This should improve responsiveness during
      umount and balance which at some point waits for cleaning all currently
      queued dead roots.
      
      A new dead root is added to the end of the list, so the snapshots
      disappear in the order of deletion.
      
      The snapshot cleaning work is now done only from the cleaner thread and the
      others wake it if needed.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      9d1a2a3a
    • Zhi Yong Wu's avatar
    • Zhi Yong Wu's avatar
    • Liu Bo's avatar
      Btrfs: share stop worker code · 7abadb64
      Liu Bo authored
      Share the exactly same code of stopping workers.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      7abadb64
    • Josef Bacik's avatar
      Btrfs: add a incompatible format change for smaller metadata extent refs · 3173a18f
      Josef Bacik authored
      We currently store the first key of the tree block inside the reference for the
      tree block in the extent tree.  This takes up quite a bit of space.  Make a new
      key type for metadata which holds the level as the offset and completely removes
      storing the btrfs_tree_block_info inside the extent ref.  This reduces the size
      from 51 bytes to 33 bytes per extent reference for each tree block.  In practice
      this results in a 30-35% decrease in the size of our extent tree, which means we
      COW less and can keep more of the extent tree in memory which makes our heavy
      metadata operations go much faster.  This is not an automatic format change, you
      must enable it at mkfs time or with btrfstune.  This patch deals with having
      metadata stored as either the old format or the new format so it is easy to
      convert.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      3173a18f
    • Liu Bo's avatar
      Btrfs: use helper to cleanup tree roots · be283b2e
      Liu Bo authored
      free_root_pointers() has been introduced to cleanup all of tree roots,
      so just use it instead.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      be283b2e
    • Liu Bo's avatar
      Btrfs: cleanup unused arguments of btrfs_csum_data · b0496686
      Liu Bo authored
      Argument 'root' is no more used in btrfs_csum_data().
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      b0496686
    • David Sterba's avatar
      btrfs: clean up transaction abort messages · 08748810
      David Sterba authored
      The transaction abort stacktrace is printed only once per module
      lifetime, but we'd like to see it each time it happens per mounted
      filesystem.  Introduce a fs_state flag that records it.
      
      Tweak the messages around abort:
      * add error number to the first abort
      * print the exact negative errno from btrfs_decode_error
      * clean up btrfs_decode_error and callers
      * no dots at the end of the messages
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      08748810