1. 16 Jan, 2012 30 commits
  2. 11 Jan, 2012 10 commits
    • Li Zefan's avatar
      Btrfs: fix possible deadlock when opening a seed device · b367e47f
      Li Zefan authored
      The correct lock order is uuid_mutex -> volume_mutex -> chunk_mutex,
      but when we mount a filesystem which has backing seed devices, we have
      this lock chain:
      
          open_ctree()
              lock(chunk_mutex);
              read_chunk_tree();
                  read_one_dev();
                      open_seed_devices();
                          lock(uuid_mutex);
      
      and then we hit a lockdep splat.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      b367e47f
    • Li Zefan's avatar
      Btrfs: update global block_rsv when creating a new block group · c7c144db
      Li Zefan authored
      A bug was triggered while using seed device:
      
          # mkfs.btrfs /dev/loop1
          # btrfstune -S 1 /dev/loop1
          # mount -o /dev/loop1 /mnt
          # btrfs dev add /dev/loop2 /mnt
      
      btrfs: block rsv returned -28
      ------------[ cut here ]------------
      WARNING: at fs/btrfs/extent-tree.c:5969 btrfs_alloc_free_block+0x166/0x396 [btrfs]()
      ...
      Call Trace:
      ...
      [<f7b7c31c>] btrfs_cow_block+0x101/0x147 [btrfs]
      [<f7b7eaa6>] btrfs_search_slot+0x1b8/0x55f [btrfs]
      [<f7b7f844>] btrfs_insert_empty_items+0x42/0x7f [btrfs]
      [<f7b7f8c1>] btrfs_insert_item+0x40/0x7e [btrfs]
      [<f7b8ac02>] btrfs_make_block_group+0x243/0x2aa [btrfs]
      [<f7bb3f53>] __btrfs_alloc_chunk+0x672/0x70e [btrfs]
      [<f7bb41ff>] init_first_rw_device+0x77/0x13c [btrfs]
      [<f7bb5a62>] btrfs_init_new_device+0x664/0x9fd [btrfs]
      [<f7bbb65a>] btrfs_ioctl+0x694/0xdbe [btrfs]
      [<c04f55f7>] do_vfs_ioctl+0x496/0x4cc
      [<c04f5660>] sys_ioctl+0x33/0x4f
      [<c07b9edf>] sysenter_do_call+0x12/0x38
      ---[ end trace 906adac595facc7d ]---
      
      Since seed device is readonly, there's no usable space in the filesystem.
      Afterwards we add a sprout device to it, and the kernel creates a METADATA
      block group and a SYSTEM block group where comes free space we can reserve,
      but we still get revervation failure because the global block_rsv hasn't
      been updated accordingly.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      c7c144db
    • Li Zefan's avatar
      Btrfs: rewrite btrfs_trim_block_group() · 7fe1e641
      Li Zefan authored
      There are various bugs in block group trimming:
      
      - It may trim from offset smaller than user-specified offset.
      - It may trim beyond user-specified range.
      - It may leak free space for extents smaller than specified minlen.
      - It may truncate the last trimmed extent thus leak free space.
      - With mixed extents+bitmaps, some extents may not be trimmed.
      - With mixed extents+bitmaps, some bitmaps may not be trimmed (even
      none will be trimmed). Even for those trimmed, not all the free space
      in the bitmaps will be trimmed.
      
      I rewrite btrfs_trim_block_group() and break it into two functions.
      One is to trim extents only, and the other is to trim bitmaps only.
      
      Before patching:
      
      	# fstrim -v /mnt/
      	/mnt/: 1496465408 bytes were trimmed
      
      After patching:
      
      	# fstrim -v /mnt/
      	/mnt/: 2193768448 bytes were trimmed
      
      And this matches the total free space:
      
      	# btrfs fi df /mnt
      	Data: total=3.58GB, used=1.79GB
      	System, DUP: total=8.00MB, used=4.00KB
      	System: total=4.00MB, used=0.00
      	Metadata, DUP: total=205.12MB, used=97.14MB
      	Metadata: total=8.00MB, used=0.00
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      7fe1e641
    • Li Zefan's avatar
      Btrfs: simplfy calculation of stripe length for discard operation · ec9ef7a1
      Li Zefan authored
      For btrfs raid, while discarding a range of space, we'll need to know
      the start offset and length to discard for each device, and it's done
      in btrfs_map_block().
      
      However the calculation is a bit complex for raid0 and raid10, so I
      reimplement it based on a fact that:
      
              dev1          dev2           dev3    (raid0)
              -----------------------------------
              s0 s3 s6      s1 s4 s7       s2 s5
      
      Each device has (total_stripes / nr_dev) stripes, or plus one.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      ec9ef7a1
    • Li Zefan's avatar
      Btrfs: don't pre-allocate btrfs bio · de11cc12
      Li Zefan authored
      We pre-allocate a btrfs bio with fixed size, and then may re-allocate
      memory if we find stripes are bigger than the fixed size. But this
      pre-allocation is not necessary.
      
      Also we don't have to calcuate the stripe number twice.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      de11cc12
    • Li Zefan's avatar
      Btrfs: don't pass a trans handle unnecessarily in volumes.c · 125ccb0a
      Li Zefan authored
      Some functions never use the transaction handle passed to them.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      125ccb0a
    • Li Zefan's avatar
      Btrfs: reserve metadata space in btrfs_ioctl_setflags() · 4da6f1a3
      Li Zefan authored
      Check and reserve space for btrfs_update_inode().
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      4da6f1a3
    • Li Zefan's avatar
      Btrfs: remove BUG_ON()s in btrfs_ioctl_setflags() · f062abf0
      Li Zefan authored
      We can recover from errors and return -errno to user space.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      f062abf0
    • Li Zefan's avatar
      Btrfs: check the return value of io_ctl_init() · 706efc66
      Li Zefan authored
      It can return -ENOMEM.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      706efc66
    • Li Zefan's avatar
      Btrfs: avoid possible NULL deref in io_ctl_drop_pages() · a1ee5a45
      Li Zefan authored
      If we run into some failure path in io_ctl_prepare_pages(),
      io_ctl->pages[] array may have some NULL pointers.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      a1ee5a45