1. 27 Oct, 2016 1 commit
    • Chris Mason's avatar
      btrfs: fix races on root_log_ctx lists · 570dd450
      Chris Mason authored
      btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the
      list because it knows all of the waiters are patiently waiting for the
      commit to finish.
      
      But, there's a small race where btrfs_sync_log can remove itself from
      the list if it finds a log commit is already done.  Also, it uses
      list_del_init() to remove itself from the list, but there's no way to
      know if btrfs_remove_all_log_ctxs has already run, so we don't know for
      sure if it is safe to call list_del_init().
      
      This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and
      just calls it with the proper locking.
      
      This is part two of the corruption fixed by cbd60aa7.  I should have
      done this in the first place, but convinced myself the optimizations were
      safe.  A 12 hour run of dbench 2048 will eventually trigger a list debug
      WARN_ON for the list_del_init() in btrfs_sync_log().
      
      Fixes: d1433debReported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      cc: stable@vger.kernel.org # 3.15+
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      570dd450
  2. 18 Oct, 2016 1 commit
  3. 12 Oct, 2016 2 commits
    • Chris Mason's avatar
      Merge branch 'fst-fixes' of... · d9ed71e5
      Chris Mason authored
      Merge branch 'fst-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      d9ed71e5
    • Filipe Manana's avatar
      Btrfs: fix incremental send failure caused by balance · d5e84fd8
      Filipe Manana authored
      Commit 95155585 ("Btrfs: send, don't bug on inconsistent snapshots")
      removed some BUG_ON() statements (replacing them with returning errors
      to user space and logging error messages) when a snapshot is in an
      inconsistent state due to failures to update a delayed inode item (ENOMEM
      or ENOSPC) after adding/updating/deleting references, xattrs or file
      extent items.
      
      However there is a case, when no errors happen, where a file extent item
      can be modified without having the corresponding inode item updated. This
      case happens during balance under very specific timings, when relocation
      is in the stage where it updates data pointers and a leaf that contains
      file extent items is COWed. When that happens file extent items get their
      disk_bytenr field updated to a new value that reflects the post relocation
      logical address of the extent, without updating their respective inode
      items (as there is nothing that needs to be updated on them). This is
      performed at relocation.c:replace_file_extents() through
      relocation.c:btrfs_reloc_cow_block().
      
      So make an incremental send deal with this case and don't do any processing
      for a file extent item that got its disk_bytenr field updated by relocation,
      since the extent's data is the same as the one pointed by the file extent
      item in the parent snapshot.
      
      After the recent commit mentioned above this case resulted in EIO errors
      returned to user space (and an error message logged to dmesg/syslog) when
      doing an incremental send, while before it, it resulted in hitting a
      BUG_ON leading to the following trace:
      
      [  952.206705] ------------[ cut here ]------------
      [  952.206714] kernel BUG at ../fs/btrfs/send.c:5653!
      [  952.206719] Internal error: Oops - BUG: 0 [#1] SMP
      [  952.209854] Modules linked in: st dm_mod nls_utf8 isofs fuse nf_log_ipv6 xt_pkttype xt_physdev br_netfilter nf_log_ipv4 nf_log_common xt_LOG xt_limit ebtable_filter ebtables af_packet bridge stp llc ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables xfs libcrc32c nls_iso8859_1 nls_cp437 vfat fat joydev aes_ce_blk ablk_helper cryptd snd_intel8x0 aes_ce_cipher snd_ac97_codec ac97_bus snd_pcm ghash_ce sha2_ce sha1_ce snd_timer snd virtio_net soundcore btrfs xor sr_mod cdrom hid_generic usbhid raid6_pq virtio_blk virtio_scsi bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_mmio xhci_pci xhci_hcd usbcore usb_common virtio_pci virtio_ring virtio drm sg efivarfs
      [  952.228333] Supported: Yes
      [  952.228908] CPU: 0 PID: 12779 Comm: snapperd Not tainted 4.4.14-50-default #1
      [  952.230329] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      [  952.231683] task: ffff800058e94100 ti: ffff8000d866c000 task.ti: ffff8000d866c000
      [  952.233279] PC is at changed_cb+0x9f4/0xa48 [btrfs]
      [  952.234375] LR is at changed_cb+0x58/0xa48 [btrfs]
      [  952.236552] pc : [<ffff7ffffc39de7c>] lr : [<ffff7ffffc39d4e0>] pstate: 80000145
      [  952.238049] sp : ffff8000d866fa20
      [  952.238732] x29: ffff8000d866fa20 x28: 0000000000000019
      [  952.239840] x27: 00000000000028d5 x26: 00000000000024a2
      [  952.241008] x25: 0000000000000002 x24: ffff8000e66e92f0
      [  952.242131] x23: ffff8000b8c76800 x22: ffff800092879140
      [  952.243238] x21: 0000000000000002 x20: ffff8000d866fb78
      [  952.244348] x19: ffff8000b8f8c200 x18: 0000000000002710
      [  952.245607] x17: 0000ffff90d42480 x16: ffff800000237dc0
      [  952.246719] x15: 0000ffff90de7510 x14: ab000c000a2faf08
      [  952.247835] x13: 0000000000577c2b x12: ab000c000b696665
      [  952.248981] x11: 2e65726f632f6966 x10: 652d34366d72612f
      [  952.250101] x9 : 32627572672f746f x8 : ab000c00092f1671
      [  952.251352] x7 : 8000000000577c2b x6 : ffff800053eadf45
      [  952.252468] x5 : 0000000000000000 x4 : ffff80005e169494
      [  952.253582] x3 : 0000000000000004 x2 : ffff8000d866fb78
      [  952.254695] x1 : 000000000003e2a3 x0 : 000000000003e2a4
      [  952.255803]
      [  952.256150] Process snapperd (pid: 12779, stack limit = 0xffff8000d866c020)
      [  952.257516] Stack: (0xffff8000d866fa20 to 0xffff8000d8670000)
      [  952.258654] fa20: ffff8000d866fae0 ffff7ffffc308fc0 ffff800092879140 ffff8000e66e92f0
      [  952.260219] fa40: 0000000000000035 ffff800055de6000 ffff8000b8c76800 ffff8000d866fb78
      [  952.261745] fa60: 0000000000000002 00000000000024a2 00000000000028d5 0000000000000019
      [  952.263269] fa80: ffff8000d866fae0 ffff7ffffc3090f0 ffff8000d866fae0 ffff7ffffc309128
      [  952.264797] faa0: ffff800092879140 ffff8000e66e92f0 0000000000000035 ffff800055de6000
      [  952.268261] fac0: ffff8000b8c76800 ffff8000d866fb78 0000000000000002 0000000000001000
      [  952.269822] fae0: ffff8000d866fbc0 ffff7ffffc39ecfc ffff8000b8f8c200 ffff8000b8f8c368
      [  952.271368] fb00: ffff8000b8f8c378 ffff800055de6000 0000000000000001 ffff8000ecb17500
      [  952.272893] fb20: ffff8000b8c76800 ffff800092879140 ffff800062b6d000 ffff80007a9e2470
      [  952.274420] fb40: ffff8000b8f8c208 0000000005784000 ffff8000580a8000 ffff8000b8f8c200
      [  952.276088] fb60: ffff7ffffc39d488 00000002b8f8c368 0000000000000000 000000000003e2a4
      [  952.280275] fb80: 000000000000006c ffff7ffffc39ec00 000000000003e2a4 000000000000006c
      [  952.283219] fba0: ffff8000b8f8c300 0000000000000100 0000000000000001 ffff8000ecb17500
      [  952.286166] fbc0: ffff8000d866fcd0 ffff7ffffc3643c0 ffff8000f8842700 0000ffff8ffe9278
      [  952.289136] fbe0: 0000000040489426 ffff800055de6000 0000ffff8ffe9278 0000000040489426
      [  952.292083] fc00: 000000000000011d 000000000000001d ffff80007a9e4598 ffff80007a9e43e8
      [  952.294959] fc20: ffff8000b8c7693f 0000000000003b24 0000000000000019 ffff8000b8f8c218
      [  952.301161] fc40: 00000001d866fc70 ffff8000b8c76800 0000000000000128 ffffffffffffff84
      [  952.305749] fc60: ffff800058e941ff 0000000000003a58 ffff8000d866fcb0 ffff8000000f7390
      [  952.308875] fc80: 000000000000012a 0000000000010290 ffff8000d866fc00 000000000000007b
      [  952.311915] fca0: 0000000000010290 ffff800046c1b100 74732d7366727462 000001006d616572
      [  952.314937] fcc0: ffff8000fffc4100 cb88537fdc8ba60e ffff8000d866fe10 ffff8000002499e8
      [  952.318008] fce0: 0000000040489426 ffff8000f8842700 0000ffff8ffe9278 ffff80007a9e4598
      [  952.321321] fd00: 0000ffff8ffe9278 0000000040489426 000000000000011d 000000000000001d
      [  952.324280] fd20: ffff80000072c000 ffff8000d866c000 ffff8000d866fda0 ffff8000000e997c
      [  952.327156] fd40: ffff8000fffc4180 00000000000031ed ffff8000fffc4180 ffff800046c1b7d4
      [  952.329895] fd60: 0000000000000140 0000ffff907ea170 000000000000011d 00000000000000dc
      [  952.334641] fd80: ffff80000072c000 ffff8000d866c000 0000000000000000 0000000000000002
      [  952.338002] fda0: ffff8000d866fdd0 ffff8000000ebacc ffff800046c1b080 ffff800046c1b7d4
      [  952.340724] fdc0: ffff8000d866fdf0 ffff8000000db67c 0000000000000040 ffff800000e69198
      [  952.343415] fde0: 0000ffff8ffea790 00000000000031ed ffff8000d866fe20 ffff800000254000
      [  952.346101] fe00: 000000000000001d 0000000000000004 ffff8000d866fe90 ffff800000249d3c
      [  952.348980] fe20: ffff8000f8842700 0000000000000000 ffff8000f8842701 0000000000000008
      [  952.351696] fe40: ffff8000d866fe70 0000000000000008 ffff8000d866fe90 ffff800000249cf8
      [  952.354387] fe60: ffff8000f8842700 0000ffff8ffe9170 ffff8000f8842701 0000000000000008
      [  952.357083] fe80: 0000ffff8ffe9278 ffff80008ff85500 0000ffff8ffe90c0 ffff800000085c84
      [  952.359800] fea0: 0000000000000000 0000ffff8ffe9170 ffffffffffffffff 0000ffff90d473bc
      [  952.365351] fec0: 0000000000000000 0000000000000015 0000000000000008 0000000040489426
      [  952.369550] fee0: 0000ffff8ffe9278 0000ffff907ea790 0000ffff907ea170 0000ffff907ea790
      [  952.372416] ff00: 0000ffff907ea170 0000000000000000 000000000000001d 0000000000000004
      [  952.375223] ff20: 0000ffff90a32220 00000000003d0f00 0000ffff907ea0a0 0000ffff8ffe8f30
      [  952.378099] ff40: 0000ffff9100f554 0000ffff91147000 0000ffff91117bc0 0000ffff90d473b0
      [  952.381115] ff60: 0000ffff9100f620 0000ffff880069b0 0000ffff8ffe9170 0000ffff8ffe91a0
      [  952.384003] ff80: 0000ffff8ffe9160 0000ffff8ffe9140 0000ffff88006990 0000ffff8ffe9278
      [  952.386860] ffa0: 0000ffff88008a60 0000ffff8ffe9480 0000ffff88014ca0 0000ffff8ffe90c0
      [  952.389654] ffc0: 0000ffff910be8e8 0000ffff8ffe90c0 0000ffff90d473bc 0000000000000000
      [  952.410986] ffe0: 0000000000000008 000000000000001d 6e2079747265706f 72616d223d656d61
      [  952.415497] Call trace:
      [  952.417403] [<ffff7ffffc39de7c>] changed_cb+0x9f4/0xa48 [btrfs]
      [  952.420023] [<ffff7ffffc308fc0>] btrfs_compare_trees+0x500/0x6b0 [btrfs]
      [  952.422759] [<ffff7ffffc39ecfc>] btrfs_ioctl_send+0xb4c/0xe10 [btrfs]
      [  952.425601] [<ffff7ffffc3643c0>] btrfs_ioctl+0x374/0x29a4 [btrfs]
      [  952.428031] [<ffff8000002499e8>] do_vfs_ioctl+0x33c/0x600
      [  952.430360] [<ffff800000249d3c>] SyS_ioctl+0x90/0xa4
      [  952.432552] [<ffff800000085c84>] el0_svc_naked+0x38/0x3c
      [  952.434803] Code: 2a1503e0 17fffdac b9404282 17ffff28 (d4210000)
      [  952.437457] ---[ end trace 9afd7090c466cf15 ]---
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      d5e84fd8
  4. 10 Oct, 2016 1 commit
  5. 03 Oct, 2016 7 commits
  6. 26 Sep, 2016 28 commits
    • Liu Bo's avatar
      Btrfs: remove unnecessary btrfs_mark_buffer_dirty in split_leaf · 196e0249
      Liu Bo authored
      When we're not able to get enough space through splitting leaf,
      we'd create a new sibling leaf instead, and it's possible that we return
       a zero-nritem sibling leaf and mark it dirty before it's in a consistent
      state.  With CONFIG_BTRFS_FS_CHECK_INTEGRITY=y, the integrity check of
      check_leaf will report panic due to this zero-nritem non-root leaf.
      
      This removes the unnecessary btrfs_mark_buffer_dirty.
      Reported-by: default avatarFilipe Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      196e0249
    • Josef Bacik's avatar
      Btrfs: don't BUG() during drop snapshot · 4867268c
      Josef Bacik authored
      Really there's lots of things that can go wrong here, kill all the
      BUG_ON()'s and replace the logic ones with ASSERT()'s and return EIO
      instead.
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      [ switched to btrfs_err, errors go to common label ]
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      4867268c
    • Arnd Bergmann's avatar
      btrfs: fix btrfs_no_printk stub helper · 2fd57fcb
      Arnd Bergmann authored
      The addition of btrfs_no_printk() caused a build failure when
      CONFIG_PRINTK is disabled:
      
      fs/btrfs/send.c: In function 'send_rename':
      fs/btrfs/ctree.h:3367:2: error: implicit declaration of function 'btrfs_no_printk' [-Werror=implicit-function-declaration]
      
      This moves the helper outside of that #ifdef so it is always
      defined, and changes the existing #ifdef to refer to that
      helper as well for consistency.
      
      Fixes: 47c57058ff2c ("btrfs: btrfs_debug should consume fs_info when DEBUG is not defined")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2fd57fcb
    • Liu Bo's avatar
      Btrfs: memset to avoid stale content in btree leaf · 851cd173
      Liu Bo authored
      This is an additional patch to
      "Btrfs: memset to avoid stale content in btree node block".
      
      This uses memset to initialize the unused space in a leaf to avoid
      potential stale content, which may be incurred by pushing items
      between sibling leaves.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      851cd173
    • Goldwyn Rodrigues's avatar
      btrfs: parent_start initialization cleanup · 0f5053eb
      Goldwyn Rodrigues authored
      Code cleanup. parent_start is initialized multiple times when it is
      not necessary to do so.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0f5053eb
    • Goldwyn Rodrigues's avatar
      btrfs: Remove already completed TODO comment · 6cea66e5
      Goldwyn Rodrigues authored
      Fixes: 7cf5b976 ("btrfs: qgroup: Cleanup old inaccurate facilities")
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6cea66e5
    • Goldwyn Rodrigues's avatar
      btrfs: Do not reassign count in btrfs_run_delayed_refs · dd12d5b8
      Goldwyn Rodrigues authored
      Code cleanup. count is already (unsgined long)-1. That is the reason
      run_all was set. Do not reassign it (unsigned long)-1.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      dd12d5b8
    • Anand Jain's avatar
      btrfs: fix a possible umount deadlock · 0ccd0528
      Anand Jain authored
      btrfs_show_devname() is using the device_list_mutex, sometimes
      a call to blkdev_put() leads vfs calling into this func. So
      call blkdev_put() outside of device_list_mutex, as of now.
      
      [  983.284212] ======================================================
      [  983.290401] [ INFO: possible circular locking dependency detected ]
      [  983.296677] 4.8.0-rc5-ceph-00023-g1b39cec2 #1 Not tainted
      [  983.302081] -------------------------------------------------------
      [  983.308357] umount/21720 is trying to acquire lock:
      [  983.313243]  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.321264]
      [  983.321264] but task is already holding lock:
      [  983.327101]  (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs]
      [  983.337839]
      [  983.337839] which lock already depends on the new lock.
      [  983.337839]
      [  983.346024]
      [  983.346024] the existing dependency chain (in reverse order) is:
      [  983.353512]
      -> #4 (&fs_devs->device_list_mutex){+.+...}:
      [  983.359096]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.365143]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.371521]        [<ffffffffc02d8116>] btrfs_show_devname+0x36/0x1f0 [btrfs]
      [  983.378710]        [<ffffffff9129523e>] show_vfsmnt+0x4e/0x150
      [  983.384593]        [<ffffffff9126ffc7>] m_show+0x17/0x20
      [  983.389957]        [<ffffffff91276405>] seq_read+0x2b5/0x3b0
      [  983.395669]        [<ffffffff9124c808>] __vfs_read+0x28/0x100
      [  983.401464]        [<ffffffff9124eb3b>] vfs_read+0xab/0x150
      [  983.407080]        [<ffffffff9124ec32>] SyS_read+0x52/0xb0
      [  983.412609]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.419617]
      -> #3 (namespace_sem){++++++}:
      [  983.424024]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.430074]        [<ffffffff918239e9>] down_write+0x49/0x80
      [  983.435785]        [<ffffffff91272457>] lock_mount+0x67/0x1c0
      [  983.441582]        [<ffffffff91272ab2>] do_add_mount+0x32/0xf0
      [  983.447458]        [<ffffffff9127363a>] finish_automount+0x5a/0xc0
      [  983.453682]        [<ffffffff91259513>] follow_managed+0x1b3/0x2a0
      [  983.459912]        [<ffffffff9125b750>] lookup_fast+0x300/0x350
      [  983.465875]        [<ffffffff9125d6e7>] path_openat+0x3a7/0xaa0
      [  983.471846]        [<ffffffff9125ef75>] do_filp_open+0x85/0xe0
      [  983.477731]        [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0
      [  983.483702]        [<ffffffff9124c4de>] SyS_open+0x1e/0x20
      [  983.489240]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.496254]
      -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}:
      [  983.501798]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.507855]        [<ffffffff918239e9>] down_write+0x49/0x80
      [  983.513558]        [<ffffffff91366237>] start_creating+0x87/0x100
      [  983.519703]        [<ffffffff91366647>] debugfs_create_dir+0x17/0x100
      [  983.526195]        [<ffffffff911df153>] bdi_register+0x93/0x210
      [  983.532165]        [<ffffffff911df313>] bdi_register_owner+0x43/0x70
      [  983.538570]        [<ffffffff914080fb>] device_add_disk+0x1fb/0x450
      [  983.544888]        [<ffffffff91580226>] loop_add+0x1e6/0x290
      [  983.550596]        [<ffffffff91fec358>] loop_init+0x10b/0x14f
      [  983.556394]        [<ffffffff91002207>] do_one_initcall+0xa7/0x180
      [  983.562618]        [<ffffffff91f932e0>] kernel_init_freeable+0x1cc/0x266
      [  983.569370]        [<ffffffff918174be>] kernel_init+0xe/0x100
      [  983.575166]        [<ffffffff9182620f>] ret_from_fork+0x1f/0x40
      [  983.581131]
      -> #1 (loop_index_mutex){+.+.+.}:
      [  983.585801]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.591858]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.598256]        [<ffffffff9157ed3f>] lo_open+0x1f/0x60
      [  983.603704]        [<ffffffff9128eec3>] __blkdev_get+0x123/0x400
      [  983.609757]        [<ffffffff9128f4ea>] blkdev_get+0x34a/0x350
      [  983.615639]        [<ffffffff9128f554>] blkdev_open+0x64/0x80
      [  983.621428]        [<ffffffff9124aff6>] do_dentry_open+0x1c6/0x2d0
      [  983.627651]        [<ffffffff9124c029>] vfs_open+0x69/0x80
      [  983.633181]        [<ffffffff9125db74>] path_openat+0x834/0xaa0
      [  983.639152]        [<ffffffff9125ef75>] do_filp_open+0x85/0xe0
      [  983.645035]        [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0
      [  983.650999]        [<ffffffff9124c4de>] SyS_open+0x1e/0x20
      [  983.656535]        [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      [  983.663541]
      -> #0 (&bdev->bd_mutex){+.+.+.}:
      [  983.668107]        [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0
      [  983.674510]        [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.680561]        [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.686967]        [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.692761]        [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs]
      [  983.699699]        [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs]
      [  983.707178]        [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs]
      [  983.714380]        [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs]
      [  983.721061]        [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs]
      [  983.727908]        [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100
      [  983.734744]        [<ffffffff91250f56>] kill_anon_super+0x16/0x30
      [  983.740888]        [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs]
      [  983.747909]        [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80
      [  983.754745]        [<ffffffff912515fd>] deactivate_super+0x5d/0x70
      [  983.760977]        [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80
      [  983.766773]        [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20
      [  983.772738]        [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0
      [  983.778708]        [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4
      [  983.785373]        [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0
      [  983.792212]        [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
      [  983.799225]
      [  983.799225] other info that might help us debug this:
      [  983.799225]
      [  983.807291] Chain exists of:
        &bdev->bd_mutex --> namespace_sem --> &fs_devs->device_list_mutex
      
      [  983.816521]  Possible unsafe locking scenario:
      [  983.816521]
      [  983.822489]        CPU0                    CPU1
      [  983.827043]        ----                    ----
      [  983.831599]   lock(&fs_devs->device_list_mutex);
      [  983.836289]                                lock(namespace_sem);
      [  983.842268]                                lock(&fs_devs->device_list_mutex);
      [  983.849478]   lock(&bdev->bd_mutex);
      [  983.853127]
      [  983.853127]  *** DEADLOCK ***
      [  983.853127]
      [  983.859113] 3 locks held by umount/21720:
      [  983.863145]  #0:  (&type->s_umount_key#35){++++..}, at: [<ffffffff912515f5>] deactivate_super+0x55/0x70
      [  983.872713]  #1:  (uuid_mutex){+.+.+.}, at: [<ffffffffc033d8d3>] btrfs_close_devices+0x23/0xa0 [btrfs]
      [  983.882206]  #2:  (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs]
      [  983.893422]
      [  983.893422] stack backtrace:
      [  983.897824] CPU: 6 PID: 21720 Comm: umount Not tainted 4.8.0-rc5-ceph-00023-g1b39cec2 #1
      [  983.905958] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
      [  983.913492]  0000000000000000 ffff8c8a53c17a38 ffffffff91429521 ffffffff9260f4f0
      [  983.921018]  ffffffff92642760 ffff8c8a53c17a88 ffffffff911b2b04 0000000000000050
      [  983.928542]  ffffffff9237d620 ffff8c8a5294aee0 ffff8c8a5294aeb8 ffff8c8a5294aee0
      [  983.936072] Call Trace:
      [  983.938545]  [<ffffffff91429521>] dump_stack+0x85/0xc4
      [  983.943715]  [<ffffffff911b2b04>] print_circular_bug+0x1fb/0x20c
      [  983.949748]  [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0
      [  983.955613]  [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0
      [  983.961123]  [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150
      [  983.966550]  [<ffffffff91823125>] mutex_lock_nested+0x65/0x350
      [  983.972407]  [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150
      [  983.977832]  [<ffffffff9128ec51>] blkdev_put+0x31/0x150
      [  983.983101]  [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs]
      [  983.989500]  [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs]
      [  983.996415]  [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs]
      [  984.003068]  [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs]
      [  984.009189]  [<ffffffff9126cc5e>] ? evict_inodes+0x15e/0x170
      [  984.014881]  [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs]
      [  984.021176]  [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100
      [  984.027476]  [<ffffffff91250f56>] kill_anon_super+0x16/0x30
      [  984.033082]  [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs]
      [  984.039548]  [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80
      [  984.045839]  [<ffffffff912515fd>] deactivate_super+0x5d/0x70
      [  984.051525]  [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80
      [  984.056774]  [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20
      [  984.062201]  [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0
      [  984.067625]  [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4
      [  984.073747]  [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0
      [  984.080038]  [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1
      Reported-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0ccd0528
    • Liu Bo's avatar
      Btrfs: fix memory leak in do_walk_down · a958eab0
      Liu Bo authored
      The extent buffer 'next' needs to be free'd conditionally.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      a958eab0
    • Jeff Mahoney's avatar
      btrfs: btrfs_debug should consume fs_info when DEBUG is not defined · c01f5f96
      Jeff Mahoney authored
      We can hit unused variable warnings when btrfs_debug and friends are
      just aliases for no_printk.  This is due to the fs_info not getting
      consumed by the function call, which can happen if convenenience
      variables are used.  This patch adds a new btrfs_no_printk static inline
      that consumes the convenience variable and does nothing else.  It
      silences the unused variable warning and has no impact on the generated
      code:
      
      $ size fs/btrfs/extent_io.o*
         text	   data	    bss	    dec	    hex	filename
        44072	    152	     32	  44256	   ace0	fs/btrfs/extent_io.o.btrfs_no_printk
        44072	    152	     32	  44256	   ace0	fs/btrfs/extent_io.o.no_printk
      
      Fixes: 27a0dd61 (Btrfs: make btrfs_debug match pr_debug handling related to DEBUG)
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      c01f5f96
    • Jeff Mahoney's avatar
      btrfs: convert send's verbose_printk to btrfs_debug · 04ab956e
      Jeff Mahoney authored
      This was basically an open-coded, less flexible dynamic printk.  We can
      just use btrfs_debug instead.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      04ab956e
    • Jeff Mahoney's avatar
      btrfs: convert pr_* to btrfs_* where possible · ab8d0fc4
      Jeff Mahoney authored
      For many printks, we want to know which file system issued the message.
      
      This patch converts most pr_* calls to use the btrfs_* versions instead.
      In some cases, this means adding plumbing to allow call sites access to
      an fs_info pointer.
      
      fs/btrfs/check-integrity.c is left alone for another day.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ab8d0fc4
    • Jeff Mahoney's avatar
      btrfs: convert printk(KERN_* to use pr_* calls · 62e85577
      Jeff Mahoney authored
      This patch converts printk(KERN_* style messages to use the pr_* versions.
      
      One side effect is that anything that was KERN_DEBUG is now automatically
      a dynamic debug message.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      62e85577
    • Jeff Mahoney's avatar
      btrfs: unsplit printed strings · 5d163e0e
      Jeff Mahoney authored
      CodingStyle chapter 2:
      "[...] never break user-visible strings such as printk messages,
      because that breaks the ability to grep for them."
      
      This patch unsplits user-visible strings.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      5d163e0e
    • Jeff Mahoney's avatar
      btrfs: clean the old superblocks before freeing the device · cea67ab9
      Jeff Mahoney authored
      btrfs_rm_device frees the block device but then re-opens it using
      the saved device name.  A race exists between the close and the
      re-open that allows the block size to be changed.  The result
      is getting stuck forever in the reclaim loop in __getblk_slow.
      
      This patch moves the superblock cleanup before closing the block
      device, which is also consistent with other callers.  We also don't
      need a private copy of dev_name as the whole routine operates under
      the uuid_mutex.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      cea67ab9
    • Liu Bo's avatar
      Btrfs: kill BUG_ON in run_delayed_tree_ref · 02794222
      Liu Bo authored
      In a corrupted btrfs image, we can come across this BUG_ON and
      get an unreponsive system, but if we return errors instead,
      its caller can handle everything gracefully by aborting the current
      transaction.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      02794222
    • Josef Bacik's avatar
      Btrfs: don't leak reloc root nodes on error · 6bdf131f
      Josef Bacik authored
      We don't track the reloc roots in any sort of normal way, so the only way the
      root/commit_root nodes get free'd is if the relocation finishes successfully and
      the reloc root is deleted.  Fix this by free'ing them in free_reloc_roots.
      Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6bdf131f
    • Masahiro Yamada's avatar
      btrfs: squash lines for simple wrapper functions · e2c89907
      Masahiro Yamada authored
      Remove unneeded variables and assignments.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      e2c89907
    • Liu Bo's avatar
      Btrfs: improve check_node to avoid reading corrupted nodes · 6b722c17
      Liu Bo authored
      We need to check items in a node to make sure that we're reading
      a valid one, otherwise we could get various crashes while processing
      delayed_refs.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6b722c17
    • Liu Bo's avatar
      Btrfs: add error handling for extent buffer in print tree · a42cbec9
      Liu Bo authored
      Somehow we missed btrfs_print_tree when last time we
      updated error handling for read_extent_block().
      
      This keeps us from getting a NULL pointer panic when
      btrfs_print_tree's read_extent_block() fails.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      a42cbec9
    • Liu Bo's avatar
      Btrfs: remove BUG_ON in start_transaction · a43f7f82
      Liu Bo authored
      Since we could get errors from the concurrent aborted transaction,
      the check of this BUG_ON in start_transaction is not true any more.
      
      Say, while flushing free space cache inode's dirty pages,
      btrfs_finish_ordered_io
       -> btrfs_join_transaction_nolock
            (the transaction has been aborted.)
            -> BUG_ON(type == TRANS_JOIN_NOLOCK);
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      a43f7f82
    • Liu Bo's avatar
      Btrfs: memset to avoid stale content in btree node block · 3eb548ee
      Liu Bo authored
      During updating btree, we could push items between sibling
      nodes/leaves, for leaves data sections starts reversely from
      the end of the block while for nodes we only have key pairs
      which are stored one by one from the start of the block.
      
      So we could do try to push key pairs from one node to the next
      node right in the tree, and after that, we update the node's
      nritems to reflect the correct end while leaving the stale
      content in the node.  One may intentionally corrupt the fs
      image and access the stale content by bumping the nritems and
      causes various crashes.
      
      This takes the in-memory @nritems as the correct one and
      gets to memset the unused part of a btree node.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      3eb548ee
    • Liu Bo's avatar
      Btrfs: return gracefully from balance if fs tree is corrupted · 3561b9db
      Liu Bo authored
      When relocating tree blocks, we firstly get block information from
      back references in the extent tree, we then search fs tree to try to
      find all parents of a block.
      
      However, if fs tree is corrupted, eg. if there're some missing
      items, we could come across these WARN_ONs and BUG_ONs.
      
      This makes us print some error messages and return gracefully
      from balance.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      3561b9db
    • Josef Bacik's avatar
      Btrfs: kill BUG_ON()'s in btrfs_mark_extent_written · 9c8e63db
      Josef Bacik authored
      No reason to bug on in here, fs corruption could easily cause these things to
      happen.
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9c8e63db
    • Josef Bacik's avatar
      Btrfs: kill the start argument to read_extent_buffer_pages · 8436ea91
      Josef Bacik authored
      Nobody uses this, it makes no sense to do partial reads of extent buffers.
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      8436ea91
    • Josef Bacik's avatar
      Btrfs: add a flags field to btrfs_fs_info · afcdd129
      Josef Bacik authored
      We have a lot of random ints in btrfs_fs_info that can be put into flags.  This
      is mostly equivalent with the exception of how we deal with quota going on or
      off, now instead we set a flag when we are turning it on or off and deal with
      that appropriately, rather than just having a pending state that the current
      quota_enabled gets set to.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      afcdd129
    • Qu Wenruo's avatar
      btrfs: extend btrfs_set_extent_delalloc and its friends to support in-band... · ba8b04c1
      Qu Wenruo authored
      btrfs: extend btrfs_set_extent_delalloc and its friends to support in-band dedupe and subpage size patchset
      
      Extend btrfs_set_extent_delalloc() and extent_clear_unlock_delalloc()
      parameters for both in-band dedupe and subpage sector size patchset.
      
      This should reduce conflict of both patchset and the effort to rebase
      them.
      
      Cc: Chandan Rajendra <chandan@linux.vnet.ibm.com>
      Cc: David Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ba8b04c1
    • Jeff Mahoney's avatar
      btrfs: add dynamic debug support · 897a41b1
      Jeff Mahoney authored
      We can re-use the dynamic debugging descriptor to make use of the dynamic
      debugging mechanism but still use our own printk interface.
      
      Defining the DEBUG macro works as it did before.  When it's defined,
      all of the messages default to print.  We can also enable all debug
      messages at boot or module-load time using the 'dyndbg' and
      'btrfs.dyndbg' options.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      897a41b1