• Filipe Manana's avatar
    Btrfs: remove unnecessary locking of cleaner_mutex to avoid deadlock · 85e0a0f2
    Filipe Manana authored
    After commmit e44163e1 ("btrfs: explictly delete unused block groups
    in close_ctree and ro-remount"), added in the 4.3 merge window, we have
    calls to btrfs_delete_unused_bgs() while holding the cleaner_mutex.
    This can cause a deadlock with a concurrent block group relocation (when
    a filesystem balance or shrink operation is in progress for example)
    because btrfs_delete_unused_bgs() locks delete_unused_bgs_mutex and the
    relocation path locks first delete_unused_bgs_mutex and then it locks
    cleaner_mutex, resulting in a classic ABBA deadlock:
    
             CPU 0                                        CPU 1
    
    lock fs_info->cleaner_mutex
    
                                               __btrfs_balance() || btrfs_shrink_device()
                                                 lock fs_info->delete_unused_bgs_mutex
                                                 btrfs_relocate_chunk()
                                                   btrfs_relocate_block_group()
                                                     lock fs_info->cleaner_mutex
    btrfs_delete_unused_bgs()
      lock fs_info->delete_unused_bgs_mutex
    
    Fix this by not taking the cleaner_mutex before calling
    btrfs_delete_unused_bgs() because it's no longer needed after
    commit 67c5e7d4 ("Btrfs: fix race between balance and unused block
    group deletion"). The mutex fs_info->delete_unused_bgs_mutex, the
    spinlock fs_info->unused_bgs_lock and a block group's spinlock are
    enough to get correct serialization between tasks running relocation
    and unused block group deletion (as well as between multiple tasks
    concurrently calling btrfs_delete_unused_bgs()).
    
    This issue was discussed (in the mailing list) during the review of
    the patch titled "btrfs: explictly delete unused block groups in
    close_ctree and ro-remount" and it was agreed that acquiring the
    cleaner mutex had to be dropped after the patch titled
    "Btrfs: fix race between balance and unused block group deletion"
    got merged (both patches were submitted at about the same time, but
    one landed in kernel 4.2 and the other in the 4.3 merge window).
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    85e0a0f2
disk-io.c 119 KB