• Josef Bacik's avatar
    Btrfs: compare relevant parts of delayed tree refs · a2d8e3c7
    Josef Bacik authored
    commit 41b0fc42 upstream.
    
    A user reported a panic while running a balance.  What was happening was he was
    relocating a block, which added the reference to the relocation tree.  Then
    relocation would walk through the relocation tree and drop that reference and
    free that block, and then it would walk down a snapshot which referenced the
    same block and add another ref to the block.  The problem is this was all
    happening in the same transaction, so the parent block was free'ed up when we
    drop our reference which was immediately available for allocation, and then it
    was used _again_ to add a reference for the same block from a different
    snapshot.  This resulted in something like this in the delayed ref tree
    
    add ref to 90234880, parent=2067398656, ref_root 1766, level 1
    del ref to 90234880, parent=2067398656, ref_root 18446744073709551608, level 1
    add ref to 90234880, parent=2067398656, ref_root 1767, level 1
    
    as you can see the ref_root's don't match, because when we inc the ref we use
    the header owner, which is the original tree the block belonged to, instead of
    the data reloc tree.  Then when we remove the extent we use the reloc tree
    objectid.  But none of this matters, since it is a shared reference which means
    only the parent matters.  When the delayed ref stuff runs it adds all the
    increments first, and then does all the drops, to make sure that we don't delete
    the ref if we net a positive ref count.  But tree blocks aren't allowed to have
    multiple refs from the same block, so this panics when it tries to add the
    second ref.  We need the add and the drop to cancel each other out in memory so
    we only do the final add.
    
    So to fix this we need to adjust how the delayed refs are added to the tree.
    Only the ref_root matters when it is a normal backref, and only the parent
    matters when it is a shared backref.  So make our decision based on what ref
    type we have.  This allows us to keep the ref_root in memory in case anybody
    wants to use it for something else, and it allows the delayed refs to be merged
    properly so we don't end up with this panic.
    
    With this patch the users image no longer panics on mount, and it has a clean
    fsck after a normal mount/umount cycle.  Thanks,
    Reported-by: default avatarRoman Mamedov <rm@romanrm.ru>
    Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    a2d8e3c7
delayed-ref.c 24.6 KB