• Filipe Manana's avatar
    btrfs: fix block group item corruption after inserting new block group · 675dfe12
    Filipe Manana authored
    We can often end up inserting a block group item, for a new block group,
    with a wrong value for the used bytes field.
    
    This happens if for the new allocated block group, in the same transaction
    that created the block group, we have tasks allocating extents from it as
    well as tasks removing extents from it.
    
    For example:
    
    1) Task A creates a metadata block group X;
    
    2) Two extents are allocated from block group X, so its "used" field is
       updated to 32K, and its "commit_used" field remains as 0;
    
    3) Transaction commit starts, by some task B, and it enters
       btrfs_start_dirty_block_groups(). There it tries to update the block
       group item for block group X, which currently has its "used" field with
       a value of 32K. But that fails since the block group item was not yet
       inserted, and so on failure update_block_group_item() sets the
       "commit_used" field of the block group back to 0;
    
    4) The block group item is inserted by task A, when for example
       btrfs_create_pending_block_groups() is called when releasing its
       transaction handle. This results in insert_block_group_item() inserting
       the block group item in the extent tree (or block group tree), with a
       "used" field having a value of 32K, but without updating the
       "commit_used" field in the block group, which remains with value of 0;
    
    5) The two extents are freed from block X, so its "used" field changes
       from 32K to 0;
    
    6) The transaction commit by task B continues, it enters
       btrfs_write_dirty_block_groups() which calls update_block_group_item()
       for block group X, and there it decides to skip the block group item
       update, because "used" has a value of 0 and "commit_used" has a value
       of 0 too.
    
       As a result, we end up with a block item having a 32K "used" field but
       no extents allocated from it.
    
    When this issue happens, a btrfs check reports an error like this:
    
       [1/7] checking root items
       [2/7] checking extents
       block group [1104150528 1073741824] used 39796736 but extent items used 0
       ERROR: errors found in extent allocation tree or chunk allocation
       (...)
    
    Fix this by making insert_block_group_item() update the block group's
    "commit_used" field.
    
    Fixes: 7248e0ce ("btrfs: skip update of block group item if used bytes are the same")
    CC: stable@vger.kernel.org # 6.2+
    Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    675dfe12
block-group.c 132 KB