• Filipe Manana's avatar
    btrfs: avoid unnecessary resolution of indirect backrefs during fiemap · 6976201f
    Filipe Manana authored
    
    
    During fiemap, when determining if a data extent is shared or not, if we
    don't find the extent is directly shared, then we need to determine if
    it's shared through subtrees. For that we need to resolve the indirect
    reference we found in order to figure out the path in the inode's fs tree,
    which is a path starting at the fs tree's root node and going down to the
    leaf that contains the file extent item that points to the data extent.
    We then proceed to determine if any extent buffer in that path is shared
    with other trees or not.
    
    However when the generation of the data extent is more recent than the
    last generation used to snapshot the root, we don't need to determine
    the path, since the data extent can not be shared through snapshots.
    For this case we currently still determine the leaf of that path (at
    find_parent_nodes(), but then stop determining the other nodes in the
    path (at btrfs_is_data_extent_shared()) as it's pointless.
    
    So do the check of the data extent's generation earlier, at
    find_parent_nodes(), before trying to resolve the indirect reference to
    determine the leaf in the path. This saves us from doing one expensive
    b+tree search in the fs tree of our target inode, as well as other minor
    work.
    
    The following test was run on a non-debug kernel (Debian's default kernel
    config):
    
       $ cat test-fiemap.sh
       #!/bin/bash
    
       DEV=/dev/sdi
       MNT=/mnt/sdi
    
       umount $DEV &> /dev/null
       mkfs.btrfs -f $DEV
       # Use compression to quickly create files with a lot of extents
       # (each with a size of 128K).
       mount -o compress=lzo $DEV $MNT
    
       # 40G gives 327680 extents, each with a size of 128K.
       xfs_io -f -c "pwrite -S 0xab -b 1M 0 40G" $MNT/foobar
    
       # Add some more files to increase the size of the fs and extent
       # trees (in the real world there's a lot of files and extents
       # from other files).
       xfs_io -f -c "pwrite -S 0xcd -b 1M 0 20G" $MNT/file1
       xfs_io -f -c "pwrite -S 0xef -b 1M 0 20G" $MNT/file2
       xfs_io -f -c "pwrite -S 0x73 -b 1M 0 20G" $MNT/file3
    
       umount $MNT
       mount -o compress=lzo $DEV $MNT
    
       start=$(date +%s%N)
       filefrag $MNT/foobar
       end=$(date +%s%N)
       dur=$(( (end - start) / 1000000 ))
       echo "fiemap took $dur milliseconds (metadata not cached)"
       echo
    
       start=$(date +%s%N)
       filefrag $MNT/foobar
       end=$(date +%s%N)
       dur=$(( (end - start) / 1000000 ))
       echo "fiemap took $dur milliseconds (metadata cached)"
    
       umount $MNT
    
    Before applying this patch:
    
       (...)
       /mnt/sdi/foobar: 327680 extents found
       fiemap took 1285 milliseconds (metadata not cached)
    
       /mnt/sdi/foobar: 327680 extents found
       fiemap took 742 milliseconds (metadata cached)
    
    After applying this patch:
    
       (...)
       /mnt/sdi/foobar: 327680 extents found
       fiemap took 689 milliseconds (metadata not cached)
    
       /mnt/sdi/foobar: 327680 extents found
       fiemap took 393 milliseconds (metadata cached)
    
    That's a -46.4% total reduction for the metadata not cached case, and
    a -47.0% reduction for the cached metadata case.
    
    The test is somewhat limited in the sense the gains may be higher in
    practice, because in the test the filesystem is small, so we have small
    fs and extent trees, plus there's no concurrent access to the trees as
    well, therefore no lock contention there.
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    6976201f
backref.c 94.6 KB