• Filipe David Borba Manana's avatar
    Btrfs: more efficient extent state insertions · 12cfbad9
    Filipe David Borba Manana authored
    Currently we do 2 traversals of an inode's extent_io_tree
    before inserting an extent state structure: 1 to see if a
    matching extent state already exists and 1 to do the insertion
    if the fist traversal didn't found such extent state.
    
    This change just combines those tree traversals into a single one.
    While running sysbench tests (random writes) I captured the number
    of elements in extent_io_tree trees for a while (into a procfs file
    backed by a seq_list from seq_file module) and got this histogram:
    
    Count: 9310
    Range: 51.000 - 21386.000; Mean: 11785.243; Median: 18743.500; Stddev: 8923.688
    Percentiles:  90th: 20985.000; 95th: 21155.000; 99th: 21369.000
      51.000 -   93.933:   693 ########
      93.933 -  172.314:   938 ##########
     172.314 -  315.408:   856 #########
     315.408 -  576.646:    95 #
     576.646 - 6415.830:   888 ##########
    6415.830 - 11713.809:  1024 ###########
    11713.809 - 21386.000:  4816 #####################################################
    
    So traversing such trees can take some significant time that can
    easily be avoided.
    
    Ran the following sysbench tests, 5 times each, for sequential and
    random writes, and got the following results:
    
      sysbench --test=fileio --file-num=1 --file-total-size=2G \
        --file-test-mode=seqwr --num-threads=16 --file-block-size=65536 \
        --max-requests=0 --max-time=60 --file-io-mode=sync
    
      sysbench --test=fileio --file-num=1 --file-total-size=2G \
        --file-test-mode=rndwr --num-threads=16 --file-block-size=65536 \
        --max-requests=0 --max-time=60 --file-io-mode=sync
    
    Before this change:
    
    sequential writes: 69.28Mb/sec (average of 5 runs)
    random writes:     4.14Mb/sec  (average of 5 runs)
    
    After this change:
    
    sequential writes: 69.91Mb/sec (average of 5 runs)
    random writes:     5.69Mb/sec  (average of 5 runs)
    Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
    Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    12cfbad9
extent_io.c 131 KB