• Qu Wenruo's avatar
    btrfs: use an efficient way to represent source of duplicated stripes · 1faf3885
    Qu Wenruo authored
    For btrfs dev-replace, we have to duplicate writes to the source
    device into the target device.
    
    For non-RAID56, all writes into the same mapped ranges are sharing the
    same content, thus they don't really need to bother anything.
    (E.g. in btrfs_submit_bio() for non-RAID56 range we just submit the
    same write to all involved devices).
    
    But for RAID56, all stripes contain different content, thus we must
    have a clear mapping of which stripe is duplicated from which original
    stripe.
    
    Currently we use a complex way using tgtdev_map[] array, e.g:
    
     num_tgtdevs = 1
     tgtdev_map[0] = 0    <- Means stripes[0] is not involved in replace.
     tgtdev_map[1] = 3    <- Means stripes[1] is involved in replace,
    			 and it's duplicated to stripes[3].
     tgtdev_map[2] = 0    <- Means stripes[2] is not involved in replace.
    
    But this is wasting some space, and ignores one important thing for
    dev-replace, there is at most one running replace.
    
    Thus we can change it to a fixed array to represent the mapping:
    
     replace_nr_stripes = 1
     replace_stripe_src = 1    <- Means stripes[1] is involved in replace.
    			      thus the extra stripe is a copy of
    			      stripes[1]
    
    By this we can save some space for bioc on RAID56 chunks with many
    devices.  And we get rid of one variable sized array from bioc.
    
    Thus the patch involves the following changes:
    
    - Replace @num_tgtdevs and @tgtdev_map[] with @replace_nr_stripes
      and @replace_stripe_src.
    
      @num_tgtdevs is just renamed to @replace_nr_stripes.
      While the mapping is completely changed.
    
    - Add extra ASSERT()s for RAID56 code
    
    - Only add two more extra stripes for dev-replace cases.
      As we have an upper limit on how many dev-replace stripes we can have.
    
    - Unify the behavior of handle_ops_on_dev_replace()
      Previously handle_ops_on_dev_replace() go two different paths for
      WRITE and GET_READ_MIRRORS.
      Now unify them by always going the WRITE path first (with at most 2
      replace stripes), then if we're doing GET_READ_MIRRORS and we have 2
      extra stripes, just drop one stripe.
    
    - Remove the @real_stripes argument from alloc_btrfs_io_context()
      As we don't need the old variable length array any more.
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    1faf3885
raid56.c 73.9 KB