• Qu Wenruo's avatar
    btrfs: don't merge pages into bio if their page offset is not contiguous · 4a445b7b
    Qu Wenruo authored
    [BUG]
    Zygo reported on latest development branch, he could hit
    ASSERT()/BUG_ON() caused crash when doing RAID5 recovery (intentionally
    corrupt one disk, and let btrfs to recover the data during read/scrub).
    
    And The following minimal reproducer can cause extent state leakage at
    rmmod time:
    
      mkfs.btrfs -f -d raid5 -m raid5 $dev1 $dev2 $dev3 -b 1G > /dev/null
      mount $dev1 $mnt
      fsstress -w -d $mnt -n 25 -s 1660807876
      sync
      fssum -A -f -w /tmp/fssum.saved $mnt
      umount $mnt
    
      # Wipe the dev1 but keeps its super block
      xfs_io -c "pwrite -S 0x0 1m 1023m" $dev1
      mount $dev1 $mnt
      fssum -r /tmp/fssum.saved $mnt > /dev/null
      umount $mnt
      rmmod btrfs
    
    This will lead to the following extent states leakage:
    
      BTRFS: state leak: start 499712 end 503807 state 5 in tree 1 refs 1
      BTRFS: state leak: start 495616 end 499711 state 5 in tree 1 refs 1
      BTRFS: state leak: start 491520 end 495615 state 5 in tree 1 refs 1
      BTRFS: state leak: start 487424 end 491519 state 5 in tree 1 refs 1
      BTRFS: state leak: start 483328 end 487423 state 5 in tree 1 refs 1
      BTRFS: state leak: start 479232 end 483327 state 5 in tree 1 refs 1
      BTRFS: state leak: start 475136 end 479231 state 5 in tree 1 refs 1
      BTRFS: state leak: start 471040 end 475135 state 5 in tree 1 refs 1
    
    [CAUSE]
    Since commit 7aa51232 ("btrfs: pass a btrfs_bio to
    btrfs_repair_one_sector"), we always use btrfs_bio->file_offset to
    determine the file offset of a page.
    
    But that usage assume that, one bio has all its page having a continuous
    page offsets.
    
    Unfortunately that's not true, btrfs only requires the logical bytenr
    contiguous when assembling its bios.
    
    From above script, we have one bio looks like this:
    
      fssum-27671  submit_one_bio: bio logical=217739264 len=36864
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=466944 <<<
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=724992 <<<
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=729088
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=733184
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=737280
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=741376
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=745472
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=749568
      fssum-27671  submit_one_bio:   r/i=5/261 page_offset=753664
    
    Note that the 1st and the 2nd page has non-contiguous page offsets.
    
    This means, at repair time, we will have completely wrong file offset
    passed in:
    
       kworker/u32:2-19927  btrfs_repair_one_sector: r/i=5/261 page_off=729088 file_off=475136 bio_offset=8192
    
    Since the file offset is incorrect, we latter incorrectly set the extent
    states, and no way to really release them.
    
    Thus later it causes the leakage.
    
    In fact, this can be even worse, since the file offset is incorrect, we
    can hit cases like the incorrect file offset belongs to a HOLE, and
    later cause btrfs_num_copies() to trigger error, finally hit
    BUG_ON()/ASSERT() later.
    
    [FIX]
    Add an extra condition in btrfs_bio_add_page() for uncompressed IO.
    
    Now we will have more strict requirement for bio pages:
    
    - They should all have the same mapping
      (the mapping check is already implied by the call chain)
    
    - Their logical bytenr should be adjacent
      This is the same as the old condition.
    
    - Their page_offset() (file offset) should be adjacent
      This is the new check.
      This would result a slightly increased amount of bios from btrfs
      (needs holes and inside the same stripe boundary to trigger).
    
      But this would greatly reduce the confusion, as it's pretty common
      to assume a btrfs bio would only contain continuous page cache.
    
    Later we may need extra cleanups, as we no longer needs to handle gaps
    between page offsets in endio functions.
    
    Currently this should be the minimal patch to fix commit 7aa51232
    ("btrfs: pass a btrfs_bio to btrfs_repair_one_sector").
    Reported-by: default avatarZygo Blaxell <ce3g8jdj@umail.furryterror.org>
    Fixes: 7aa51232 ("btrfs: pass a btrfs_bio to btrfs_repair_one_sector")
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    4a445b7b
extent_io.c 199 KB