• Qu Wenruo's avatar
    btrfs: fix hang when run_delalloc_range() failed · 968f2566
    Qu Wenruo authored
    [BUG]
    When running subpage preparation patches on x86, btrfs/125 will hang
    forever with one ordered extent never finished.
    
    [CAUSE]
    The test case btrfs/125 itself will always fail as the fix is never merged.
    
    When the test fails at balance, btrfs needs to cleanup the ordered
    extent in btrfs_cleanup_ordered_extents() for data reloc inode.
    
    The problem is in the sequence how we cleanup the page Order bit.
    
    Currently it works like:
    
      btrfs_cleanup_ordered_extents()
      |- find_get_page();
      |- btrfs_page_clear_ordered(page);
      |  Now the page doesn't have Ordered bit anymore.
      |  !!! This also includes the first (locked) page !!!
      |
      |- offset += PAGE_SIZE
      |  This is to skip the first page
      |- __endio_write_update_ordered()
         |- btrfs_mark_ordered_io_finished(NULL)
            Except the first page, all ordered extents are finished.
    
    Then the locked page is cleaned up in __extent_writepage():
    
      __extent_writepage()
      |- If (PageError(page))
      |- end_extent_writepage()
         |- btrfs_mark_ordered_io_finished(page)
            |- if (btrfs_test_page_ordered(page))
            |-  !!! The page gets skipped !!!
                The ordered extent is not decreased as the page doesn't
                have ordered bit anymore.
    
    This leaves the ordered extent with bytes_left == sectorsize, thus never
    finish.
    
    [FIX]
    The fix is to ensure we never clear page Ordered bit without running the
    ordered extent accounting.
    
    Here we choose to skip the locked page in
    btrfs_cleanup_ordered_extents() so that later end_extent_writepage() can
    properly finish the ordered extent.
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    968f2566
inode.c 298 KB