• Christoph Hellwig's avatar
    xfs: fix crash and data corruption due to removal of busy COW extents · a1b7a4de
    Christoph Hellwig authored
    There is a race window between write_cache_pages calling
    clear_page_dirty_for_io and XFS calling set_page_writeback, in which
    the mapping for an inode is tagged neither as dirty, nor as writeback.
    
    If the COW shrinker hits in exactly that window we'll remove the delayed
    COW extents and writepages trying to write it back, which in release
    kernels will manifest as corruption of the bmap btree, and in debug
    kernels will trip the ASSERT about now calling xfs_bmapi_write with the
    COWFORK flag for holes.  A complex customer load manages to hit this
    window fairly reliably, probably by always having COW writeback in flight
    while the cow shrinker runs.
    
    This patch adds another check for having the I_DIRTY_PAGES flag set,
    which is still set during this race window.  While this fixes the problem
    I'm still not overly happy about the way the COW shrinker works as it
    still seems a bit fragile.
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    a1b7a4de
xfs_icache.c 42.8 KB