• Brian Foster's avatar
    xfs: don't free cowblocks from under dirty pagecache on unshare · 4390f019
    Brian Foster authored
    fallocate unshare mode explicitly breaks extent sharing. When a
    command completes, it checks the data fork for any remaining shared
    extents to determine whether the reflink inode flag and COW fork
    preallocation can be removed. This logic doesn't consider in-core
    pagecache and I/O state, however, which means we can unsafely remove
    COW fork blocks that are still needed under certain conditions.
    
    For example, consider the following command sequence:
    
    xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \
    	-c "pwrite 0 32k" -c "funshare 0 1k" <file>
    
    This allocates a data block at offset 0, shares it, and then
    overwrites it with a larger buffered write. The overwrite triggers
    COW fork preallocation, 32 blocks by default, which maps the entire
    32k write to delalloc in the COW fork. All but the shared block at
    offset 0 remains hole mapped in the data fork. The unshare command
    redirties and flushes the folio at offset 0, removing the only
    shared extent from the inode. Since the inode no longer maps shared
    extents, unshare purges the COW fork before the remaining 28k may
    have written back.
    
    This leaves dirty pagecache backed by holes, which writeback quietly
    skips, thus leaving clean, non-zeroed pagecache over holes in the
    file. To verify, fiemap shows holes in the first 32k of the file and
    reads return different data across a remount:
    
    $ xfs_io -c "fiemap -v" <file>
    <file>:
     EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
       ...
       1: [8..511]:        hole               504
       ...
    $ xfs_io -c "pread -v 4k 8" <file>
    00001000:  cd cd cd cd cd cd cd cd  ........
    $ umount <mnt>; mount <dev> <mnt>
    $ xfs_io -c "pread -v 4k 8" <file>
    00001000:  00 00 00 00 00 00 00 00  ........
    
    To avoid this problem, make unshare follow the same rules used for
    background cowblock scanning and never purge the COW fork for inodes
    with dirty pagecache or in-flight I/O.
    
    Fixes: 46afb062 ("xfs: only flush the unshared range in xfs_reflink_unshare")
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: default avatarCarlos Maiolino <cem@kernel.org>
    4390f019
xfs_icache.c 57.4 KB