• Zhang Yi's avatar
    xfs: reserve blocks for truncating large realtime inode · d0489451
    Zhang Yi authored
    When unaligned truncate down a big realtime file, xfs_truncate_page()
    only zeros out the tail EOF block, __xfs_bunmapi() should split the tail
    written extent and convert the later one that beyond EOF block to
    unwritten, but it couldn't work as expected now since the reserved block
    is zero in xfs_setattr_size(), this could expose stale data just after
    commit '943bc088 ("iomap: don't increase i_size if it's not a write
    operation")'.
    
    If we truncate file that contains a large enough written extent:
    
         |<    rxext    >|<    rtext    >|
      ...WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
            ^ (new EOF)      ^ old EOF
    
    Since we only zeros out the tail of the EOF block, and
    xfs_itruncate_extents()->..->__xfs_bunmapi() unmap the whole ailgned
    extents, it becomes this state:
    
         |<    rxext    >|
      ...WWWzWWWWWWWWWWWWW
            ^ new EOF
    
    Then if we do an extending write like this, the blocks in the previous
    tail extent becomes stale:
    
         |<    rxext    >|
      ...WWWzSSSSSSSSSSSSS..........WWWWWWWWWWWWWWWWW
            ^ old EOF               ^ append start  ^ new EOF
    
    Fix this by reserving XFS_DIOSTRAT_SPACE_RES blocks for big realtime
    inode.
    Signed-off-by: default avatarZhang Yi <yi.zhang@huawei.com>
    Link: https://lore.kernel.org/r/20240618142112.1315279-2-yi.zhang@huaweicloud.comReviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    d0489451
xfs_iops.c 35 KB