• Darrick J. Wong's avatar
    xfs: fix rm_offset flag handling in rmap keys · 08c987de
    Darrick J. Wong authored
    Keys for extent interval records in the reverse mapping btree are
    supposed to be computed as follows:
    
    (physical block, owner, fork, is_btree, offset)
    
    This provides users the ability to look up a reverse mapping from a file
    block mapping record -- start with the physical block; then if there are
    multiple records for the same block, move on to the owner; then the
    inode fork type; and so on to the file offset.
    
    Unfortunately, the code that creates rmap lookup keys from rmap records
    forgot to mask off the record attribute flags, leading to ondisk keys
    that look like this:
    
    (physical block, owner, fork, is_btree, unwritten state, offset)
    
    Fortunately, this has all worked ok for the past six years because the
    key comparison functions incorrectly ignore the fork/bmbt/unwritten
    information that's encoded in the on-disk offset.  This means that
    lookup comparisons are only done with:
    
    (physical block, owner, offset)
    
    Queries can (theoretically) return incorrect results because of this
    omission.  On consistent filesystems this isn't an issue because xattr
    and bmbt blocks cannot be shared and hence the comparisons succeed
    purely on the contents of the rm_startblock field.  For the one case
    where we support sharing (written data fork blocks) all flag bits are
    zero, so the omission in the comparison has no ill effects.
    
    Unfortunately, this bug prevents scrub from detecting incorrect fork and
    bmbt flag bits in the rmap btree, so we really do need to fix the
    compare code.  Old filesystems with the unwritten bit erroneously set in
    the rmap key struct will work fine on new kernels since we still ignore
    the unwritten bit.  New filesystems on older kernels will work fine
    since the old kernels never paid attention to the unwritten bit.
    
    A previous version of this patch forgot to keep the (un)written state
    flag masked during the comparison and caused a major regression in
    5.9.x since unwritten extent conversion can update an rmap record
    without requiring key updates.
    
    Note that blocks cannot go directly from data fork to attr fork without
    being deallocated and reallocated, nor can they be added to or removed
    from a bmbt without a free/alloc cycle, so this should not cause any
    regressions.
    
    Found by fuzzing keys[1].attrfork = ones on xfs/371.
    
    Fixes: 4b8ed677 ("xfs: add rmap btree operations")
    Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    08c987de
xfs_rmap_btree.c 18.4 KB