• Dave Chinner's avatar
    xfs: fix log recovery transaction item reordering · a775ad77
    Dave Chinner authored
    There are several constraints that inode allocation and unlink
    logging impose on log recovery. These all stem from the fact that
    inode alloc/unlink are logged in buffers, but all other inode
    changes are logged in inode items. Hence there are ordering
    constraints that recovery must follow to ensure the correct result
    occurs.
    
    As it turns out, this ordering has been working mostly by chance
    than good management. The existing code moves all buffers except
    cancelled buffers to the head of the list, and everything else to
    the tail of the list. The problem with this is that is interleaves
    inode items with the buffer cancellation items, and hence whether
    the inode item in an cancelled buffer gets replayed is essentially
    left to chance.
    
    Further, this ordering causes problems for log recovery when inode
    CRCs are enabled. It typically replays the inode unlink buffer long before
    it replays the inode core changes, and so the CRC recorded in an
    unlink buffer is going to be invalid and hence any attempt to
    validate the inode in the buffer is going to fail. Hence we really
    need to enforce the ordering that the inode alloc/unlink code has
    expected log recovery to have since inode chunk de-allocation was
    introduced back in 2003...
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
    Signed-off-by: default avatarBen Myers <bpm@sgi.com>
    a775ad77
xfs_log_recover.c 114 KB