• Dave Chinner's avatar
    xfs: unpin stale inodes directly in IOP_COMMITTED · 1316d4da
    Dave Chinner authored
    When inodes are marked stale in a transaction, they are treated
    specially when the inode log item is being inserted into the AIL.
    It tries to avoid moving the log item forward in the AIL due to a
    race condition with the writing the underlying buffer back to disk.
    The was "fixed" in commit de25c181 ("xfs: avoid moving stale inodes
    in the AIL").
    
    To avoid moving the item forward, we return a LSN smaller than the
    commit_lsn of the completing transaction, thereby trying to trick
    the commit code into not moving the inode forward at all. I'm not
    sure this ever worked as intended - it assumes the inode is already
    in the AIL, but I don't think the returned LSN would have been small
    enough to prevent moving the inode. It appears that the reason it
    worked is that the lower LSN of the inodes meant they were inserted
    into the AIL and flushed before the inode buffer (which was moved to
    the commit_lsn of the transaction).
    
    The big problem is that with delayed logging, the returning of the
    different LSN means insertion takes the slow, non-bulk path.  Worse
    yet is that insertion is to a position -before- the commit_lsn so it
    is doing a AIL traversal on every insertion, and has to walk over
    all the items that have already been inserted into the AIL. It's
    expensive.
    
    To compound the matter further, with delayed logging inodes are
    likely to go from clean to stale in a single checkpoint, which means
    they aren't even in the AIL at all when we come across them at AIL
    insertion time. Hence these were all getting inserted into the AIL
    when they simply do not need to be as inodes marked XFS_ISTALE are
    never written back.
    
    Transactional/recovery integrity is maintained in this case by the
    other items in the unlink transaction that were modified (e.g. the
    AGI btree blocks) and committed in the same checkpoint.
    
    So to fix this, simply unpin the stale inodes directly in
    xfs_inode_item_committed() and return -1 to indicate that the AIL
    insertion code does not need to do any further processing of these
    inodes.
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
    1316d4da
xfs_inode_item.c 30.1 KB