• Dave Chinner's avatar
    xfs: fix inode lookup race · f30d500f
    Dave Chinner authored
    When we get concurrent lookups of the same inode that is not in the
    per-AG inode cache, there is a race condition that triggers warnings
    in unlock_new_inode() indicating that we are initialising an inode
    that isn't in a the correct state for a new inode.
    
    When we do an inode lookup via a file handle or a bulkstat, we don't
    serialise lookups at a higher level through the dentry cache (i.e.
    pathless lookup), and so we can get concurrent lookups of the same
    inode.
    
    The race condition is between the insertion of the inode into the
    cache in the case of a cache miss and a concurrently lookup:
    
    Thread 1			Thread 2
    xfs_iget()
      xfs_iget_cache_miss()
        xfs_iread()
        lock radix tree
        radix_tree_insert()
    				rcu_read_lock
    				radix_tree_lookup
    				lock inode flags
    				XFS_INEW not set
    				igrab()
    				unlock inode flags
    				rcu_read_unlock
    				use uninitialised inode
    				.....
        lock inode flags
        set XFS_INEW
        unlock inode flags
        unlock radix tree
      xfs_setup_inode()
        inode flags = I_NEW
        unlock_new_inode()
          WARNING as inode flags != I_NEW
    
    This can lead to inode corruption, inode list corruption, etc, and
    is generally a bad thing to occur.
    
    Fix this by setting XFS_INEW before inserting the inode into the
    radix tree. This will ensure any concurrent lookup will find the new
    inode with XFS_INEW set and that forces the lookup to wait until the
    XFS_INEW flag is removed before allowing the lookup to succeed.
    
    cc: <stable@vger.kernel.org> # for 3.0.x, 3.2.x
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarBen Myers <bpm@sgi.com>
    f30d500f
xfs_iget.c 19.7 KB