• Dave Chinner's avatar
    xfs: fix low space alloc deadlock · 1dd0510f
    Dave Chinner authored
    I've recently encountered an ABBA deadlock with g/476. The upcoming
    changes seem to make this much easier to hit, but the underlying
    problem is a pre-existing one.
    
    Essentially, if we select an AG for allocation, then lock the AGF
    and then fail to allocate for some reason (e.g. minimum length
    requirements cannot be satisfied), then we drop out of the
    allocation with the AGF still locked.
    
    The caller then modifies the allocation constraints - usually
    loosening them up - and tries again. This can result in trying to
    access AGFs that are lower than the AGF we already have locked from
    the failed attempt. e.g. the failed attempt skipped several AGs
    before failing, so we have locks an AG higher than the start AG.
    Retrying the allocation from the start AG then causes us to violate
    AGF lock ordering and this can lead to deadlocks.
    
    The deadlock exists even if allocation succeeds - we can do a
    followup allocations in the same transaction for BMBT blocks that
    aren't guaranteed to be in the same AG as the original, and can move
    into higher AGs. Hence we really need to move the tp->t_firstblock
    tracking down into xfs_alloc_vextent() where it can be set when we
    exit with a locked AG.
    
    xfs_alloc_vextent() can also check there if the requested
    allocation falls within the allow range of AGs set by
    tp->t_firstblock. If we can't allocate within the range set, we have
    to fail the allocation. If we are allowed to to non-blocking AGF
    locking, we can ignore the AG locking order limitations as we can
    use try-locks for the first iteration over requested AG range.
    
    This invalidates a set of post allocation asserts that check that
    the allocation is always above tp->t_firstblock if it is set.
    Because we can use try-locks to avoid the deadlock in some
    circumstances, having a pre-existing locked AGF doesn't always
    prevent allocation from lower order AGFs. Hence those ASSERTs need
    to be removed.
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    1dd0510f
xfs_bmap.c 165 KB