• Brian Foster's avatar
    bcachefs: more aggressive fast path write buffer key flushing · 873555f0
    Brian Foster authored
    The btree write buffer flush code is prone to causing journal
    deadlock due to inefficient use and release of reservation space.
    Reservation is not pre-reserved for write buffered keys (as is done
    for key cache keys, for example), because the write buffer flush
    side uses a fast path that attempts insertion without need for any
    reservation at all.
    
    The write buffer flush attempts to deal with this by inserting keys
    using the BTREE_INSERT_JOURNAL_RECLAIM flag to return an error on
    journal reservations that require blocking. Upon first error, it
    falls back to a slow path that inserts in journal order and supports
    moving the associated journal pin forward.
    
    The problem is that under pathological conditions (i.e. smaller log,
    larger write buffer and journal reservation pressure), we've seen
    instances where the fast path fails fairly quickly without having
    completed many insertions, and then the slow path is unable to push
    the journal pin forward enough to free up the space it needs to
    completely flush the buffer. This problem is occasionally reproduced
    by fstest generic/333.
    
    To avoid this problem, update the fast path algorithm to skip key
    inserts that fail due to inability to acquire needed journal
    reservation without immediately breaking out of the loop. Instead,
    insert as many keys as possible, zap the sequence numbers to mark
    them as processed, and then fall back to the slow path to process
    the remaining set in journal order. This reduces the amount of
    journal reservation that might be required to flush the entire
    buffer and increases the odds that the slow path is able to move the
    journal pin forward and free up space as keys are processed.
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
    873555f0
btree_write_buffer.c 8.9 KB