• Omar Sandoval's avatar
    xfs: cache last bitmap block in realtime allocator · e94b53ff
    Omar Sandoval authored
    Profiling a workload on a highly fragmented realtime device showed a ton
    of CPU cycles being spent in xfs_trans_read_buf() called by
    xfs_rtbuf_get(). Further tracing showed that much of that was repeated
    calls to xfs_rtbuf_get() for the same block of the realtime bitmap.
    These come from xfs_rtallocate_extent_block(): as it walks through
    ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and
    xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block
    is very fragmented, then this is _a lot_ of buffer lookups.
    
    The realtime allocator already passes around a cache of the last used
    realtime summary block to avoid repeated reads (the parameters rbpp and
    rsb). We can do the same for the realtime bitmap.
    
    This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches
    the most recently used block for both the realtime bitmap and summary.
    xfs_rtbuf_get() now handles the caching instead of the callers, which
    requires plumbing xfs_rtbuf_cache to more functions but also makes sure
    we don't miss anything.
    Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
    Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    e94b53ff
xfs_rtalloc.c 37.6 KB