• Andrew Morton's avatar
    [PATCH] writeback scalability improvements · e64fa3db
    Andrew Morton authored
    The kernel has a number of problems wrt heavy write traffic to multiple
    spindles.  What keeps on happening is that all processes which are
    responsible for writeback get blocked on one of the queues and all the
    others fall idle.
    
    This happens in the balance_dirty_pages() path (balance_dirty() in 2.4)
    and in the page reclaim code, when a dirty page is found on the LRU.
    
    The latter is particularly bad because it causes "innocent" processes
    to be suspended for long periods due to the activity of heavy writers.
    
    The general idea is: the primary resource for writeback should be the
    process which is dirtying memory.  The secondary resource is the
    pdflush pool (although this is mainly for providing async writeback in
    the presence of light-moderate loads).  Add the final
    oh-gee-we-screwed-up resource for writeback is a caller to
    shrink_cache().
    
    This patch addresses the balance_dirty_pages() path.  This code was
    initially modelled on the 2.4 writeback scheme: throttled processes
    writeback all data regardless of its queue.  Instead, the patch changes
    it so that the balance_dirty_pages() caller only writes back pages
    which are dirty against the queue which that caller just dirtied.
    
    So the effect is a better allocation of writeback resources across the
    queues and increased parallelism.
    
    The per-queue writeback is implemented by using
    mapping->backing_dev_info as a search key during the walk across the
    superblocks and inodes.
    
    The patch also fixes an initialisation problem in
    block_dev.c:do_open(): it was setting up the blockdev's
    mapping->backing_dev_info too early, before the queue has been
    identified.
    
    Generally, this patch doesn't help much, because of the stalls in the
    page allocator.  I have a patch which mostly fixes that up, and taken
    together the kernel is achieving almost platter speed against six
    spindles, but only when the system has a small amount of memory.  More
    work is needed there.
    e64fa3db
page-writeback.c 13.9 KB