• Mel Gorman's avatar
    vmscan: do not unconditionally treat zones that fail zone_reclaim() as full · fa5e084e
    Mel Gorman authored
    On NUMA machines, the administrator can configure zone_reclaim_mode that
    is a more targetted form of direct reclaim.  On machines with large NUMA
    distances for example, a zone_reclaim_mode defaults to 1 meaning that
    clean unmapped pages will be reclaimed if the zone watermarks are not
    being met.  The problem is that zone_reclaim() failing at all means the
    zone gets marked full.
    
    This can cause situations where a zone is usable, but is being skipped
    because it has been considered full.  Take a situation where a large tmpfs
    mount is occuping a large percentage of memory overall.  The pages do not
    get cleaned or reclaimed by zone_reclaim(), but the zone gets marked full
    and the zonelist cache considers them not worth trying in the future.
    
    This patch makes zone_reclaim() return more fine-grained information about
    what occured when zone_reclaim() failued.  The zone only gets marked full
    if it really is unreclaimable.  If it's a case that the scan did not occur
    or if enough pages were not reclaimed with the limited reclaim_mode, then
    the zone is simply skipped.
    
    There is a side-effect to this patch.  Currently, if zone_reclaim()
    successfully reclaimed SWAP_CLUSTER_MAX, an allocation attempt would go
    ahead.  With this patch applied, zone watermarks are rechecked after
    zone_reclaim() does some work.
    
    This bug was introduced by commit 9276b1bc
    ("memory page_alloc zonelist caching speedup") way back in 2.6.19 when the
    zonelist_cache was introduced.  It was not intended that zone_reclaim()
    aggressively consider the zone to be full when it failed as full direct
    reclaim can still be an option.  Due to the age of the bug, it should be
    considered a -stable candidate.
    Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
    Reviewed-by: default avatarWu Fengguang <fengguang.wu@intel.com>
    Reviewed-by: default avatarRik van Riel <riel@redhat.com>
    Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Cc: Christoph Lameter <cl@linux-foundation.org>
    Cc: <stable@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    fa5e084e
page_alloc.c 134 KB