1. 09 Jul, 2011 1 commit
    • Mel Gorman's avatar
      mm: vmscan: correct check for kswapd sleeping in sleeping_prematurely · 08951e54
      Mel Gorman authored
      During allocator-intensive workloads, kswapd will be woken frequently
      causing free memory to oscillate between the high and min watermark.  This
      is expected behaviour.  Unfortunately, if the highest zone is small, a
      problem occurs.
      
      This seems to happen most with recent sandybridge laptops but it's
      probably a co-incidence as some of these laptops just happen to have a
      small Normal zone.  The reproduction case is almost always during copying
      large files that kswapd pegs at 100% CPU until the file is deleted or
      cache is dropped.
      
      The problem is mostly down to sleeping_prematurely() keeping kswapd awake
      when the highest zone is small and unreclaimable and compounded by the
      fact we shrink slabs even when not shrinking zones causing a lot of time
      to be spent in shrinkers and a lot of memory to be reclaimed.
      
      Patch 1 corrects sleeping_prematurely to check the zones matching
      	the classzone_idx instead of all zones.
      
      Patch 2 avoids shrinking slab when we are not shrinking a zone.
      
      Patch 3 notes that sleeping_prematurely is checking lower zones against
      	a high classzone which is not what allocators or balance_pgdat()
      	is doing leading to an artifical belief that kswapd should be
      	still awake.
      
      Patch 4 notes that when balance_pgdat() gives up on a high zone that the
      	decision is not communicated to sleeping_prematurely()
      
      This problem affects 2.6.38.8 for certain and is expected to affect 2.6.39
      and 3.0-rc4 as well.  If accepted, they need to go to -stable to be picked
      up by distros and this series is against 3.0-rc4.  I've cc'd people that
      reported similar problems recently to see if they still suffer from the
      problem and if this fixes it.
      
      This patch: correct the check for kswapd sleeping in sleeping_prematurely()
      
      During allocator-intensive workloads, kswapd will be woken frequently
      causing free memory to oscillate between the high and min watermark.  This
      is expected behaviour.
      
      A problem occurs if the highest zone is small.  balance_pgdat() only
      considers unreclaimable zones when priority is DEF_PRIORITY but
      sleeping_prematurely considers all zones.  It's possible for this sequence
      to occur
      
        1. kswapd wakes up and enters balance_pgdat()
        2. At DEF_PRIORITY, marks highest zone unreclaimable
        3. At DEF_PRIORITY-1, ignores highest zone setting end_zone
        4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from
              highest zone, clearing all_unreclaimable. Highest zone
              is still unbalanced
        5. kswapd returns and calls sleeping_prematurely
        6. sleeping_prematurely looks at *all* zones, not just the ones
           being considered by balance_pgdat. The highest small zone
           has all_unreclaimable cleared but the zone is not
           balanced. all_zones_ok is false so kswapd stays awake
      
      This patch corrects the behaviour of sleeping_prematurely to check the
      zones balance_pgdat() checked.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarPádraig Brady <P@draigBrady.com>
      Tested-by: default avatarPádraig Brady <P@draigBrady.com>
      Tested-by: default avatarAndrew Lutomirski <luto@mit.edu>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarMinchan Kim <minchan.kim@gmail.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08951e54
  2. 08 Jul, 2011 5 commits
  3. 07 Jul, 2011 21 commits
  4. 06 Jul, 2011 13 commits