Commit def0fdae authored by Johannes Weiner's avatar Johannes Weiner Committed by Linus Torvalds

mm: memcontrol: fix NUMA round-robin reclaim at intermediate level

When a cgroup is reclaimed on behalf of a configured limit, reclaim
needs to round-robin through all NUMA nodes that hold pages of the memcg
in question.  However, when assembling the mask of candidate NUMA nodes,
the code only consults the *local* cgroup LRU counters, not the
recursive counters for the entire subtree.  Cgroup limits are frequently
configured against intermediate cgroups that do not have memory on their
own LRUs.  In this case, the node mask will always come up empty and
reclaim falls back to scanning only the current node.

If a cgroup subtree has some memory on one node but the processes are
bound to another node afterwards, the limit reclaim will never age or
reclaim that memory anymore.

To fix this, use the recursive LRU counts for a cgroup subtree to
determine which nodes hold memory of that cgroup.

The code has been broken like this forever, so it doesn't seem to be a
problem in practice.  I just noticed it while reviewing the way the LRU
counters are used in general.

Link: http://lkml.kernel.org/r/20190412151507.2769-5-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
Reviewed-by: default avatarRoman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 42a30035
...@@ -1507,13 +1507,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg, ...@@ -1507,13 +1507,13 @@ static bool test_mem_cgroup_node_reclaimable(struct mem_cgroup *memcg,
{ {
struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg); struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg);
if (lruvec_page_state_local(lruvec, NR_INACTIVE_FILE) || if (lruvec_page_state(lruvec, NR_INACTIVE_FILE) ||
lruvec_page_state_local(lruvec, NR_ACTIVE_FILE)) lruvec_page_state(lruvec, NR_ACTIVE_FILE))
return true; return true;
if (noswap || !total_swap_pages) if (noswap || !total_swap_pages)
return false; return false;
if (lruvec_page_state_local(lruvec, NR_INACTIVE_ANON) || if (lruvec_page_state(lruvec, NR_INACTIVE_ANON) ||
lruvec_page_state_local(lruvec, NR_ACTIVE_ANON)) lruvec_page_state(lruvec, NR_ACTIVE_ANON))
return true; return true;
return false; return false;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment