• Shakeel Butt's avatar
    memcg: infrastructure to flush memcg stats · aa48e47e
    Shakeel Butt authored
    At the moment memcg stats are read in four contexts:
    
    1. memcg stat user interfaces
    2. dirty throttling
    3. page fault
    4. memory reclaim
    
    Currently the kernel flushes the stats for first two cases.  Flushing the
    stats for remaining two casese may have performance impact.  Always
    flushing the memcg stats on the page fault code path may negatively
    impacts the performance of the applications.  In addition flushing in the
    memory reclaim code path, though treated as slowpath, can become the
    source of contention for the global lock taken for stat flushing because
    when system or memcg is under memory pressure, many tasks may enter the
    reclaim path.
    
    This patch uses following mechanisms to solve these challenges:
    
    1. Periodically flush the stats from root memcg every 2 seconds.  This
       will time limit the out of sync stats.
    
    2. Asynchronously flush the stats after fixed number of stat updates.
       In the worst case the stat can be out of sync by O(nr_cpus * BATCH) for
       2 seconds.
    
    3. For avoiding thundering herd to flush the stats particularly from
       the memory reclaim context, introduce memcg local spinlock and let only
       one flusher active at a time.  This could have been done through
       cgroup_rstat_lock lock but that lock is used by other subsystem and for
       userspace reading memcg stats.  So, it is better to keep flushers
       introduced by this patch decoupled from cgroup_rstat_lock.  However we
       would have to use irqsafe version of rstat flush but that is fine as
       this code path will be flushing for whole tree and do the work for
       everyone.  No one will be waiting for that worker.
    
    [shakeelb@google.com: fix sleep-in-wrong context bug]
      Link: https://lkml.kernel.org/r/20210716212137.1391164-2-shakeelb@google.com
    
    Link: https://lkml.kernel.org/r/20210714013948.270662-2-shakeelb@google.comSigned-off-by: default avatarShakeel Butt <shakeelb@google.com>
    Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Michal Koutný <mkoutny@suse.com>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    aa48e47e
vmscan.c 130 KB