• Roman Gushchin's avatar
    mm, oom: introduce memory.oom.group · 3d8b38eb
    Roman Gushchin authored
    For some workloads an intervention from the OOM killer can be painful.
    Killing a random task can bring the workload into an inconsistent state.
    
    Historically, there are two common solutions for this
    problem:
    1) enabling panic_on_oom,
    2) using a userspace daemon to monitor OOMs and kill
       all outstanding processes.
    
    Both approaches have their downsides: rebooting on each OOM is an obvious
    waste of capacity, and handling all in userspace is tricky and requires a
    userspace agent, which will monitor all cgroups for OOMs.
    
    In most cases an in-kernel after-OOM cleaning-up mechanism can eliminate
    the necessity of enabling panic_on_oom.  Also, it can simplify the cgroup
    management for userspace applications.
    
    This commit introduces a new knob for cgroup v2 memory controller:
    memory.oom.group.  The knob determines whether the cgroup should be
    treated as an indivisible workload by the OOM killer.  If set, all tasks
    belonging to the cgroup or to its descendants (if the memory cgroup is not
    a leaf cgroup) are killed together or not at all.
    
    To determine which cgroup has to be killed, we do traverse the cgroup
    hierarchy from the victim task's cgroup up to the OOMing cgroup (or root)
    and looking for the highest-level cgroup with memory.oom.group set.
    
    Tasks with the OOM protection (oom_score_adj set to -1000) are treated as
    an exception and are never killed.
    
    This patch doesn't change the OOM victim selection algorithm.
    
    Link: http://lkml.kernel.org/r/20180802003201.817-4-guro@fb.comSigned-off-by: default avatarRoman Gushchin <guro@fb.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3d8b38eb
memcontrol.c 170 KB