• Tejun Heo's avatar
    cgroup: implement cgroup.populated for the default hierarchy · 842b597e
    Tejun Heo authored
    cgroup users often need a way to determine when a cgroup's
    subhierarchy becomes empty so that it can be cleaned up.  cgroup
    currently provides release_agent for it; unfortunately, this mechanism
    is riddled with issues.
    
    * It delivers events by forking and execing a userland binary
      specified as the release_agent.  This is a long deprecated method of
      notification delivery.  It's extremely heavy, slow and cumbersome to
      integrate with larger infrastructure.
    
    * There is single monitoring point at the root.  There's no way to
      delegate management of a subtree.
    
    * The event isn't recursive.  It triggers when a cgroup doesn't have
      any tasks or child cgroups.  Events for internal nodes trigger only
      after all children are removed.  This again makes it impossible to
      delegate management of a subtree.
    
    * Events are filtered from the kernel side.  "notify_on_release" file
      is used to subscribe to or suppress release event.  This is
      unnecessarily complicated and probably done this way because event
      delivery itself was expensive.
    
    This patch implements interface file "cgroup.populated" which can be
    used to monitor whether the cgroup's subhierarchy has tasks in it or
    not.  Its value is 0 if there is no task in the cgroup and its
    descendants; otherwise, 1, and kernfs_notify() notificaiton is
    triggers when the value changes, which can be monitored through poll
    and [di]notify.
    
    This is a lot ligther and simpler and trivially allows delegating
    management of subhierarchy - subhierarchy monitoring can block further
    propgation simply by putting itself or another process in the root of
    the subhierarchy and monitor events that it's interested in from there
    without interfering with monitoring higher in the tree.
    
    v2: Patch description updated as per Serge.
    
    v3: "cgroup.subtree_populated" renamed to "cgroup.populated".  The
        subtree_ prefix was a bit confusing because
        "cgroup.subtree_control" uses it to denote the tree rooted at the
        cgroup sans the cgroup itself while the populated state includes
        the cgroup itself.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Acked-by: default avatarSerge Hallyn <serge.hallyn@ubuntu.com>
    Acked-by: default avatarLi Zefan <lizefan@huawei.com>
    Cc: Lennart Poettering <lennart@poettering.net>
    842b597e
cgroup.c 144 KB