1. 15 Apr, 2013 2 commits
  2. 14 Apr, 2013 1 commit
    • Tejun Heo's avatar
      cgroup: make cgroup_path() not print double slashes · da1f296f
      Tejun Heo authored
      While reimplementing cgroup_path(), 65dff759 ("cgroup: fix
      cgroup_path() vs rename() race") introduced a bug where the path of a
      non-root cgroup would have two slahses at the beginning, which is
      caused by treating the root cgroup which has the name '/' like
      non-root cgroups.
      
       $ grep systemd /proc/self/cgroup
       1:name=systemd://user/root/1
      
      Fix it by special casing root cgroup case and not looping over it in
      the normal path.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      da1f296f
  3. 12 Apr, 2013 1 commit
  4. 10 Apr, 2013 4 commits
    • Tejun Heo's avatar
      perf: make perf_event cgroup hierarchical · ef824fa1
      Tejun Heo authored
      perf_event is one of a couple remaining cgroup controllers with broken
      hierarchy support.  Converting it to support hierarchy is almost
      trivial.  The only thing necessary is to consider a task belonging to
      a descendant cgroup as a match.  IOW, if the cgroup of the currently
      executing task (@cpuctx->cgrp) equals or is a descendant of the
      event's cgroup (@event->cgrp), then the event should be enabled.
      
      Implement hierarchy support and remove .broken_hierarchy tag along
      with the incorrect comment on what needs to be done for hierarchy
      support.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      ef824fa1
    • Li Zefan's avatar
      cgroup: implement cgroup_is_descendant() · 78574cf9
      Li Zefan authored
      A couple controllers want to determine whether two cgroups are in
      ancestor/descendant relationship.  As it's more likely that the
      descendant is the primary subject of interest and there are other
      operations focusing on the descendants, let's ask is_descendent rather
      than is_ancestor.
      
      Implementation is trivial as the previous patch guarantees that all
      ancestors of a cgroup stay accessible as long as the cgroup is
      accessible.
      
      tj: Removed depth optimization, renamed from cgroup_is_ancestor(),
          rewrote descriptions.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      78574cf9
    • Li Zefan's avatar
      cgroup: make sure parent won't be destroyed before its children · 415cf07a
      Li Zefan authored
      Suppose we rmdir a cgroup and there're still css refs, this cgroup won't
      be freed. Then we rmdir the parent cgroup, and the parent is freed
      immediately due to css ref draining to 0. Now it would be a disaster if
      the still-alive child cgroup tries to access its parent.
      
      Make sure this won't happen.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      415cf07a
    • Rami Rosen's avatar
      cgroup: remove bind() method from cgroup_subsys. · 84cfb6ab
      Rami Rosen authored
      The bind() method of cgroup_subsys is not used in any of the
      controllers (cpuset, freezer, blkio, net_cls, memcg, net_prio,
      devices, perf, hugetlb, cpu and cpuacct)
      
      tj: Removed the entry on ->bind() from
          Documentation/cgroups/cgroups.txt.  Also updated a couple
          paragraphs which were suggesting that dynamic re-binding may be
          implemented.  It's not gonna.
      Signed-off-by: default avatarRami Rosen <ramirose@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      84cfb6ab
  5. 08 Apr, 2013 1 commit
  6. 07 Apr, 2013 6 commits
  7. 03 Apr, 2013 2 commits
  8. 20 Mar, 2013 6 commits
    • Li Zefan's avatar
      cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc() · 081aa458
      Li Zefan authored
      These two functions share most of the code.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      081aa458
    • Aristeu Rozanski's avatar
      devcg: propagate local changes down the hierarchy · bd2953eb
      Aristeu Rozanski authored
      This patch makes exception changes to propagate down in hierarchy respecting
      when possible local exceptions.
      
      New exceptions allowing additional access to devices won't be propagated, but
      it'll be possible to add an exception to access all of part of the newly
      allowed device(s).
      
      New exceptions disallowing access to devices will be propagated down and the
      local group's exceptions will be revalidated for the new situation.
      Example:
            A
           / \
              B
      
          group        behavior          exceptions
          A            allow             "b 8:* rwm", "c 116:1 rw"
          B            deny              "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"
      
      If a new exception is added to group A:
      	# echo "c 116:* r" > A/devices.deny
      it'll propagate down and after revalidating B's local exceptions, the exception
      "c 116:2 rwm" will be removed.
      
      In case parent's exceptions change and local exceptions are not allowed anymore,
      they'll be deleted.
      
      v7:
      - do not allow behavior change when the cgroup has children
      - update documentation
      
      v6: fixed issues pointed by Serge Hallyn
      - only copy parent's exceptions while propagating behavior if the local
        behavior is different
      - while propagating exceptions, do not clear and copy parent's: it'd be against
        the premise we don't propagate access to more devices
      
      v5: fixed issues pointed by Serge Hallyn
      - updated documentation
      - not propagating when an exception is written to devices.allow
      - when propagating a new behavior, clean the local exceptions list if they're
        for a different behavior
      
      v4: fixed issues pointed by Tejun Heo
      - separated function to walk the tree and collect valid propagation targets
      
      v3: fixed issues pointed by Tejun Heo
      - update documentation
      - move css_online/css_offline changes to a new patch
      - use cgroup_for_each_descendant_pre() instead of own descendant walk
      - move exception_copy rework to a separared patch
      - move exception_clean rework to a separated patch
      
      v2: fixed issues pointed by Tejun Heo
      - instead of keeping the local settings that won't apply anymore, remove them
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      bd2953eb
    • Aristeu Rozanski's avatar
      devcg: use css_online and css_offline · 1909554c
      Aristeu Rozanski authored
      Allocate resources and change behavior only when online. This is needed in
      order to determine if a node is suitable for hierarchy propagation or if it's
      being removed.
      
      Locking:
      Both functions take devcgroup_mutex to make changes to device_cgroup structure.
      Hierarchy propagation will also take devcgroup_mutex before walking the
      tree while walking the tree itself is protected by rcu lock.
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      1909554c
    • Aristeu Rozanski's avatar
      devcg: prepare may_access() for hierarchy support · c39a2a30
      Aristeu Rozanski authored
      Currently may_access() is only able to verify if an exception is valid for the
      current cgroup, which has the same behavior. With hierarchy, it'll be also used
      to verify if a cgroup local exception is valid towards its cgroup parent, which
      might have different behavior.
      
      v2:
      - updated patch description
      - rebased on top of a new patch to expand the may_access() logic to make it
        more clear
      - fixed argument description order in may_access()
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      c39a2a30
    • Aristeu Rozanski's avatar
      devcg: expand may_access() logic · 26898fdf
      Aristeu Rozanski authored
      In order to make the next patch more clear, expand may_access() logic.
      
      v2: may_access() returns bool now
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      26898fdf
    • Li Zefan's avatar
      cgroup: fix an off-by-one bug which may trigger BUG_ON() · 3ac1707a
      Li Zefan authored
      The 3rd parameter of flex_array_prealloc() is the number of elements,
      not the index of the last element.
      
      The effect of the bug is, when opening cgroup.procs, a flex array will
      be allocated and all elements of the array is allocated with
      GFP_KERNEL flag, but the last one is GFP_ATOMIC, and if we fail to
      allocate memory for it, it'll trigger a BUG_ON().
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      3ac1707a
  9. 12 Mar, 2013 6 commits
  10. 05 Mar, 2013 3 commits
  11. 04 Mar, 2013 3 commits
    • Li Zefan's avatar
      cgroup: no need to check css refs for release notification · f50daa70
      Li Zefan authored
      We no longer fail rmdir() when there're still css refs, so we don't
      need to check css refs in check_for_release().
      
      This also voids a bug. cgroup_has_css_refs() accesses subsys[i]
      without cgroup_mutex, so it can race with cgroup_unload_subsys().
      
      cgroup_has_css_refs()
      ...
        if (ss == NULL || ss->root != cgrp->root)
      
      if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded
      right after the former check but before the latter, the memory that
      net_cls_subsys resides has become invalid.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      f50daa70
    • Li Zefan's avatar
      cpuset: use cgroup_name() in cpuset_print_task_mems_allowed() · f440d98f
      Li Zefan authored
      Use cgroup_name() instead of cgrp->dentry->name. This makes the code
      a bit simpler.
      
      While at it, remove cpuset_name and make cpuset_nodelist a local variable
      to cpuset_print_task_mems_allowed().
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      f440d98f
    • Li Zefan's avatar
      cgroup: fix cgroup_path() vs rename() race · 65dff759
      Li Zefan authored
      rename() will change dentry->d_name. The result of this race can
      be worse than seeing partially rewritten name, but we might access
      a stale pointer because rename() will re-allocate memory to hold
      a longer name.
      
      As accessing dentry->name must be protected by dentry->d_lock or
      parent inode's i_mutex, while on the other hand cgroup-path() can
      be called with some irq-safe spinlocks held, we can't generate
      cgroup path using dentry->d_name.
      
      Alternatively we make a copy of dentry->d_name and save it in
      cgrp->name when a cgroup is created, and update cgrp->name at
      rename().
      
      v5: use flexible array instead of zero-size array.
      v4: - allocate root_cgroup_name and all root_cgroup->name points to it.
          - add cgroup_name() wrapper.
      v3: use kfree_rcu() instead of synchronize_rcu() in user-visible path.
      v2: make cgrp->name RCU safe.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      65dff759
  12. 03 Mar, 2013 5 commits