• Peter Zijlstra's avatar
    sched: Fix fork vs hotplug vs cpuset namespaces · fabf318e
    Peter Zijlstra authored
    There are a number of issues:
    
    1) TASK_WAKING vs cgroup_clone (cpusets)
    
    copy_process():
    
      sched_fork()
        child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
      if (current->nsproxy != p->nsproxy)
         ns_cgroup_clone()
           cgroup_clone()
             mutex_lock(inode->i_mutex)
             mutex_lock(cgroup_mutex)
             cgroup_attach_task()
    	   ss->can_attach()
               ss->attach() [ -> cpuset_attach() ]
                 cpuset_attach_task()
                   set_cpus_allowed_ptr();
                     while (child->state == TASK_WAKING)
                       cpu_relax();
    will deadlock the system.
    
    
    2) cgroup_clone (cpusets) vs copy_process
    
    So even if the above would work we still have:
    
    copy_process():
    
      if (current->nsproxy != p->nsproxy)
         ns_cgroup_clone()
           cgroup_clone()
             mutex_lock(inode->i_mutex)
             mutex_lock(cgroup_mutex)
             cgroup_attach_task()
    	   ss->can_attach()
               ss->attach() [ -> cpuset_attach() ]
                 cpuset_attach_task()
                   set_cpus_allowed_ptr();
      ...
    
      p->cpus_allowed = current->cpus_allowed
    
    over-writing the modified cpus_allowed.
    
    
    3) fork() vs hotplug
    
      if we unplug the child's cpu after the sanity check when the child
      gets attached to the task_list but before wake_up_new_task() shit
      will meet with fan.
    
    Solve all these issues by moving fork cpu selection into
    wake_up_new_task().
    Reported-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
    Tested-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <1264106190.4283.1314.camel@laptop>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    fabf318e
sched.c 270 KB