• Paul Jackson's avatar
    cpuset: remove sched domain hooks from cpusets · 607717a6
    Paul Jackson authored
    Remove the cpuset hooks that defined sched domains depending on the setting
    of the 'cpu_exclusive' flag.
    
    The cpu_exclusive flag can only be set on a child if it is set on the
    parent.
    
    This made that flag painfully unsuitable for use as a flag defining a
    partitioning of a system.
    
    It was entirely unobvious to a cpuset user what partitioning of sched
    domains they would be causing when they set that one cpu_exclusive bit on
    one cpuset, because it depended on what CPUs were in the remainder of that
    cpusets siblings and child cpusets, after subtracting out other
    cpu_exclusive cpusets.
    
    Furthermore, there was no way on production systems to query the
    result.
    
    Using the cpu_exclusive flag for this was simply wrong from the get go.
    
    Fortunately, it was sufficiently borked that so far as I know, almost no
    successful use has been made of this.  One real time group did use it to
    affectively isolate CPUs from any load balancing efforts.  They are willing
    to adapt to alternative mechanisms for this, such as someway to manipulate
    the list of isolated CPUs on a running system.  They can do without this
    present cpu_exclusive based mechanism while we develop an alternative.
    
    There is a real risk, to the best of my understanding, of users
    accidentally setting up a partitioned scheduler domains, inhibiting desired
    load balancing across all their CPUs, due to the nonobvious (from the
    cpuset perspective) side affects of the cpu_exclusive flag.
    
    Furthermore, since there was no way on a running system to see what one was
    doing with sched domains, this change will be invisible to any using code.
    Unless they have real insight to the scheduler load balancing choices, they
    will be unable to detect that this change has been made in the kernel's
    behaviour.
    
    Initial discussion on lkml of this patch has generated much comment.  My
    (probably controversial) take on that discussion is that it has reached a
    rough concensus that the current cpuset cpu_exclusive mechanism for
    defining sched domains is borked.  There is no concensus on the
    replacement.  But since we can remove this mechanism, and since its
    continued presence risks causing unwanted partitioning of the schedulers
    load balancing, we should remove it while we can, as we proceed to work the
    replacement scheduler domain mechanisms.
    Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Nick Piggin <nickpiggin@yahoo.com.au>
    Cc: Christoph Lameter <clameter@engr.sgi.com>
    Cc: Dinakar Guniguntala <dino@in.ibm.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    607717a6
cpusets.txt 25.2 KB