• Erich Focht's avatar
    [PATCH] more migration thread cleanups · b85f47ad
    Erich Focht authored
    I'm currently working on a node affine scheduler extension for NUMA
    machines and the load balancer behaves a bit different from the original.
    So after a few boot failures with those slowly booting 16 CPU IA64
    machines I thought there must be a simpler solution than synchronizing and
    waiting for the load balancer: just let migration_CPU0 do what it is
    designed for. So my proposal is:
       - start all migration threads on CPU#0
       - initialize migration_CPU0 (trivial, reliable, as it already is on
         the right CPU)
       - let all other migration threads use set_cpus_allowed() to get to the
         right place
    
    The only synchronization needed is the non-zero migration threads waiting
    for migration_CPU0 to start working, which it will, as it is already on
    the right CPU. This saves quite some lines of code.
    
    I first posted this to LKML on March 6th (BTW, the fix #1, too) and since
    then it was tested on several big NUMA platforms: 16 CPU NEC AzusA (IA64)
    (also known as HP rx....), up to 32 CPU SGI IA64, 16 CPU IBM NUMA-Q
    (IA32). No more lock-ups at boot since then. So I consider it working.
    
    There is another good reason for this approach: the integration of the CPU
    hotplug patch with the new scheduler becomes easier. One just needs to
    create the new migration thread, it will move itself to the right CPU
    without any additional magic (which you otherwise need because of the
    synchronizations which won't be there at hotplug). Kimi Suganuma in the
    neighboring cube is fiddling this out currently.
    b85f47ad
sched.c 41.4 KB