Commit ea3f01f8 authored by Ingo Molnar's avatar Ingo Molnar

sched: re-tune NUMA topologies

improve the sysbench ramp-up phase and its peak throughput on
a 16way NUMA box, by turning on WAKE_AFFINE:

             tip/sched   tip/sched+wake-affine
-------------------------------------------------
    1:             700              830    +15.65%
    2:            1465             1391    -5.28%
    4:            3017             3105    +2.81%
    8:            5100             6021    +15.30%
   16:           10725            10745    +0.19%
   32:           10135            10150    +0.16%
   64:            9338             9240    -1.06%
  128:            8599             8252    -4.21%
  256:            8475             8144    -4.07%
-------------------------------------------------
  SUM:           57558            57882    +0.56%

this change also improves lat_ctx from 6.69 usecs to 1.11 usec:

  $ ./lat_ctx -s 0 2
  "size=0k ovr=1.19
  2 1.11

  $ ./lat_ctx -s 0 2
  "size=0k ovr=1.22
  2 6.69

in sysbench it's an overall win with some weakness at the lots-of-clients
side. That happens because we now under-balance this workload
a bit. To counter that effect, turn on NEWIDLE:

              wake-idle          wake-idle+newidle
 -------------------------------------------------
     1:             830              834    +0.43%
     2:            1391             1401    +0.65%
     4:            3105             3091    -0.43%
     8:            6021             6046    +0.42%
    16:           10745            10736    -0.08%
    32:           10150            10206    +0.55%
    64:            9240             9533    +3.08%
   128:            8252             8355    +1.24%
   256:            8144             8384    +2.87%
 -------------------------------------------------
   SUM:           57882            58591    +1.21%

as a bonus this not only improves the many-clients case but
also improves the (more important) rampup phase.

sysbench is a workload that quickly breaks down if the
scheduler over-balances, so since it showed an improvement
under NEWIDLE this change is definitely good.
parent b3137bc8
...@@ -166,6 +166,8 @@ void arch_update_cpu_topology(void); ...@@ -166,6 +166,8 @@ void arch_update_cpu_topology(void);
.busy_idx = 3, \ .busy_idx = 3, \
.idle_idx = 3, \ .idle_idx = 3, \
.flags = SD_LOAD_BALANCE \ .flags = SD_LOAD_BALANCE \
| SD_BALANCE_NEWIDLE \
| SD_WAKE_AFFINE \
| SD_SERIALIZE, \ | SD_SERIALIZE, \
.last_balance = jiffies, \ .last_balance = jiffies, \
.balance_interval = 64, \ .balance_interval = 64, \
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment