• Lai Jiangshan's avatar
    sched: fair group: fix overflow(was: fix divide by zero) · 2e084786
    Lai Jiangshan authored
    I found a bug which can be reproduced by this way:(linux-2.6.26-rc5, x86-64)
    (use 2^32, 2^33, ...., 2^63 as shares value)
    
    # mkdir /dev/cpuctl
    # mount -t cgroup -o cpu cpuctl /dev/cpuctl
    # cd /dev/cpuctl
    # mkdir sub
    # echo 0x8000000000000000 > sub/cpu.shares
    # echo $$ > sub/tasks
    oops here! divide by zero.
    
    This is because do_div() expects the 2th parameter to be 32 bits,
    but unsigned long is 64 bits in x86_64.
    
    Peter Zijstra pointed it out that the sane thing to do is limit the
    shares value to something smaller instead of using an even more
    expensive divide.
    
    Also, I found another bug about "the shares value is too large":
    
    pid1 and pid2 are set affinity to cpu#0
    pid1 is attached to cg1 and pid2 is attached to cg2
    
    if cg1/cpu.shares = 1024 cg2/cpu.shares = 2000000000
    then pid2 got 100% usage of cpu, and pid1 0%
    
    if cg1/cpu.shares = 1024 cg2/cpu.shares = 20000000000
    then pid2 got 0% usage of cpu, and pid1 100%
    
    And a weight of a cfs_rq is the sum of weights of which entities
    are queued on this cfs_rq, so the shares value should be limited
    to a smaller value.
    
    I think that (1UL << 18) is a good limited value:
    
    1) it's not too large, we can create a lot of group before overflow
    2) it's several times the weight value for nice=-19 (not too small)
    Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
    Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    2e084786
sched.c 213 KB