• Valentin Schneider's avatar
    sched/fair: Fix shift-out-of-bounds in load_balance() · 39a2a6eb
    Valentin Schneider authored
    Syzbot reported a handful of occurrences where an sd->nr_balance_failed can
    grow to much higher values than one would expect.
    
    A successful load_balance() resets it to 0; a failed one increments
    it. Once it gets to sd->cache_nice_tries + 3, this *should* trigger an
    active balance, which will either set it to sd->cache_nice_tries+1 or reset
    it to 0. However, in case the to-be-active-balanced task is not allowed to
    run on env->dst_cpu, then the increment is done without any further
    modification.
    
    This could then be repeated ad nauseam, and would explain the absurdly high
    values reported by syzbot (86, 149). VincentG noted there is value in
    letting sd->cache_nice_tries grow, so the shift itself should be
    fixed. That means preventing:
    
      """
      If the value of the right operand is negative or is greater than or equal
      to the width of the promoted left operand, the behavior is undefined.
      """
    
    Thus we need to cap the shift exponent to
      BITS_PER_TYPE(typeof(lefthand)) - 1.
    
    I had a look around for other similar cases via coccinelle:
    
      @expr@
      position pos;
      expression E1;
      expression E2;
      @@
      (
      E1 >> E2@pos
      |
      E1 >> E2@pos
      )
    
      @cst depends on expr@
      position pos;
      expression expr.E1;
      constant cst;
      @@
      (
      E1 >> cst@pos
      |
      E1 << cst@pos
      )
    
      @script:python depends on !cst@
      pos << expr.pos;
      exp << expr.E2;
      @@
      # Dirty hack to ignore constexpr
      if exp.upper() != exp:
         coccilib.report.print_report(pos[0], "Possible UB shift here")
    
    The only other match in kernel/sched is rq_clock_thermal() which employs
    sched_thermal_decay_shift, and that exponent is already capped to 10, so
    that one is fine.
    
    Fixes: 5a7f5559 ("sched/fair: Relax constraint on task's load during load balance")
    Reported-by: syzbot+d7581744d5fd27c9fbe1@syzkaller.appspotmail.com
    Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Link: http://lore.kernel.org/r/000000000000ffac1205b9a2112f@google.com
    39a2a6eb
fair.c 298 KB