• Nicolas Saenz Julienne's avatar
    timers: Fix get_next_timer_interrupt() with no timers pending · aebacb7f
    Nicolas Saenz Julienne authored
    31cd0e11 ("timers: Recalculate next timer interrupt only when
    necessary") subtly altered get_next_timer_interrupt()'s behaviour. The
    function no longer consistently returns KTIME_MAX with no timers
    pending.
    
    In order to decide if there are any timers pending we check whether the
    next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now.
    Unfortunately, the next expiry time and the timer base clock are no
    longer updated in unison. The former changes upon certain timer
    operations (enqueue, expire, detach), whereas the latter keeps track of
    jiffies as they move forward. Ultimately breaking the logic above.
    
    A simplified example:
    
    - Upon entering get_next_timer_interrupt() with:
    
    	jiffies = 1
    	base->clk = 0;
    	base->next_expiry = NEXT_TIMER_MAX_DELTA;
    
      'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function
      returns KTIME_MAX.
    
    - 'base->clk' is updated to the jiffies value.
    
    - The next time we enter get_next_timer_interrupt(), taking into account
      no timer operations happened:
    
    	base->clk = 1;
    	base->next_expiry = NEXT_TIMER_MAX_DELTA;
    
      'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function
      returns a valid expire time, which is incorrect.
    
    This ultimately might unnecessarily rearm sched's timer on nohz_full
    setups, and add latency to the system[1].
    
    So, introduce 'base->timers_pending'[2], update it every time
    'base->next_expiry' changes, and use it in get_next_timer_interrupt().
    
    [1] See tick_nohz_stop_tick().
    [2] A quick pahole check on x86_64 and arm64 shows it doesn't make
        'struct timer_base' any bigger.
    
    Fixes: 31cd0e11 ("timers: Recalculate next timer interrupt only when necessary")
    Signed-off-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
    Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    aebacb7f
timer.c 60.1 KB