• Thomas Gleixner's avatar
    clockevents: prevent stale tick update on offline cpu · 5e41d0d6
    Thomas Gleixner authored
    Taking a cpu offline removes the cpu from the online mask before the
    CPU_DEAD notification is done. The clock events layer does the cleanup
    of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is
    used to avoid xtime lock contention by assigning the task of jiffies
    xtime updates to one CPU. If a CPU is taken offline, then this
    assignment becomes stale. This went unnoticed because most of the time
    the offline CPU went dead before the online CPU reached __cpu_die(),
    where the CPU_DEAD state is checked. In the case that the offline CPU did
    not reach the DEAD state before we reach __cpu_die(), the code in there
    goes to sleep for 100ms. Due to the stale time update assignment, the
    system is stuck forever.
    
    Take the assignment away when a cpu is not longer in the cpu_online_mask.
    We do this in the last call to tick_nohz_stop_sched_tick() when the offline
    CPU is on the way to the final play_dead() idle entry.
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    5e41d0d6
tick-sched.c 16.3 KB