• Feng Tang's avatar
    clocksource: Suspend the watchdog temporarily when high read latency detected · b7082cdf
    Feng Tang authored
    
    
    Bugs have been reported on 8 sockets x86 machines in which the TSC was
    wrongly disabled when the system is under heavy workload.
    
     [ 818.380354] clocksource: timekeeping watchdog on CPU336: hpet wd-wd read-back delay of 1203520ns
     [ 818.436160] clocksource: wd-tsc-wd read-back delay of 181880ns, clock-skew test skipped!
     [ 819.402962] clocksource: timekeeping watchdog on CPU338: hpet wd-wd read-back delay of 324000ns
     [ 819.448036] clocksource: wd-tsc-wd read-back delay of 337240ns, clock-skew test skipped!
     [ 819.880863] clocksource: timekeeping watchdog on CPU339: hpet read-back delay of 150280ns, attempt 3, marking unstable
     [ 819.936243] tsc: Marking TSC unstable due to clocksource watchdog
     [ 820.068173] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
     [ 820.092382] sched_clock: Marking unstable (818769414384, 1195404998)
     [ 820.643627] clocksource: Checking clocksource tsc synchronization from CPU 267 to CPUs 0,4,25,70,126,430,557,564.
     [ 821.067990] clocksource: Switched to clocksource hpet
    
    This can be reproduced by running memory intensive 'stream' tests,
    or some of the stress-ng subcases such as 'ioport'.
    
    The reason for these issues is the when system is under heavy load, the
    read latency of the clocksources can be very high.  Even lightweight TSC
    reads can show high latencies, and latencies are much worse for external
    clocksources such as HPET or the APIC PM timer.  These latencies can
    result in false-positive clocksource-unstable determinations.
    
    These issues were initially reported by a customer running on a production
    system, and this problem was reproduced on several generations of Xeon
    servers, especially when running the stress-ng test.  These Xeon servers
    were not production systems, but they did have the latest steppings
    and firmware.
    
    Given that the clocksource watchdog is a continual diagnostic check with
    frequency of twice a second, there is no need to rush it when the system
    is under heavy load.  Therefore, when high clocksource read latencies
    are detected, suspend the watchdog timer for 5 minutes.
    Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
    Acked-by: default avatarWaiman Long <longman@redhat.com>
    Cc: John Stultz <jstultz@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Stephen Boyd <sboyd@kernel.org>
    Cc: Feng Tang <feng.tang@intel.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    b7082cdf
clocksource.c 42.5 KB