• Thomas Gleixner's avatar
    x86/tsc: Force TSC_ADJUST register to value >= zero · 5bae1562
    Thomas Gleixner authored
    Roland reported that his DELL T5810 sports a value add BIOS which
    completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
    random negative TSC_ADJUST values, different on all CPUs. That renders the
    TSC useless because the sycnchronization check fails.
    
    Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
    the TSCs he needs to disable the TSC deadline timer, otherwise the machine
    just stops booting.
    
    Deeper investigation unearthed that the TSC deadline timer is sensitive to
    the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
    interrupt storm caused by the TSC deadline timer.
    
    This does not make any sense and it's hard to imagine what kind of hardware
    wreckage is behind that misfeature, but it's reliably reproducible on other
    systems which have TSC_ADJUST and TSC deadline timer.
    
    While it would be understandable that a big enough negative value which
    moves the resulting TSC readout into the negative space could have the
    described effect, this happens even with a adjust value of -1, which keeps
    the TSC readout definitely in the positive space. The compare register for
    the TSC deadline timer is set to a positive value larger than the TSC, but
    despite not having reached the deadline the interrupt is raised
    immediately. If this happens on the boot CPU, then the machine dies
    silently because this setup happens before the NMI watchdog is armed.
    
    Further experiments showed that any other adjustment of TSC_ADJUST works as
    expected as long as it stays in the positive range. The direction of the
    adjustment has no influence either. See the lkml link for further analysis.
    
    Yet another proof for the theory that timers are designed by janitors and
    the underlying (obviously undocumented) mechanisms which allow BIOSes to
    wreckage them are considered a feature. Well done Intel - NOT!
    
    To address this wreckage add the following sanity measures:
    
    - If the TSC_ADJUST value on the boot cpu is not 0, set it to 0
    
    - If the TSC_ADJUST value on any cpu is negative, set it to 0
    
    - Prevent the cross package synchronization mechanism from setting negative
      TSC_ADJUST values.
    Reported-and-tested-by: default avatarRoland Scheidegger <rscheidegger_lists@hispeed.ch>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
    Cc: Kevin Stanton <kevin.b.stanton@intel.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Allen Hung <allen_hung@dell.com>
    Cc: Borislav Petkov <bp@alien8.de>
    Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    5bae1562
tsc.c 36.1 KB