• Joe Korty's avatar
    clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error · 839a9739
    Joe Korty authored
    The TVAL register is 32 bit signed.  Thus only the lower 31 bits are
    available to specify when an interrupt is to occur at some time in the
    near future.  Attempting to specify a larger interval with TVAL results
    in a negative time delta which means the timer fires immediately upon
    being programmed, rather than firing at that expected future time.
    
    The solution is for Linux to declare that TVAL is a 31 bit register rather
    than give its true size of 32 bits.  This prevents Linux from programming
    TVAL with a too-large value.  Note that, prior to 5.16, this little trick
    was the standard way to handle TVAL in Linux, so there is nothing new
    happening here on that front.
    
    The softlockup detector hides the issue, because it keeps generating
    short timer deadlines that are within the scope of the broken timer.
    
    Disabling it, it starts using NO_HZ with much longer timer deadlines, which
    turns into an interrupt flood:
    
     11: 1124855130  949168462  758009394   76417474  104782230   30210281
             310890 1734323687     GICv2  29 Level     arch_timer
    
    And "much longer" isn't that long: it takes less than 43s to underflow
    TVAL at 50MHz (the frequency of the counter on XGene-1).
    
    Some comments on the v1 version of this patch by Marc Zyngier:
    
      XGene implements CVAL (a 64bit comparator) in terms of TVAL (a countdown
      register) instead of the other way around. TVAL being a 32bit register,
      the width of the counter should equally be 32.  However, TVAL is a
      *signed* value, and keeps counting down in the negative range once the
      timer fires.
    
      It means that any TVAL value with bit 31 set will fire immediately,
      as it cannot be distinguished from an already expired timer. Reducing
      the timer range back to a paltry 31 bits papers over the issue.
    
      Another problem cannot be fixed though, which is that the timer interrupt
      *must* be handled within the negative countdown period, or the interrupt
      will be lost (TVAL will rollover to a positive value, indicative of a
      new timer deadline).
    
    Fixes: 012f1885 ("clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations")
    Signed-off-by: default avatarJoe Korty <joe.korty@concurrent-rt.com>
    Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
    Link: https://lore.kernel.org/r/20221024165422.GA51107@zipoli.concurrent-rt.com
    Link: https://lore.kernel.org/r/20221121145343.896018-1-maz@kernel.org
    
    [maz: revamped the commit message]
    839a9739
arm_arch_timer.c 46 KB