• Paul Mackerras's avatar
    KVM: PPC: Book3S HV: Enable migration of decrementer register · 5855564c
    Paul Mackerras authored
    This adds a register identifier for use with the one_reg interface
    to allow the decrementer expiry time to be read and written by
    userspace.  The decrementer expiry time is in guest timebase units
    and is equal to the sum of the decrementer and the guest timebase.
    (The expiry time is used rather than the decrementer value itself
    because the expiry time is not constantly changing, though the
    decrementer value is, while the guest vcpu is not running.)
    
    Without this, a guest vcpu migrated to a new host will see its
    decrementer set to some random value.  On POWER8 and earlier, the
    decrementer is 32 bits wide and counts down at 512MHz, so the
    guest vcpu will potentially see no decrementer interrupts for up
    to about 4 seconds, which will lead to a stall.  With POWER9, the
    decrementer is now 56 bits side, so the stall can be much longer
    (up to 2.23 years) and more noticeable.
    
    To help work around the problem in cases where userspace has not been
    updated to migrate the decrementer expiry time, we now set the
    default decrementer expiry at vcpu creation time to the current time
    rather than the maximum possible value.  This should mean an
    immediate decrementer interrupt when a migrated vcpu starts
    running.  In cases where the decrementer is 32 bits wide and more
    than 4 seconds elapse between the creation of the vcpu and when it
    first runs, the decrementer would have wrapped around to positive
    values and there may still be a stall - but this is no worse than
    the current situation.  In the large-decrementer case, we are sure
    to get an immediate decrementer interrupt (assuming the time from
    vcpu creation to first run is less than 2.23 years) and we thus
    avoid a very long stall.
    Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
    5855564c
api.txt 143 KB