• Paul Mackerras's avatar
    KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts · 577a5119
    Paul Mackerras authored
    commit 959c5d51 upstream.
    
    Escalation interrupts are interrupts sent to the host by the XIVE
    hardware when it has an interrupt to deliver to a guest VCPU but that
    VCPU is not running anywhere in the system.  Hence we disable the
    escalation interrupt for the VCPU being run when we enter the guest
    and re-enable it when the guest does an H_CEDE hypercall indicating
    it is idle.
    
    It is possible that an escalation interrupt gets generated just as we
    are entering the guest.  In that case the escalation interrupt may be
    using a queue entry in one of the interrupt queues, and that queue
    entry may not have been processed when the guest exits with an H_CEDE.
    The existing entry code detects this situation and does not clear the
    vcpu->arch.xive_esc_on flag as an indication that there is a pending
    queue entry (if the queue entry gets processed, xive_esc_irq() will
    clear the flag).  There is a comment in the code saying that if the
    flag is still set on H_CEDE, we have to abort the cede rather than
    re-enabling the escalation interrupt, lest we end up with two
    occurrences of the escalation interrupt in the interrupt queue.
    
    However, the exit code doesn't do that; it aborts the cede in the sense
    that vcpu->arch.ceded gets cleared, but it still enables the escalation
    interrupt by setting the source's PQ bits to 00.  Instead we need to
    set the PQ bits to 10, indicating that an interrupt has been triggered.
    We also need to avoid setting vcpu->arch.xive_esc_on in this case
    (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
    xive_esc_irq() will run at some point and clear it, and if we race with
    that we may end up with an incorrect result (i.e. xive_esc_on set when
    the escalation interrupt has just been handled).
    
    It is extremely unlikely that having two queue entries would cause
    observable problems; theoretically it could cause queue overflow, but
    the CPU would have to have thousands of interrupts targetted to it for
    that to be possible.  However, this fix will also make it possible to
    determine accurately whether there is an unhandled escalation
    interrupt in the queue, which will be needed by the following patch.
    
    Fixes: 9b9b13a6 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded")
    Cc: stable@vger.kernel.org # v4.16+
    Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberrySigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    577a5119
book3s_hv_rmhandlers.S 85.3 KB