• Ilias Stamatis's avatar
    KVM: Fix coalesced_mmio_has_room() to avoid premature userspace exit · 92f6d413
    Ilias Stamatis authored
    The following calculation used in coalesced_mmio_has_room() to check
    whether the ring buffer is full is wrong and results in premature exits if
    the start of the valid entries is in the first half of the ring buffer.
    
      avail = (ring->first - last - 1) % KVM_COALESCED_MMIO_MAX;
      if (avail == 0)
    	  /* full */
    
    Because negative values are handled using two's complement, and KVM
    computes the result as an unsigned value, the above will get a false
    positive if "first < last" and the ring is half-full.
    
    The above might have worked as expected in python for example:
      >>> (-86) % 170
      84
    
    However it doesn't work the same way in C.
    
      printf("avail: %d\n", (-86) % 170);
      printf("avail: %u\n", (-86) % 170);
      printf("avail: %u\n", (-86u) % 170u);
    
    Using gcc-11 these print:
    
      avail: -86
      avail: 4294967210
      avail: 0
    
    For illustration purposes, given a 4-bit integer and a ring size of 0xA
    (unsigned), 0xA == 0x1010 == -6, and thus (-6u % 0xA) == 0.
    
    Fix the calculation and allow all but one entries in the buffer to be
    used as originally intended.
    
    Note, KVM's behavior is self-healing to some extent, as KVM will allow the
    entire buffer to be used if ring->first is beyond the halfway point.  In
    other words, in the unlikely scenario that a use case benefits from being
    able to coalesce more than 86 entries at once, KVM will still provide such
    behavior, sometimes.
    
    Note #2, the % operator in C is not the modulo operator but the remainder
    operator. Modulo and remainder operators differ with respect to negative
    values.  But, the relevant values in KVM are all unsigned, so it's a moot
    point in this case anyway.
    
    Note #3, this is almost a pure revert of the buggy commit, plus a
    READ_ONCE() to provide additional safety.  Thue buggy commit justified the
    change with "it paves the way for making this function lockless", but it's
    not at all clear what was intended, nor is there any evidence that the
    buggy code was somehow safer.  (a) the fields in question were already
    accessed locklessly, from the perspective that they could be modified by
    userspace at any time, and (b) the lock guarding the ring itself was
    changed, but never dropped, i.e. whatever lockless scheme (SRCU?) was
    planned never landed.
    
    Fixes: 105f8d40 ("KVM: Calculate available entries in coalesced mmio ring")
    Signed-off-by: default avatarIlias Stamatis <ilstam@amazon.com>
    Reviewed-by: default avatarPaul Durrant <paul@xen.org>
    Link: https://lore.kernel.org/r/20240718193543.624039-2-ilstam@amazon.com
    [sean: rework changelog to clarify behavior, call out weirdness of buggy commit]
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    92f6d413
coalesced_mmio.c 4.76 KB