• Arjan van de Ven's avatar
    intel_idle: Add a "Long HLT" C1 state for the VM guest mode · 0fac214b
    Arjan van de Ven authored
    intel_idle will, for the bare metal case, usually have one or more deep
    power states that have the CPUIDLE_FLAG_TLB_FLUSHED flag set. When
    a state with this flag is selected by the cpuidle framework, it will also
    flush the TLBs as part of entering this state. The benefit of doing this is
    that the kernel does not need to wake the cpu out of this deep power state
    just to flush the TLBs... for which the latency can be very high due to
    the exit latency of deep power states.
    
    In a VM guest currently, this benefit of avoiding the wakeup does not exist,
    while the problem (long exit latency) is even more severe. Linux will need
    to wake up a vCPU (causing the host to either come out of a deep C state,
    or the VMM to have to deschedule something else to schedule the vCPU) which
    can take a very long time.. adding a lot of latency to tlb flush operations
    (including munmap and others).
    
    To solve this, add a "Long HLT" C state to the state table for the VM guest
    case that has the CPUIDLE_FLAG_TLB_FLUSHED flag set.  The result of that is
    that for long idle periods (where the VMM is likely to do things that cause
    large latency) the cpuidle framework will flush the TLBs (and avoid the
    wakeups), while for short/quick idle durations, the existing behavior is
    retained.
    
    Now, there is still only "hlt" available in the guest, but for long idle,
    the host can go to a deeper state (say C6).  There is a reasonable debate
    one can have to what to set for the exit_latency and break even point for
    this "Long HLT" state.  The good news is that intel_idle has these values
    available for the underlying CPU (even when mwait is not exposed).  The
    solution thus is to just use the latency and break even of the deepest state
    from the bare metal CPU.  This is under the assumption that this is a pretty
    reasonable estimate of what the VMM would do to cause latency.
    Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    0fac214b
intel_idle.c 61.4 KB