• Arjan van de Ven's avatar
    intel_idle: Add support for using intel_idle in a VM guest using just hlt · 2f3d08f0
    Arjan van de Ven authored
    In a typical VM guest, the mwait instruction is not available, leaving
    only the 'hlt' instruction (which causes a VMEXIT to the host).
    
    So for this common case, intel_idle will detect the lack of mwait, and
    fail to initialize (after which another idle method would step in which
    will just use hlt always).
    
    Other (non-common) cases exist; the table below shows the before/after
    for these:
    
    +------------+--------------------------+-------------------------+
    | Hypervisor | Idle method before patch | Idle method after patch |
    | exposes    |                          |                         |
    +============+==========================+=========================+
    | nothing    | default_idle fallback    | intel_idle VM table     |
    | (common)   | (straight "hlt")         |                         |
    +------------+--------------------------+-------------------------+
    | mwait      | intel_idle mwait table   | intel_idle mwait table  |
    +------------+--------------------------+-------------------------+
    | ACPI       | ACPI C1 state ("hlt")    | intel_idle VM table     |
    +------------+--------------------------+-------------------------+
    
    This is only applicable to CPUs known by intel_idle. For the bare metal
    case, unknown CPU models will use the ACPI tables (when available) to
    get estimates for latency and break even point for longer idle states.
    In guests, the common case is that ACPI tables are not available, but
    even when they are available, they can't and don't provide the latency
    information for the longer (mwait based) states. For this scenario
    (unknown CPU model), the default_idle mode (no ACPI) or ACPI C1 (ACPI
    avaible) will be used.
    
    By providing capability to do this with the intel_idle driver, we can
    do better than the fallback or ACPI table methods. While this current
    change only gets us to the existing behavior, later patches in this
    series will add new capabilities such as optimized TLB flushing.
    
    In order to do this, a simplified version of the initialization
    function for VM guests is created, and this will be called if the CPU
    is recognized, but mwait is not supported, and we're in a VM guest.
    
    One thing to note is that the max latency (and break even) of this C1
    state is higher than the typical bare metal C1 state. Because hlt causes
    a vmexit, and the cost of vmexit + hypervisor overhead + vmenter is
    typically in the order of upto 5 microseconds... even if the hypervisor
    does not actually goes into a hardware power saving state.
    Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
    [ rjw: Dropped redundant checks from should_verify_mwait() ]
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    2f3d08f0
intel_idle.c 59.7 KB