Commit e6cf623b authored by Rafael J. Wysocki's avatar Rafael J. Wysocki

Merge branch 'intel_idle+acpi'

Merge changes updating the ACPI processor driver in order to export
acpi_processor_evaluate_cst() to the code outside of it and adding
ACPI support to the intel_idle driver based on that.

* intel_idle+acpi:
  Documentation: admin-guide: PM: Add intel_idle document
  intel_idle: Use ACPI _CST on server systems
  intel_idle: Add module parameter to prevent ACPI _CST from being used
  intel_idle: Allow ACPI _CST to be used for selected known processors
  cpuidle: Allow idle states to be disabled by default
  intel_idle: Use ACPI _CST for processor models without C-state tables
  intel_idle: Refactor intel_idle_cpuidle_driver_init()
  ACPI: processor: Export acpi_processor_evaluate_cst()
  ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR
  ACPI: processor: Clean up acpi_processor_evaluate_cst()
  ACPI: processor: Introduce acpi_processor_evaluate_cst()
  ACPI: processor: Export function to claim _CST control
parents cefb9409 a3299182
......@@ -196,6 +196,12 @@ Description:
does not reflect it. Likewise, if one enables a deep state but a
lighter state still is disabled, then this has no effect.
What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/default_status
Date: December 2019
KernelVersion: v5.6
Contact: Linux power management list <linux-pm@vger.kernel.org>
Description:
(RO) The default status of this state, "enabled" or "disabled".
What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/residency
Date: March 2014
......
......@@ -506,6 +506,9 @@ object corresponding to it, as follows:
``disable``
Whether or not this idle state is disabled.
``default_status``
The default status of this state, "enabled" or "disabled".
``latency``
Exit latency of the idle state in microseconds.
......
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
==============================================
``intel_idle`` CPU Idle Time Management Driver
==============================================
:Copyright: |copy| 2020 Intel Corporation
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
General Information
===================
``intel_idle`` is a part of the
:doc:`CPU idle time management subsystem <cpuidle>` in the Linux kernel
(``CPUIdle``). It is the default CPU idle time management driver for the
Nehalem and later generations of Intel processors, but the level of support for
a particular processor model in it depends on whether or not it recognizes that
processor model and may also depend on information coming from the platform
firmware. [To understand ``intel_idle`` it is necessary to know how ``CPUIdle``
works in general, so this is the time to get familiar with :doc:`cpuidle` if you
have not done that yet.]
``intel_idle`` uses the ``MWAIT`` instruction to inform the processor that the
logical CPU executing it is idle and so it may be possible to put some of the
processor's functional blocks into low-power states. That instruction takes two
arguments (passed in the ``EAX`` and ``ECX`` registers of the target CPU), the
first of which, referred to as a *hint*, can be used by the processor to
determine what can be done (for details refer to Intel Software Developer’s
Manual [1]_). Accordingly, ``intel_idle`` refuses to work with processors in
which the support for the ``MWAIT`` instruction has been disabled (for example,
via the platform firmware configuration menu) or which do not support that
instruction at all.
``intel_idle`` is not modular, so it cannot be unloaded, which means that the
only way to pass early-configuration-time parameters to it is via the kernel
command line.
.. _intel-idle-enumeration-of-states:
Enumeration of Idle States
==========================
Each ``MWAIT`` hint value is interpreted by the processor as a license to
reconfigure itself in a certain way in order to save energy. The processor
configurations (with reduced power draw) resulting from that are referred to
as C-states (in the ACPI terminology) or idle states. The list of meaningful
``MWAIT`` hint values and idle states (i.e. low-power configurations of the
processor) corresponding to them depends on the processor model and it may also
depend on the configuration of the platform.
In order to create a list of available idle states required by the ``CPUIdle``
subsystem (see :ref:`idle-states-representation` in :doc:`cpuidle`),
``intel_idle`` can use two sources of information: static tables of idle states
for different processor models included in the driver itself and the ACPI tables
of the system. The former are always used if the processor model at hand is
recognized by ``intel_idle`` and the latter are used if that is required for
the given processor model (which is the case for all server processor models
recognized by ``intel_idle``) or if the processor model is not recognized.
If the ACPI tables are going to be used for building the list of available idle
states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
objects corresponding to the CPUs in the system (refer to the ACPI specification
[2]_ for the description of ``_CST`` and its output package). Because the
``CPUIdle`` subsystem expects that the list of idle states supplied by the
driver will be suitable for all of the CPUs handled by it and ``intel_idle`` is
registered as the ``CPUIdle`` driver for all of the CPUs in the system, the
driver looks for the first ``_CST`` object returning at least one valid idle
state description and such that all of the idle states included in its return
package are of the FFH (Functional Fixed Hardware) type, which means that the
``MWAIT`` instruction is expected to be used to tell the processor that it can
enter one of them. The return package of that ``_CST`` is then assumed to be
applicable to all of the other CPUs in the system and the idle state
descriptions extracted from it are stored in a preliminary list of idle states
coming from the ACPI tables. [This step is skipped if ``intel_idle`` is
configured to ignore the ACPI tables; see `below <intel-idle-parameters_>`_.]
Next, the first (index 0) entry in the list of available idle states is
initialized to represent a "polling idle state" (a pseudo-idle state in which
the target CPU continuously fetches and executes instructions), and the
subsequent (real) idle state entries are populated as follows.
If the processor model at hand is recognized by ``intel_idle``, there is a
(static) table of idle state descriptions for it in the driver. In that case,
the "internal" table is the primary source of information on idle states and the
information from it is copied to the final list of available idle states. If
using the ACPI tables for the enumeration of idle states is not required
(depending on the processor model), all of the listed idle state are enabled by
default (so all of them will be taken into consideration by ``CPUIdle``
governors during CPU idle state selection). Otherwise, some of the listed idle
states may not be enabled by default if there are no matching entries in the
preliminary list of idle states coming from the ACPI tables. In that case user
space still can enable them later (on a per-CPU basis) with the help of
the ``disable`` idle state attribute in ``sysfs`` (see
:ref:`idle-states-representation` in :doc:`cpuidle`). This basically means that
the idle states "known" to the driver may not be enabled by default if they have
not been exposed by the platform firmware (through the ACPI tables).
If the given processor model is not recognized by ``intel_idle``, but it
supports ``MWAIT``, the preliminary list of idle states coming from the ACPI
tables is used for building the final list that will be supplied to the
``CPUIdle`` core during driver registration. For each idle state in that list,
the description, ``MWAIT`` hint and exit latency are copied to the corresponding
entry in the final list of idle states. The name of the idle state represented
by it (to be returned by the ``name`` idle state attribute in ``sysfs``) is
"CX_ACPI", where X is the index of that idle state in the final list (note that
the minimum value of X is 1, because 0 is reserved for the "polling" state), and
its target residency is based on the exit latency value. Specifically, for
C1-type idle states the exit latency value is also used as the target residency
(for compatibility with the majority of the "internal" tables of idle states for
various processor models recognized by ``intel_idle``) and for the other idle
state types (C2 and C3) the target residency value is 3 times the exit latency
(again, that is because it reflects the target residency to exit latency ratio
in the majority of cases for the processor models recognized by ``intel_idle``).
All of the idle states in the final list are enabled by default in this case.
.. _intel-idle-initialization:
Initialization
==============
The initialization of ``intel_idle`` starts with checking if the kernel command
line options forbid the use of the ``MWAIT`` instruction. If that is the case,
an error code is returned right away.
The next step is to check whether or not the processor model is known to the
driver, which determines the idle states enumeration method (see
`above <intel-idle-enumeration-of-states_>`_), and whether or not the processor
supports ``MWAIT`` (the initialization fails if that is not the case). Then,
the ``MWAIT`` support in the processor is enumerated through ``CPUID`` and the
driver initialization fails if the level of support is not as expected (for
example, if the total number of ``MWAIT`` substates returned is 0).
Next, if the driver is not configured to ignore the ACPI tables (see
`below <intel-idle-parameters_>`_), the idle states information provided by the
platform firmware is extracted from them.
Then, ``CPUIdle`` device objects are allocated for all CPUs and the list of
available idle states is created as explained
`above <intel-idle-enumeration-of-states_>`_.
Finally, ``intel_idle`` is registered with the help of cpuidle_register_driver()
as the ``CPUIdle`` driver for all CPUs in the system and a CPU online callback
for configuring individual CPUs is registered via cpuhp_setup_state(), which
(among other things) causes the callback routine to be invoked for all of the
CPUs present in the system at that time (each CPU executes its own instance of
the callback routine). That routine registers a ``CPUIdle`` device for the CPU
running it (which enables the ``CPUIdle`` subsystem to operate that CPU) and
optionally performs some CPU-specific initialization actions that may be
required for the given processor model.
.. _intel-idle-parameters:
Kernel Command Line Options and Module Parameters
=================================================
The *x86* architecture support code recognizes three kernel command line
options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
and ``idle=nomwait``. If any of them is present in the kernel command line, the
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
Apart from that there are two module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
The ``max_cstate`` parameter value is the maximum idle state index in the list
of idle states supplied to the ``CPUIdle`` core during the registration of the
driver. It is also the maximum number of regular (non-polling) idle states that
can be used by ``intel_idle``, so the enumeration of idle states is terminated
after finding that number of usable idle states (the other idle states that
potentially might have been used if ``max_cstate`` had been greater are not
taken into consideration at all). Setting ``max_cstate`` can prevent
``intel_idle`` from exposing idle states that are regarded as "too deep" for
some reason to the ``CPUIdle`` core, but it does so by making them effectively
invisible until the system is shut down and started again which may not always
be desirable. In practice, it is only really necessary to do that if the idle
states in question cannot be enabled during system startup, because in the
working state of the system the CPU power management quality of service (PM
QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states
even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
kernel has been configured with ACPI support), can be set to make the driver
ignore the system's ACPI tables entirely (it is unset by default).
.. _intel-idle-core-and-package-idle-states:
Core and Package Levels of Idle States
======================================
Typically, in a processor supporting the ``MWAIT`` instruction there are (at
least) two levels of idle states (or C-states). One level, referred to as
"core C-states", covers individual cores in the processor, whereas the other
level, referred to as "package C-states", covers the entire processor package
and it may also involve other components of the system (GPUs, memory
controllers, I/O hubs etc.).
Some of the ``MWAIT`` hint values allow the processor to use core C-states only
(most importantly, that is the case for the ``MWAIT`` hint value corresponding
to the ``C1`` idle state), but the majority of them give it a license to put
the target core (i.e. the core containing the logical CPU executing ``MWAIT``
with the given hint value) into a specific core C-state and then (if possible)
to enter a specific package C-state at the deeper level. For example, the
``MWAIT`` hint value representing the ``C3`` idle state allows the processor to
put the target core into the low-power state referred to as "core ``C3``" (or
``CC3``), which happens if all of the logical CPUs (SMT siblings) in that core
have executed ``MWAIT`` with the ``C3`` hint value (or with a hint value
representing a deeper idle state), and in addition to that (in the majority of
cases) it gives the processor a license to put the entire package (possibly
including some non-CPU components such as a GPU or a memory controller) into the
low-power state referred to as "package ``C3``" (or ``PC3``), which happens if
all of the cores have gone into the ``CC3`` state and (possibly) some additional
conditions are satisfied (for instance, if the GPU is covered by ``PC3``, it may
be required to be in a certain GPU-specific low-power state for ``PC3`` to be
reachable).
As a rule, there is no simple way to make the processor use core C-states only
if the conditions for entering the corresponding package C-states are met, so
the logical CPU executing ``MWAIT`` with a hint value that is not core-level
only (like for ``C1``) must always assume that this may cause the processor to
enter a package C-state. [That is why the exit latency and target residency
values corresponding to the majority of ``MWAIT`` hint values in the "internal"
tables of idle states in ``intel_idle`` reflect the properties of package
C-states.] If using package C-states is not desirable at all, either
:ref:`PM QoS <cpu-pm-qos>` or the ``max_cstate`` module parameter of
``intel_idle`` described `above <intel-idle-parameters_>`_ must be used to
restrict the range of permissible idle states to the ones with core-level only
``MWAIT`` hint values (like ``C1``).
References
==========
.. [1] *Intel® 64 and IA-32 Architectures Software Developers Manual Volume 2B*,
https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2b-manual.html
.. [2] *Advanced Configuration and Power Interface (ACPI) Specification*,
https://uefi.org/specifications
......@@ -8,6 +8,7 @@ Working-State Power Management
:maxdepth: 2
cpuidle
intel_idle
cpufreq
intel_pstate
intel_epb
......@@ -241,6 +241,7 @@ config ACPI_CPU_FREQ_PSS
config ACPI_PROCESSOR_CSTATE
def_bool y
depends on ACPI_PROCESSOR
depends on IA64 || X86
config ACPI_PROCESSOR_IDLE
......
......@@ -705,3 +705,185 @@ void __init acpi_processor_init(void)
acpi_scan_add_handler_with_hotplug(&processor_handler, "processor");
acpi_scan_add_handler(&processor_container_handler);
}
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
/**
* acpi_processor_claim_cst_control - Request _CST control from the platform.
*/
bool acpi_processor_claim_cst_control(void)
{
static bool cst_control_claimed;
acpi_status status;
if (!acpi_gbl_FADT.cst_control || cst_control_claimed)
return true;
status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
acpi_gbl_FADT.cst_control, 8);
if (ACPI_FAILURE(status)) {
pr_warn("ACPI: Failed to claim processor _CST control\n");
return false;
}
cst_control_claimed = true;
return true;
}
EXPORT_SYMBOL_GPL(acpi_processor_claim_cst_control);
/**
* acpi_processor_evaluate_cst - Evaluate the processor _CST control method.
* @handle: ACPI handle of the processor object containing the _CST.
* @cpu: The numeric ID of the target CPU.
* @info: Object write the C-states information into.
*
* Extract the C-state information for the given CPU from the output of the _CST
* control method under the corresponding ACPI processor object (or processor
* device object) and populate @info with it.
*
* If any ACPI_ADR_SPACE_FIXED_HARDWARE C-states are found, invoke
* acpi_processor_ffh_cstate_probe() to verify them and update the
* cpu_cstate_entry data for @cpu.
*/
int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
struct acpi_processor_power *info)
{
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
union acpi_object *cst;
acpi_status status;
u64 count;
int last_index = 0;
int i, ret = 0;
status = acpi_evaluate_object(handle, "_CST", NULL, &buffer);
if (ACPI_FAILURE(status)) {
acpi_handle_debug(handle, "No _CST\n");
return -ENODEV;
}
cst = buffer.pointer;
/* There must be at least 2 elements. */
if (!cst || cst->type != ACPI_TYPE_PACKAGE || cst->package.count < 2) {
acpi_handle_warn(handle, "Invalid _CST output\n");
ret = -EFAULT;
goto end;
}
count = cst->package.elements[0].integer.value;
/* Validate the number of C-states. */
if (count < 1 || count != cst->package.count - 1) {
acpi_handle_warn(handle, "Inconsistent _CST data\n");
ret = -EFAULT;
goto end;
}
for (i = 1; i <= count; i++) {
union acpi_object *element;
union acpi_object *obj;
struct acpi_power_register *reg;
struct acpi_processor_cx cx;
/*
* If there is not enough space for all C-states, skip the
* excess ones and log a warning.
*/
if (last_index >= ACPI_PROCESSOR_MAX_POWER - 1) {
acpi_handle_warn(handle,
"No room for more idle states (limit: %d)\n",
ACPI_PROCESSOR_MAX_POWER - 1);
break;
}
memset(&cx, 0, sizeof(cx));
element = &cst->package.elements[i];
if (element->type != ACPI_TYPE_PACKAGE)
continue;
if (element->package.count != 4)
continue;
obj = &element->package.elements[0];
if (obj->type != ACPI_TYPE_BUFFER)
continue;
reg = (struct acpi_power_register *)obj->buffer.pointer;
obj = &element->package.elements[1];
if (obj->type != ACPI_TYPE_INTEGER)
continue;
cx.type = obj->integer.value;
/*
* There are known cases in which the _CST output does not
* contain C1, so if the type of the first state found is not
* C1, leave an empty slot for C1 to be filled in later.
*/
if (i == 1 && cx.type != ACPI_STATE_C1)
last_index = 1;
cx.address = reg->address;
cx.index = last_index + 1;
if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) {
if (!acpi_processor_ffh_cstate_probe(cpu, &cx, reg)) {
/*
* In the majority of cases _CST describes C1 as
* a FIXED_HARDWARE C-state, but if the command
* line forbids using MWAIT, use CSTATE_HALT for
* C1 regardless.
*/
if (cx.type == ACPI_STATE_C1 &&
boot_option_idle_override == IDLE_NOMWAIT) {
cx.entry_method = ACPI_CSTATE_HALT;
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
} else {
cx.entry_method = ACPI_CSTATE_FFH;
}
} else if (cx.type == ACPI_STATE_C1) {
/*
* In the special case of C1, FIXED_HARDWARE can
* be handled by executing the HLT instruction.
*/
cx.entry_method = ACPI_CSTATE_HALT;
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
} else {
continue;
}
} else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
cx.entry_method = ACPI_CSTATE_SYSTEMIO;
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x",
cx.address);
} else {
continue;
}
if (cx.type == ACPI_STATE_C1)
cx.valid = 1;
obj = &element->package.elements[2];
if (obj->type != ACPI_TYPE_INTEGER)
continue;
cx.latency = obj->integer.value;
obj = &element->package.elements[3];
if (obj->type != ACPI_TYPE_INTEGER)
continue;
memcpy(&info->states[++last_index], &cx, sizeof(cx));
}
acpi_handle_info(handle, "Found %d idle states\n", last_index);
info->count = last_index;
end:
kfree(buffer.pointer);
return ret;
}
EXPORT_SYMBOL_GPL(acpi_processor_evaluate_cst);
#endif /* CONFIG_ACPI_PROCESSOR_CSTATE */
......@@ -299,164 +299,24 @@ static int acpi_processor_get_power_info_default(struct acpi_processor *pr)
static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
{
acpi_status status;
u64 count;
int current_count;
int i, ret = 0;
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
union acpi_object *cst;
int ret;
if (nocst)
return -ENODEV;
current_count = 0;
status = acpi_evaluate_object(pr->handle, "_CST", NULL, &buffer);
if (ACPI_FAILURE(status)) {
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No _CST, giving up\n"));
return -ENODEV;
}
cst = buffer.pointer;
/* There must be at least 2 elements */
if (!cst || (cst->type != ACPI_TYPE_PACKAGE) || cst->package.count < 2) {
pr_err("not enough elements in _CST\n");
ret = -EFAULT;
goto end;
}
count = cst->package.elements[0].integer.value;
ret = acpi_processor_evaluate_cst(pr->handle, pr->id, &pr->power);
if (ret)
return ret;
/* Validate number of power states. */
if (count < 1 || count != cst->package.count - 1) {
pr_err("count given by _CST is not valid\n");
ret = -EFAULT;
goto end;
}
/*
* It is expected that there will be at least 2 states, C1 and
* something else (C2 or C3), so fail if that is not the case.
*/
if (pr->power.count < 2)
return -EFAULT;
/* Tell driver that at least _CST is supported. */
pr->flags.has_cst = 1;
for (i = 1; i <= count; i++) {
union acpi_object *element;
union acpi_object *obj;
struct acpi_power_register *reg;
struct acpi_processor_cx cx;
memset(&cx, 0, sizeof(cx));
element = &(cst->package.elements[i]);
if (element->type != ACPI_TYPE_PACKAGE)
continue;
if (element->package.count != 4)
continue;
obj = &(element->package.elements[0]);
if (obj->type != ACPI_TYPE_BUFFER)
continue;
reg = (struct acpi_power_register *)obj->buffer.pointer;
if (reg->space_id != ACPI_ADR_SPACE_SYSTEM_IO &&
(reg->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE))
continue;
/* There should be an easy way to extract an integer... */
obj = &(element->package.elements[1]);
if (obj->type != ACPI_TYPE_INTEGER)
continue;
cx.type = obj->integer.value;
/*
* Some buggy BIOSes won't list C1 in _CST -
* Let acpi_processor_get_power_info_default() handle them later
*/
if (i == 1 && cx.type != ACPI_STATE_C1)
current_count++;
cx.address = reg->address;
cx.index = current_count + 1;
cx.entry_method = ACPI_CSTATE_SYSTEMIO;
if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) {
if (acpi_processor_ffh_cstate_probe
(pr->id, &cx, reg) == 0) {
cx.entry_method = ACPI_CSTATE_FFH;
} else if (cx.type == ACPI_STATE_C1) {
/*
* C1 is a special case where FIXED_HARDWARE
* can be handled in non-MWAIT way as well.
* In that case, save this _CST entry info.
* Otherwise, ignore this info and continue.
*/
cx.entry_method = ACPI_CSTATE_HALT;
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
} else {
continue;
}
if (cx.type == ACPI_STATE_C1 &&
(boot_option_idle_override == IDLE_NOMWAIT)) {
/*
* In most cases the C1 space_id obtained from
* _CST object is FIXED_HARDWARE access mode.
* But when the option of idle=halt is added,
* the entry_method type should be changed from
* CSTATE_FFH to CSTATE_HALT.
* When the option of idle=nomwait is added,
* the C1 entry_method type should be
* CSTATE_HALT.
*/
cx.entry_method = ACPI_CSTATE_HALT;
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
}
} else {
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x",
cx.address);
}
if (cx.type == ACPI_STATE_C1) {
cx.valid = 1;
}
obj = &(element->package.elements[2]);
if (obj->type != ACPI_TYPE_INTEGER)
continue;
cx.latency = obj->integer.value;
obj = &(element->package.elements[3]);
if (obj->type != ACPI_TYPE_INTEGER)
continue;
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
/*
* We support total ACPI_PROCESSOR_MAX_POWER - 1
* (From 1 through ACPI_PROCESSOR_MAX_POWER - 1)
*/
if (current_count >= (ACPI_PROCESSOR_MAX_POWER - 1)) {
pr_warn("Limiting number of power states to max (%d)\n",
ACPI_PROCESSOR_MAX_POWER);
pr_warn("Please increase ACPI_PROCESSOR_MAX_POWER if needed.\n");
break;
}
}
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found %d power states\n",
current_count));
/* Validate number of power states discovered */
if (current_count < 2)
ret = -EFAULT;
end:
kfree(buffer.pointer);
return ret;
return 0;
}
static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
......@@ -909,7 +769,6 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
static inline void acpi_processor_cstate_first_run_checks(void)
{
acpi_status status;
static int first_run;
if (first_run)
......@@ -921,13 +780,10 @@ static inline void acpi_processor_cstate_first_run_checks(void)
max_cstate);
first_run++;
if (acpi_gbl_FADT.cst_control && !nocst) {
status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
acpi_gbl_FADT.cst_control, 8);
if (ACPI_FAILURE(status))
ACPI_EXCEPTION((AE_INFO, status,
"Notifying BIOS of _CST ability failed"));
}
if (nocst)
return;
acpi_processor_claim_cst_control();
}
#else
......
......@@ -575,10 +575,14 @@ static int __cpuidle_register_device(struct cpuidle_device *dev)
if (!try_module_get(drv->owner))
return -EINVAL;
for (i = 0; i < drv->state_count; i++)
for (i = 0; i < drv->state_count; i++) {
if (drv->states[i].flags & CPUIDLE_FLAG_UNUSABLE)
dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_DRIVER;
if (drv->states[i].flags & CPUIDLE_FLAG_OFF)
dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_USER;
}
per_cpu(cpuidle_devices, dev->cpu) = dev;
list_add(&dev->device_list, &cpuidle_detected_devices);
......
......@@ -329,6 +329,14 @@ static ssize_t store_state_disable(struct cpuidle_state *state,
return size;
}
static ssize_t show_state_default_status(struct cpuidle_state *state,
struct cpuidle_state_usage *state_usage,
char *buf)
{
return sprintf(buf, "%s\n",
state->flags & CPUIDLE_FLAG_OFF ? "disabled" : "enabled");
}
define_one_state_ro(name, show_state_name);
define_one_state_ro(desc, show_state_desc);
define_one_state_ro(latency, show_state_exit_latency);
......@@ -339,6 +347,7 @@ define_one_state_ro(time, show_state_time);
define_one_state_rw(disable, show_state_disable, store_state_disable);
define_one_state_ro(above, show_state_above);
define_one_state_ro(below, show_state_below);
define_one_state_ro(default_status, show_state_default_status);
static struct attribute *cpuidle_state_default_attrs[] = {
&attr_name.attr,
......@@ -351,6 +360,7 @@ static struct attribute *cpuidle_state_default_attrs[] = {
&attr_disable.attr,
&attr_above.attr,
&attr_below.attr,
&attr_default_status.attr,
NULL
};
......
......@@ -41,6 +41,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/acpi.h>
#include <linux/kernel.h>
#include <linux/cpuidle.h>
#include <linux/tick.h>
......@@ -79,6 +80,7 @@ struct idle_cpu {
unsigned long auto_demotion_disable_flags;
bool byt_auto_demotion_disable_flag;
bool disable_promotion_to_c1e;
bool use_acpi;
};
static const struct idle_cpu *icpu;
......@@ -89,6 +91,11 @@ static void intel_idle_s2idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index);
static struct cpuidle_state *cpuidle_state_table;
/*
* Enable this state by default even if the ACPI _CST does not list it.
*/
#define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)
/*
* Set this flag for states where the HW flushes the TLB for us
* and so we don't need cross-calls to keep it consistent.
......@@ -124,7 +131,7 @@ static struct cpuidle_state nehalem_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -161,7 +168,7 @@ static struct cpuidle_state snb_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -296,7 +303,7 @@ static struct cpuidle_state ivb_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -341,7 +348,7 @@ static struct cpuidle_state ivt_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 80,
.enter = &intel_idle,
......@@ -378,7 +385,7 @@ static struct cpuidle_state ivt_cstates_4s[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 250,
.enter = &intel_idle,
......@@ -415,7 +422,7 @@ static struct cpuidle_state ivt_cstates_8s[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 500,
.enter = &intel_idle,
......@@ -452,7 +459,7 @@ static struct cpuidle_state hsw_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -520,7 +527,7 @@ static struct cpuidle_state bdw_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -589,7 +596,7 @@ static struct cpuidle_state skl_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -658,7 +665,7 @@ static struct cpuidle_state skx_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -808,7 +815,7 @@ static struct cpuidle_state bxt_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -869,7 +876,7 @@ static struct cpuidle_state dnv_cstates[] = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
.flags = MWAIT2flg(0x01),
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 10,
.target_residency = 20,
.enter = &intel_idle,
......@@ -944,6 +951,22 @@ static void intel_idle_s2idle(struct cpuidle_device *dev,
mwait_idle_with_hints(eax, ecx);
}
static bool intel_idle_verify_cstate(unsigned int mwait_hint)
{
unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1;
unsigned int num_substates = (mwait_substates >> mwait_cstate * 4) &
MWAIT_SUBSTATE_MASK;
/* Ignore the C-state if there are NO sub-states in CPUID for it. */
if (num_substates == 0)
return false;
if (mwait_cstate > 2 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
mark_tsc_unstable("TSC halts in idle states deeper than C2");
return true;
}
static void __setup_broadcast_timer(bool on)
{
if (on)
......@@ -975,6 +998,13 @@ static const struct idle_cpu idle_cpu_nehalem = {
.disable_promotion_to_c1e = true,
};
static const struct idle_cpu idle_cpu_nhx = {
.state_table = nehalem_cstates,
.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_atom = {
.state_table = atom_cstates,
};
......@@ -993,6 +1023,12 @@ static const struct idle_cpu idle_cpu_snb = {
.disable_promotion_to_c1e = true,
};
static const struct idle_cpu idle_cpu_snx = {
.state_table = snb_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_byt = {
.state_table = byt_cstates,
.disable_promotion_to_c1e = true,
......@@ -1013,6 +1049,7 @@ static const struct idle_cpu idle_cpu_ivb = {
static const struct idle_cpu idle_cpu_ivt = {
.state_table = ivt_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_hsw = {
......@@ -1020,11 +1057,23 @@ static const struct idle_cpu idle_cpu_hsw = {
.disable_promotion_to_c1e = true,
};
static const struct idle_cpu idle_cpu_hsx = {
.state_table = hsw_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_bdw = {
.state_table = bdw_cstates,
.disable_promotion_to_c1e = true,
};
static const struct idle_cpu idle_cpu_bdx = {
.state_table = bdw_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_skl = {
.state_table = skl_cstates,
.disable_promotion_to_c1e = true,
......@@ -1033,15 +1082,18 @@ static const struct idle_cpu idle_cpu_skl = {
static const struct idle_cpu idle_cpu_skx = {
.state_table = skx_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_avn = {
.state_table = avn_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_knl = {
.state_table = knl_cstates,
.use_acpi = true,
};
static const struct idle_cpu idle_cpu_bxt = {
......@@ -1052,20 +1104,21 @@ static const struct idle_cpu idle_cpu_bxt = {
static const struct idle_cpu idle_cpu_dnv = {
.state_table = dnv_cstates,
.disable_promotion_to_c1e = true,
.use_acpi = true,
};
static const struct x86_cpu_id intel_idle_ids[] __initconst = {
INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nhx),
INTEL_CPU_FAM6(NEHALEM, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM_G, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nhx),
INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nhx),
INTEL_CPU_FAM6(ATOM_BONNELL, idle_cpu_atom),
INTEL_CPU_FAM6(ATOM_BONNELL_MID, idle_cpu_lincroft),
INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nhx),
INTEL_CPU_FAM6(SANDYBRIDGE, idle_cpu_snb),
INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snb),
INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snx),
INTEL_CPU_FAM6(ATOM_SALTWELL, idle_cpu_atom),
INTEL_CPU_FAM6(ATOM_SILVERMONT, idle_cpu_byt),
INTEL_CPU_FAM6(ATOM_SILVERMONT_MID, idle_cpu_tangier),
......@@ -1073,14 +1126,14 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
INTEL_CPU_FAM6(IVYBRIDGE, idle_cpu_ivb),
INTEL_CPU_FAM6(IVYBRIDGE_X, idle_cpu_ivt),
INTEL_CPU_FAM6(HASWELL, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsx),
INTEL_CPU_FAM6(HASWELL_L, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_G, idle_cpu_hsw),
INTEL_CPU_FAM6(ATOM_SILVERMONT_D, idle_cpu_avn),
INTEL_CPU_FAM6(BROADWELL, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_G, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdx),
INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdx),
INTEL_CPU_FAM6(SKYLAKE_L, idle_cpu_skl),
INTEL_CPU_FAM6(SKYLAKE, idle_cpu_skl),
INTEL_CPU_FAM6(KABYLAKE_L, idle_cpu_skl),
......@@ -1095,6 +1148,162 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
{}
};
#define INTEL_CPU_FAM6_MWAIT \
{ X86_VENDOR_INTEL, 6, X86_MODEL_ANY, X86_FEATURE_MWAIT, 0 }
static const struct x86_cpu_id intel_mwait_ids[] __initconst = {
INTEL_CPU_FAM6_MWAIT,
{}
};
static bool intel_idle_max_cstate_reached(int cstate)
{
if (cstate + 1 > max_cstate) {
pr_info("max_cstate %d reached\n", max_cstate);
return true;
}
return false;
}
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
#include <acpi/processor.h>
static bool no_acpi __read_mostly;
module_param(no_acpi, bool, 0444);
MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
static struct acpi_processor_power acpi_state_table;
/**
* intel_idle_cst_usable - Check if the _CST information can be used.
*
* Check if all of the C-states listed by _CST in the max_cstate range are
* ACPI_CSTATE_FFH, which means that they should be entered via MWAIT.
*/
static bool intel_idle_cst_usable(void)
{
int cstate, limit;
limit = min_t(int, min_t(int, CPUIDLE_STATE_MAX, max_cstate + 1),
acpi_state_table.count);
for (cstate = 1; cstate < limit; cstate++) {
struct acpi_processor_cx *cx = &acpi_state_table.states[cstate];
if (cx->entry_method != ACPI_CSTATE_FFH)
return false;
}
return true;
}
static bool intel_idle_acpi_cst_extract(void)
{
unsigned int cpu;
if (no_acpi) {
pr_debug("Not allowed to use ACPI _CST\n");
return false;
}
for_each_possible_cpu(cpu) {
struct acpi_processor *pr = per_cpu(processors, cpu);
if (!pr)
continue;
if (acpi_processor_evaluate_cst(pr->handle, cpu, &acpi_state_table))
continue;
acpi_state_table.count++;
if (!intel_idle_cst_usable())
continue;
if (!acpi_processor_claim_cst_control()) {
acpi_state_table.count = 0;
return false;
}
return true;
}
pr_debug("ACPI _CST not found or not usable\n");
return false;
}
static void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv)
{
int cstate, limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count);
/*
* If limit > 0, intel_idle_cst_usable() has returned 'true', so all of
* the interesting states are ACPI_CSTATE_FFH.
*/
for (cstate = 1; cstate < limit; cstate++) {
struct acpi_processor_cx *cx;
struct cpuidle_state *state;
if (intel_idle_max_cstate_reached(cstate))
break;
cx = &acpi_state_table.states[cstate];
state = &drv->states[drv->state_count++];
snprintf(state->name, CPUIDLE_NAME_LEN, "C%d_ACPI", cstate);
strlcpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
state->exit_latency = cx->latency;
/*
* For C1-type C-states use the same number for both the exit
* latency and target residency, because that is the case for
* C1 in the majority of the static C-states tables above.
* For the other types of C-states, however, set the target
* residency to 3 times the exit latency which should lead to
* a reasonable balance between energy-efficiency and
* performance in the majority of interesting cases.
*/
state->target_residency = cx->latency;
if (cx->type > ACPI_STATE_C1)
state->target_residency *= 3;
state->flags = MWAIT2flg(cx->address);
if (cx->type > ACPI_STATE_C2)
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
state->enter = intel_idle;
state->enter_s2idle = intel_idle_s2idle;
}
}
static bool intel_idle_off_by_default(u32 mwait_hint)
{
int cstate, limit;
/*
* If there are no _CST C-states, do not disable any C-states by
* default.
*/
if (!acpi_state_table.count)
return false;
limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count);
/*
* If limit > 0, intel_idle_cst_usable() has returned 'true', so all of
* the interesting states are ACPI_CSTATE_FFH.
*/
for (cstate = 1; cstate < limit; cstate++) {
if (acpi_state_table.states[cstate].address == mwait_hint)
return false;
}
return true;
}
#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
static inline bool intel_idle_acpi_cst_extract(void) { return false; }
static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
#endif /* !CONFIG_ACPI_PROCESSOR_CSTATE */
/*
* intel_idle_probe()
*/
......@@ -1109,17 +1318,15 @@ static int __init intel_idle_probe(void)
}
id = x86_match_cpu(intel_idle_ids);
if (!id) {
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
boot_cpu_data.x86 == 6)
pr_debug("does not run on family %d model %d\n",
boot_cpu_data.x86, boot_cpu_data.x86_model);
return -ENODEV;
}
if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
pr_debug("Please enable MWAIT in BIOS SETUP\n");
return -ENODEV;
if (id) {
if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
pr_debug("Please enable MWAIT in BIOS SETUP\n");
return -ENODEV;
}
} else {
id = x86_match_cpu(intel_mwait_ids);
if (!id)
return -ENODEV;
}
if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF)
......@@ -1135,7 +1342,13 @@ static int __init intel_idle_probe(void)
pr_debug("MWAIT substates: 0x%x\n", mwait_substates);
icpu = (const struct idle_cpu *)id->driver_data;
cpuidle_state_table = icpu->state_table;
if (icpu) {
cpuidle_state_table = icpu->state_table;
if (icpu->use_acpi)
intel_idle_acpi_cst_extract();
} else if (!intel_idle_acpi_cst_extract()) {
return -ENODEV;
}
pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n",
boot_cpu_data.x86_model);
......@@ -1317,60 +1530,39 @@ static void intel_idle_state_table_update(void)
}
}
/*
* intel_idle_cpuidle_driver_init()
* allocate, initialize cpuidle_states
*/
static void __init intel_idle_cpuidle_driver_init(void)
static void intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
{
int cstate;
struct cpuidle_driver *drv = &intel_idle_driver;
intel_idle_state_table_update();
cpuidle_poll_state_init(drv);
drv->state_count = 1;
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
int num_substates, mwait_hint, mwait_cstate;
unsigned int mwait_hint;
if ((cpuidle_state_table[cstate].enter == NULL) &&
(cpuidle_state_table[cstate].enter_s2idle == NULL))
if (intel_idle_max_cstate_reached(cstate))
break;
if (cstate + 1 > max_cstate) {
pr_info("max_cstate %d reached\n", max_cstate);
if (!cpuidle_state_table[cstate].enter &&
!cpuidle_state_table[cstate].enter_s2idle)
break;
}
mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint);
/* number of sub-states for this state in CPUID.MWAIT */
num_substates = (mwait_substates >> ((mwait_cstate + 1) * 4))
& MWAIT_SUBSTATE_MASK;
/* if NO sub-states for this state in CPUID, skip it */
if (num_substates == 0)
continue;
/* if state marked as disabled, skip it */
/* If marked as unusable, skip this state. */
if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_UNUSABLE) {
pr_debug("state %s is disabled\n",
cpuidle_state_table[cstate].name);
continue;
}
mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
if (!intel_idle_verify_cstate(mwait_hint))
continue;
if (((mwait_cstate + 1) > 2) &&
!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
mark_tsc_unstable("TSC halts in idle"
" states deeper than C2");
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
drv->states[drv->state_count] = /* structure copy */
cpuidle_state_table[cstate];
if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
drv->state_count += 1;
drv->state_count++;
}
if (icpu->byt_auto_demotion_disable_flag) {
......@@ -1379,6 +1571,24 @@ static void __init intel_idle_cpuidle_driver_init(void)
}
}
/*
* intel_idle_cpuidle_driver_init()
* allocate, initialize cpuidle_states
*/
static void __init intel_idle_cpuidle_driver_init(void)
{
struct cpuidle_driver *drv = &intel_idle_driver;
intel_idle_state_table_update();
cpuidle_poll_state_init(drv);
drv->state_count = 1;
if (icpu)
intel_idle_init_cstates_icpu(drv);
else
intel_idle_init_cstates_acpi(drv);
}
/*
* intel_idle_cpu_init()
......@@ -1397,6 +1607,9 @@ static int intel_idle_cpu_init(unsigned int cpu)
return -EIO;
}
if (!icpu)
return 0;
if (icpu->auto_demotion_disable_flags)
auto_demotion_disable();
......
......@@ -279,6 +279,21 @@ static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id)
/* Validate the processor object's proc_id */
bool acpi_duplicate_processor_id(int proc_id);
/* Processor _CTS control */
struct acpi_processor_power;
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
bool acpi_processor_claim_cst_control(void);
int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
struct acpi_processor_power *info);
#else
static inline bool acpi_processor_claim_cst_control(void) { return false; }
static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
struct acpi_processor_power *info)
{
return -ENODEV;
}
#endif
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
......
......@@ -77,6 +77,7 @@ struct cpuidle_state {
#define CPUIDLE_FLAG_COUPLED BIT(1) /* state applies to multiple cpus */
#define CPUIDLE_FLAG_TIMER_STOP BIT(2) /* timer is stopped on this state */
#define CPUIDLE_FLAG_UNUSABLE BIT(3) /* avoid using this state */
#define CPUIDLE_FLAG_OFF BIT(4) /* disable this state by default */
struct cpuidle_device_kobj;
struct cpuidle_state_kobj;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment