Commit 72ec9456 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Traditionally, cpufreq is the area with the greatest number of
  changes, but there are fewer of them than last time. There also is
  some activity in the generic power domains and the devfreq frameworks,
  a couple of system suspend and hibernation fixes and some assorted
  changes in other places.

  One new feature is the cpufreq change to allow the scheduler to pass
  hints to the governors' utilization update callbacks and some code
  rework based on that. Another one is the support for domain removal in
  the generic power domains framework. Also it is now possible to use
  hibernation with PAGE_POISONING_ZERO enabled and devfreq supports the
  RockChip DFI controller and the rk3399 DMC.

  The rest of the changes is mostly fixes and cleanups in a number of
  places.

  Specifics:

   - Add a mechanism for passing hints from the scheduler to cpufreq
     governors via their utilization update callbacks and use it to
     introduce "IOwait boosting" into the schedutil governor and
     intel_pstate that will make them boost performance if the enqueued
     task was previously waiting on I/O (Rafael Wysocki).

   - Fix a schedutil governor problem that causes it to overestimate
     utilization if SMT is in use (Steve Muckle).

   - Update defconfigs trying to use the schedutil governor as a module
     which is not possible any more (Javier Martinez Canillas).

   - Update the intel_pstate's pstate_sample tracepoint to take "IOwait
     boosting" into account (Srinivas Pandruvada).

   - Fix a problem in the cpufreq core causing it to mishandle the
     initialization of CPUs registered after the cpufreq driver (Viresh
     Kumar, Rafael Wysocki).

   - Make the cpufreq-dt driver support per-policy governor tunables,
     clean it up and update its Kconfig description (Viresh Kumar).

   - Add support for more ARM platforms to the cpufreq-dt driver
     (Chanwoo Choi, Dave Gerlach, Geert Uytterhoeven).

   - Make the cpufreq CPPC driver report frequencies in KHz to avoid
     user space compatiblility issues (Al Stone, Hoan Tran).

   - Clean up a few cpufreq drivers (st, kirkwood, SCPI) a bit (Colin
     Ian King, Markus Elfring).

   - Constify some local structures in the intel_pstate driver (Julia
     Lawall).

   - Add a Documentation/cpu-freq/ entry to MAINTAINERS (Jean Delvare).

   - Add support for PM domain removal to the generic power domains
     (genpd) framework, add new DT helper functions to it and make it
     always enable debugfs support if available (Jon Hunter, Tomeu
     Vizoso).

   - Clean up the generic power domains (genpd) framework and make it
     avoid measuring power-on and power-off latencies during system-wide
     PM transitions (Ulf Hansson).

   - Add support for the RockChip DFI controller and the rk3399 DMC to
     the devfreq framework (Lin Huang, Axel Lin, Arnd Bergmann).

   - Add COMPILE_TEST to the devfreq framework (Krzysztof Kozlowski,
     Stephen Rothwell).

   - Fix a minor issue in the exynos-ppmu devfreq driver and fix up
     devfreq Kconfig indentation style (Wei Yongjun, Jisheng Zhang).

   - Fix the system suspend interface to make suspend-to-idle work if
     platform suspend operations have not been registered (Sudeep
     Holla).

   - Make it possible to use hibernation with PAGE_POISONING_ZERO
     enabled (Anisse Astier).

   - Increas the default timeout of the system suspend/resume watchdog
     and make it depend on EXPERT (Chen Yu).

   - Make the operating performance points (OPP) framework avoid using
     OPPs that aren't supported by the platform and fix a build warning
     in it (Dave Gerlach, Arnd Bergmann).

   - Fix the ARM cpuidle driver's return value (Christophe Jaillet).

   - Make the SmartReflex AVS (Adaptive Voltage Scaling) driver use more
     common logging style (Joe Perches)"

* tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (58 commits)
  PM / OPP: Don't support OPP if it provides supported-hw but platform does not
  cpufreq: st: add missing \n to end of dev_err message
  cpufreq: kirkwood: add missing \n to end of dev_err messages
  PM / Domains: Rename pm_genpd_sync_poweron|poweroff()
  PM / Domains: Don't measure latency of ->power_on|off() during system PM
  PM / Domains: Remove redundant system PM callbacks
  PM / Domains: Simplify detaching a device from its genpd
  PM / devfreq: rk3399_dmc: Remove explictly regulator_put call in .remove
  PM / devfreq: rockchip: add PM_DEVFREQ_EVENT dependency
  PM / OPP: avoid maybe-uninitialized warning
  PM / Domains: Allow holes in genpd_data.domains array
  cpufreq: CPPC: Avoid overflow when calculating desired_perf
  cpufreq: ti: Use generic platdev driver
  cpufreq: intel_pstate: Add io_boost trace
  partial revert of "PM / devfreq: Add COMPILE_TEST for build coverage"
  cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm
  cpufreq: schedutil: Add iowait boosting
  cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition
  PM / Domains: Add support for removing nested PM domains by provider
  PM / Domains: Add support for removing PM domains
  ...
parents 7af8a0f8 993eb0ae
* Rockchip rk3399 DFI device
Required properties:
- compatible: Must be "rockchip,rk3399-dfi".
- reg: physical base address of each DFI and length of memory mapped region
- rockchip,pmu: phandle to the syscon managing the "pmu general register files"
- clocks: phandles for clock specified in "clock-names" property
- clock-names : the name of clock used by the DFI, must be "pclk_ddr_mon";
Example:
dfi: dfi@0xff630000 {
compatible = "rockchip,rk3399-dfi";
reg = <0x00 0xff630000 0x00 0x4000>;
rockchip,pmu = <&pmugrf>;
clocks = <&cru PCLK_DDR_MON>;
clock-names = "pclk_ddr_mon";
status = "disabled";
};
* Rockchip rk3399 DMC(Dynamic Memory Controller) device
Required properties:
- compatible: Must be "rockchip,rk3399-dmc".
- devfreq-events: Node to get DDR loading, Refer to
Documentation/devicetree/bindings/devfreq/
rockchip-dfi.txt
- interrupts: The interrupt number to the CPU. The interrupt
specifier format depends on the interrupt controller.
It should be DCF interrupts, when DDR dvfs finish,
it will happen.
- clocks: Phandles for clock specified in "clock-names" property
- clock-names : The name of clock used by the DFI, must be
"pclk_ddr_mon";
- operating-points-v2: Refer to Documentation/devicetree/bindings/power/opp.txt
for details.
- center-supply: DMC supply node.
- status: Marks the node enabled/disabled.
Following properties are ddr timing:
- rockchip,dram_speed_bin : Value reference include/dt-bindings/clock/ddr.h,
it select ddr3 cl-trp-trcd type, default value
"DDR3_DEFAULT".it must selected according to
"Speed Bin" in ddr3 datasheet, DO NOT use
smaller "Speed Bin" than ddr3 exactly is.
- rockchip,pd_idle : Config the PD_IDLE value, defined the power-down
idle period, memories are places into power-down
mode if bus is idle for PD_IDLE DFI clocks.
- rockchip,sr_idle : Configure the SR_IDLE value, defined the
selfrefresh idle period, memories are places
into self-refresh mode if bus is idle for
SR_IDLE*1024 DFI clocks (DFI clocks freq is
half of dram's clocks), defaule value is "0".
- rockchip,sr_mc_gate_idle : Defined the self-refresh with memory and
controller clock gating idle period, memories
are places into self-refresh mode and memory
controller clock arg gating if bus is idle for
sr_mc_gate_idle*1024 DFI clocks.
- rockchip,srpd_lite_idle : Defined the self-refresh power down idle
period, memories are places into self-refresh
power down mode if bus is idle for
srpd_lite_idle*1024 DFI clocks. This parameter
is for LPDDR4 only.
- rockchip,standby_idle : Defined the standby idle period, memories are
places into self-refresh than controller, pi,
phy and dram clock will gating if bus is idle
for standby_idle * DFI clocks.
- rockchip,dram_dll_disb_freq : It's defined the DDR3 dll bypass frequency in
MHz, when ddr freq less than DRAM_DLL_DISB_FREQ,
ddr3 dll will bypssed note: if dll was bypassed,
the odt also stop working.
- rockchip,phy_dll_disb_freq : Defined the PHY dll bypass frequency in
MHz (Mega Hz), when ddr freq less than
DRAM_DLL_DISB_FREQ, phy dll will bypssed.
note: phy dll and phy odt are independent.
- rockchip,ddr3_odt_disb_freq : When dram type is DDR3, this parameter defined
the odt disable frequency in MHz (Mega Hz),
when ddr frequency less then ddr3_odt_disb_freq,
the odt on dram side and controller side are
both disabled.
- rockchip,ddr3_drv : When dram type is DDR3, this parameter define
the dram side driver stength in ohm, default
value is DDR3_DS_40ohm.
- rockchip,ddr3_odt : When dram type is DDR3, this parameter define
the dram side ODT stength in ohm, default value
is DDR3_ODT_120ohm.
- rockchip,phy_ddr3_ca_drv : When dram type is DDR3, this parameter define
the phy side CA line(incluing command line,
address line and clock line) driver strength.
Default value is PHY_DRV_ODT_40.
- rockchip,phy_ddr3_dq_drv : When dram type is DDR3, this parameter define
the phy side DQ line(incluing DQS/DQ/DM line)
driver strength. default value is PHY_DRV_ODT_40.
- rockchip,phy_ddr3_odt : When dram type is DDR3, this parameter define the
phy side odt strength, default value is
PHY_DRV_ODT_240.
- rockchip,lpddr3_odt_disb_freq : When dram type is LPDDR3, this parameter defined
then odt disable frequency in MHz (Mega Hz),
when ddr frequency less then ddr3_odt_disb_freq,
the odt on dram side and controller side are
both disabled.
- rockchip,lpddr3_drv : When dram type is LPDDR3, this parameter define
the dram side driver stength in ohm, default
value is LP3_DS_34ohm.
- rockchip,lpddr3_odt : When dram type is LPDDR3, this parameter define
the dram side ODT stength in ohm, default value
is LP3_ODT_240ohm.
- rockchip,phy_lpddr3_ca_drv : When dram type is LPDDR3, this parameter define
the phy side CA line(incluing command line,
address line and clock line) driver strength.
default value is PHY_DRV_ODT_40.
- rockchip,phy_lpddr3_dq_drv : When dram type is LPDDR3, this parameter define
the phy side DQ line(incluing DQS/DQ/DM line)
driver strength. default value is
PHY_DRV_ODT_40.
- rockchip,phy_lpddr3_odt : When dram type is LPDDR3, this parameter define
the phy side odt strength, default value is
PHY_DRV_ODT_240.
- rockchip,lpddr4_odt_disb_freq : When dram type is LPDDR4, this parameter
defined the odt disable frequency in
MHz (Mega Hz), when ddr frequency less then
ddr3_odt_disb_freq, the odt on dram side and
controller side are both disabled.
- rockchip,lpddr4_drv : When dram type is LPDDR4, this parameter define
the dram side driver stength in ohm, default
value is LP4_PDDS_60ohm.
- rockchip,lpddr4_dq_odt : When dram type is LPDDR4, this parameter define
the dram side ODT on dqs/dq line stength in ohm,
default value is LP4_DQ_ODT_40ohm.
- rockchip,lpddr4_ca_odt : When dram type is LPDDR4, this parameter define
the dram side ODT on ca line stength in ohm,
default value is LP4_CA_ODT_40ohm.
- rockchip,phy_lpddr4_ca_drv : When dram type is LPDDR4, this parameter define
the phy side CA line(incluing command address
line) driver strength. default value is
PHY_DRV_ODT_40.
- rockchip,phy_lpddr4_ck_cs_drv : When dram type is LPDDR4, this parameter define
the phy side clock line and cs line driver
strength. default value is PHY_DRV_ODT_80.
- rockchip,phy_lpddr4_dq_drv : When dram type is LPDDR4, this parameter define
the phy side DQ line(incluing DQS/DQ/DM line)
driver strength. default value is PHY_DRV_ODT_80.
- rockchip,phy_lpddr4_odt : When dram type is LPDDR4, this parameter define
the phy side odt strength, default value is
PHY_DRV_ODT_60.
Example:
dmc_opp_table: dmc_opp_table {
compatible = "operating-points-v2";
opp00 {
opp-hz = /bits/ 64 <300000000>;
opp-microvolt = <900000>;
};
opp01 {
opp-hz = /bits/ 64 <666000000>;
opp-microvolt = <900000>;
};
};
dmc: dmc {
compatible = "rockchip,rk3399-dmc";
devfreq-events = <&dfi>;
interrupts = <GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru SCLK_DDRCLK>;
clock-names = "dmc_clk";
operating-points-v2 = <&dmc_opp_table>;
center-supply = <&ppvar_centerlogic>;
upthreshold = <15>;
downdifferential = <10>;
rockchip,ddr3_speed_bin = <21>;
rockchip,pd_idle = <0x40>;
rockchip,sr_idle = <0x2>;
rockchip,sr_mc_gate_idle = <0x3>;
rockchip,srpd_lite_idle = <0x4>;
rockchip,standby_idle = <0x2000>;
rockchip,dram_dll_dis_freq = <300>;
rockchip,phy_dll_dis_freq = <125>;
rockchip,auto_pd_dis_freq = <666>;
rockchip,ddr3_odt_dis_freq = <333>;
rockchip,ddr3_drv = <DDR3_DS_40ohm>;
rockchip,ddr3_odt = <DDR3_ODT_120ohm>;
rockchip,phy_ddr3_ca_drv = <PHY_DRV_ODT_40>;
rockchip,phy_ddr3_dq_drv = <PHY_DRV_ODT_40>;
rockchip,phy_ddr3_odt = <PHY_DRV_ODT_240>;
rockchip,lpddr3_odt_dis_freq = <333>;
rockchip,lpddr3_drv = <LP3_DS_34ohm>;
rockchip,lpddr3_odt = <LP3_ODT_240ohm>;
rockchip,phy_lpddr3_ca_drv = <PHY_DRV_ODT_40>;
rockchip,phy_lpddr3_dq_drv = <PHY_DRV_ODT_40>;
rockchip,phy_lpddr3_odt = <PHY_DRV_ODT_240>;
rockchip,lpddr4_odt_dis_freq = <333>;
rockchip,lpddr4_drv = <LP4_PDDS_60ohm>;
rockchip,lpddr4_dq_odt = <LP4_DQ_ODT_40ohm>;
rockchip,lpddr4_ca_odt = <LP4_CA_ODT_40ohm>;
rockchip,phy_lpddr4_ca_drv = <PHY_DRV_ODT_40>;
rockchip,phy_lpddr4_ck_cs_drv = <PHY_DRV_ODT_80>;
rockchip,phy_lpddr4_dq_drv = <PHY_DRV_ODT_80>;
rockchip,phy_lpddr4_odt = <PHY_DRV_ODT_60>;
status = "disabled";
};
......@@ -3284,6 +3284,7 @@ L: linux-pm@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
T: git git://git.linaro.org/people/vireshk/linux.git (For ARM Updates)
F: Documentation/cpu-freq/
F: drivers/cpufreq/
F: include/linux/cpufreq.h
......
......@@ -28,7 +28,7 @@ CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
CONFIG_CPUFREQ_DT=y
CONFIG_CPU_IDLE=y
CONFIG_ARM_EXYNOS_CPUIDLE=y
......
......@@ -135,7 +135,7 @@ CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
CONFIG_QORIQ_CPUFREQ=y
CONFIG_CPU_IDLE=y
CONFIG_ARM_CPUIDLE=y
......
This diff is collapsed.
......@@ -584,7 +584,6 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
struct clk *clk;
unsigned long freq, old_freq;
unsigned long u_volt, u_volt_min, u_volt_max;
unsigned long ou_volt, ou_volt_min, ou_volt_max;
int ret;
if (unlikely(!target_freq)) {
......@@ -620,11 +619,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
}
old_opp = _find_freq_ceil(opp_table, &old_freq);
if (!IS_ERR(old_opp)) {
ou_volt = old_opp->u_volt;
ou_volt_min = old_opp->u_volt_min;
ou_volt_max = old_opp->u_volt_max;
} else {
if (IS_ERR(old_opp)) {
dev_err(dev, "%s: failed to find current OPP for freq %lu (%ld)\n",
__func__, old_freq, PTR_ERR(old_opp));
}
......@@ -683,7 +678,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
restore_voltage:
/* This shouldn't harm even if the voltages weren't updated earlier */
if (!IS_ERR(old_opp))
_set_opp_voltage(dev, reg, ou_volt, ou_volt_min, ou_volt_max);
_set_opp_voltage(dev, reg, old_opp->u_volt,
old_opp->u_volt_min, old_opp->u_volt_max);
return ret;
}
......
......@@ -71,8 +71,18 @@ static bool _opp_is_supported(struct device *dev, struct opp_table *opp_table,
u32 version;
int ret;
if (!opp_table->supported_hw)
return true;
if (!opp_table->supported_hw) {
/*
* In the case that no supported_hw has been set by the
* platform but there is an opp-supported-hw value set for
* an OPP then the OPP should not be enabled as there is
* no way to see if the hardware supports it.
*/
if (of_find_property(np, "opp-supported-hw", NULL))
return false;
else
return true;
}
while (count--) {
ret = of_property_read_u32_index(np, "opp-supported-hw", count,
......
......@@ -194,7 +194,7 @@ config CPU_FREQ_GOV_CONSERVATIVE
If in doubt, say N.
config CPU_FREQ_GOV_SCHEDUTIL
tristate "'schedutil' cpufreq policy governor"
bool "'schedutil' cpufreq policy governor"
depends on CPU_FREQ && SMP
select CPU_FREQ_GOV_ATTR_SET
select IRQ_WORK
......@@ -208,9 +208,6 @@ config CPU_FREQ_GOV_SCHEDUTIL
frequency tipping point is at utilization/capacity equal to 80% in
both cases.
To compile this driver as a module, choose M here: the module will
be called cpufreq_schedutil.
If in doubt, say N.
comment "CPU frequency scaling drivers"
......@@ -225,7 +222,7 @@ config CPUFREQ_DT
help
This adds a generic DT based cpufreq driver for frequency management.
It supports both uniprocessor (UP) and symmetric multiprocessor (SMP)
systems which share clock and voltage across all CPUs.
systems.
If in doubt, say N.
......
......@@ -19,10 +19,19 @@
#include <linux/delay.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
#include <linux/dmi.h>
#include <linux/vmalloc.h>
#include <asm/unaligned.h>
#include <acpi/cppc_acpi.h>
/* Minimum struct length needed for the DMI processor entry we want */
#define DMI_ENTRY_PROCESSOR_MIN_LENGTH 48
/* Offest in the DMI processor structure for the max frequency */
#define DMI_PROCESSOR_MAX_SPEED 0x14
/*
* These structs contain information parsed from per CPU
* ACPI _CPC structures.
......@@ -32,6 +41,39 @@
*/
static struct cpudata **all_cpu_data;
/* Capture the max KHz from DMI */
static u64 cppc_dmi_max_khz;
/* Callback function used to retrieve the max frequency from DMI */
static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
{
const u8 *dmi_data = (const u8 *)dm;
u16 *mhz = (u16 *)private;
if (dm->type == DMI_ENTRY_PROCESSOR &&
dm->length >= DMI_ENTRY_PROCESSOR_MIN_LENGTH) {
u16 val = (u16)get_unaligned((const u16 *)
(dmi_data + DMI_PROCESSOR_MAX_SPEED));
*mhz = val > *mhz ? val : *mhz;
}
}
/* Look up the max frequency in DMI */
static u64 cppc_get_dmi_max_khz(void)
{
u16 mhz = 0;
dmi_walk(cppc_find_dmi_mhz, &mhz);
/*
* Real stupid fallback value, just in case there is no
* actual value set.
*/
mhz = mhz ? mhz : 1;
return (1000 * mhz);
}
static int cppc_cpufreq_set_target(struct cpufreq_policy *policy,
unsigned int target_freq,
unsigned int relation)
......@@ -42,7 +84,7 @@ static int cppc_cpufreq_set_target(struct cpufreq_policy *policy,
cpu = all_cpu_data[policy->cpu];
cpu->perf_ctrls.desired_perf = target_freq;
cpu->perf_ctrls.desired_perf = (u64)target_freq * policy->max / cppc_dmi_max_khz;
freqs.old = policy->cur;
freqs.new = target_freq;
......@@ -94,8 +136,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
return ret;
}
policy->min = cpu->perf_caps.lowest_perf;
policy->max = cpu->perf_caps.highest_perf;
cppc_dmi_max_khz = cppc_get_dmi_max_khz();
policy->min = cpu->perf_caps.lowest_perf * cppc_dmi_max_khz / cpu->perf_caps.highest_perf;
policy->max = cppc_dmi_max_khz;
policy->cpuinfo.min_freq = policy->min;
policy->cpuinfo.max_freq = policy->max;
policy->shared_type = cpu->shared_type;
......@@ -112,7 +156,8 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
cpu->cur_policy = policy;
/* Set policy->cur to max now. The governors will adjust later. */
policy->cur = cpu->perf_ctrls.desired_perf = cpu->perf_caps.highest_perf;
policy->cur = cppc_dmi_max_khz;
cpu->perf_ctrls.desired_perf = cpu->perf_caps.highest_perf;
ret = cppc_set_perf(cpu_num, &cpu->perf_ctrls);
if (ret)
......
......@@ -11,6 +11,8 @@
#include <linux/of.h>
#include <linux/platform_device.h>
#include "cpufreq-dt.h"
static const struct of_device_id machines[] __initconst = {
{ .compatible = "allwinner,sun4i-a10", },
{ .compatible = "allwinner,sun5i-a10s", },
......@@ -40,6 +42,7 @@ static const struct of_device_id machines[] __initconst = {
{ .compatible = "samsung,exynos5250", },
#ifndef CONFIG_BL_SWITCHER
{ .compatible = "samsung,exynos5420", },
{ .compatible = "samsung,exynos5433", },
{ .compatible = "samsung,exynos5800", },
#endif
......@@ -51,6 +54,7 @@ static const struct of_device_id machines[] __initconst = {
{ .compatible = "renesas,r8a7779", },
{ .compatible = "renesas,r8a7790", },
{ .compatible = "renesas,r8a7791", },
{ .compatible = "renesas,r8a7792", },
{ .compatible = "renesas,r8a7793", },
{ .compatible = "renesas,r8a7794", },
{ .compatible = "renesas,sh73a0", },
......@@ -68,6 +72,8 @@ static const struct of_device_id machines[] __initconst = {
{ .compatible = "sigma,tango4" },
{ .compatible = "ti,am33xx", },
{ .compatible = "ti,dra7", },
{ .compatible = "ti,omap2", },
{ .compatible = "ti,omap3", },
{ .compatible = "ti,omap4", },
......@@ -91,7 +97,8 @@ static int __init cpufreq_dt_platdev_init(void)
if (!match)
return -ENODEV;
return PTR_ERR_OR_ZERO(platform_device_register_simple("cpufreq-dt", -1,
NULL, 0));
return PTR_ERR_OR_ZERO(platform_device_register_data(NULL, "cpufreq-dt",
-1, match->data,
sizeof(struct cpufreq_dt_platform_data)));
}
device_initcall(cpufreq_dt_platdev_init);
......@@ -25,6 +25,8 @@
#include <linux/slab.h>
#include <linux/thermal.h>
#include "cpufreq-dt.h"
struct private_data {
struct device *cpu_dev;
struct thermal_cooling_device *cdev;
......@@ -353,6 +355,7 @@ static struct cpufreq_driver dt_cpufreq_driver = {
static int dt_cpufreq_probe(struct platform_device *pdev)
{
struct cpufreq_dt_platform_data *data = dev_get_platdata(&pdev->dev);
int ret;
/*
......@@ -366,7 +369,8 @@ static int dt_cpufreq_probe(struct platform_device *pdev)
if (ret)
return ret;
dt_cpufreq_driver.driver_data = dev_get_platdata(&pdev->dev);
if (data && data->have_governor_per_policy)
dt_cpufreq_driver.flags |= CPUFREQ_HAVE_GOVERNOR_PER_POLICY;
ret = cpufreq_register_driver(&dt_cpufreq_driver);
if (ret)
......
/*
* Copyright (C) 2016 Linaro
* Viresh Kumar <viresh.kumar@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef __CPUFREQ_DT_H__
#define __CPUFREQ_DT_H__
#include <linux/types.h>
struct cpufreq_dt_platform_data {
bool have_governor_per_policy;
};
#endif /* __CPUFREQ_DT_H__ */
......@@ -916,58 +916,18 @@ static struct kobj_type ktype_cpufreq = {
.release = cpufreq_sysfs_release,
};
static int add_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
static int add_cpu_dev_symlink(struct cpufreq_policy *policy,
struct device *dev)
{
struct device *cpu_dev;
pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
if (!policy)
return 0;
cpu_dev = get_cpu_device(cpu);
if (WARN_ON(!cpu_dev))
return 0;
return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
}
static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
{
struct device *cpu_dev;
pr_debug("%s: Removing symlink for CPU: %u\n", __func__, cpu);
cpu_dev = get_cpu_device(cpu);
if (WARN_ON(!cpu_dev))
return;
sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
dev_dbg(dev, "%s: Adding symlink\n", __func__);
return sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
}
/* Add/remove symlinks for all related CPUs */
static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
static void remove_cpu_dev_symlink(struct cpufreq_policy *policy,
struct device *dev)
{
unsigned int j;
int ret = 0;
/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus) {
ret = add_cpu_dev_symlink(policy, j);
if (ret)
break;
}
return ret;
}
static void cpufreq_remove_dev_symlink(struct cpufreq_policy *policy)
{
unsigned int j;
/* Some related CPUs might not be present (physically hotplugged) */
for_each_cpu(j, policy->real_cpus)
remove_cpu_dev_symlink(policy, j);
dev_dbg(dev, "%s: Removing symlink\n", __func__);
sysfs_remove_link(&dev->kobj, "cpufreq");
}
static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
......@@ -999,7 +959,7 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
return ret;
}
return cpufreq_add_dev_symlink(policy);
return 0;
}
__weak struct cpufreq_governor *cpufreq_default_governor(void)
......@@ -1073,13 +1033,9 @@ static void handle_update(struct work_struct *work)
static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)
{
struct device *dev = get_cpu_device(cpu);
struct cpufreq_policy *policy;
int ret;
if (WARN_ON(!dev))
return NULL;
policy = kzalloc(sizeof(*policy), GFP_KERNEL);
if (!policy)
return NULL;
......@@ -1133,7 +1089,6 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify)
down_write(&policy->rwsem);
cpufreq_stats_free_table(policy);
cpufreq_remove_dev_symlink(policy);
kobj = &policy->kobj;
cmp = &policy->kobj_unregister;
up_write(&policy->rwsem);
......@@ -1215,8 +1170,8 @@ static int cpufreq_online(unsigned int cpu)
if (new_policy) {
/* related_cpus should at least include policy->cpus. */
cpumask_copy(policy->related_cpus, policy->cpus);
/* Remember CPUs present at the policy creation time. */
cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
/* Clear mask of registered CPUs */
cpumask_clear(policy->real_cpus);
}
/*
......@@ -1331,6 +1286,8 @@ static int cpufreq_online(unsigned int cpu)
return ret;
}
static void cpufreq_offline(unsigned int cpu);
/**
* cpufreq_add_dev - the cpufreq interface for a CPU device.
* @dev: CPU device.
......@@ -1340,22 +1297,28 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
{
struct cpufreq_policy *policy;
unsigned cpu = dev->id;
int ret;
dev_dbg(dev, "%s: adding CPU%u\n", __func__, cpu);
if (cpu_online(cpu))
return cpufreq_online(cpu);
if (cpu_online(cpu)) {
ret = cpufreq_online(cpu);
if (ret)
return ret;
}
/*
* A hotplug notifier will follow and we will handle it as CPU online
* then. For now, just create the sysfs link, unless there is no policy
* or the link is already present.
*/
/* Create sysfs link on CPU registration */
policy = per_cpu(cpufreq_cpu_data, cpu);
if (!policy || cpumask_test_and_set_cpu(cpu, policy->real_cpus))
return 0;
return add_cpu_dev_symlink(policy, cpu);
ret = add_cpu_dev_symlink(policy, dev);
if (ret) {
cpumask_clear_cpu(cpu, policy->real_cpus);
cpufreq_offline(cpu);
}
return ret;
}
static void cpufreq_offline(unsigned int cpu)
......@@ -1436,7 +1399,7 @@ static void cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
cpufreq_offline(cpu);
cpumask_clear_cpu(cpu, policy->real_cpus);
remove_cpu_dev_symlink(policy, cpu);
remove_cpu_dev_symlink(policy, dev);
if (cpumask_empty(policy->real_cpus))
cpufreq_policy_free(policy, true);
......
......@@ -260,7 +260,7 @@ static void dbs_irq_work(struct irq_work *irq_work)
}
static void dbs_update_util_handler(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max)
unsigned int flags)
{
struct cpu_dbs_info *cdbs = container_of(data, struct cpu_dbs_info, update_util);
struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
......
......@@ -181,6 +181,8 @@ struct _pid {
* @cpu: CPU number for this instance data
* @update_util: CPUFreq utility callback information
* @update_util_set: CPUFreq utility callback is set
* @iowait_boost: iowait-related boost fraction
* @last_update: Time of the last update.
* @pstate: Stores P state limits for this CPU
* @vid: Stores VID limits for this CPU
* @pid: Stores PID parameters for this CPU
......@@ -206,6 +208,7 @@ struct cpudata {
struct vid_data vid;
struct _pid pid;
u64 last_update;
u64 last_sample_time;
u64 prev_aperf;
u64 prev_mperf;
......@@ -216,6 +219,7 @@ struct cpudata {
struct acpi_processor_performance acpi_perf_data;
bool valid_pss_table;
#endif
unsigned int iowait_boost;
};
static struct cpudata **all_cpu_data;
......@@ -229,6 +233,7 @@ static struct cpudata **all_cpu_data;
* @p_gain_pct: PID proportional gain
* @i_gain_pct: PID integral gain
* @d_gain_pct: PID derivative gain
* @boost_iowait: Whether or not to use iowait boosting.
*
* Stores per CPU model static PID configuration data.
*/
......@@ -240,6 +245,7 @@ struct pstate_adjust_policy {
int p_gain_pct;
int d_gain_pct;
int i_gain_pct;
bool boost_iowait;
};
/**
......@@ -1029,7 +1035,7 @@ static struct cpu_defaults core_params = {
},
};
static struct cpu_defaults silvermont_params = {
static const struct cpu_defaults silvermont_params = {
.pid_policy = {
.sample_rate_ms = 10,
.deadband = 0,
......@@ -1037,6 +1043,7 @@ static struct cpu_defaults silvermont_params = {
.p_gain_pct = 14,
.d_gain_pct = 0,
.i_gain_pct = 4,
.boost_iowait = true,
},
.funcs = {
.get_max = atom_get_max_pstate,
......@@ -1050,7 +1057,7 @@ static struct cpu_defaults silvermont_params = {
},
};
static struct cpu_defaults airmont_params = {
static const struct cpu_defaults airmont_params = {
.pid_policy = {
.sample_rate_ms = 10,
.deadband = 0,
......@@ -1058,6 +1065,7 @@ static struct cpu_defaults airmont_params = {
.p_gain_pct = 14,
.d_gain_pct = 0,
.i_gain_pct = 4,
.boost_iowait = true,
},
.funcs = {
.get_max = atom_get_max_pstate,
......@@ -1071,7 +1079,7 @@ static struct cpu_defaults airmont_params = {
},
};
static struct cpu_defaults knl_params = {
static const struct cpu_defaults knl_params = {
.pid_policy = {
.sample_rate_ms = 10,
.deadband = 0,
......@@ -1091,7 +1099,7 @@ static struct cpu_defaults knl_params = {
},
};
static struct cpu_defaults bxt_params = {
static const struct cpu_defaults bxt_params = {
.pid_policy = {
.sample_rate_ms = 10,
.deadband = 0,
......@@ -1099,6 +1107,7 @@ static struct cpu_defaults bxt_params = {
.p_gain_pct = 14,
.d_gain_pct = 0,
.i_gain_pct = 4,
.boost_iowait = true,
},
.funcs = {
.get_max = core_get_max_pstate,
......@@ -1222,36 +1231,18 @@ static inline int32_t get_avg_pstate(struct cpudata *cpu)
static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
{
struct sample *sample = &cpu->sample;
u64 cummulative_iowait, delta_iowait_us;
u64 delta_iowait_mperf;
u64 mperf, now;
int32_t cpu_load;
int32_t busy_frac, boost;
cummulative_iowait = get_cpu_iowait_time_us(cpu->cpu, &now);
busy_frac = div_fp(sample->mperf, sample->tsc);
/*
* Convert iowait time into number of IO cycles spent at max_freq.
* IO is considered as busy only for the cpu_load algorithm. For
* performance this is not needed since we always try to reach the
* maximum P-State, so we are already boosting the IOs.
*/
delta_iowait_us = cummulative_iowait - cpu->prev_cummulative_iowait;
delta_iowait_mperf = div64_u64(delta_iowait_us * cpu->pstate.scaling *
cpu->pstate.max_pstate, MSEC_PER_SEC);
boost = cpu->iowait_boost;
cpu->iowait_boost >>= 1;
mperf = cpu->sample.mperf + delta_iowait_mperf;
cpu->prev_cummulative_iowait = cummulative_iowait;
if (busy_frac < boost)
busy_frac = boost;
/*
* The load can be estimated as the ratio of the mperf counter
* running at a constant frequency during active periods
* (C0) and the time stamp counter running at the same frequency
* also during C-states.
*/
cpu_load = div64_u64(int_tofp(100) * mperf, sample->tsc);
cpu->sample.busy_scaled = cpu_load;
return get_avg_pstate(cpu) - pid_calc(&cpu->pid, cpu_load);
sample->busy_scaled = busy_frac * 100;
return get_avg_pstate(cpu) - pid_calc(&cpu->pid, sample->busy_scaled);
}
static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu)
......@@ -1325,15 +1316,29 @@ static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
sample->mperf,
sample->aperf,
sample->tsc,
get_avg_frequency(cpu));
get_avg_frequency(cpu),
fp_toint(cpu->iowait_boost * 100));
}
static void intel_pstate_update_util(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max)
unsigned int flags)
{
struct cpudata *cpu = container_of(data, struct cpudata, update_util);
u64 delta_ns = time - cpu->sample.time;
u64 delta_ns;
if (pid_params.boost_iowait) {
if (flags & SCHED_CPUFREQ_IOWAIT) {
cpu->iowait_boost = int_tofp(1);
} else if (cpu->iowait_boost) {
/* Clear iowait_boost if the CPU may have been idle. */
delta_ns = time - cpu->last_update;
if (delta_ns > TICK_NSEC)
cpu->iowait_boost = 0;
}
cpu->last_update = time;
}
delta_ns = time - cpu->sample.time;
if ((s64)delta_ns >= pid_params.sample_rate_ns) {
bool sample_taken = intel_pstate_sample(cpu, time);
......
......@@ -123,7 +123,7 @@ static int kirkwood_cpufreq_probe(struct platform_device *pdev)
priv.cpu_clk = of_clk_get_by_name(np, "cpu_clk");
if (IS_ERR(priv.cpu_clk)) {
dev_err(priv.dev, "Unable to get cpuclk");
dev_err(priv.dev, "Unable to get cpuclk\n");
return PTR_ERR(priv.cpu_clk);
}
......@@ -132,7 +132,7 @@ static int kirkwood_cpufreq_probe(struct platform_device *pdev)
priv.ddr_clk = of_clk_get_by_name(np, "ddrclk");
if (IS_ERR(priv.ddr_clk)) {
dev_err(priv.dev, "Unable to get ddrclk");
dev_err(priv.dev, "Unable to get ddrclk\n");
err = PTR_ERR(priv.ddr_clk);
goto out_cpu;
}
......@@ -142,7 +142,7 @@ static int kirkwood_cpufreq_probe(struct platform_device *pdev)
priv.powersave_clk = of_clk_get_by_name(np, "powersave");
if (IS_ERR(priv.powersave_clk)) {
dev_err(priv.dev, "Unable to get powersave");
dev_err(priv.dev, "Unable to get powersave\n");
err = PTR_ERR(priv.powersave_clk);
goto out_ddr;
}
......@@ -155,7 +155,7 @@ static int kirkwood_cpufreq_probe(struct platform_device *pdev)
if (!err)
return 0;
dev_err(priv.dev, "Failed to register cpufreq driver");
dev_err(priv.dev, "Failed to register cpufreq driver\n");
clk_disable_unprepare(priv.powersave_clk);
out_ddr:
......
......@@ -105,7 +105,6 @@ static int scpi_cpufreq_remove(struct platform_device *pdev)
static struct platform_driver scpi_cpufreq_platdrv = {
.driver = {
.name = "scpi-cpufreq",
.owner = THIS_MODULE,
},
.probe = scpi_cpufreq_probe,
.remove = scpi_cpufreq_remove,
......
......@@ -163,7 +163,7 @@ static int sti_cpufreq_set_opp_info(void)
reg_fields = sti_cpufreq_match();
if (!reg_fields) {
dev_err(dev, "This SoC doesn't support voltage scaling");
dev_err(dev, "This SoC doesn't support voltage scaling\n");
return -ENODEV;
}
......
......@@ -121,6 +121,7 @@ static int __init arm_idle_init(void)
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev) {
pr_err("Failed to allocate cpuidle device\n");
ret = -ENOMEM;
goto out_fail;
}
dev->cpu = cpu;
......
......@@ -76,7 +76,7 @@ comment "DEVFREQ Drivers"
config ARM_EXYNOS_BUS_DEVFREQ
tristate "ARM EXYNOS Generic Memory Bus DEVFREQ Driver"
depends on ARCH_EXYNOS
depends on ARCH_EXYNOS || COMPILE_TEST
select DEVFREQ_GOV_SIMPLE_ONDEMAND
select DEVFREQ_GOV_PASSIVE
select DEVFREQ_EVENT_EXYNOS_PPMU
......@@ -91,14 +91,26 @@ config ARM_EXYNOS_BUS_DEVFREQ
This does not yet operate with optimal voltages.
config ARM_TEGRA_DEVFREQ
tristate "Tegra DEVFREQ Driver"
depends on ARCH_TEGRA_124_SOC
select DEVFREQ_GOV_SIMPLE_ONDEMAND
select PM_OPP
help
This adds the DEVFREQ driver for the Tegra family of SoCs.
It reads ACTMON counters of memory controllers and adjusts the
operating frequencies and voltages with OPP support.
tristate "Tegra DEVFREQ Driver"
depends on ARCH_TEGRA_124_SOC
select DEVFREQ_GOV_SIMPLE_ONDEMAND
select PM_OPP
help
This adds the DEVFREQ driver for the Tegra family of SoCs.
It reads ACTMON counters of memory controllers and adjusts the
operating frequencies and voltages with OPP support.
config ARM_RK3399_DMC_DEVFREQ
tristate "ARM RK3399 DMC DEVFREQ Driver"
depends on ARCH_ROCKCHIP
select DEVFREQ_EVENT_ROCKCHIP_DFI
select DEVFREQ_GOV_SIMPLE_ONDEMAND
select PM_DEVFREQ_EVENT
select PM_OPP
help
This adds the DEVFREQ driver for the RK3399 DMC(Dynamic Memory Controller).
It sets the frequency for the memory controller and reads the usage counts
from hardware.
source "drivers/devfreq/event/Kconfig"
......
......@@ -8,6 +8,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE) += governor_passive.o
# DEVFREQ Drivers
obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ) += exynos-bus.o
obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ) += rk3399_dmc.o
obj-$(CONFIG_ARM_TEGRA_DEVFREQ) += tegra-devfreq.o
# DEVFREQ Event Drivers
......
......@@ -15,7 +15,7 @@ if PM_DEVFREQ_EVENT
config DEVFREQ_EVENT_EXYNOS_NOCP
tristate "EXYNOS NoC (Network On Chip) Probe DEVFREQ event Driver"
depends on ARCH_EXYNOS
depends on ARCH_EXYNOS || COMPILE_TEST
select PM_OPP
help
This add the devfreq-event driver for Exynos SoC. It provides NoC
......@@ -23,11 +23,18 @@ config DEVFREQ_EVENT_EXYNOS_NOCP
config DEVFREQ_EVENT_EXYNOS_PPMU
tristate "EXYNOS PPMU (Platform Performance Monitoring Unit) DEVFREQ event Driver"
depends on ARCH_EXYNOS
depends on ARCH_EXYNOS || COMPILE_TEST
select PM_OPP
help
This add the devfreq-event driver for Exynos SoC. It provides PPMU
(Platform Performance Monitoring Unit) counters to estimate the
utilization of each module.
config DEVFREQ_EVENT_ROCKCHIP_DFI
tristate "ROCKCHIP DFI DEVFREQ event Driver"
depends on ARCH_ROCKCHIP
help
This add the devfreq-event driver for Rockchip SoC. It provides DFI
(DDR Monitor Module) driver to count ddr load.
endif # PM_DEVFREQ_EVENT
......@@ -2,3 +2,4 @@
obj-$(CONFIG_DEVFREQ_EVENT_EXYNOS_NOCP) += exynos-nocp.o
obj-$(CONFIG_DEVFREQ_EVENT_EXYNOS_PPMU) += exynos-ppmu.o
obj-$(CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI) += rockchip-dfi.o
......@@ -406,8 +406,6 @@ static int of_get_devfreq_events(struct device_node *np,
of_property_read_string(node, "event-name", &desc[j].name);
j++;
of_node_put(node);
}
info->desc = desc;
......
/*
* Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd
* Author: Lin Huang <hl@rock-chips.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms and conditions of the GNU General Public License,
* version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*/
#include <linux/clk.h>
#include <linux/devfreq-event.h>
#include <linux/kernel.h>
#include <linux/err.h>
#include <linux/init.h>
#include <linux/io.h>
#include <linux/mfd/syscon.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/regmap.h>
#include <linux/slab.h>
#include <linux/list.h>
#include <linux/of.h>
#define RK3399_DMC_NUM_CH 2
/* DDRMON_CTRL */
#define DDRMON_CTRL 0x04
#define CLR_DDRMON_CTRL (0x1f0000 << 0)
#define LPDDR4_EN (0x10001 << 4)
#define HARDWARE_EN (0x10001 << 3)
#define LPDDR3_EN (0x10001 << 2)
#define SOFTWARE_EN (0x10001 << 1)
#define SOFTWARE_DIS (0x10000 << 1)
#define TIME_CNT_EN (0x10001 << 0)
#define DDRMON_CH0_COUNT_NUM 0x28
#define DDRMON_CH0_DFI_ACCESS_NUM 0x2c
#define DDRMON_CH1_COUNT_NUM 0x3c
#define DDRMON_CH1_DFI_ACCESS_NUM 0x40
/* pmu grf */
#define PMUGRF_OS_REG2 0x308
#define DDRTYPE_SHIFT 13
#define DDRTYPE_MASK 7
enum {
DDR3 = 3,
LPDDR3 = 6,
LPDDR4 = 7,
UNUSED = 0xFF
};
struct dmc_usage {
u32 access;
u32 total;
};
/*
* The dfi controller can monitor DDR load. It has an upper and lower threshold
* for the operating points. Whenever the usage leaves these bounds an event is
* generated to indicate the DDR frequency should be changed.
*/
struct rockchip_dfi {
struct devfreq_event_dev *edev;
struct devfreq_event_desc *desc;
struct dmc_usage ch_usage[RK3399_DMC_NUM_CH];
struct device *dev;
void __iomem *regs;
struct regmap *regmap_pmu;
struct clk *clk;
};
static void rockchip_dfi_start_hardware_counter(struct devfreq_event_dev *edev)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
void __iomem *dfi_regs = info->regs;
u32 val;
u32 ddr_type;
/* get ddr type */
regmap_read(info->regmap_pmu, PMUGRF_OS_REG2, &val);
ddr_type = (val >> DDRTYPE_SHIFT) & DDRTYPE_MASK;
/* clear DDRMON_CTRL setting */
writel_relaxed(CLR_DDRMON_CTRL, dfi_regs + DDRMON_CTRL);
/* set ddr type to dfi */
if (ddr_type == LPDDR3)
writel_relaxed(LPDDR3_EN, dfi_regs + DDRMON_CTRL);
else if (ddr_type == LPDDR4)
writel_relaxed(LPDDR4_EN, dfi_regs + DDRMON_CTRL);
/* enable count, use software mode */
writel_relaxed(SOFTWARE_EN, dfi_regs + DDRMON_CTRL);
}
static void rockchip_dfi_stop_hardware_counter(struct devfreq_event_dev *edev)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
void __iomem *dfi_regs = info->regs;
writel_relaxed(SOFTWARE_DIS, dfi_regs + DDRMON_CTRL);
}
static int rockchip_dfi_get_busier_ch(struct devfreq_event_dev *edev)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
u32 tmp, max = 0;
u32 i, busier_ch = 0;
void __iomem *dfi_regs = info->regs;
rockchip_dfi_stop_hardware_counter(edev);
/* Find out which channel is busier */
for (i = 0; i < RK3399_DMC_NUM_CH; i++) {
info->ch_usage[i].access = readl_relaxed(dfi_regs +
DDRMON_CH0_DFI_ACCESS_NUM + i * 20) * 4;
info->ch_usage[i].total = readl_relaxed(dfi_regs +
DDRMON_CH0_COUNT_NUM + i * 20);
tmp = info->ch_usage[i].access;
if (tmp > max) {
busier_ch = i;
max = tmp;
}
}
rockchip_dfi_start_hardware_counter(edev);
return busier_ch;
}
static int rockchip_dfi_disable(struct devfreq_event_dev *edev)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
rockchip_dfi_stop_hardware_counter(edev);
clk_disable_unprepare(info->clk);
return 0;
}
static int rockchip_dfi_enable(struct devfreq_event_dev *edev)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
int ret;
ret = clk_prepare_enable(info->clk);
if (ret) {
dev_err(&edev->dev, "failed to enable dfi clk: %d\n", ret);
return ret;
}
rockchip_dfi_start_hardware_counter(edev);
return 0;
}
static int rockchip_dfi_set_event(struct devfreq_event_dev *edev)
{
return 0;
}
static int rockchip_dfi_get_event(struct devfreq_event_dev *edev,
struct devfreq_event_data *edata)
{
struct rockchip_dfi *info = devfreq_event_get_drvdata(edev);
int busier_ch;
busier_ch = rockchip_dfi_get_busier_ch(edev);
edata->load_count = info->ch_usage[busier_ch].access;
edata->total_count = info->ch_usage[busier_ch].total;
return 0;
}
static const struct devfreq_event_ops rockchip_dfi_ops = {
.disable = rockchip_dfi_disable,
.enable = rockchip_dfi_enable,
.get_event = rockchip_dfi_get_event,
.set_event = rockchip_dfi_set_event,
};
static const struct of_device_id rockchip_dfi_id_match[] = {
{ .compatible = "rockchip,rk3399-dfi" },
{ },
};
static int rockchip_dfi_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct rockchip_dfi *data;
struct resource *res;
struct devfreq_event_desc *desc;
struct device_node *np = pdev->dev.of_node, *node;
data = devm_kzalloc(dev, sizeof(struct rockchip_dfi), GFP_KERNEL);
if (!data)
return -ENOMEM;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
data->regs = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(data->regs))
return PTR_ERR(data->regs);
data->clk = devm_clk_get(dev, "pclk_ddr_mon");
if (IS_ERR(data->clk)) {
dev_err(dev, "Cannot get the clk dmc_clk\n");
return PTR_ERR(data->clk);
};
/* try to find the optional reference to the pmu syscon */
node = of_parse_phandle(np, "rockchip,pmu", 0);
if (node) {
data->regmap_pmu = syscon_node_to_regmap(node);
if (IS_ERR(data->regmap_pmu))
return PTR_ERR(data->regmap_pmu);
}
data->dev = dev;
desc = devm_kzalloc(dev, sizeof(*desc), GFP_KERNEL);
if (!desc)
return -ENOMEM;
desc->ops = &rockchip_dfi_ops;
desc->driver_data = data;
desc->name = np->name;
data->desc = desc;
data->edev = devm_devfreq_event_add_edev(&pdev->dev, desc);
if (IS_ERR(data->edev)) {
dev_err(&pdev->dev,
"failed to add devfreq-event device\n");
return PTR_ERR(data->edev);
}
platform_set_drvdata(pdev, data);
return 0;
}
static struct platform_driver rockchip_dfi_driver = {
.probe = rockchip_dfi_probe,
.driver = {
.name = "rockchip-dfi",
.of_match_table = rockchip_dfi_id_match,
},
};
module_platform_driver(rockchip_dfi_driver);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Lin Huang <hl@rock-chips.com>");
MODULE_DESCRIPTION("Rockchip DFI driver");
This diff is collapsed.
This diff is collapsed.
......@@ -215,29 +215,22 @@ static __init int exynos4_pm_init_power_domain(void)
/* Assign the child power domains to their parents */
for_each_matching_node(np, exynos_pm_domain_of_match) {
struct generic_pm_domain *child_domain, *parent_domain;
struct of_phandle_args args;
struct of_phandle_args child, parent;
args.np = np;
args.args_count = 0;
child_domain = of_genpd_get_from_provider(&args);
if (IS_ERR(child_domain))
continue;
child.np = np;
child.args_count = 0;
if (of_parse_phandle_with_args(np, "power-domains",
"#power-domain-cells", 0, &args) != 0)
continue;
parent_domain = of_genpd_get_from_provider(&args);
if (IS_ERR(parent_domain))
"#power-domain-cells", 0,
&parent) != 0)
continue;
if (pm_genpd_add_subdomain(parent_domain, child_domain))
if (of_genpd_add_subdomain(&parent, &child))
pr_warn("%s failed to add subdomain: %s\n",
parent_domain->name, child_domain->name);
parent.np->name, child.np->name);
else
pr_info("%s has as child subdomain: %s.\n",
parent_domain->name, child_domain->name);
parent.np->name, child.np->name);
}
return 0;
......
......@@ -140,7 +140,6 @@ static int board_staging_add_dev_domain(struct platform_device *pdev,
const char *domain)
{
struct of_phandle_args pd_args;
struct generic_pm_domain *pd;
struct device_node *np;
np = of_find_node_by_path(domain);
......@@ -151,14 +150,8 @@ static int board_staging_add_dev_domain(struct platform_device *pdev,
pd_args.np = np;
pd_args.args_count = 0;
pd = of_genpd_get_from_provider(&pd_args);
if (IS_ERR(pd)) {
pr_err("Cannot find genpd %s (%ld)\n", domain, PTR_ERR(pd));
return PTR_ERR(pd);
}
pr_debug("Found genpd %s for device %s\n", pd->name, pdev->name);
return pm_genpd_add_device(pd, &pdev->dev);
return of_genpd_add_device(&pd_args, &pdev->dev);
}
#else
static inline int board_staging_add_dev_domain(struct platform_device *pdev,
......
......@@ -148,11 +148,6 @@ static inline int devfreq_event_reset_event(struct devfreq_event_dev *edev)
return -EINVAL;
}
static inline void *devfreq_event_get_drvdata(struct devfreq_event_dev *edev)
{
return ERR_PTR(-EINVAL);
}
static inline struct devfreq_event_dev *devfreq_event_get_edev_by_phandle(
struct device *dev, int index)
{
......
......@@ -51,6 +51,8 @@ struct generic_pm_domain {
struct mutex lock;
struct dev_power_governor *gov;
struct work_struct power_off_work;
struct fwnode_handle *provider; /* Identity of the domain provider */
bool has_provider;
const char *name;
atomic_t sd_count; /* Number of subdomains with power "on" */
enum gpd_status status; /* Current state of the domain */
......@@ -116,7 +118,6 @@ static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
return to_gpd_data(dev->power.subsys_data->domain_data);
}
extern struct generic_pm_domain *pm_genpd_lookup_dev(struct device *dev);
extern int __pm_genpd_add_device(struct generic_pm_domain *genpd,
struct device *dev,
struct gpd_timing_data *td);
......@@ -129,6 +130,7 @@ extern int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
struct generic_pm_domain *target);
extern int pm_genpd_init(struct generic_pm_domain *genpd,
struct dev_power_governor *gov, bool is_off);
extern int pm_genpd_remove(struct generic_pm_domain *genpd);
extern struct dev_power_governor simple_qos_governor;
extern struct dev_power_governor pm_domain_always_on_gov;
......@@ -138,10 +140,6 @@ static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
{
return ERR_PTR(-ENOSYS);
}
static inline struct generic_pm_domain *pm_genpd_lookup_dev(struct device *dev)
{
return NULL;
}
static inline int __pm_genpd_add_device(struct generic_pm_domain *genpd,
struct device *dev,
struct gpd_timing_data *td)
......@@ -168,6 +166,10 @@ static inline int pm_genpd_init(struct generic_pm_domain *genpd,
{
return -ENOSYS;
}
static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
{
return -ENOTSUPP;
}
#endif
static inline int pm_genpd_add_device(struct generic_pm_domain *genpd,
......@@ -192,57 +194,57 @@ struct genpd_onecell_data {
unsigned int num_domains;
};
typedef struct generic_pm_domain *(*genpd_xlate_t)(struct of_phandle_args *args,
void *data);
#ifdef CONFIG_PM_GENERIC_DOMAINS_OF
int __of_genpd_add_provider(struct device_node *np, genpd_xlate_t xlate,
void *data);
int of_genpd_add_provider_simple(struct device_node *np,
struct generic_pm_domain *genpd);
int of_genpd_add_provider_onecell(struct device_node *np,
struct genpd_onecell_data *data);
void of_genpd_del_provider(struct device_node *np);
struct generic_pm_domain *of_genpd_get_from_provider(
struct of_phandle_args *genpdspec);
struct generic_pm_domain *__of_genpd_xlate_simple(
struct of_phandle_args *genpdspec,
void *data);
struct generic_pm_domain *__of_genpd_xlate_onecell(
struct of_phandle_args *genpdspec,
void *data);
extern int of_genpd_add_device(struct of_phandle_args *args,
struct device *dev);
extern int of_genpd_add_subdomain(struct of_phandle_args *parent,
struct of_phandle_args *new_subdomain);
extern struct generic_pm_domain *of_genpd_remove_last(struct device_node *np);
int genpd_dev_pm_attach(struct device *dev);
#else /* !CONFIG_PM_GENERIC_DOMAINS_OF */
static inline int __of_genpd_add_provider(struct device_node *np,
genpd_xlate_t xlate, void *data)
static inline int of_genpd_add_provider_simple(struct device_node *np,
struct generic_pm_domain *genpd)
{
return 0;
return -ENOTSUPP;
}
static inline void of_genpd_del_provider(struct device_node *np) {}
static inline struct generic_pm_domain *of_genpd_get_from_provider(
struct of_phandle_args *genpdspec)
static inline int of_genpd_add_provider_onecell(struct device_node *np,
struct genpd_onecell_data *data)
{
return NULL;
return -ENOTSUPP;
}
#define __of_genpd_xlate_simple NULL
#define __of_genpd_xlate_onecell NULL
static inline void of_genpd_del_provider(struct device_node *np) {}
static inline int genpd_dev_pm_attach(struct device *dev)
static inline int of_genpd_add_device(struct of_phandle_args *args,
struct device *dev)
{
return -ENODEV;
}
#endif /* CONFIG_PM_GENERIC_DOMAINS_OF */
static inline int of_genpd_add_provider_simple(struct device_node *np,
struct generic_pm_domain *genpd)
static inline int of_genpd_add_subdomain(struct of_phandle_args *parent,
struct of_phandle_args *new_subdomain)
{
return __of_genpd_add_provider(np, __of_genpd_xlate_simple, genpd);
return -ENODEV;
}
static inline int of_genpd_add_provider_onecell(struct device_node *np,
struct genpd_onecell_data *data)
static inline int genpd_dev_pm_attach(struct device *dev)
{
return -ENODEV;
}
static inline
struct generic_pm_domain *of_genpd_remove_last(struct device_node *np)
{
return __of_genpd_add_provider(np, __of_genpd_xlate_onecell, data);
return ERR_PTR(-ENOTSUPP);
}
#endif /* CONFIG_PM_GENERIC_DOMAINS_OF */
#ifdef CONFIG_PM
extern int dev_pm_domain_attach(struct device *dev, bool power_on);
......
......@@ -3469,15 +3469,20 @@ static inline unsigned long rlimit_max(unsigned int limit)
return task_rlimit_max(current, limit);
}
#define SCHED_CPUFREQ_RT (1U << 0)
#define SCHED_CPUFREQ_DL (1U << 1)
#define SCHED_CPUFREQ_IOWAIT (1U << 2)
#define SCHED_CPUFREQ_RT_DL (SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
#ifdef CONFIG_CPU_FREQ
struct update_util_data {
void (*func)(struct update_util_data *data,
u64 time, unsigned long util, unsigned long max);
void (*func)(struct update_util_data *data, u64 time, unsigned int flags);
};
void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
void (*func)(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max));
void (*func)(struct update_util_data *data, u64 time,
unsigned int flags));
void cpufreq_remove_update_util_hook(int cpu);
#endif /* CONFIG_CPU_FREQ */
......
......@@ -245,6 +245,7 @@ static inline bool idle_should_freeze(void)
return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
}
extern void __init pm_states_init(void);
extern void freeze_set_ops(const struct platform_freeze_ops *ops);
extern void freeze_wake(void);
......@@ -279,6 +280,7 @@ static inline bool pm_resume_via_firmware(void) { return false; }
static inline void suspend_set_ops(const struct platform_suspend_ops *ops) {}
static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
static inline bool idle_should_freeze(void) { return false; }
static inline void __init pm_states_init(void) {}
static inline void freeze_set_ops(const struct platform_freeze_ops *ops) {}
static inline void freeze_wake(void) {}
#endif /* !CONFIG_SUSPEND */
......
......@@ -69,7 +69,8 @@ TRACE_EVENT(pstate_sample,
u64 mperf,
u64 aperf,
u64 tsc,
u32 freq
u32 freq,
u32 io_boost
),
TP_ARGS(core_busy,
......@@ -79,7 +80,8 @@ TRACE_EVENT(pstate_sample,
mperf,
aperf,
tsc,
freq
freq,
io_boost
),
TP_STRUCT__entry(
......@@ -91,6 +93,7 @@ TRACE_EVENT(pstate_sample,
__field(u64, aperf)
__field(u64, tsc)
__field(u32, freq)
__field(u32, io_boost)
),
TP_fast_assign(
......@@ -102,9 +105,10 @@ TRACE_EVENT(pstate_sample,
__entry->aperf = aperf;
__entry->tsc = tsc;
__entry->freq = freq;
__entry->io_boost = io_boost;
),
TP_printk("core_busy=%lu scaled=%lu from=%lu to=%lu mperf=%llu aperf=%llu tsc=%llu freq=%lu ",
TP_printk("core_busy=%lu scaled=%lu from=%lu to=%lu mperf=%llu aperf=%llu tsc=%llu freq=%lu io_boost=%lu",
(unsigned long)__entry->core_busy,
(unsigned long)__entry->scaled_busy,
(unsigned long)__entry->from,
......@@ -112,7 +116,8 @@ TRACE_EVENT(pstate_sample,
(unsigned long long)__entry->mperf,
(unsigned long long)__entry->aperf,
(unsigned long long)__entry->tsc,
(unsigned long)__entry->freq
(unsigned long)__entry->freq,
(unsigned long)__entry->io_boost
)
);
......
......@@ -186,7 +186,7 @@ config PM_SLEEP_DEBUG
config DPM_WATCHDOG
bool "Device suspend/resume watchdog"
depends on PM_DEBUG && PSTORE
depends on PM_DEBUG && PSTORE && EXPERT
---help---
Sets up a watchdog timer to capture drivers that are
locked up attempting to suspend/resume a device.
......@@ -197,7 +197,7 @@ config DPM_WATCHDOG
config DPM_WATCHDOG_TIMEOUT
int "Watchdog timeout in seconds"
range 1 120
default 60
default 120
depends on DPM_WATCHDOG
config PM_TRACE
......
......@@ -306,8 +306,10 @@ static int create_image(int platform_mode)
if (error)
printk(KERN_ERR "PM: Error %d creating hibernation image\n",
error);
if (!in_suspend)
if (!in_suspend) {
events_check_enabled = false;
clear_free_pages();
}
platform_leave(platform_mode);
......@@ -1189,22 +1191,6 @@ static int __init nohibernate_setup(char *str)
return 1;
}
static int __init page_poison_nohibernate_setup(char *str)
{
#ifdef CONFIG_PAGE_POISONING_ZERO
/*
* The zeroing option for page poison skips the checks on alloc.
* since hibernation doesn't save free pages there's no way to
* guarantee the pages will still be zeroed.
*/
if (!strcmp(str, "on")) {
pr_info("Disabling hibernation due to page poisoning\n");
return nohibernate_setup(str);
}
#endif
return 1;
}
__setup("noresume", noresume_setup);
__setup("resume_offset=", resume_offset_setup);
__setup("resume=", resume_setup);
......@@ -1212,4 +1198,3 @@ __setup("hibernate=", hibernate_setup);
__setup("resumewait", resumewait_setup);
__setup("resumedelay=", resumedelay_setup);
__setup("nohibernate", nohibernate_setup);
__setup("page_poison=", page_poison_nohibernate_setup);
......@@ -644,6 +644,7 @@ static int __init pm_init(void)
return error;
hibernate_image_size_init();
hibernate_reserved_size_init();
pm_states_init();
power_kobj = kobject_create_and_add("power", NULL);
if (!power_kobj)
return -ENOMEM;
......
......@@ -110,6 +110,8 @@ extern int create_basic_memory_bitmaps(void);
extern void free_basic_memory_bitmaps(void);
extern int hibernate_preallocate_memory(void);
extern void clear_free_pages(void);
/**
* Auxiliary structure used for reading the snapshot image data and
* metadata from and writing them to the list of page backup entries
......
......@@ -1132,6 +1132,28 @@ void free_basic_memory_bitmaps(void)
pr_debug("PM: Basic memory bitmaps freed\n");
}
void clear_free_pages(void)
{
#ifdef CONFIG_PAGE_POISONING_ZERO
struct memory_bitmap *bm = free_pages_map;
unsigned long pfn;
if (WARN_ON(!(free_pages_map)))
return;
memory_bm_position_reset(bm);
pfn = memory_bm_next_pfn(bm);
while (pfn != BM_END_OF_MAP) {
if (pfn_valid(pfn))
clear_highpage(pfn_to_page(pfn));
pfn = memory_bm_next_pfn(bm);
}
memory_bm_position_reset(bm);
pr_info("PM: free pages cleared after restore\n");
#endif /* PAGE_POISONING_ZERO */
}
/**
* snapshot_additional_pages - Estimate the number of extra pages needed.
* @zone: Memory zone to carry out the computation for.
......
......@@ -118,10 +118,18 @@ static bool valid_state(suspend_state_t state)
*/
static bool relative_states;
void __init pm_states_init(void)
{
/*
* freeze state should be supported even without any suspend_ops,
* initialize pm_states accordingly here
*/
pm_states[PM_SUSPEND_FREEZE] = pm_labels[relative_states ? 0 : 2];
}
static int __init sleep_states_setup(char *str)
{
relative_states = !strncmp(str, "1", 1);
pm_states[PM_SUSPEND_FREEZE] = pm_labels[relative_states ? 0 : 2];
return 1;
}
......@@ -211,7 +219,7 @@ static int platform_suspend_begin(suspend_state_t state)
{
if (state == PM_SUSPEND_FREEZE && freeze_ops && freeze_ops->begin)
return freeze_ops->begin();
else if (suspend_ops->begin)
else if (suspend_ops && suspend_ops->begin)
return suspend_ops->begin(state);
else
return 0;
......@@ -221,7 +229,7 @@ static void platform_resume_end(suspend_state_t state)
{
if (state == PM_SUSPEND_FREEZE && freeze_ops && freeze_ops->end)
freeze_ops->end();
else if (suspend_ops->end)
else if (suspend_ops && suspend_ops->end)
suspend_ops->end();
}
......
......@@ -33,7 +33,7 @@ DEFINE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
*/
void cpufreq_add_update_util_hook(int cpu, struct update_util_data *data,
void (*func)(struct update_util_data *data, u64 time,
unsigned long util, unsigned long max))
unsigned int flags))
{
if (WARN_ON(!data || !func))
return;
......
......@@ -12,7 +12,6 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/cpufreq.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <trace/events/power.h>
......@@ -48,11 +47,14 @@ struct sugov_cpu {
struct sugov_policy *sg_policy;
unsigned int cached_raw_freq;
unsigned long iowait_boost;
unsigned long iowait_boost_max;
u64 last_update;
/* The fields below are only needed when sharing a policy. */
unsigned long util;
unsigned long max;
u64 last_update;
unsigned int flags;
};
static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
......@@ -144,24 +146,75 @@ static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
return cpufreq_driver_resolve_freq(policy, freq);
}
static void sugov_get_util(unsigned long *util, unsigned long *max)
{
struct rq *rq = this_rq();
unsigned long cfs_max;
cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id());
*util = min(rq->cfs.avg.util_avg, cfs_max);
*max = cfs_max;
}
static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
unsigned int flags)
{
if (flags & SCHED_CPUFREQ_IOWAIT) {
sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
} else if (sg_cpu->iowait_boost) {
s64 delta_ns = time - sg_cpu->last_update;
/* Clear iowait_boost if the CPU apprears to have been idle. */
if (delta_ns > TICK_NSEC)
sg_cpu->iowait_boost = 0;
}
}
static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
unsigned long *max)
{
unsigned long boost_util = sg_cpu->iowait_boost;
unsigned long boost_max = sg_cpu->iowait_boost_max;
if (!boost_util)
return;
if (*util * boost_max < *max * boost_util) {
*util = boost_util;
*max = boost_max;
}
sg_cpu->iowait_boost >>= 1;
}
static void sugov_update_single(struct update_util_data *hook, u64 time,
unsigned long util, unsigned long max)
unsigned int flags)
{
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
if (!sugov_should_update_freq(sg_policy, time))
return;
next_f = util == ULONG_MAX ? policy->cpuinfo.max_freq :
get_next_freq(sg_cpu, util, max);
if (flags & SCHED_CPUFREQ_RT_DL) {
next_f = policy->cpuinfo.max_freq;
} else {
sugov_get_util(&util, &max);
sugov_iowait_boost(sg_cpu, &util, &max);
next_f = get_next_freq(sg_cpu, util, max);
}
sugov_update_commit(sg_policy, time, next_f);
}
static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
unsigned long util, unsigned long max)
unsigned long util, unsigned long max,
unsigned int flags)
{
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
......@@ -169,9 +222,11 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
u64 last_freq_update_time = sg_policy->last_freq_update_time;
unsigned int j;
if (util == ULONG_MAX)
if (flags & SCHED_CPUFREQ_RT_DL)
return max_f;
sugov_iowait_boost(sg_cpu, &util, &max);
for_each_cpu(j, policy->cpus) {
struct sugov_cpu *j_sg_cpu;
unsigned long j_util, j_max;
......@@ -186,41 +241,50 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
* frequency update and the time elapsed between the last update
* of the CPU utilization and the last frequency update is long
* enough, don't take the CPU into account as it probably is
* idle now.
* idle now (and clear iowait_boost for it).
*/
delta_ns = last_freq_update_time - j_sg_cpu->last_update;
if (delta_ns > TICK_NSEC)
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
continue;
j_util = j_sg_cpu->util;
if (j_util == ULONG_MAX)
}
if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
return max_f;
j_util = j_sg_cpu->util;
j_max = j_sg_cpu->max;
if (j_util * max > j_max * util) {
util = j_util;
max = j_max;
}
sugov_iowait_boost(j_sg_cpu, &util, &max);
}
return get_next_freq(sg_cpu, util, max);
}
static void sugov_update_shared(struct update_util_data *hook, u64 time,
unsigned long util, unsigned long max)
unsigned int flags)
{
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
unsigned long util, max;
unsigned int next_f;
sugov_get_util(&util, &max);
raw_spin_lock(&sg_policy->update_lock);
sg_cpu->util = util;
sg_cpu->max = max;
sg_cpu->flags = flags;
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
if (sugov_should_update_freq(sg_policy, time)) {
next_f = sugov_next_freq_shared(sg_cpu, util, max);
next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
sugov_update_commit(sg_policy, time, next_f);
}
......@@ -444,10 +508,13 @@ static int sugov_start(struct cpufreq_policy *policy)
sg_cpu->sg_policy = sg_policy;
if (policy_is_shared(policy)) {
sg_cpu->util = ULONG_MAX;
sg_cpu->util = 0;
sg_cpu->max = 0;
sg_cpu->flags = SCHED_CPUFREQ_RT;
sg_cpu->last_update = 0;
sg_cpu->cached_raw_freq = 0;
sg_cpu->iowait_boost = 0;
sg_cpu->iowait_boost_max = policy->cpuinfo.max_freq;
cpufreq_add_update_util_hook(cpu, &sg_cpu->update_util,
sugov_update_shared);
} else {
......@@ -495,28 +562,15 @@ static struct cpufreq_governor schedutil_gov = {
.limits = sugov_limits,
};
static int __init sugov_module_init(void)
{
return cpufreq_register_governor(&schedutil_gov);
}
static void __exit sugov_module_exit(void)
{
cpufreq_unregister_governor(&schedutil_gov);
}
MODULE_AUTHOR("Rafael J. Wysocki <rafael.j.wysocki@intel.com>");
MODULE_DESCRIPTION("Utilization-based CPU frequency selection");
MODULE_LICENSE("GPL");
#ifdef CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL
struct cpufreq_governor *cpufreq_default_governor(void)
{
return &schedutil_gov;
}
fs_initcall(sugov_module_init);
#else
module_init(sugov_module_init);
#endif
module_exit(sugov_module_exit);
static int __init sugov_register(void)
{
return cpufreq_register_governor(&schedutil_gov);
}
fs_initcall(sugov_register);
......@@ -735,9 +735,8 @@ static void update_curr_dl(struct rq *rq)
return;
}
/* kick cpufreq (see the comment in linux/cpufreq.h). */
if (cpu_of(rq) == smp_processor_id())
cpufreq_trigger_update(rq_clock(rq));
/* kick cpufreq (see the comment in kernel/sched/sched.h). */
cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_DL);
schedstat_set(curr->se.statistics.exec_max,
max(curr->se.statistics.exec_max, delta_exec));
......
......@@ -2875,12 +2875,7 @@ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
{
struct rq *rq = rq_of(cfs_rq);
int cpu = cpu_of(rq);
if (cpu == smp_processor_id() && &rq->cfs == cfs_rq) {
unsigned long max = rq->cpu_capacity_orig;
if (&this_rq()->cfs == cfs_rq) {
/*
* There are a few boundary cases this might miss but it should
* get called often enough that that should (hopefully) not be
......@@ -2897,8 +2892,7 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
*
* See cpu_util().
*/
cpufreq_update_util(rq_clock(rq),
min(cfs_rq->avg.util_avg, max), max);
cpufreq_update_util(rq_of(cfs_rq), 0);
}
}
......@@ -3159,10 +3153,7 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq)
static inline void update_load_avg(struct sched_entity *se, int not_used)
{
struct cfs_rq *cfs_rq = cfs_rq_of(se);
struct rq *rq = rq_of(cfs_rq);
cpufreq_trigger_update(rq_clock(rq));
cpufreq_update_util(rq_of(cfs_rq_of(se)), 0);
}
static inline void
......@@ -4509,6 +4500,14 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
struct cfs_rq *cfs_rq;
struct sched_entity *se = &p->se;
/*
* If in_iowait is set, the code below may not trigger any cpufreq
* utilization updates, so do it here explicitly with the IOWAIT flag
* passed.
*/
if (p->in_iowait)
cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_IOWAIT);
for_each_sched_entity(se) {
if (se->on_rq)
break;
......
......@@ -957,9 +957,8 @@ static void update_curr_rt(struct rq *rq)
if (unlikely((s64)delta_exec <= 0))
return;
/* Kick cpufreq (see the comment in linux/cpufreq.h). */
if (cpu_of(rq) == smp_processor_id())
cpufreq_trigger_update(rq_clock(rq));
/* Kick cpufreq (see the comment in kernel/sched/sched.h). */
cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_RT);
schedstat_set(curr->se.statistics.exec_max,
max(curr->se.statistics.exec_max, delta_exec));
......
......@@ -1763,27 +1763,13 @@ DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
/**
* cpufreq_update_util - Take a note about CPU utilization changes.
* @time: Current time.
* @util: Current utilization.
* @max: Utilization ceiling.
* @rq: Runqueue to carry out the update for.
* @flags: Update reason flags.
*
* This function is called by the scheduler on every invocation of
* update_load_avg() on the CPU whose utilization is being updated.
* This function is called by the scheduler on the CPU whose utilization is
* being updated.
*
* It can only be called from RCU-sched read-side critical sections.
*/
static inline void cpufreq_update_util(u64 time, unsigned long util, unsigned long max)
{
struct update_util_data *data;
data = rcu_dereference_sched(*this_cpu_ptr(&cpufreq_update_util_data));
if (data)
data->func(data, time, util, max);
}
/**
* cpufreq_trigger_update - Trigger CPU performance state evaluation if needed.
* @time: Current time.
*
* The way cpufreq is currently arranged requires it to evaluate the CPU
* performance state (frequency/voltage) on a regular basis to prevent it from
......@@ -1797,13 +1783,23 @@ static inline void cpufreq_update_util(u64 time, unsigned long util, unsigned lo
* but that really is a band-aid. Going forward it should be replaced with
* solutions targeted more specifically at RT and DL tasks.
*/
static inline void cpufreq_trigger_update(u64 time)
static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
{
struct update_util_data *data;
data = rcu_dereference_sched(*this_cpu_ptr(&cpufreq_update_util_data));
if (data)
data->func(data, rq_clock(rq), flags);
}
static inline void cpufreq_update_this_cpu(struct rq *rq, unsigned int flags)
{
cpufreq_update_util(time, ULONG_MAX, 0);
if (cpu_of(rq) == smp_processor_id())
cpufreq_update_util(rq, flags);
}
#else
static inline void cpufreq_update_util(u64 time, unsigned long util, unsigned long max) {}
static inline void cpufreq_trigger_update(u64 time) {}
static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
static inline void cpufreq_update_this_cpu(struct rq *rq, unsigned int flags) {}
#endif /* CONFIG_CPU_FREQ */
#ifdef arch_scale_freq_capacity
......
......@@ -76,8 +76,6 @@ config PAGE_POISONING_ZERO
no longer necessary to write zeros when GFP_ZERO is used on
allocation.
Enabling page poisoning with this option will disable hibernation
If unsure, say N
bool
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment