Commit 12dd08fa authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These make hibernation on 32-bit x86 systems work in all of the cases
  in which it works on 64-bit x86 ones, update the menu cpuidle governor
  and the "polling" state to make them more efficient, add more hardware
  support to cpufreq drivers and fix issues with some of them, fix a bug
  in the conservative cpufreq governor, fix the operating performance
  points (OPP) framework and make it more stable, update the devfreq
  subsystem to take changes in the APIs used by into account and clean
  up some things all over.

  Specifics:

   - Backport hibernation bug fixes from x86-64 to x86-32 and
     consolidate hibernation handling on x86 to allow 32-bit systems to
     work in all of the cases in which 64-bit ones work (Zhimin Gu, Chen
     Yu).

   - Fix hibernation documentation (Vladimir D. Seleznev).

   - Update the menu cpuidle governor to fix a couple of issues with it,
     make it more efficient in some cases and clean it up (Rafael
     Wysocki).

   - Rework the cpuidle polling state implementation to make it more
     efficient (Rafael Wysocki).

   - Clean up the cpuidle core somewhat (Fieah Lim).

   - Fix the cpufreq conservative governor to take policy limits into
     account properly in some cases (Rafael Wysocki).

   - Add support for retrieving guaranteed performance information to
     the ACPI CPPC library and make the intel_pstate driver use it to
     expose the CPU base frequency via sysfs on systems with the
     hardware-managed P-states (HWP) feature enabled (Srinivas
     Pandruvada).

   - Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).

   - Get rid of device_node.name printing from cpufreq (Rob Herring).

   - Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).

   - Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju
     Das).

   - Update the dt-platdev cpufreq driver to allow RK3399 to have
     separate tunables per cluster (Dmitry Torokhov).

   - Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
     (Christoph Hellwig).

   - Make the imx6q cpufreq driver read OCOTP through nvmem for
     imx6ul/imx6ull (Anson Huang).

   - Fix several bugs in the operating performance points (OPP)
     framework and make it more stable (Viresh Kumar, Dave Gerlach).

   - Update the devfreq subsystem to take changes in the APIs used by
     into account, fix some issues with it and make it stop print
     device_node.name directly (Bjorn Andersson, Enric Balletbo i Serra,
     Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong jiang).

   - Prepare the generic power domains (genpd) framework for dealing
     with domains containing CPUs (Ulf Hansson).

   - Prevent sysfs attributes representing low-power S0 residency
     counters from being exposed if low-power S0 support is not
     indicated in ACPI FADT (Rajneesh Bhardwaj).

   - Get rid of custom CPU features macros for Intel CPUs from the
     intel_idle and RAPL drivers (Andy Shevchenko).

   - Update the tasks freezer to list tasks that refused to freeze and
     caused a system transition to a sleep state to be aborted (Todd
     Brandt).

   - Update the pm-graph set of tools to v5.2 (Todd Brandt).

   - Fix some issues in the cpupower utility (Anders Roxell, Prarit
     Bhargava)"

* tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  PM / Domains: Document flags for genpd
  PM / Domains: Deal with multiple states but no governor in genpd
  PM / Domains: Don't treat zero found compatible idle states as an error
  cpuidle: menu: Avoid computations when result will be discarded
  cpuidle: menu: Drop redundant comparison
  cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
  cpufreq: conservative: Take limits changes into account properly
  Documentation: intel_pstate: Add base_frequency information
  cpufreq: intel_pstate: Add base_frequency attribute
  ACPI / CPPC: Add support for guaranteed performance
  cpuidle: menu: Simplify checks related to the polling state
  PM / tools: sleepgraph and bootgraph: upgrade to v5.2
  PM / tools: sleepgraph: first batch of v5.2 changes
  cpupower: Fix coredump on VMWare
  cpupower: Fix AMD Family 0x17 msr_pstate size
  cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
  cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
  cpuidle: poll_state: Revise loop termination condition
  cpuidle: menu: Move the latency_req == 0 special case check
  cpuidle: menu: Avoid computations for very close timers
  ...
parents 72f86d08 cc19b05e
......@@ -99,7 +99,7 @@ Description:
this file, the suspend image will be as small as possible.
Reading from this file will display the current image size
limit, which is set to 500 MB by default.
limit, which is set to around 2/5 of available RAM by default.
What: /sys/power/pm_trace
Date: August 2006
......
......@@ -465,6 +465,13 @@ Next, the following policy attributes have special meaning if
policy for the time interval between the last two invocations of the
driver's utilization update callback by the CPU scheduler for that CPU.
One more policy attribute is present if the `HWP feature is enabled in the
processor <Active Mode With HWP_>`_:
``base_frequency``
Shows the base frequency of the CPU. Any frequency above this will be
in the turbo frequency range.
The meaning of these attributes in the `passive mode <Passive Mode_>`_ is the
same as for other scaling drivers.
......
......@@ -56,7 +56,7 @@ If you want to limit the suspend image size to N bytes, do
echo N > /sys/power/image_size
before suspend (it is limited to 500 MB by default).
before suspend (it is limited to around 2/5 of available RAM by default).
. The resume process checks for the presence of the resume device,
if found, it then checks the contents for the hibernation image signature.
......
......@@ -2422,7 +2422,7 @@ menu "Power management and ACPI options"
config ARCH_HIBERNATION_HEADER
def_bool y
depends on X86_64 && HIBERNATION
depends on HIBERNATION
source "kernel/power/Kconfig"
......
......@@ -4,3 +4,11 @@
#else
# include <asm/suspend_64.h>
#endif
extern unsigned long restore_jump_address __visible;
extern unsigned long jump_address_phys;
extern unsigned long restore_cr3 __visible;
extern unsigned long temp_pgt __visible;
extern unsigned long relocated_restore_code __visible;
extern int relocate_restore_code(void);
/* Defined in hibernate_asm_32/64.S */
extern asmlinkage __visible int restore_image(void);
......@@ -32,4 +32,8 @@ struct saved_context {
unsigned long return_address;
} __attribute__((packed));
/* routines for saving/restoring kernel state */
extern char core_restore_code[];
extern char restore_registers[];
#endif /* _ASM_X86_SUSPEND_32_H */
......@@ -1251,7 +1251,7 @@ void __init setup_arch(char **cmdline_p)
x86_init.hyper.guest_late_init();
e820__reserve_resources();
e820__register_nosave_regions(max_low_pfn);
e820__register_nosave_regions(max_pfn);
x86_init.resources.reserve_resources();
......
......@@ -7,4 +7,4 @@ nostackp := $(call cc-option, -fno-stack-protector)
CFLAGS_cpu.o := $(nostackp)
obj-$(CONFIG_PM_SLEEP) += cpu.o
obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o
obj-$(CONFIG_HIBERNATION) += hibernate_$(BITS).o hibernate_asm_$(BITS).o hibernate.o
// SPDX-License-Identifier: GPL-2.0
/*
* Hibernation support for x86
*
* Copyright (c) 2007 Rafael J. Wysocki <rjw@sisk.pl>
* Copyright (c) 2002 Pavel Machek <pavel@ucw.cz>
* Copyright (c) 2001 Patrick Mochel <mochel@osdl.org>
*/
#include <linux/gfp.h>
#include <linux/smp.h>
#include <linux/suspend.h>
#include <linux/scatterlist.h>
#include <linux/kdebug.h>
#include <crypto/hash.h>
#include <asm/e820/api.h>
#include <asm/init.h>
#include <asm/proto.h>
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/mtrr.h>
#include <asm/sections.h>
#include <asm/suspend.h>
#include <asm/tlbflush.h>
/*
* Address to jump to in the last phase of restore in order to get to the image
* kernel's text (this value is passed in the image header).
*/
unsigned long restore_jump_address __visible;
unsigned long jump_address_phys;
/*
* Value of the cr3 register from before the hibernation (this value is passed
* in the image header).
*/
unsigned long restore_cr3 __visible;
unsigned long temp_pgt __visible;
unsigned long relocated_restore_code __visible;
/**
* pfn_is_nosave - check if given pfn is in the 'nosave' section
*/
int pfn_is_nosave(unsigned long pfn)
{
unsigned long nosave_begin_pfn;
unsigned long nosave_end_pfn;
nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;
return pfn >= nosave_begin_pfn && pfn < nosave_end_pfn;
}
#define MD5_DIGEST_SIZE 16
struct restore_data_record {
unsigned long jump_address;
unsigned long jump_address_phys;
unsigned long cr3;
unsigned long magic;
u8 e820_digest[MD5_DIGEST_SIZE];
};
#if IS_BUILTIN(CONFIG_CRYPTO_MD5)
/**
* get_e820_md5 - calculate md5 according to given e820 table
*
* @table: the e820 table to be calculated
* @buf: the md5 result to be stored to
*/
static int get_e820_md5(struct e820_table *table, void *buf)
{
struct crypto_shash *tfm;
struct shash_desc *desc;
int size;
int ret = 0;
tfm = crypto_alloc_shash("md5", 0, 0);
if (IS_ERR(tfm))
return -ENOMEM;
desc = kmalloc(sizeof(struct shash_desc) + crypto_shash_descsize(tfm),
GFP_KERNEL);
if (!desc) {
ret = -ENOMEM;
goto free_tfm;
}
desc->tfm = tfm;
desc->flags = 0;
size = offsetof(struct e820_table, entries) +
sizeof(struct e820_entry) * table->nr_entries;
if (crypto_shash_digest(desc, (u8 *)table, size, buf))
ret = -EINVAL;
kzfree(desc);
free_tfm:
crypto_free_shash(tfm);
return ret;
}
static int hibernation_e820_save(void *buf)
{
return get_e820_md5(e820_table_firmware, buf);
}
static bool hibernation_e820_mismatch(void *buf)
{
int ret;
u8 result[MD5_DIGEST_SIZE];
memset(result, 0, MD5_DIGEST_SIZE);
/* If there is no digest in suspend kernel, let it go. */
if (!memcmp(result, buf, MD5_DIGEST_SIZE))
return false;
ret = get_e820_md5(e820_table_firmware, result);
if (ret)
return true;
return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false;
}
#else
static int hibernation_e820_save(void *buf)
{
return 0;
}
static bool hibernation_e820_mismatch(void *buf)
{
/* If md5 is not builtin for restore kernel, let it go. */
return false;
}
#endif
#ifdef CONFIG_X86_64
#define RESTORE_MAGIC 0x23456789ABCDEF01UL
#else
#define RESTORE_MAGIC 0x12345678UL
#endif
/**
* arch_hibernation_header_save - populate the architecture specific part
* of a hibernation image header
* @addr: address to save the data at
*/
int arch_hibernation_header_save(void *addr, unsigned int max_size)
{
struct restore_data_record *rdr = addr;
if (max_size < sizeof(struct restore_data_record))
return -EOVERFLOW;
rdr->magic = RESTORE_MAGIC;
rdr->jump_address = (unsigned long)restore_registers;
rdr->jump_address_phys = __pa_symbol(restore_registers);
/*
* The restore code fixes up CR3 and CR4 in the following sequence:
*
* [in hibernation asm]
* 1. CR3 <= temporary page tables
* 2. CR4 <= mmu_cr4_features (from the kernel that restores us)
* 3. CR3 <= rdr->cr3
* 4. CR4 <= mmu_cr4_features (from us, i.e. the image kernel)
* [in restore_processor_state()]
* 5. CR4 <= saved CR4
* 6. CR3 <= saved CR3
*
* Our mmu_cr4_features has CR4.PCIDE=0, and toggling
* CR4.PCIDE while CR3's PCID bits are nonzero is illegal, so
* rdr->cr3 needs to point to valid page tables but must not
* have any of the PCID bits set.
*/
rdr->cr3 = restore_cr3 & ~CR3_PCID_MASK;
return hibernation_e820_save(rdr->e820_digest);
}
/**
* arch_hibernation_header_restore - read the architecture specific data
* from the hibernation image header
* @addr: address to read the data from
*/
int arch_hibernation_header_restore(void *addr)
{
struct restore_data_record *rdr = addr;
if (rdr->magic != RESTORE_MAGIC) {
pr_crit("Unrecognized hibernate image header format!\n");
return -EINVAL;
}
restore_jump_address = rdr->jump_address;
jump_address_phys = rdr->jump_address_phys;
restore_cr3 = rdr->cr3;
if (hibernation_e820_mismatch(rdr->e820_digest)) {
pr_crit("Hibernate inconsistent memory map detected!\n");
return -ENODEV;
}
return 0;
}
int relocate_restore_code(void)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
relocated_restore_code = get_safe_page(GFP_ATOMIC);
if (!relocated_restore_code)
return -ENOMEM;
memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE);
/* Make the page containing the relocated code executable */
pgd = (pgd_t *)__va(read_cr3_pa()) +
pgd_index(relocated_restore_code);
p4d = p4d_offset(pgd, relocated_restore_code);
if (p4d_large(*p4d)) {
set_p4d(p4d, __p4d(p4d_val(*p4d) & ~_PAGE_NX));
goto out;
}
pud = pud_offset(p4d, relocated_restore_code);
if (pud_large(*pud)) {
set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX));
goto out;
}
pmd = pmd_offset(pud, relocated_restore_code);
if (pmd_large(*pmd)) {
set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX));
goto out;
}
pte = pte_offset_kernel(pmd, relocated_restore_code);
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX));
out:
__flush_tlb_all();
return 0;
}
......@@ -14,9 +14,7 @@
#include <asm/pgtable.h>
#include <asm/mmzone.h>
#include <asm/sections.h>
/* Defined in hibernate_asm_32.S */
extern int restore_image(void);
#include <asm/suspend.h>
/* Pointer to the temporary resume page tables */
pgd_t *resume_pg_dir;
......@@ -145,6 +143,32 @@ static inline void resume_init_first_level_page_table(pgd_t *pg_dir)
#endif
}
static int set_up_temporary_text_mapping(pgd_t *pgd_base)
{
pgd_t *pgd;
pmd_t *pmd;
pte_t *pte;
pgd = pgd_base + pgd_index(restore_jump_address);
pmd = resume_one_md_table_init(pgd);
if (!pmd)
return -ENOMEM;
if (boot_cpu_has(X86_FEATURE_PSE)) {
set_pmd(pmd + pmd_index(restore_jump_address),
__pmd((jump_address_phys & PMD_MASK) | pgprot_val(PAGE_KERNEL_LARGE_EXEC)));
} else {
pte = resume_one_page_table_init(pmd);
if (!pte)
return -ENOMEM;
set_pte(pte + pte_index(restore_jump_address),
__pte((jump_address_phys & PAGE_MASK) | pgprot_val(PAGE_KERNEL_EXEC)));
}
return 0;
}
asmlinkage int swsusp_arch_resume(void)
{
int error;
......@@ -154,22 +178,22 @@ asmlinkage int swsusp_arch_resume(void)
return -ENOMEM;
resume_init_first_level_page_table(resume_pg_dir);
error = set_up_temporary_text_mapping(resume_pg_dir);
if (error)
return error;
error = resume_physical_mapping_init(resume_pg_dir);
if (error)
return error;
temp_pgt = __pa(resume_pg_dir);
error = relocate_restore_code();
if (error)
return error;
/* We have got enough memory and from now on we cannot recover */
restore_image();
return 0;
}
/*
* pfn_is_nosave - check if given pfn is in the 'nosave' section
*/
int pfn_is_nosave(unsigned long pfn)
{
unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;
return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn);
}
......@@ -26,26 +26,6 @@
#include <asm/suspend.h>
#include <asm/tlbflush.h>
/* Defined in hibernate_asm_64.S */
extern asmlinkage __visible int restore_image(void);
/*
* Address to jump to in the last phase of restore in order to get to the image
* kernel's text (this value is passed in the image header).
*/
unsigned long restore_jump_address __visible;
unsigned long jump_address_phys;
/*
* Value of the cr3 register from before the hibernation (this value is passed
* in the image header).
*/
unsigned long restore_cr3 __visible;
unsigned long temp_level4_pgt __visible;
unsigned long relocated_restore_code __visible;
static int set_up_temporary_text_mapping(pgd_t *pgd)
{
pmd_t *pmd;
......@@ -141,46 +121,7 @@ static int set_up_temporary_mappings(void)
return result;
}
temp_level4_pgt = __pa(pgd);
return 0;
}
static int relocate_restore_code(void)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
relocated_restore_code = get_safe_page(GFP_ATOMIC);
if (!relocated_restore_code)
return -ENOMEM;
memcpy((void *)relocated_restore_code, core_restore_code, PAGE_SIZE);
/* Make the page containing the relocated code executable */
pgd = (pgd_t *)__va(read_cr3_pa()) +
pgd_index(relocated_restore_code);
p4d = p4d_offset(pgd, relocated_restore_code);
if (p4d_large(*p4d)) {
set_p4d(p4d, __p4d(p4d_val(*p4d) & ~_PAGE_NX));
goto out;
}
pud = pud_offset(p4d, relocated_restore_code);
if (pud_large(*pud)) {
set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX));
goto out;
}
pmd = pmd_offset(pud, relocated_restore_code);
if (pmd_large(*pmd)) {
set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX));
goto out;
}
pte = pte_offset_kernel(pmd, relocated_restore_code);
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX));
out:
__flush_tlb_all();
temp_pgt = __pa(pgd);
return 0;
}
......@@ -200,166 +141,3 @@ asmlinkage int swsusp_arch_resume(void)
restore_image();
return 0;
}
/*
* pfn_is_nosave - check if given pfn is in the 'nosave' section
*/
int pfn_is_nosave(unsigned long pfn)
{
unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;
return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn);
}
#define MD5_DIGEST_SIZE 16
struct restore_data_record {
unsigned long jump_address;
unsigned long jump_address_phys;
unsigned long cr3;
unsigned long magic;
u8 e820_digest[MD5_DIGEST_SIZE];
};
#define RESTORE_MAGIC 0x23456789ABCDEF01UL
#if IS_BUILTIN(CONFIG_CRYPTO_MD5)
/**
* get_e820_md5 - calculate md5 according to given e820 table
*
* @table: the e820 table to be calculated
* @buf: the md5 result to be stored to
*/
static int get_e820_md5(struct e820_table *table, void *buf)
{
struct crypto_shash *tfm;
struct shash_desc *desc;
int size;
int ret = 0;
tfm = crypto_alloc_shash("md5", 0, 0);
if (IS_ERR(tfm))
return -ENOMEM;
desc = kmalloc(sizeof(struct shash_desc) + crypto_shash_descsize(tfm),
GFP_KERNEL);
if (!desc) {
ret = -ENOMEM;
goto free_tfm;
}
desc->tfm = tfm;
desc->flags = 0;
size = offsetof(struct e820_table, entries) +
sizeof(struct e820_entry) * table->nr_entries;
if (crypto_shash_digest(desc, (u8 *)table, size, buf))
ret = -EINVAL;
kzfree(desc);
free_tfm:
crypto_free_shash(tfm);
return ret;
}
static void hibernation_e820_save(void *buf)
{
get_e820_md5(e820_table_firmware, buf);
}
static bool hibernation_e820_mismatch(void *buf)
{
int ret;
u8 result[MD5_DIGEST_SIZE];
memset(result, 0, MD5_DIGEST_SIZE);
/* If there is no digest in suspend kernel, let it go. */
if (!memcmp(result, buf, MD5_DIGEST_SIZE))
return false;
ret = get_e820_md5(e820_table_firmware, result);
if (ret)
return true;
return memcmp(result, buf, MD5_DIGEST_SIZE) ? true : false;
}
#else
static void hibernation_e820_save(void *buf)
{
}
static bool hibernation_e820_mismatch(void *buf)
{
/* If md5 is not builtin for restore kernel, let it go. */
return false;
}
#endif
/**
* arch_hibernation_header_save - populate the architecture specific part
* of a hibernation image header
* @addr: address to save the data at
*/
int arch_hibernation_header_save(void *addr, unsigned int max_size)
{
struct restore_data_record *rdr = addr;
if (max_size < sizeof(struct restore_data_record))
return -EOVERFLOW;
rdr->jump_address = (unsigned long)restore_registers;
rdr->jump_address_phys = __pa_symbol(restore_registers);
/*
* The restore code fixes up CR3 and CR4 in the following sequence:
*
* [in hibernation asm]
* 1. CR3 <= temporary page tables
* 2. CR4 <= mmu_cr4_features (from the kernel that restores us)
* 3. CR3 <= rdr->cr3
* 4. CR4 <= mmu_cr4_features (from us, i.e. the image kernel)
* [in restore_processor_state()]
* 5. CR4 <= saved CR4
* 6. CR3 <= saved CR3
*
* Our mmu_cr4_features has CR4.PCIDE=0, and toggling
* CR4.PCIDE while CR3's PCID bits are nonzero is illegal, so
* rdr->cr3 needs to point to valid page tables but must not
* have any of the PCID bits set.
*/
rdr->cr3 = restore_cr3 & ~CR3_PCID_MASK;
rdr->magic = RESTORE_MAGIC;
hibernation_e820_save(rdr->e820_digest);
return 0;
}
/**
* arch_hibernation_header_restore - read the architecture specific data
* from the hibernation image header
* @addr: address to read the data from
*/
int arch_hibernation_header_restore(void *addr)
{
struct restore_data_record *rdr = addr;
restore_jump_address = rdr->jump_address;
jump_address_phys = rdr->jump_address_phys;
restore_cr3 = rdr->cr3;
if (rdr->magic != RESTORE_MAGIC) {
pr_crit("Unrecognized hibernate image header format!\n");
return -EINVAL;
}
if (hibernation_e820_mismatch(rdr->e820_digest)) {
pr_crit("Hibernate inconsistent memory map detected!\n");
return -ENODEV;
}
return 0;
}
......@@ -12,6 +12,7 @@
#include <asm/page_types.h>
#include <asm/asm-offsets.h>
#include <asm/processor-flags.h>
#include <asm/frame.h>
.text
......@@ -24,13 +25,30 @@ ENTRY(swsusp_arch_suspend)
pushfl
popl saved_context_eflags
/* save cr3 */
movl %cr3, %eax
movl %eax, restore_cr3
FRAME_BEGIN
call swsusp_save
FRAME_END
ret
ENDPROC(swsusp_arch_suspend)
ENTRY(restore_image)
/* prepare to jump to the image kernel */
movl restore_jump_address, %ebx
movl restore_cr3, %ebp
movl mmu_cr4_features, %ecx
movl resume_pg_dir, %eax
subl $__PAGE_OFFSET, %eax
/* jump to relocated restore code */
movl relocated_restore_code, %eax
jmpl *%eax
/* code below has been relocated to a safe page */
ENTRY(core_restore_code)
movl temp_pgt, %eax
movl %eax, %cr3
jecxz 1f # cr4 Pentium and higher, skip if zero
......@@ -49,7 +67,7 @@ copy_loop:
movl pbe_address(%edx), %esi
movl pbe_orig_address(%edx), %edi
movl $1024, %ecx
movl $(PAGE_SIZE >> 2), %ecx
rep
movsl
......@@ -58,10 +76,13 @@ copy_loop:
.p2align 4,,7
done:
jmpl *%ebx
/* code below belongs to the image kernel */
.align PAGE_SIZE
ENTRY(restore_registers)
/* go back to the original page tables */
movl $swapper_pg_dir, %eax
subl $__PAGE_OFFSET, %eax
movl %eax, %cr3
movl %ebp, %cr3
movl mmu_cr4_features, %ecx
jecxz 1f # cr4 Pentium and higher, skip if zero
movl %ecx, %cr4; # turn PGE back on
......@@ -82,4 +103,8 @@ done:
xorl %eax, %eax
/* tell the hibernation core that we've just restored the memory */
movl %eax, in_suspend
ret
ENDPROC(restore_registers)
......@@ -59,7 +59,7 @@ ENTRY(restore_image)
movq restore_cr3(%rip), %r9
/* prepare to switch to temporary page tables */
movq temp_level4_pgt(%rip), %rax
movq temp_pgt(%rip), %rax
movq mmu_cr4_features(%rip), %rbx
/* prepare to copy image data to their original locations */
......
......@@ -117,11 +117,17 @@ static void lpit_update_residency(struct lpit_residency_info *info,
if (!info->iomem_addr)
return;
if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
return;
/* Silently fail, if cpuidle attribute group is not present */
sysfs_add_file_to_group(&cpu_subsys.dev_root->kobj,
&dev_attr_low_power_idle_system_residency_us.attr,
"cpuidle");
} else if (info->gaddr.space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) {
if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
return;
/* Silently fail, if cpuidle attribute group is not present */
sysfs_add_file_to_group(&cpu_subsys.dev_root->kobj,
&dev_attr_low_power_idle_cpu_residency_us.attr,
......
......@@ -1061,9 +1061,9 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
{
struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
struct cpc_register_resource *highest_reg, *lowest_reg,
*lowest_non_linear_reg, *nominal_reg,
*lowest_non_linear_reg, *nominal_reg, *guaranteed_reg,
*low_freq_reg = NULL, *nom_freq_reg = NULL;
u64 high, low, nom, min_nonlinear, low_f = 0, nom_f = 0;
u64 high, low, guaranteed, nom, min_nonlinear, low_f = 0, nom_f = 0;
int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpunum);
struct cppc_pcc_data *pcc_ss_data = NULL;
int ret = 0, regs_in_pcc = 0;
......@@ -1079,6 +1079,7 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
nominal_reg = &cpc_desc->cpc_regs[NOMINAL_PERF];
low_freq_reg = &cpc_desc->cpc_regs[LOWEST_FREQ];
nom_freq_reg = &cpc_desc->cpc_regs[NOMINAL_FREQ];
guaranteed_reg = &cpc_desc->cpc_regs[GUARANTEED_PERF];
/* Are any of the regs PCC ?*/
if (CPC_IN_PCC(highest_reg) || CPC_IN_PCC(lowest_reg) ||
......@@ -1107,6 +1108,9 @@ int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps)
cpc_read(cpunum, nominal_reg, &nom);
perf_caps->nominal_perf = nom;
cpc_read(cpunum, guaranteed_reg, &guaranteed);
perf_caps->guaranteed_perf = guaranteed;
cpc_read(cpunum, lowest_non_linear_reg, &min_nonlinear);
perf_caps->lowest_nonlinear_perf = min_nonlinear;
......
......@@ -467,6 +467,10 @@ static int genpd_power_off(struct generic_pm_domain *genpd, bool one_dev_on,
return -EAGAIN;
}
/* Default to shallowest state. */
if (!genpd->gov)
genpd->state_idx = 0;
if (genpd->power_off) {
int ret;
......@@ -1687,6 +1691,8 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
ret = genpd_set_default_power_state(genpd);
if (ret)
return ret;
} else if (!gov) {
pr_warn("%s : no governor for states\n", genpd->name);
}
device_initialize(&genpd->dev);
......@@ -2478,8 +2484,8 @@ static int genpd_iterate_idle_states(struct device_node *dn,
*
* Returns the device states parsed from the OF node. The memory for the states
* is allocated by this function and is the responsibility of the caller to
* free the memory after use. If no domain idle states is found it returns
* -EINVAL and in case of errors, a negative error code.
* free the memory after use. If any or zero compatible domain idle states is
* found it returns 0 and in case of errors, a negative error code is returned.
*/
int of_genpd_parse_idle_states(struct device_node *dn,
struct genpd_power_state **states, int *n)
......@@ -2488,8 +2494,14 @@ int of_genpd_parse_idle_states(struct device_node *dn,
int ret;
ret = genpd_iterate_idle_states(dn, NULL);
if (ret <= 0)
return ret < 0 ? ret : -EINVAL;
if (ret < 0)
return ret;
if (!ret) {
*states = NULL;
*n = 0;
return 0;
}
st = kcalloc(ret, sizeof(*st), GFP_KERNEL);
if (!st)
......
......@@ -428,7 +428,7 @@ MODULE_LICENSE("GPL");
late_initcall(cppc_cpufreq_init);
static const struct acpi_device_id cppc_acpi_ids[] = {
static const struct acpi_device_id cppc_acpi_ids[] __used = {
{ACPI_PROCESSOR_DEVICE_HID, },
{}
};
......
......@@ -58,6 +58,7 @@ static const struct of_device_id whitelist[] __initconst = {
{ .compatible = "renesas,r8a73a4", },
{ .compatible = "renesas,r8a7740", },
{ .compatible = "renesas,r8a7743", },
{ .compatible = "renesas,r8a7744", },
{ .compatible = "renesas,r8a7745", },
{ .compatible = "renesas,r8a7778", },
{ .compatible = "renesas,r8a7779", },
......@@ -78,7 +79,10 @@ static const struct of_device_id whitelist[] __initconst = {
{ .compatible = "rockchip,rk3328", },
{ .compatible = "rockchip,rk3366", },
{ .compatible = "rockchip,rk3368", },
{ .compatible = "rockchip,rk3399", },
{ .compatible = "rockchip,rk3399",
.data = &(struct cpufreq_dt_platform_data)
{ .have_governor_per_policy = true, },
},
{ .compatible = "st-ericsson,u8500", },
{ .compatible = "st-ericsson,u8540", },
......
......@@ -32,6 +32,7 @@ struct private_data {
struct device *cpu_dev;
struct thermal_cooling_device *cdev;
const char *reg_name;
bool have_static_opps;
};
static struct freq_attr *cpufreq_dt_attr[] = {
......@@ -204,6 +205,15 @@ static int cpufreq_init(struct cpufreq_policy *policy)
}
}
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) {
ret = -ENOMEM;
goto out_put_regulator;
}
priv->reg_name = name;
priv->opp_table = opp_table;
/*
* Initialize OPP tables for all policy->cpus. They will be shared by
* all CPUs which have marked their CPUs shared with OPP bindings.
......@@ -214,7 +224,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
*
* OPPs might be populated at runtime, don't check for error here
*/
dev_pm_opp_of_cpumask_add_table(policy->cpus);
if (!dev_pm_opp_of_cpumask_add_table(policy->cpus))
priv->have_static_opps = true;
/*
* But we need OPP table to function so if it is not there let's
......@@ -240,19 +251,10 @@ static int cpufreq_init(struct cpufreq_policy *policy)
__func__, ret);
}
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) {
ret = -ENOMEM;
goto out_free_opp;
}
priv->reg_name = name;
priv->opp_table = opp_table;
ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
if (ret) {
dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
goto out_free_priv;
goto out_free_opp;
}
priv->cpu_dev = cpu_dev;
......@@ -282,10 +284,11 @@ static int cpufreq_init(struct cpufreq_policy *policy)
out_free_cpufreq_table:
dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
out_free_priv:
kfree(priv);
out_free_opp:
if (priv->have_static_opps)
dev_pm_opp_of_cpumask_remove_table(policy->cpus);
kfree(priv);
out_put_regulator:
if (name)
dev_pm_opp_put_regulators(opp_table);
out_put_clk:
......@@ -300,6 +303,7 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
cpufreq_cooling_unregister(priv->cdev);
dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
if (priv->have_static_opps)
dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
if (priv->reg_name)
dev_pm_opp_put_regulators(priv->opp_table);
......
......@@ -403,7 +403,7 @@ EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin);
void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs, int transition_failed)
{
if (unlikely(WARN_ON(!policy->transition_ongoing)))
if (WARN_ON(!policy->transition_ongoing))
return;
cpufreq_notify_post_transition(policy, freqs, transition_failed);
......
......@@ -80,8 +80,10 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
* changed in the meantime, so fall back to current frequency in that
* case.
*/
if (requested_freq > policy->max || requested_freq < policy->min)
if (requested_freq > policy->max || requested_freq < policy->min) {
requested_freq = policy->cur;
dbs_info->requested_freq = requested_freq;
}
freq_step = get_freq_step(cs_tuners, policy);
......@@ -92,7 +94,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
if (policy_dbs->idle_periods < UINT_MAX) {
unsigned int freq_steps = policy_dbs->idle_periods * freq_step;
if (requested_freq > freq_steps)
if (requested_freq > policy->min + freq_steps)
requested_freq -= freq_steps;
else
requested_freq = policy->min;
......
......@@ -12,6 +12,7 @@
#include <linux/cpu_cooling.h>
#include <linux/err.h>
#include <linux/module.h>
#include <linux/nvmem-consumer.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/pm_opp.h>
......@@ -290,20 +291,32 @@ static void imx6q_opp_check_speed_grading(struct device *dev)
#define OCOTP_CFG3_6ULL_SPEED_792MHZ 0x2
#define OCOTP_CFG3_6ULL_SPEED_900MHZ 0x3
static void imx6ul_opp_check_speed_grading(struct device *dev)
static int imx6ul_opp_check_speed_grading(struct device *dev)
{
u32 val;
int ret = 0;
if (of_find_property(dev->of_node, "nvmem-cells", NULL)) {
ret = nvmem_cell_read_u32(dev, "speed_grade", &val);
if (ret)
return ret;
} else {
struct device_node *np;
void __iomem *base;
u32 val;
np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
if (!np)
return;
return -ENOENT;
base = of_iomap(np, 0);
of_node_put(np);
if (!base) {
dev_err(dev, "failed to map ocotp\n");
goto put_node;
return -EFAULT;
}
val = readl_relaxed(base + OCOTP_CFG3);
iounmap(base);
}
/*
......@@ -314,7 +327,6 @@ static void imx6ul_opp_check_speed_grading(struct device *dev)
* 2b'11: 900000000Hz on i.MX6ULL only;
* We need to set the max speed of ARM according to fuse map.
*/
val = readl_relaxed(base + OCOTP_CFG3);
val >>= OCOTP_CFG3_SPEED_SHIFT;
val &= 0x3;
......@@ -334,9 +346,7 @@ static void imx6ul_opp_check_speed_grading(struct device *dev)
dev_warn(dev, "failed to disable 900MHz OPP\n");
}
iounmap(base);
put_node:
of_node_put(np);
return ret;
}
static int imx6q_cpufreq_probe(struct platform_device *pdev)
......@@ -394,10 +404,18 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
}
if (of_machine_is_compatible("fsl,imx6ul") ||
of_machine_is_compatible("fsl,imx6ull"))
imx6ul_opp_check_speed_grading(cpu_dev);
else
of_machine_is_compatible("fsl,imx6ull")) {
ret = imx6ul_opp_check_speed_grading(cpu_dev);
if (ret == -EPROBE_DEFER)
return ret;
if (ret) {
dev_err(cpu_dev, "failed to read ocotp: %d\n",
ret);
return ret;
}
} else {
imx6q_opp_check_speed_grading(cpu_dev);
}
/* Because we have added the OPPs here, we must free them */
free_opp = true;
......
......@@ -373,10 +373,28 @@ static void intel_pstate_set_itmt_prio(int cpu)
}
}
}
static int intel_pstate_get_cppc_guranteed(int cpu)
{
struct cppc_perf_caps cppc_perf;
int ret;
ret = cppc_get_perf_caps(cpu, &cppc_perf);
if (ret)
return ret;
return cppc_perf.guaranteed_perf;
}
#else
static void intel_pstate_set_itmt_prio(int cpu)
{
}
static int intel_pstate_get_cppc_guranteed(int cpu)
{
return -ENOTSUPP;
}
#endif
static void intel_pstate_init_acpi_perf_limits(struct cpufreq_policy *policy)
......@@ -699,9 +717,29 @@ static ssize_t show_energy_performance_preference(
cpufreq_freq_attr_rw(energy_performance_preference);
static ssize_t show_base_frequency(struct cpufreq_policy *policy, char *buf)
{
struct cpudata *cpu;
u64 cap;
int ratio;
ratio = intel_pstate_get_cppc_guranteed(policy->cpu);
if (ratio <= 0) {
rdmsrl_on_cpu(policy->cpu, MSR_HWP_CAPABILITIES, &cap);
ratio = HWP_GUARANTEED_PERF(cap);
}
cpu = all_cpu_data[policy->cpu];
return sprintf(buf, "%d\n", ratio * cpu->pstate.scaling);
}
cpufreq_freq_attr_ro(base_frequency);
static struct freq_attr *hwp_cpufreq_attrs[] = {
&energy_performance_preference,
&energy_performance_available_preferences,
&base_frequency,
NULL,
};
......
......@@ -84,9 +84,10 @@ static int __init armada_xp_pmsu_cpufreq_init(void)
ret = dev_pm_opp_add(cpu_dev, clk_get_rate(clk) / 2, 0);
if (ret) {
dev_pm_opp_remove(cpu_dev, clk_get_rate(clk));
clk_put(clk);
dev_err(cpu_dev, "Failed to register OPPs\n");
goto opp_register_failed;
return ret;
}
ret = dev_pm_opp_set_sharing_cpus(cpu_dev,
......@@ -99,11 +100,5 @@ static int __init armada_xp_pmsu_cpufreq_init(void)
platform_device_register_simple("cpufreq-dt", -1, NULL, 0);
return 0;
opp_register_failed:
/* As registering has failed remove all the opp for all cpus */
dev_pm_opp_cpumask_remove_table(cpu_possible_mask);
return ret;
}
device_initcall(armada_xp_pmsu_cpufreq_init);
......@@ -611,8 +611,8 @@ static int s5pv210_cpufreq_probe(struct platform_device *pdev)
for_each_compatible_node(np, NULL, "samsung,s5pv210-dmc") {
id = of_alias_get_id(np, "dmc");
if (id < 0 || id >= ARRAY_SIZE(dmc_base)) {
pr_err("%s: failed to get alias of dmc node '%s'\n",
__func__, np->name);
pr_err("%s: failed to get alias of dmc node '%pOFn'\n",
__func__, np);
of_node_put(np);
return id;
}
......
......@@ -121,7 +121,7 @@ static struct cpufreq_frequency_table *init_vhint_table(
void *virt;
virt = dma_alloc_coherent(bpmp->dev, sizeof(*data), &phys,
GFP_KERNEL | GFP_DMA32);
GFP_KERNEL);
if (!virt)
return ERR_PTR(-ENOMEM);
......
......@@ -247,17 +247,17 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
if (!cpuidle_state_is_coupled(drv, index))
local_irq_enable();
if (entered_state >= 0) {
/*
* Update cpuidle counters
* This can be moved to within driver enter routine,
* but that results in multiple copies of same code.
*/
diff = ktime_us_delta(time_end, time_start);
if (diff > INT_MAX)
diff = INT_MAX;
dev->last_residency = (int) diff;
if (entered_state >= 0) {
/* Update cpuidle counters */
/* This can be moved to within driver enter routine
* but that results in multiple copies of same code.
*/
dev->last_residency = (int)diff;
dev->states_usage[entered_state].time += dev->last_residency;
dev->states_usage[entered_state].usage++;
} else {
......
......@@ -80,7 +80,7 @@ static int ladder_select_state(struct cpuidle_driver *drv,
last_state = &ldev->states[last_idx];
last_residency = cpuidle_get_last_residency(dev) - drv->states[last_idx].exit_latency;
last_residency = dev->last_residency - drv->states[last_idx].exit_latency;
/* consider promotion */
if (last_idx < drv->state_count - 1 &&
......
......@@ -124,7 +124,6 @@ struct menu_device {
int tick_wakeup;
unsigned int next_timer_us;
unsigned int predicted_us;
unsigned int bucket;
unsigned int correction_factor[BUCKETS];
unsigned int intervals[INTERVALS];
......@@ -197,10 +196,11 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev);
* of points is below a threshold. If it is... then use the
* average of these 8 points as the estimated value.
*/
static unsigned int get_typical_interval(struct menu_device *data)
static unsigned int get_typical_interval(struct menu_device *data,
unsigned int predicted_us)
{
int i, divisor;
unsigned int max, thresh, avg;
unsigned int min, max, thresh, avg;
uint64_t sum, variance;
thresh = UINT_MAX; /* Discard outliers above this value */
......@@ -208,6 +208,7 @@ static unsigned int get_typical_interval(struct menu_device *data)
again:
/* First calculate the average of past intervals */
min = UINT_MAX;
max = 0;
sum = 0;
divisor = 0;
......@@ -218,8 +219,19 @@ static unsigned int get_typical_interval(struct menu_device *data)
divisor++;
if (value > max)
max = value;
if (value < min)
min = value;
}
}
/*
* If the result of the computation is going to be discarded anyway,
* avoid the computation altogether.
*/
if (min >= predicted_us)
return UINT_MAX;
if (divisor == INTERVALS)
avg = sum >> INTERVAL_SHIFT;
else
......@@ -286,10 +298,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
struct menu_device *data = this_cpu_ptr(&menu_devices);
int latency_req = cpuidle_governor_latency_req(dev->cpu);
int i;
int first_idx;
int idx;
unsigned int interactivity_req;
unsigned int expected_interval;
unsigned int predicted_us;
unsigned long nr_iowaiters, cpu_load;
ktime_t delta_next;
......@@ -298,50 +309,36 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
data->needs_update = 0;
}
/* Special case when user has set very strict latency requirement */
if (unlikely(latency_req == 0)) {
*stop_tick = false;
return 0;
}
/* determine the expected residency time, round up */
data->next_timer_us = ktime_to_us(tick_nohz_get_sleep_length(&delta_next));
get_iowait_load(&nr_iowaiters, &cpu_load);
data->bucket = which_bucket(data->next_timer_us, nr_iowaiters);
if (unlikely(drv->state_count <= 1 || latency_req == 0) ||
((data->next_timer_us < drv->states[1].target_residency ||
latency_req < drv->states[1].exit_latency) &&
!drv->states[0].disabled && !dev->states_usage[0].disable)) {
/*
* In this case state[0] will be used no matter what, so return
* it right away and keep the tick running.
*/
*stop_tick = false;
return 0;
}
/*
* Force the result of multiplication to be 64 bits even if both
* operands are 32 bits.
* Make sure to round up for half microseconds.
*/
data->predicted_us = DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us *
predicted_us = DIV_ROUND_CLOSEST_ULL((uint64_t)data->next_timer_us *
data->correction_factor[data->bucket],
RESOLUTION * DECAY);
expected_interval = get_typical_interval(data);
expected_interval = min(expected_interval, data->next_timer_us);
first_idx = 0;
if (drv->states[0].flags & CPUIDLE_FLAG_POLLING) {
struct cpuidle_state *s = &drv->states[1];
unsigned int polling_threshold;
/*
* Default to a physical idle state, not to busy polling, unless
* a timer is going to trigger really really soon.
*/
polling_threshold = max_t(unsigned int, 20, s->target_residency);
if (data->next_timer_us > polling_threshold &&
latency_req > s->exit_latency && !s->disabled &&
!dev->states_usage[1].disable)
first_idx = 1;
}
/*
* Use the lowest expected idle interval to pick the idle state.
*/
data->predicted_us = min(data->predicted_us, expected_interval);
predicted_us = min(predicted_us, get_typical_interval(data, predicted_us));
if (tick_nohz_tick_stopped()) {
/*
......@@ -352,34 +349,46 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
* the known time till the closest timer event for the idle
* state selection.
*/
if (data->predicted_us < TICK_USEC)
data->predicted_us = ktime_to_us(delta_next);
if (predicted_us < TICK_USEC)
predicted_us = ktime_to_us(delta_next);
} else {
/*
* Use the performance multiplier and the user-configurable
* latency_req to determine the maximum exit latency.
*/
interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
interactivity_req = predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
if (latency_req > interactivity_req)
latency_req = interactivity_req;
}
expected_interval = data->predicted_us;
/*
* Find the idle state with the lowest power while satisfying
* our constraints.
*/
idx = -1;
for (i = first_idx; i < drv->state_count; i++) {
for (i = 0; i < drv->state_count; i++) {
struct cpuidle_state *s = &drv->states[i];
struct cpuidle_state_usage *su = &dev->states_usage[i];
if (s->disabled || su->disable)
continue;
if (idx == -1)
idx = i; /* first enabled state */
if (s->target_residency > data->predicted_us) {
if (data->predicted_us < TICK_USEC)
if (s->target_residency > predicted_us) {
/*
* Use a physical idle state, not busy polling, unless
* a timer is going to trigger soon enough.
*/
if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
s->exit_latency <= latency_req &&
s->target_residency <= data->next_timer_us) {
predicted_us = s->target_residency;
idx = i;
break;
}
if (predicted_us < TICK_USEC)
break;
if (!tick_nohz_tick_stopped()) {
......@@ -389,7 +398,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
* tick in that case and let the governor run
* again in the next iteration of the loop.
*/
expected_interval = drv->states[idx].target_residency;
predicted_us = drv->states[idx].target_residency;
break;
}
......@@ -403,7 +412,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
s->target_residency <= ktime_to_us(delta_next))
idx = i;
goto out;
return idx;
}
if (s->exit_latency > latency_req) {
/*
......@@ -412,7 +421,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
* expected idle duration so that the tick is retained
* as long as that target residency is low enough.
*/
expected_interval = drv->states[idx].target_residency;
predicted_us = drv->states[idx].target_residency;
break;
}
idx = i;
......@@ -426,7 +435,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
* expected idle duration is shorter than the tick period length.
*/
if (((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) ||
expected_interval < TICK_USEC) && !tick_nohz_tick_stopped()) {
predicted_us < TICK_USEC) && !tick_nohz_tick_stopped()) {
unsigned int delta_next_us = ktime_to_us(delta_next);
*stop_tick = false;
......@@ -450,10 +459,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
}
}
out:
data->last_state_idx = idx;
return data->last_state_idx;
return idx;
}
/**
......@@ -512,9 +518,19 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
* duration predictor do a better job next time.
*/
measured_us = 9 * MAX_INTERESTING / 10;
} else if ((drv->states[last_idx].flags & CPUIDLE_FLAG_POLLING) &&
dev->poll_time_limit) {
/*
* The CPU exited the "polling" state due to a time limit, so
* the idle duration prediction leading to the selection of that
* state was inaccurate. If a better prediction had been made,
* the CPU might have been woken up from idle by the next timer.
* Assume that to be the case.
*/
measured_us = data->next_timer_us;
} else {
/* measured value */
measured_us = cpuidle_get_last_residency(dev);
measured_us = dev->last_residency;
/* Deduct exit latency */
if (measured_us > 2 * target->exit_latency)
......
......@@ -9,7 +9,6 @@
#include <linux/sched/clock.h>
#include <linux/sched/idle.h>
#define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16)
#define POLL_IDLE_RELAX_COUNT 200
static int __cpuidle poll_idle(struct cpuidle_device *dev,
......@@ -17,8 +16,11 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
{
u64 time_start = local_clock();
dev->poll_time_limit = false;
local_irq_enable();
if (!current_set_polling_and_test()) {
u64 limit = (u64)drv->states[1].target_residency * NSEC_PER_USEC;
unsigned int loop_count = 0;
while (!need_resched()) {
......@@ -27,10 +29,12 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
continue;
loop_count = 0;
if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
if (local_clock() - time_start > limit) {
dev->poll_time_limit = true;
break;
}
}
}
current_clr_polling();
return index;
......
......@@ -11,6 +11,7 @@
*/
#include <linux/kernel.h>
#include <linux/kmod.h>
#include <linux/sched.h>
#include <linux/errno.h>
#include <linux/err.h>
......@@ -28,9 +29,6 @@
#include <linux/of.h>
#include "governor.h"
#define MAX(a,b) ((a > b) ? a : b)
#define MIN(a,b) ((a < b) ? a : b)
static struct class *devfreq_class;
/*
......@@ -221,6 +219,49 @@ static struct devfreq_governor *find_devfreq_governor(const char *name)
return ERR_PTR(-ENODEV);
}
/**
* try_then_request_governor() - Try to find the governor and request the
* module if is not found.
* @name: name of the governor
*
* Search the list of devfreq governors and request the module and try again
* if is not found. This can happen when both drivers (the governor driver
* and the driver that call devfreq_add_device) are built as modules.
* devfreq_list_lock should be held by the caller. Returns the matched
* governor's pointer.
*/
static struct devfreq_governor *try_then_request_governor(const char *name)
{
struct devfreq_governor *governor;
int err = 0;
if (IS_ERR_OR_NULL(name)) {
pr_err("DEVFREQ: %s: Invalid parameters\n", __func__);
return ERR_PTR(-EINVAL);
}
WARN(!mutex_is_locked(&devfreq_list_lock),
"devfreq_list_lock must be locked.");
governor = find_devfreq_governor(name);
if (IS_ERR(governor)) {
mutex_unlock(&devfreq_list_lock);
if (!strncmp(name, DEVFREQ_GOV_SIMPLE_ONDEMAND,
DEVFREQ_NAME_LEN))
err = request_module("governor_%s", "simpleondemand");
else
err = request_module("governor_%s", name);
/* Restore previous state before return */
mutex_lock(&devfreq_list_lock);
if (err)
return NULL;
governor = find_devfreq_governor(name);
}
return governor;
}
static int devfreq_notify_transition(struct devfreq *devfreq,
struct devfreq_freqs *freqs, unsigned int state)
{
......@@ -280,14 +321,14 @@ int update_devfreq(struct devfreq *devfreq)
* max_freq
* min_freq
*/
max_freq = MIN(devfreq->scaling_max_freq, devfreq->max_freq);
min_freq = MAX(devfreq->scaling_min_freq, devfreq->min_freq);
max_freq = min(devfreq->scaling_max_freq, devfreq->max_freq);
min_freq = max(devfreq->scaling_min_freq, devfreq->min_freq);
if (min_freq && freq < min_freq) {
if (freq < min_freq) {
freq = min_freq;
flags &= ~DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use GLB */
}
if (max_freq && freq > max_freq) {
if (freq > max_freq) {
freq = max_freq;
flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
}
......@@ -534,10 +575,6 @@ static void devfreq_dev_release(struct device *dev)
list_del(&devfreq->node);
mutex_unlock(&devfreq_list_lock);
if (devfreq->governor)
devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
if (devfreq->profile->exit)
devfreq->profile->exit(devfreq->dev.parent);
......@@ -646,9 +683,8 @@ struct devfreq *devfreq_add_device(struct device *dev,
mutex_unlock(&devfreq->lock);
mutex_lock(&devfreq_list_lock);
list_add(&devfreq->node, &devfreq_list);
governor = find_devfreq_governor(devfreq->governor_name);
governor = try_then_request_governor(devfreq->governor_name);
if (IS_ERR(governor)) {
dev_err(dev, "%s: Unable to find governor for the device\n",
__func__);
......@@ -664,18 +700,19 @@ struct devfreq *devfreq_add_device(struct device *dev,
__func__);
goto err_init;
}
list_add(&devfreq->node, &devfreq_list);
mutex_unlock(&devfreq_list_lock);
return devfreq;
err_init:
list_del(&devfreq->node);
mutex_unlock(&devfreq_list_lock);
device_unregister(&devfreq->dev);
devfreq_remove_device(devfreq);
devfreq = NULL;
err_dev:
if (devfreq)
kfree(devfreq);
err_out:
return ERR_PTR(err);
......@@ -693,6 +730,9 @@ int devfreq_remove_device(struct devfreq *devfreq)
if (!devfreq)
return -EINVAL;
if (devfreq->governor)
devfreq->governor->event_handler(devfreq,
DEVFREQ_GOV_STOP, NULL);
device_unregister(&devfreq->dev);
return 0;
......@@ -991,7 +1031,7 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr,
return -EINVAL;
mutex_lock(&devfreq_list_lock);
governor = find_devfreq_governor(str_governor);
governor = try_then_request_governor(str_governor);
if (IS_ERR(governor)) {
ret = PTR_ERR(governor);
goto out;
......@@ -1126,18 +1166,27 @@ static ssize_t min_freq_store(struct device *dev, struct device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
unsigned long value;
int ret;
unsigned long max;
ret = sscanf(buf, "%lu", &value);
if (ret != 1)
return -EINVAL;
mutex_lock(&df->lock);
max = df->max_freq;
if (value && max && value > max) {
if (value) {
if (value > df->max_freq) {
ret = -EINVAL;
goto unlock;
}
} else {
unsigned long *freq_table = df->profile->freq_table;
/* Get minimum frequency according to sorting order */
if (freq_table[0] < freq_table[df->profile->max_state - 1])
value = freq_table[0];
else
value = freq_table[df->profile->max_state - 1];
}
df->min_freq = value;
update_devfreq(df);
......@@ -1152,7 +1201,7 @@ static ssize_t min_freq_show(struct device *dev, struct device_attribute *attr,
{
struct devfreq *df = to_devfreq(dev);
return sprintf(buf, "%lu\n", MAX(df->scaling_min_freq, df->min_freq));
return sprintf(buf, "%lu\n", max(df->scaling_min_freq, df->min_freq));
}
static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
......@@ -1161,18 +1210,27 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr,
struct devfreq *df = to_devfreq(dev);
unsigned long value;
int ret;
unsigned long min;
ret = sscanf(buf, "%lu", &value);
if (ret != 1)
return -EINVAL;
mutex_lock(&df->lock);
min = df->min_freq;
if (value && min && value < min) {
if (value) {
if (value < df->min_freq) {
ret = -EINVAL;
goto unlock;
}
} else {
unsigned long *freq_table = df->profile->freq_table;
/* Get maximum frequency according to sorting order */
if (freq_table[0] < freq_table[df->profile->max_state - 1])
value = freq_table[df->profile->max_state - 1];
else
value = freq_table[0];
}
df->max_freq = value;
update_devfreq(df);
......@@ -1188,7 +1246,7 @@ static ssize_t max_freq_show(struct device *dev, struct device_attribute *attr,
{
struct devfreq *df = to_devfreq(dev);
return sprintf(buf, "%lu\n", MIN(df->scaling_max_freq, df->max_freq));
return sprintf(buf, "%lu\n", min(df->scaling_max_freq, df->max_freq));
}
static DEVICE_ATTR_RW(max_freq);
......
......@@ -535,8 +535,8 @@ static int of_get_devfreq_events(struct device_node *np,
if (i == ARRAY_SIZE(ppmu_events)) {
dev_warn(dev,
"don't know how to configure events : %s\n",
node->name);
"don't know how to configure events : %pOFn\n",
node);
continue;
}
......
......@@ -25,6 +25,9 @@
#define DEVFREQ_GOV_SUSPEND 0x4
#define DEVFREQ_GOV_RESUME 0x5
#define DEVFREQ_MIN_FREQ 0
#define DEVFREQ_MAX_FREQ ULONG_MAX
/**
* struct devfreq_governor - Devfreq policy governor
* @node: list node - contains registered devfreq governors
......@@ -54,9 +57,6 @@ struct devfreq_governor {
unsigned int event, void *data);
};
/* Caution: devfreq->lock must be locked before calling update_devfreq */
extern int update_devfreq(struct devfreq *devfreq);
extern void devfreq_monitor_start(struct devfreq *devfreq);
extern void devfreq_monitor_stop(struct devfreq *devfreq);
extern void devfreq_monitor_suspend(struct devfreq *devfreq);
......
......@@ -20,10 +20,7 @@ static int devfreq_performance_func(struct devfreq *df,
* target callback should be able to get floor value as
* said in devfreq.h
*/
if (!df->max_freq)
*freq = UINT_MAX;
else
*freq = df->max_freq;
*freq = DEVFREQ_MAX_FREQ;
return 0;
}
......
......@@ -20,7 +20,7 @@ static int devfreq_powersave_func(struct devfreq *df,
* target callback should be able to get ceiling value as
* said in devfreq.h
*/
*freq = df->min_freq;
*freq = DEVFREQ_MIN_FREQ;
return 0;
}
......
......@@ -27,7 +27,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
unsigned int dfso_upthreshold = DFSO_UPTHRESHOLD;
unsigned int dfso_downdifferential = DFSO_DOWNDIFFERENCTIAL;
struct devfreq_simple_ondemand_data *data = df->data;
unsigned long max = (df->max_freq) ? df->max_freq : UINT_MAX;
err = devfreq_update_stats(df);
if (err)
......@@ -47,7 +46,7 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
/* Assume MAX if it is going to be divided by zero */
if (stat->total_time == 0) {
*freq = max;
*freq = DEVFREQ_MAX_FREQ;
return 0;
}
......@@ -60,13 +59,13 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
/* Set MAX if it's busy enough */
if (stat->busy_time * 100 >
stat->total_time * dfso_upthreshold) {
*freq = max;
*freq = DEVFREQ_MAX_FREQ;
return 0;
}
/* Set MAX if we do not know the initial frequency */
if (stat->current_frequency == 0) {
*freq = max;
*freq = DEVFREQ_MAX_FREQ;
return 0;
}
......@@ -85,11 +84,6 @@ static int devfreq_simple_ondemand_func(struct devfreq *df,
b = div_u64(b, (dfso_upthreshold - dfso_downdifferential / 2));
*freq = (unsigned long) b;
if (df->min_freq && *freq < df->min_freq)
*freq = df->min_freq;
if (df->max_freq && *freq > df->max_freq)
*freq = df->max_freq;
return 0;
}
......
......@@ -26,19 +26,11 @@ static int devfreq_userspace_func(struct devfreq *df, unsigned long *freq)
{
struct userspace_data *data = df->data;
if (data->valid) {
unsigned long adjusted_freq = data->user_frequency;
if (df->max_freq && adjusted_freq > df->max_freq)
adjusted_freq = df->max_freq;
if (df->min_freq && adjusted_freq < df->min_freq)
adjusted_freq = df->min_freq;
*freq = adjusted_freq;
} else {
if (data->valid)
*freq = data->user_frequency;
else
*freq = df->previous_freq; /* No user freq specified yet */
}
return 0;
}
......
......@@ -1066,46 +1066,43 @@ static const struct idle_cpu idle_cpu_dnv = {
.disable_promotion_to_c1e = true,
};
#define ICPU(model, cpu) \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&cpu }
static const struct x86_cpu_id intel_idle_ids[] __initconst = {
ICPU(INTEL_FAM6_NEHALEM_EP, idle_cpu_nehalem),
ICPU(INTEL_FAM6_NEHALEM, idle_cpu_nehalem),
ICPU(INTEL_FAM6_NEHALEM_G, idle_cpu_nehalem),
ICPU(INTEL_FAM6_WESTMERE, idle_cpu_nehalem),
ICPU(INTEL_FAM6_WESTMERE_EP, idle_cpu_nehalem),
ICPU(INTEL_FAM6_NEHALEM_EX, idle_cpu_nehalem),
ICPU(INTEL_FAM6_ATOM_PINEVIEW, idle_cpu_atom),
ICPU(INTEL_FAM6_ATOM_LINCROFT, idle_cpu_lincroft),
ICPU(INTEL_FAM6_WESTMERE_EX, idle_cpu_nehalem),
ICPU(INTEL_FAM6_SANDYBRIDGE, idle_cpu_snb),
ICPU(INTEL_FAM6_SANDYBRIDGE_X, idle_cpu_snb),
ICPU(INTEL_FAM6_ATOM_CEDARVIEW, idle_cpu_atom),
ICPU(INTEL_FAM6_ATOM_SILVERMONT1, idle_cpu_byt),
ICPU(INTEL_FAM6_ATOM_MERRIFIELD, idle_cpu_tangier),
ICPU(INTEL_FAM6_ATOM_AIRMONT, idle_cpu_cht),
ICPU(INTEL_FAM6_IVYBRIDGE, idle_cpu_ivb),
ICPU(INTEL_FAM6_IVYBRIDGE_X, idle_cpu_ivt),
ICPU(INTEL_FAM6_HASWELL_CORE, idle_cpu_hsw),
ICPU(INTEL_FAM6_HASWELL_X, idle_cpu_hsw),
ICPU(INTEL_FAM6_HASWELL_ULT, idle_cpu_hsw),
ICPU(INTEL_FAM6_HASWELL_GT3E, idle_cpu_hsw),
ICPU(INTEL_FAM6_ATOM_SILVERMONT2, idle_cpu_avn),
ICPU(INTEL_FAM6_BROADWELL_CORE, idle_cpu_bdw),
ICPU(INTEL_FAM6_BROADWELL_GT3E, idle_cpu_bdw),
ICPU(INTEL_FAM6_BROADWELL_X, idle_cpu_bdw),
ICPU(INTEL_FAM6_BROADWELL_XEON_D, idle_cpu_bdw),
ICPU(INTEL_FAM6_SKYLAKE_MOBILE, idle_cpu_skl),
ICPU(INTEL_FAM6_SKYLAKE_DESKTOP, idle_cpu_skl),
ICPU(INTEL_FAM6_KABYLAKE_MOBILE, idle_cpu_skl),
ICPU(INTEL_FAM6_KABYLAKE_DESKTOP, idle_cpu_skl),
ICPU(INTEL_FAM6_SKYLAKE_X, idle_cpu_skx),
ICPU(INTEL_FAM6_XEON_PHI_KNL, idle_cpu_knl),
ICPU(INTEL_FAM6_XEON_PHI_KNM, idle_cpu_knl),
ICPU(INTEL_FAM6_ATOM_GOLDMONT, idle_cpu_bxt),
ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE, idle_cpu_bxt),
ICPU(INTEL_FAM6_ATOM_DENVERTON, idle_cpu_dnv),
INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM_G, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE, idle_cpu_nehalem),
INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nehalem),
INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nehalem),
INTEL_CPU_FAM6(ATOM_PINEVIEW, idle_cpu_atom),
INTEL_CPU_FAM6(ATOM_LINCROFT, idle_cpu_lincroft),
INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nehalem),
INTEL_CPU_FAM6(SANDYBRIDGE, idle_cpu_snb),
INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snb),
INTEL_CPU_FAM6(ATOM_CEDARVIEW, idle_cpu_atom),
INTEL_CPU_FAM6(ATOM_SILVERMONT1, idle_cpu_byt),
INTEL_CPU_FAM6(ATOM_MERRIFIELD, idle_cpu_tangier),
INTEL_CPU_FAM6(ATOM_AIRMONT, idle_cpu_cht),
INTEL_CPU_FAM6(IVYBRIDGE, idle_cpu_ivb),
INTEL_CPU_FAM6(IVYBRIDGE_X, idle_cpu_ivt),
INTEL_CPU_FAM6(HASWELL_CORE, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_ULT, idle_cpu_hsw),
INTEL_CPU_FAM6(HASWELL_GT3E, idle_cpu_hsw),
INTEL_CPU_FAM6(ATOM_SILVERMONT2, idle_cpu_avn),
INTEL_CPU_FAM6(BROADWELL_CORE, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_GT3E, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdw),
INTEL_CPU_FAM6(BROADWELL_XEON_D, idle_cpu_bdw),
INTEL_CPU_FAM6(SKYLAKE_MOBILE, idle_cpu_skl),
INTEL_CPU_FAM6(SKYLAKE_DESKTOP, idle_cpu_skl),
INTEL_CPU_FAM6(KABYLAKE_MOBILE, idle_cpu_skl),
INTEL_CPU_FAM6(KABYLAKE_DESKTOP, idle_cpu_skl),
INTEL_CPU_FAM6(SKYLAKE_X, idle_cpu_skx),
INTEL_CPU_FAM6(XEON_PHI_KNL, idle_cpu_knl),
INTEL_CPU_FAM6(XEON_PHI_KNM, idle_cpu_knl),
INTEL_CPU_FAM6(ATOM_GOLDMONT, idle_cpu_bxt),
INTEL_CPU_FAM6(ATOM_GEMINI_LAKE, idle_cpu_bxt),
INTEL_CPU_FAM6(ATOM_DENVERTON, idle_cpu_dnv),
{}
};
......
......@@ -48,9 +48,14 @@ static struct opp_device *_find_opp_dev(const struct device *dev,
static struct opp_table *_find_opp_table_unlocked(struct device *dev)
{
struct opp_table *opp_table;
bool found;
list_for_each_entry(opp_table, &opp_tables, node) {
if (_find_opp_dev(dev, opp_table)) {
mutex_lock(&opp_table->lock);
found = !!_find_opp_dev(dev, opp_table);
mutex_unlock(&opp_table->lock);
if (found) {
_get_opp_table_kref(opp_table);
return opp_table;
......@@ -313,7 +318,7 @@ int dev_pm_opp_get_opp_count(struct device *dev)
count = PTR_ERR(opp_table);
dev_dbg(dev, "%s: OPP table not found (%d)\n",
__func__, count);
return 0;
return count;
}
count = _get_opp_count(opp_table);
......@@ -754,7 +759,7 @@ static void _remove_opp_dev(struct opp_device *opp_dev,
kfree(opp_dev);
}
struct opp_device *_add_opp_dev(const struct device *dev,
static struct opp_device *_add_opp_dev_unlocked(const struct device *dev,
struct opp_table *opp_table)
{
struct opp_device *opp_dev;
......@@ -766,6 +771,7 @@ struct opp_device *_add_opp_dev(const struct device *dev,
/* Initialize opp-dev */
opp_dev->dev = dev;
list_add(&opp_dev->node, &opp_table->dev_list);
/* Create debugfs entries for the opp_table */
......@@ -777,7 +783,19 @@ struct opp_device *_add_opp_dev(const struct device *dev,
return opp_dev;
}
static struct opp_table *_allocate_opp_table(struct device *dev)
struct opp_device *_add_opp_dev(const struct device *dev,
struct opp_table *opp_table)
{
struct opp_device *opp_dev;
mutex_lock(&opp_table->lock);
opp_dev = _add_opp_dev_unlocked(dev, opp_table);
mutex_unlock(&opp_table->lock);
return opp_dev;
}
static struct opp_table *_allocate_opp_table(struct device *dev, int index)
{
struct opp_table *opp_table;
struct opp_device *opp_dev;
......@@ -791,6 +809,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
if (!opp_table)
return NULL;
mutex_init(&opp_table->lock);
INIT_LIST_HEAD(&opp_table->dev_list);
opp_dev = _add_opp_dev(dev, opp_table);
......@@ -799,7 +818,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
return NULL;
}
_of_init_opp_table(opp_table, dev);
_of_init_opp_table(opp_table, dev, index);
/* Find clk for the device */
opp_table->clk = clk_get(dev, NULL);
......@@ -812,7 +831,6 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head);
INIT_LIST_HEAD(&opp_table->opp_list);
mutex_init(&opp_table->lock);
kref_init(&opp_table->kref);
/* Secure the device table modification */
......@@ -825,7 +843,7 @@ void _get_opp_table_kref(struct opp_table *opp_table)
kref_get(&opp_table->kref);
}
struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
static struct opp_table *_opp_get_opp_table(struct device *dev, int index)
{
struct opp_table *opp_table;
......@@ -836,31 +854,56 @@ struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
if (!IS_ERR(opp_table))
goto unlock;
opp_table = _allocate_opp_table(dev);
opp_table = _managed_opp(dev, index);
if (opp_table) {
if (!_add_opp_dev_unlocked(dev, opp_table)) {
dev_pm_opp_put_opp_table(opp_table);
opp_table = NULL;
}
goto unlock;
}
opp_table = _allocate_opp_table(dev, index);
unlock:
mutex_unlock(&opp_table_lock);
return opp_table;
}
struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
{
return _opp_get_opp_table(dev, 0);
}
EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_table);
struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *dev,
int index)
{
return _opp_get_opp_table(dev, index);
}
static void _opp_table_kref_release(struct kref *kref)
{
struct opp_table *opp_table = container_of(kref, struct opp_table, kref);
struct opp_device *opp_dev;
struct opp_device *opp_dev, *temp;
/* Release clk */
if (!IS_ERR(opp_table->clk))
clk_put(opp_table->clk);
opp_dev = list_first_entry(&opp_table->dev_list, struct opp_device,
node);
WARN_ON(!list_empty(&opp_table->opp_list));
_remove_opp_dev(opp_dev, opp_table);
list_for_each_entry_safe(opp_dev, temp, &opp_table->dev_list, node) {
/*
* The OPP table is getting removed, drop the performance state
* constraints.
*/
if (opp_table->genpd_performance_state)
dev_pm_genpd_set_performance_state((struct device *)(opp_dev->dev), 0);
/* dev_list must be empty now */
WARN_ON(!list_empty(&opp_table->dev_list));
_remove_opp_dev(opp_dev, opp_table);
}
mutex_destroy(&opp_table->lock);
list_del(&opp_table->node);
......@@ -869,6 +912,33 @@ static void _opp_table_kref_release(struct kref *kref)
mutex_unlock(&opp_table_lock);
}
void _opp_remove_all_static(struct opp_table *opp_table)
{
struct dev_pm_opp *opp, *tmp;
list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) {
if (!opp->dynamic)
dev_pm_opp_put(opp);
}
opp_table->parsed_static_opps = false;
}
static void _opp_table_list_kref_release(struct kref *kref)
{
struct opp_table *opp_table = container_of(kref, struct opp_table,
list_kref);
_opp_remove_all_static(opp_table);
mutex_unlock(&opp_table_lock);
}
void _put_opp_list_kref(struct opp_table *opp_table)
{
kref_put_mutex(&opp_table->list_kref, _opp_table_list_kref_release,
&opp_table_lock);
}
void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
{
kref_put_mutex(&opp_table->kref, _opp_table_kref_release,
......@@ -896,7 +966,6 @@ static void _opp_kref_release(struct kref *kref)
kfree(opp);
mutex_unlock(&opp_table->lock);
dev_pm_opp_put_opp_table(opp_table);
}
void dev_pm_opp_get(struct dev_pm_opp *opp)
......@@ -940,11 +1009,15 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
if (found) {
dev_pm_opp_put(opp);
/* Drop the reference taken by dev_pm_opp_add() */
dev_pm_opp_put_opp_table(opp_table);
} else {
dev_warn(dev, "%s: Couldn't find OPP with freq: %lu\n",
__func__, freq);
}
/* Drop the reference taken by _find_opp_table() */
dev_pm_opp_put_opp_table(opp_table);
}
EXPORT_SYMBOL_GPL(dev_pm_opp_remove);
......@@ -1062,9 +1135,6 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
new_opp->opp_table = opp_table;
kref_init(&new_opp->kref);
/* Get a reference to the OPP table */
_get_opp_table_kref(opp_table);
ret = opp_debug_create_one(new_opp, opp_table);
if (ret)
dev_err(dev, "%s: Failed to register opp to debugfs (%d)\n",
......@@ -1543,8 +1613,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
return -ENOMEM;
ret = _opp_add_v1(opp_table, dev, freq, u_volt, true);
if (ret)
dev_pm_opp_put_opp_table(opp_table);
return ret;
}
EXPORT_SYMBOL_GPL(dev_pm_opp_add);
......@@ -1707,35 +1778,7 @@ int dev_pm_opp_unregister_notifier(struct device *dev,
}
EXPORT_SYMBOL(dev_pm_opp_unregister_notifier);
/*
* Free OPPs either created using static entries present in DT or even the
* dynamically added entries based on remove_all param.
*/
void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev,
bool remove_all)
{
struct dev_pm_opp *opp, *tmp;
/* Find if opp_table manages a single device */
if (list_is_singular(&opp_table->dev_list)) {
/* Free static OPPs */
list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) {
if (remove_all || !opp->dynamic)
dev_pm_opp_put(opp);
}
/*
* The OPP table is getting removed, drop the performance state
* constraints.
*/
if (opp_table->genpd_performance_state)
dev_pm_genpd_set_performance_state(dev, 0);
} else {
_remove_opp_dev(_find_opp_dev(dev, opp_table), opp_table);
}
}
void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
void _dev_pm_opp_find_and_remove_table(struct device *dev)
{
struct opp_table *opp_table;
......@@ -1752,8 +1795,12 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
return;
}
_dev_pm_opp_remove_table(opp_table, dev, remove_all);
_put_opp_list_kref(opp_table);
/* Drop reference taken by _find_opp_table() */
dev_pm_opp_put_opp_table(opp_table);
/* Drop reference taken while the OPP table was added */
dev_pm_opp_put_opp_table(opp_table);
}
......@@ -1766,6 +1813,6 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
*/
void dev_pm_opp_remove_table(struct device *dev)
{
_dev_pm_opp_find_and_remove_table(dev, true);
_dev_pm_opp_find_and_remove_table(dev);
}
EXPORT_SYMBOL_GPL(dev_pm_opp_remove_table);
......@@ -108,7 +108,8 @@ void dev_pm_opp_free_cpufreq_table(struct device *dev,
EXPORT_SYMBOL_GPL(dev_pm_opp_free_cpufreq_table);
#endif /* CONFIG_CPU_FREQ */
void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of)
void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask,
int last_cpu)
{
struct device *cpu_dev;
int cpu;
......@@ -116,6 +117,9 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of)
WARN_ON(cpumask_empty(cpumask));
for_each_cpu(cpu, cpumask) {
if (cpu == last_cpu)
break;
cpu_dev = get_cpu_device(cpu);
if (!cpu_dev) {
pr_err("%s: failed to get cpu%d device\n", __func__,
......@@ -123,10 +127,7 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of)
continue;
}
if (of)
dev_pm_opp_of_remove_table(cpu_dev);
else
dev_pm_opp_remove_table(cpu_dev);
_dev_pm_opp_find_and_remove_table(cpu_dev);
}
}
......@@ -140,7 +141,7 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of)
*/
void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask)
{
_dev_pm_opp_cpumask_remove_table(cpumask, false);
_dev_pm_opp_cpumask_remove_table(cpumask, -1);
}
EXPORT_SYMBOL_GPL(dev_pm_opp_cpumask_remove_table);
......@@ -222,8 +223,10 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
cpumask_clear(cpumask);
if (opp_table->shared_opp == OPP_TABLE_ACCESS_SHARED) {
mutex_lock(&opp_table->lock);
list_for_each_entry(opp_dev, &opp_table->dev_list, node)
cpumask_set_cpu(opp_dev->dev->id, cpumask);
mutex_unlock(&opp_table->lock);
} else {
cpumask_set_cpu(cpu_dev->id, cpumask);
}
......
This diff is collapsed.
......@@ -126,9 +126,11 @@ enum opp_table_access {
* @dev_list: list of devices that share these OPPs
* @opp_list: table of opps
* @kref: for reference count of the table.
* @lock: mutex protecting the opp_list.
* @list_kref: for reference count of the OPP list.
* @lock: mutex protecting the opp_list and dev_list.
* @np: struct device_node pointer for opp's DT node.
* @clock_latency_ns_max: Max clock latency in nanoseconds.
* @parsed_static_opps: True if OPPs are initialized from DT.
* @shared_opp: OPP is shared between multiple devices.
* @suspend_opp: Pointer to OPP to be used during device suspend.
* @supported_hw: Array of version number to support.
......@@ -156,6 +158,7 @@ struct opp_table {
struct list_head dev_list;
struct list_head opp_list;
struct kref kref;
struct kref list_kref;
struct mutex lock;
struct device_node *np;
......@@ -164,6 +167,7 @@ struct opp_table {
/* For backward compatibility with v1 bindings */
unsigned int voltage_tolerance_v1;
bool parsed_static_opps;
enum opp_table_access shared_opp;
struct dev_pm_opp *suspend_opp;
......@@ -186,23 +190,26 @@ struct opp_table {
/* Routines internal to opp core */
void dev_pm_opp_get(struct dev_pm_opp *opp);
void _opp_remove_all_static(struct opp_table *opp_table);
void _get_opp_table_kref(struct opp_table *opp_table);
int _get_opp_count(struct opp_table *opp_table);
struct opp_table *_find_opp_table(struct device *dev);
struct opp_device *_add_opp_dev(const struct device *dev, struct opp_table *opp_table);
void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev, bool remove_all);
void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all);
void _dev_pm_opp_find_and_remove_table(struct device *dev);
struct dev_pm_opp *_opp_allocate(struct opp_table *opp_table);
void _opp_free(struct dev_pm_opp *opp);
int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, struct opp_table *opp_table, bool rate_not_available);
int _opp_add_v1(struct opp_table *opp_table, struct device *dev, unsigned long freq, long u_volt, bool dynamic);
void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of);
void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, int last_cpu);
struct opp_table *_add_opp_table(struct device *dev);
void _put_opp_list_kref(struct opp_table *opp_table);
#ifdef CONFIG_OF
void _of_init_opp_table(struct opp_table *opp_table, struct device *dev);
void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index);
struct opp_table *_managed_opp(struct device *dev, int index);
#else
static inline void _of_init_opp_table(struct opp_table *opp_table, struct device *dev) {}
static inline void _of_init_opp_table(struct opp_table *opp_table, struct device *dev, int index) {}
static inline struct opp_table *_managed_opp(struct device *dev, int index) { return NULL; }
#endif
#ifdef CONFIG_DEBUG_FS
......
......@@ -1133,47 +1133,40 @@ static const struct rapl_defaults rapl_defaults_cht = {
.compute_time_window = rapl_compute_time_window_atom,
};
#define RAPL_CPU(_model, _ops) { \
.vendor = X86_VENDOR_INTEL, \
.family = 6, \
.model = _model, \
.driver_data = (kernel_ulong_t)&_ops, \
}
static const struct x86_cpu_id rapl_ids[] __initconst = {
RAPL_CPU(INTEL_FAM6_SANDYBRIDGE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_SANDYBRIDGE_X, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_IVYBRIDGE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_IVYBRIDGE_X, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_HASWELL_CORE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_HASWELL_ULT, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_HASWELL_GT3E, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_HASWELL_X, rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_BROADWELL_CORE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_BROADWELL_GT3E, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_BROADWELL_XEON_D, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_BROADWELL_X, rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_SKYLAKE_DESKTOP, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_SKYLAKE_MOBILE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_SKYLAKE_X, rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_KABYLAKE_MOBILE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_KABYLAKE_DESKTOP, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_CANNONLAKE_MOBILE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_ATOM_SILVERMONT1, rapl_defaults_byt),
RAPL_CPU(INTEL_FAM6_ATOM_AIRMONT, rapl_defaults_cht),
RAPL_CPU(INTEL_FAM6_ATOM_MERRIFIELD, rapl_defaults_tng),
RAPL_CPU(INTEL_FAM6_ATOM_MOOREFIELD, rapl_defaults_ann),
RAPL_CPU(INTEL_FAM6_ATOM_GOLDMONT, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_ATOM_GEMINI_LAKE, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_ATOM_DENVERTON, rapl_defaults_core),
RAPL_CPU(INTEL_FAM6_XEON_PHI_KNL, rapl_defaults_hsw_server),
RAPL_CPU(INTEL_FAM6_XEON_PHI_KNM, rapl_defaults_hsw_server),
INTEL_CPU_FAM6(SANDYBRIDGE, rapl_defaults_core),
INTEL_CPU_FAM6(SANDYBRIDGE_X, rapl_defaults_core),
INTEL_CPU_FAM6(IVYBRIDGE, rapl_defaults_core),
INTEL_CPU_FAM6(IVYBRIDGE_X, rapl_defaults_core),
INTEL_CPU_FAM6(HASWELL_CORE, rapl_defaults_core),
INTEL_CPU_FAM6(HASWELL_ULT, rapl_defaults_core),
INTEL_CPU_FAM6(HASWELL_GT3E, rapl_defaults_core),
INTEL_CPU_FAM6(HASWELL_X, rapl_defaults_hsw_server),
INTEL_CPU_FAM6(BROADWELL_CORE, rapl_defaults_core),
INTEL_CPU_FAM6(BROADWELL_GT3E, rapl_defaults_core),
INTEL_CPU_FAM6(BROADWELL_XEON_D, rapl_defaults_core),
INTEL_CPU_FAM6(BROADWELL_X, rapl_defaults_hsw_server),
INTEL_CPU_FAM6(SKYLAKE_DESKTOP, rapl_defaults_core),
INTEL_CPU_FAM6(SKYLAKE_MOBILE, rapl_defaults_core),
INTEL_CPU_FAM6(SKYLAKE_X, rapl_defaults_hsw_server),
INTEL_CPU_FAM6(KABYLAKE_MOBILE, rapl_defaults_core),
INTEL_CPU_FAM6(KABYLAKE_DESKTOP, rapl_defaults_core),
INTEL_CPU_FAM6(CANNONLAKE_MOBILE, rapl_defaults_core),
INTEL_CPU_FAM6(ATOM_SILVERMONT1, rapl_defaults_byt),
INTEL_CPU_FAM6(ATOM_AIRMONT, rapl_defaults_cht),
INTEL_CPU_FAM6(ATOM_MERRIFIELD, rapl_defaults_tng),
INTEL_CPU_FAM6(ATOM_MOOREFIELD, rapl_defaults_ann),
INTEL_CPU_FAM6(ATOM_GOLDMONT, rapl_defaults_core),
INTEL_CPU_FAM6(ATOM_GEMINI_LAKE, rapl_defaults_core),
INTEL_CPU_FAM6(ATOM_DENVERTON, rapl_defaults_core),
INTEL_CPU_FAM6(XEON_PHI_KNL, rapl_defaults_hsw_server),
INTEL_CPU_FAM6(XEON_PHI_KNM, rapl_defaults_hsw_server),
{}
};
MODULE_DEVICE_TABLE(x86cpu, rapl_ids);
......
......@@ -104,6 +104,7 @@ enum cppc_regs {
* today.
*/
struct cppc_perf_caps {
u32 guaranteed_perf;
u32 highest_perf;
u32 nominal_perf;
u32 lowest_perf;
......
......@@ -81,6 +81,7 @@ struct cpuidle_device {
unsigned int registered:1;
unsigned int enabled:1;
unsigned int use_deepest_state:1;
unsigned int poll_time_limit:1;
unsigned int cpu;
int last_residency;
......@@ -99,16 +100,6 @@ struct cpuidle_device {
DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices);
DECLARE_PER_CPU(struct cpuidle_device, cpuidle_dev);
/**
* cpuidle_get_last_residency - retrieves the last state's residency time
* @dev: the target CPU
*/
static inline int cpuidle_get_last_residency(struct cpuidle_device *dev)
{
return dev->last_residency;
}
/****************************
* CPUIDLE DRIVER INTERFACE *
****************************/
......
......@@ -198,6 +198,14 @@ extern void devm_devfreq_remove_device(struct device *dev,
extern int devfreq_suspend_device(struct devfreq *devfreq);
extern int devfreq_resume_device(struct devfreq *devfreq);
/**
* update_devfreq() - Reevaluate the device and configure frequency
* @devfreq: the devfreq device
*
* Note: devfreq->lock must be held
*/
extern int update_devfreq(struct devfreq *devfreq);
/* Helper functions for devfreq user device driver with OPP. */
extern struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
unsigned long *freq, u32 flags);
......
......@@ -17,11 +17,36 @@
#include <linux/notifier.h>
#include <linux/spinlock.h>
/* Defines used for the flags field in the struct generic_pm_domain */
#define GENPD_FLAG_PM_CLK (1U << 0) /* PM domain uses PM clk */
#define GENPD_FLAG_IRQ_SAFE (1U << 1) /* PM domain operates in atomic */
#define GENPD_FLAG_ALWAYS_ON (1U << 2) /* PM domain is always powered on */
#define GENPD_FLAG_ACTIVE_WAKEUP (1U << 3) /* Keep devices active if wakeup */
/*
* Flags to control the behaviour of a genpd.
*
* These flags may be set in the struct generic_pm_domain's flags field by a
* genpd backend driver. The flags must be set before it calls pm_genpd_init(),
* which initializes a genpd.
*
* GENPD_FLAG_PM_CLK: Instructs genpd to use the PM clk framework,
* while powering on/off attached devices.
*
* GENPD_FLAG_IRQ_SAFE: This informs genpd that its backend callbacks,
* ->power_on|off(), doesn't sleep. Hence, these
* can be invoked from within atomic context, which
* enables genpd to power on/off the PM domain,
* even when pm_runtime_is_irq_safe() returns true,
* for any of its attached devices. Note that, a
* genpd having this flag set, requires its
* masterdomains to also have it set.
*
* GENPD_FLAG_ALWAYS_ON: Instructs genpd to always keep the PM domain
* powered on.
*
* GENPD_FLAG_ACTIVE_WAKEUP: Instructs genpd to keep the PM domain powered
* on, in case any of its attached devices is used
* in the wakeup path to serve system wakeups.
*/
#define GENPD_FLAG_PM_CLK (1U << 0)
#define GENPD_FLAG_IRQ_SAFE (1U << 1)
#define GENPD_FLAG_ALWAYS_ON (1U << 2)
#define GENPD_FLAG_ACTIVE_WAKEUP (1U << 3)
enum gpd_status {
GPD_STATE_ACTIVE = 0, /* PM domain is active */
......
......@@ -79,6 +79,7 @@ struct dev_pm_set_opp_data {
#if defined(CONFIG_PM_OPP)
struct opp_table *dev_pm_opp_get_opp_table(struct device *dev);
struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *dev, int index);
void dev_pm_opp_put_opp_table(struct opp_table *opp_table);
unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp);
......@@ -136,6 +137,11 @@ static inline struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
return ERR_PTR(-ENOTSUPP);
}
static inline struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *dev, int index)
{
return ERR_PTR(-ENOTSUPP);
}
static inline void dev_pm_opp_put_opp_table(struct opp_table *opp_table) {}
static inline unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
......
......@@ -96,7 +96,7 @@ static int try_to_freeze_tasks(bool user_only)
if (wq_busy)
show_workqueue_state();
if (!wakeup) {
if (!wakeup || pm_debug_messages_on) {
read_lock(&tasklist_lock);
for_each_process_thread(g, p) {
if (p != current && !freezer_should_skip(p)
......
......@@ -145,7 +145,7 @@ struct config *prepare_default_config()
config->cpu = 0;
config->prio = SCHED_HIGH;
config->verbose = 0;
strncpy(config->governor, "ondemand", 8);
strncpy(config->governor, "ondemand", sizeof(config->governor));
config->output = stdout;
......
......@@ -200,6 +200,8 @@ static int get_boost_mode(unsigned int cpu)
printf(_(" Boost States: %d\n"), b_states);
printf(_(" Total States: %d\n"), pstate_no);
for (i = 0; i < pstate_no; i++) {
if (!pstates[i])
continue;
if (i < b_states)
printf(_(" Pstate-Pb%d: %luMHz (boost state)"
"\n"), i, pstates[i]);
......
......@@ -33,7 +33,7 @@ union msr_pstate {
unsigned vid:8;
unsigned iddval:8;
unsigned idddiv:2;
unsigned res1:30;
unsigned res1:31;
unsigned en:1;
} fam17h_bits;
unsigned long long val;
......@@ -119,6 +119,11 @@ int decode_pstates(unsigned int cpu, unsigned int cpu_family,
}
if (read_msr(cpu, MSR_AMD_PSTATE + i, &pstate.val))
return -1;
if ((cpu_family == 0x17) && (!pstate.fam17h_bits.en))
continue;
else if (!pstate.bits.en)
continue;
pstates[i] = get_cof(cpu_family, pstate);
}
*no = i;
......
......@@ -23,8 +23,8 @@ install : uninstall
install -m 644 config/suspend-x2-proc.cfg $(DESTDIR)$(PREFIX)/lib/pm-graph/config
install -d $(DESTDIR)$(PREFIX)/bin
ln -s $(DESTDIR)$(PREFIX)/lib/pm-graph/bootgraph.py $(DESTDIR)$(PREFIX)/bin/bootgraph
ln -s $(DESTDIR)$(PREFIX)/lib/pm-graph/sleepgraph.py $(DESTDIR)$(PREFIX)/bin/sleepgraph
ln -s ../lib/pm-graph/bootgraph.py $(DESTDIR)$(PREFIX)/bin/bootgraph
ln -s ../lib/pm-graph/sleepgraph.py $(DESTDIR)$(PREFIX)/bin/sleepgraph
install -d $(DESTDIR)$(PREFIX)/share/man/man8
install bootgraph.8 $(DESTDIR)$(PREFIX)/share/man/man8
......
......@@ -34,6 +34,10 @@ from datetime import datetime, timedelta
from subprocess import call, Popen, PIPE
import sleepgraph as aslib
def pprint(msg):
print(msg)
sys.stdout.flush()
# ----------------- CLASSES --------------------
# Class: SystemValues
......@@ -157,11 +161,11 @@ class SystemValues(aslib.SystemValues):
return cmdline
def manualRebootRequired(self):
cmdline = self.kernelParams()
print 'To generate a new timeline manually, follow these steps:\n'
print '1. Add the CMDLINE string to your kernel command line.'
print '2. Reboot the system.'
print '3. After reboot, re-run this tool with the same arguments but no command (w/o -reboot or -manual).\n'
print 'CMDLINE="%s"' % cmdline
pprint('To generate a new timeline manually, follow these steps:\n\n'\
'1. Add the CMDLINE string to your kernel command line.\n'\
'2. Reboot the system.\n'\
'3. After reboot, re-run this tool with the same arguments but no command (w/o -reboot or -manual).\n\n'\
'CMDLINE="%s"' % cmdline)
sys.exit()
def blGrub(self):
blcmd = ''
......@@ -431,7 +435,7 @@ def parseTraceLog(data):
if len(cg.list) < 1 or cg.invalid or (cg.end - cg.start == 0):
continue
if(not cg.postProcess()):
print('Sanity check failed for %s-%d' % (proc, pid))
pprint('Sanity check failed for %s-%d' % (proc, pid))
continue
# match cg data to devices
devname = data.deviceMatch(pid, cg)
......@@ -442,8 +446,8 @@ def parseTraceLog(data):
sysvals.vprint('%s callgraph found for %s %s-%d [%f - %f]' %\
(kind, cg.name, proc, pid, cg.start, cg.end))
elif len(cg.list) > 1000000:
print 'WARNING: the callgraph found for %s is massive! (%d lines)' %\
(devname, len(cg.list))
pprint('WARNING: the callgraph found for %s is massive! (%d lines)' %\
(devname, len(cg.list)))
# Function: retrieveLogs
# Description:
......@@ -528,7 +532,7 @@ def createBootGraph(data):
tMax = data.end
tTotal = tMax - t0
if(tTotal == 0):
print('ERROR: No timeline data')
pprint('ERROR: No timeline data')
return False
user_mode = '%.0f'%(data.tUserMode*1000)
last_init = '%.0f'%(tTotal*1000)
......@@ -734,7 +738,7 @@ def updateCron(restore=False):
op.close()
res = call([cmd, cronfile])
except Exception, e:
print 'Exception: %s' % str(e)
pprint('Exception: %s' % str(e))
shutil.move(backfile, cronfile)
res = -1
if res != 0:
......@@ -750,7 +754,7 @@ def updateGrub(restore=False):
call(sysvals.blexec, stderr=PIPE, stdout=PIPE,
env={'PATH': '.:/sbin:/usr/sbin:/usr/bin:/sbin:/bin'})
except Exception, e:
print 'Exception: %s\n' % str(e)
pprint('Exception: %s\n' % str(e))
return
# extract the option and create a grub config without it
sysvals.rootUser(True)
......@@ -797,7 +801,7 @@ def updateGrub(restore=False):
res = call(sysvals.blexec)
os.remove(grubfile)
except Exception, e:
print 'Exception: %s' % str(e)
pprint('Exception: %s' % str(e))
res = -1
# cleanup
shutil.move(tempfile, grubfile)
......@@ -821,7 +825,7 @@ def updateKernelParams(restore=False):
def doError(msg, help=False):
if help == True:
printHelp()
print 'ERROR: %s\n' % msg
pprint('ERROR: %s\n' % msg)
sysvals.outputResult({'error':msg})
sys.exit()
......@@ -829,52 +833,51 @@ def doError(msg, help=False):
# Description:
# print out the help text
def printHelp():
print('')
print('%s v%s' % (sysvals.title, sysvals.version))
print('Usage: bootgraph <options> <command>')
print('')
print('Description:')
print(' This tool reads in a dmesg log of linux kernel boot and')
print(' creates an html representation of the boot timeline up to')
print(' the start of the init process.')
print('')
print(' If no specific command is given the tool reads the current dmesg')
print(' and/or ftrace log and creates a timeline')
print('')
print(' Generates output files in subdirectory: boot-yymmdd-HHMMSS')
print(' HTML output: <hostname>_boot.html')
print(' raw dmesg output: <hostname>_boot_dmesg.txt')
print(' raw ftrace output: <hostname>_boot_ftrace.txt')
print('')
print('Options:')
print(' -h Print this help text')
print(' -v Print the current tool version')
print(' -verbose Print extra information during execution and analysis')
print(' -addlogs Add the dmesg log to the html output')
print(' -result fn Export a results table to a text file for parsing.')
print(' -o name Overrides the output subdirectory name when running a new test')
print(' default: boot-{date}-{time}')
print(' [advanced]')
print(' -fstat Use ftrace to add function detail and statistics (default: disabled)')
print(' -f/-callgraph Add callgraph detail, can be very large (default: disabled)')
print(' -maxdepth N limit the callgraph data to N call levels (default: 2)')
print(' -mincg ms Discard all callgraphs shorter than ms milliseconds (e.g. 0.001 for us)')
print(' -timeprec N Number of significant digits in timestamps (0:S, 3:ms, [6:us])')
print(' -expandcg pre-expand the callgraph data in the html output (default: disabled)')
print(' -func list Limit ftrace to comma-delimited list of functions (default: do_one_initcall)')
print(' -cgfilter S Filter the callgraph output in the timeline')
print(' -cgskip file Callgraph functions to skip, off to disable (default: cgskip.txt)')
print(' -bl name Use the following boot loader for kernel params (default: grub)')
print(' -reboot Reboot the machine automatically and generate a new timeline')
print(' -manual Show the steps to generate a new timeline manually (used with -reboot)')
print('')
print('Other commands:')
print(' -flistall Print all functions capable of being captured in ftrace')
print(' -sysinfo Print out system info extracted from BIOS')
print(' [redo]')
print(' -dmesg file Create HTML output using dmesg input (used with -ftrace)')
print(' -ftrace file Create HTML output using ftrace input (used with -dmesg)')
print('')
pprint('\n%s v%s\n'\
'Usage: bootgraph <options> <command>\n'\
'\n'\
'Description:\n'\
' This tool reads in a dmesg log of linux kernel boot and\n'\
' creates an html representation of the boot timeline up to\n'\
' the start of the init process.\n'\
'\n'\
' If no specific command is given the tool reads the current dmesg\n'\
' and/or ftrace log and creates a timeline\n'\
'\n'\
' Generates output files in subdirectory: boot-yymmdd-HHMMSS\n'\
' HTML output: <hostname>_boot.html\n'\
' raw dmesg output: <hostname>_boot_dmesg.txt\n'\
' raw ftrace output: <hostname>_boot_ftrace.txt\n'\
'\n'\
'Options:\n'\
' -h Print this help text\n'\
' -v Print the current tool version\n'\
' -verbose Print extra information during execution and analysis\n'\
' -addlogs Add the dmesg log to the html output\n'\
' -result fn Export a results table to a text file for parsing.\n'\
' -o name Overrides the output subdirectory name when running a new test\n'\
' default: boot-{date}-{time}\n'\
' [advanced]\n'\
' -fstat Use ftrace to add function detail and statistics (default: disabled)\n'\
' -f/-callgraph Add callgraph detail, can be very large (default: disabled)\n'\
' -maxdepth N limit the callgraph data to N call levels (default: 2)\n'\
' -mincg ms Discard all callgraphs shorter than ms milliseconds (e.g. 0.001 for us)\n'\
' -timeprec N Number of significant digits in timestamps (0:S, 3:ms, [6:us])\n'\
' -expandcg pre-expand the callgraph data in the html output (default: disabled)\n'\
' -func list Limit ftrace to comma-delimited list of functions (default: do_one_initcall)\n'\
' -cgfilter S Filter the callgraph output in the timeline\n'\
' -cgskip file Callgraph functions to skip, off to disable (default: cgskip.txt)\n'\
' -bl name Use the following boot loader for kernel params (default: grub)\n'\
' -reboot Reboot the machine automatically and generate a new timeline\n'\
' -manual Show the steps to generate a new timeline manually (used with -reboot)\n'\
'\n'\
'Other commands:\n'\
' -flistall Print all functions capable of being captured in ftrace\n'\
' -sysinfo Print out system info extracted from BIOS\n'\
' [redo]\n'\
' -dmesg file Create HTML output using dmesg input (used with -ftrace)\n'\
' -ftrace file Create HTML output using ftrace input (used with -dmesg)\n'\
'' % (sysvals.title, sysvals.version))
return True
# ----------------- MAIN --------------------
......@@ -895,7 +898,7 @@ if __name__ == '__main__':
printHelp()
sys.exit()
elif(arg == '-v'):
print("Version %s" % sysvals.version)
pprint("Version %s" % sysvals.version)
sys.exit()
elif(arg == '-verbose'):
sysvals.verbose = True
......@@ -1016,7 +1019,7 @@ if __name__ == '__main__':
print f
elif cmd == 'checkbl':
sysvals.getBootLoader()
print 'Boot Loader: %s\n%s' % (sysvals.bootloader, sysvals.blexec)
pprint('Boot Loader: %s\n%s' % (sysvals.bootloader, sysvals.blexec))
elif(cmd == 'sysinfo'):
sysvals.printSystemInfo(True)
sys.exit()
......
......@@ -27,6 +27,7 @@ ktime_get
# console calls
printk
dev_printk
__dev_printk
console_unlock
# memory handling
......
......@@ -105,7 +105,7 @@ override-dev-timeline-functions: true
# example: [color=#CC00CC]
#
# arglist: A list of arguments from registers/stack addresses. See URL:
# https://www.kernel.org/doc/Documentation/trace/kprobetrace.rst
# https://www.kernel.org/doc/Documentation/trace/kprobetrace.txt
#
# example: cpu=%di:s32
#
......@@ -170,7 +170,7 @@ pm_restore_console:
# example: [color=#CC00CC]
#
# arglist: A list of arguments from registers/stack addresses. See URL:
# https://www.kernel.org/doc/Documentation/trace/kprobetrace.rst
# https://www.kernel.org/doc/Documentation/trace/kprobetrace.txt
#
# example: port=+36(%di):s32
#
......
......@@ -65,9 +65,9 @@ During test, enable/disable runtime suspend for all devices. The test is delayed
by 5 seconds to allow runtime suspend changes to occur. The settings are restored
after the test is complete.
.TP
\fB-display \fIon/off\fR
Turn the display on or off for the test using the xset command. This helps
maintain the consistecy of test data for better comparison.
\fB-display \fIon/off/standby/suspend\fR
Switch the display to the requested mode for the test using the xset command.
This helps maintain the consistency of test data for better comparison.
.TP
\fB-skiphtml\fR
Run the test and capture the trace logs, but skip the timeline generation.
......@@ -183,6 +183,13 @@ Print out the contents of the ACPI Firmware Performance Data Table.
\fB-battery\fR
Print out battery status and current charge.
.TP
\fB-xon/-xoff/-xstandby/-xsuspend\fR
Test xset by attempting to switch the display to the given mode. This
is the same command which will be issued by \fB-display \fImode\fR.
.TP
\fB-xstat\fR
Get the current DPMS display mode.
.TP
\fB-sysinfo\fR
Print out system info extracted from BIOS. Reads /dev/mem directly instead of going through dmidecode.
.TP
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment