Commit b57b8496 authored by Tang Yizhou's avatar Tang Yizhou Committed by Jonathan Corbet

docs: scheduler: Convert schedutil.txt to ReST

All other scheduler documents have been converted to *.rst. Let's do
the same for schedutil.txt.

Also fixed some typos.
Signed-off-by: default avatarTang Yizhou <tangyizhou@huawei.com>
Link: https://lore.kernel.org/r/20220312070751.16844-1-tangyizhou@huawei.comSigned-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent ff136876
...@@ -14,6 +14,7 @@ Linux Scheduler ...@@ -14,6 +14,7 @@ Linux Scheduler
sched-domains sched-domains
sched-capacity sched-capacity
sched-energy sched-energy
schedutil
sched-nice-design sched-nice-design
sched-rt-group sched-rt-group
sched-stats sched-stats
......
=========
Schedutil
=========
.. note::
NOTE; all this assumes a linear relation between frequency and work capacity, All this assumes a linear relation between frequency and work capacity,
we know this is flawed, but it is the best workable approximation. we know this is flawed, but it is the best workable approximation.
PELT (Per Entity Load Tracking) PELT (Per Entity Load Tracking)
------------------------------- ===============================
With PELT we track some metrics across the various scheduler entities, from With PELT we track some metrics across the various scheduler entities, from
individual tasks to task-group slices to CPU runqueues. As the basis for this individual tasks to task-group slices to CPU runqueues. As the basis for this
...@@ -38,8 +42,8 @@ while 'runnable' will increase to reflect the amount of contention. ...@@ -38,8 +42,8 @@ while 'runnable' will increase to reflect the amount of contention.
For more detail see: kernel/sched/pelt.c For more detail see: kernel/sched/pelt.c
Frequency- / CPU Invariance Frequency / CPU Invariance
--------------------------- ==========================
Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU
for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on
...@@ -47,7 +51,7 @@ a big CPU, we allow architectures to scale the time delta with two ratios, one ...@@ -47,7 +51,7 @@ a big CPU, we allow architectures to scale the time delta with two ratios, one
Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio. Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio.
For simple DVFS architectures (where software is in full control) we trivially For simple DVFS architectures (where software is in full control) we trivially
compute the ratio as: compute the ratio as::
f_cur f_cur
r_dvfs := ----- r_dvfs := -----
...@@ -55,7 +59,7 @@ compute the ratio as: ...@@ -55,7 +59,7 @@ compute the ratio as:
For more dynamic systems where the hardware is in control of DVFS we use For more dynamic systems where the hardware is in control of DVFS we use
hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this ratio. hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this ratio.
For Intel specifically, we use: For Intel specifically, we use::
APERF APERF
f_cur := ----- * P0 f_cur := ----- * P0
...@@ -87,7 +91,7 @@ For more detail see: ...@@ -87,7 +91,7 @@ For more detail see:
UTIL_EST / UTIL_EST_FASTUP UTIL_EST / UTIL_EST_FASTUP
-------------------------- ==========================
Because periodic tasks have their averages decayed while they sleep, even Because periodic tasks have their averages decayed while they sleep, even
though when running their expected utilization will be the same, they suffer a though when running their expected utilization will be the same, they suffer a
...@@ -106,7 +110,7 @@ For more detail see: kernel/sched/fair.c:util_est_dequeue() ...@@ -106,7 +110,7 @@ For more detail see: kernel/sched/fair.c:util_est_dequeue()
UCLAMP UCLAMP
------ ======
It is possible to set effective u_min and u_max clamps on each CFS or RT task; It is possible to set effective u_min and u_max clamps on each CFS or RT task;
the runqueue keeps an max aggregate of these clamps for all running tasks. the runqueue keeps an max aggregate of these clamps for all running tasks.
...@@ -115,7 +119,7 @@ For more detail see: include/uapi/linux/sched/types.h ...@@ -115,7 +119,7 @@ For more detail see: include/uapi/linux/sched/types.h
Schedutil / DVFS Schedutil / DVFS
---------------- ================
Every time the scheduler load tracking is updated (task wakeup, task Every time the scheduler load tracking is updated (task wakeup, task
migration, time progression) we call out to schedutil to update the hardware migration, time progression) we call out to schedutil to update the hardware
...@@ -123,7 +127,7 @@ DVFS state. ...@@ -123,7 +127,7 @@ DVFS state.
The basis is the CPU runqueue's 'running' metric, which per the above it is The basis is the CPU runqueue's 'running' metric, which per the above it is
the frequency invariant utilization estimate of the CPU. From this we compute the frequency invariant utilization estimate of the CPU. From this we compute
a desired frequency like: a desired frequency like::
max( running, util_est ); if UTIL_EST max( running, util_est ); if UTIL_EST
u_cfs := { running; otherwise u_cfs := { running; otherwise
...@@ -135,7 +139,7 @@ a desired frequency like: ...@@ -135,7 +139,7 @@ a desired frequency like:
f_des := min( f_max, 1.25 u * f_max ) f_des := min( f_max, 1.25 u * f_max )
XXX IO-wait; when the update is due to a task wakeup from IO-completion we XXX IO-wait: when the update is due to a task wakeup from IO-completion we
boost 'u' above. boost 'u' above.
This frequency is then used to select a P-state/OPP or directly munged into a This frequency is then used to select a P-state/OPP or directly munged into a
...@@ -153,7 +157,7 @@ For more information see: kernel/sched/cpufreq_schedutil.c ...@@ -153,7 +157,7 @@ For more information see: kernel/sched/cpufreq_schedutil.c
NOTES NOTES
----- =====
- On low-load scenarios, where DVFS is most relevant, the 'running' numbers - On low-load scenarios, where DVFS is most relevant, the 'running' numbers
will closely reflect utilization. will closely reflect utilization.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment