Commit 77095901 authored by Paul E. McKenney's avatar Paul E. McKenney

doc: Update removal of RCU-bh/sched update machinery

The RCU-bh update API is now defined in terms of that of RCU-bh and
RCU-sched, so this commit updates the documentation accordingly.

In addition, although RCU-sched persists in !PREEMPT kernels, in
the PREEMPT case its update API is now defined in terms of that of
RCU-preempt, so this commit also updates the documentation accordingly.

While in the area, this commit removes the documentation for the
now-obsolete synchronize_rcu_mult() and clarifies the Tasks RCU
documentation.
Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
parent ea24c125
...@@ -1374,8 +1374,7 @@ that is, if the CPU is currently idle. ...@@ -1374,8 +1374,7 @@ that is, if the CPU is currently idle.
Accessor Functions</a></h3> Accessor Functions</a></h3>
<p>The following listing shows the <p>The following listing shows the
<tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt>, <tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt> and
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt>, and
<tt>rcu_for_each_leaf_node()</tt> function and macros: <tt>rcu_for_each_leaf_node()</tt> function and macros:
<pre> <pre>
...@@ -1388,13 +1387,9 @@ Accessor Functions</a></h3> ...@@ -1388,13 +1387,9 @@ Accessor Functions</a></h3>
7 for ((rnp) = &amp;(rsp)-&gt;node[0]; \ 7 for ((rnp) = &amp;(rsp)-&gt;node[0]; \
8 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++) 8 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
9 9
10 #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \ 10 #define rcu_for_each_leaf_node(rsp, rnp) \
11 for ((rnp) = &amp;(rsp)-&gt;node[0]; \ 11 for ((rnp) = (rsp)-&gt;level[NUM_RCU_LVLS - 1]; \
12 (rnp) &lt; (rsp)-&gt;level[NUM_RCU_LVLS - 1]; (rnp)++) 12 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
13
14 #define rcu_for_each_leaf_node(rsp, rnp) \
15 for ((rnp) = (rsp)-&gt;level[NUM_RCU_LVLS - 1]; \
16 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
</pre> </pre>
<p>The <tt>rcu_get_root()</tt> simply returns a pointer to the <p>The <tt>rcu_get_root()</tt> simply returns a pointer to the
...@@ -1407,10 +1402,7 @@ macro takes advantage of the layout of the <tt>rcu_node</tt> ...@@ -1407,10 +1402,7 @@ macro takes advantage of the layout of the <tt>rcu_node</tt>
structures in the <tt>rcu_state</tt> structure's structures in the <tt>rcu_state</tt> structure's
<tt>-&gt;node[]</tt> array, performing a breadth-first traversal by <tt>-&gt;node[]</tt> array, performing a breadth-first traversal by
simply traversing the array in order. simply traversing the array in order.
The <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> macro operates Similarly, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
similarly, but traverses only the first part of the array, thus excluding
the leaf <tt>rcu_node</tt> structures.
Finally, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
the last part of the array, thus traversing only the leaf the last part of the array, thus traversing only the leaf
<tt>rcu_node</tt> structures. <tt>rcu_node</tt> structures.
...@@ -1418,15 +1410,14 @@ the last part of the array, thus traversing only the leaf ...@@ -1418,15 +1410,14 @@ the last part of the array, thus traversing only the leaf
<tr><th>&nbsp;</th></tr> <tr><th>&nbsp;</th></tr>
<tr><th align="left">Quick Quiz:</th></tr> <tr><th align="left">Quick Quiz:</th></tr>
<tr><td> <tr><td>
What do <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> and What does
<tt>rcu_for_each_leaf_node()</tt> do if the <tt>rcu_node</tt> tree <tt>rcu_for_each_leaf_node()</tt> do if the <tt>rcu_node</tt> tree
contains only a single node? contains only a single node?
</td></tr> </td></tr>
<tr><th align="left">Answer:</th></tr> <tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff"> <tr><td bgcolor="#ffffff"><font color="ffffff">
In the single-node case, In the single-node case,
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt> is a no-op <tt>rcu_for_each_leaf_node()</tt> traverses the single node.
and <tt>rcu_for_each_leaf_node()</tt> traverses the single node.
</font></td></tr> </font></td></tr>
<tr><td>&nbsp;</td></tr> <tr><td>&nbsp;</td></tr>
</table> </table>
......
...@@ -12,10 +12,9 @@ high efficiency and minimal disturbance, expedited grace periods accept ...@@ -12,10 +12,9 @@ high efficiency and minimal disturbance, expedited grace periods accept
lower efficiency and significant disturbance to attain shorter latencies. lower efficiency and significant disturbance to attain shorter latencies.
<p> <p>
There are three flavors of RCU (RCU-bh, RCU-preempt, and RCU-sched), There are two flavors of RCU (RCU-preempt and RCU-sched), with an earlier
but only two flavors of expedited grace periods because the RCU-bh third RCU-bh flavor having been implemented in terms of the other two.
expedited grace period maps onto the RCU-sched expedited grace period. Each of the two implementations is covered in its own section.
Each of the remaining two implementations is covered in its own section.
<ol> <ol>
<li> <a href="#Expedited Grace Period Design"> <li> <a href="#Expedited Grace Period Design">
......
...@@ -1306,8 +1306,6 @@ doing so would degrade real-time response. ...@@ -1306,8 +1306,6 @@ doing so would degrade real-time response.
<p> <p>
This non-requirement appeared with preemptible RCU. This non-requirement appeared with preemptible RCU.
If you need a grace period that waits on non-preemptible code regions, use
<a href="#Sched Flavor">RCU-sched</a>.
<h2><a name="Parallelism Facts of Life">Parallelism Facts of Life</a></h2> <h2><a name="Parallelism Facts of Life">Parallelism Facts of Life</a></h2>
...@@ -2165,14 +2163,9 @@ however, this is not a panacea because there would be severe restrictions ...@@ -2165,14 +2163,9 @@ however, this is not a panacea because there would be severe restrictions
on what operations those callbacks could invoke. on what operations those callbacks could invoke.
<p> <p>
Perhaps surprisingly, <tt>synchronize_rcu()</tt>, Perhaps surprisingly, <tt>synchronize_rcu()</tt> and
<a href="#Bottom-Half Flavor"><tt>synchronize_rcu_bh()</tt></a>
(<a href="#Bottom-Half Flavor">discussed below</a>),
<a href="#Sched Flavor"><tt>synchronize_sched()</tt></a>,
<tt>synchronize_rcu_expedited()</tt>, <tt>synchronize_rcu_expedited()</tt>,
<tt>synchronize_rcu_bh_expedited()</tt>, and will operate normally
<tt>synchronize_sched_expedited()</tt>
will all operate normally
during very early boot, the reason being that there is only one CPU during very early boot, the reason being that there is only one CPU
and preemption is disabled. and preemption is disabled.
This means that the call <tt>synchronize_rcu()</tt> (or friends) This means that the call <tt>synchronize_rcu()</tt> (or friends)
...@@ -2861,15 +2854,22 @@ The other four flavors are listed below, with requirements for each ...@@ -2861,15 +2854,22 @@ The other four flavors are listed below, with requirements for each
described in a separate section. described in a separate section.
<ol> <ol>
<li> <a href="#Bottom-Half Flavor">Bottom-Half Flavor</a> <li> <a href="#Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a>
<li> <a href="#Sched Flavor">Sched Flavor</a> <li> <a href="#Sched Flavor">Sched Flavor (Historical)</a>
<li> <a href="#Sleepable RCU">Sleepable RCU</a> <li> <a href="#Sleepable RCU">Sleepable RCU</a>
<li> <a href="#Tasks RCU">Tasks RCU</a> <li> <a href="#Tasks RCU">Tasks RCU</a>
<li> <a href="#Waiting for Multiple Grace Periods">
Waiting for Multiple Grace Periods</a>
</ol> </ol>
<h3><a name="Bottom-Half Flavor">Bottom-Half Flavor</a></h3> <h3><a name="Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a></h3>
<p>
The RCU-bh flavor of RCU has since been expressed in terms of
the other RCU flavors as part of a consolidation of the three
flavors into a single flavor.
The read-side API remains, and continues to disable softirq and to
be accounted for by lockdep.
Much of the material in this section is therefore strictly historical
in nature.
<p> <p>
The softirq-disable (AKA &ldquo;bottom-half&rdquo;, The softirq-disable (AKA &ldquo;bottom-half&rdquo;,
...@@ -2929,8 +2929,20 @@ includes ...@@ -2929,8 +2929,20 @@ includes
<tt>call_rcu_bh()</tt>, <tt>call_rcu_bh()</tt>,
<tt>rcu_barrier_bh()</tt>, and <tt>rcu_barrier_bh()</tt>, and
<tt>rcu_read_lock_bh_held()</tt>. <tt>rcu_read_lock_bh_held()</tt>.
However, the update-side APIs are now simple wrappers for other RCU
flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt
otherwise.
<h3><a name="Sched Flavor">Sched Flavor</a></h3> <h3><a name="Sched Flavor">Sched Flavor (Historical)</a></h3>
<p>
The RCU-sched flavor of RCU has since been expressed in terms of
the other RCU flavors as part of a consolidation of the three
flavors into a single flavor.
The read-side API remains, and continues to disable preemption and to
be accounted for by lockdep.
Much of the material in this section is therefore strictly historical
in nature.
<p> <p>
Before preemptible RCU, waiting for an RCU grace period had the Before preemptible RCU, waiting for an RCU grace period had the
...@@ -3150,94 +3162,14 @@ The tasks-RCU API is quite compact, consisting only of ...@@ -3150,94 +3162,14 @@ The tasks-RCU API is quite compact, consisting only of
<tt>call_rcu_tasks()</tt>, <tt>call_rcu_tasks()</tt>,
<tt>synchronize_rcu_tasks()</tt>, and <tt>synchronize_rcu_tasks()</tt>, and
<tt>rcu_barrier_tasks()</tt>. <tt>rcu_barrier_tasks()</tt>.
In <tt>CONFIG_PREEMPT=n</tt> kernels, trampolines cannot be preempted,
<h3><a name="Waiting for Multiple Grace Periods"> so these APIs map to
Waiting for Multiple Grace Periods</a></h3> <tt>call_rcu()</tt>,
<tt>synchronize_rcu()</tt>, and
<p> <tt>rcu_barrier()</tt>, respectively.
Perhaps you have an RCU protected data structure that is accessed from In <tt>CONFIG_PREEMPT=y</tt> kernels, trampolines can be preempted,
RCU read-side critical sections, from softirq handlers, and from and these three APIs are therefore implemented by separate functions
hardware interrupt handlers. that check for voluntary context switches.
That is three flavors of RCU, the normal flavor, the bottom-half flavor,
and the sched flavor.
How to wait for a compound grace period?
<p>
The best approach is usually to &ldquo;just say no!&rdquo; and
insert <tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>
around each RCU read-side critical section, regardless of what
environment it happens to be in.
But suppose that some of the RCU read-side critical sections are
on extremely hot code paths, and that use of <tt>CONFIG_PREEMPT=n</tt>
is not a viable option, so that <tt>rcu_read_lock()</tt> and
<tt>rcu_read_unlock()</tt> are not free.
What then?
<p>
You <i>could</i> wait on all three grace periods in succession, as follows:
<blockquote>
<pre>
1 synchronize_rcu();
2 synchronize_rcu_bh();
3 synchronize_sched();
</pre>
</blockquote>
<p>
This works, but triples the update-side latency penalty.
In cases where this is not acceptable, <tt>synchronize_rcu_mult()</tt>
may be used to wait on all three flavors of grace period concurrently:
<blockquote>
<pre>
1 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched);
</pre>
</blockquote>
<p>
But what if it is necessary to also wait on SRCU?
This can be done as follows:
<blockquote>
<pre>
1 static void call_my_srcu(struct rcu_head *head,
2 void (*func)(struct rcu_head *head))
3 {
4 call_srcu(&amp;my_srcu, head, func);
5 }
6
7 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched, call_my_srcu);
</pre>
</blockquote>
<p>
If you needed to wait on multiple different flavors of SRCU
(but why???), you would need to create a wrapper function resembling
<tt>call_my_srcu()</tt> for each SRCU flavor.
<table>
<tr><th>&nbsp;</th></tr>
<tr><th align="left">Quick Quiz:</th></tr>
<tr><td>
But what if I need to wait for multiple RCU flavors, but I also need
the grace periods to be expedited?
</td></tr>
<tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff">
If you are using expedited grace periods, there should be less penalty
for waiting on them in succession.
But if that is nevertheless a problem, you can use workqueues
or multiple kthreads to wait on the various expedited grace
periods concurrently.
</font></td></tr>
<tr><td>&nbsp;</td></tr>
</table>
<p>
Again, it is usually better to adjust the RCU read-side critical sections
to use a single flavor of RCU, but when this is not feasible, you can use
<tt>synchronize_rcu_mult()</tt>.
<h2><a name="Possible Future Changes">Possible Future Changes</a></h2> <h2><a name="Possible Future Changes">Possible Future Changes</a></h2>
...@@ -3248,12 +3180,6 @@ If this becomes a serious problem, it will be necessary to rework the ...@@ -3248,12 +3180,6 @@ If this becomes a serious problem, it will be necessary to rework the
grace-period state machine so as to avoid the need for the additional grace-period state machine so as to avoid the need for the additional
latency. latency.
<p>
Expedited grace periods scan the CPUs, so their latency and overhead
increases with increasing numbers of CPUs.
If this becomes a serious problem on large systems, it will be necessary
to do some redesign to avoid this scalability problem.
<p> <p>
RCU disables CPU hotplug in a few places, perhaps most notably in the RCU disables CPU hotplug in a few places, perhaps most notably in the
<tt>rcu_barrier()</tt> operations. <tt>rcu_barrier()</tt> operations.
...@@ -3298,11 +3224,6 @@ Please note that arrangements that require RCU to remap CPU numbers will ...@@ -3298,11 +3224,6 @@ Please note that arrangements that require RCU to remap CPU numbers will
require extremely good demonstration of need and full exploration of require extremely good demonstration of need and full exploration of
alternatives. alternatives.
<p>
There is an embarrassingly large number of flavors of RCU, and this
number has been increasing over time.
Perhaps it will be possible to combine some at some future date.
<p> <p>
RCU's various kthreads are reasonably recent additions. RCU's various kthreads are reasonably recent additions.
It is quite likely that adjustments will be required to more gracefully It is quite likely that adjustments will be required to more gracefully
......
...@@ -16,12 +16,9 @@ o A CPU looping in an RCU read-side critical section. ...@@ -16,12 +16,9 @@ o A CPU looping in an RCU read-side critical section.
o A CPU looping with interrupts disabled. o A CPU looping with interrupts disabled.
o A CPU looping with preemption disabled. This condition can o A CPU looping with preemption disabled.
result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
stalls.
o A CPU looping with bottom halves disabled. This condition can o A CPU looping with bottom halves disabled.
result in RCU-sched and RCU-bh stalls.
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule(). If the looping in the kernel is without invoking schedule(). If the looping in the kernel is
...@@ -87,9 +84,9 @@ o A hardware failure. This is quite unlikely, but has occurred ...@@ -87,9 +84,9 @@ o A hardware failure. This is quite unlikely, but has occurred
This resulted in a series of RCU CPU stall warnings, eventually This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed. leading the realization that the CPU had failed.
The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
warning. Note that SRCU does -not- have CPU stall warnings. Please note Note that SRCU does -not- have CPU stall warnings. Please note that
that RCU only detects CPU stalls when there is a grace period in progress. RCU only detects CPU stalls when there is a grace period in progress.
No grace period, no CPU stall warnings. No grace period, no CPU stall warnings.
To diagnose the cause of the stall, inspect the stack traces. To diagnose the cause of the stall, inspect the stack traces.
......
...@@ -934,7 +934,8 @@ c. Do you need to treat NMI handlers, hardirq handlers, ...@@ -934,7 +934,8 @@ c. Do you need to treat NMI handlers, hardirq handlers,
d. Do you need RCU grace periods to complete even in the face d. Do you need RCU grace periods to complete even in the face
of softirq monopolization of one or more of the CPUs? For of softirq monopolization of one or more of the CPUs? For
example, is your code subject to network-based denial-of-service example, is your code subject to network-based denial-of-service
attacks? If so, you need RCU-bh. attacks? If so, you should disable softirq across your readers,
for example, by using rcu_read_lock_bh().
e. Is your workload too update-intensive for normal use of e. Is your workload too update-intensive for normal use of
RCU, but inappropriate for other synchronization mechanisms? RCU, but inappropriate for other synchronization mechanisms?
......
...@@ -3534,14 +3534,14 @@ ...@@ -3534,14 +3534,14 @@
In kernels built with CONFIG_RCU_NOCB_CPU=y, set In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs. the specified list of CPUs to be no-callback CPUs.
Invocation of these CPUs' RCU callbacks will Invocation of these CPUs' RCU callbacks will be
be offloaded to "rcuox/N" kthreads created for offloaded to "rcuox/N" kthreads created for that
that purpose, where "x" is "b" for RCU-bh, "p" purpose, where "x" is "p" for RCU-preempt, and
for RCU-preempt, and "s" for RCU-sched, and "N" "s" for RCU-sched, and "N" is the CPU number.
is the CPU number. This reduces OS jitter on the This reduces OS jitter on the offloaded CPUs,
offloaded CPUs, which can be useful for HPC and which can be useful for HPC and real-time
real-time workloads. It can also improve energy workloads. It can also improve energy efficiency
efficiency for asymmetric multiprocessors. for asymmetric multiprocessors.
rcu_nocb_poll [KNL] rcu_nocb_poll [KNL]
Rather than requiring that offloaded CPUs Rather than requiring that offloaded CPUs
......
...@@ -321,7 +321,7 @@ To reduce its OS jitter, do at least one of the following: ...@@ -321,7 +321,7 @@ To reduce its OS jitter, do at least one of the following:
to do. to do.
Name: Name:
rcuob/%d, rcuop/%d, and rcuos/%d rcuop/%d and rcuos/%d
Purpose: Purpose:
Offload RCU callbacks from the corresponding CPU. Offload RCU callbacks from the corresponding CPU.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment