Commit cee1352f authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "The biggest change in this cycle is the conclusion of the big
  'simplify RCU to two primary flavors' consolidation work - i.e.
  there's a single RCU flavor for any kernel variant (PREEMPT and
  !PREEMPT):

    - Consolidate the RCU-bh, RCU-preempt, and RCU-sched flavors into a
      single flavor similar to RCU-sched in !PREEMPT kernels and into a
      single flavor similar to RCU-preempt (but also waiting on
      preempt-disabled sequences of code) in PREEMPT kernels.

      This branch also includes a refactoring of
      rcu_{nmi,irq}_{enter,exit}() from Byungchul Park.

    - Now that there is only one RCU flavor in any given running kernel,
      the many "rsp" pointers are no longer required, and this cleanup
      series removes them.

    - This branch carries out additional cleanups made possible by the
      RCU flavor consolidation, including inlining now-trivial
      functions, updating comments and definitions, and removing
      now-unneeded rcutorture scenarios.

    - Now that there is only one flavor of RCU in any running kernel,
      there is also only on rcu_data structure per CPU. This means that
      the rcu_dynticks structure can be merged into the rcu_data
      structure, a task taken on by this branch. This branch also
      contains a -rt-related fix from Mike Galbraith.

  There were also other updates:

    - Documentation updates, including some good-eye catches from Joel
      Fernandes.

    - SRCU updates, most notably changes enabling call_srcu() to be
      invoked very early in the boot sequence.

    - Torture-test updates, including some preliminary work towards
      making rcutorture better able to find problems that result in
      insufficient grace-period forward progress.

    - Initial changes to RCU to better promote forward progress of grace
      periods, including fixing a bug found by Marius Hillenbrand and
      David Woodhouse, with the fix suggested by Peter Zijlstra"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
  srcu: Make early-boot call_srcu() reuse workqueue lists
  rcutorture: Test early boot call_srcu()
  srcu: Make call_srcu() available during very early boot
  rcu: Convert rcu_state.ofl_lock to raw_spinlock_t
  rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed
  rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks
  rcu: Switch dyntick nesting counters to rcu_data structure
  rcu: Switch urgent quiescent-state requests to rcu_data structure
  rcu: Switch lazy counts to rcu_data structure
  rcu: Switch last accelerate/advance to rcu_data structure
  rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure
  rcu: Merge rcu_dynticks structure into rcu_data structure
  rcu: Remove unused rcu_dynticks_snap() from Tiny RCU
  rcu: Convert "1UL << x" to "BIT(x)"
  rcu: Avoid resched_cpu() when rescheduling the current CPU
  rcu: More aggressively enlist scheduler aid for nohz_full CPUs
  rcu: Compute jiffies_till_sched_qs from other kernel parameters
  rcu: Provide functions for determining if call_rcu() has been invoked
  rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure
  rcu: Motivate Tiny RCU forward progress
  ...
parents e2b623fb d0346559
......@@ -1227,9 +1227,11 @@ to overflow the counter, this approach corrects the
CPU enters the idle loop from process context.
</p><p>The <tt>-&gt;dynticks</tt> field counts the corresponding
CPU's transitions to and from dyntick-idle mode, so that this counter
has an even value when the CPU is in dyntick-idle mode and an odd
value otherwise.
CPU's transitions to and from either dyntick-idle or user mode, so
that this counter has an even value when the CPU is in dyntick-idle
mode or user mode and an odd value otherwise. The transitions to/from
user mode need to be counted for user mode adaptive-ticks support
(see timers/NO_HZ.txt).
</p><p>The <tt>-&gt;rcu_need_heavy_qs</tt> field is used
to record the fact that the RCU core code would really like to
......@@ -1372,8 +1374,7 @@ that is, if the CPU is currently idle.
Accessor Functions</a></h3>
<p>The following listing shows the
<tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt>,
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt>, and
<tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt> and
<tt>rcu_for_each_leaf_node()</tt> function and macros:
<pre>
......@@ -1386,13 +1387,9 @@ Accessor Functions</a></h3>
7 for ((rnp) = &amp;(rsp)-&gt;node[0]; \
8 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
9
10 #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \
11 for ((rnp) = &amp;(rsp)-&gt;node[0]; \
12 (rnp) &lt; (rsp)-&gt;level[NUM_RCU_LVLS - 1]; (rnp)++)
13
14 #define rcu_for_each_leaf_node(rsp, rnp) \
15 for ((rnp) = (rsp)-&gt;level[NUM_RCU_LVLS - 1]; \
16 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
10 #define rcu_for_each_leaf_node(rsp, rnp) \
11 for ((rnp) = (rsp)-&gt;level[NUM_RCU_LVLS - 1]; \
12 (rnp) &lt; &amp;(rsp)-&gt;node[NUM_RCU_NODES]; (rnp)++)
</pre>
<p>The <tt>rcu_get_root()</tt> simply returns a pointer to the
......@@ -1405,10 +1402,7 @@ macro takes advantage of the layout of the <tt>rcu_node</tt>
structures in the <tt>rcu_state</tt> structure's
<tt>-&gt;node[]</tt> array, performing a breadth-first traversal by
simply traversing the array in order.
The <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> macro operates
similarly, but traverses only the first part of the array, thus excluding
the leaf <tt>rcu_node</tt> structures.
Finally, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
Similarly, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
the last part of the array, thus traversing only the leaf
<tt>rcu_node</tt> structures.
......@@ -1416,15 +1410,14 @@ the last part of the array, thus traversing only the leaf
<tr><th>&nbsp;</th></tr>
<tr><th align="left">Quick Quiz:</th></tr>
<tr><td>
What do <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> and
What does
<tt>rcu_for_each_leaf_node()</tt> do if the <tt>rcu_node</tt> tree
contains only a single node?
</td></tr>
<tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff">
In the single-node case,
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt> is a no-op
and <tt>rcu_for_each_leaf_node()</tt> traverses the single node.
<tt>rcu_for_each_leaf_node()</tt> traverses the single node.
</font></td></tr>
<tr><td>&nbsp;</td></tr>
</table>
......
......@@ -12,10 +12,9 @@ high efficiency and minimal disturbance, expedited grace periods accept
lower efficiency and significant disturbance to attain shorter latencies.
<p>
There are three flavors of RCU (RCU-bh, RCU-preempt, and RCU-sched),
but only two flavors of expedited grace periods because the RCU-bh
expedited grace period maps onto the RCU-sched expedited grace period.
Each of the remaining two implementations is covered in its own section.
There are two flavors of RCU (RCU-preempt and RCU-sched), with an earlier
third RCU-bh flavor having been implemented in terms of the other two.
Each of the two implementations is covered in its own section.
<ol>
<li> <a href="#Expedited Grace Period Design">
......@@ -158,7 +157,7 @@ whether or not the current CPU is in an RCU read-side critical section.
The best that <tt>sync_sched_exp_handler()</tt> can do is to check
for idle, on the off-chance that the CPU went idle while the IPI
was in flight.
If the CPU is idle, then tt>sync_sched_exp_handler()</tt> reports
If the CPU is idle, then <tt>sync_sched_exp_handler()</tt> reports
the quiescent state.
<p>
......
......@@ -16,12 +16,9 @@ o A CPU looping in an RCU read-side critical section.
o A CPU looping with interrupts disabled.
o A CPU looping with preemption disabled. This condition can
result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
stalls.
o A CPU looping with preemption disabled.
o A CPU looping with bottom halves disabled. This condition can
result in RCU-sched and RCU-bh stalls.
o A CPU looping with bottom halves disabled.
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule(). If the looping in the kernel is
......@@ -87,9 +84,9 @@ o A hardware failure. This is quite unlikely, but has occurred
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.
The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall
warning. Note that SRCU does -not- have CPU stall warnings. Please note
that RCU only detects CPU stalls when there is a grace period in progress.
The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
Note that SRCU does -not- have CPU stall warnings. Please note that
RCU only detects CPU stalls when there is a grace period in progress.
No grace period, no CPU stall warnings.
To diagnose the cause of the stall, inspect the stack traces.
......
......@@ -934,7 +934,8 @@ c. Do you need to treat NMI handlers, hardirq handlers,
d. Do you need RCU grace periods to complete even in the face
of softirq monopolization of one or more of the CPUs? For
example, is your code subject to network-based denial-of-service
attacks? If so, you need RCU-bh.
attacks? If so, you should disable softirq across your readers,
for example, by using rcu_read_lock_bh().
e. Is your workload too update-intensive for normal use of
RCU, but inappropriate for other synchronization mechanisms?
......
......@@ -3540,14 +3540,14 @@
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs.
Invocation of these CPUs' RCU callbacks will
be offloaded to "rcuox/N" kthreads created for
that purpose, where "x" is "b" for RCU-bh, "p"
for RCU-preempt, and "s" for RCU-sched, and "N"
is the CPU number. This reduces OS jitter on the
offloaded CPUs, which can be useful for HPC and
real-time workloads. It can also improve energy
efficiency for asymmetric multiprocessors.
Invocation of these CPUs' RCU callbacks will be
offloaded to "rcuox/N" kthreads created for that
purpose, where "x" is "p" for RCU-preempt, and
"s" for RCU-sched, and "N" is the CPU number.
This reduces OS jitter on the offloaded CPUs,
which can be useful for HPC and real-time
workloads. It can also improve energy efficiency
for asymmetric multiprocessors.
rcu_nocb_poll [KNL]
Rather than requiring that offloaded CPUs
......@@ -3601,7 +3601,14 @@
Set required age in jiffies for a
given grace period before RCU starts
soliciting quiescent-state help from
rcu_note_context_switch().
rcu_note_context_switch(). If not specified, the
kernel will calculate a value based on the most
recent settings of rcutree.jiffies_till_first_fqs
and rcutree.jiffies_till_next_fqs.
This calculated value may be viewed in
rcutree.jiffies_to_sched_qs. Any attempt to
set rcutree.jiffies_to_sched_qs will be
cheerfully overwritten.
rcutree.jiffies_till_first_fqs= [KNL]
Set delay from grace-period initialization to
......@@ -3869,12 +3876,6 @@
rcupdate.rcu_self_test= [KNL]
Run the RCU early boot self tests
rcupdate.rcu_self_test_bh= [KNL]
Run the RCU bh early boot self tests
rcupdate.rcu_self_test_sched= [KNL]
Run the RCU sched early boot self tests
rdinit= [KNL]
Format: <full_path>
Run specified binary instead of /init from the ramdisk,
......
......@@ -321,7 +321,7 @@ To reduce its OS jitter, do at least one of the following:
to do.
Name:
rcuob/%d, rcuop/%d, and rcuos/%d
rcuop/%d and rcuos/%d
Purpose:
Offload RCU callbacks from the corresponding CPU.
......
......@@ -182,7 +182,7 @@ static inline void list_replace_rcu(struct list_head *old,
* @list: the RCU-protected list to splice
* @prev: points to the last element of the existing list
* @next: points to the first element of the existing list
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ...
* @sync: synchronize_rcu, synchronize_rcu_expedited, ...
*
* The list pointed to by @prev and @next can be RCU-read traversed
* concurrently with this function.
......@@ -240,7 +240,7 @@ static inline void __list_splice_init_rcu(struct list_head *list,
* designed for stacks.
* @list: the RCU-protected list to splice
* @head: the place in the existing list to splice the first list into
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ...
* @sync: synchronize_rcu, synchronize_rcu_expedited, ...
*/
static inline void list_splice_init_rcu(struct list_head *list,
struct list_head *head,
......@@ -255,7 +255,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
* list, designed for queues.
* @list: the RCU-protected list to splice
* @head: the place in the existing list to splice the first list into
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ...
* @sync: synchronize_rcu, synchronize_rcu_expedited, ...
*/
static inline void list_splice_tail_init_rcu(struct list_head *list,
struct list_head *head,
......@@ -359,13 +359,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @type: the type of the struct this is embedded in.
* @member: the name of the list_head within the struct.
*
* This primitive may safely run concurrently with the _rcu list-mutation
* primitives such as list_add_rcu(), but requires some implicit RCU
* read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where
* lockdep cannot be invoked (in which case updaters must use RCU-sched,
* as in synchronize_sched(), call_rcu_sched(), and friends). Another
* example is when items are added to the list, but never deleted.
* This primitive may safely run concurrently with the _rcu
* list-mutation primitives such as list_add_rcu(), but requires some
* implicit RCU read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where lockdep
* cannot be invoked. Another example is when items are added to the list,
* but never deleted.
*/
#define list_entry_lockless(ptr, type, member) \
container_of((typeof(ptr))READ_ONCE(ptr), type, member)
......@@ -376,13 +375,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @head: the head for your list.
* @member: the name of the list_struct within the struct.
*
* This primitive may safely run concurrently with the _rcu list-mutation
* primitives such as list_add_rcu(), but requires some implicit RCU
* read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where
* lockdep cannot be invoked (in which case updaters must use RCU-sched,
* as in synchronize_sched(), call_rcu_sched(), and friends). Another
* example is when items are added to the list, but never deleted.
* This primitive may safely run concurrently with the _rcu
* list-mutation primitives such as list_add_rcu(), but requires some
* implicit RCU read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where lockdep
* cannot be invoked. Another example is when items are added to the list,
* but never deleted.
*/
#define list_for_each_entry_lockless(pos, head, member) \
for (pos = list_entry_lockless((head)->next, typeof(*pos), member); \
......
......@@ -48,23 +48,14 @@
#define ulong2long(a) (*(long *)(&(a)))
/* Exported common interfaces */
#ifdef CONFIG_PREEMPT_RCU
void call_rcu(struct rcu_head *head, rcu_callback_t func);
#else /* #ifdef CONFIG_PREEMPT_RCU */
#define call_rcu call_rcu_sched
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func);
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func);
void synchronize_sched(void);
void rcu_barrier_tasks(void);
void synchronize_rcu(void);
#ifdef CONFIG_PREEMPT_RCU
void __rcu_read_lock(void);
void __rcu_read_unlock(void);
void synchronize_rcu(void);
/*
* Defined as a macro as it is a very low level header included from
......@@ -88,11 +79,6 @@ static inline void __rcu_read_unlock(void)
preempt_enable();
}
static inline void synchronize_rcu(void)
{
synchronize_sched();
}
static inline int rcu_preempt_depth(void)
{
return 0;
......@@ -103,8 +89,6 @@ static inline int rcu_preempt_depth(void)
/* Internal to kernel */
void rcu_init(void);
extern int rcu_scheduler_active __read_mostly;
void rcu_sched_qs(void);
void rcu_bh_qs(void);
void rcu_check_callbacks(int user);
void rcu_report_dead(unsigned int cpu);
void rcutree_migrate_callbacks(int cpu);
......@@ -135,11 +119,10 @@ static inline void rcu_init_nohz(void) { }
* RCU_NONIDLE - Indicate idle-loop code that needs RCU readers
* @a: Code that RCU needs to pay attention to.
*
* RCU, RCU-bh, and RCU-sched read-side critical sections are forbidden
* in the inner idle loop, that is, between the rcu_idle_enter() and
* the rcu_idle_exit() -- RCU will happily ignore any such read-side
* critical sections. However, things like powertop need tracepoints
* in the inner idle loop.
* RCU read-side critical sections are forbidden in the inner idle loop,
* that is, between the rcu_idle_enter() and the rcu_idle_exit() -- RCU
* will happily ignore any such read-side critical sections. However,
* things like powertop need tracepoints in the inner idle loop.
*
* This macro provides the way out: RCU_NONIDLE(do_something_with_RCU())
* will tell RCU that it needs to pay attention, invoke its argument
......@@ -167,20 +150,16 @@ static inline void rcu_init_nohz(void) { }
if (READ_ONCE((t)->rcu_tasks_holdout)) \
WRITE_ONCE((t)->rcu_tasks_holdout, false); \
} while (0)
#define rcu_note_voluntary_context_switch(t) \
do { \
rcu_all_qs(); \
rcu_tasks_qs(t); \
} while (0)
#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void);
void exit_tasks_rcu_start(void);
void exit_tasks_rcu_finish(void);
#else /* #ifdef CONFIG_TASKS_RCU */
#define rcu_tasks_qs(t) do { } while (0)
#define rcu_note_voluntary_context_switch(t) rcu_all_qs()
#define call_rcu_tasks call_rcu_sched
#define synchronize_rcu_tasks synchronize_sched
#define rcu_note_voluntary_context_switch(t) do { } while (0)
#define call_rcu_tasks call_rcu
#define synchronize_rcu_tasks synchronize_rcu
static inline void exit_tasks_rcu_start(void) { }
static inline void exit_tasks_rcu_finish(void) { }
#endif /* #else #ifdef CONFIG_TASKS_RCU */
......@@ -325,9 +304,8 @@ static inline void rcu_preempt_sleep_check(void) { }
* Helper functions for rcu_dereference_check(), rcu_dereference_protected()
* and rcu_assign_pointer(). Some of these could be folded into their
* callers, but they are left separate in order to ease introduction of
* multiple flavors of pointers to match the multiple flavors of RCU
* (e.g., __rcu_bh, * __rcu_sched, and __srcu), should this make sense in
* the future.
* multiple pointers markings to match different RCU implementations
* (e.g., __srcu), should this make sense in the future.
*/
#ifdef __CHECKER__
......@@ -686,14 +664,9 @@ static inline void rcu_read_unlock(void)
/**
* rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section
*
* This is equivalent of rcu_read_lock(), but to be used when updates
* are being done using call_rcu_bh() or synchronize_rcu_bh(). Since
* both call_rcu_bh() and synchronize_rcu_bh() consider completion of a
* softirq handler to be a quiescent state, a process in RCU read-side
* critical section must be protected by disabling softirqs. Read-side
* critical sections in interrupt context can use just rcu_read_lock(),
* though this should at least be commented to avoid confusing people
* reading the code.
* This is equivalent of rcu_read_lock(), but also disables softirqs.
* Note that anything else that disables softirqs can also serve as
* an RCU read-side critical section.
*
* Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh()
* must occur in the same context, for example, it is illegal to invoke
......@@ -726,10 +699,9 @@ static inline void rcu_read_unlock_bh(void)
/**
* rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
*
* This is equivalent of rcu_read_lock(), but to be used when updates
* are being done using call_rcu_sched() or synchronize_rcu_sched().
* Read-side critical sections can also be introduced by anything that
* disables preemption, including local_irq_disable() and friends.
* This is equivalent of rcu_read_lock(), but disables preemption.
* Read-side critical sections can also be introduced by anything else
* that disables preemption, including local_irq_disable() and friends.
*
* Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched()
* must occur in the same context, for example, it is illegal to invoke
......@@ -885,4 +857,96 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
#endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */
/* Has the specified rcu_head structure been handed to call_rcu()? */
/*
* rcu_head_init - Initialize rcu_head for rcu_head_after_call_rcu()
* @rhp: The rcu_head structure to initialize.
*
* If you intend to invoke rcu_head_after_call_rcu() to test whether a
* given rcu_head structure has already been passed to call_rcu(), then
* you must also invoke this rcu_head_init() function on it just after
* allocating that structure. Calls to this function must not race with
* calls to call_rcu(), rcu_head_after_call_rcu(), or callback invocation.
*/
static inline void rcu_head_init(struct rcu_head *rhp)
{
rhp->func = (rcu_callback_t)~0L;
}
/*
* rcu_head_after_call_rcu - Has this rcu_head been passed to call_rcu()?
* @rhp: The rcu_head structure to test.
* @func: The function passed to call_rcu() along with @rhp.
*
* Returns @true if the @rhp has been passed to call_rcu() with @func,
* and @false otherwise. Emits a warning in any other case, including
* the case where @rhp has already been invoked after a grace period.
* Calls to this function must not race with callback invocation. One way
* to avoid such races is to enclose the call to rcu_head_after_call_rcu()
* in an RCU read-side critical section that includes a read-side fetch
* of the pointer to the structure containing @rhp.
*/
static inline bool
rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
{
if (READ_ONCE(rhp->func) == f)
return true;
WARN_ON_ONCE(READ_ONCE(rhp->func) != (rcu_callback_t)~0L);
return false;
}
/* Transitional pre-consolidation compatibility definitions. */
static inline void synchronize_rcu_bh(void)
{
synchronize_rcu();
}
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_rcu_expedited();
}
static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func)
{
call_rcu(head, func);
}
static inline void rcu_barrier_bh(void)
{
rcu_barrier();
}
static inline void synchronize_sched(void)
{
synchronize_rcu();
}
static inline void synchronize_sched_expedited(void)
{
synchronize_rcu_expedited();
}
static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
{
call_rcu(head, func);
}
static inline void rcu_barrier_sched(void)
{
rcu_barrier();
}
static inline unsigned long get_state_synchronize_sched(void)
{
return get_state_synchronize_rcu();
}
static inline void cond_synchronize_sched(unsigned long oldstate)
{
cond_synchronize_rcu(oldstate);
}
#endif /* __LINUX_RCUPDATE_H */
......@@ -33,17 +33,17 @@ do { \
/**
* synchronize_rcu_mult - Wait concurrently for multiple grace periods
* @...: List of call_rcu() functions for the flavors to wait on.
* @...: List of call_rcu() functions for different grace periods to wait on
*
* This macro waits concurrently for multiple flavors of RCU grace periods.
* For example, synchronize_rcu_mult(call_rcu, call_rcu_bh) would wait
* on concurrent RCU and RCU-bh grace periods. Waiting on a give SRCU
* This macro waits concurrently for multiple types of RCU grace periods.
* For example, synchronize_rcu_mult(call_rcu, call_rcu_tasks) would wait
* on concurrent RCU and RCU-tasks grace periods. Waiting on a give SRCU
* domain requires you to write a wrapper function for that SRCU domain's
* call_srcu() function, supplying the corresponding srcu_struct.
*
* If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU
* or RCU-bh, given that anywhere synchronize_rcu_mult() can be called
* is automatically a grace period.
* If Tiny RCU, tell _wait_rcu_gp() does not bother waiting for RCU,
* given that anywhere synchronize_rcu_mult() can be called is automatically
* a grace period.
*/
#define synchronize_rcu_mult(...) \
_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)
......
......@@ -27,12 +27,6 @@
#include <linux/ktime.h>
struct rcu_dynticks;
static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp)
{
return 0;
}
/* Never flag non-existent other CPUs! */
static inline bool rcu_eqs_special_set(int cpu) { return false; }
......@@ -46,53 +40,28 @@ static inline void cond_synchronize_rcu(unsigned long oldstate)
might_sleep();
}
static inline unsigned long get_state_synchronize_sched(void)
{
return 0;
}
static inline void cond_synchronize_sched(unsigned long oldstate)
{
might_sleep();
}
extern void rcu_barrier_bh(void);
extern void rcu_barrier_sched(void);
extern void rcu_barrier(void);
static inline void synchronize_rcu_expedited(void)
{
synchronize_sched(); /* Only one CPU, so pretty fast anyway!!! */
synchronize_rcu();
}
static inline void rcu_barrier(void)
static inline void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
{
rcu_barrier_sched(); /* Only one CPU, so only one list of callbacks! */
}
static inline void synchronize_rcu_bh(void)
{
synchronize_sched();
}
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_sched();
call_rcu(head, func);
}
static inline void synchronize_sched_expedited(void)
{
synchronize_sched();
}
void rcu_qs(void);
static inline void kfree_call_rcu(struct rcu_head *head,
rcu_callback_t func)
static inline void rcu_softirq_qs(void)
{
call_rcu(head, func);
rcu_qs();
}
#define rcu_note_context_switch(preempt) \
do { \
rcu_sched_qs(); \
rcu_qs(); \
rcu_tasks_qs(current); \
} while (0)
......@@ -108,6 +77,7 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
*/
static inline void rcu_virt_note_context_switch(int cpu) { }
static inline void rcu_cpu_stall_reset(void) { }
static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
static inline void rcu_idle_enter(void) { }
static inline void rcu_idle_exit(void) { }
static inline void rcu_irq_enter(void) { }
......@@ -115,6 +85,11 @@ static inline void rcu_irq_exit_irqson(void) { }
static inline void rcu_irq_enter_irqson(void) { }
static inline void rcu_irq_exit(void) { }
static inline void exit_rcu(void) { }
static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t)
{
return false;
}
static inline void rcu_preempt_deferred_qs(struct task_struct *t) { }
#ifdef CONFIG_SRCU
void rcu_scheduler_starting(void);
#else /* #ifndef CONFIG_SRCU */
......
......@@ -30,6 +30,7 @@
#ifndef __LINUX_RCUTREE_H
#define __LINUX_RCUTREE_H
void rcu_softirq_qs(void);
void rcu_note_context_switch(bool preempt);
int rcu_needs_cpu(u64 basem, u64 *nextevt);
void rcu_cpu_stall_reset(void);
......@@ -44,41 +45,13 @@ static inline void rcu_virt_note_context_switch(int cpu)
rcu_note_context_switch(false);
}
void synchronize_rcu_bh(void);
void synchronize_sched_expedited(void);
void synchronize_rcu_expedited(void);
void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
/**
* synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period
*
* Wait for an RCU-bh grace period to elapse, but use a "big hammer"
* approach to force the grace period to end quickly. This consumes
* significant time on all CPUs and is unfriendly to real-time workloads,
* so is thus not recommended for any sort of common-case code. In fact,
* if you are using synchronize_rcu_bh_expedited() in a loop, please
* restructure your code to batch your updates, and then use a single
* synchronize_rcu_bh() instead.
*
* Note that it is illegal to call this function while holding any lock
* that is acquired by a CPU-hotplug notifier. And yes, it is also illegal
* to call this function from a CPU-hotplug notifier. Failing to observe
* these restriction will result in deadlock.
*/
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_sched_expedited();
}
void rcu_barrier(void);
void rcu_barrier_bh(void);
void rcu_barrier_sched(void);
bool rcu_eqs_special_set(int cpu);
unsigned long get_state_synchronize_rcu(void);
void cond_synchronize_rcu(unsigned long oldstate);
unsigned long get_state_synchronize_sched(void);
void cond_synchronize_sched(unsigned long oldstate);
void rcu_idle_enter(void);
void rcu_idle_exit(void);
......@@ -93,7 +66,9 @@ void rcu_scheduler_starting(void);
extern int rcu_scheduler_active __read_mostly;
void rcu_end_inkernel_boot(void);
bool rcu_is_watching(void);
#ifndef CONFIG_PREEMPT
void rcu_all_qs(void);
#endif
/* RCUtree hotplug events */
int rcutree_prepare_cpu(unsigned int cpu);
......
......@@ -571,12 +571,8 @@ union rcu_special {
struct {
u8 blocked;
u8 need_qs;
u8 exp_need_qs;
/* Otherwise the compiler can store garbage here: */
u8 pad;
} b; /* Bits. */
u32 s; /* Set of bits. */
u16 s; /* Set of bits. */
};
enum perf_event_task_context {
......
......@@ -105,12 +105,13 @@ struct srcu_struct {
#define SRCU_STATE_SCAN2 2
#define __SRCU_STRUCT_INIT(name, pcpu_name) \
{ \
.sda = &pcpu_name, \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = 0 - 1, \
__SRCU_DEP_MAP_INIT(name) \
}
{ \
.sda = &pcpu_name, \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
__SRCU_DEP_MAP_INIT(name) \
}
/*
* Define and initialize a srcu struct at build time.
......
......@@ -77,7 +77,7 @@ void torture_shutdown_absorb(const char *title);
int torture_shutdown_init(int ssecs, void (*cleanup)(void));
/* Task stuttering, which forces load/no-load transitions. */
void stutter_wait(const char *title);
bool stutter_wait(const char *title);
int torture_stutter_init(int s);
/* Initialization and cleanup. */
......
......@@ -393,9 +393,8 @@ TRACE_EVENT(rcu_quiescent_state_report,
* Tracepoint for quiescent states detected by force_quiescent_state().
* These trace events include the type of RCU, the grace-period number
* that was blocked by the CPU, the CPU itself, and the type of quiescent
* state, which can be "dti" for dyntick-idle mode, "kick" when kicking
* a CPU that has been in dyntick-idle mode for too long, or "rqc" if the
* CPU got a quiescent state via its rcu_qs_ctr.
* state, which can be "dti" for dyntick-idle mode or "kick" when kicking
* a CPU that has been in dyntick-idle mode for too long.
*/
TRACE_EVENT(rcu_fqs,
......@@ -705,20 +704,20 @@ TRACE_EVENT(rcu_torture_read,
);
/*
* Tracepoint for _rcu_barrier() execution. The string "s" describes
* the _rcu_barrier phase:
* "Begin": _rcu_barrier() started.
* "EarlyExit": _rcu_barrier() piggybacked, thus early exit.
* "Inc1": _rcu_barrier() piggyback check counter incremented.
* "OfflineNoCB": _rcu_barrier() found callback on never-online CPU
* "OnlineNoCB": _rcu_barrier() found online no-CBs CPU.
* "OnlineQ": _rcu_barrier() found online CPU with callbacks.
* "OnlineNQ": _rcu_barrier() found online CPU, no callbacks.
* Tracepoint for rcu_barrier() execution. The string "s" describes
* the rcu_barrier phase:
* "Begin": rcu_barrier() started.
* "EarlyExit": rcu_barrier() piggybacked, thus early exit.
* "Inc1": rcu_barrier() piggyback check counter incremented.
* "OfflineNoCB": rcu_barrier() found callback on never-online CPU
* "OnlineNoCB": rcu_barrier() found online no-CBs CPU.
* "OnlineQ": rcu_barrier() found online CPU with callbacks.
* "OnlineNQ": rcu_barrier() found online CPU, no callbacks.
* "IRQ": An rcu_barrier_callback() callback posted on remote CPU.
* "IRQNQ": An rcu_barrier_callback() callback found no callbacks.
* "CB": An rcu_barrier_callback() invoked a callback, not the last.
* "LastCB": An rcu_barrier_callback() invoked the last callback.
* "Inc2": _rcu_barrier() piggyback check counter incremented.
* "Inc2": rcu_barrier() piggyback check counter incremented.
* The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument
* is the count of remaining callbacks, and "done" is the piggybacking count.
*/
......
......@@ -196,7 +196,7 @@ config RCU_BOOST
This option boosts the priority of preempted RCU readers that
block the current preemptible RCU grace period for too long.
This option also prevents heavy loads from blocking RCU
callback invocation for all flavors of RCU.
callback invocation.
Say Y here if you are working with real-time apps or heavy loads
Say N here if you are unsure.
......@@ -225,12 +225,12 @@ config RCU_NOCB_CPU
callback invocation to energy-efficient CPUs in battery-powered
asymmetric multiprocessors.
This option offloads callback invocation from the set of
CPUs specified at boot time by the rcu_nocbs parameter.
For each such CPU, a kthread ("rcuox/N") will be created to
invoke callbacks, where the "N" is the CPU being offloaded,
and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and
"s" for RCU-sched. Nothing prevents this kthread from running
This option offloads callback invocation from the set of CPUs
specified at boot time by the rcu_nocbs parameter. For each
such CPU, a kthread ("rcuox/N") will be created to invoke
callbacks, where the "N" is the CPU being offloaded, and where
the "p" for RCU-preempt (PREEMPT kernels) and "s" for RCU-sched
(!PREEMPT kernels). Nothing prevents this kthread from running
on the specified CPUs, but (1) the kthreads may be preempted
between each callback, and (2) affinity or cgroups can be used
to force the kthreads to run on whatever set of CPUs is desired.
......
......@@ -176,8 +176,9 @@ static inline unsigned long rcu_seq_diff(unsigned long new, unsigned long old)
/*
* debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
* by call_rcu() and rcu callback execution, and are therefore not part of the
* RCU API. Leaving in rcupdate.h because they are used by all RCU flavors.
* by call_rcu() and rcu callback execution, and are therefore not part
* of the RCU API. These are in rcupdate.h because they are used by all
* RCU implementations.
*/
#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
......@@ -223,6 +224,7 @@ void kfree(const void *);
*/
static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
{
rcu_callback_t f;
unsigned long offset = (unsigned long)head->func;
rcu_lock_acquire(&rcu_callback_map);
......@@ -233,7 +235,9 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
return true;
} else {
RCU_TRACE(trace_rcu_invoke_callback(rn, head);)
head->func(head);
f = head->func;
WRITE_ONCE(head->func, (rcu_callback_t)0L);
f(head);
rcu_lock_release(&rcu_callback_map);
return false;
}
......@@ -328,40 +332,35 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt)
}
}
/* Returns first leaf rcu_node of the specified RCU flavor. */
#define rcu_first_leaf_node(rsp) ((rsp)->level[rcu_num_lvls - 1])
/* Returns a pointer to the first leaf rcu_node structure. */
#define rcu_first_leaf_node() (rcu_state.level[rcu_num_lvls - 1])
/* Is this rcu_node a leaf? */
#define rcu_is_leaf_node(rnp) ((rnp)->level == rcu_num_lvls - 1)
/* Is this rcu_node the last leaf? */
#define rcu_is_last_leaf_node(rsp, rnp) ((rnp) == &(rsp)->node[rcu_num_nodes - 1])
#define rcu_is_last_leaf_node(rnp) ((rnp) == &rcu_state.node[rcu_num_nodes - 1])
/*
* Do a full breadth-first scan of the rcu_node structures for the
* specified rcu_state structure.
* Do a full breadth-first scan of the {s,}rcu_node structures for the
* specified state structure (for SRCU) or the only rcu_state structure
* (for RCU).
*/
#define rcu_for_each_node_breadth_first(rsp, rnp) \
for ((rnp) = &(rsp)->node[0]; \
(rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++)
#define srcu_for_each_node_breadth_first(sp, rnp) \
for ((rnp) = &(sp)->node[0]; \
(rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
#define rcu_for_each_node_breadth_first(rnp) \
srcu_for_each_node_breadth_first(&rcu_state, rnp)
/*
* Do a breadth-first scan of the non-leaf rcu_node structures for the
* specified rcu_state structure. Note that if there is a singleton
* rcu_node tree with but one rcu_node structure, this loop is a no-op.
* Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
* Note that if there is a singleton rcu_node tree with but one rcu_node
* structure, this loop -will- visit the rcu_node structure. It is still
* a leaf node, even if it is also the root node.
*/
#define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \
for ((rnp) = &(rsp)->node[0]; !rcu_is_leaf_node(rsp, rnp); (rnp)++)
/*
* Scan the leaves of the rcu_node hierarchy for the specified rcu_state
* structure. Note that if there is a singleton rcu_node tree with but
* one rcu_node structure, this loop -will- visit the rcu_node structure.
* It is still a leaf node, even if it is also the root node.
*/
#define rcu_for_each_leaf_node(rsp, rnp) \
for ((rnp) = rcu_first_leaf_node(rsp); \
(rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++)
#define rcu_for_each_leaf_node(rnp) \
for ((rnp) = rcu_first_leaf_node(); \
(rnp) < &rcu_state.node[rcu_num_nodes]; (rnp)++)
/*
* Iterate over all possible CPUs in a leaf RCU node.
......@@ -435,6 +434,12 @@ do { \
#endif /* #if defined(SRCU) || !defined(TINY_RCU) */
#ifdef CONFIG_SRCU
void srcu_init(void);
#else /* #ifdef CONFIG_SRCU */
static inline void srcu_init(void) { }
#endif /* #else #ifdef CONFIG_SRCU */
#ifdef CONFIG_TINY_RCU
/* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
static inline bool rcu_gp_is_normal(void) { return true; }
......@@ -515,29 +520,19 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
#ifdef CONFIG_TINY_RCU
static inline unsigned long rcu_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_bh_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_sched_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; }
static inline unsigned long
srcu_batches_completed(struct srcu_struct *sp) { return 0; }
static inline void rcu_force_quiescent_state(void) { }
static inline void rcu_bh_force_quiescent_state(void) { }
static inline void rcu_sched_force_quiescent_state(void) { }
static inline void show_rcu_gp_kthreads(void) { }
static inline int rcu_get_gp_kthreads_prio(void) { return 0; }
#else /* #ifdef CONFIG_TINY_RCU */
unsigned long rcu_get_gp_seq(void);
unsigned long rcu_bh_get_gp_seq(void);
unsigned long rcu_sched_get_gp_seq(void);
unsigned long rcu_exp_batches_completed(void);
unsigned long rcu_exp_batches_completed_sched(void);
unsigned long srcu_batches_completed(struct srcu_struct *sp);
void show_rcu_gp_kthreads(void);
int rcu_get_gp_kthreads_prio(void);
void rcu_force_quiescent_state(void);
void rcu_bh_force_quiescent_state(void);
void rcu_sched_force_quiescent_state(void);
extern struct workqueue_struct *rcu_gp_wq;
extern struct workqueue_struct *rcu_par_gp_wq;
#endif /* #else #ifdef CONFIG_TINY_RCU */
......
......@@ -189,36 +189,6 @@ static struct rcu_perf_ops rcu_ops = {
.name = "rcu"
};
/*
* Definitions for rcu_bh perf testing.
*/
static int rcu_bh_perf_read_lock(void) __acquires(RCU_BH)
{
rcu_read_lock_bh();
return 0;
}
static void rcu_bh_perf_read_unlock(int idx) __releases(RCU_BH)
{
rcu_read_unlock_bh();
}
static struct rcu_perf_ops rcu_bh_ops = {
.ptype = RCU_BH_FLAVOR,
.init = rcu_sync_perf_init,
.readlock = rcu_bh_perf_read_lock,
.readunlock = rcu_bh_perf_read_unlock,
.get_gp_seq = rcu_bh_get_gp_seq,
.gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_bh,
.gp_barrier = rcu_barrier_bh,
.sync = synchronize_rcu_bh,
.exp_sync = synchronize_rcu_bh_expedited,
.name = "rcu_bh"
};
/*
* Definitions for srcu perf testing.
*/
......@@ -305,36 +275,6 @@ static struct rcu_perf_ops srcud_ops = {
.name = "srcud"
};
/*
* Definitions for sched perf testing.
*/
static int sched_perf_read_lock(void)
{
preempt_disable();
return 0;
}
static void sched_perf_read_unlock(int idx)
{
preempt_enable();
}
static struct rcu_perf_ops sched_ops = {
.ptype = RCU_SCHED_FLAVOR,
.init = rcu_sync_perf_init,
.readlock = sched_perf_read_lock,
.readunlock = sched_perf_read_unlock,
.get_gp_seq = rcu_sched_get_gp_seq,
.gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_sched,
.gp_barrier = rcu_barrier_sched,
.sync = synchronize_sched,
.exp_sync = synchronize_sched_expedited,
.name = "sched"
};
/*
* Definitions for RCU-tasks perf testing.
*/
......@@ -611,7 +551,7 @@ rcu_perf_cleanup(void)
kfree(writer_n_durations);
}
/* Do flavor-specific cleanup operations. */
/* Do torture-type-specific cleanup operations. */
if (cur_ops->cleanup != NULL)
cur_ops->cleanup();
......@@ -661,8 +601,7 @@ rcu_perf_init(void)
long i;
int firsterr = 0;
static struct rcu_perf_ops *perf_ops[] = {
&rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops,
&tasks_ops,
&rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops,
};
if (!torture_init_begin(perf_type, verbose))
......@@ -680,6 +619,7 @@ rcu_perf_init(void)
for (i = 0; i < ARRAY_SIZE(perf_ops); i++)
pr_cont(" %s", perf_ops[i]->name);
pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST));
firsterr = -EINVAL;
goto unwind;
}
......
This diff is collapsed.
......@@ -34,6 +34,8 @@
#include "rcu.h"
int rcu_scheduler_active __read_mostly;
static LIST_HEAD(srcu_boot_list);
static bool srcu_init_done;
static int init_srcu_struct_fields(struct srcu_struct *sp)
{
......@@ -46,6 +48,7 @@ static int init_srcu_struct_fields(struct srcu_struct *sp)
sp->srcu_gp_waiting = false;
sp->srcu_idx = 0;
INIT_WORK(&sp->srcu_work, srcu_drive_gp);
INIT_LIST_HEAD(&sp->srcu_work.entry);
return 0;
}
......@@ -179,8 +182,12 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
*sp->srcu_cb_tail = rhp;
sp->srcu_cb_tail = &rhp->next;
local_irq_restore(flags);
if (!READ_ONCE(sp->srcu_gp_running))
schedule_work(&sp->srcu_work);
if (!READ_ONCE(sp->srcu_gp_running)) {
if (likely(srcu_init_done))
schedule_work(&sp->srcu_work);
else if (list_empty(&sp->srcu_work.entry))
list_add(&sp->srcu_work.entry, &srcu_boot_list);
}
}
EXPORT_SYMBOL_GPL(call_srcu);
......@@ -204,3 +211,21 @@ void __init rcu_scheduler_starting(void)
{
rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
}
/*
* Queue work for srcu_struct structures with early boot callbacks.
* The work won't actually execute until the workqueue initialization
* phase that takes place after the scheduler starts.
*/
void __init srcu_init(void)
{
struct srcu_struct *sp;
srcu_init_done = true;
while (!list_empty(&srcu_boot_list)) {
sp = list_first_entry(&srcu_boot_list,
struct srcu_struct, srcu_work.entry);
list_del_init(&sp->srcu_work.entry);
schedule_work(&sp->srcu_work);
}
}
......@@ -51,6 +51,10 @@ module_param(exp_holdoff, ulong, 0444);
static ulong counter_wrap_check = (ULONG_MAX >> 2);
module_param(counter_wrap_check, ulong, 0444);
/* Early-boot callback-management, so early that no lock is required! */
static LIST_HEAD(srcu_boot_list);
static bool __read_mostly srcu_init_done;
static void srcu_invoke_callbacks(struct work_struct *work);
static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay);
static void process_srcu(struct work_struct *work);
......@@ -105,7 +109,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
rcu_init_levelspread(levelspread, num_rcu_lvl);
/* Each pass through this loop initializes one srcu_node structure. */
rcu_for_each_node_breadth_first(sp, snp) {
srcu_for_each_node_breadth_first(sp, snp) {
spin_lock_init(&ACCESS_PRIVATE(snp, lock));
WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
ARRAY_SIZE(snp->srcu_data_have_cbs));
......@@ -235,7 +239,6 @@ static void check_init_srcu_struct(struct srcu_struct *sp)
{
unsigned long flags;
WARN_ON_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INIT);
/* The smp_load_acquire() pairs with the smp_store_release(). */
if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/
return; /* Already initialized. */
......@@ -561,7 +564,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
/* Initiate callback invocation as needed. */
idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
rcu_for_each_node_breadth_first(sp, snp) {
srcu_for_each_node_breadth_first(sp, snp) {
spin_lock_irq_rcu_node(snp);
cbs = false;
last_lvl = snp >= sp->level[rcu_num_lvls - 1];
......@@ -701,7 +704,11 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
rcu_seq_state(sp->srcu_gp_seq) == SRCU_STATE_IDLE) {
WARN_ON_ONCE(ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed));
srcu_gp_start(sp);
queue_delayed_work(rcu_gp_wq, &sp->work, srcu_get_delay(sp));
if (likely(srcu_init_done))
queue_delayed_work(rcu_gp_wq, &sp->work,
srcu_get_delay(sp));
else if (list_empty(&sp->work.work.entry))
list_add(&sp->work.work.entry, &srcu_boot_list);
}
spin_unlock_irqrestore_rcu_node(sp, flags);
}
......@@ -980,7 +987,7 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
* There are memory-ordering constraints implied by synchronize_srcu().
* On systems with more than one CPU, when synchronize_srcu() returns,
* each CPU is guaranteed to have executed a full memory barrier since
* the end of its last corresponding SRCU-sched read-side critical section
* the end of its last corresponding SRCU read-side critical section
* whose beginning preceded the call to synchronize_srcu(). In addition,
* each CPU having an SRCU read-side critical section that extends beyond
* the return from synchronize_srcu() is guaranteed to have executed a
......@@ -1308,3 +1315,17 @@ static int __init srcu_bootup_announce(void)
return 0;
}
early_initcall(srcu_bootup_announce);
void __init srcu_init(void)
{
struct srcu_struct *sp;
srcu_init_done = true;
while (!list_empty(&srcu_boot_list)) {
sp = list_first_entry(&srcu_boot_list, struct srcu_struct,
work.work.entry);
check_init_srcu_struct(sp);
list_del_init(&sp->work.work.entry);
queue_work(rcu_gp_wq, &sp->work.work);
}
}
......@@ -46,69 +46,27 @@ struct rcu_ctrlblk {
};
/* Definition for rcupdate control block. */
static struct rcu_ctrlblk rcu_sched_ctrlblk = {
.donetail = &rcu_sched_ctrlblk.rcucblist,
.curtail = &rcu_sched_ctrlblk.rcucblist,
static struct rcu_ctrlblk rcu_ctrlblk = {
.donetail = &rcu_ctrlblk.rcucblist,
.curtail = &rcu_ctrlblk.rcucblist,
};
static struct rcu_ctrlblk rcu_bh_ctrlblk = {
.donetail = &rcu_bh_ctrlblk.rcucblist,
.curtail = &rcu_bh_ctrlblk.rcucblist,
};
void rcu_barrier_bh(void)
{
wait_rcu_gp(call_rcu_bh);
}
EXPORT_SYMBOL(rcu_barrier_bh);
void rcu_barrier_sched(void)
{
wait_rcu_gp(call_rcu_sched);
}
EXPORT_SYMBOL(rcu_barrier_sched);
/*
* Helper function for rcu_sched_qs() and rcu_bh_qs().
* Also irqs are disabled to avoid confusion due to interrupt handlers
* invoking call_rcu().
*/
static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
{
if (rcp->donetail != rcp->curtail) {
rcp->donetail = rcp->curtail;
return 1;
}
return 0;
}
/*
* Record an rcu quiescent state. And an rcu_bh quiescent state while we
* are at it, given that any rcu quiescent state is also an rcu_bh
* quiescent state. Use "+" instead of "||" to defeat short circuiting.
*/
void rcu_sched_qs(void)
void rcu_barrier(void)
{
unsigned long flags;
local_irq_save(flags);
if (rcu_qsctr_help(&rcu_sched_ctrlblk) +
rcu_qsctr_help(&rcu_bh_ctrlblk))
raise_softirq(RCU_SOFTIRQ);
local_irq_restore(flags);
wait_rcu_gp(call_rcu);
}
EXPORT_SYMBOL(rcu_barrier);
/*
* Record an rcu_bh quiescent state.
*/
void rcu_bh_qs(void)
/* Record an rcu quiescent state. */
void rcu_qs(void)
{
unsigned long flags;
local_irq_save(flags);
if (rcu_qsctr_help(&rcu_bh_ctrlblk))
if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
rcu_ctrlblk.donetail = rcu_ctrlblk.curtail;
raise_softirq(RCU_SOFTIRQ);
}
local_irq_restore(flags);
}
......@@ -120,34 +78,33 @@ void rcu_bh_qs(void)
*/
void rcu_check_callbacks(int user)
{
if (user)
rcu_sched_qs();
if (user || !in_softirq())
rcu_bh_qs();
if (user) {
rcu_qs();
} else if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
set_tsk_need_resched(current);
set_preempt_need_resched();
}
}
/*
* Invoke the RCU callbacks on the specified rcu_ctrlkblk structure
* whose grace period has elapsed.
*/
static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
/* Invoke the RCU callbacks whose grace period has elapsed. */
static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused)
{
struct rcu_head *next, *list;
unsigned long flags;
/* Move the ready-to-invoke callbacks to a local list. */
local_irq_save(flags);
if (rcp->donetail == &rcp->rcucblist) {
if (rcu_ctrlblk.donetail == &rcu_ctrlblk.rcucblist) {
/* No callbacks ready, so just leave. */
local_irq_restore(flags);
return;
}
list = rcp->rcucblist;
rcp->rcucblist = *rcp->donetail;
*rcp->donetail = NULL;
if (rcp->curtail == rcp->donetail)
rcp->curtail = &rcp->rcucblist;
rcp->donetail = &rcp->rcucblist;
list = rcu_ctrlblk.rcucblist;
rcu_ctrlblk.rcucblist = *rcu_ctrlblk.donetail;
*rcu_ctrlblk.donetail = NULL;
if (rcu_ctrlblk.curtail == rcu_ctrlblk.donetail)
rcu_ctrlblk.curtail = &rcu_ctrlblk.rcucblist;
rcu_ctrlblk.donetail = &rcu_ctrlblk.rcucblist;
local_irq_restore(flags);
/* Invoke the callbacks on the local list. */
......@@ -162,37 +119,31 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
}
}
static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused)
{
__rcu_process_callbacks(&rcu_sched_ctrlblk);
__rcu_process_callbacks(&rcu_bh_ctrlblk);
}
/*
* Wait for a grace period to elapse. But it is illegal to invoke
* synchronize_sched() from within an RCU read-side critical section.
* Therefore, any legal call to synchronize_sched() is a quiescent
* state, and so on a UP system, synchronize_sched() need do nothing.
* Ditto for synchronize_rcu_bh(). (But Lai Jiangshan points out the
* benefits of doing might_sleep() to reduce latency.)
* synchronize_rcu() from within an RCU read-side critical section.
* Therefore, any legal call to synchronize_rcu() is a quiescent
* state, and so on a UP system, synchronize_rcu() need do nothing.
* (But Lai Jiangshan points out the benefits of doing might_sleep()
* to reduce latency.)
*
* Cool, huh? (Due to Josh Triplett.)
*/
void synchronize_sched(void)
void synchronize_rcu(void)
{
RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) ||
lock_is_held(&rcu_lock_map) ||
lock_is_held(&rcu_sched_lock_map),
"Illegal synchronize_sched() in RCU read-side critical section");
"Illegal synchronize_rcu() in RCU read-side critical section");
}
EXPORT_SYMBOL_GPL(synchronize_sched);
EXPORT_SYMBOL_GPL(synchronize_rcu);
/*
* Helper function for call_rcu() and call_rcu_bh().
* Post an RCU callback to be invoked after the end of an RCU grace
* period. But since we have but one CPU, that would be after any
* quiescent state.
*/
static void __call_rcu(struct rcu_head *head,
rcu_callback_t func,
struct rcu_ctrlblk *rcp)
void call_rcu(struct rcu_head *head, rcu_callback_t func)
{
unsigned long flags;
......@@ -201,39 +152,20 @@ static void __call_rcu(struct rcu_head *head,
head->next = NULL;
local_irq_save(flags);
*rcp->curtail = head;
rcp->curtail = &head->next;
*rcu_ctrlblk.curtail = head;
rcu_ctrlblk.curtail = &head->next;
local_irq_restore(flags);
if (unlikely(is_idle_task(current))) {
/* force scheduling for rcu_sched_qs() */
/* force scheduling for rcu_qs() */
resched_cpu(0);
}
}
/*
* Post an RCU callback to be invoked after the end of an RCU-sched grace
* period. But since we have but one CPU, that would be after any
* quiescent state.
*/
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
{
__call_rcu(head, func, &rcu_sched_ctrlblk);
}
EXPORT_SYMBOL_GPL(call_rcu_sched);
/*
* Post an RCU bottom-half callback to be invoked after any subsequent
* quiescent state.
*/
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func)
{
__call_rcu(head, func, &rcu_bh_ctrlblk);
}
EXPORT_SYMBOL_GPL(call_rcu_bh);
EXPORT_SYMBOL_GPL(call_rcu);
void __init rcu_init(void)
{
open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
rcu_early_boot_tests();
srcu_init();
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -203,11 +203,7 @@ void rcu_test_sync_prims(void)
if (!IS_ENABLED(CONFIG_PROVE_RCU))
return;
synchronize_rcu();
synchronize_rcu_bh();
synchronize_sched();
synchronize_rcu_expedited();
synchronize_rcu_bh_expedited();
synchronize_sched_expedited();
}
#if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU)
......@@ -298,7 +294,7 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_held);
*
* Check debug_lockdep_rcu_enabled() to prevent false positives during boot.
*
* Note that rcu_read_lock() is disallowed if the CPU is either idle or
* Note that rcu_read_lock_bh() is disallowed if the CPU is either idle or
* offline from an RCU perspective, so check for those as well.
*/
int rcu_read_lock_bh_held(void)
......@@ -336,7 +332,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
int i;
int j;
/* Initialize and register callbacks for each flavor specified. */
/* Initialize and register callbacks for each crcu_array element. */
for (i = 0; i < n; i++) {
if (checktiny &&
(crcu_array[i] == call_rcu ||
......@@ -472,6 +468,7 @@ int rcu_jiffies_till_stall_check(void)
}
return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
}
EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
void rcu_sysrq_start(void)
{
......@@ -701,19 +698,19 @@ static int __noreturn rcu_tasks_kthread(void *arg)
/*
* Wait for all pre-existing t->on_rq and t->nvcsw
* transitions to complete. Invoking synchronize_sched()
* transitions to complete. Invoking synchronize_rcu()
* suffices because all these transitions occur with
* interrupts disabled. Without this synchronize_sched(),
* interrupts disabled. Without this synchronize_rcu(),
* a read-side critical section that started before the
* grace period might be incorrectly seen as having started
* after the grace period.
*
* This synchronize_sched() also dispenses with the
* This synchronize_rcu() also dispenses with the
* need for a memory barrier on the first store to
* ->rcu_tasks_holdout, as it forces the store to happen
* after the beginning of the grace period.
*/
synchronize_sched();
synchronize_rcu();
/*
* There were callbacks, so we need to wait for an
......@@ -740,7 +737,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* This does only part of the job, ensuring that all
* tasks that were previously exiting reach the point
* where they have disabled preemption, allowing the
* later synchronize_sched() to finish the job.
* later synchronize_rcu() to finish the job.
*/
synchronize_srcu(&tasks_rcu_exit_srcu);
......@@ -790,20 +787,20 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* cause their RCU-tasks read-side critical sections to
* extend past the end of the grace period. However,
* because these ->nvcsw updates are carried out with
* interrupts disabled, we can use synchronize_sched()
* interrupts disabled, we can use synchronize_rcu()
* to force the needed ordering on all such CPUs.
*
* This synchronize_sched() also confines all
* This synchronize_rcu() also confines all
* ->rcu_tasks_holdout accesses to be within the grace
* period, avoiding the need for memory barriers for
* ->rcu_tasks_holdout accesses.
*
* In addition, this synchronize_sched() waits for exiting
* In addition, this synchronize_rcu() waits for exiting
* tasks to complete their final preempt_disable() region
* of execution, cleaning up after the synchronize_srcu()
* above.
*/
synchronize_sched();
synchronize_rcu();
/* Invoke the callbacks. */
while (list) {
......@@ -870,15 +867,10 @@ static void __init rcu_tasks_bootup_oddness(void)
#ifdef CONFIG_PROVE_RCU
/*
* Early boot self test parameters, one for each flavor
* Early boot self test parameters.
*/
static bool rcu_self_test;
static bool rcu_self_test_bh;
static bool rcu_self_test_sched;
module_param(rcu_self_test, bool, 0444);
module_param(rcu_self_test_bh, bool, 0444);
module_param(rcu_self_test_sched, bool, 0444);
static int rcu_self_test_counter;
......@@ -888,25 +880,16 @@ static void test_callback(struct rcu_head *r)
pr_info("RCU test callback executed %d\n", rcu_self_test_counter);
}
DEFINE_STATIC_SRCU(early_srcu);
static void early_boot_test_call_rcu(void)
{
static struct rcu_head head;
static struct rcu_head shead;
call_rcu(&head, test_callback);
}
static void early_boot_test_call_rcu_bh(void)
{
static struct rcu_head head;
call_rcu_bh(&head, test_callback);
}
static void early_boot_test_call_rcu_sched(void)
{
static struct rcu_head head;
call_rcu_sched(&head, test_callback);
if (IS_ENABLED(CONFIG_SRCU))
call_srcu(&early_srcu, &shead, test_callback);
}
void rcu_early_boot_tests(void)
......@@ -915,10 +898,6 @@ void rcu_early_boot_tests(void)
if (rcu_self_test)
early_boot_test_call_rcu();
if (rcu_self_test_bh)
early_boot_test_call_rcu_bh();
if (rcu_self_test_sched)
early_boot_test_call_rcu_sched();
rcu_test_sync_prims();
}
......@@ -930,16 +909,11 @@ static int rcu_verify_early_boot_tests(void)
if (rcu_self_test) {
early_boot_test_counter++;
rcu_barrier();
if (IS_ENABLED(CONFIG_SRCU)) {
early_boot_test_counter++;
srcu_barrier(&early_srcu);
}
}
if (rcu_self_test_bh) {
early_boot_test_counter++;
rcu_barrier_bh();
}
if (rcu_self_test_sched) {
early_boot_test_counter++;
rcu_barrier_sched();
}
if (rcu_self_test_counter != early_boot_test_counter) {
WARN_ON(1);
ret = -1;
......
......@@ -301,7 +301,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
pending >>= softirq_bit;
}
rcu_bh_qs();
if (__this_cpu_read(ksoftirqd) == current)
rcu_softirq_qs();
local_irq_disable();
pending = local_softirq_pending();
......
......@@ -573,7 +573,7 @@ static int stutter;
* Block until the stutter interval ends. This must be called periodically
* by all running kthreads that need to be subject to stuttering.
*/
void stutter_wait(const char *title)
bool stutter_wait(const char *title)
{
int spt;
......@@ -590,6 +590,7 @@ void stutter_wait(const char *title)
}
torture_shutdown_absorb(title);
}
return !!spt;
}
EXPORT_SYMBOL_GPL(stutter_wait);
......
......@@ -120,7 +120,6 @@ then
parse-build.sh $resdir/Make.out $title
else
# Build failed.
cp $builddir/Make*.out $resdir
cp $builddir/.config $resdir || :
echo Build failed, not running KVM, see $resdir.
if test -f $builddir.wait
......
......@@ -3,9 +3,7 @@ TREE02
TREE03
TREE04
TREE05
TREE06
TREE07
TREE08
TREE09
SRCU-N
SRCU-P
......
rcutorture.torture_type=srcud
rcupdate.rcu_self_test=1
rcutorture.torture_type=srcud
rcupdate.rcu_self_test=1
rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_bh=1
rcutorture.torture_type=rcu_bh
rcutorture.torture_type=rcu_bh maxcpus=8 nr_cpus=43
maxcpus=8 nr_cpus=43
rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
......
rcutorture.torture_type=rcu_bh rcutree.rcu_fanout_leaf=4 nohz_full=1-7
rcutree.rcu_fanout_leaf=4 nohz_full=1-7
rcutorture.torture_type=sched
rcupdate.rcu_self_test_sched=1
rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcupdate.rcu_self_test=1
rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_bh=1
rcupdate.rcu_self_test_sched=1
rcutree.rcu_fanout_exact=1
rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3
......
rcutorture.torture_type=sched
rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_sched=1
rcutree.rcu_fanout_exact=1
rcu_nocbs=0-7
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment