Commit 389abd48 authored by Paul E. McKenney's avatar Paul E. McKenney

rcu: Avoid RCU-preempt expedited grace-period botch

Because rcu_read_unlock_special() samples rcu_preempted_readers_exp(rnp)
after dropping rnp->lock, the following sequence of events is possible:

1.	Task A exits its RCU read-side critical section, and removes
	itself from the ->blkd_tasks list, releases rnp->lock, and is
	then preempted.  Task B remains on the ->blkd_tasks list, and
	blocks the current expedited grace period.

2.	Task B exits from its RCU read-side critical section and removes
	itself from the ->blkd_tasks list.  Because it is the last task
	blocking the current expedited grace period, it ends that
	expedited grace period.

3.	Task A resumes, and samples rcu_preempted_readers_exp(rnp) which
	of course indicates that nothing is blocking the nonexistent
	expedited grace period. Task A is again preempted.

4.	Some other CPU starts an expedited grace period.  There are several
	tasks blocking this expedited grace period queued on the
	same rcu_node structure that Task A was using in step 1 above.

5.	Task A examines its state and incorrectly concludes that it was
	the last task blocking the expedited grace period on the current
	rcu_node structure.  It therefore reports completion up the
	rcu_node tree.

6.	The expedited grace period can then incorrectly complete before
	the tasks blocked on this same rcu_node structure exit their
	RCU read-side critical sections.  Arbitrarily bad things happen.

This commit therefore takes a snapshot of rcu_preempted_readers_exp(rnp)
prior to dropping the lock, so that only the last task thinks that it is
the last task, thus avoiding the failure scenario laid out above.
Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
parent af446b70
...@@ -312,6 +312,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) ...@@ -312,6 +312,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
{ {
int empty; int empty;
int empty_exp; int empty_exp;
int empty_exp_now;
unsigned long flags; unsigned long flags;
struct list_head *np; struct list_head *np;
#ifdef CONFIG_RCU_BOOST #ifdef CONFIG_RCU_BOOST
...@@ -382,8 +383,10 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) ...@@ -382,8 +383,10 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
/* /*
* If this was the last task on the current list, and if * If this was the last task on the current list, and if
* we aren't waiting on any CPUs, report the quiescent state. * we aren't waiting on any CPUs, report the quiescent state.
* Note that rcu_report_unblock_qs_rnp() releases rnp->lock. * Note that rcu_report_unblock_qs_rnp() releases rnp->lock,
* so we must take a snapshot of the expedited state.
*/ */
empty_exp_now = !rcu_preempted_readers_exp(rnp);
if (!empty && !rcu_preempt_blocked_readers_cgp(rnp)) { if (!empty && !rcu_preempt_blocked_readers_cgp(rnp)) {
trace_rcu_quiescent_state_report("preempt_rcu", trace_rcu_quiescent_state_report("preempt_rcu",
rnp->gpnum, rnp->gpnum,
...@@ -406,7 +409,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) ...@@ -406,7 +409,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t)
* If this was the last task on the expedited lists, * If this was the last task on the expedited lists,
* then we need to report up the rcu_node hierarchy. * then we need to report up the rcu_node hierarchy.
*/ */
if (!empty_exp && !rcu_preempted_readers_exp(rnp)) if (!empty_exp && empty_exp_now)
rcu_report_exp_rnp(&rcu_preempt_state, rnp); rcu_report_exp_rnp(&rcu_preempt_state, rnp);
} else { } else {
local_irq_restore(flags); local_irq_restore(flags);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment