• Paul E. McKenney's avatar
    rcu: Fix grace-period hangs due to race with CPU offline · 1e64b15a
    Paul E. McKenney authored
    Without special fail-safe quiescent-state-propagation checks, grace-period
    hangs can result from the following scenario:
    
    1.	CPU 1 goes offline.
    
    2.	Because CPU 1 is the only CPU in the system blocking the current
    	grace period, the grace period ends as soon as
    	rcu_cleanup_dying_idle_cpu()'s call to rcu_report_qs_rnp()
    	returns.
    
    3.	At this point, the leaf rcu_node structure's ->lock is no longer
    	held: rcu_report_qs_rnp() has released it, as it must in order
    	to awaken the RCU grace-period kthread.
    
    4.	At this point, that same leaf rcu_node structure's ->qsmaskinitnext
    	field still records CPU 1 as being online.  This is absolutely
    	necessary because the scheduler uses RCU (in this case on the
    	wake-up path while awakening RCU's grace-period kthread), and
    	->qsmaskinitnext contains RCU's idea as to which CPUs are online.
    	Therefore, invoking rcu_report_qs_rnp() after clearing CPU 1's
    	bit from ->qsmaskinitnext would result in a lockdep-RCU splat
    	due to RCU being used from an offline CPU.
    
    5.	RCU's grace-period kthread awakens, sees that the old grace period
    	has completed and that a new one is needed.  It therefore starts
    	a new grace period, but because CPU 1's leaf rcu_node structure's
    	->qsmaskinitnext field still shows CPU 1 as being online, this new
    	grace period is initialized to wait for a quiescent state from the
    	now-offline CPU 1.
    
    6.	Without the fail-safe force-quiescent-state checks, there would
    	be no quiescent state from the now-offline CPU 1, which would
    	eventually result in RCU CPU stall warnings and memory exhaustion.
    
    It would be good to get rid of the special fail-safe quiescent-state
    propagation checks, and thus it would be good to fix things so that
    the above scenario cannot happen.  This commit therefore adds a new
    ->ofl_lock to the rcu_state structure.  This lock is held by rcu_gp_init()
    across the applying of buffered online and offline operations to the
    rcu_node tree, and it is also held by rcu_cleanup_dying_idle_cpu()
    when buffering a new offline operation.  This prevents rcu_gp_init()
    from acquiring the leaf rcu_node structure's lock during the interval
    between when rcu_cleanup_dying_idle_cpu() invokes rcu_report_qs_rnp(),
    which releases ->lock and the re-acquisition of that same lock.
    This in turn prevents the failure scenario outlined above, and will
    hopefully eventually allow removal of the offline-CPU checks from the
    force-quiescent-state code path.
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    1e64b15a
tree.h 19.1 KB