• Paul E. McKenney's avatar
    rcu: Reject memory-order-induced stall-warning false positives · 26cdfedf
    Paul E. McKenney authored
    If a system is idle from an RCU perspective for longer than specified
    by CONFIG_RCU_CPU_STALL_TIMEOUT, and if one CPU starts a grace period
    just as a second checks for CPU stalls, and if this second CPU happens
    to see the old value of rsp->jiffies_stall, it will incorrectly report a
    CPU stall.  This is quite rare, but apparently occurs deterministically
    on systems with about 6TB of memory.
    
    This commit therefore orders accesses to the data used to determine
    whether or not a CPU stall is in progress.  Grace-period initialization
    and cleanup first increments rsp->completed to mark the end of the
    previous grace period, then records the current jiffies in rsp->gp_start,
    then records the jiffies at which a stall can be expected to occur in
    rsp->jiffies_stall, and finally increments rsp->gpnum to mark the start
    of the new grace period.  Now, this ordering by itself does not prevent
    false positives.  For example, if grace-period initialization was delayed
    between recording rsp->gp_start and rsp->jiffies_stall, the CPU stall
    warning code might still see an old value of rsp->jiffies_stall.
    
    Therefore, this commit also orders the CPU stall warning accesses as
    well, loading rsp->gpnum and jiffies, then rsp->jiffies_stall, then
    rsp->gp_start, and finally rsp->completed.  This ordering means that
    the false-positive scenario in the previous paragraph would result
    in rsp->completed being greater than or equal to rsp->gpnum, which is
    never valid for a CPU stall, allowing the false positive to be rejected.
    Furthermore, any fetch that gets an old value of rsp->jiffies_stall
    must also get an old value of rsp->gpnum, which will again be rejected
    by the comparison of rsp->gpnum and rsp->completed.  Situations where
    rsp->gp_start is later than rsp->jiffies_stall are also rejected, as
    are situations where jiffies is less than rsp->jiffies_stall.
    
    Although use of unsynchronized accesses means that there are likely
    still some false-positive scenarios (synchronization has proven to be
    a very bad idea on large systems), this should get rid of a large class
    of these scenarios.
    Reported-by: default avatarFabian Herschel <fabian.herschel@suse.com>
    Reported-by: default avatarMichal Hocko <mhocko@suse.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
    Tested-by: default avatarJochen Striepe <jochen@tolot.escape.de>
    26cdfedf
rcutree.c 104 KB