1. 09 Dec, 2021 20 commits
    • Paul E. McKenney's avatar
      Merge branches 'doc.2021.11.30c', 'exp.2021.12.07a', 'fastnohz.2021.11.30c',... · f80fe66c
      Paul E. McKenney authored
      Merge branches 'doc.2021.11.30c', 'exp.2021.12.07a', 'fastnohz.2021.11.30c', 'fixes.2021.11.30c', 'nocb.2021.12.09a', 'nolibc.2021.11.30c', 'tasks.2021.12.09a', 'torture.2021.12.07a' and 'torturescript.2021.11.30c' into HEAD
      
      doc.2021.11.30c: Documentation updates.
      exp.2021.12.07a: Expedited-grace-period fixes.
      fastnohz.2021.11.30c: Remove CONFIG_RCU_FAST_NO_HZ.
      fixes.2021.11.30c: Miscellaneous fixes.
      nocb.2021.12.09a: No-CB CPU updates.
      nolibc.2021.11.30c: Tiny in-kernel library updates.
      tasks.2021.12.09a: RCU-tasks updates, including update-side scalability.
      torture.2021.12.07a: Torture-test in-kernel module updates.
      torturescript.2021.11.30c: Torture-test scripting updates.
      f80fe66c
    • Frederic Weisbecker's avatar
      rcu/nocb: Merge rcu_spawn_cpu_nocb_kthread() and rcu_spawn_one_nocb_kthread() · 10d47031
      Frederic Weisbecker authored
      The rcu_spawn_one_nocb_kthread() function is called only from
      rcu_spawn_cpu_nocb_kthread().  Therefore, inline the former into
      the latter, saving a few lines of code.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      10d47031
    • Frederic Weisbecker's avatar
      rcu/nocb: Allow empty "rcu_nocbs" kernel parameter · d2cf0854
      Frederic Weisbecker authored
      Allow the rcu_nocbs kernel parameter to be specified just by itself,
      without specifying any CPUs.  This allows systems administrators to use
      "rcu_nocbs" to specify that none of the CPUs are to be offloaded at boot
      time, but than any of them may be offloaded at runtime via cpusets.
      
      In contrast, if the "rcu_nocbs" or "nohz_full" kernel parameters are not
      specified at all, then not only are none of the CPUs offloaded at boot,
      none of them can be offloaded at runtime, either.
      
      While in the area, modernize the description of the "rcuo" kthreads'
      naming scheme.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      d2cf0854
    • Frederic Weisbecker's avatar
      rcu/nocb: Create kthreads on all CPUs if "rcu_nocbs=" or "nohz_full=" are passed · 2cf4528d
      Frederic Weisbecker authored
      In order to be able to (de-)offload any CPU using cpusets in the future,
      create the NOCB data structures for all possible CPUs.  For now this is
      done only as long as the "rcu_nocbs=" or "nohz_full=" kernel parameters
      are passed to avoid the unnecessary overhead for most users.
      
      Note that the rcuog and rcuoc kthreads are not created until at least
      one of the corresponding CPUs comes online.  This approach avoids the
      creation of excess kthreads when firmware lies about the number of CPUs
      present on the system.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      2cf4528d
    • Frederic Weisbecker's avatar
      rcu/nocb: Optimize kthreads and rdp initialization · a81aeaf7
      Frederic Weisbecker authored
      Currently cpumask_available() is used to prevent from unwanted NOCB
      initialization.  However if neither "rcu_nocbs=" nor "nohz_full="
      parameters are passed to a kernel built with CONFIG_CPUMASK_OFFSTACK=n,
      the initialization path is still taken, running through all sorts of
      needless operations and iterations on an empty cpumask.
      
      Fix this by relying on a real initialization state instead.  This also
      optimizes kthread creation, preventing needless iteration over all online
      CPUs when the kernel is booted without any offloaded CPUs.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      a81aeaf7
    • Frederic Weisbecker's avatar
      rcu/nocb: Prepare nocb_cb_wait() to start with a non-offloaded rdp · 8d970396
      Frederic Weisbecker authored
      In order to be able to toggle the offloaded state from cpusets, a nocb
      kthread will need to be created for all possible CPUs whenever either
      of the "rcu_nocbs=" or "nohz_full=" parameters are specified.
      
      Therefore, the nocb_cb_wait() kthread must be prepared to start running
      on a de-offloaded rdp.  To accomplish this, simply move the sleeping
      condition to the beginning of the nocb_cb_wait() function, which prevents
      this kthread from attempting to invoke callbacks before the corresponding
      CPU is offloaded.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      8d970396
    • Frederic Weisbecker's avatar
      rcu/nocb: Remove rcu_node structure from nocb list when de-offloaded · 2ebc45c4
      Frederic Weisbecker authored
      The nocb_gp_wait() function iterates over all CPUs in its group,
      including even those CPUs that have been de-offloaded.  This is of
      course suboptimal, especially if none of the CPUs within the group are
      currently offloaded.  This will become even more of a problem once a
      nocb kthread is created for all possible CPUs.
      
      Therefore use a standard double linked list to link all the offloaded
      rcu_data structures and safely add or delete these structure as we
      offload or de-offload them, respectively.
      Reviewed-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Tested-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      2ebc45c4
    • Paul E. McKenney's avatar
      rcu-tasks: Use fewer callbacks queues if callback flood ends · fd796e41
      Paul E. McKenney authored
      By default, when lock contention is encountered, the RCU Tasks flavors
      of RCU switch to using per-CPU queueing.  However, if the callback
      flood ends, per-CPU queueing continues to be used, which introduces
      significant additional overhead, especially for callback invocation,
      which fans out a series of workqueue handlers.
      
      This commit therefore switches back to single-queue operation if at the
      beginning of a grace period there are very few callbacks.  The definition
      of "very few" is set by the rcupdate.rcu_task_collapse_lim module
      parameter, which defaults to 10.  This switch happens in two phases,
      with the first phase causing future callbacks to be enqueued on CPU 0's
      queue, but with all queues continuing to be checked for grace periods
      and callback invocation.  The second phase checks to see if an RCU grace
      period has elapsed and if all remaining RCU-Tasks callbacks are queued
      on CPU 0.  If so, only CPU 0 is checked for future grace periods and
      callback operation.
      
      Of course, the return of contention anywhere during this process will
      result in returning to per-CPU callback queueing.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      fd796e41
    • Paul E. McKenney's avatar
      rcu-tasks: Use separate ->percpu_dequeue_lim for callback dequeueing · 2cee0789
      Paul E. McKenney authored
      Decreasing the number of callback queues is a bit tricky because it
      is necessary to handle callbacks that were queued before the number of
      queues decreased, but which were not ready to invoke until afterwards.
      This commit takes a first step in this direction by maintaining a separate
      ->percpu_dequeue_lim to control callback dequeueing, in addition to the
      existing ->percpu_enqueue_lim which now controls only enqueueing.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      2cee0789
    • Paul E. McKenney's avatar
      rcu-tasks: Use more callback queues if contention encountered · ab97152f
      Paul E. McKenney authored
      The rcupdate.rcu_task_enqueue_lim module parameter allows system
      administrators to tune the number of callback queues used by the RCU
      Tasks flavors.  However if callback storms are infrequent, it would
      be better to operate with a single queue on a given system unless and
      until that system actually needed more queues.  Systems not needing
      more queues can then avoid the overhead of checking the extra queues
      and especially avoid the overhead of fanning workqueue handlers out to
      all CPUs to invoke callbacks.
      
      This commit therefore switches to using all the CPUs' callback queues if
      call_rcu_tasks_generic() encounters too much lock contention.  The amount
      of lock contention to tolerate defaults to 100 contended lock acquisitions
      per jiffy, and can be adjusted using the new rcupdate.rcu_task_contend_lim
      module parameter.
      
      Such switching is undertaken only if the rcupdate.rcu_task_enqueue_lim
      module parameter is negative, which is its default value (-1).
      This allows savvy systems administrators to set the number of queues
      to some known good value and to not have to worry about the kernel doing
      any second guessing.
      
      [ paulmck: Apply feedback from Guillaume Tucker and kernelci. ]
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ab97152f
    • Paul E. McKenney's avatar
      rcu-tasks: Avoid raw-spinlocked wakeups from call_rcu_tasks_generic() · 3063b33a
      Paul E. McKenney authored
      If the caller of of call_rcu_tasks(), call_rcu_tasks_rude(),
      or call_rcu_tasks_trace() holds a raw spinlock, and then if
      call_rcu_tasks_generic() determines that the grace-period kthread must
      be awakened, then the wakeup might acquire a normal spinlock while a
      raw spinlock is held.  This results in lockdep splats when the
      kernel is built with CONFIG_PROVE_RAW_LOCK_NESTING=y.
      
      This commit therefore defers the wakeup using irq_work_queue().
      
      It would be nice to directly invoke wakeup when a raw spinlock is not
      held, but there is currently no way to check for this in all kernels.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      3063b33a
    • Paul E. McKenney's avatar
      rcu-tasks: Count trylocks to estimate call_rcu_tasks() contention · 7d13d30b
      Paul E. McKenney authored
      This commit converts the unconditional raw_spin_lock_rcu_node() lock
      acquisition in call_rcu_tasks_generic() to a trylock followed by an
      unconditional acquisition if the trylock fails.  If the trylock fails,
      the failure is counted, but the count is reset to zero on each new jiffy.
      
      This statistic will be used to determine when to move from a single
      callback queue to per-CPU callback queues.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      7d13d30b
    • Paul E. McKenney's avatar
      rcu-tasks: Add rcupdate.rcu_task_enqueue_lim to set initial queueing · 8610b656
      Paul E. McKenney authored
      This commit adds a rcupdate.rcu_task_enqueue_lim module parameter that
      sets the initial number of callback queues to use for the RCU Tasks
      family of RCU implementations.  This parameter allows testing of various
      fanout values.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      8610b656
    • Paul E. McKenney's avatar
      rcu-tasks: Make rcu_barrier_tasks*() handle multiple callback queues · ce9b1c66
      Paul E. McKenney authored
      Currently, rcu_barrier_tasks(), rcu_barrier_tasks_rude(),
      and rcu_barrier_tasks_trace() simply invoke the corresponding
      synchronize_rcu_tasks*() function.  This works because there is only
      one callback queue.
      
      However, there will soon be multiple callback queues.  This commit
      therefore scans the queues currently in use, entraining a callback on
      each non-empty queue.  Sequence numbers and reference counts are used
      to synchronize this process in a manner similar to the approach taken
      by rcu_barrier().
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ce9b1c66
    • Paul E. McKenney's avatar
      rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations · d363f833
      Paul E. McKenney authored
      If there is a flood of callbacks, it is necessary to put multiple
      CPUs to work invoking those callbacks.  This commit therefore uses a
      workqueue-flooding approach to parallelize RCU Tasks callback execution.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      d363f833
    • Paul E. McKenney's avatar
      rcu-tasks: Abstract invocations of callbacks · 57881863
      Paul E. McKenney authored
      This commit adds a rcu_tasks_invoke_cbs() function that invokes all
      ready callbacks on all of the per-CPU lists that are currently in use.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      57881863
    • Paul E. McKenney's avatar
      rcu-tasks: Abstract checking of callback lists · 4d1114c0
      Paul E. McKenney authored
      This commit adds a rcu_tasks_need_gpcb() function that returns an
      indication of whether another grace period is required, and if no grace
      period is required, whether there are callbacks that need to be invoked.
      The function scans all per-CPU lists currently in use.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      4d1114c0
    • Paul E. McKenney's avatar
      rcu-tasks: Add a ->percpu_enqueue_lim to the rcu_tasks structure · 8dd593fd
      Paul E. McKenney authored
      This commit adds a ->percpu_enqueue_lim field to the rcu_tasks structure.
      This field contains two to the power of the ->percpu_enqueue_shift
      field, easing construction of iterators over the per-CPU queues that
      might contain RCU Tasks callbacks.  Such iterators will be introduced
      in later commits.
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      8dd593fd
    • Neeraj Upadhyay's avatar
      rcu-tasks: Inspect stalled task's trc state in locked state · 65b629e7
      Neeraj Upadhyay authored
      On RCU tasks trace stall, inspect the RCU-tasks-trace specific
      states of stalled task in locked down state, using try_invoke_
      on_locked_down_task(), to get reliable trc state of a non-running
      stalled task.
      
      This was tested using the following command:
      
      tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --configs TRACE01 \
      --bootargs "rcutorture.torture_type=tasks-tracing rcutorture.stall_cpu=10 \
      rcutorture.stall_cpu_block=1 rcupdate.rcu_task_stall_timeout=100" --trust-make
      
      As expected, this produced the following console output for running and
      sleeping tasks.
      
      [   21.520291] INFO: rcu_tasks_trace detected stalls on tasks:
      [   21.521292] P85: ... nesting: 1N cpu: 2
      [   21.521966] task:rcu_torture_sta state:D stack:15080 pid:   85 ppid:     2
      flags:0x00004000
      [   21.523384] Call Trace:
      [   21.523808]  __schedule+0x273/0x6e0
      [   21.524428]  schedule+0x35/0xa0
      [   21.524971]  schedule_timeout+0x1ed/0x270
      [   21.525690]  ? del_timer_sync+0x30/0x30
      [   21.526371]  ? rcu_torture_writer+0x720/0x720
      [   21.527106]  rcu_torture_stall+0x24a/0x270
      [   21.527816]  kthread+0x115/0x140
      [   21.528401]  ? set_kthread_struct+0x40/0x40
      [   21.529136]  ret_from_fork+0x22/0x30
      [   21.529766]  1 holdouts
      [   21.632300] INFO: rcu_tasks_trace detected stalls on tasks:
      [   21.632345] rcu_torture_stall end.
      [   21.633293] P85: .
      [   21.633294] task:rcu_torture_sta state:R  running task stack:15080 pid:
      85 ppid:     2 flags:0x00004000
      [   21.633299] Call Trace:
      [   21.633301]  ? vprintk_emit+0xab/0x180
      [   21.633306]  ? vprintk_emit+0x11a/0x180
      [   21.633308]  ? _printk+0x4d/0x69
      [   21.633311]  ? __default_send_IPI_shortcut+0x1f/0x40
      
      [ paulmck: Update to new v5.16 task_call_func() name. ]
      Signed-off-by: default avatarNeeraj Upadhyay <quic_neeraju@quicinc.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      65b629e7
    • Paul E. McKenney's avatar
      rcu-tasks: Use spin_lock_rcu_node() and friends · 381a4f3b
      Paul E. McKenney authored
      This commit renames the rcu_tasks_percpu structure's ->cbs_pcpu_lock
      to ->lock and then uses spin_lock_rcu_node() and friends to acquire and
      release this lock, preparing for upcoming commits that will spread the
      grace-period process across multiple CPUs and kthreads.
      
      [ paulmck: Apply feedback from kernel test robot. ]
      Reported-by: default avatarMartin Lau <kafai@fb.com>
      Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      381a4f3b
  2. 08 Dec, 2021 20 commits