Commit 7696f991 authored by Andrea Parri's avatar Andrea Parri Committed by Ingo Molnar

sched/Documentation: Update wake_up() & co. memory-barrier guarantees

Both the implementation and the users' expectation [1] for the various
wakeup primitives have evolved over time, but the documentation has not
kept up with these changes: brings it into 2018.

[1] http://lkml.kernel.org/r/20180424091510.GB4064@hirez.programming.kicks-ass.net

Also applied feedback from Alan Stern.
Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
Signed-off-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: parri.andrea@gmail.com
Link: http://lkml.kernel.org/r/20180716180605.16115-12-paulmck@linux.vnet.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent 3d85b270
...@@ -2179,32 +2179,41 @@ or: ...@@ -2179,32 +2179,41 @@ or:
event_indicated = 1; event_indicated = 1;
wake_up_process(event_daemon); wake_up_process(event_daemon);
A write memory barrier is implied by wake_up() and co. if and only if they A general memory barrier is executed by wake_up() if it wakes something up.
wake something up. The barrier occurs before the task state is cleared, and so If it doesn't wake anything up then a memory barrier may or may not be
sits between the STORE to indicate the event and the STORE to set TASK_RUNNING: executed; you must not rely on it. The barrier occurs before the task state
is accessed, in particular, it sits between the STORE to indicate the event
and the STORE to set TASK_RUNNING:
CPU 1 CPU 2 CPU 1 (Sleeper) CPU 2 (Waker)
=============================== =============================== =============================== ===============================
set_current_state(); STORE event_indicated set_current_state(); STORE event_indicated
smp_store_mb(); wake_up(); smp_store_mb(); wake_up();
STORE current->state <write barrier> STORE current->state ...
<general barrier> STORE current->state <general barrier> <general barrier>
LOAD event_indicated LOAD event_indicated if ((LOAD task->state) & TASK_NORMAL)
STORE task->state
To repeat, this write memory barrier is present if and only if something where "task" is the thread being woken up and it equals CPU 1's "current".
is actually awakened. To see this, consider the following sequence of
events, where X and Y are both initially zero: To repeat, a general memory barrier is guaranteed to be executed by wake_up()
if something is actually awakened, but otherwise there is no such guarantee.
To see this, consider the following sequence of events, where X and Y are both
initially zero:
CPU 1 CPU 2 CPU 1 CPU 2
=============================== =============================== =============================== ===============================
X = 1; STORE event_indicated X = 1; Y = 1;
smp_mb(); wake_up(); smp_mb(); wake_up();
Y = 1; wait_event(wq, Y == 1); LOAD Y LOAD X
wake_up(); load from Y sees 1, no memory barrier
load from X might see 0 If a wakeup does occur, one (at least) of the two loads must see 1. If, on
the other hand, a wakeup does not occur, both loads might see 0.
In contrast, if a wakeup does occur, CPU 2's load from X would be guaranteed wake_up_process() always executes a general memory barrier. The barrier again
to see 1. occurs before the task state is accessed. In particular, if the wake_up() in
the previous snippet were replaced by a call to wake_up_process() then one of
the two loads would be guaranteed to see 1.
The available waker functions include: The available waker functions include:
...@@ -2224,6 +2233,8 @@ The available waker functions include: ...@@ -2224,6 +2233,8 @@ The available waker functions include:
wake_up_poll(); wake_up_poll();
wake_up_process(); wake_up_process();
In terms of memory ordering, these functions all provide the same guarantees of
a wake_up() (or stronger).
[!] Note that the memory barriers implied by the sleeper and the waker do _not_ [!] Note that the memory barriers implied by the sleeper and the waker do _not_
order multiple stores before the wake-up with respect to loads of those stored order multiple stores before the wake-up with respect to loads of those stored
......
...@@ -167,8 +167,8 @@ struct task_group; ...@@ -167,8 +167,8 @@ struct task_group;
* need_sleep = false; * need_sleep = false;
* wake_up_state(p, TASK_UNINTERRUPTIBLE); * wake_up_state(p, TASK_UNINTERRUPTIBLE);
* *
* Where wake_up_state() (and all other wakeup primitives) imply enough * where wake_up_state() executes a full memory barrier before accessing the
* barriers to order the store of the variable against wakeup. * task state.
* *
* Wakeup will do: if (@state & p->state) p->state = TASK_RUNNING, that is, * Wakeup will do: if (@state & p->state) p->state = TASK_RUNNING, that is,
* once it observes the TASK_UNINTERRUPTIBLE store the waking CPU can issue a * once it observes the TASK_UNINTERRUPTIBLE store the waking CPU can issue a
......
...@@ -22,8 +22,8 @@ ...@@ -22,8 +22,8 @@
* *
* See also complete_all(), wait_for_completion() and related routines. * See also complete_all(), wait_for_completion() and related routines.
* *
* It may be assumed that this function implies a write memory barrier before * If this function wakes up a task, it executes a full memory barrier before
* changing the task state if and only if any tasks are woken up. * accessing the task state.
*/ */
void complete(struct completion *x) void complete(struct completion *x)
{ {
...@@ -44,8 +44,8 @@ EXPORT_SYMBOL(complete); ...@@ -44,8 +44,8 @@ EXPORT_SYMBOL(complete);
* *
* This will wake up all threads waiting on this particular completion event. * This will wake up all threads waiting on this particular completion event.
* *
* It may be assumed that this function implies a write memory barrier before * If this function wakes up a task, it executes a full memory barrier before
* changing the task state if and only if any tasks are woken up. * accessing the task state.
* *
* Since complete_all() sets the completion of @x permanently to done * Since complete_all() sets the completion of @x permanently to done
* to allow multiple waiters to finish, a call to reinit_completion() * to allow multiple waiters to finish, a call to reinit_completion()
......
...@@ -412,8 +412,8 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task) ...@@ -412,8 +412,8 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task)
* its already queued (either by us or someone else) and will get the * its already queued (either by us or someone else) and will get the
* wakeup due to that. * wakeup due to that.
* *
* This cmpxchg() implies a full barrier, which pairs with the write * This cmpxchg() executes a full barrier, which pairs with the full
* barrier implied by the wakeup in wake_up_q(). * barrier executed by the wakeup in wake_up_q().
*/ */
if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL)) if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL))
return; return;
...@@ -441,8 +441,8 @@ void wake_up_q(struct wake_q_head *head) ...@@ -441,8 +441,8 @@ void wake_up_q(struct wake_q_head *head)
task->wake_q.next = NULL; task->wake_q.next = NULL;
/* /*
* wake_up_process() implies a wmb() to pair with the queueing * wake_up_process() executes a full barrier, which pairs with
* in wake_q_add() so as not to miss wakeups. * the queueing in wake_q_add() so as not to miss wakeups.
*/ */
wake_up_process(task); wake_up_process(task);
put_task_struct(task); put_task_struct(task);
...@@ -1879,8 +1879,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags) ...@@ -1879,8 +1879,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
* rq(c1)->lock (if not at the same time, then in that order). * rq(c1)->lock (if not at the same time, then in that order).
* C) LOCK of the rq(c1)->lock scheduling in task * C) LOCK of the rq(c1)->lock scheduling in task
* *
* Transitivity guarantees that B happens after A and C after B. * Release/acquire chaining guarantees that B happens after A and C after B.
* Note: we only require RCpc transitivity.
* Note: the CPU doing B need not be c0 or c1 * Note: the CPU doing B need not be c0 or c1
* *
* Example: * Example:
...@@ -1942,16 +1941,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags) ...@@ -1942,16 +1941,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
* UNLOCK rq(0)->lock * UNLOCK rq(0)->lock
* *
* *
* However; for wakeups there is a second guarantee we must provide, namely we * However, for wakeups there is a second guarantee we must provide, namely we
* must observe the state that lead to our wakeup. That is, not only must our * must ensure that CONDITION=1 done by the caller can not be reordered with
* task observe its own prior state, it must also observe the stores prior to * accesses to the task state; see try_to_wake_up() and set_current_state().
* its wakeup.
*
* This means that any means of doing remote wakeups must order the CPU doing
* the wakeup against the CPU the task is going to end up running on. This,
* however, is already required for the regular Program-Order guarantee above,
* since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
*
*/ */
/** /**
...@@ -1967,6 +1959,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags) ...@@ -1967,6 +1959,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
* Atomic against schedule() which would dequeue a task, also see * Atomic against schedule() which would dequeue a task, also see
* set_current_state(). * set_current_state().
* *
* This function executes a full memory barrier before accessing the task
* state; see set_current_state().
*
* Return: %true if @p->state changes (an actual wakeup was done), * Return: %true if @p->state changes (an actual wakeup was done),
* %false otherwise. * %false otherwise.
*/ */
...@@ -2141,8 +2136,7 @@ static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf) ...@@ -2141,8 +2136,7 @@ static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf)
* *
* Return: 1 if the process was woken up, 0 if it was already running. * Return: 1 if the process was woken up, 0 if it was already running.
* *
* It may be assumed that this function implies a write memory barrier before * This function executes a full memory barrier before accessing the task state.
* changing the task state if and only if any tasks are woken up.
*/ */
int wake_up_process(struct task_struct *p) int wake_up_process(struct task_struct *p)
{ {
......
...@@ -134,8 +134,8 @@ static void __wake_up_common_lock(struct wait_queue_head *wq_head, unsigned int ...@@ -134,8 +134,8 @@ static void __wake_up_common_lock(struct wait_queue_head *wq_head, unsigned int
* @nr_exclusive: how many wake-one or wake-many threads to wake up * @nr_exclusive: how many wake-one or wake-many threads to wake up
* @key: is directly passed to the wakeup function * @key: is directly passed to the wakeup function
* *
* It may be assumed that this function implies a write memory barrier before * If this function wakes up a task, it executes a full memory barrier before
* changing the task state if and only if any tasks are woken up. * accessing the task state.
*/ */
void __wake_up(struct wait_queue_head *wq_head, unsigned int mode, void __wake_up(struct wait_queue_head *wq_head, unsigned int mode,
int nr_exclusive, void *key) int nr_exclusive, void *key)
...@@ -180,8 +180,8 @@ EXPORT_SYMBOL_GPL(__wake_up_locked_key_bookmark); ...@@ -180,8 +180,8 @@ EXPORT_SYMBOL_GPL(__wake_up_locked_key_bookmark);
* *
* On UP it can prevent extra preemption. * On UP it can prevent extra preemption.
* *
* It may be assumed that this function implies a write memory barrier before * If this function wakes up a task, it executes a full memory barrier before
* changing the task state if and only if any tasks are woken up. * accessing the task state.
*/ */
void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode, void __wake_up_sync_key(struct wait_queue_head *wq_head, unsigned int mode,
int nr_exclusive, void *key) int nr_exclusive, void *key)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment