Commit 1c19b68a authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - pvqspinlock statistics fixes (Davidlohr Bueso)

   - flip atomic_fetch_or() arguments (Peter Zijlstra)

   - locktorture simplification (Paul E.  McKenney)

   - documentation updates (SeongJae Park, David Howells, Davidlohr
     Bueso, Paul E McKenney, Peter Zijlstra, Will Deacon)

   - various fixes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Flip atomic_fetch_or() arguments
  locking/pvqspinlock: Robustify init_qspinlock_stat()
  locking/pvqspinlock: Avoid double resetting of stats
  lcoking/locktorture: Simplify the torture_runnable computation
  locking/Documentation: Clarify that ACQUIRE applies to loads, RELEASE applies to stores
  locking/Documentation: State purpose of memory-barriers.txt
  locking/Documentation: Add disclaimer
  locking/Documentation/lockdep: Fix spelling mistakes
  locking/lockdep: Deinline register_lock_class(), save 2328 bytes
  locking/locktorture: Fix NULL pointer dereference for cleanup paths
  locking/locktorture: Fix deboosting NULL pointer dereference
  locking/Documentation: Mention smp_cond_acquire()
  locking/Documentation: Insert white spaces consistently
  locking/Documentation: Fix formatting inconsistencies
  locking/Documentation: Add missed subsection in TOC
  locking/Documentation: Fix missed s/lock/acquire renames
  locking/Documentation: Clarify relationship of barrier() to control dependencies
parents 49817c33 a1cc5bcf
...@@ -97,7 +97,7 @@ between any two lock-classes: ...@@ -97,7 +97,7 @@ between any two lock-classes:
<hardirq-safe> -> <hardirq-unsafe> <hardirq-safe> -> <hardirq-unsafe>
<softirq-safe> -> <softirq-unsafe> <softirq-safe> -> <softirq-unsafe>
The first rule comes from the fact the a hardirq-safe lock could be The first rule comes from the fact that a hardirq-safe lock could be
taken by a hardirq context, interrupting a hardirq-unsafe lock - and taken by a hardirq context, interrupting a hardirq-unsafe lock - and
thus could result in a lock inversion deadlock. Likewise, a softirq-safe thus could result in a lock inversion deadlock. Likewise, a softirq-safe
lock could be taken by an softirq context, interrupting a softirq-unsafe lock could be taken by an softirq context, interrupting a softirq-unsafe
...@@ -220,7 +220,7 @@ calculated, which hash is unique for every lock chain. The hash value, ...@@ -220,7 +220,7 @@ calculated, which hash is unique for every lock chain. The hash value,
when the chain is validated for the first time, is then put into a hash when the chain is validated for the first time, is then put into a hash
table, which hash-table can be checked in a lockfree manner. If the table, which hash-table can be checked in a lockfree manner. If the
locking chain occurs again later on, the hash table tells us that we locking chain occurs again later on, the hash table tells us that we
dont have to validate the chain again. don't have to validate the chain again.
Troubleshooting: Troubleshooting:
---------------- ----------------
......
...@@ -4,8 +4,40 @@ ...@@ -4,8 +4,40 @@
By: David Howells <dhowells@redhat.com> By: David Howells <dhowells@redhat.com>
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Will Deacon <will.deacon@arm.com>
Peter Zijlstra <peterz@infradead.org>
Contents: ==========
DISCLAIMER
==========
This document is not a specification; it is intentionally (for the sake of
brevity) and unintentionally (due to being human) incomplete. This document is
meant as a guide to using the various memory barriers provided by Linux, but
in case of any doubt (and there are many) please ask.
To repeat, this document is not a specification of what Linux expects from
hardware.
The purpose of this document is twofold:
(1) to specify the minimum functionality that one can rely on for any
particular barrier, and
(2) to provide a guide as to how to use the barriers that are available.
Note that an architecture can provide more than the minimum requirement
for any particular barrier, but if the architecure provides less than
that, that architecture is incorrect.
Note also that it is possible that a barrier may be a no-op for an
architecture because the way that arch works renders an explicit barrier
unnecessary in that case.
========
CONTENTS
========
(*) Abstract memory access model. (*) Abstract memory access model.
...@@ -31,15 +63,15 @@ Contents: ...@@ -31,15 +63,15 @@ Contents:
(*) Implicit kernel memory barriers. (*) Implicit kernel memory barriers.
- Locking functions. - Lock acquisition functions.
- Interrupt disabling functions. - Interrupt disabling functions.
- Sleep and wake-up functions. - Sleep and wake-up functions.
- Miscellaneous functions. - Miscellaneous functions.
(*) Inter-CPU locking barrier effects. (*) Inter-CPU acquiring barrier effects.
- Locks vs memory accesses. - Acquires vs memory accesses.
- Locks vs I/O accesses. - Acquires vs I/O accesses.
(*) Where are memory barriers needed? (*) Where are memory barriers needed?
...@@ -61,6 +93,7 @@ Contents: ...@@ -61,6 +93,7 @@ Contents:
(*) The things CPUs get up to. (*) The things CPUs get up to.
- And then there's the Alpha. - And then there's the Alpha.
- Virtual Machine Guests.
(*) Example uses. (*) Example uses.
...@@ -148,7 +181,7 @@ As a further example, consider this sequence of events: ...@@ -148,7 +181,7 @@ As a further example, consider this sequence of events:
CPU 1 CPU 2 CPU 1 CPU 2
=============== =============== =============== ===============
{ A == 1, B == 2, C = 3, P == &A, Q == &C } { A == 1, B == 2, C == 3, P == &A, Q == &C }
B = 4; Q = P; B = 4; Q = P;
P = &B D = *Q; P = &B D = *Q;
...@@ -430,8 +463,9 @@ And a couple of implicit varieties: ...@@ -430,8 +463,9 @@ And a couple of implicit varieties:
This acts as a one-way permeable barrier. It guarantees that all memory This acts as a one-way permeable barrier. It guarantees that all memory
operations after the ACQUIRE operation will appear to happen after the operations after the ACQUIRE operation will appear to happen after the
ACQUIRE operation with respect to the other components of the system. ACQUIRE operation with respect to the other components of the system.
ACQUIRE operations include LOCK operations and smp_load_acquire() ACQUIRE operations include LOCK operations and both smp_load_acquire()
operations. and smp_cond_acquire() operations. The later builds the necessary ACQUIRE
semantics from relying on a control dependency and smp_rmb().
Memory operations that occur before an ACQUIRE operation may appear to Memory operations that occur before an ACQUIRE operation may appear to
happen after it completes. happen after it completes.
...@@ -464,6 +498,11 @@ And a couple of implicit varieties: ...@@ -464,6 +498,11 @@ And a couple of implicit varieties:
This means that ACQUIRE acts as a minimal "acquire" operation and This means that ACQUIRE acts as a minimal "acquire" operation and
RELEASE acts as a minimal "release" operation. RELEASE acts as a minimal "release" operation.
A subset of the atomic operations described in atomic_ops.txt have ACQUIRE
and RELEASE variants in addition to fully-ordered and relaxed (no barrier
semantics) definitions. For compound atomics performing both a load and a
store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
only to the store portion of the operation.
Memory barriers are only required where there's a possibility of interaction Memory barriers are only required where there's a possibility of interaction
between two CPUs or between a CPU and a device. If it can be guaranteed that between two CPUs or between a CPU and a device. If it can be guaranteed that
...@@ -517,7 +556,7 @@ following sequence of events: ...@@ -517,7 +556,7 @@ following sequence of events:
CPU 1 CPU 2 CPU 1 CPU 2
=============== =============== =============== ===============
{ A == 1, B == 2, C = 3, P == &A, Q == &C } { A == 1, B == 2, C == 3, P == &A, Q == &C }
B = 4; B = 4;
<write barrier> <write barrier>
WRITE_ONCE(P, &B) WRITE_ONCE(P, &B)
...@@ -544,7 +583,7 @@ between the address load and the data load: ...@@ -544,7 +583,7 @@ between the address load and the data load:
CPU 1 CPU 2 CPU 1 CPU 2
=============== =============== =============== ===============
{ A == 1, B == 2, C = 3, P == &A, Q == &C } { A == 1, B == 2, C == 3, P == &A, Q == &C }
B = 4; B = 4;
<write barrier> <write barrier>
WRITE_ONCE(P, &B); WRITE_ONCE(P, &B);
...@@ -813,9 +852,10 @@ In summary: ...@@ -813,9 +852,10 @@ In summary:
the same variable, then those stores must be ordered, either by the same variable, then those stores must be ordered, either by
preceding both of them with smp_mb() or by using smp_store_release() preceding both of them with smp_mb() or by using smp_store_release()
to carry out the stores. Please note that it is -not- sufficient to carry out the stores. Please note that it is -not- sufficient
to use barrier() at beginning of each leg of the "if" statement, to use barrier() at beginning of each leg of the "if" statement
as optimizing compilers do not necessarily respect barrier() because, as shown by the example above, optimizing compilers can
in this case. destroy the control dependency while respecting the letter of the
barrier() law.
(*) Control dependencies require at least one run-time conditional (*) Control dependencies require at least one run-time conditional
between the prior load and the subsequent store, and this between the prior load and the subsequent store, and this
...@@ -1794,6 +1834,7 @@ There are some more advanced barrier functions: ...@@ -1794,6 +1834,7 @@ There are some more advanced barrier functions:
(*) lockless_dereference(); (*) lockless_dereference();
This can be thought of as a pointer-fetch wrapper around the This can be thought of as a pointer-fetch wrapper around the
smp_read_barrier_depends() data-dependency barrier. smp_read_barrier_depends() data-dependency barrier.
...@@ -1858,7 +1899,7 @@ This is a variation on the mandatory write barrier that causes writes to weakly ...@@ -1858,7 +1899,7 @@ This is a variation on the mandatory write barrier that causes writes to weakly
ordered I/O regions to be partially ordered. Its effects may go beyond the ordered I/O regions to be partially ordered. Its effects may go beyond the
CPU->Hardware interface and actually affect the hardware at some level. CPU->Hardware interface and actually affect the hardware at some level.
See the subsection "Locks vs I/O accesses" for more information. See the subsection "Acquires vs I/O accesses" for more information.
=============================== ===============================
...@@ -1873,8 +1914,8 @@ provide more substantial guarantees, but these may not be relied upon outside ...@@ -1873,8 +1914,8 @@ provide more substantial guarantees, but these may not be relied upon outside
of arch specific code. of arch specific code.
ACQUIRING FUNCTIONS LOCK ACQUISITION FUNCTIONS
------------------- --------------------------
The Linux kernel has a number of locking constructs: The Linux kernel has a number of locking constructs:
...@@ -2090,9 +2131,9 @@ or: ...@@ -2090,9 +2131,9 @@ or:
event_indicated = 1; event_indicated = 1;
wake_up_process(event_daemon); wake_up_process(event_daemon);
A write memory barrier is implied by wake_up() and co. if and only if they wake A write memory barrier is implied by wake_up() and co. if and only if they
something up. The barrier occurs before the task state is cleared, and so sits wake something up. The barrier occurs before the task state is cleared, and so
between the STORE to indicate the event and the STORE to set TASK_RUNNING: sits between the STORE to indicate the event and the STORE to set TASK_RUNNING:
CPU 1 CPU 2 CPU 1 CPU 2
=============================== =============================== =============================== ===============================
...@@ -2486,9 +2527,9 @@ The following operations are special locking primitives: ...@@ -2486,9 +2527,9 @@ The following operations are special locking primitives:
clear_bit_unlock(); clear_bit_unlock();
__clear_bit_unlock(); __clear_bit_unlock();
These implement ACQUIRE-class and RELEASE-class operations. These should be used in These implement ACQUIRE-class and RELEASE-class operations. These should be
preference to other operations when implementing locking primitives, because used in preference to other operations when implementing locking primitives,
their implementations can be optimised on many architectures. because their implementations can be optimised on many architectures.
[!] Note that special memory barrier primitives are available for these [!] Note that special memory barrier primitives are available for these
situations because on some CPUs the atomic instructions used imply full memory situations because on some CPUs the atomic instructions used imply full memory
...@@ -2587,8 +2628,8 @@ functions: ...@@ -2587,8 +2628,8 @@ functions:
(*) inX(), outX(): (*) inX(), outX():
These are intended to talk to I/O space rather than memory space, but These are intended to talk to I/O space rather than memory space, but
that's primarily a CPU-specific concept. The i386 and x86_64 processors do that's primarily a CPU-specific concept. The i386 and x86_64 processors
indeed have special I/O space access cycles and instructions, but many do indeed have special I/O space access cycles and instructions, but many
CPUs don't have such a concept. CPUs don't have such a concept.
The PCI bus, amongst others, defines an I/O space concept which - on such The PCI bus, amongst others, defines an I/O space concept which - on such
...@@ -3040,8 +3081,9 @@ The Alpha defines the Linux kernel's memory barrier model. ...@@ -3040,8 +3081,9 @@ The Alpha defines the Linux kernel's memory barrier model.
See the subsection on "Cache Coherency" above. See the subsection on "Cache Coherency" above.
VIRTUAL MACHINE GUESTS VIRTUAL MACHINE GUESTS
------------------- ----------------------
Guests running within virtual machines might be affected by SMP effects even if Guests running within virtual machines might be affected by SMP effects even if
the guest itself is compiled without SMP support. This is an artifact of the guest itself is compiled without SMP support. This is an artifact of
...@@ -3058,6 +3100,7 @@ These are equivalent to smp_mb() etc counterparts in all other respects, ...@@ -3058,6 +3100,7 @@ These are equivalent to smp_mb() etc counterparts in all other respects,
in particular, they do not control MMIO effects: to control in particular, they do not control MMIO effects: to control
MMIO effects, use mandatory barriers. MMIO effects, use mandatory barriers.
============ ============
EXAMPLE USES EXAMPLE USES
============ ============
......
...@@ -560,11 +560,11 @@ static inline int atomic_dec_if_positive(atomic_t *v) ...@@ -560,11 +560,11 @@ static inline int atomic_dec_if_positive(atomic_t *v)
/** /**
* atomic_fetch_or - perform *p |= mask and return old value of *p * atomic_fetch_or - perform *p |= mask and return old value of *p
* @p: pointer to atomic_t
* @mask: mask to OR on the atomic_t * @mask: mask to OR on the atomic_t
* @p: pointer to atomic_t
*/ */
#ifndef atomic_fetch_or #ifndef atomic_fetch_or
static inline int atomic_fetch_or(atomic_t *p, int mask) static inline int atomic_fetch_or(int mask, atomic_t *p)
{ {
int old, val = atomic_read(p); int old, val = atomic_read(p);
......
...@@ -708,7 +708,7 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass) ...@@ -708,7 +708,7 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
* yet. Otherwise we look it up. We cache the result in the lock object * yet. Otherwise we look it up. We cache the result in the lock object
* itself, so actual lookup of the hash should be once per lock object. * itself, so actual lookup of the hash should be once per lock object.
*/ */
static inline struct lock_class * static struct lock_class *
register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force) register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
{ {
struct lockdep_subclass_key *key; struct lockdep_subclass_key *key;
......
...@@ -75,12 +75,7 @@ struct lock_stress_stats { ...@@ -75,12 +75,7 @@ struct lock_stress_stats {
long n_lock_acquired; long n_lock_acquired;
}; };
#if defined(MODULE) int torture_runnable = IS_ENABLED(MODULE);
#define LOCKTORTURE_RUNNABLE_INIT 1
#else
#define LOCKTORTURE_RUNNABLE_INIT 0
#endif
int torture_runnable = LOCKTORTURE_RUNNABLE_INIT;
module_param(torture_runnable, int, 0444); module_param(torture_runnable, int, 0444);
MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init"); MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init");
...@@ -394,11 +389,11 @@ static void torture_rtmutex_boost(struct torture_random_state *trsp) ...@@ -394,11 +389,11 @@ static void torture_rtmutex_boost(struct torture_random_state *trsp)
if (!rt_task(current)) { if (!rt_task(current)) {
/* /*
* (1) Boost priority once every ~50k operations. When the * Boost priority once every ~50k operations. When the
* task tries to take the lock, the rtmutex it will account * task tries to take the lock, the rtmutex it will account
* for the new priority, and do any corresponding pi-dance. * for the new priority, and do any corresponding pi-dance.
*/ */
if (!(torture_random(trsp) % if (trsp && !(torture_random(trsp) %
(cxt.nrealwriters_stress * factor))) { (cxt.nrealwriters_stress * factor))) {
policy = SCHED_FIFO; policy = SCHED_FIFO;
param.sched_priority = MAX_RT_PRIO - 1; param.sched_priority = MAX_RT_PRIO - 1;
...@@ -748,6 +743,15 @@ static void lock_torture_cleanup(void) ...@@ -748,6 +743,15 @@ static void lock_torture_cleanup(void)
if (torture_cleanup_begin()) if (torture_cleanup_begin())
return; return;
/*
* Indicates early cleanup, meaning that the test has not run,
* such as when passing bogus args when loading the module. As
* such, only perform the underlying torture-specific cleanups,
* and avoid anything related to locktorture.
*/
if (!cxt.lwsa)
goto end;
if (writer_tasks) { if (writer_tasks) {
for (i = 0; i < cxt.nrealwriters_stress; i++) for (i = 0; i < cxt.nrealwriters_stress; i++)
torture_stop_kthread(lock_torture_writer, torture_stop_kthread(lock_torture_writer,
...@@ -776,6 +780,7 @@ static void lock_torture_cleanup(void) ...@@ -776,6 +780,7 @@ static void lock_torture_cleanup(void)
else else
lock_torture_print_module_parms(cxt.cur_ops, lock_torture_print_module_parms(cxt.cur_ops,
"End of test: SUCCESS"); "End of test: SUCCESS");
end:
torture_cleanup_end(); torture_cleanup_end();
} }
...@@ -870,6 +875,7 @@ static int __init lock_torture_init(void) ...@@ -870,6 +875,7 @@ static int __init lock_torture_init(void)
VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory"); VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
firsterr = -ENOMEM; firsterr = -ENOMEM;
kfree(cxt.lwsa); kfree(cxt.lwsa);
cxt.lwsa = NULL;
goto unwind; goto unwind;
} }
...@@ -878,6 +884,7 @@ static int __init lock_torture_init(void) ...@@ -878,6 +884,7 @@ static int __init lock_torture_init(void)
cxt.lrsa[i].n_lock_acquired = 0; cxt.lrsa[i].n_lock_acquired = 0;
} }
} }
lock_torture_print_module_parms(cxt.cur_ops, "Start of test"); lock_torture_print_module_parms(cxt.cur_ops, "Start of test");
/* Prepare torture context. */ /* Prepare torture context. */
......
...@@ -191,8 +191,6 @@ static ssize_t qstat_write(struct file *file, const char __user *user_buf, ...@@ -191,8 +191,6 @@ static ssize_t qstat_write(struct file *file, const char __user *user_buf,
for (i = 0 ; i < qstat_num; i++) for (i = 0 ; i < qstat_num; i++)
WRITE_ONCE(ptr[i], 0); WRITE_ONCE(ptr[i], 0);
for (i = 0 ; i < qstat_num; i++)
WRITE_ONCE(ptr[i], 0);
} }
return count; return count;
} }
...@@ -214,10 +212,8 @@ static int __init init_qspinlock_stat(void) ...@@ -214,10 +212,8 @@ static int __init init_qspinlock_stat(void)
struct dentry *d_qstat = debugfs_create_dir("qlockstat", NULL); struct dentry *d_qstat = debugfs_create_dir("qlockstat", NULL);
int i; int i;
if (!d_qstat) { if (!d_qstat)
pr_warn("Could not create 'qlockstat' debugfs directory\n"); goto out;
return 0;
}
/* /*
* Create the debugfs files * Create the debugfs files
...@@ -227,12 +223,20 @@ static int __init init_qspinlock_stat(void) ...@@ -227,12 +223,20 @@ static int __init init_qspinlock_stat(void)
* performance. * performance.
*/ */
for (i = 0; i < qstat_num; i++) for (i = 0; i < qstat_num; i++)
debugfs_create_file(qstat_names[i], 0400, d_qstat, if (!debugfs_create_file(qstat_names[i], 0400, d_qstat,
(void *)(long)i, &fops_qstat); (void *)(long)i, &fops_qstat))
goto fail_undo;
if (!debugfs_create_file(qstat_names[qstat_reset_cnts], 0200, d_qstat,
(void *)(long)qstat_reset_cnts, &fops_qstat))
goto fail_undo;
debugfs_create_file(qstat_names[qstat_reset_cnts], 0200, d_qstat,
(void *)(long)qstat_reset_cnts, &fops_qstat);
return 0; return 0;
fail_undo:
debugfs_remove_recursive(d_qstat);
out:
pr_warn("Could not create 'qlockstat' debugfs entries\n");
return -ENOMEM;
} }
fs_initcall(init_qspinlock_stat); fs_initcall(init_qspinlock_stat);
......
...@@ -262,7 +262,7 @@ static void tick_nohz_dep_set_all(atomic_t *dep, ...@@ -262,7 +262,7 @@ static void tick_nohz_dep_set_all(atomic_t *dep,
{ {
int prev; int prev;
prev = atomic_fetch_or(dep, BIT(bit)); prev = atomic_fetch_or(BIT(bit), dep);
if (!prev) if (!prev)
tick_nohz_full_kick_all(); tick_nohz_full_kick_all();
} }
...@@ -292,7 +292,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit) ...@@ -292,7 +292,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum tick_dep_bits bit)
ts = per_cpu_ptr(&tick_cpu_sched, cpu); ts = per_cpu_ptr(&tick_cpu_sched, cpu);
prev = atomic_fetch_or(&ts->tick_dep_mask, BIT(bit)); prev = atomic_fetch_or(BIT(bit), &ts->tick_dep_mask);
if (!prev) { if (!prev) {
preempt_disable(); preempt_disable();
/* Perf needs local kick that is NMI safe */ /* Perf needs local kick that is NMI safe */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment