Commit 0e4cd9f2 authored by Catalin Marinas's avatar Catalin Marinas

Merge branch 'for-next/read-barrier-depends' into for-next/core

* for-next/read-barrier-depends:
  : Allow architectures to override __READ_ONCE()
  arm64: Reduce the number of header files pulled into vmlinux.lds.S
  compiler.h: Move compiletime_assert() macros into compiler_types.h
  checkpatch: Remove checks relating to [smp_]read_barrier_depends()
  include/linux: Remove smp_read_barrier_depends() from comments
  tools/memory-model: Remove smp_read_barrier_depends() from informal doc
  Documentation/barriers/kokr: Remove references to [smp_]read_barrier_depends()
  Documentation/barriers: Remove references to [smp_]read_barrier_depends()
  locking/barriers: Remove definitions for [smp_]read_barrier_depends()
  alpha: Replace smp_read_barrier_depends() usage with smp_[r]mb()
  vhost: Remove redundant use of read_barrier_depends() barrier
  asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h'
  asm/rwonce: Remove smp_read_barrier_depends() invocation
  alpha: Override READ_ONCE() with barriered implementation
  asm/rwonce: Allow __READ_ONCE to be overridden by the architecture
  compiler.h: Split {READ,WRITE}_ONCE definitions out into rwonce.h
  tools: bpf: Use local copy of headers including uapi/linux/filter.h
parents 18aa3bd5 5f1f7f6c
...@@ -463,7 +463,7 @@ again without disrupting RCU readers. ...@@ -463,7 +463,7 @@ again without disrupting RCU readers.
This guarantee was only partially premeditated. DYNIX/ptx used an This guarantee was only partially premeditated. DYNIX/ptx used an
explicit memory barrier for publication, but had nothing resembling explicit memory barrier for publication, but had nothing resembling
``rcu_dereference()`` for subscription, nor did it have anything ``rcu_dereference()`` for subscription, nor did it have anything
resembling the ``smp_read_barrier_depends()`` that was later subsumed resembling the dependency-ordering barrier that was later subsumed
into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The
need for these operations made itself known quite suddenly at a need for these operations made itself known quite suddenly at a
late-1990s meeting with the DEC Alpha architects, back in the days when late-1990s meeting with the DEC Alpha architects, back in the days when
......
...@@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee: ...@@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
DATA DEPENDENCY BARRIERS (HISTORICAL) DATA DEPENDENCY BARRIERS (HISTORICAL)
------------------------------------- -------------------------------------
As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for
added to READ_ONCE(), which means that about the only people who DEC Alpha, which means that about the only people who need to pay attention
need to pay attention to this section are those working on DEC Alpha to this section are those working on DEC Alpha architecture-specific code
architecture-specific code and those working on READ_ONCE() itself. and those working on READ_ONCE() itself. For those who need it, and for
For those who need it, and for those who are interested in the history, those who are interested in the history, here is the story of
here is the story of data-dependency barriers. data-dependency barriers.
The usage requirements of data dependency barriers are a little subtle, and The usage requirements of data dependency barriers are a little subtle, and
it's not always obvious that they're needed. To illustrate, consider the it's not always obvious that they're needed. To illustrate, consider the
...@@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or ...@@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or
the use of any special device communication instructions the CPU may have. the use of any special device communication instructions the CPU may have.
CACHE COHERENCY
---------------
Life isn't quite as simple as it may appear above, however: for while the
caches are expected to be coherent, there's no guarantee that that coherency
will be ordered. This means that while changes made on one CPU will
eventually become visible on all CPUs, there's no guarantee that they will
become apparent in the same order on those other CPUs.
Consider dealing with a system that has a pair of CPUs (1 & 2), each of which
has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D):
:
: +--------+
: +---------+ | |
+--------+ : +--->| Cache A |<------->| |
| | : | +---------+ | |
| CPU 1 |<---+ | |
| | : | +---------+ | |
+--------+ : +--->| Cache B |<------->| |
: +---------+ | |
: | Memory |
: +---------+ | System |
+--------+ : +--->| Cache C |<------->| |
| | : | +---------+ | |
| CPU 2 |<---+ | |
| | : | +---------+ | |
+--------+ : +--->| Cache D |<------->| |
: +---------+ | |
: +--------+
:
Imagine the system has the following properties:
(*) an odd-numbered cache line may be in cache A, cache C or it may still be
resident in memory;
(*) an even-numbered cache line may be in cache B, cache D or it may still be
resident in memory;
(*) while the CPU core is interrogating one cache, the other cache may be
making use of the bus to access the rest of the system - perhaps to
displace a dirty cacheline or to do a speculative load;
(*) each cache has a queue of operations that need to be applied to that cache
to maintain coherency with the rest of the system;
(*) the coherency queue is not flushed by normal loads to lines already
present in the cache, even though the contents of the queue may
potentially affect those loads.
Imagine, then, that two writes are made on the first CPU, with a write barrier
between them to guarantee that they will appear to reach that CPU's caches in
the requisite order:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb(); Make sure change to v is visible before
change to p
<A:modify v=2> v is now in cache A exclusively
p = &v;
<B:modify p=&v> p is now in cache B exclusively
The write memory barrier forces the other CPUs in the system to perceive that
the local CPU's caches have apparently been updated in the correct order. But
now imagine that the second CPU wants to read those values:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
...
q = p;
x = *q;
The above pair of reads may then fail to happen in the expected order, as the
cacheline holding p may get updated in one of the second CPU's caches while
the update to the cacheline holding v is delayed in the other of the second
CPU's caches by some other cache event:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb();
<A:modify v=2> <C:busy>
<C:queue v=2>
p = &v; q = p;
<D:request p>
<B:modify p=&v> <D:commit p=&v>
<D:read p>
x = *q;
<C:read *q> Reads from v before v updated in cache
<C:unbusy>
<C:commit v=2>
Basically, while both cachelines will be updated on CPU 2 eventually, there's
no guarantee that, without intervention, the order of update will be the same
as that committed on CPU 1.
To intervene, we need to interpolate a data dependency barrier or a read
barrier between the loads (which as of v4.15 is supplied unconditionally
by the READ_ONCE() macro). This will force the cache to commit its
coherency queue before processing any further requests:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb();
<A:modify v=2> <C:busy>
<C:queue v=2>
p = &v; q = p;
<D:request p>
<B:modify p=&v> <D:commit p=&v>
<D:read p>
smp_read_barrier_depends()
<C:unbusy>
<C:commit v=2>
x = *q;
<C:read *q> Reads from v after v updated in cache
This sort of problem can be encountered on DEC Alpha processors as they have a
split cache that improves performance by making better use of the data bus.
While most CPUs do imply a data dependency barrier on the read when a memory
access depends on a read, not all do, so it may not be relied on.
Other CPUs may also have split caches, but must coordinate between the various
cachelets for normal memory accesses. The semantics of the Alpha removes the
need for hardware coordination in the absence of memory barriers, which
permitted Alpha to sport higher CPU clock rates back in the day. However,
please note that (again, as of v4.15) smp_read_barrier_depends() should not
be used except in Alpha arch-specific code and within the READ_ONCE() macro.
CACHE COHERENCY VS DMA CACHE COHERENCY VS DMA
---------------------- ----------------------
...@@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer ...@@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer
changes vs new data occur in the right order. changes vs new data occur in the right order.
The Alpha defines the Linux kernel's memory model, although as of v4.15 The Alpha defines the Linux kernel's memory model, although as of v4.15
the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE() the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly
greatly reduced Alpha's impact on the memory model. reduced its impact on the memory model.
See the subsection on "Cache Coherency" above.
VIRTUAL MACHINE GUESTS VIRTUAL MACHINE GUESTS
......
...@@ -577,7 +577,7 @@ ACQUIRE 는 해당 오퍼레이션의 로드 부분에만 적용되고 RELEASE ...@@ -577,7 +577,7 @@ ACQUIRE 는 해당 오퍼레이션의 로드 부분에만 적용되고 RELEASE
데이터 의존성 배리어 (역사적) 데이터 의존성 배리어 (역사적)
----------------------------- -----------------------------
리눅스 커널 v4.15 기준으로, smp_read_barrier_depends() 가 READ_ONCE() 리눅스 커널 v4.15 기준으로, smp_mb() 가 DEC Alpha 용 READ_ONCE() 코드
추가되었는데, 이는 이 섹션에 주의를 기울여야 하는 사람들은 DEC Alpha 아키텍쳐 추가되었는데, 이는 이 섹션에 주의를 기울여야 하는 사람들은 DEC Alpha 아키텍쳐
전용 코드를 만드는 사람들과 READ_ONCE() 자체를 만드는 사람들 뿐임을 의미합니다. 전용 코드를 만드는 사람들과 READ_ONCE() 자체를 만드는 사람들 뿐임을 의미합니다.
그런 분들을 위해, 그리고 역사에 관심 있는 분들을 위해, 여기 데이터 의존성 그런 분들을 위해, 그리고 역사에 관심 있는 분들을 위해, 여기 데이터 의존성
...@@ -2664,144 +2664,6 @@ CPU 코어는 프로그램의 인과성이 유지된다고만 여겨진다면 ...@@ -2664,144 +2664,6 @@ CPU 코어는 프로그램의 인과성이 유지된다고만 여겨진다면
수도 있습니다. 수도 있습니다.
캐시 일관성
-----------
하지만 삶은 앞에서 이야기한 것처럼 단순하지 않습니다: 캐시들은 일관적일 것으로
기대되지만, 그 일관성이 순서에도 적용될 거라는 보장은 없습니다. 한 CPU 에서
만들어진 변경 사항은 최종적으로는 시스템의 모든 CPU 에게 보여지게 되지만, 다른
CPU 들에게도 같은 순서로 보이게 될 거라는 보장은 없다는 뜻입니다.
두개의 CPU (1 & 2) 가 달려 있고, 각 CPU 에 두개의 데이터 캐시(CPU 1 은 A/B 를,
CPU 2 는 C/D 를 갖습니다)가 병렬로 연결되어 있는 시스템을 다룬다고 생각해
봅시다:
:
: +--------+
: +---------+ | |
+--------+ : +--->| Cache A |<------->| |
| | : | +---------+ | |
| CPU 1 |<---+ | |
| | : | +---------+ | |
+--------+ : +--->| Cache B |<------->| |
: +---------+ | |
: | Memory |
: +---------+ | System |
+--------+ : +--->| Cache C |<------->| |
| | : | +---------+ | |
| CPU 2 |<---+ | |
| | : | +---------+ | |
+--------+ : +--->| Cache D |<------->| |
: +---------+ | |
: +--------+
:
이 시스템이 다음과 같은 특성을 갖는다 생각해 봅시다:
(*) 홀수번 캐시라인은 캐시 A, 캐시 C 또는 메모리에 위치할 수 있음;
(*) 짝수번 캐시라인은 캐시 B, 캐시 D 또는 메모리에 위치할 수 있음;
(*) CPU 코어가 한개의 캐시에 접근하는 동안, 다른 캐시는 - 더티 캐시라인을
메모리에 내리거나 추측성 로드를 하거나 하기 위해 - 시스템의 다른 부분에
액세스 하기 위해 버스를 사용할 수 있음;
(*) 각 캐시는 시스템의 나머지 부분들과 일관성을 맞추기 위해 해당 캐시에
적용되어야 할 오퍼레이션들의 큐를 가짐;
(*) 이 일관성 큐는 캐시에 이미 존재하는 라인에 가해지는 평범한 로드에 의해서는
비워지지 않는데, 큐의 오퍼레이션들이 이 로드의 결과에 영향을 끼칠 수 있다
할지라도 그러함.
이제, 첫번째 CPU 에서 두개의 쓰기 오퍼레이션을 만드는데, 해당 CPU 의 캐시에
요청된 순서로 오퍼레이션이 도달됨을 보장하기 위해 두 오퍼레이션 사이에 쓰기
배리어를 사용하는 상황을 상상해 봅시다:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb(); v 의 변경이 p 의 변경 전에 보일 것을
분명히 함
<A:modify v=2> v 는 이제 캐시 A 에 독점적으로 존재함
p = &v;
<B:modify p=&v> p 는 이제 캐시 B 에 독점적으로 존재함
여기서의 쓰기 메모리 배리어는 CPU 1 의 캐시가 올바른 순서로 업데이트 된 것으로
시스템의 다른 CPU 들이 인지하게 만듭니다. 하지만, 이제 두번째 CPU 가 그 값들을
읽으려 하는 상황을 생각해 봅시다:
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
...
q = p;
x = *q;
위의 두개의 읽기 오퍼레이션은 예상된 순서로 일어나지 못할 수 있는데, 두번째 CPU
의 한 캐시에 다른 캐시 이벤트가 발생해 v 를 담고 있는 캐시라인의 해당 캐시에의
업데이트가 지연되는 사이, p 를 담고 있는 캐시라인은 두번째 CPU 의 다른 캐시에
업데이트 되어버렸을 수 있기 때문입니다.
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb();
<A:modify v=2> <C:busy>
<C:queue v=2>
p = &v; q = p;
<D:request p>
<B:modify p=&v> <D:commit p=&v>
<D:read p>
x = *q;
<C:read *q> 캐시에 업데이트 되기 전의 v 를 읽음
<C:unbusy>
<C:commit v=2>
기본적으로, 두개의 캐시라인 모두 CPU 2 에 최종적으로는 업데이트 될 것이지만,
별도의 개입 없이는, 업데이트의 순서가 CPU 1 에서 만들어진 순서와 동일할
것이라는 보장이 없습니다.
여기에 개입하기 위해선, 데이터 의존성 배리어나 읽기 배리어를 로드 오퍼레이션들
사이에 넣어야 합니다 (v4.15 부터는 READ_ONCE() 매크로에 의해 무조건적으로
그렇게 됩니다). 이렇게 함으로써 캐시가 다음 요청을 처리하기 전에 일관성 큐를
처리하도록 강제하게 됩니다.
CPU 1 CPU 2 COMMENT
=============== =============== =======================================
u == 0, v == 1 and p == &u, q == &u
v = 2;
smp_wmb();
<A:modify v=2> <C:busy>
<C:queue v=2>
p = &v; q = p;
<D:request p>
<B:modify p=&v> <D:commit p=&v>
<D:read p>
smp_read_barrier_depends()
<C:unbusy>
<C:commit v=2>
x = *q;
<C:read *q> 캐시에 업데이트 된 v 를 읽음
이런 부류의 문제는 DEC Alpha 계열 프로세서들에서 발견될 수 있는데, 이들은
데이터 버스를 좀 더 잘 사용해 성능을 개선할 수 있는, 분할된 캐시를 가지고 있기
때문입니다. 대부분의 CPU 는 하나의 읽기 오퍼레이션의 메모리 액세스가 다른 읽기
오퍼레이션에 의존적이라면 데이터 의존성 배리어를 내포시킵니다만, 모두가 그런건
아니기 때문에 이점에 의존해선 안됩니다.
다른 CPU 들도 분할된 캐시를 가지고 있을 수 있지만, 그런 CPU 들은 평범한 메모리
액세스를 위해서도 이 분할된 캐시들 사이의 조정을 해야만 합니다. Alpha 는 가장
약한 메모리 순서 시맨틱 (semantic) 을 선택함으로써 메모리 배리어가 명시적으로
사용되지 않았을 때에는 그런 조정이 필요하지 않게 했으며, 이는 Alpha 가 당시에
더 높은 CPU 클락 속도를 가질 수 있게 했습니다. 하지만, (다시 말하건대, v4.15
이후부터는) Alpha 아키텍쳐 전용 코드와 READ_ONCE() 매크로 내부에서를 제외하고는
smp_read_barrier_depends() 가 사용되지 않아야 함을 알아두시기 바랍니다.
캐시 일관성 VS DMA 캐시 일관성 VS DMA
------------------ ------------------
...@@ -2962,10 +2824,8 @@ Alpha CPU 의 일부 버전은 분할된 데이터 캐시를 가지고 있어서 ...@@ -2962,10 +2824,8 @@ Alpha CPU 의 일부 버전은 분할된 데이터 캐시를 가지고 있어서
데이터의 발견을 올바른 순서로 일어나게 하기 때문입니다. 데이터의 발견을 올바른 순서로 일어나게 하기 때문입니다.
리눅스 커널의 메모리 배리어 모델은 Alpha 에 기초해서 정의되었습니다만, v4.15 리눅스 커널의 메모리 배리어 모델은 Alpha 에 기초해서 정의되었습니다만, v4.15
부터는 리눅스 커널이 READ_ONCE() 내에 smp_read_barrier_depends() 를 추가해서 부터는 Alpha 용 READ_ONCE() 코드 내에 smp_mb() 가 추가되어서 메모리 모델로의
Alpha 의 메모리 모델로의 영향력이 크게 줄어들긴 했습니다. Alpha 의 영향력이 크게 줄어들었습니다.
위의 "캐시 일관성" 서브섹션을 참고하세요.
가상 머신 게스트 가상 머신 게스트
......
...@@ -16,10 +16,10 @@ ...@@ -16,10 +16,10 @@
/* /*
* To ensure dependency ordering is preserved for the _relaxed and * To ensure dependency ordering is preserved for the _relaxed and
* _release atomics, an smp_read_barrier_depends() is unconditionally * _release atomics, an smp_mb() is unconditionally inserted into the
* inserted into the _relaxed variants, which are used to build the * _relaxed variants, which are used to build the barriered versions.
* barriered versions. Avoid redundant back-to-back fences in the * Avoid redundant back-to-back fences in the _acquire and _fence
* _acquire and _fence versions. * versions.
*/ */
#define __atomic_acquire_fence() #define __atomic_acquire_fence()
#define __atomic_post_full_fence() #define __atomic_post_full_fence()
...@@ -70,7 +70,7 @@ static inline int atomic_##op##_return_relaxed(int i, atomic_t *v) \ ...@@ -70,7 +70,7 @@ static inline int atomic_##op##_return_relaxed(int i, atomic_t *v) \
".previous" \ ".previous" \
:"=&r" (temp), "=m" (v->counter), "=&r" (result) \ :"=&r" (temp), "=m" (v->counter), "=&r" (result) \
:"Ir" (i), "m" (v->counter) : "memory"); \ :"Ir" (i), "m" (v->counter) : "memory"); \
smp_read_barrier_depends(); \ smp_mb(); \
return result; \ return result; \
} }
...@@ -88,7 +88,7 @@ static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v) \ ...@@ -88,7 +88,7 @@ static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v) \
".previous" \ ".previous" \
:"=&r" (temp), "=m" (v->counter), "=&r" (result) \ :"=&r" (temp), "=m" (v->counter), "=&r" (result) \
:"Ir" (i), "m" (v->counter) : "memory"); \ :"Ir" (i), "m" (v->counter) : "memory"); \
smp_read_barrier_depends(); \ smp_mb(); \
return result; \ return result; \
} }
...@@ -123,7 +123,7 @@ static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \ ...@@ -123,7 +123,7 @@ static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \
".previous" \ ".previous" \
:"=&r" (temp), "=m" (v->counter), "=&r" (result) \ :"=&r" (temp), "=m" (v->counter), "=&r" (result) \
:"Ir" (i), "m" (v->counter) : "memory"); \ :"Ir" (i), "m" (v->counter) : "memory"); \
smp_read_barrier_depends(); \ smp_mb(); \
return result; \ return result; \
} }
...@@ -141,7 +141,7 @@ static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \ ...@@ -141,7 +141,7 @@ static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \
".previous" \ ".previous" \
:"=&r" (temp), "=m" (v->counter), "=&r" (result) \ :"=&r" (temp), "=m" (v->counter), "=&r" (result) \
:"Ir" (i), "m" (v->counter) : "memory"); \ :"Ir" (i), "m" (v->counter) : "memory"); \
smp_read_barrier_depends(); \ smp_mb(); \
return result; \ return result; \
} }
......
...@@ -2,64 +2,15 @@ ...@@ -2,64 +2,15 @@
#ifndef __BARRIER_H #ifndef __BARRIER_H
#define __BARRIER_H #define __BARRIER_H
#include <asm/compiler.h>
#define mb() __asm__ __volatile__("mb": : :"memory") #define mb() __asm__ __volatile__("mb": : :"memory")
#define rmb() __asm__ __volatile__("mb": : :"memory") #define rmb() __asm__ __volatile__("mb": : :"memory")
#define wmb() __asm__ __volatile__("wmb": : :"memory") #define wmb() __asm__ __volatile__("wmb": : :"memory")
/** #define __smp_load_acquire(p) \
* read_barrier_depends - Flush all pending reads that subsequents reads ({ \
* depend on. compiletime_assert_atomic_type(*p); \
* __READ_ONCE(*p); \
* No data-dependent reads from memory-like regions are ever reordered })
* over this barrier. All reads preceding this primitive are guaranteed
* to access memory (but not necessarily other CPUs' caches) before any
* reads following this primitive that depend on the data return by
* any of the preceding reads. This primitive is much lighter weight than
* rmb() on most CPUs, and is never heavier weight than is
* rmb().
*
* These ordering constraints are respected by both the local CPU
* and the compiler.
*
* Ordering is not guaranteed by anything other than these primitives,
* not even by data dependencies. See the documentation for
* memory_barrier() for examples and URLs to more information.
*
* For example, the following code would force ordering (the initial
* value of "a" is zero, "b" is one, and "p" is "&a"):
*
* <programlisting>
* CPU 0 CPU 1
*
* b = 2;
* memory_barrier();
* p = &b; q = p;
* read_barrier_depends();
* d = *q;
* </programlisting>
*
* because the read of "*q" depends on the read of "p" and these
* two reads are separated by a read_barrier_depends(). However,
* the following code, with the same initial values for "a" and "b":
*
* <programlisting>
* CPU 0 CPU 1
*
* a = 2;
* memory_barrier();
* b = 3; y = b;
* read_barrier_depends();
* x = a;
* </programlisting>
*
* does not enforce ordering, since there is no data dependency between
* the read of "a" and the read of "b". Therefore, on some CPUs, such
* as Alpha, "y" could be set to 3 and "x" to 0. Use rmb()
* in cases like this where there are no data dependencies.
*/
#define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
#define __ASM_SMP_MB "\tmb\n" #define __ASM_SMP_MB "\tmb\n"
......
...@@ -277,9 +277,9 @@ extern inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= __DIRTY_BITS; retur ...@@ -277,9 +277,9 @@ extern inline pte_t pte_mkdirty(pte_t pte) { pte_val(pte) |= __DIRTY_BITS; retur
extern inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= __ACCESS_BITS; return pte; } extern inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= __ACCESS_BITS; return pte; }
/* /*
* The smp_read_barrier_depends() in the following functions are required to * The smp_rmb() in the following functions are required to order the load of
* order the load of *dir (the pointer in the top level page table) with any * *dir (the pointer in the top level page table) with any subsequent load of
* subsequent load of the returned pmd_t *ret (ret is data dependent on *dir). * the returned pmd_t *ret (ret is data dependent on *dir).
* *
* If this ordering is not enforced, the CPU might load an older value of * If this ordering is not enforced, the CPU might load an older value of
* *ret, which may be uninitialized data. See mm/memory.c:__pte_alloc for * *ret, which may be uninitialized data. See mm/memory.c:__pte_alloc for
...@@ -293,7 +293,7 @@ extern inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= __ACCESS_BITS; retu ...@@ -293,7 +293,7 @@ extern inline pte_t pte_mkyoung(pte_t pte) { pte_val(pte) |= __ACCESS_BITS; retu
extern inline pmd_t * pmd_offset(pud_t * dir, unsigned long address) extern inline pmd_t * pmd_offset(pud_t * dir, unsigned long address)
{ {
pmd_t *ret = (pmd_t *) pud_page_vaddr(*dir) + ((address >> PMD_SHIFT) & (PTRS_PER_PAGE - 1)); pmd_t *ret = (pmd_t *) pud_page_vaddr(*dir) + ((address >> PMD_SHIFT) & (PTRS_PER_PAGE - 1));
smp_read_barrier_depends(); /* see above */ smp_rmb(); /* see above */
return ret; return ret;
} }
#define pmd_offset pmd_offset #define pmd_offset pmd_offset
...@@ -303,7 +303,7 @@ extern inline pte_t * pte_offset_kernel(pmd_t * dir, unsigned long address) ...@@ -303,7 +303,7 @@ extern inline pte_t * pte_offset_kernel(pmd_t * dir, unsigned long address)
{ {
pte_t *ret = (pte_t *) pmd_page_vaddr(*dir) pte_t *ret = (pte_t *) pmd_page_vaddr(*dir)
+ ((address >> PAGE_SHIFT) & (PTRS_PER_PAGE - 1)); + ((address >> PAGE_SHIFT) & (PTRS_PER_PAGE - 1));
smp_read_barrier_depends(); /* see above */ smp_rmb(); /* see above */
return ret; return ret;
} }
#define pte_offset_kernel pte_offset_kernel #define pte_offset_kernel pte_offset_kernel
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2019 Google LLC.
*/
#ifndef __ASM_RWONCE_H
#define __ASM_RWONCE_H
#ifdef CONFIG_SMP
#include <asm/barrier.h>
/*
* Alpha is apparently daft enough to reorder address-dependent loads
* on some CPU implementations. Knock some common sense into it with
* a memory barrier in READ_ONCE().
*
* For the curious, more information about this unusual reordering is
* available in chapter 15 of the "perfbook":
*
* https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
*
*/
#define __READ_ONCE(x) \
({ \
__unqual_scalar_typeof(x) __x = \
(*(volatile typeof(__x) *)(&(x))); \
mb(); \
(typeof(x))__x; \
})
#endif /* CONFIG_SMP */
#include <asm-generic/rwonce.h>
#endif /* __ASM_RWONCE_H */
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <asm/barrier.h>
#include <asm/errno.h> #include <asm/errno.h>
#include <asm/unistd.h> #include <asm/unistd.h>
#include <asm/vdso/cp15.h> #include <asm/vdso/cp15.h>
......
...@@ -8,7 +8,7 @@ ...@@ -8,7 +8,7 @@
#ifndef __ASM_KERNEL_PGTABLE_H #ifndef __ASM_KERNEL_PGTABLE_H
#define __ASM_KERNEL_PGTABLE_H #define __ASM_KERNEL_PGTABLE_H
#include <linux/pgtable.h> #include <asm/pgtable-hwdef.h>
#include <asm/sparsemem.h> #include <asm/sparsemem.h>
/* /*
......
...@@ -10,11 +10,8 @@ ...@@ -10,11 +10,8 @@
#ifndef __ASM_MEMORY_H #ifndef __ASM_MEMORY_H
#define __ASM_MEMORY_H #define __ASM_MEMORY_H
#include <linux/compiler.h>
#include <linux/const.h> #include <linux/const.h>
#include <linux/sizes.h> #include <linux/sizes.h>
#include <linux/types.h>
#include <asm/bug.h>
#include <asm/page-def.h> #include <asm/page-def.h>
/* /*
...@@ -157,11 +154,15 @@ ...@@ -157,11 +154,15 @@
#endif #endif
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
extern u64 vabits_actual;
#define PAGE_END (_PAGE_END(vabits_actual))
#include <linux/bitops.h> #include <linux/bitops.h>
#include <linux/compiler.h>
#include <linux/mmdebug.h> #include <linux/mmdebug.h>
#include <linux/types.h>
#include <asm/bug.h>
extern u64 vabits_actual;
#define PAGE_END (_PAGE_END(vabits_actual))
extern s64 physvirt_offset; extern s64 physvirt_offset;
extern s64 memstart_addr; extern s64 memstart_addr;
......
...@@ -19,6 +19,7 @@ ...@@ -19,6 +19,7 @@
#include <linux/string.h> #include <linux/string.h>
#include <asm/cpufeature.h> #include <asm/cpufeature.h>
#include <asm/mmu.h>
#include <asm/ptrace.h> #include <asm/ptrace.h>
#include <asm/memory.h> #include <asm/memory.h>
#include <asm/extable.h> #include <asm/extable.h>
......
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <asm/barrier.h>
#include <asm/unistd.h> #include <asm/unistd.h>
#include <asm/errno.h> #include <asm/errno.h>
......
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <asm/barrier.h>
#include <asm/unistd.h> #include <asm/unistd.h>
#define VDSO_HAS_CLOCK_GETRES 1 #define VDSO_HAS_CLOCK_GETRES 1
......
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
#include <asm/assembler.h> #include <asm/assembler.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/asm_pointer_auth.h> #include <asm/asm_pointer_auth.h>
#include <asm/bug.h>
#include <asm/cpufeature.h> #include <asm/cpufeature.h>
#include <asm/errno.h> #include <asm/errno.h>
#include <asm/esr.h> #include <asm/esr.h>
......
...@@ -10,7 +10,6 @@ ...@@ -10,7 +10,6 @@
#include <asm-generic/vmlinux.lds.h> #include <asm-generic/vmlinux.lds.h>
#include <asm/cache.h> #include <asm/cache.h>
#include <asm/kernel-pgtable.h> #include <asm/kernel-pgtable.h>
#include <asm/thread_info.h>
#include <asm/memory.h> #include <asm/memory.h>
#include <asm/page.h> #include <asm/page.h>
......
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
#include <linux/linkage.h> #include <linux/linkage.h>
#include <asm/alternative.h>
#include <asm/assembler.h> #include <asm/assembler.h>
#include <asm/kvm_arm.h> #include <asm/kvm_arm.h>
#include <asm/kvm_mmu.h> #include <asm/kvm_mmu.h>
......
...@@ -4,6 +4,7 @@ ...@@ -4,6 +4,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <asm/barrier.h>
#include <asm/unistd.h> #include <asm/unistd.h>
#include <asm/csr.h> #include <asm/csr.h>
#include <uapi/linux/time.h> #include <uapi/linux/time.h>
......
...@@ -2092,11 +2092,6 @@ static int get_indirect(struct vhost_virtqueue *vq, ...@@ -2092,11 +2092,6 @@ static int get_indirect(struct vhost_virtqueue *vq,
return ret; return ret;
} }
iov_iter_init(&from, READ, vq->indirect, ret, len); iov_iter_init(&from, READ, vq->indirect, ret, len);
/* We will use the result as an address to read from, so most
* architectures only need a compiler barrier here. */
read_barrier_depends();
count = len / sizeof desc; count = len / sizeof desc;
/* Buffers are chained via a 16 bit next field, so /* Buffers are chained via a 16 bit next field, so
* we can have at most 2^16 of these. */ * we can have at most 2^16 of these. */
......
...@@ -45,6 +45,7 @@ mandatory-y += pci.h ...@@ -45,6 +45,7 @@ mandatory-y += pci.h
mandatory-y += percpu.h mandatory-y += percpu.h
mandatory-y += pgalloc.h mandatory-y += pgalloc.h
mandatory-y += preempt.h mandatory-y += preempt.h
mandatory-y += rwonce.h
mandatory-y += sections.h mandatory-y += sections.h
mandatory-y += serial.h mandatory-y += serial.h
mandatory-y += shmparam.h mandatory-y += shmparam.h
......
...@@ -13,7 +13,7 @@ ...@@ -13,7 +13,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <linux/compiler.h> #include <asm/rwonce.h>
#ifndef nop #ifndef nop
#define nop() asm volatile ("nop") #define nop() asm volatile ("nop")
...@@ -46,10 +46,6 @@ ...@@ -46,10 +46,6 @@
#define dma_wmb() wmb() #define dma_wmb() wmb()
#endif #endif
#ifndef read_barrier_depends
#define read_barrier_depends() do { } while (0)
#endif
#ifndef __smp_mb #ifndef __smp_mb
#define __smp_mb() mb() #define __smp_mb() mb()
#endif #endif
...@@ -62,10 +58,6 @@ ...@@ -62,10 +58,6 @@
#define __smp_wmb() wmb() #define __smp_wmb() wmb()
#endif #endif
#ifndef __smp_read_barrier_depends
#define __smp_read_barrier_depends() read_barrier_depends()
#endif
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
#ifndef smp_mb #ifndef smp_mb
...@@ -80,10 +72,6 @@ ...@@ -80,10 +72,6 @@
#define smp_wmb() __smp_wmb() #define smp_wmb() __smp_wmb()
#endif #endif
#ifndef smp_read_barrier_depends
#define smp_read_barrier_depends() __smp_read_barrier_depends()
#endif
#else /* !CONFIG_SMP */ #else /* !CONFIG_SMP */
#ifndef smp_mb #ifndef smp_mb
...@@ -98,10 +86,6 @@ ...@@ -98,10 +86,6 @@
#define smp_wmb() barrier() #define smp_wmb() barrier()
#endif #endif
#ifndef smp_read_barrier_depends
#define smp_read_barrier_depends() do { } while (0)
#endif
#endif /* CONFIG_SMP */ #endif /* CONFIG_SMP */
#ifndef __smp_store_mb #ifndef __smp_store_mb
...@@ -196,7 +180,6 @@ do { \ ...@@ -196,7 +180,6 @@ do { \
#define virt_mb() __smp_mb() #define virt_mb() __smp_mb()
#define virt_rmb() __smp_rmb() #define virt_rmb() __smp_rmb()
#define virt_wmb() __smp_wmb() #define virt_wmb() __smp_wmb()
#define virt_read_barrier_depends() __smp_read_barrier_depends()
#define virt_store_mb(var, value) __smp_store_mb(var, value) #define virt_store_mb(var, value) __smp_store_mb(var, value)
#define virt_mb__before_atomic() __smp_mb__before_atomic() #define virt_mb__before_atomic() __smp_mb__before_atomic()
#define virt_mb__after_atomic() __smp_mb__after_atomic() #define virt_mb__after_atomic() __smp_mb__after_atomic()
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Prevent the compiler from merging or refetching reads or writes. The
* compiler is also forbidden from reordering successive instances of
* READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
* particular ordering. One way to make the compiler aware of ordering is to
* put the two invocations of READ_ONCE or WRITE_ONCE in different C
* statements.
*
* These two macros will also work on aggregate data types like structs or
* unions.
*
* Their two major use cases are: (1) Mediating communication between
* process-level code and irq/NMI handlers, all running on the same CPU,
* and (2) Ensuring that the compiler does not fold, spindle, or otherwise
* mutilate accesses that either do not require ordering or that interact
* with an explicit memory barrier or atomic instruction that provides the
* required ordering.
*/
#ifndef __ASM_GENERIC_RWONCE_H
#define __ASM_GENERIC_RWONCE_H
#ifndef __ASSEMBLY__
#include <linux/compiler_types.h>
#include <linux/kasan-checks.h>
#include <linux/kcsan-checks.h>
/*
* Yes, this permits 64-bit accesses on 32-bit architectures. These will
* actually be atomic in some cases (namely Armv7 + LPAE), but for others we
* rely on the access being split into 2x32-bit accesses for a 32-bit quantity
* (e.g. a virtual address) and a strong prevailing wind.
*/
#define compiletime_assert_rwonce_type(t) \
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
"Unsupported access size for {READ,WRITE}_ONCE().")
/*
* Use __READ_ONCE() instead of READ_ONCE() if you do not require any
* atomicity. Note that this may result in tears!
*/
#ifndef __READ_ONCE
#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
#endif
#define READ_ONCE(x) \
({ \
compiletime_assert_rwonce_type(x); \
__READ_ONCE(x); \
})
#define __WRITE_ONCE(x, val) \
do { \
*(volatile typeof(x) *)&(x) = (val); \
} while (0)
#define WRITE_ONCE(x, val) \
do { \
compiletime_assert_rwonce_type(x); \
__WRITE_ONCE(x, val); \
} while (0)
static __no_sanitize_or_inline
unsigned long __read_once_word_nocheck(const void *addr)
{
return __READ_ONCE(*(unsigned long *)addr);
}
/*
* Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
* word from memory atomically but without telling KASAN/KCSAN. This is
* usually used by unwinding code when walking the stack of a running process.
*/
#define READ_ONCE_NOCHECK(x) \
({ \
compiletime_assert(sizeof(x) == sizeof(unsigned long), \
"Unsupported access size for READ_ONCE_NOCHECK()."); \
(typeof(x))__read_once_word_nocheck(&(x)); \
})
static __no_kasan_or_inline
unsigned long read_word_at_a_time(const void *addr)
{
kasan_check_read(addr, 1);
return *(unsigned long *)addr;
}
#endif /* __ASSEMBLY__ */
#endif /* __ASM_GENERIC_RWONCE_H */
...@@ -230,28 +230,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, ...@@ -230,28 +230,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
# define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__) # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
#endif #endif
/*
* Prevent the compiler from merging or refetching reads or writes. The
* compiler is also forbidden from reordering successive instances of
* READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
* particular ordering. One way to make the compiler aware of ordering is to
* put the two invocations of READ_ONCE or WRITE_ONCE in different C
* statements.
*
* These two macros will also work on aggregate data types like structs or
* unions.
*
* Their two major use cases are: (1) Mediating communication between
* process-level code and irq/NMI handlers, all running on the same CPU,
* and (2) Ensuring that the compiler does not fold, spindle, or otherwise
* mutilate accesses that either do not require ordering or that interact
* with an explicit memory barrier or atomic instruction that provides the
* required ordering.
*/
#include <asm/barrier.h>
#include <linux/kasan-checks.h>
#include <linux/kcsan-checks.h>
/** /**
* data_race - mark an expression as containing intentional data races * data_race - mark an expression as containing intentional data races
* *
...@@ -272,65 +250,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, ...@@ -272,65 +250,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
__v; \ __v; \
}) })
/*
* Use __READ_ONCE() instead of READ_ONCE() if you do not require any
* atomicity or dependency ordering guarantees. Note that this may result
* in tears!
*/
#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
#define __READ_ONCE_SCALAR(x) \
({ \
__unqual_scalar_typeof(x) __x = __READ_ONCE(x); \
smp_read_barrier_depends(); \
(typeof(x))__x; \
})
#define READ_ONCE(x) \
({ \
compiletime_assert_rwonce_type(x); \
__READ_ONCE_SCALAR(x); \
})
#define __WRITE_ONCE(x, val) \
do { \
*(volatile typeof(x) *)&(x) = (val); \
} while (0)
#define WRITE_ONCE(x, val) \
do { \
compiletime_assert_rwonce_type(x); \
__WRITE_ONCE(x, val); \
} while (0)
static __no_sanitize_or_inline
unsigned long __read_once_word_nocheck(const void *addr)
{
return __READ_ONCE(*(unsigned long *)addr);
}
/*
* Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
* word from memory atomically but without telling KASAN/KCSAN. This is
* usually used by unwinding code when walking the stack of a running process.
*/
#define READ_ONCE_NOCHECK(x) \
({ \
unsigned long __x; \
compiletime_assert(sizeof(x) == sizeof(__x), \
"Unsupported access size for READ_ONCE_NOCHECK()."); \
__x = __read_once_word_nocheck(&(x)); \
smp_read_barrier_depends(); \
(typeof(x))__x; \
})
static __no_kasan_or_inline
unsigned long read_word_at_a_time(const void *addr)
{
kasan_check_read(addr, 1);
return *(unsigned long *)addr;
}
#endif /* __KERNEL__ */ #endif /* __KERNEL__ */
/* /*
...@@ -354,57 +273,6 @@ static inline void *offset_to_ptr(const int *off) ...@@ -354,57 +273,6 @@ static inline void *offset_to_ptr(const int *off)
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
# define __compiletime_object_size(obj) -1
#endif
#ifndef __compiletime_warning
# define __compiletime_warning(message)
#endif
#ifndef __compiletime_error
# define __compiletime_error(message)
#endif
#ifdef __OPTIMIZE__
# define __compiletime_assert(condition, msg, prefix, suffix) \
do { \
extern void prefix ## suffix(void) __compiletime_error(msg); \
if (!(condition)) \
prefix ## suffix(); \
} while (0)
#else
# define __compiletime_assert(condition, msg, prefix, suffix) do { } while (0)
#endif
#define _compiletime_assert(condition, msg, prefix, suffix) \
__compiletime_assert(condition, msg, prefix, suffix)
/**
* compiletime_assert - break build and emit msg if condition is false
* @condition: a compile-time constant condition to check
* @msg: a message to emit if condition is false
*
* In tradition of POSIX assert, this macro will break the build if the
* supplied condition is *false*, emitting the supplied error message if the
* compiler has support to do so.
*/
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
#define compiletime_assert_atomic_type(t) \
compiletime_assert(__native_word(t), \
"Need native word sized stores/loads for atomicity.")
/*
* Yes, this permits 64-bit accesses on 32-bit architectures. These will
* actually be atomic in some cases (namely Armv7 + LPAE), but for others we
* rely on the access being split into 2x32-bit accesses for a 32-bit quantity
* (e.g. a virtual address) and a strong prevailing wind.
*/
#define compiletime_assert_rwonce_type(t) \
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
"Unsupported access size for {READ,WRITE}_ONCE().")
/* &a[0] degrades to a pointer: a different type from an array */ /* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
...@@ -414,4 +282,6 @@ static inline void *offset_to_ptr(const int *off) ...@@ -414,4 +282,6 @@ static inline void *offset_to_ptr(const int *off)
*/ */
#define prevent_tail_call_optimization() mb() #define prevent_tail_call_optimization() mb()
#include <asm/rwonce.h>
#endif /* __LINUX_COMPILER_H */ #endif /* __LINUX_COMPILER_H */
...@@ -300,6 +300,47 @@ struct ftrace_likely_data { ...@@ -300,6 +300,47 @@ struct ftrace_likely_data {
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \ (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
/* Compile time object size, -1 for unknown */
#ifndef __compiletime_object_size
# define __compiletime_object_size(obj) -1
#endif
#ifndef __compiletime_warning
# define __compiletime_warning(message)
#endif
#ifndef __compiletime_error
# define __compiletime_error(message)
#endif
#ifdef __OPTIMIZE__
# define __compiletime_assert(condition, msg, prefix, suffix) \
do { \
extern void prefix ## suffix(void) __compiletime_error(msg); \
if (!(condition)) \
prefix ## suffix(); \
} while (0)
#else
# define __compiletime_assert(condition, msg, prefix, suffix) do { } while (0)
#endif
#define _compiletime_assert(condition, msg, prefix, suffix) \
__compiletime_assert(condition, msg, prefix, suffix)
/**
* compiletime_assert - break build and emit msg if condition is false
* @condition: a compile-time constant condition to check
* @msg: a message to emit if condition is false
*
* In tradition of POSIX assert, this macro will break the build if the
* supplied condition is *false*, emitting the supplied error message if the
* compiler has support to do so.
*/
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
#define compiletime_assert_atomic_type(t) \
compiletime_assert(__native_word(t), \
"Need native word sized stores/loads for atomicity.")
/* Helpers for emitting diagnostics in pragmas. */ /* Helpers for emitting diagnostics in pragmas. */
#ifndef __diag #ifndef __diag
#define __diag(string) #define __diag(string)
......
...@@ -5,6 +5,8 @@ ...@@ -5,6 +5,8 @@
#ifndef _LINUX_NOSPEC_H #ifndef _LINUX_NOSPEC_H
#define _LINUX_NOSPEC_H #define _LINUX_NOSPEC_H
#include <linux/compiler.h>
#include <asm/barrier.h> #include <asm/barrier.h>
struct task_struct; struct task_struct;
......
...@@ -155,7 +155,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref, ...@@ -155,7 +155,7 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
* between contaminating the pointer value, meaning that * between contaminating the pointer value, meaning that
* READ_ONCE() is required when fetching it. * READ_ONCE() is required when fetching it.
* *
* The smp_read_barrier_depends() implied by READ_ONCE() pairs * The dependency ordering from the READ_ONCE() pairs
* with smp_store_release() in __percpu_ref_switch_to_percpu(). * with smp_store_release() in __percpu_ref_switch_to_percpu().
*/ */
percpu_ptr = READ_ONCE(ref->percpu_count_ptr); percpu_ptr = READ_ONCE(ref->percpu_count_ptr);
......
...@@ -107,7 +107,7 @@ static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr) ...@@ -107,7 +107,7 @@ static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
return -ENOSPC; return -ENOSPC;
/* Make sure the pointer we are storing points to a valid data. */ /* Make sure the pointer we are storing points to a valid data. */
/* Pairs with smp_read_barrier_depends in __ptr_ring_consume. */ /* Pairs with the dependency ordering in __ptr_ring_consume. */
smp_wmb(); smp_wmb();
WRITE_ONCE(r->queue[r->producer++], ptr); WRITE_ONCE(r->queue[r->producer++], ptr);
......
...@@ -437,7 +437,7 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd) ...@@ -437,7 +437,7 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
* of a chain of data-dependent loads, meaning most CPUs (alpha * of a chain of data-dependent loads, meaning most CPUs (alpha
* being the notable exception) will already guarantee loads are * being the notable exception) will already guarantee loads are
* seen in-order. See the alpha page table accessors for the * seen in-order. See the alpha page table accessors for the
* smp_read_barrier_depends() barriers in page table walking code. * smp_rmb() barriers in page table walking code.
*/ */
smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
......
...@@ -5903,8 +5903,7 @@ sub process { ...@@ -5903,8 +5903,7 @@ sub process {
my $barriers = qr{ my $barriers = qr{
mb| mb|
rmb| rmb|
wmb| wmb
read_barrier_depends
}x; }x;
my $barrier_stems = qr{ my $barrier_stems = qr{
mb__before_atomic| mb__before_atomic|
...@@ -5953,12 +5952,6 @@ sub process { ...@@ -5953,12 +5952,6 @@ sub process {
} }
} }
# check for smp_read_barrier_depends and read_barrier_depends
if (!$file && $line =~ /\b(smp_|)read_barrier_depends\s*\(/) {
WARN("READ_BARRIER_DEPENDS",
"$1read_barrier_depends should only be used in READ_ONCE or DEC Alpha code\n" . $herecurr);
}
# check of hardware specific defines # check of hardware specific defines
if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) { if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) {
CHK("ARCH_DEFINES", CHK("ARCH_DEFINES",
......
...@@ -9,7 +9,8 @@ MAKE = make ...@@ -9,7 +9,8 @@ MAKE = make
INSTALL ?= install INSTALL ?= install
CFLAGS += -Wall -O2 CFLAGS += -Wall -O2
CFLAGS += -D__EXPORTED_HEADERS__ -I$(srctree)/include/uapi -I$(srctree)/include CFLAGS += -D__EXPORTED_HEADERS__ -I$(srctree)/tools/include/uapi \
-I$(srctree)/tools/include
# This will work when bpf is built in tools env. where srctree # This will work when bpf is built in tools env. where srctree
# isn't set and when invoked from selftests build, where srctree # isn't set and when invoked from selftests build, where srctree
......
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
/*
* Linux Socket Filter Data Structures
*/
#ifndef __LINUX_FILTER_H__
#define __LINUX_FILTER_H__
#include <linux/types.h>
#include <linux/bpf_common.h>
/*
* Current version of the filter code architecture.
*/
#define BPF_MAJOR_VERSION 1
#define BPF_MINOR_VERSION 1
/*
* Try and keep these values and structures similar to BSD, especially
* the BPF code definitions which need to match so you can share filters
*/
struct sock_filter { /* Filter block */
__u16 code; /* Actual filter code */
__u8 jt; /* Jump true */
__u8 jf; /* Jump false */
__u32 k; /* Generic multiuse field */
};
struct sock_fprog { /* Required for SO_ATTACH_FILTER. */
unsigned short len; /* Number of filter blocks */
struct sock_filter *filter;
};
/* ret - BPF_K and BPF_X also apply */
#define BPF_RVAL(code) ((code) & 0x18)
#define BPF_A 0x10
/* misc */
#define BPF_MISCOP(code) ((code) & 0xf8)
#define BPF_TAX 0x00
#define BPF_TXA 0x80
/*
* Macros for filter block array initializers.
*/
#ifndef BPF_STMT
#define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
#endif
#ifndef BPF_JUMP
#define BPF_JUMP(code, k, jt, jf) { (unsigned short)(code), jt, jf, k }
#endif
/*
* Number of scratch memory words for: BPF_ST and BPF_STX
*/
#define BPF_MEMWORDS 16
/* RATIONALE. Negative offsets are invalid in BPF.
We use them to reference ancillary data.
Unlike introduction new instructions, it does not break
existing compilers/optimizers.
*/
#define SKF_AD_OFF (-0x1000)
#define SKF_AD_PROTOCOL 0
#define SKF_AD_PKTTYPE 4
#define SKF_AD_IFINDEX 8
#define SKF_AD_NLATTR 12
#define SKF_AD_NLATTR_NEST 16
#define SKF_AD_MARK 20
#define SKF_AD_QUEUE 24
#define SKF_AD_HATYPE 28
#define SKF_AD_RXHASH 32
#define SKF_AD_CPU 36
#define SKF_AD_ALU_XOR_X 40
#define SKF_AD_VLAN_TAG 44
#define SKF_AD_VLAN_TAG_PRESENT 48
#define SKF_AD_PAY_OFFSET 52
#define SKF_AD_RANDOM 56
#define SKF_AD_VLAN_TPID 60
#define SKF_AD_MAX 64
#define SKF_NET_OFF (-0x100000)
#define SKF_LL_OFF (-0x200000)
#define BPF_NET_OFF SKF_NET_OFF
#define BPF_LL_OFF SKF_LL_OFF
#endif /* __LINUX_FILTER_H__ */
...@@ -1122,12 +1122,10 @@ maintain at least the appearance of FIFO order. ...@@ -1122,12 +1122,10 @@ maintain at least the appearance of FIFO order.
In practice, this difficulty is solved by inserting a special fence In practice, this difficulty is solved by inserting a special fence
between P1's two loads when the kernel is compiled for the Alpha between P1's two loads when the kernel is compiled for the Alpha
architecture. In fact, as of version 4.15, the kernel automatically architecture. In fact, as of version 4.15, the kernel automatically
adds this fence (called smp_read_barrier_depends() and defined as adds this fence after every READ_ONCE() and atomic load on Alpha. The
nothing at all on non-Alpha builds) after every READ_ONCE() and atomic effect of the fence is to cause the CPU not to execute any po-later
load. The effect of the fence is to cause the CPU not to execute any instructions until after the local cache has finished processing all
po-later instructions until after the local cache has finished the stores it has already received. Thus, if the code was changed to:
processing all the stores it has already received. Thus, if the code
was changed to:
P1() P1()
{ {
...@@ -1146,14 +1144,14 @@ READ_ONCE() or another synchronization primitive rather than accessed ...@@ -1146,14 +1144,14 @@ READ_ONCE() or another synchronization primitive rather than accessed
directly. directly.
The LKMM requires that smp_rmb(), acquire fences, and strong fences The LKMM requires that smp_rmb(), acquire fences, and strong fences
share this property with smp_read_barrier_depends(): They do not allow share this property: They do not allow the CPU to execute any po-later
the CPU to execute any po-later instructions (or po-later loads in the instructions (or po-later loads in the case of smp_rmb()) until all
case of smp_rmb()) until all outstanding stores have been processed by outstanding stores have been processed by the local cache. In the
the local cache. In the case of a strong fence, the CPU first has to case of a strong fence, the CPU first has to wait for all of its
wait for all of its po-earlier stores to propagate to every other CPU po-earlier stores to propagate to every other CPU in the system; then
in the system; then it has to wait for the local cache to process all it has to wait for the local cache to process all the stores received
the stores received as of that time -- not just the stores received as of that time -- not just the stores received when the strong fence
when the strong fence began. began.
And of course, none of this matters for any architecture other than And of course, none of this matters for any architecture other than
Alpha. Alpha.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment