Commit 093ae8f9 authored by Borislav Petkov's avatar Borislav Petkov

x86/TSC: Use RDTSCP

Currently, the kernel uses

  [LM]FENCE; RDTSC

in the timekeeping code, to guarantee monotonicity of time where the
*FENCE is selected based on vendor.

Replace that sequence with RDTSCP which is faster or on-par and gives
the same guarantees.

A microbenchmark on Intel shows that the change is on-par.

On AMD, the change is either on-par with the current LFENCE-prefixed
RDTSC or slightly better with RDTSCP.

The comparison is done with the LFENCE-prefixed RDTSC (and not with the
MFENCE-prefixed one, as one would normally expect) because all modern
AMD families make LFENCE serializing and thus avoid the heavy MFENCE by
effectively enabling X86_FEATURE_LFENCE_RDTSC.
Co-developed-by: default avatarThomas Gleixner <tglx@linutronix.de>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20181119184556.11479-1-bp@alien8.de
parent 71a93c26
...@@ -217,6 +217,8 @@ static __always_inline unsigned long long rdtsc(void) ...@@ -217,6 +217,8 @@ static __always_inline unsigned long long rdtsc(void)
*/ */
static __always_inline unsigned long long rdtsc_ordered(void) static __always_inline unsigned long long rdtsc_ordered(void)
{ {
DECLARE_ARGS(val, low, high);
/* /*
* The RDTSC instruction is not ordered relative to memory * The RDTSC instruction is not ordered relative to memory
* access. The Intel SDM and the AMD APM are both vague on this * access. The Intel SDM and the AMD APM are both vague on this
...@@ -227,9 +229,19 @@ static __always_inline unsigned long long rdtsc_ordered(void) ...@@ -227,9 +229,19 @@ static __always_inline unsigned long long rdtsc_ordered(void)
* ordering guarantees as reading from a global memory location * ordering guarantees as reading from a global memory location
* that some other imaginary CPU is updating continuously with a * that some other imaginary CPU is updating continuously with a
* time stamp. * time stamp.
*
* Thus, use the preferred barrier on the respective CPU, aiming for
* RDTSCP as the default.
*/ */
barrier_nospec(); asm volatile(ALTERNATIVE_3("rdtsc",
return rdtsc(); "mfence; rdtsc", X86_FEATURE_MFENCE_RDTSC,
"lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
"rdtscp", X86_FEATURE_RDTSCP)
: EAX_EDX_RET(val, low, high)
/* RDTSCP clobbers ECX with MSR_TSC_AUX. */
:: "ecx");
return EAX_EDX_VAL(val, low, high);
} }
static inline unsigned long long native_read_pmc(int counter) static inline unsigned long long native_read_pmc(int counter)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment