Commit 4f612ed3 authored by Marco Elver's avatar Marco Elver Committed by Linus Torvalds

kfence: default to dynamic branch instead of static keys mode

We have observed that on very large machines with newer CPUs, the static
key/branch switching delay is on the order of milliseconds.  This is due
to the required broadcast IPIs, which simply does not scale well to
hundreds of CPUs (cores).  If done too frequently, this can adversely
affect tail latencies of various workloads.

One workaround is to increase the sample interval to several seconds,
while decreasing sampled allocation coverage, but the problem still
exists and could still increase tail latencies.

As already noted in the Kconfig help text, there are trade-offs: at
lower sample intervals the dynamic branch results in better performance;
however, at very large sample intervals, the static keys mode can result
in better performance -- careful benchmarking is recommended.

Our initial benchmarking showed that with large enough sample intervals
and workloads stressing the allocator, the static keys mode was slightly
better.  Evaluating and observing the possible system-wide side-effects
of the static-key-switching induced broadcast IPIs, however, was a blind
spot (in particular on large machines with 100s of cores).

Therefore, a major downside of the static keys mode is, unfortunately,
that it is hard to predict performance on new system architectures and
topologies, but also making conclusions about performance of new
workloads based on a limited set of benchmarks.

Most distributions will simply select the defaults, while targeting a
large variety of different workloads and system architectures.  As such,
the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
only recommended after careful evaluation.

For reference, on x86-64 the condition in kfence_alloc() generates
exactly
2 instructions in the kmem_cache_alloc() fast-path:

 | ...
 | cmpl   $0x0,0x1a8021c(%rip)  # ffffffff82d560d0 <kfence_allocation_gate>
 | je     ffffffff812d6003      <kmem_cache_alloc+0x243>
 | ...

which, given kfence_allocation_gate is infrequently modified, should be
well predicted by most CPUs.

Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.comSigned-off-by: default avatarMarco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 07e8481d
...@@ -231,10 +231,14 @@ Guarded allocations are set up based on the sample interval. After expiration ...@@ -231,10 +231,14 @@ Guarded allocations are set up based on the sample interval. After expiration
of the sample interval, the next allocation through the main allocator (SLAB or of the sample interval, the next allocation through the main allocator (SLAB or
SLUB) returns a guarded allocation from the KFENCE object pool (allocation SLUB) returns a guarded allocation from the KFENCE object pool (allocation
sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and
the next allocation is set up after the expiration of the interval. To "gate" a the next allocation is set up after the expiration of the interval.
KFENCE allocation through the main allocator's fast-path without overhead,
KFENCE relies on static branches via the static keys infrastructure. The static When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated"
branch is toggled to redirect the allocation to KFENCE. through the main allocator's fast-path by relying on static branches via the
static keys infrastructure. The static branch is toggled to redirect the
allocation to KFENCE. Depending on sample interval, target workloads, and
system architecture, this may perform better than the simple dynamic branch.
Careful benchmarking is recommended.
KFENCE objects each reside on a dedicated page, at either the left or right KFENCE objects each reside on a dedicated page, at either the left or right
page boundaries selected at random. The pages to the left and right of the page boundaries selected at random. The pages to the left and right of the
......
...@@ -25,17 +25,6 @@ menuconfig KFENCE ...@@ -25,17 +25,6 @@ menuconfig KFENCE
if KFENCE if KFENCE
config KFENCE_STATIC_KEYS
bool "Use static keys to set up allocations"
default y
depends on JUMP_LABEL # To ensure performance, require jump labels
help
Use static keys (static branches) to set up KFENCE allocations. Using
static keys is normally recommended, because it avoids a dynamic
branch in the allocator's fast path. However, with very low sample
intervals, or on systems that do not support jump labels, a dynamic
branch may still be an acceptable performance trade-off.
config KFENCE_SAMPLE_INTERVAL config KFENCE_SAMPLE_INTERVAL
int "Default sample interval in milliseconds" int "Default sample interval in milliseconds"
default 100 default 100
...@@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS ...@@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS
pages are required; with one containing the object and two adjacent pages are required; with one containing the object and two adjacent
ones used as guard pages. ones used as guard pages.
config KFENCE_STATIC_KEYS
bool "Use static keys to set up allocations" if EXPERT
depends on JUMP_LABEL
help
Use static keys (static branches) to set up KFENCE allocations. This
option is only recommended when using very large sample intervals, or
performance has carefully been evaluated with this option.
Using static keys comes with trade-offs that need to be carefully
evaluated given target workloads and system architectures. Notably,
enabling and disabling static keys invoke IPI broadcasts, the latency
and impact of which is much harder to predict than a dynamic branch.
Say N if you are unsure.
config KFENCE_STRESS_TEST_FAULTS config KFENCE_STRESS_TEST_FAULTS
int "Stress testing of fault handling and error reporting" if EXPERT int "Stress testing of fault handling and error reporting" if EXPERT
default 0 default 0
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment