• Marco Elver's avatar
    kfence: default to dynamic branch instead of static keys mode · 4f612ed3
    Marco Elver authored
    We have observed that on very large machines with newer CPUs, the static
    key/branch switching delay is on the order of milliseconds.  This is due
    to the required broadcast IPIs, which simply does not scale well to
    hundreds of CPUs (cores).  If done too frequently, this can adversely
    affect tail latencies of various workloads.
    
    One workaround is to increase the sample interval to several seconds,
    while decreasing sampled allocation coverage, but the problem still
    exists and could still increase tail latencies.
    
    As already noted in the Kconfig help text, there are trade-offs: at
    lower sample intervals the dynamic branch results in better performance;
    however, at very large sample intervals, the static keys mode can result
    in better performance -- careful benchmarking is recommended.
    
    Our initial benchmarking showed that with large enough sample intervals
    and workloads stressing the allocator, the static keys mode was slightly
    better.  Evaluating and observing the possible system-wide side-effects
    of the static-key-switching induced broadcast IPIs, however, was a blind
    spot (in particular on large machines with 100s of cores).
    
    Therefore, a major downside of the static keys mode is, unfortunately,
    that it is hard to predict performance on new system architectures and
    topologies, but also making conclusions about performance of new
    workloads based on a limited set of benchmarks.
    
    Most distributions will simply select the defaults, while targeting a
    large variety of different workloads and system architectures.  As such,
    the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
    only recommended after careful evaluation.
    
    For reference, on x86-64 the condition in kfence_alloc() generates
    exactly
    2 instructions in the kmem_cache_alloc() fast-path:
    
     | ...
     | cmpl   $0x0,0x1a8021c(%rip)  # ffffffff82d560d0 <kfence_allocation_gate>
     | je     ffffffff812d6003      <kmem_cache_alloc+0x243>
     | ...
    
    which, given kfence_allocation_gate is infrequently modified, should be
    well predicted by most CPUs.
    
    Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.comSigned-off-by: default avatarMarco Elver <elver@google.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Jann Horn <jannh@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    4f612ed3
kfence.rst 13.9 KB