• Hou Tao's avatar
    bpf: Support inlining bpf_kptr_xchg() helper · 7c05e7f3
    Hou Tao authored
    The motivation of inlining bpf_kptr_xchg() comes from the performance
    profiling of bpf memory allocator benchmark. The benchmark uses
    bpf_kptr_xchg() to stash the allocated objects and to pop the stashed
    objects for free. After inling bpf_kptr_xchg(), the performance for
    object free on 8-CPUs VM increases about 2%~10%. The inline also has
    downside: both the kasan and kcsan checks on the pointer will be
    unavailable.
    
    bpf_kptr_xchg() can be inlined by converting the calling of
    bpf_kptr_xchg() into an atomic_xchg() instruction. But the conversion
    depends on two conditions:
    1) JIT backend supports atomic_xchg() on pointer-sized word
    2) For the specific arch, the implementation of xchg is the same as
       atomic_xchg() on pointer-sized words.
    
    It seems most 64-bit JIT backends satisfies these two conditions. But
    as a precaution, defining a weak function bpf_jit_supports_ptr_xchg()
    to state whether such conversion is safe and only supporting inline for
    64-bit host.
    
    For x86-64, it supports BPF_XCHG atomic operation and both xchg() and
    atomic_xchg() use arch_xchg() to implement the exchange, so enabling the
    inline of bpf_kptr_xchg() on x86-64 first.
    Reviewed-by: default avatarEduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
    Link: https://lore.kernel.org/r/20240105104819.3916743-2-houtao@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    7c05e7f3
core.c 76.3 KB