Merge branch 'bpf-inline-bpf_kptr_xchg'
Hou Tao says: ==================== The motivation of inlining bpf_kptr_xchg() comes from the performance profiling of bpf memory allocator benchmark [1]. The benchmark uses bpf_kptr_xchg() to stash the allocated objects and to pop the stashed objects for free. After inling bpf_kptr_xchg(), the performance for object free on 8-CPUs VM increases about 2%~10%. However the performance gain comes with costs: both the kasan and kcsan checks on the pointer will be unavailable. Initially the inline is implemented in do_jit() for x86-64 directly, but I think it will more portable to implement the inline in verifier. Patch #1 supports inlining bpf_kptr_xchg() helper and enables it on x86-4. Patch #2 factors out a helper for newly-added test in patch #3. Patch #3 tests whether the inlining of bpf_kptr_xchg() is expected. Please see individual patches for more details. And comments are always welcome. Change Log: v3: * rebased on bpf-next tree * patch 1 & 2: Add Rvb-by and Ack-by tags from Eduard * patch 3: use inline assembly and naked function instead of c code (suggested by Eduard) v2: https://lore.kernel.org/bpf/20231223104042.1432300-1-houtao@huaweicloud.com/ * rebased on bpf-next tree * drop patch #1 in v1 due to discussion in [2] * patch #1: add the motivation in the commit message, merge patch #1 and #3 into the new patch in v2. (Daniel) * patch #2/#3: newly-added patch to test the inlining of bpf_kptr_xchg() (Eduard) v1: https://lore.kernel.org/bpf/95b8c2cd-44d5-5fe1-60b5-7e8218779566@huaweicloud.com/ [1]: https://lore.kernel.org/bpf/20231221141501.3588586-1-houtao@huaweicloud.com/ [2]: https://lore.kernel.org/bpf/fd94efb9-4a56-c982-dc2e-c66be5202cb7@huaweicloud.com/ ==================== Link: https://lore.kernel.org/r/20240105104819.3916743-1-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
Showing
Please register or sign in to comment