• wuqiang.matt's avatar
    kprobes: kretprobe scalability improvement · 4bbd9345
    wuqiang.matt authored
    kretprobe is using freelist to manage return-instances, but freelist,
    as LIFO queue based on singly linked list, scales badly and reduces
    the overall throughput of kretprobed routines, especially for high
    contention scenarios.
    
    Here's a typical throughput test of sys_prctl (counts in 10 seconds,
    measured with perf stat -a -I 10000 -e syscalls:sys_enter_prctl):
    
    OS: Debian 10 X86_64, Linux 6.5rc7 with freelist
    HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
    
             1T       2T       4T       8T      16T      24T
       24150045 29317964 15446741 12494489 18287272 17708768
            32T      48T      64T      72T      96T     128T
       16200682 13737658 11645677 11269858 10470118  9931051
    
    This patch introduces objpool to replace freelist. objpool is a
    high performance queue, which can bring near-linear scalability
    to kretprobed routines. Tests of kretprobe throughput show the
    biggest ratio as 159x of original freelist. Here's the result:
    
                      1T         2T         4T         8T        16T
    native:     41186213   82336866  164250978  328662645  658810299
    freelist:   24150045   29317964   15446741   12494489   18287272
    objpool:    23926730   48010314   96125218  191782984  385091769
                     32T        48T        64T        96T       128T
    native:   1330338351 1969957941 2512291791 2615754135 2671040914
    freelist:   16200682   13737658   11645677   10470118    9931051
    objpool:   764481096 1147149781 1456220214 1502109662 1579015050
    
    Testings on 96-core ARM64 output similarly, but with the biggest
    ratio up to 448x:
    
    OS: Debian 10 AARCH64, Linux 6.5rc7
    HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
    
                      1T         2T         4T         8T        16T
    native: .   30066096   63569843  126194076  257447289  505800181
    freelist:   16152090   11064397   11124068    7215768    5663013
    objpool:    13997541   28032100   55726624  110099926  221498787
                     24T        32T        48T        64T        96T
    native:    763305277 1015925192 1521075123 2033009392 3021013752
    freelist:    5015810    4602893    3766792    3382478    2945292
    objpool:   328192025  439439564  668534502  887401381 1319972072
    
    Link: https://lore.kernel.org/all/20231017135654.82270-4-wuqiang.matt@bytedance.com/Signed-off-by: default avatarwuqiang.matt <wuqiang.matt@bytedance.com>
    Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
    4bbd9345
fprobe.c 9.17 KB