• Anton Protopopov's avatar
    bpf: optimize hashmap lookups when key_size is divisible by 4 · 5b85575a
    Anton Protopopov authored
    The BPF hashmap uses the jhash() hash function. There is an optimized version
    of this hash function which may be used if hash size is a multiple of 4. Apply
    this optimization to the hashmap in a similar way as it is done in the bloom
    filter map.
    
    On practice the optimization is only noticeable for smaller key sizes, which,
    however, is sufficient for many applications. An example is listed in the
    following table of measurements (a hashmap of 65536 elements was used):
    
        --------------------------------------------------------------------
        | key_size | fullness | lookups /sec | lookups (opt) /sec |   gain |
        --------------------------------------------------------------------
        |        4 |      25% |      42.990M |            46.000M |   7.0% |
        |        4 |      50% |      37.910M |            39.094M |   3.1% |
        |        4 |      75% |      34.486M |            36.124M |   4.7% |
        |        4 |     100% |      31.760M |            32.719M |   3.0% |
        --------------------------------------------------------------------
        |        8 |      25% |      43.855M |            49.626M |  13.2% |
        |        8 |      50% |      38.328M |            42.152M |  10.0% |
        |        8 |      75% |      34.483M |            38.088M |  10.5% |
        |        8 |     100% |      31.306M |            34.686M |  10.8% |
        --------------------------------------------------------------------
        |       12 |      25% |      38.398M |            43.770M |  14.0% |
        |       12 |      50% |      33.336M |            37.712M |  13.1% |
        |       12 |      75% |      29.917M |            34.440M |  15.1% |
        |       12 |     100% |      27.322M |            30.480M |  11.6% |
        --------------------------------------------------------------------
        |       16 |      25% |      41.491M |            41.921M |   1.0% |
        |       16 |      50% |      36.206M |            36.474M |   0.7% |
        |       16 |      75% |      32.529M |            33.027M |   1.5% |
        |       16 |     100% |      29.581M |            30.325M |   2.5% |
        --------------------------------------------------------------------
        |       20 |      25% |      34.240M |            36.787M |   7.4% |
        |       20 |      50% |      30.328M |            32.663M |   7.7% |
        |       20 |      75% |      27.536M |            29.354M |   6.6% |
        |       20 |     100% |      24.847M |            26.505M |   6.7% |
        --------------------------------------------------------------------
        |       24 |      25% |      36.329M |            40.608M |  11.8% |
        |       24 |      50% |      31.444M |            35.059M |  11.5% |
        |       24 |      75% |      28.426M |            31.452M |  10.6% |
        |       24 |     100% |      26.278M |            28.741M |   9.4% |
        --------------------------------------------------------------------
        |       28 |      25% |      31.540M |            31.944M |   1.3% |
        |       28 |      50% |      27.739M |            28.063M |   1.2% |
        |       28 |      75% |      24.993M |            25.814M |   3.3% |
        |       28 |     100% |      23.513M |            23.500M |  -0.1% |
        --------------------------------------------------------------------
        |       32 |      25% |      32.116M |            33.953M |   5.7% |
        |       32 |      50% |      28.879M |            29.859M |   3.4% |
        |       32 |      75% |      26.227M |            26.948M |   2.7% |
        |       32 |     100% |      23.829M |            24.613M |   3.3% |
        --------------------------------------------------------------------
        |       64 |      25% |      22.535M |            22.554M |   0.1% |
        |       64 |      50% |      20.471M |            20.675M |   1.0% |
        |       64 |      75% |      19.077M |            19.146M |   0.4% |
        |       64 |     100% |      17.710M |            18.131M |   2.4% |
        --------------------------------------------------------------------
    
    The following script was used to gather the results (SMT & frequency off):
    
        cd tools/testing/selftests/bpf
        for key_size in 4 8 12 16 20 24 28 32 64; do
                for nr_entries in `seq 16384 16384 65536`; do
                        fullness=$(printf '%3s' $((nr_entries*100/65536)))
                        echo -n "key_size=$key_size: $fullness% full: "
                        sudo ./bench -d2 -a bpf-hashmap-lookup --key_size=$key_size --nr_entries=$nr_entries --max_entries=65536 --nr_loops=2000000 --map_flags=0x40 | grep cpu
                done
                echo
        done
    Signed-off-by: default avatarAnton Protopopov <aspsk@isovalent.com>
    Link: https://lore.kernel.org/r/20230401200602.3275-1-aspsk@isovalent.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    5b85575a
hashtab.c 67.8 KB