1. 12 May, 2024 5 commits
    • Puranjay Mohan's avatar
      bpf, arm64: inline bpf_get_smp_processor_id() helper · 75fe4c0b
      Puranjay Mohan authored
      Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting
      a read from struct thread_info. The SP_EL0 system register holds the
      pointer to the task_struct and thread_info is the first member of this
      struct. We can read the cpu number from the thread_info.
      
      Here is how the ARM64 JITed assembly changes after this commit:
      
                                            ARM64 JIT
                                           ===========
      
                    BEFORE                                    AFTER
                   --------                                  -------
      
      int cpu = bpf_get_smp_processor_id();        int cpu = bpf_get_smp_processor_id();
      
      mov     x10, #0xfffffffffffff4d0             mrs     x10, sp_el0
      movk    x10, #0x802b, lsl #16                ldr     w7, [x10, #24]
      movk    x10, #0x8000, lsl #32
      blr     x10
      add     x7, x0, #0x0
      
                     Performance improvement using benchmark[1]
      
      ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
      
      +---------------+-------------------+-------------------+--------------+
      |      Name     |      Before       |        After      |   % change   |
      |---------------+-------------------+-------------------+--------------|
      | glob-arr-inc  | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s |   + 10.74%   |
      | arr-inc       | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s |   + 5.37%    |
      | hash-inc      | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s |   + 2.08%    |
      +---------------+-------------------+-------------------+--------------+
      
      [1] https://github.com/anakryiko/linux/commit/8dec900975efSigned-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20240502151854.9810-5-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      75fe4c0b
    • Puranjay Mohan's avatar
      arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs · 7a4c3222
      Puranjay Mohan authored
      Support an instruction for resolving absolute addresses of per-CPU
      data from their per-CPU offsets. This instruction is internal-only and
      users are not allowed to use them directly. They will only be used for
      internal inlining optimizations for now between BPF verifier and BPF
      JITs.
      
      Since commit 71586276 ("arm64: percpu: implement optimised pcpu
      access using tpidr_el1"), the per-cpu offset for the CPU is stored in
      the tpidr_el1/2 register of that CPU.
      
      To support this BPF instruction in the ARM64 JIT, the following ARM64
      instructions are emitted:
      
      mov dst, src		// Move src to dst, if src != dst
      mrs tmp, tpidr_el1/2	// Move per-cpu offset of the current cpu in tmp.
      add dst, dst, tmp	// Add the per cpu offset to the dst.
      
      To measure the performance improvement provided by this change, the
      benchmark in [1] was used:
      
      Before:
      glob-arr-inc   :   23.597 ± 0.012M/s
      arr-inc        :   23.173 ± 0.019M/s
      hash-inc       :   12.186 ± 0.028M/s
      
      After:
      glob-arr-inc   :   23.819 ± 0.034M/s
      arr-inc        :   23.285 ± 0.017M/s
      hash-inc       :   12.419 ± 0.011M/s
      
      [1] https://github.com/anakryiko/linux/commit/8dec900975efSigned-off-by: default avatarPuranjay Mohan <puranjay12@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20240502151854.9810-4-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7a4c3222
    • Puranjay Mohan's avatar
      riscv, bpf: inline bpf_get_smp_processor_id() · 2ddec2c8
      Puranjay Mohan authored
      Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit.
      
      RISCV saves the pointer to the CPU's task_struct in the TP (thread
      pointer) register. This makes it trivial to get the CPU's processor id.
      As thread_info is the first member of task_struct, we can read the
      processor id from TP + offsetof(struct thread_info, cpu).
      
                RISCV64 JIT output for `call bpf_get_smp_processor_id`
      	  ======================================================
      
                      Before                           After
                     --------                         -------
      
               auipc   t1,0x848c                  ld    a5,32(tp)
               jalr    604(t1)
               mv      a5,a0
      
      Benchmark using [1] on Qemu.
      
      ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
      
      +---------------+------------------+------------------+--------------+
      |      Name     |     Before       |       After      |   % change   |
      |---------------+------------------+------------------+--------------|
      | glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
      | arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
      | hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
      +---------------+------------------+------------------+--------------+
      
      NOTE: This benchmark includes changes from this patch and the previous
            patch that implemented the per-cpu insn.
      
      [1] https://github.com/anakryiko/linux/commit/8dec900975efSigned-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Acked-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarBjörn Töpel <bjorn@kernel.org>
      Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2ddec2c8
    • Puranjay Mohan's avatar
      riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs · 19c56d4e
      Puranjay Mohan authored
      Support an instruction for resolving absolute addresses of per-CPU
      data from their per-CPU offsets. This instruction is internal-only and
      users are not allowed to use them directly. They will only be used for
      internal inlining optimizations for now between BPF verifier and BPF
      JITs.
      
      RISC-V uses generic per-cpu implementation where the offsets for CPUs
      are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
      the address of the task_struct in TP register. The first element in
      task_struct is struct thread_info, and we can get the cpu number by
      reading from the TP register + offsetof(struct thread_info, cpu).
      
      Once we have the cpu number in a register we read the offset for that
      cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
      offset to the destination register.
      
      To measure the improvement from this change, the benchmark in [1] was
      used on Qemu:
      
      Before:
      glob-arr-inc   :    1.127 ± 0.013M/s
      arr-inc        :    1.121 ± 0.004M/s
      hash-inc       :    0.681 ± 0.052M/s
      
      After:
      glob-arr-inc   :    1.138 ± 0.011M/s
      arr-inc        :    1.366 ± 0.006M/s
      hash-inc       :    0.676 ± 0.001M/s
      
      [1] https://github.com/anakryiko/linux/commit/8dec900975efSigned-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Acked-by: default avatarBjörn Töpel <bjorn@kernel.org>
      Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      19c56d4e
    • Shahab Vahedi's avatar
      ARC: Add eBPF JIT support · f122668d
      Shahab Vahedi authored
      This will add eBPF JIT support to the 32-bit ARCv2 processors. The
      implementation is qualified by running the BPF tests on a Synopsys HSDK
      board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.
      
      The test_bpf.ko reports 2-10 fold improvements in execution time of its
      tests. For instance:
      
      test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS
      test_bpf: #33 tcpdump port 22 jited:1 120  224  260 PASS
      
      test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS
      test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1  23 PASS
      
      test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS
      test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS
      
      Deployment and structure
      ------------------------
      The related codes are added to "arch/arc/net":
      
      - bpf_jit.h       -- The interface that a back-end translator must provide
      - bpf_jit_core.c  -- Knows how to handle the input eBPF byte stream
      - bpf_jit_arcv2.c -- The back-end code that knows the translation logic
      
      The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance
      to the whole process. Normally, the translation is done in one pass,
      namely the "normal pass". In case some relocations are not known during
      this pass, some data (arc_jit_data) is allocated for the next pass to
      come. This possible next (and last) pass is called the "extra pass".
      
      1. Normal pass       # The necessary pass
           1a. Dry run       # Get the whole JIT length, epilogue offset, etc.
           1b. Emit phase    # Allocate memory and start emitting instructions
      2. Extra pass        # Only needed if there are relocations to be fixed
           2a. Patch relocations
      
      Support status
      --------------
      The JIT compiler supports BPF instructions up to "cpu=v4". However, it
      does not yet provide support for:
      
      - Tail calls
      - Atomic operations
      - 64-bit division/remainder
      - BPF_PROBE_MEM* (exception table)
      
      The result of "test_bpf" test suite on an HSDK board is:
      
      hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf
      
        test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
      
      All the failing test cases are due to the ones that were not JIT'ed.
      Categorically, they can be represented as:
      
        .-----------.------------.-------------.
        | test type |   opcodes  | # of cases  |
        |-----------+------------+-------------|
        | atomic    | 0xC3, 0xDB |         149 |
        | div64     | 0x37, 0x3F |          22 |
        | mod64     | 0x97, 0x9F |          15 |
        `-----------^------------+-------------|
                                 | (total) 186 |
                                 `-------------'
      
      Setup: build config
      -------------------
      The following configs must be set to have a working JIT test:
      
        CONFIG_BPF_JIT=y
        CONFIG_BPF_JIT_ALWAYS_ON=y
        CONFIG_TEST_BPF=m
      
      The following options are not necessary for the tests module,
      but are good to have:
      
        CONFIG_DEBUG_INFO=y             # prerequisite for below
        CONFIG_DEBUG_INFO_BTF=y         # so bpftool can generate vmlinux.h
      
        CONFIG_FTRACE=y                 #
        CONFIG_BPF_SYSCALL=y            # all these options lead to
        CONFIG_KPROBE_EVENTS=y          # having CONFIG_BPF_EVENTS=y
        CONFIG_PERF_EVENTS=y            #
      
      Some BPF programs provide data through /sys/kernel/debug:
        CONFIG_DEBUG_FS=y
      arc# mount -t debugfs debugfs /sys/kernel/debug
      
      Setup: elfutils
      ---------------
      The libdw.{so,a} library that is used by pahole for processing
      the final binary must come from elfutils 0.189 or newer. The
      support for ARCv2 [1] has been added since that version.
      
      [1]
      https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7
      
      Setup: pahole
      -------------
      The line below in linux/scripts/Makefile.btf must be commented out:
      
      pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats
      
      Or else, the build will fail:
      
      $ make V=1
        ...
        BTF     .btf.vmlinux.bin.o
      pahole -J --btf_gen_floats                    \
             -j --lang_exclude=rust                 \
             --skip_encoding_btf_inconsistent_proto \
             --btf_gen_optimized .tmp_vmlinux.btf
      Complex, interval and imaginary float types are not supported
      Encountered error while encoding BTF.
        ...
        BTFIDS  vmlinux
      ./tools/bpf/resolve_btfids/resolve_btfids vmlinux
      libbpf: failed to find '.BTF' ELF section in vmlinux
      FAILED: load BTF from vmlinux: No data available
      
      This is due to the fact that the ARC toolchains generate
      "complex float" DIE entries in libgcc and at the moment, pahole
      can't handle such entries.
      
      Running the tests
      -----------------
      host$ scp /bld/linux/lib/test_bpf.ko arc:
      arc # sysctl net.core.bpf_jit_enable=1
      arc # insmod test_bpf.ko test_suite=test_bpf
            ...
            test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS
            test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
      
      Acknowledgments
      ---------------
      - Claudiu Zissulescu for his unwavering support
      - Yuriy Kolerov for testing and troubleshooting
      - Vladimir Isaev for the pahole workaround
      - Sergey Matyukevich for paving the road by adding the interpreter support
      Signed-off-by: default avatarShahab Vahedi <shahab@synopsys.com>
      Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f122668d
  2. 09 May, 2024 19 commits
  3. 08 May, 2024 5 commits
    • Jose E. Marchesi's avatar
      bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD · 00936709
      Jose E. Marchesi authored
      [Changes from V1:
       - Use a default branch in the switch statement to initialize `val'.]
      
      GCC warns that `val' may be used uninitialized in the
      BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as:
      
      	[...]
      	unsigned long long val;						      \
      	[...]								      \
      	switch (__CORE_RELO(s, field, BYTE_SIZE)) {			      \
      	case 1: val = *(const unsigned char *)p; break;			      \
      	case 2: val = *(const unsigned short *)p; break;		      \
      	case 4: val = *(const unsigned int *)p; break;			      \
      	case 8: val = *(const unsigned long long *)p; break;		      \
              }       							      \
      	[...]
      	val;								      \
      	}								      \
      
      This patch adds a default entry in the switch statement that sets
      `val' to zero in order to avoid the warning, and random values to be
      used in case __builtin_preserve_field_info returns unexpected values
      for BPF_FIELD_BYTE_SIZE.
      
      Tested in bpf-next master.
      No regressions.
      Signed-off-by: default avatarJose E. Marchesi <jose.marchesi@oracle.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240508101313.16662-1-jose.marchesi@oracle.com
      00936709
    • Jose E. Marchesi's avatar
      bpf: guard BPF_NO_PRESERVE_ACCESS_INDEX in skb_pkt_end.c · 911edc69
      Jose E. Marchesi authored
      This little patch is a follow-up to:
      https://lore.kernel.org/bpf/20240507095011.15867-1-jose.marchesi@oracle.com/T/#u
      
      The temporary workaround of passing -DBPF_NO_PRESERVE_ACCESS_INDEX
      when building with GCC triggers a redefinition preprocessor error when
      building progs/skb_pkt_end.c.  This patch adds a guard to avoid
      redefinition.
      Signed-off-by: default avatarJose E. Marchesi <jose.marchesi@oracle.com>
      Cc: david.faust@oracle.com
      Cc: cupertino.miranda@oracle.com
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Yonghong Song <yonghong.song@linux.dev>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/r/20240508110332.17332-1-jose.marchesi@oracle.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      911edc69
    • Jose E. Marchesi's avatar
      bpf: avoid UB in usages of the __imm_insn macro · 1209a523
      Jose E. Marchesi authored
      [Changes from V2:
       - no-strict-aliasing is only applied when building with GCC.
       - cpumask_failure.c is excluded, as it doesn't use __imm_insn.]
      
      The __imm_insn macro is defined in bpf_misc.h as:
      
        #define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
      
      This may lead to type-punning and strict aliasing rules violations in
      it's typical usage where the address of a struct bpf_insn is passed as
      expr, like in:
      
        __imm_insn(st_mem,
                   BPF_ST_MEM(BPF_W, BPF_REG_1, offsetof(struct __sk_buff, mark), 42))
      
      Where:
      
        #define BPF_ST_MEM(SIZE, DST, OFF, IMM)				\
      	((struct bpf_insn) {					\
      		.code  = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM,	\
      		.dst_reg = DST,					\
      		.src_reg = 0,					\
      		.off   = OFF,					\
      		.imm   = IMM })
      
      In all the actual instances of this in the BPF selftests the value is
      fed to a volatile asm statement as soon as it gets read from memory,
      and thus it is unlikely anti-aliasing rules breakage may lead to
      misguided optimizations.
      
      However, GCC detects the potential problem (indirectly) by issuing a
      warning stating that a temporary <Uxxxxxx> is used uninitialized,
      where the temporary corresponds to the memory read by *(long *).
      
      This patch adds -fno-strict-aliasing to the compilation flags of the
      particular selftests that do type punning via __imm_insn, only for
      GCC.
      
      Tested in master bpf-next.
      No regressions.
      Signed-off-by: default avatarJose E. Marchesi <jose.marchesi@oracle.com>
      Cc: david.faust@oracle.com
      Cc: cupertino.miranda@oracle.com
      Cc: Yonghong Song <yonghong.song@linux.dev>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/r/20240508103551.14955-1-jose.marchesi@oracle.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1209a523
    • Jose E. Marchesi's avatar
      bpf: avoid uninitialized warnings in verifier_global_subprogs.c · cd3fc3b9
      Jose E. Marchesi authored
      [Changes from V1:
      - The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized.
      - This warning is only supported in GCC.]
      
      The BPF selftest verifier_global_subprogs.c contains code that
      purposedly performs out of bounds access to memory, to check whether
      the kernel verifier is able to catch them.  For example:
      
        __noinline int global_unsupp(const int *mem)
        {
      	if (!mem)
      		return 0;
      	return mem[100]; /* BOOM */
        }
      
      With -O1 and higher and no inlining, GCC notices this fact and emits a
      "maybe uninitialized" warning.  This is by design.  Note that the
      emission of these warnings is highly dependent on the precise
      optimizations that are performed.
      
      This patch adds a compiler pragma to verifier_global_subprogs.c to
      ignore these warnings.
      
      Tested in bpf-next master.
      No regressions.
      Signed-off-by: default avatarJose E. Marchesi <jose.marchesi@oracle.com>
      Cc: david.faust@oracle.com
      Cc: cupertino.miranda@oracle.com
      Cc: Yonghong Song <yonghong.song@linux.dev>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cd3fc3b9
    • Puranjay Mohan's avatar
      bpf, arm64: Add support for lse atomics in bpf_arena · e612b5c1
      Puranjay Mohan authored
      When LSE atomics are available, BPF atomic instructions are implemented
      as single ARM64 atomic instructions, therefore it is easy to enable
      these in bpf_arena using the currently available exception handling
      setup.
      
      LL_SC atomics use loops and therefore would need more work to enable in
      bpf_arena.
      
      Enable LSE atomics based instructions in bpf_arena and use the
      bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE
      atomics are not available.
      
      All atomics and arena_atomics selftests are passing:
      
        [root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics
        #3/1     arena_atomics/add:OK
        #3/2     arena_atomics/sub:OK
        #3/3     arena_atomics/and:OK
        #3/4     arena_atomics/or:OK
        #3/5     arena_atomics/xor:OK
        #3/6     arena_atomics/cmpxchg:OK
        #3/7     arena_atomics/xchg:OK
        #3       arena_atomics:OK
        #10/1    atomics/add:OK
        #10/2    atomics/sub:OK
        #10/3    atomics/and:OK
        #10/4    atomics/or:OK
        #10/5    atomics/xor:OK
        #10/6    atomics/cmpxchg:OK
        #10/7    atomics/xchg:OK
        #10      atomics:OK
        Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Link: https://lore.kernel.org/r/20240426161116.441-1-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e612b5c1
  4. 07 May, 2024 11 commits