Commits · b9d32677ff1c2fe6c9c97c23341c6d69254d89cd · Kirill Smelkov / linux

05 Sep, 2024 24 commits

Merge branch 'local-vmtest-enhancement-and-rv64-enabled' · b9d32677

Alexei Starovoitov authored Sep 05, 2024

Pu Lehui says:

====================
Local vmtest enhancement and RV64 enabled

Patch 1-3 fix some problem about bpf selftests. Patch 4 add local rootfs
image support for vmtest. Patch 5 enable cross-platform testing for
vmtest. Patch 6-10 enable vmtest on RV64.

We can now perform cross platform testing for riscv64 bpf using the
following command:

PLATFORM=riscv64 CROSS_COMPILE=riscv64-linux-gnu- \
  tools/testing/selftests/bpf/vmtest.sh \
  -l <path of local rootfs image> -- \
  ./test_progs -d \
      \"$(cat tools/testing/selftests/bpf/DENYLIST.riscv64 \
          | cut -d'#' -f1 \
          | sed -e 's/^[[:space:]]*//' \
                -e 's/[[:space:]]*$//' \
          | tr -s '\n' ',' \
      )\"

For better regression, we rely on commit [0]. And since the work of riscv
ftrace to remove stop_machine atomic replacement is in progress, we also
need to revert commit [1] [2].

The test platform is x86_64 architecture, and the versions of relevant
components are as follows:
    QEMU: 8.2.0
    CLANG: 17.0.6 (align to BPF CI)
    ROOTFS: ubuntu noble (generated by [3])

Link: https://lore.kernel.org/all/20240831071520.1630360-1-pulehui@huaweicloud.com/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3308172276db [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7caa9765465f [2]
Link: https://github.com/libbpf/ci/blob/main/rootfs/mkrootfs_debian.sh [3]

v3:
- Use llvm static linking when detecting that feature-llvm is enabled
- Add Acked-by by Eduard

v2: https://lore.kernel.org/all/20240904141951.1139090-1-pulehui@huaweicloud.com/
- Drop patch about relaxing Zbb insns restrictions.
- Add local rootfs image support
- Add description about running vmtest on RV64
- Fix some problem about bpf selftests

v1: https://lore.kernel.org/all/20240328124916.293173-1-pulehui@huaweicloud.com/
====================
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240905081401.1894789-1-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b9d32677

selftests/bpf: Add description for running vmtest on RV64 · 95b1c5d1

Pu Lehui authored Sep 05, 2024

Add description in tools/testing/selftests/bpf/README.rst
for running vmtest on RV64.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-11-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

95b1c5d1

selftests/bpf: Add riscv64 configurations to local vmtest · b2bc9d50

Pu Lehui authored Sep 05, 2024

Add riscv64 configurations to local vmtest.

We can now perform cross platform testing for riscv64 bpf using the
following command:

PLATFORM=riscv64 CROSS_COMPILE=riscv64-linux-gnu- vmtest.sh \
    -l ./libbpf-vmtest-rootfs-2024.08.30-noble-riscv64.tar.zst -- \
    ./test_progs -d \
        \"$(cat tools/testing/selftests/bpf/DENYLIST.riscv64 \
            | cut -d'#' -f1 \
            | sed -e 's/^[[:space:]]*//' \
                  -e 's/[[:space:]]*$//' \
            | tr -s '\n' ','\
        )\"
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-10-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b2bc9d50

selftests/bpf: Add DENYLIST.riscv64 · c402cb85

Pu Lehui authored Sep 05, 2024

This patch adds DENYLIST.riscv64 file for riscv64. It will help BPF CI
and local vmtest to mask failing and unsupported test cases.

We can use the following command to use deny list in local vmtest as
previously mentioned by Manu.

PLATFORM=riscv64 CROSS_COMPILE=riscv64-linux-gnu- vmtest.sh \
    -l ./libbpf-vmtest-rootfs-2024.08.30-noble-riscv64.tar.zst -- \
    ./test_progs -d \
        \"$(cat tools/testing/selftests/bpf/DENYLIST.riscv64 \
            | cut -d'#' -f1 \
            | sed -e 's/^[[:space:]]*//' \
                  -e 's/[[:space:]]*$//' \
            | tr -s '\n' ','\
        )\"
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-9-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c402cb85

selftests/bpf: Add config.riscv64 · 897b3680

Pu Lehui authored Sep 05, 2024

Add config.riscv64 for both BPF CI and local vmtest.
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-8-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

897b3680

selftests/bpf: Enable cross platform testing for vmtest · d95d5651

Pu Lehui authored Sep 05, 2024

Add support cross platform testing for vmtest. The variable $ARCH in the
current script is platform semantics, not kernel semantics. Rename it to
$PLATFORM so that we can easily use $ARCH in cross-compilation. And drop
`set -u` unbound variable check as we will use CROSS_COMPILE env
variable. For now, Using PLATFORM= and CROSS_COMPILE= options will
enable cross platform testing:

  PLATFORM=<platform> CROSS_COMPILE=<toolchain> vmtest.sh -- ./test_progs
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-7-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d95d5651

selftests/bpf: Support local rootfs image for vmtest · 2294073d

Pu Lehui authored Sep 05, 2024

Support vmtest to use local rootfs image generated by [0] that is
consistent with BPF CI. Now we can specify the local rootfs image
through the `-l` parameter like as follows:

vmtest.sh -l ./libbpf-vmtest-rootfs-2024.08.22-noble-amd64.tar.zst -- ./test_progs

Meanwhile, some descriptions have been flushed.

Link: https://github.com/libbpf/ci/blob/main/rootfs/mkrootfs_debian.sh [0]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-6-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2294073d

selftests/bpf: Limit URLS parsing logic to actual scope in vmtest · 0c3fc330

Pu Lehui authored Sep 05, 2024

The URLS array is only valid in the download_rootfs function and does
not need to be parsed globally in advance. At the same time, the logic
of loading rootfs is refactored to prepare vmtest for supporting local
rootfs.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-5-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

0c3fc330

selftests/bpf: Prefer static linking for LLVM libraries · 67ab80a0

Eduard Zingerman authored Sep 05, 2024

It is not always convenient to have LLVM libraries installed inside CI
rootfs images, thus request static libraries from llvm-config.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240905081401.1894789-4-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

67ab80a0

selftests/bpf: Rename fallback in bpf_dctcp to avoid naming conflict · a48a4388

Pu Lehui authored Sep 05, 2024

Recently, when compiling bpf selftests on RV64, the following
compilation failure occurred:

progs/bpf_dctcp.c:29:21: error: redefinition of 'fallback' as different kind of symbol
   29 | volatile const char fallback[TCP_CA_NAME_MAX];
      |                     ^
/workspace/tools/testing/selftests/bpf/tools/include/vmlinux.h:86812:15: note: previous definition is here
 86812 | typedef u32 (*fallback)(u32, const unsigned char *, size_t);

The reason is that the `fallback` symbol has been defined in
arch/riscv/lib/crc32.c, which will cause symbol conflicts when vmlinux.h
is included in bpf_dctcp. Let we rename `fallback` string to
`fallback_cc` in bpf_dctcp to fix this compilation failure.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-3-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a48a4388

selftests/bpf: Adapt OUTPUT appending logic to lower versions of Make · dc3a8804

Pu Lehui authored Sep 05, 2024

The $(let ...) function is only supported by GNU Make version 4.4 [0]
and above, otherwise the following exception file or directory will be
generated:

	tools/testing/selftests/bpfFEATURE-DUMP.selftests
	tools/testing/selftests/bpffeature/

Considering that the GNU Make version of most Linux distributions is
lower than 4.4, let us adapt the corresponding logic to it.

Link: https://lists.gnu.org/archive/html/info-gnu/2022-10/msg00008.html [0]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240905081401.1894789-2-pulehui@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

dc3a8804

libbpf: fix some typos in libbpf · bd4d67f8

Lin Yikai authored Sep 05, 2024

Hi, fix some spelling errors in libbpf, the details are as follows:

-in the code comments:
	termintaing->terminating
	architecutre->architecture
	requring->requiring
	recored->recoded
	sanitise->sanities
	allowd->allowed
	abover->above
	see bpf_udst_arg()->see bpf_usdt_arg()
Signed-off-by: Lin Yikai <yikai.lin@vivo.com>
Link: https://lore.kernel.org/r/20240905110354.3274546-3-yikai.lin@vivo.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

bd4d67f8

bpftool: fix some typos in bpftool · a86857d2

Lin Yikai authored Sep 05, 2024

Hi, fix some spelling errors in bpftool, the details are as follows:

-in file "bpftool-gen.rst"
	libppf->libbpf
-in the code comments:
	ouptut->output
Signed-off-by: Lin Yikai <yikai.lin@vivo.com>
Link: https://lore.kernel.org/r/20240905110354.3274546-2-yikai.lin@vivo.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a86857d2

selftests/bpf: fix some typos in selftests · 5db0ba67

Lin Yikai authored Sep 05, 2024

Hi, fix some spelling errors in selftest, the details are as follows:

-in the codes:
	test_bpf_sk_stoarge_map_iter_fd(void)
		->test_bpf_sk_storage_map_iter_fd(void)
	load BTF from btf_data.o->load BTF from btf_data.bpf.o

-in the code comments:
	preample->preamble
	multi-contollers->multi-controllers
	errono->errno
	unsighed/unsinged->unsigned
	egree->egress
	shoud->should
	regsiter->register
	assummed->assumed
	conditiona->conditional
	rougly->roughly
	timetamp->timestamp
	ingores->ignores
	null-termainted->null-terminated
	slepable->sleepable
	implemenation->implementation
	veriables->variables
	timetamps->timestamps
	substitue a costant->substitute a constant
	secton->section
	unreferened->unreferenced
	verifer->verifier
	libppf->libbpf
...
Signed-off-by: Lin Yikai <yikai.lin@vivo.com>
Link: https://lore.kernel.org/r/20240905110354.3274546-1-yikai.lin@vivo.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5db0ba67

Merge branch 'selftests-bpf-add-uprobe-multi-pid-filter-test' · 552895af

Andrii Nakryiko authored Sep 05, 2024

Jiri Olsa says:

====================
selftests/bpf: Add uprobe multi pid filter test

hi,
sending fix for uprobe multi pid filtering together with tests. The first
version included tests for standard uprobes, but as we still do not have
fix for that, sending just uprobe multi changes.

thanks,
jirka

v2 changes:
  - focused on uprobe multi only, removed perf event uprobe specific parts
  - added fix and test for CLONE_VM process filter
---
====================

Link: https://lore.kernel.org/r/20240905115124.1503998-1-jolsa@kernel.orgSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

552895af

selftests/bpf: Add uprobe multi pid filter test for clone-ed processes · d2520bdb

Jiri Olsa authored Sep 05, 2024

The idea is to run same test as for test_pid_filter_process, but instead
of standard fork-ed process we create the process with clone(CLONE_VM..)
to make sure the thread leader process filter works properly in this case.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240905115124.1503998-5-jolsa@kernel.org

d2520bdb

selftests/bpf: Add uprobe multi pid filter test for fork-ed processes · 8df43e85

Jiri Olsa authored Sep 05, 2024

The idea is to create and monitor 3 uprobes, each trigered in separate
process and make sure the bpf program gets executed just for the proper
PID specified via pid filter.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240905115124.1503998-4-jolsa@kernel.org

8df43e85

selftests/bpf: Add child argument to spawn_child function · 0b0bb453

Jiri Olsa authored Sep 05, 2024

Adding child argument to spawn_child function to allow
to create multiple children in following change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240905115124.1503998-3-jolsa@kernel.org

0b0bb453

bpf: Fix uprobe multi pid filter check · 900f362e

Jiri Olsa authored Sep 05, 2024

Uprobe multi link does its own process (thread leader) filtering before
running the bpf program by comparing task's vm pointers.

But as Oleg pointed out there can be processes sharing the vm (CLONE_VM),
so we can't just compare task->vm pointers, but instead we need to use
same_thread_group call.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/bpf/20240905115124.1503998-2-jolsa@kernel.org

900f362e

Merge branch 'fix-accessing-first-syscall-argument-on-rv64' · aa01d13e

Andrii Nakryiko authored Sep 04, 2024

Pu Lehui says:

====================
Fix accessing first syscall argument on RV64

On RV64, as Ilya mentioned before [0], the first syscall parameter should be
accessed through orig_a0 (see arch/riscv64/include/asm/syscall.h),
otherwise it will cause selftests like bpf_syscall_macro, vmlinux,
test_lsm, etc. to fail on RV64.

Link: https://lore.kernel.org/bpf/20220209021745.2215452-1-iii@linux.ibm.com [0]

v3:
- Fix test case error.

v2: https://lore.kernel.org/all/20240831023646.1558629-1-pulehui@huaweicloud.com/
- Access first syscall argument with CO-RE direct read. (Andrii)

v1: https://lore.kernel.org/all/20240829133453.882259-1-pulehui@huaweicloud.com/
====================

Link: https://lore.kernel.org/r/20240831041934.1629216-1-pulehui@huaweicloud.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

aa01d13e

libbpf: Fix accessing first syscall argument on RV64 · 99857422

Pu Lehui authored Aug 31, 2024

On RV64, as Ilya mentioned before [0], the first syscall parameter should be
accessed through orig_a0 (see arch/riscv64/include/asm/syscall.h),
otherwise it will cause selftests like bpf_syscall_macro, vmlinux,
test_lsm, etc. to fail on RV64. Let's fix it by using the struct pt_regs
style CO-RE direct access.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-1-iii@linux.ibm.com [0]
Link: https://lore.kernel.org/bpf/20240831041934.1629216-5-pulehui@huaweicloud.com

99857422

selftests/bpf: Enable test_bpf_syscall_macro: Syscall_arg1 on s390 and arm64 · 4a4c4c0d

Pu Lehui authored Aug 31, 2024

Considering that CO-RE direct read access to the first system call
argument is already available on s390 and arm64, let's enable
test_bpf_syscall_macro:syscall_arg1 on these architectures.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240831041934.1629216-4-pulehui@huaweicloud.com

4a4c4c0d

libbpf: Access first syscall argument with CO-RE direct read on arm64 · 9ab94078

Pu Lehui authored Aug 31, 2024

Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE
SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking
into account the read pt_regs comes directly from the context, let's use
CO-RE direct read to access the first system call argument.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/bpf/20240831041934.1629216-3-pulehui@huaweicloud.com

9ab94078

libbpf: Access first syscall argument with CO-RE direct read on s390 · e4db2a82

Pu Lehui authored Aug 31, 2024

Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE
SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking
into account the read pt_regs comes directly from the context, let's use
CO-RE direct read to access the first system call argument.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240831041934.1629216-2-pulehui@huaweicloud.com

e4db2a82

04 Sep, 2024 9 commits

selftests/bpf: Add a selftest for x86 jit convergence issues · eff5b5ff

Yonghong Song authored Sep 04, 2024

The core part of the selftest, i.e., the je <-> jmp cycle, mimics the
original sched-ext bpf program. The test will fail without the
previous patch.

I tried to create some cases for other potential cycles
(je <-> je, jmp <-> je and jmp <-> jmp) with similar pattern
to the test in this patch, but failed. So this patch
only contains one test for je <-> jmp cycle.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240904221256.37389-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

eff5b5ff

bpf, x64: Fix a jit convergence issue · c8831bdb

Yonghong Song authored Sep 04, 2024

Daniel Hodges reported a jit error when playing with a sched-ext program.
The error message is:
  unexpected jmp_cond padding: -4 bytes

But further investigation shows the error is actual due to failed
convergence. The following are some analysis:

  ...
  pass4, final_proglen=4391:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    289:    48 85 ff                test   rdi,rdi
    28c:    74 17                   je     0x2a5
    28e:    e9 7f ff ff ff          jmp    0x212
    293:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.

  pass5, final_proglen=4392:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    28d:    48 85 ff                test   rdi,rdi
    290:    74 1a                   je     0x2ac
    292:    eb 84                   jmp    0x218
    294:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 6-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
time, insn at 0x292 is a 2-byte insn since its offset is -124.

pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.

Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:

    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    24d:    48 85 d2                test   rdx,rdx

The similar code in pass14:
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    249:    48 85 d2                test   rdx,rdx
    24c:    74 21                   je     0x26f
    24e:    48 01 f7                add    rdi,rsi
    ...

Before generating the following insn,
  250:    74 21                   je     0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
  #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
  nops = INSN_SZ_DIFF - 2

In this specific case,
  addrs[i] = 0x24e // from pass14
  addrs[i-1] = 0x24d // from pass15
  prog - temp = 3 // from 'test rdx,rdx' in pass15
so
  nops = -4
and this triggers the failure.

To fix the issue, we need to break cycles of je <-> jmp. For example,
in the above case, we have
  211:    74 7d                   je     0x290
the offset is 0x7d. If 2-byte je insn is generated only if
the offset is less than 0x7d (<= 0x7c), the cycle can be
break and we can achieve the convergence.

I did some study on other cases like je <-> je, jmp <-> je and
jmp <-> jmp which may cause cycles. Those cases are not from actual
reproducible cases since it is pretty hard to construct a test case
for them. the results show that the offset <= 0x7b (0x7b = 123) should
be enough to cover all cases. This patch added a new helper to generate 8-bit
cond/uncond jmp insns only if the offset range is [-128, 123].
Reported-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c8831bdb

selftests: bpf: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE · 23457b37

Feng Yang authored Sep 03, 2024

The ARRAY_SIZE macro is more compact and more formal in linux source.
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240903072559.292607-1-yangfeng59949@163.com

23457b37

Merge branch 'bpf-follow-up-on-gen_epilogue' · 6fee7a7e

Alexei Starovoitov authored Sep 04, 2024

Martin KaFai Lau says:

====================
bpf: Follow up on gen_epilogue

From: Martin KaFai Lau <martin.lau@kernel.org>

The set addresses some follow ups on the earlier gen_epilogue
patch set.
====================
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240904180847.56947-1-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

6fee7a7e

bpf: Fix indentation issue in epilogue_idx · 00750788

Martin KaFai Lau authored Sep 04, 2024

There is a report on new indentation issue in epilogue_idx.
This patch fixed it.

Fixes: 169c3176 ("bpf: Add gen_epilogue to bpf_verifier_ops")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408311622.4GzlzN33-lkp@intel.com/Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240904180847.56947-3-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

00750788

bpf: Remove the insn_buf array stack usage from the inline_bpf_loop() · 940ce73b

Martin KaFai Lau authored Sep 04, 2024

This patch removes the insn_buf array stack usage from the
inline_bpf_loop(). Instead, the env->insn_buf is used. The
usage in inline_bpf_loop() needs more than 16 insn, so the
INSN_BUF_SIZE needs to be increased from 16 to 32.
The compiler stack size warning on the verifier is gone
after this change.

Cc: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240904180847.56947-2-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

940ce73b

samples/bpf: Remove sample tracex2 · 46f4ea04

Rong Tao authored Aug 31, 2024

In commit ba8de796 ("net: introduce sk_skb_reason_drop function")
kfree_skb_reason() becomes an inline function and cannot be traced.

samples/bpf is abandonware by now, and we should slowly but surely
convert whatever makes sense into BPF selftests under
tools/testing/selftests/bpf and just get rid of the rest.

Link: https://github.com/torvalds/linux/commit/ba8de796baf4bdc03530774fb284fe3c97875566Signed-off-by: Rong Tao <rongtao@cestc.cn>
Link: https://lore.kernel.org/r/tencent_30ADAC88CB2915CA57E9512D4460035BA107@qq.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

46f4ea04

selftests/bpf: Fix procmap_query()'s params mismatch and compilation warning · 02baa0a2

Yuan Chen authored Sep 03, 2024

When the PROCMAP_QUERY is not defined, a compilation error occurs due to the
mismatch of the procmap_query()'s params, procmap_query() only be called in
the file where the function is defined, modify the params so they can match.

We get a warning when build samples/bpf:
    trace_helpers.c:252:5: warning: no previous prototype for ‘procmap_query’ [-Wmissing-prototypes]
      252 | int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags)
          |     ^~~~~~~~~~~~~
As this function is only used in the file, mark it as 'static'.

Fixes: 4e9e0760 ("selftests/bpf: make use of PROCMAP_QUERY ioctl if available")
Signed-off-by: Yuan Chen <chenyuan@kylinos.cn>
Link: https://lore.kernel.org/r/20240903012839.3178-1-chenyuan_fl@163.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

02baa0a2

bpf, arm64: Jit BPF_CALL to direct call when possible · ddbe9ec5

Xu Kuohai authored Sep 03, 2024

Currently, BPF_CALL is always jited to indirect call. When target is
within the range of direct call, BPF_CALL can be jited to direct call.

For example, the following BPF_CALL

call __htab_map_lookup_elem

is always jited to indirect call:

mov x10, #0xffffffffffff18f4
movk x10, #0x821, lsl #16
movk x10, #0x8000, lsl #32
blr x10

When the address of target __htab_map_lookup_elem is within the range of
direct call, the BPF_CALL can be jited to:

bl 0xfffffffffd33bc98

This patch does such jit optimization by emitting arm64 direct calls for
BPF_CALL when possible, indirect calls otherwise.

Without this patch, the jit works as follows.

1. First pass
A. Determine jited position and size for each bpf instruction.
B. Computed the jited image size.

2. Allocate jited image with size computed in step 1.

3. Second pass
A. Adjust jump offset for jump instructions
B. Write the final image.

This works because, for a given bpf prog, regardless of where the jited
image is allocated, the jited result for each instruction is fixed. The
second pass differs from the first only in adjusting the jump offsets,
like changing "jmp imm1" to "jmp imm2", while the position and size of
the "jmp" instruction remain unchanged.

Now considering whether to jit BPF_CALL to arm64 direct or indirect call
instruction. The choice depends solely on the jump offset: direct call
if the jump offset is within 128MB, indirect call otherwise.

For a given BPF_CALL, the target address is known, so the jump offset is
decided by the jited address of the BPF_CALL instruction. In other words,
for a given bpf prog, the jited result for each BPF_CALL is determined
by its jited address.

The jited address for a BPF_CALL is the jited image address plus the
total jited size of all preceding instructions. For a given bpf prog,
there are clearly no BPF_CALL instructions before the first BPF_CALL
instruction. Since the jited result for all other instructions other
than BPF_CALL are fixed, the total jited size preceding the first
BPF_CALL is also fixed. Therefore, once the jited image is allocated,
the jited address for the first BPF_CALL is fixed.

Now that the jited result for the first BPF_CALL is fixed, the jited
results for all instructions preceding the second BPF_CALL are fixed.
So the jited address and result for the second BPF_CALL are also fixed.

Similarly, we can conclude that the jited addresses and results for all
subsequent BPF_CALL instructions are fixed.

This means that, for a given bpf prog, once the jited image is allocated,
the jited address and result for all instructions, including all BPF_CALL
instructions, are fixed.

Based on the observation, with this patch, the jit works as follows.

1. First pass
Estimate the maximum jited image size. In this pass, all BPF_CALLs
are jited to arm64 indirect calls since the jump offsets are unknown
because the jited image is not allocated.

2. Allocate jited image with size estimated in step 1.

3. Second pass
A. Determine the jited result for each BPF_CALL.
B. Determine jited address and size for each bpf instruction.

4. Third pass
A. Adjust jump offset for jump instructions.
B. Write the final image.
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20240903094407.601107-1-xukuohai@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ddbe9ec5

02 Sep, 2024 2 commits

bpftool: Fix handling enum64 in btf dump sorting · b0222d1d

Mykyta Yatsenko authored Sep 02, 2024

Wrong function is used to access the first enum64 element. Substituting btf_enum(t)
with btf_enum64(t) for BTF_KIND_ENUM64.

Fixes: 94133cf2 ("bpftool: Introduce btf c dump sorting")
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240902171721.105253-1-mykyta.yatsenko5@gmail.com

b0222d1d

bpftool: Add missing blank lines in bpftool-net doc example · 5d2784e2

Quentin Monnet authored Sep 01, 2024

In bpftool-net documentation, two blank lines are missing in a
recently added example, causing docutils to complain:

    $ cd tools/bpf/bpftool
    $ make doc
      DESCEND Documentation
      GEN     bpftool-btf.8
      GEN     bpftool-cgroup.8
      GEN     bpftool-feature.8
      GEN     bpftool-gen.8
      GEN     bpftool-iter.8
      GEN     bpftool-link.8
      GEN     bpftool-map.8
      GEN     bpftool-net.8
    <stdin>:189: (INFO/1) Possible incomplete section title.
    Treating the overline as ordinary text because it's so short.
    <stdin>:192: (INFO/1) Blank line missing before literal block (after the "::")? Interpreted as a definition list item.
    <stdin>:199: (INFO/1) Possible incomplete section title.
    Treating the overline as ordinary text because it's so short.
    <stdin>:201: (INFO/1) Blank line missing before literal block (after the "::")? Interpreted as a definition list item.
      GEN     bpftool-perf.8
      GEN     bpftool-prog.8
      GEN     bpftool.8
      GEN     bpftool-struct_ops.8

Add the missing blank lines.

Fixes: 0d7c0612 ("bpftool: Add document for net attach/detach on tcx subcommand")
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240901210742.25758-1-qmo@kernel.org

5d2784e2

30 Aug, 2024 5 commits

selftests/bpf: Do not update vmlinux.h unnecessarily · 2ad6d23f

Ihor Solodrai authored Aug 28, 2024

%.bpf.o objects depend on vmlinux.h, which makes them transitively
dependent on unnecessary libbpf headers. However vmlinux.h doesn't
actually change as often.

When generating vmlinux.h, compare it to a previous version and update
it only if there are changes.

Example of build time improvement (after first clean build):
  $ touch ../../../lib/bpf/bpf.h
  $ time make -j8
Before: real  1m37.592s
After:  real  0m27.310s

Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20240828174608.377204-2-ihor.solodrai@pm.me

2ad6d23f

selftests/bpf: Specify libbpf headers required for %.bpf.o progs · 38960ac8

Ihor Solodrai authored Aug 28, 2024

Test %.bpf.o objects actually depend only on some libbpf headers.
Define a list of required headers and use it as TRUNNER_BPF_OBJS
dependency.

bpf_*.h list was determined by:

    $ grep -rh 'include <bpf/bpf_' progs | sort -u
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link:
Link: https://lore.kernel.org/bpf/20240828174608.377204-1-ihor.solodrai@pm.me

https://lore.kernel.org/bpf/CAEf4BzYQ-j2i_xjs94Nn=8+FVfkWt51mLZyiYKiz9oA4Z=pCeA@mail.gmail.com/

38960ac8

selftests/bpf: Check if distilled base inherits source endianness · 181b0d1a

Eduard Zingerman authored Aug 30, 2024

Create a BTF with endianness different from host, make a distilled
base/split BTF pair from it, dump as raw bytes, import again and
verify that endianness is preserved.
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240830173406.1581007-1-eddyz87@gmail.com

181b0d1a

libbpf: Ensure new BTF objects inherit input endianness · da18bfa5

Tony Ambardar authored Aug 30, 2024

New split BTF needs to preserve base's endianness. Similarly, when
creating a distilled BTF, we need to preserve original endianness.

Fix by updating libbpf's btf__distill_base() and btf_new_empty() to retain
the byte order of any source BTF objects when creating new ones.

Fixes: ba451366 ("libbpf: Implement basic split BTF support")
Fixes: 58e185a0 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF")
Reported-by: Song Liu <song@kernel.org>
Reported-by: Eduard Zingerman <eddyz87@gmail.com>
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/6358db36c5f68b07873a0a5be2d062b1af5ea5f8.camel@gmail.com/
Link: https://lore.kernel.org/bpf/20240830095150.278881-1-tony.ambardar@gmail.com

da18bfa5

bpf: Use sockfd_put() helper · 65ef66d9

Jinjie Ruan authored Aug 30, 2024

Replace fput() with sockfd_put() in bpf_fd_reuseport_array_update_elem().
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/r/20240830020756.607877-1-ruanjinjie@huawei.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

65ef66d9