Commits · e55dad12abe42383b68ba88212eb3d0fba0e9820 · Kirill Smelkov / linux

08 Feb, 2024 3 commits

bpf: Merge two CONFIG_BPF entries · e55dad12

Masahiro Yamada authored Feb 04, 2024

'config BPF' exists in both init/Kconfig and kernel/bpf/Kconfig.

Commit b24abcff ("bpf, kconfig: Add consolidated menu entry for bpf
with core options") added the second one to kernel/bpf/Kconfig instead
of moving the existing one.

Merge them together.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240204075634.32969-1-masahiroy@kernel.org

e55dad12

selftests/bpf: Mark cpumask kfunc declarations as __weak · ba6a6abb

Yafang Shao authored Feb 06, 2024

After the series "Annotate kfuncs in .BTF_ids section"[0], kfuncs can be
generated from bpftool. Let's mark the existing cpumask kfunc declarations
__weak so they don't conflict with definitions that will eventually come
from vmlinux.h.

[0]. https://lore.kernel.org/all/cover.1706491398.git.dxu@dxuuu.xyzSuggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/20240206081416.26242-5-laoar.shao@gmail.com

ba6a6abb

selftests/bpf: Fix error checking for cpumask_success__load() · a2bff65c

Yafang Shao authored Feb 06, 2024

We should verify the return value of cpumask_success__load().
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240206081416.26242-4-laoar.shao@gmail.com

a2bff65c

07 Feb, 2024 3 commits

Merge branch 'tools-resolve_btfids-fix-cross-compilation-to-non-host-endianness' · abae1ac5

Andrii Nakryiko authored Feb 07, 2024

Viktor Malik says:

====================
tools/resolve_btfids: fix cross-compilation to non-host endianness

The .BTF_ids section is pre-filled with zeroed BTF ID entries during the
build and afterwards patched by resolve_btfids with correct values.
Since resolve_btfids always writes in host-native endianness, it relies
on libelf to do the translation when the target ELF is cross-compiled to
a different endianness (this was introduced in commit 61e8aeda
("bpf: Fix libelf endian handling in resolv_btfids")).

Unfortunately, the translation will corrupt the flags fields of SET8
entries because these were written during vmlinux compilation and are in
the correct endianness already. This will lead to numerous selftests
failures such as:

    $ sudo ./test_verifier 502 502
    #502/p sleepable fentry accept FAIL
    Failed to load prog 'Invalid argument'!
    bpf_fentry_test1 is not sleepable
    verification time 34 usec
    stack depth 0
    processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
    Summary: 0 PASSED, 0 SKIPPED, 1 FAILED

Since it's not possible to instruct libelf to translate just certain
values, let's manually bswap the flags (both global and entry flags) in
resolve_btfids when needed, so that libelf then translates everything
correctly.

The first patch of the series refactors resolve_btfids by using types
from btf_ids.h instead of accessing the BTF ID data using magic offsets.
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
Changes in v4:
- remove unnecessary vars and pointer casts (suggested by Daniel Xu)

Changes in v3:
- add byte swap of global 'flags' field in btf_id_set8 (suggested by
  Jiri Olsa)
- cleaner refactoring of sets_patch (suggested by Jiri Olsa)
- add compile-time assertion that IDs are at the beginning of pairs
  struct in btf_id_set8 (suggested by Daniel Borkmann)

Changes in v2:
- use type defs from btf_ids.h (suggested by Andrii Nakryiko)
====================

Link: https://lore.kernel.org/r/cover.1707223196.git.vmalik@redhat.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

abae1ac5

tools/resolve_btfids: Fix cross-compilation to non-host endianness · 903fad43

Viktor Malik authored Feb 06, 2024

The .BTF_ids section is pre-filled with zeroed BTF ID entries during the
build and afterwards patched by resolve_btfids with correct values.
Since resolve_btfids always writes in host-native endianness, it relies
on libelf to do the translation when the target ELF is cross-compiled to
a different endianness (this was introduced in commit 61e8aeda
("bpf: Fix libelf endian handling in resolv_btfids")).

Unfortunately, the translation will corrupt the flags fields of SET8
entries because these were written during vmlinux compilation and are in
the correct endianness already. This will lead to numerous selftests
failures such as:

    $ sudo ./test_verifier 502 502
    #502/p sleepable fentry accept FAIL
    Failed to load prog 'Invalid argument'!
    bpf_fentry_test1 is not sleepable
    verification time 34 usec
    stack depth 0
    processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
    Summary: 0 PASSED, 0 SKIPPED, 1 FAILED

Since it's not possible to instruct libelf to translate just certain
values, let's manually bswap the flags (both global and entry flags) in
resolve_btfids when needed, so that libelf then translates everything
correctly.

Fixes: ef2c6f37 ("tools/resolve_btfids: Add support for 8-byte BTF sets")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/7b6bff690919555574ce0f13d2a5996cacf7bf69.1707223196.git.vmalik@redhat.com

903fad43

tools/resolve_btfids: Refactor set sorting with types from btf_ids.h · 9707ac4f

Viktor Malik authored Feb 06, 2024

Instead of using magic offsets to access BTF ID set data, leverage types
from btf_ids.h (btf_id_set and btf_id_set8) which define the actual
layout of the data. Thanks to this change, set sorting should also
continue working if the layout changes.

This requires to sync the definition of 'struct btf_id_set8' from
include/linux/btf_ids.h to tools/include/linux/btf_ids.h. We don't sync
the rest of the file at the moment, b/c that would require to also sync
multiple dependent headers and we don't need any other defs from
btf_ids.h.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/ff7f062ddf6a00815fda3087957c4ce667f50532.1707223196.git.vmalik@redhat.com

9707ac4f

06 Feb, 2024 14 commits

libbpf: Use OPTS_SET() macro in bpf_xdp_query() · 92a871ab

Toke Høiland-Jørgensen authored Feb 06, 2024

When the feature_flags and xdp_zc_max_segs fields were added to the libbpf
bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro.
This causes libbpf to write to those fields unconditionally, which means
that programs compiled against an older version of libbpf (with a smaller
size of the bpf_xdp_query_opts struct) will have its stack corrupted by
libbpf writing out of bounds.

The patch adding the feature_flags field has an early bail out if the
feature_flags field is not part of the opts struct (via the OPTS_HAS)
macro, but the patch adding xdp_zc_max_segs does not. For consistency, this
fix just changes the assignments to both fields to use the OPTS_SET()
macro.

Fixes: 13ce2daa ("xsk: add new netlink attribute dedicated for ZC max frags")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240206125922.1992815-1-toke@redhat.com

92a871ab

bpf: Use -Wno-address-of-packed-member in some selftests · c27aa462

Jose E. Marchesi authored Feb 06, 2024

[Differences from V2:
- Remove conditionals in the source files pragmas, as the
  pragma is supported by both GCC and clang.]

Both GCC and clang implement the -Wno-address-of-packed-member
warning, which is enabled by -Wall, that warns about taking the
address of a packed struct field when it can lead to an "unaligned"
address.

This triggers the following errors (-Werror) when building three
particular BPF selftests with GCC:

  progs/test_cls_redirect.c
  986 |         if (ipv4_is_fragment((void *)&encap->ip)) {
  progs/test_cls_redirect_dynptr.c
  410 |         pkt_ipv4_checksum((void *)&encap_gre->ip);
  progs/test_cls_redirect.c
  521 |         pkt_ipv4_checksum((void *)&encap_gre->ip);
  progs/test_tc_tunnel.c
   232 |         set_ipv4_csum((void *)&h_outer.ip);

These warnings do not signal any real problem in the tests as far as I
can see.

This patch adds pragmas to these test files that inhibit the
-Waddress-of-packed-member warning.

Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240206102330.7113-1-jose.marchesi@oracle.com

c27aa462

bpf, docs: Fix typos in instructions-set.rst · 563918a0

Dave Thaler authored Feb 05, 2024

* "imm32" should just be "imm"
* Add blank line to fix formatting error reported by Stephen Rothwell [0]

[0]: https://lore.kernel.org/bpf/20240206153301.4ead0bad@canb.auug.org.au/T/#uSigned-off-by: Dave Thaler <dthaler1968@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240206045146.4965-1-dthaler1968@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

563918a0

selftests/bpf: mark dynptr kfuncs __weak to make them optional on old kernels · c7dcb6c9

Andrii Nakryiko authored Feb 05, 2024

Mark dynptr kfuncs as __weak to allow
verifier_global_subprogs/arg_ctx_{perf,kprobe,raw_tp} subtests to be
loadable on old kernels. Because bpf_dynptr_from_xdp() kfunc is used
from arg_tag_dynptr BPF program in progs/verifier_global_subprogs.c
*and* is not marked as __weak, loading any subtest from
verifier_global_subprogs fails on old kernels that don't have
bpf_dynptr_from_xdp() kfunc defined. Even if arg_tag_dynptr program
itself is not loaded, libbpf bails out on non-weak reference to
bpf_dynptr_from_xdp (that can't be resolved), which shared across all
programs in progs/verifier_global_subprogs.c.

So mark all dynptr-related kfuncs as __weak to unblock libbpf CI ([0]).
In the upcoming "kfunc in vmlinux.h" work we should make sure that
kfuncs are always declared __weak as well.

  [0] https://github.com/libbpf/libbpf/actions/runs/7792673215/job/21251250831?pr=776#step:4:7961Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240206004008.1541513-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c7dcb6c9

libbpf: fix return value for PERF_EVENT __arg_ctx type fix up check · d7bc416a

Andrii Nakryiko authored Feb 05, 2024

If PERF_EVENT program has __arg_ctx argument with matching
architecture-specific pt_regs/user_pt_regs/user_regs_struct pointer
type, libbpf should still perform type rewrite for old kernels, but not
emit the warning. Fix copy/paste from kernel code where 0 is meant to
signify "no error" condition. For libbpf we need to return "true" to
proceed with type rewrite (which for PERF_EVENT program will be
a canonical `struct bpf_perf_event_data *` type).

Fixes: 9eea8faf ("libbpf: fix __arg_ctx type enforcement for perf_event programs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240206002243.1439450-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d7bc416a

Merge branch 'xsk-support-redirect-to-any-socket-bound-to-the-same-umem' · 6146fae6

Alexei Starovoitov authored Feb 05, 2024

Magnus Karlsson says:

====================
xsk: support redirect to any socket bound to the same umem

This patch set adds support for directing a packet to any socket bound
to the same umem. This makes it possible to use the XDP program to
select what socket the packet should be received on. The user can
populate the XSKMAP with various sockets and as long as they share the
same umem, the XDP program can pick any one of them.

The implementation is straight-forward. Instead of testing that the
incoming packet is targeting the same device and queue id as the
socket is bound to, just check that the umem the packet was received
on is the same as the socket we want it to be received on. This
guarantees that the redirect is legal as it is already in the correct
umem.

Patch #1 implements the feature and patch #2 adds documentation.

Thanks: Magnus
====================

Link: https://lore.kernel.org/r/20240205123553.22180-1-magnus.karlsson@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

6146fae6

xsk: document ability to redirect to any socket bound to the same umem · 968595a9

Magnus Karlsson authored Feb 05, 2024

Document the ability to redirect to any socket bound to the same umem.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20240205123553.22180-3-magnus.karlsson@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

968595a9

xsk: support redirect to any socket bound to the same umem · 2863d665

Magnus Karlsson authored Feb 05, 2024

Add support for directing a packet to any socket bound to the same
umem. This makes it possible to use the XDP program to select what
socket the packet should be received on. The user can populate the
XSKMAP with various sockets and as long as they share the same umem,
the XDP program can pick any one of them.
Suggested-by: Yuval El-Hanany <yuvale@radware.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240205123553.22180-2-magnus.karlsson@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2863d665

Merge branch 'transfer-rcu-lock-state-across-subprog-calls' · 20a286c1

Alexei Starovoitov authored Feb 05, 2024

Kumar Kartikeya Dwivedi says:

====================
Transfer RCU lock state across subprog calls

David suggested during the discussion in [0] that we should handle RCU
locks in a similar fashion to spin locks where the verifier understands
when a lock held in a caller is released in callee, or lock taken in
callee is released in a caller, or the callee is called within a lock
critical section. This set extends the same semantics to RCU read locks
and adds a few selftests to verify correct behavior. This issue has also
come up for sched-ext programs.

This would now allow static subprog calls to be made without errors
within RCU read sections, for subprogs to release RCU locks of callers
and return to them, or for subprogs to take RCU lock which is later
released in the caller.

  [0]: https://lore.kernel.org/bpf/20240204120206.796412-1-memxor@gmail.com

Changelog:
----------
v1 -> v2:
v1: https://lore.kernel.org/bpf/20240204230231.1013964-1-memxor@gmail.com

 * Add tests for global subprog behaviour (Yafang)
 * Add Acks, Tested-by (Yonghong, Yafang)
====================

Link: https://lore.kernel.org/r/20240205055646.1112186-1-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

20a286c1

selftests/bpf: Add tests for RCU lock transfer between subprogs · 8be6a014

Kumar Kartikeya Dwivedi authored Feb 05, 2024

Add selftests covering the following cases:
- A static or global subprog called from within a RCU read section works
- A static subprog taking an RCU read lock which is released in caller works
- A static subprog releasing the caller's RCU read lock works

Global subprogs that leave the lock in an imbalanced state will not
work, as they are verified separately, so ensure those cases fail as
well.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240205055646.1112186-3-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8be6a014

bpf: Transfer RCU lock state between subprog calls · 6fceea0f

Kumar Kartikeya Dwivedi authored Feb 05, 2024

Allow transferring an imbalanced RCU lock state between subprog calls
during verification. This allows patterns where a subprog call returns
with an RCU lock held, or a subprog call releases an RCU lock held by
the caller. Currently, the verifier would end up complaining if the RCU
lock is not released when processing an exit from a subprog, which is
non-ideal if its execution is supposed to be enclosed in an RCU read
section of the caller.

Instead, simply only check whether we are processing exit for frame#0
and do not complain on an active RCU lock otherwise. We only need to
update the check when processing BPF_EXIT insn, as copy_verifier_state
is already set up to do the right thing.
Suggested-by: David Vernet <void@manifault.com>
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240205055646.1112186-2-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

6fceea0f

Merge branch 'enable-static-subprog-calls-in-spin-lock-critical-sections' · 8244ab50

Alexei Starovoitov authored Feb 05, 2024

Kumar Kartikeya Dwivedi says:

====================
Enable static subprog calls in spin lock critical sections

This set allows a BPF program to make a call to a static subprog within
a bpf_spin_lock critical section. This problem has been hit in sched-ext
and ghOSt [0] as well, and is mostly an annoyance which is worked around
by inling the static subprog into the critical section.

In case of sched-ext, there are a lot of other helper/kfunc calls that
need to be allow listed for the support to be complete, but a separate
follow up will deal with that.

Unlike static subprogs, global subprogs cannot be allowed yet as the
verifier will not explore their body when encountering a call
instruction for them. Therefore, we would need an alternative approach
(some sort of function summarization to ensure a lock is never taken
from a global subprog and all its callees).

 [0]: https://lore.kernel.org/bpf/bd173bf2-dea6-3e0e-4176-4a9256a9a056@google.com

Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20240204120206.796412-1-memxor@gmail.com

 * Indicate global function call in verifier error string (Yonghong, David)
 * Add Acks from Yonghong, David
====================

Link: https://lore.kernel.org/r/20240204222349.938118-1-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8244ab50

selftests/bpf: Add test for static subprog call in lock cs · e8699c4f

Kumar Kartikeya Dwivedi authored Feb 04, 2024

Add selftests for static subprog calls within bpf_spin_lock critical
section, and ensure we still reject global subprog calls. Also test the
case where a subprog call will unlock the caller's held lock, or the
caller will unlock a lock taken by a subprog call, ensuring correct
transfer of lock state across frames on exit.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240204222349.938118-3-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e8699c4f

bpf: Allow calling static subprogs while holding a bpf_spin_lock · a44b1334

Kumar Kartikeya Dwivedi authored Feb 04, 2024

Currently, calling any helpers, kfuncs, or subprogs except the graph
data structure (lists, rbtrees) API kfuncs while holding a bpf_spin_lock
is not allowed. One of the original motivations of this decision was to
force the BPF programmer's hand into keeping the bpf_spin_lock critical
section small, and to ensure the execution time of the program does not
increase due to lock waiting times. In addition to this, some of the
helpers and kfuncs may be unsafe to call while holding a bpf_spin_lock.

However, when it comes to subprog calls, atleast for static subprogs,
the verifier is able to explore their instructions during verification.
Therefore, it is similar in effect to having the same code inlined into
the critical section. Hence, not allowing static subprog calls in the
bpf_spin_lock critical section is mostly an annoyance that needs to be
worked around, without providing any tangible benefit.

Unlike static subprog calls, global subprog calls are not safe to permit
within the critical section, as the verifier does not explore them
during verification, therefore whether the same lock will be taken
again, or unlocked, cannot be ascertained.

Therefore, allow calling static subprogs within a bpf_spin_lock critical
section, and only reject it in case the subprog linkage is global.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240204222349.938118-2-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a44b1334

05 Feb, 2024 5 commits

bpf, docs: Expand set of initial conformance groups · 2d9a925d

Dave Thaler authored Feb 02, 2024

This patch attempts to update the ISA specification according
to the latest mailing list discussion about conformance groups,
in a way that is intended to be consistent with IANA registry
processes and IETF 118 WG meeting discussion.

It does the following:
* Split basic into base32 and base64 for 32-bit vs 64-bit base
  instructions
* Split division/multiplication/modulo instructions out of base groups
* Split atomic instructions out of base groups

There may be additional changes as discussion continues,
but there seems to be consensus on the principles above.

v1->v2: fixed typo pointed out by David Vernet

v2->v3: Moved multiplication to same groups as division/modulo
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240202221110.3872-1-dthaler1968@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2d9a925d

selftests/bpf: Fix flaky selftest lwt_redirect/lwt_reroute · e7f31873

Yonghong Song authored Feb 04, 2024

Recently, when running './test_progs -j', I occasionally hit the
following errors:

  test_lwt_redirect:PASS:pthread_create 0 nsec
  test_lwt_redirect_run:FAIL:netns_create unexpected error: 256 (errno 0)
  #142/2   lwt_redirect/lwt_redirect_normal_nomac:FAIL
  #142     lwt_redirect:FAIL
  test_lwt_reroute:PASS:pthread_create 0 nsec
  test_lwt_reroute_run:FAIL:netns_create unexpected error: 256 (errno 0)
  test_lwt_reroute:PASS:pthread_join 0 nsec
  #143/2   lwt_reroute/lwt_reroute_qdisc_dropped:FAIL
  #143     lwt_reroute:FAIL

The netns_create() definition looks like below:

  #define NETNS "ns_lwt"
  static inline int netns_create(void)
  {
        return system("ip netns add " NETNS);
  }

One possibility is that both lwt_redirect and lwt_reroute create
netns with the same name "ns_lwt" which may cause conflict. I tried
the following example:
  $ sudo ip netns add abc
  $ echo $?
  0
  $ sudo ip netns add abc
  Cannot create namespace file "/var/run/netns/abc": File exists
  $ echo $?
  1
  $

The return code for above netns_create() is 256. The internet search
suggests that the return value for 'ip netns add ns_lwt' is 1, which
matches the above 'sudo ip netns add abc' example.

This patch tried to use different netns names for two tests to avoid
'ip netns add <name>' failure.

I ran './test_progs -j' 10 times and all succeeded with
lwt_redirect/lwt_reroute tests.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240205052914.1742687-1-yonghong.song@linux.dev

e7f31873

selftests/bpf: Suppress warning message of an unused variable. · 169e6500

Kui-Feng Lee authored Feb 03, 2024

"r" is used to receive the return value of test_2 in bpf_testmod.c, but it
is not actually used. So, we remove "r" and change the return type to
"void".
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401300557.z5vzn8FM-lkp@intel.com/Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240204061204.1864529-1-thinker.li@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

169e6500

selftests/bpf: Fix flaky test ptr_untrusted · 7e428638

Yonghong Song authored Feb 04, 2024

Somehow recently I frequently hit the following test failure
with either ./test_progs or ./test_progs-cpuv4:
  serial_test_ptr_untrusted:PASS:skel_open 0 nsec
  serial_test_ptr_untrusted:PASS:lsm_attach 0 nsec
  serial_test_ptr_untrusted:PASS:raw_tp_attach 0 nsec
  serial_test_ptr_untrusted:FAIL:cmp_tp_name unexpected cmp_tp_name: actual -115 != expected 0
  #182     ptr_untrusted:FAIL

Further investigation found the failure is due to
  bpf_probe_read_user_str()
where reading user-level string attr->raw_tracepoint.name
is not successfully, most likely due to the
string itself still in disk and not populated into memory yet.

One solution is do a printf() call of the string before doing bpf
syscall which will force the raw_tracepoint.name into memory.
But I think a more robust solution is to use bpf_copy_from_user()
which is used in sleepable program and can tolerate page fault,
and the fix here used the latter approach.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240204194452.2785936-1-yonghong.song@linux.dev

7e428638

bpf: Remove an unnecessary check. · df9705ea

Kui-Feng Lee authored Feb 02, 2024

The "i" here is always equal to "btf_type_vlen(t)" since
the "for_each_member()" loop never breaks.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240203055119.2235598-1-thinker.li@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

df9705ea

03 Feb, 2024 4 commits

Merge branch 'two-small-fixes-for-global-subprog-tagging' · 2a79690e

Alexei Starovoitov authored Feb 02, 2024

Andrii Nakryiko says:

====================
Two small fixes for global subprog tagging

Fix a bug with passing trusted PTR_TO_BTF_ID_OR_NULL register into global
subprog that expects `__arg_trusted __arg_nullable` arguments, which was
discovered when adopting production BPF application.

Also fix annoying warnings that are irrelevant for static subprogs, which are
just an artifact of using btf_prepare_func_args() for both static and global
subprogs.
====================
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240202190529.2374377-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

2a79690e

bpf: don't emit warnings intended for global subprogs for static subprogs · 1eb98674

Andrii Nakryiko authored Feb 02, 2024

When btf_prepare_func_args() was generalized to handle both static and
global subprogs, a few warnings/errors that are meant only for global
subprog cases started to be emitted for static subprogs, where they are
sort of expected and irrelavant.

Stop polutting verifier logs with irrelevant scary-looking messages.

Fixes: e26080d0 ("bpf: prepare btf_prepare_func_args() for handling static subprogs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240202190529.2374377-4-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

1eb98674

selftests/bpf: add more cases for __arg_trusted __arg_nullable args · e2e70535

Andrii Nakryiko authored Feb 02, 2024

Add extra layer of global functions to ensure that passing around
(trusted) PTR_TO_BTF_ID_OR_NULL registers works as expected. We also
extend trusted_task_arg_nullable subtest to check three possible valid
argumements: known NULL, known non-NULL, and maybe NULL cases.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240202190529.2374377-3-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e2e70535

bpf: handle trusted PTR_TO_BTF_ID_OR_NULL in argument check logic · 8f13c340

Andrii Nakryiko authored Feb 02, 2024

Add PTR_TRUSTED | PTR_MAYBE_NULL modifiers for PTR_TO_BTF_ID to
check_reg_type() to support passing trusted nullable PTR_TO_BTF_ID
registers into global functions accepting `__arg_trusted __arg_nullable`
arguments. This hasn't been caught earlier because tests were either
passing known non-NULL PTR_TO_BTF_ID registers or known NULL (SCALAR)
registers.

When utilizing this functionality in complicated real-world BPF
application that passes around PTR_TO_BTF_ID_OR_NULL, it became apparent
that verifier rejects valid case because check_reg_type() doesn't handle
this case explicitly. Existing check_reg_type() logic is already
anticipating this combination, so we just need to explicitly list this
combo in the switch statement.

Fixes: e2b3c4ff ("bpf: add __arg_trusted global func arg tag")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240202190529.2374377-2-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8f13c340

02 Feb, 2024 8 commits

selftests/bpf: trace_helpers.c: do not use poisoned type · a68b50f4

Shung-Hsi Yu authored Feb 02, 2024

After commit c698eaeb ("selftests/bpf: trace_helpers.c: Optimize
kallsyms cache") trace_helpers.c now includes libbpf_internal.h, and
thus can no longer use the u32 type (among others) since they are poison
in libbpf_internal.h. Replace u32 with __u32 to fix the following error
when building trace_helpers.c on powerpc:

error: attempt to use poisoned "u32"

Fixes: c698eaeb ("selftests/bpf: trace_helpers.c: Optimize kallsyms cache")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240202095559.12900-1-shung-hsi.yu@suse.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

a68b50f4

Merge branch 'improvements-for-tracking-scalars-in-the-bpf-verifier' · 6fb3f727

Andrii Nakryiko authored Feb 02, 2024

Maxim Mikityanskiy says:

====================
Improvements for tracking scalars in the BPF verifier

From: Maxim Mikityanskiy <maxim@isovalent.com>

The goal of this series is to extend the verifier's capabilities of
tracking scalars when they are spilled to stack, especially when the
spill or fill is narrowing. It also contains a fix by Eduard for
infinite loop detection and a state pruning optimization by Eduard that
compensates for a verification complexity regression introduced by
tracking unbounded scalars. These improvements reduce the surface of
false rejections that I saw while working on Cilium codebase.

Patches 1-9 of the original series were previously applied in v2.

Patches 1-2 (Maxim): Support the case when boundary checks are first
performed after the register was spilled to the stack.

Patches 3-4 (Maxim): Support narrowing fills.

Patches 5-6 (Eduard): Optimization for state pruning in stacksafe() to
mitigate the verification complexity regression.

veristat -e file,prog,states -f '!states_diff<50' -f '!states_pct<10' -f '!states_a<10' -f '!states_b<10' -C ...

 * Without patch 5:

File                  Program   States (A)  States (B)  States    (DIFF)
--------------------  --------  ----------  ----------  ----------------
pyperf100.bpf.o       on_event        4878        6528   +1650 (+33.83%)
pyperf180.bpf.o       on_event        6936       11032   +4096 (+59.05%)
pyperf600.bpf.o       on_event       22271       39455  +17184 (+77.16%)
pyperf600_iter.bpf.o  on_event         400         490     +90 (+22.50%)
strobemeta.bpf.o      on_event        4895       14028  +9133 (+186.58%)

 * With patch 5:

File                     Program        States (A)  States (B)  States   (DIFF)
-----------------------  -------------  ----------  ----------  ---------------
bpf_xdp.o                tail_lb_ipv4         2770        2224   -546 (-19.71%)
pyperf100.bpf.o          on_event             4878        5848   +970 (+19.89%)
pyperf180.bpf.o          on_event             6936        8868  +1932 (+27.85%)
pyperf600.bpf.o          on_event            22271       29656  +7385 (+33.16%)
pyperf600_iter.bpf.o     on_event              400         450    +50 (+12.50%)
xdp_synproxy_kern.bpf.o  syncookie_tc          280         226    -54 (-19.29%)
xdp_synproxy_kern.bpf.o  syncookie_xdp         302         228    -74 (-24.50%)

v2 changes:

Fixed comments in patch 1, moved endianness checks to header files in
patch 12 where possible, added Eduard's ACKs.

v3 changes:

Maxim: Removed __is_scalar_unbounded altogether, addressed Andrii's
comments.

Eduard: Patch #5 (#14 in v2) changed significantly:
- Logical changes:
  - Handling of STACK_{MISC,ZERO} mix turned out to be incorrect:
    a mix of MISC and ZERO in old state is not equivalent to e.g.
    just MISC is current state, because verifier could have deduced
    zero scalars from ZERO slots in old state for some loads.
  - There is no reason to limit the change only to cases when
    old or current stack is a spill of unbounded scalar,
    it is valid to compare any 64-bit scalar spill with fake
    register impersonating MISC.
  - STACK_ZERO vs spilled zero case was dropped,
    after recent changes for zero handling by Andrii and Yonghong
    it is hard (impossible?) to conjure all ZERO slots for an spi.
    => the case does not make any difference in veristat results.
- Use global static variable for unbound_reg (Andrii)
- Code shuffling to remove duplication in stacksafe() (Andrii)
====================

Link: https://lore.kernel.org/r/20240127175237.526726-1-maxtram95@gmail.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

6fb3f727

selftests/bpf: States pruning checks for scalar vs STACK_MISC · 73a28d9d

Eduard Zingerman authored Jan 27, 2024

Check that stacksafe() compares spilled scalars with STACK_MISC.
The following combinations are explored:
- old spill of imprecise scalar is equivalent to cur STACK_{MISC,INVALID}
  (plus error in unpriv mode);
- old spill of precise scalar is not equivalent to cur STACK_MISC;
- old STACK_MISC is equivalent to cur scalar;
- old STACK_MISC is not equivalent to cur non-scalar.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127175237.526726-7-maxtram95@gmail.com

73a28d9d

bpf: Handle scalar spill vs all MISC in stacksafe() · 6efbde20

Eduard Zingerman authored Jan 27, 2024

When check_stack_read_fixed_off() reads value from an spi
all stack slots of which are set to STACK_{MISC,INVALID},
the destination register is set to unbound SCALAR_VALUE.

Exploit this fact by allowing stacksafe() to use a fake
unbound scalar register to compare 'mmmm mmmm' stack value
in old state vs spilled 64-bit scalar in current state
and vice versa.

Veristat results after this patch show some gains:

./veristat -C -e file,prog,states -f 'states_pct>10'  not-opt after
File                     Program                States   (DIFF)
-----------------------  ---------------------  ---------------
bpf_overlay.o            tail_rev_nodeport_lb4    -45 (-15.85%)
bpf_xdp.o                tail_lb_ipv4            -541 (-19.57%)
pyperf100.bpf.o          on_event                -680 (-10.42%)
pyperf180.bpf.o          on_event               -2164 (-19.62%)
pyperf600.bpf.o          on_event               -9799 (-24.84%)
strobemeta.bpf.o         on_event               -9157 (-65.28%)
xdp_synproxy_kern.bpf.o  syncookie_tc             -54 (-19.29%)
xdp_synproxy_kern.bpf.o  syncookie_xdp            -74 (-24.50%)
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127175237.526726-6-maxtram95@gmail.com

6efbde20

selftests/bpf: Add test cases for narrowing fill · 067313a8

Maxim Mikityanskiy authored Jan 27, 2024

The previous commit allowed to preserve boundaries and track IDs of
scalars on narrowing fills. Add test cases for that pattern.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-5-maxtram95@gmail.com

067313a8

bpf: Preserve boundaries and track scalars on narrowing fill · c1e6148c

Maxim Mikityanskiy authored Jan 27, 2024

When the width of a fill is smaller than the width of the preceding
spill, the information about scalar boundaries can still be preserved,
as long as it's coerced to the right width (done by coerce_reg_to_size).
Even further, if the actual value fits into the fill width, the ID can
be preserved as well for further tracking of equal scalars.

Implement the above improvements, which makes narrowing fills behave the
same as narrowing spills and MOVs between registers.

Two tests are adjusted to accommodate for endianness differences and to
take into account that it's now allowed to do a narrowing fill from the
least significant bits.

reg_bounds_sync is added to coerce_reg_to_size to correctly adjust
umin/umax boundaries after the var_off truncation, for example, a 64-bit
value 0xXXXXXXXX00000000, when read as a 32-bit, gets umin = 0, umax =
0xFFFFFFFF, var_off = (0x0; 0xffffffff00000000), which needs to be
synced down to umax = 0, otherwise reg_bounds_sanity_check doesn't pass.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127175237.526726-4-maxtram95@gmail.com

c1e6148c

selftests/bpf: Test tracking spilled unbounded scalars · 6be503ce

Maxim Mikityanskiy authored Jan 27, 2024

The previous commit added tracking for unbounded scalars on spill. Add
the test case to check the new functionality.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-3-maxtram95@gmail.com

6be503ce

bpf: Track spilled unbounded scalars · e67ddd9b

Maxim Mikityanskiy authored Jan 27, 2024

Support the pattern where an unbounded scalar is spilled to the stack,
then boundary checks are performed on the src register, after which the
stack frame slot is refilled into a register.

Before this commit, the verifier didn't treat the src register and the
stack slot as related if the src register was an unbounded scalar. The
register state wasn't copied, the id wasn't preserved, and the stack
slot was marked as STACK_MISC. Subsequent boundary checks on the src
register wouldn't result in updating the boundaries of the spilled
variable on the stack.

After this commit, the verifier will preserve the bond between src and
dst even if src is unbounded, which permits to do boundary checks on src
and refill dst later, still remembering its boundaries. Such a pattern
is sometimes generated by clang when compiling complex long functions.

One test is adjusted to reflect that now unbounded scalars are tracked.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-2-maxtram95@gmail.com

e67ddd9b

01 Feb, 2024 3 commits

selftests/bpf: Fix bench runner SIGSEGV · 943b043a

Andrii Nakryiko authored Feb 01, 2024

Some benchmarks don't have either "consumer" or "producer" sides. For
example, trig-tp and other BPF triggering benchmarks don't have
consumers, as they only do "producing" by calling into syscall or
predefined uproes. As such it's valid for some benchmarks to have zero
consumers or producers. So allows to specify `-c0` explicitly.

This triggers another problem. If benchmark doesn't support either
consumer or producer side, consumer_thread/producer_thread callback will
be NULL, but benchmark runner will attempt to use those NULL callback to
create threads anyways. So instead of crashing with SIGSEGV in case of
misconfigured benchmark, detect the condition and report error.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-6-andrii@kernel.org

943b043a

libbpf: Add missed btf_ext__raw_data() API · b9551da8

Andrii Nakryiko authored Feb 01, 2024

Another API that was declared in libbpf.map but actual implementation
was missing. btf_ext__get_raw_data() was intended as a discouraged alias
to consistently-named btf_ext__raw_data(), so make this an actuality.

Fixes: 20eccf29 ("libbpf: hide and discourage inconsistently named getters")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-5-andrii@kernel.org

b9551da8

libbpf: Add btf__new_split() API that was declared but not implemented · c81a8ab1

Andrii Nakryiko authored Feb 01, 2024

Seems like original commit adding split BTF support intended to add
btf__new_split() API, and even declared it in libbpf.map, but never
added (trivial) implementation. Fix this.

Fixes: ba451366 ("libbpf: Implement basic split BTF support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-4-andrii@kernel.org

c81a8ab1