- 30 Aug, 2021 1 commit
-
-
Sandipan Das authored
Stepping down as I haven't had a chance to look into the powerpc BPF JIT compilers for a while. Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210827111905.396145-1-sandipan@linux.ibm.com
-
- 27 Aug, 2021 1 commit
-
-
Chengfeng Ye authored
This lock is not released if the program return at the patched branch. Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210827074140.118671-1-cyeaa@connect.ust.hk
-
- 26 Aug, 2021 8 commits
-
-
Kumar Kartikeya Dwivedi authored
While at it, also improve help output when CPU number is greater than possible. Fixes: e531a220 ("samples: bpf: Convert xdp_redirect_cpu to XDP samples helper") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210826120910.454081-1-memxor@gmail.com
-
Yucong Sun authored
This patch adds similar retry logic to more places where read() is used, to reduce flakyness in slow CI environment. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825184745.2680830-1-fallentree@fb.com
-
Daniel Xu authored
This commit fixes linker errors along the lines of: s390-linux-ld: task_iter.c:(.init.text+0xa4): undefined reference to `btf_task_struct_ids'` Fix by defining btf_task_struct_ids unconditionally in kernel/bpf/btf.c since there exists code that unconditionally uses btf_task_struct_ids. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/05d94748d9f4b3eecedc4fddd6875418a396e23c.1629942444.git.dxu@dxuuu.xyz
-
Alexei Starovoitov authored
Martin KaFai says: ==================== This set allows the bpf-tcp-cc to call bpf_setsockopt. One use case is to allow a bpf-tcp-cc switching to another cc during init(). For example, when the tcp flow is not ecn ready, the bpf_dctcp can switch to another cc by calling setsockopt(TCP_CONGESTION). bpf_getsockopt() is also added to have a symmetrical API, so less usage surprise. v2: - Not allow switching to kernel's tcp_cdg because it is the only kernel tcp-cc that stores a pointer to icsk_ca_priv. Please see the commit log in patch 1 for details. Test is added in patch 4 to check switching to tcp_cdg. - Refactor the logic finding the offset of a func ptr in the "struct tcp_congestion_ops" to prog_ops_moff() in patch 1. - bpf_setsockopt() has been disabled in release() since v1 (please see commit log in patch 1 for reason). bpf_getsockopt() is also disabled together in release() in v2 to avoid usage surprise because both of them are usually expected to be available together. bpf-tcp-cc can already use PTR_TO_BTF_ID to read from tcp_sock. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch makes the bpf_dctcp test to fallback to cubic by using setsockopt(TCP_CONGESTION) when the tcp flow is not ecn ready. It also checks setsockopt() is not available to release(). The settimeo() from the network_helpers.h is used, so the local one is removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210824173026.3979130-1-kafai@fb.com
-
Martin KaFai Lau authored
The next test requires to setsockopt(TCP_CONGESTION) before connect(), so a new arg is needed for the connect_to_fd() to specify the cc's name. This patch adds a new "struct network_helper_opts" for the future option needs. It starts with the "cc" and "timeout_ms" option. A new helper connect_to_fd_opts() is added to take the new "const struct network_helper_opts *opts" as an arg. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210824173019.3977910-1-kafai@fb.com
-
Martin KaFai Lau authored
Add sk_state define to bpf_tcp_helpers.h. Rename the existing global variable "sk_state" in the kfunc_call test to "sk_state_res". Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210824173013.3977316-1-kafai@fb.com
-
Martin KaFai Lau authored
This patch allows the bpf-tcp-cc to call bpf_setsockopt. One use case is to allow a bpf-tcp-cc switching to another cc during init(). For example, when the tcp flow is not ecn ready, the bpf_dctcp can switch to another cc by calling setsockopt(TCP_CONGESTION). During setsockopt(TCP_CONGESTION), the new tcp-cc's init() will be called and this could cause a recursion but it is stopped by the current trampoline's logic (in the prog->active counter). While retiring a bpf-tcp-cc (e.g. in tcp_v[46]_destroy_sock()), the tcp stack calls bpf-tcp-cc's release(). To avoid the retiring bpf-tcp-cc making further changes to the sk, bpf_setsockopt is not available to the bpf-tcp-cc's release(). This will avoid release() making setsockopt() call that will potentially allocate new resources. Although the bpf-tcp-cc already has a more powerful way to read tcp_sock from the PTR_TO_BTF_ID, it is usually expected that bpf_getsockopt and bpf_setsockopt are available together. Thus, bpf_getsockopt() is also added to all tcp_congestion_ops except release(). When the old bpf-tcp-cc is calling setsockopt(TCP_CONGESTION) to switch to a new cc, the old bpf-tcp-cc will be released by bpf_struct_ops_put(). Thus, this patch also puts the bpf_struct_ops_map after a rcu grace period because the trampoline's image cannot be freed while the old bpf-tcp-cc is still running. bpf-tcp-cc can only access icsk_ca_priv as SCALAR. All kernel's tcp-cc is also accessing the icsk_ca_priv as SCALAR. The size of icsk_ca_priv has already been raised a few times to avoid extra kmalloc and memory referencing. The only exception is the kernel's tcp_cdg.c that stores a kmalloc()-ed pointer in icsk_ca_priv. To avoid the old bpf-tcp-cc accidentally overriding this tcp_cdg's pointer value stored in icsk_ca_priv after switching and without over-complicating the bpf's verifier for this one exception in tcp_cdg, this patch does not allow switching to tcp_cdg. If there is a need, bpf_tcp_cdg can be implemented and then use the bpf_sk_storage as the extended storage. bpf_sk_setsockopt proto has only been recently added and used in bpf-sockopt and bpf-iter-tcp, so impose the tcp_cdg limitation in the same proto instead of adding a new proto specifically for bpf-tcp-cc. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210824173007.3976921-1-kafai@fb.com
-
- 25 Aug, 2021 23 commits
-
-
Alexei Starovoitov authored
Magnus Karlsson says: ==================== This patch set mainly contains various simplifications to the xsk selftests. The only exception is the introduction of packet streams that describes what the Tx process should send and what the Rx process should receive. If it receives anything else, the test fails. This mechanism can be used to produce tests were all packets are not received by the Rx thread or modified in some way. An example of this is if an XDP program does XDP_PASS on some of the packets. This patch set will be followed by another patch set that implements a new structure that will facilitate adding new tests. A couple of new tests will also be included in that patch set. v2 -> v3: * Reworked patch 12 so that it now has functions for creating and destroying ifobjects. Simplifies the code. [Maciej] * The packet stream now allocates the supplied buffer array length, instead of the default one. [Maciej] * pkt_stream_get_pkt() now returns NULL when indexing a non-existing packet. [Maciej] * pkt_validate() is now is_pkt_valid(). [Maciej] * Slowed down packet sending speed even more in patch 11 so that slow systems do not silenty drop packets in skb mode. v1 -> v2: * Dropped the patch with per process limit changes as it is not needed [Yonghong] * Improved the commit message of patch 1 [Yonghong] * Fixed a spelling error in patch 9 Thanks: Magnus ==================== Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Magnus Karlsson authored
Preface all options with opt_ and make them booleans. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-17-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Make enums lower case as that is the standard. Also drop the unnecessary TEST_MODE_UNCONFIGURED mode. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-16-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Generate packets from a specification instead of something hard coded. The idea is that a test generates one or more packet specifications and provides it/them to both Tx and Rx. The Tx thread will generate from this specification and Rx will validate that it receives what is in the specification. The specification can be the same on both ends, meaning that everything that was sent should be received, or different which means that Rx will only receive part of the sent packets. Currently, the packet specification is the same for both Rx and Tx and the same for each test. This will change in later work as features and tests are added. The data path functions are also renamed to better reflect what actions they are performing after introducing this feature. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-15-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Generate the packet directly in the umem instead of in a temporary buffer that is copied out. Simplifies the code and improves performance. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-14-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Simpify the cleanup of ifobjects right before the program exits by introducing functions for creating and destroying these objects. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-13-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Decrease sending speed to avoid potentially overflowing some buffers in the skb case that leads to dropped packets we cannot control (and thus the tests may generate false negatives). Decrease batch size and introduce a usleep in the transmit thread to not overflow the receiver. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-12-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Validate the tx stats on the Tx thread instead of the Rx thread. Depending on your settings, you might not be allowed to query the statistics of a socket you do not own, so better to do this on the correct thread to start with. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-11-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Simplify packet validation in the xsk selftests by performing it at once for every packet. The current code performed this per batch and did this on copied packet data. Make it simpler and faster by validating it at once and on the umem packet data thus skipping the copy and the memory allocation for the temprary buffer. The optional packet dump feature is also simplified in the same manner. Memory allocation and copying is removed and the dump is performed directly on the umem data. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-10-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Rename worker_* functions that are not thread entry points to something else. This was confusing. Now only thread entry points are worker_something. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-9-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Disassociate the number of packets sent with the number of buffers in the umem. This so we can loop over the umem to test more things. Set the size of the umem to be a multiple of 2M. A requirement for huge pages that are needed in unaligned mode. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-8-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Get rid of the end-of-test packet and just count the number of packets received and quit when the expected number as been received. Simplifies the code. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-7-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Simplify the retry code and make it more efficient by waiting first, instead of trying immediately which always fails due to the asynchronous nature of xsk socket close. Also decrease the wait time to significantly lower the run-time of the test suite. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-6-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Return the correct error codes so they can be printed correctly. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-5-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Remove unused variables and typedefs. The *_npkts variables are incremented but never used. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-4-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Remove the number of tx packet option as this should be decided by the test itself. Also change the number of packets to be sent to 4096 speeding up the execution. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-3-magnus.karlsson@gmail.com
-
Magnus Karlsson authored
Remove color mode since it does not add any value and having less code means less maintenance which is a good thing. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-2-magnus.karlsson@gmail.com
-
Alexei Starovoitov authored
Daniel Xu says: ==================== The motivation behind this helper is to access userspace pt_regs in a kprobe handler. uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a kprobe handler. The final case (kernelspace pt_regs in uprobe) is pretty rare (usermode helper) so I think that can be solved later if necessary. More concretely, this helper is useful in doing BPF-based DWARF stack unwinding. Currently the kernel can only do framepointer based stack unwinds for userspace code. This is because the DWARF state machines are too fragile to be computed in kernelspace [0]. The idea behind DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace stack (while in prog context) and send it up to userspace for unwinding (probably with libunwind) [1]. This would effectively enable profiling applications with -fomit-frame-pointer using kprobes and uprobes. [0]: https://lkml.org/lkml/2012/2/10/356 [1]: https://github.com/danobi/bpf-dwarf-walk Changes from v1: - Conwolidate BTF_ID decls for task_struct - Enable bpf_get_current_task_btf() for all prog types - Enable bpf_task_pt_regs() for all prog types - Use ASSERT_* macros instead of CHECK ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
This test retrieves the uprobe's pt_regs in two different ways and compares the contents in an arch-agnostic way. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/5581eb8800f6625ec8813fe21e9dce1fbdef4937.1629772842.git.dxu@dxuuu.xyz
-
Daniel Xu authored
The motivation behind this helper is to access userspace pt_regs in a kprobe handler. uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a kprobe handler. The final case (kernelspace pt_regs in uprobe) is pretty rare (usermode helper) so I think that can be solved later if necessary. More concretely, this helper is useful in doing BPF-based DWARF stack unwinding. Currently the kernel can only do framepointer based stack unwinds for userspace code. This is because the DWARF state machines are too fragile to be computed in kernelspace [0]. The idea behind DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace stack (while in prog context) and send it up to userspace for unwinding (probably with libunwind) [1]. This would effectively enable profiling applications with -fomit-frame-pointer using kprobes and uprobes. [0]: https://lkml.org/lkml/2012/2/10/356 [1]: https://github.com/danobi/bpf-dwarf-walkSigned-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/e2718ced2d51ef4268590ab8562962438ab82815.1629772842.git.dxu@dxuuu.xyz
-
Daniel Xu authored
bpf_get_current_task() is already supported so it's natural to also include the _btf() variant for btf-powered helpers. This is required for non-tracing progs to use bpf_task_pt_regs() in the next commit. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/f99870ed5f834c9803d73b3476f8272b1bb987c0.1629772842.git.dxu@dxuuu.xyz
-
Daniel Xu authored
No need to have it defined 5 times. Once is enough. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/6dcefa5bed26fe1226f26683f36819bb53ec19a2.1629772842.git.dxu@dxuuu.xyz
-
Daniel Xu authored
Same as BTF_ID_LIST_SINGLE macro except defines a global ID. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/a867a97517df42fd3953eeb5454402b57e74538f.1629772842.git.dxu@dxuuu.xyz
-
- 24 Aug, 2021 7 commits
-
-
Alexei Starovoitov authored
Kumar Kartikeya says: ==================== This set revamps XDP samples related to redirection to show better output and implement missing features consolidating all their differences and giving them a consistent look and feel, by implementing common features and command line options. Some of the TODO items like reporting redirect error numbers (ENETDOWN, EINVAL, ENOSPC, etc.) have also been implemented. Some of the features are: * Received packet statistics * xdp_redirect/xdp_redirect_map tracepoint statistics * xdp_redirect_err/xdp_redirect_map_err tracepoint statistics (with support for showing exact errno) * xdp_cpumap_enqueue/xdp_cpumap_kthread tracepoint statistics * xdp_devmap_xmit tracepoint statistics * xdp_exception tracepoint statistics * Per ifindex pair devmap_xmit stats shown dynamically (for xdp_monitor) to decompose the total. * Use of BPF skeleton and BPF static linking to share BPF programs. * Use of vmlinux.h and tp_btf for raw_tracepoint support. * Removal of redundant -N/--native-mode option (enforced by default now) * ... and massive cleanups all over the place. All tracepoints also use raw_tp now, and tracepoints like xdp_redirect are only enabled when requested explicitly to capture successful redirection statistics. The set of programs converted as part of this series are: * xdp_redirect_cpu * xdp_redirect_map_multi * xdp_redirect_map * xdp_redirect * xdp_monitor Explanation of the output: There is now a concise output mode by default that shows primarily four fields: rx/s Number of packets received per second redir/s Number of packets successfully redirected per second err,drop/s Aggregated count of errors per second (including dropped packets) xmit/s Number of packets transmitted on the output device per second Some examples: ; sudo ./xdp_redirect_map veth0 veth1 -s Redirecting from veth0 (ifindex 15; driver veth) to veth1 (ifindex 14; driver veth) veth0->veth1 0 rx/s 0 redir/s 0 err,drop/s 0 xmit/s veth0->veth1 9,998,660 rx/s 9,998,658 redir/s 0 err,drop/s 9,998,654 xmit/s ... There is also a verbose mode, that can also be enabled by default using -v (--verbose). The output mode can be switched dynamically at runtime using Ctrl + \ (SIGQUIT). To make the concise output more useful, the errors that occur are expanded inline (as if verbose mode was enabled) to let the user pin down the source of the problem without having to clutter output (or possibly miss it) or always use verbose mode. For instance, let's consider a case where the output device link state is set to down while redirection is happening: [...] veth0->veth1 24,503,376 rx/s 0 err,drop/s 24,503,372 xmit/s veth0->veth1 25,044,775 rx/s 0 err,drop/s 25,044,783 xmit/s veth0->veth1 25,263,046 rx/s 4 err,drop/s 25,263,028 xmit/s redirect_err 4 error/s ENETDOWN 4 error/s [...] The same holds for xdp_exception actions. An example of how a complete xdp_redirect_map session would look: ; sudo ./xdp_redirect_map veth0 veth1 Redirecting from veth0 (ifindex 5; driver veth) to veth1 (ifindex 4; driver veth) veth0->veth1 7,411,506 rx/s 0 err,drop/s 7,411,470 xmit/s veth0->veth1 8,931,770 rx/s 0 err,drop/s 8,931,771 xmit/s ^\ veth0->veth1 8,787,295 rx/s 0 err,drop/s 8,787,325 xmit/s receive total 8,787,295 pkt/s 0 drop/s 0 error/s cpu:7 8,787,295 pkt/s 0 drop/s 0 error/s redirect_err 0 error/s xdp_exception 0 hit/s xmit veth0->veth1 8,787,325 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg cpu:7 8,787,325 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg veth0->veth1 8,842,610 rx/s 0 err,drop/s 8,842,606 xmit/s receive total 8,842,610 pkt/s 0 drop/s 0 error/s cpu:7 8,842,610 pkt/s 0 drop/s 0 error/s redirect_err 0 error/s xdp_exception 0 hit/s xmit veth0->veth1 8,842,606 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg cpu:7 8,842,606 xmit/s 0 drop/s 0 drv_err/s 2.00 bulk-avg ^C Packets received : 33,973,181 Average packets/s : 4,246,648 Packets transmitted : 33,973,172 Average transmit/s : 4,246,647 The xdp_redirect tracepoint (for success stats) needs to be enabled explicitly using --stats/-s. Documentation for entire output and options is provided when user specifies --help/-h with a sample. Changelog: ---------- v3 -> v4: v3: https://lore.kernel.org/bpf/20210728165552.435050-1-memxor@gmail.com * Address all feedback from Daniel * Use READ_ONCE/WRITE_ONCE from linux/compiler.h (cannot directly include due to conflicts with vmlinux.h) * Fix MAX_CPUS hardcoding by switching to mmapable array maps, that are resized based on the value of libbpf_num_possible_cpus * s/ELEMENTS_OF/ARRAY_SIZE/g * Use tools/include/linux/hashtable.h * Coding style fixes * Remove hyperlinks for tracepoints * Split into smaller reviewable changes * Restore support for specifying custom xdp_redirect_cpu cpumap prog with some enhancements, including built-in programs for common actions (pass, drop, redirect). By default, cpumap prog is now disabled. * Misc bug fixes all over the place The printing stuff is a lot more basic without hyperlink support, hence it has not been exported into a more general facility. v2 -> v3 v2: https://lore.kernel.org/bpf/20210721212833.701342-1-memxor@gmail.com * Address all feedback from Andrii * Replace usage of libbpf hashmap (internal API) with custom one * Rename ATOMIC_* macros to NO_TEAR_* to better reflect their use * Use size_t as a portable word sized data type * Set libbpf_set_strict_mode * Invert conditions in BPF programs to exit early and reduce nesting * Use canonical SEC("xdp") naming for all XDP BPF progams * Add missing help description for cpumap enqueue and kthread tracepoints * Move private struct declarations from xdp_sample_user.h to .c file * Improve help output for cpumap enqueue and cpumap kthread tracepoints * Fix a bug where keys array for BPF_MAP_LOOKUP_BATCH is overallocated * Fix some conditions for printing stats (earlier only checked pps, now pps, drop, err and print if any is greater than zero) * Fix alloc_stats_record to properly return and cleanup allocated memory on allocation failure instead of calling exit(3) * Bump bpf_map_lookup_batch count to 32 to reduce lookup time with multiple devices in map * Fix a bug where devmap_xmit_multi stats are not printed when previous record is missing (i.e. when the first time stats are printed), by simply using a dummy record that is zeroed out * Also print per-CPU counts for devmap_xmit_multi which we collect already * Change mac_map to be BPF_MAP_TYPE_HASH instead of array to prevent resizing to a large size when max_ifindex is high, in xdp_redirect_map_multi * Fix instance of strerror(errno) in sample_install_xdp to use saved errno * Provide a usage function from samples helper * Provide a fix where incorrect stats are shown for parallel sessions of xdp_redirect_* samples by introducing matching support for input device(s), output device(s) and cpumap map id for enqueue and kthread stats. Only xdp_monitor doesn't filter stats, all others do. RFC (v1) -> v2 RFC (v1): https://lore.kernel.org/bpf/20210528235250.2635167-1-memxor@gmail.com * Address all feedback from Andrii * Use BPF static linking * Use vmlinux.h * Use BPF_PROG macro * Use global variables instead of maps * Use of tp_btf for raw_tracepoint progs * Switch to timerfd for polling * Use libbpf hashmap for maintaing device sets for per ifindex pair devmap_xmit stats * Fix Makefile to specify object dependencies properly * Use in-tree bpftool * ... misc fixes and cleanups all over the place ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Kumar Kartikeya Dwivedi authored
Use the libbpf skeleton facility and other utilities provided by XDP samples helper. Also adapt to change of type of mac address map, so that no resizing is required. Add a new flag for sample mask that skips priting the from_device->to_device heading for each line, as xdp_redirect_map_multi may have two devices but the flow of data may be bidirectional, so the output would be confusing. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-23-memxor@gmail.com
-
Kumar Kartikeya Dwivedi authored
One of the notable changes is using a BPF_MAP_TYPE_HASH instead of array map to store mac addresses of devices, as the resizing behavior was based on max_ifindex, which unecessarily maximized the capacity of map beyond what was needed. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-22-memxor@gmail.com
-
Kumar Kartikeya Dwivedi authored
Use the libbpf skeleton facility and other utilities provided by XDP samples helper. Since get_mac_addr is already provided by XDP samples helper, we drop it. Also convert to XDP samples helper similar to prior samples to minimize duplication of code. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-21-memxor@gmail.com
-
Kumar Kartikeya Dwivedi authored
Also update it to use consistent SEC("xdp") and SEC("xdp_devmap") naming, and use global variable instead of BPF map for copying the mac address. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-20-memxor@gmail.com
-
Kumar Kartikeya Dwivedi authored
Use the libbpf skeleton facility and other utilities provided by XDP samples helper. Similar to xdp_monitor, xdp_redirect_cpu was quite featureful except a few minor omissions (e.g. redirect errno reporting). All of these have been moved to XDP samples helper, hence drop the unneeded code and convert to usage of helpers provided by it. One of the important changes here is dropping of mprog-disable option, as we make that the default. Also, we support built-in programs for some common actions on the packet when it reaches kthread (pass, drop, redirect to device). If the user still needs to install a custom program, they can still supply a BPF object, however the program should be suitably tagged with SEC("xdp_cpumap") annotation so that the expected attach type is correct when updating our cpumap map element. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-19-memxor@gmail.com
-
Kumar Kartikeya Dwivedi authored
Similar to xdp_monitor_kern, a lot of these BPF programs have been reimplemented properly consolidating missing features from other XDP samples. Hence, drop the unneeded code and rename to .bpf.c suffix. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210821002010.845777-18-memxor@gmail.com
-