- 21 Aug, 2023 40 commits
-
-
Jiri Olsa authored
Running api and link tests also with pid filter and checking the probe gets executed only for specific pid. Spawning extra process to trigger attached uprobes and checking we get correct counts from executed programs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-28-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding test for cookies setup/retrieval in uprobe_link uprobes and making sure bpf_get_attach_cookie works properly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-27-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding test that attaches 50k usdt probes in usdt_multi binary. After the attach is done we run the binary and make sure we get proper amount of hits. With current uprobes: # perf stat --null ./test_progs -n 254/6 #254/6 uprobe_multi_test/bench_usdt:OK #254 uprobe_multi_test:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Performance counter stats for './test_progs -n 254/6': 1353.659680562 seconds time elapsed With uprobe_multi link: # perf stat --null ./test_progs -n 254/6 #254/6 uprobe_multi_test/bench_usdt:OK #254 uprobe_multi_test:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Performance counter stats for './test_progs -n 254/6': 0.322046364 seconds time elapsed Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-26-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding code in uprobe_multi test binary that defines 50k usdts and will serve as attach point for uprobe_multi usdt bench test in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-25-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding test that attaches 50k uprobes in uprobe_multi binary. After the attach is done we run the binary and make sure we get proper amount of hits. The resulting attach/detach times on my setup: test_bench_attach_uprobe:PASS:uprobe_multi__open 0 nsec test_bench_attach_uprobe:PASS:uprobe_multi__attach 0 nsec test_bench_attach_uprobe:PASS:uprobes_count 0 nsec test_bench_attach_uprobe: attached in 0.346s test_bench_attach_uprobe: detached in 0.419s #262/5 uprobe_multi_test/bench_uprobe:OK Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-24-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding uprobe_multi test program that defines 50k uprobe_multi_func_* functions and will serve as attach point for uprobe_multi bench test in following patch. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-23-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding uprobe_multi test for bpf_link_create attach function. Testing attachment using the struct bpf_link_create_opts. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-22-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding uprobe_multi test for bpf_program__attach_uprobe_multi attach function. Testing attachment using glob patterns and via bpf_uprobe_multi_opts paths/syms fields. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-21-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding uprobe_multi test for skeleton load/attach functions, to test skeleton auto attach for uprobe_multi link. Test that bpf_get_func_ip works properly for uprobe_multi attachment. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-20-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
We'd like to have single copy of get_time_ns used b bench and test_progs, but we can't just include bench.h, because of conflicting 'struct env' objects. Moving get_time_ns to testing_helpers.h which is being included by both bench and test_progs objects. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-19-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding support for usdt_manager_attach_usdt to use uprobe_multi link to attach to usdt probes. The uprobe_multi support is detected before the usdt program is loaded and its expected_attach_type is set accordingly. If uprobe_multi support is detected the usdt_manager_attach_usdt gathers uprobes info and calls bpf_program__attach_uprobe to create all needed uprobes. If uprobe_multi support is not detected the old behaviour stays. Also adding usdt.s program section for sleepable usdt probes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding uprobe-multi link detection. It will be used later in bpf_program__attach_usdt function to check and use uprobe_multi link over standard uprobe links. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding support for several uprobe_multi program sections to allow auto attach of multi_uprobe programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding bpf_program__attach_uprobe_multi function that allows to attach multiple uprobes with uprobe_multi link. The user can specify uprobes with direct arguments: binary_path/func_pattern/pid or with struct bpf_uprobe_multi_opts opts argument fields: const char **syms; const unsigned long *offsets; const unsigned long *ref_ctr_offsets; const __u64 *cookies; User can specify 2 mutually exclusive set of inputs: 1) use only path/func_pattern/pid arguments 2) use path/pid with allowed combinations of: syms/offsets/ref_ctr_offsets/cookies/cnt - syms and offsets are mutually exclusive - ref_ctr_offsets and cookies are optional Any other usage results in error. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding new uprobe_multi struct to bpf_link_create_opts object to pass multiple uprobe data to link_create attr uapi. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding elf_resolve_pattern_offsets function that looks up offsets for symbols specified by pattern argument. The 'pattern' argument allows wildcards (*?' supported). Offsets are returned in allocated array together with its size and needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding elf_resolve_syms_offsets function that looks up offsets for symbols specified in syms array argument. Offsets are returned in allocated array with the 'cnt' size, that needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding elf symbol iterator object (and some functions) that follow open-coded iterator pattern and some functions to ease up iterating elf object symbols. The idea is to iterate single symbol section with: struct elf_sym_iter iter; struct elf_sym *sym; if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM)) goto error; while ((sym = elf_sym_iter_next(&iter))) { ... } I considered opening the elf inside the iterator and iterate all symbol sections, but then it gets more complicated wrt user checks for when the next section is processed. Plus side is the we don't need 'exit' function, because caller/user is in charge of that. The returned iterated symbol object from elf_sym_iter_next function is placed inside the struct elf_sym_iter, so no extra allocation or argument is needed. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding elf_open/elf_close functions and using it in elf_find_func_offset_from_file function. It will be used in following changes to save some common code. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding new elf object that will contain elf related functions. There's no functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding new uprobe_multi attach type and link names, so the functions can resolve the new values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding support for bpf_get_func_ip helper being called from ebpf program attached by uprobe_multi link. It returns the ip of the uprobe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-7-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding support to specify pid for uprobe_multi link and the uprobes are created only for task with given pid value. Using the consumer.filter filter callback for that, so the task gets filtered during the uprobe installation. We still need to check the task during runtime in the uprobe handler, because the handler could get executed if there's another system wide consumer on the same uprobe (thanks Oleg for the insight). Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-6-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding support to specify cookies array for uprobe_multi link. The cookies array share indexes and length with other uprobe_multi arrays (offsets/ref_ctr_offsets). The cookies[i] value defines cookie for i-the uprobe and will be returned by bpf_get_attach_cookie helper when called from ebpf program hooked to that specific uprobe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-5-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding new multi uprobe link that allows to attach bpf program to multiple uprobes. Uprobes to attach are specified via new link_create uprobe_multi union: struct { __aligned_u64 path; __aligned_u64 offsets; __aligned_u64 ref_ctr_offsets; __u32 cnt; __u32 flags; } uprobe_multi; Uprobes are defined for single binary specified in path and multiple calling sites specified in offsets array with optional reference counters specified in ref_ctr_offsets array. All specified arrays have length of 'cnt'. The 'flags' supports single bit for now that marks the uprobe as return probe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-4-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Add extra attach_type checks from link_create under bpf_prog_attach_check_attach_type. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-3-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Switching BPF_F_KPROBE_MULTI_RETURN macro to anonymous enum, so it'd show up in vmlinux.h. There's not functional change compared to having this as macro. Acked-by: Yafang Shao <laoar.shao@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-2-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Daniel T. Lee says: ==================== samples/bpf: make BPF programs more libbpf aware The existing tracing programs have been developed for a considerable period of time and, as a result, do not properly incorporate the features of the current libbpf, such as CO-RE. This is evident in frequent usage of functions like PT_REGS* and the persistence of "hack" methods using underscore-style bpf_probe_read_kernel from the past. These programs are far behind the current level of libbpf and can potentially confuse users. The kernel has undergone significant changes, and some of these changes have broken these programs, but on the other hand, more robust APIs have been developed for increased stableness. To list some of the kernel changes that this patch set is focusing on, - symbol mismatch occurs due to compiler optimization [1] - inline of blk_account_io* breaks BPF kprobe program [2] - new tracepoints for the block_io_start/done are introduced [3] - map lookup probes can't be triggered (bpf_disable_instrumentation)[4] - BPF_KSYSCALL has been introduced to simplify argument fetching [5] - convert to vmlinux.h and use tp argument structure within it - make tracing programs to be more CO-RE centric In this regard, this patch set aims not only to integrate the latest features of libbpf into BPF programs but also to reduce confusion and clarify the BPF programs. This will help with the potential confusion among users and make the programs more intutitive. [1]: https://github.com/iovisor/bcc/issues/1754 [2]: https://github.com/iovisor/bcc/issues/4261 [3]: commit 5a80bd07 ("block: introduce block_io_start/block_io_done tracepoints") [4]: commit 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock") [5]: commit 6f5d467d ("libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL") ==================== Link: https://lore.kernel.org/r/20230818090119.477441-1-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
With the introduction of kprobe.multi, it is now possible to attach multiple kprobes to a single BPF program without the need for multiple definitions. Additionally, this method supports wildcard-based matching, allowing for further simplification of BPF programs. In here, an asterisk (*) wildcard is used to map to all symbols relevant to spin_{lock|unlock}. Furthermore, since kprobe.multi handles symbol matching, this commit eliminates the need for the previous logic of reading the ksym table to verify the existence of symbols. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-10-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
This commit refactors the syscall tracing programs by adopting the BPF_KSYSCALL macro. This change aims to enhance the clarity and simplicity of the BPF programs by reducing the complexity of argument parsing from pt_regs. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-9-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
In the commit 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock"), a potential deadlock issue was addressed, which resulted in *_map_lookup_elem not triggering BPF programs. (prior to lookup, bpf_disable_instrumentation() is used) To resolve the broken map lookup probe using "htab_map_lookup_elem", this commit introduces an alternative approach. Instead, it utilize "bpf_map_copy_value" and apply a filter specifically for the hash table with map_type. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Fixes: 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock") Link: https://lore.kernel.org/r/20230818090119.477441-8-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Recently, a new tracepoint for the block layer, specifically the block_io_start/done tracepoints, was introduced in commit 5a80bd07 ("block: introduce block_io_start/block_io_done tracepoints"). Previously, the kprobe entry used for this purpose was quite unstable and inherently broke relevant probes [1]. Now that a stable tracepoint is available, this commit replaces the bio latency check with it. One of the changes made during this replacement is the key used for the hash table. Since 'struct request' cannot be used as a hash key, the approach taken follows that which was implemented in bcc/biolatency [2]. (uses dev:sector for the key) [1]: https://github.com/iovisor/bcc/issues/4261 [2]: https://github.com/iovisor/bcc/pull/4691 Fixes: 450b7879 ("block: move blk_account_io_{start,done} to blk-mq.c") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-7-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
The existing tracing programs have been developed for a considerable period of time and, as a result, do not properly incorporate the features of the current libbpf, such as CO-RE. This is evident in frequent usage of functions like PT_REGS* and the persistence of "hack" methods using underscore-style bpf_probe_read_kernel from the past. These programs are far behind the current level of libbpf and can potentially confuse users. Therefore, this commit aims to convert the outdated BPF programs to be more CO-RE centric. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-6-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Currently, multiple kprobe programs are suffering from symbol mismatch due to compiler optimization. These optimizations might induce additional suffix to the symbol name such as '.isra' or '.constprop'. # egrep ' finish_task_switch| __netif_receive_skb_core' /proc/kallsyms ffffffff81135e50 t finish_task_switch.isra.0 ffffffff81dd36d0 t __netif_receive_skb_core.constprop.0 ffffffff8205cc0e t finish_task_switch.isra.0.cold ffffffff820b1aba t __netif_receive_skb_core.constprop.0.cold To avoid this, this commit replaces the original kprobe section to kprobe.multi in order to match symbol with wildcard characters. Here, asterisk is used for avoiding symbol mismatch. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-5-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Currently, BPF programs typically have a suffix of .bpf.c. However, some programs still utilize a mixture of _kern.c suffix alongside the naming convention. In order to achieve consistency in the naming of these programs, this commit unifies the inconsistency in the naming convention of BPF kernel programs. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
This commit replaces separate headers with a single vmlinux.h to tracing programs. Thanks to that, we no longer need to define the argument structure for tracing programs directly. For example, argument for the sched_switch tracpepoint (sched_switch_args) can be replaced with the vmlinux.h provided trace_event_raw_sched_switch. Additional defines have been added to the BPF program either directly or through the inclusion of net_shared.h. Defined values are PERF_MAX_STACK_DEPTH, IFNAMSIZ constants and __stringify() macro. This change enables the BPF program to access internal structures with BTF generated "vmlinux.h" header. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-3-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Currently, compiling the bpf programs will result the warning with the ignored attribute as follows. This commit fixes the warning by adding cf-protection option. In file included from ./arch/x86/include/asm/linkage.h:6: ./arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes] extern __noendbr u64 ibt_save(bool disable); ^ ./arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr' ^ Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-2-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Hou Tao says: ==================== Remove unnecessary synchronizations in cpumap From: Hou Tao <houtao1@huawei.com> Hi, This is the formal patchset to remove unnecessary synchronizations in cpu-map after address comments and collect Rvb tags from Toke Høiland-Jørgensen (Big thanks to Toke). Patch #1 removes the unnecessary rcu_barrier() when freeing bpf_cpu_map_entry and replaces it by queue_rcu_work(). Patch #2 removes the unnecessary call_rcu() and queue_work() when destroying cpu-map and does the freeing directly. Test the patchset by using xdp_redirect_cpu and virtio-net. Both xdp-mode and skb-mode have been exercised and no issues were reported. As ususal, comments and suggestions are always welcome. Change Log: v1: * address comments from Toke Høiland-Jørgensen * add Rvb tags from Toke Høiland-Jørgensen * update outdated comment in cpu_map_delete_elem() RFC: https://lore.kernel.org/bpf/20230728023030.1906124-1-houtao@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20230816045959.358059-1-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Hou Tao authored
After synchronous_rcu(), both the dettached XDP program and xdp_do_flush() are completed, and the only user of bpf_cpu_map_entry will be cpu_map_kthread_run(), so instead of calling __cpu_map_entry_replace() to stop kthread and cleanup entry after a RCU grace period, do these things directly. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230816045959.358059-3-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Hou Tao authored
As for now __cpu_map_entry_replace() uses call_rcu() to wait for the inflight xdp program to exit the RCU read critical section, and then launch kworker cpu_map_kthread_stop() to call kthread_stop() to flush all pending xdp frames or skbs. But it is unnecessary to use rcu_barrier() in cpu_map_kthread_stop() to wait for the completion of __cpu_map_entry_free(), because rcu_barrier() will wait for all pending RCU callbacks and cpu_map_kthread_stop() only needs to wait for the completion of a specific __cpu_map_entry_free(). So use queue_rcu_work() to replace call_rcu(), schedule_work() and rcu_barrier(). queue_rcu_work() will queue a __cpu_map_entry_free() kworker after a RCU grace period. Because __cpu_map_entry_free() is running in a kworker context, so it is OK to do all of these freeing procedures include kthread_stop() in it. After the update, there is no need to do reference-counting for bpf_cpu_map_entry, because bpf_cpu_map_entry is freed directly in __cpu_map_entry_free(), so just remove it. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230816045959.358059-2-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-