- 27 Apr, 2021 13 commits
-
-
Alexei Starovoitov authored
Florent Revest says: ==================== BPF's formatted output helpers are currently implemented with snprintf-like functions which use variadic arguments. The types of all arguments need to be known at compilation time. BPF_CAST_FMT_ARG casts all arguments to the size they should be (known at runtime), but the C type promotion rules cast them back to u64s. On 32 bit architectures, this can cause misaligned va_lists and generate mangled output. This series refactors these helpers to avoid variadic arguments. It uses a "binary printf" instead, where arguments are passed in a buffer constructed at runtime. --- Changes in v2: - Reworded the second patch's description to better describe how arguments get mangled on 32 bit architectures ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Florent Revest authored
BPF has three formatted output helpers: bpf_trace_printk, bpf_seq_printf and bpf_snprintf. Their signatures specify that all arguments are provided from the BPF world as u64s (in an array or as registers). All of these helpers are currently implemented by calling functions such as snprintf() whose signatures take a variable number of arguments, then placed in a va_list by the compiler to call vsnprintf(). "d9c9e4db bpf: Factorize bpf_trace_printk and bpf_seq_printf" introduced a bpf_printf_prepare function that fills an array of u64 sanitized arguments with an array of "modifiers" which indicate what the "real" size of each argument should be (given by the format specifier). The BPF_CAST_FMT_ARG macro consumes these arrays and casts each argument to its real size. However, the C promotion rules implicitely cast them all back to u64s. Therefore, the arguments given to snprintf are u64s and the va_list constructed by the compiler will use 64 bits for each argument. On 64 bit machines, this happens to work well because 32 bit arguments in va_lists need to occupy 64 bits anyway, but on 32 bit architectures this breaks the layout of the va_list expected by the called function and mangles values. In "88a5c690 bpf: fix bpf_trace_printk on 32 bit archs", this problem had been solved for bpf_trace_printk only with a "horrid workaround" that emitted multiple calls to trace_printk where each call had different argument types and generated different va_list layouts. One of the call would be dynamically chosen at runtime. This was ok with the 3 arguments that bpf_trace_printk takes but bpf_seq_printf and bpf_snprintf accept up to 12 arguments. Because this approach scales code exponentially, it is not a viable option anymore. Because the promotion rules are part of the language and because the construction of a va_list is an arch-specific ABI, it's best to just avoid variadic arguments and va_lists altogether. Thankfully the kernel's snprintf() has an alternative in the form of bstr_printf() that accepts arguments in a "binary buffer representation". These binary buffers are currently created by vbin_printf and used in the tracing subsystem to split the cost of printing into two parts: a fast one that only dereferences and remembers values, and a slower one, called later, that does the pretty-printing. This patch refactors bpf_printf_prepare to construct binary buffers of arguments consumable by bstr_printf() instead of arrays of arguments and modifiers. This gets rid of BPF_CAST_FMT_ARG and greatly simplifies the bpf_printf_prepare usage but there are a few gotchas that change how bpf_printf_prepare needs to do things. Currently, bpf_printf_prepare uses a per cpu temporary buffer as a generic storage for strings and IP addresses. With this refactoring, the temporary buffers now holds all the arguments in a structured binary format. To comply with the format expected by bstr_printf, certain format specifiers also need to be pre-formatted: %pB and %pi6/%pi4/%pI4/%pI6. Because vsnprintf subroutines for these specifiers are hard to expose, we pre-format these arguments with calls to snprintf(). Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210427174313.860948-3-revest@chromium.org
-
Florent Revest authored
Similarly to seq_buf_bprintf in lib/seq_buf.c, this function writes a printf formatted string with arguments provided in a "binary representation" built by functions such as vbin_printf. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210427174313.860948-2-revest@chromium.org
-
Hengqi Chen authored
Add a missing colon so that the code block followed can be rendered properly. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210424021208.832116-1-hengqi.chen@gmail.com
-
Lorenzo Bianconi authored
Rely on netif_receive_skb_list routine to send skbs converted from xdp_frames in cpu_map_kthread_run in order to improve i-cache usage. The proposed patch has been tested running xdp_redirect_cpu bpf sample available in the kernel tree that is used to redirect UDP frames from ixgbe driver to a cpumap entry and then to the networking stack. UDP frames are generated using pktgen. Packets are discarded by the UDP layer. $ xdp_redirect_cpu --cpu <cpu> --progname xdp_cpu_map0 --dev <eth> bpf-next: ~2.35Mpps bpf-next + cpumap skb-list: ~2.72Mpps Rename drops counter in kmem_alloc_drops since now it reports just kmem_cache_alloc_bulk failures Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/c729f83e5d7482d9329e0f165bdbe5adcefd1510.1619169700.git.lorenzo@kernel.org
-
Daniel Borkmann authored
Similarly as b0270958 ("bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds."), we also need to fix the propagation of 32 bit unsigned bounds from 64 bit counterparts. That is, really only set the u32_{min,max}_value when /both/ {umin,umax}_value safely fit in 32 bit space. For example, the register with a umin_value == 1 does /not/ imply that u32_min_value is also equal to 1, since umax_value could be much larger than 32 bit subregister can hold, and thus u32_min_value is in the interval [0,1] instead. Before fix, invalid tracking result of R2_w=inv1: [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smin_value=-9223372036854775807,smax_value=9223372032559808513,umin_value=1,umax_value=18446744069414584321,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv1 R10=fp0 [...] After fix, correct tracking result of R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)): [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smax_value=9223372032559808513,umax_value=18446744069414584321,var_off=(0x0; 0xffffffff00000001),s32_min_value=0,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) R10=fp0 [...] Thus, same issue as in b0270958 holds for unsigned subregister tracking. Also, align __reg64_bound_u32() similarly to __reg64_bound_s32() as done in b0270958 to make them uniform again. Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: Manfred Paul (@_manfp) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
-
Florent Revest authored
bpf_trace_printk uses a shared static buffer to hold strings before they are printed. A recent refactoring moved the locking of that buffer after it gets filled by mistake. Fixes: d9c9e4db ("bpf: Factorize bpf_trace_printk and bpf_seq_printf") Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210427112958.773132-1-revest@chromium.org
-
Alexei Starovoitov authored
Andrii Nakryiko says: ==================== Lorenz Bauer noticed that core_reloc selftest has two inverted CHECK() conditions, allowing failing tests to pass unnoticed. Fixing that opened up few long-standing (field existence and direct memory bitfields) and one recent failures (BTF_KIND_FLOAT relos). This patch set fixes core_reloc selftest to capture such failures reliably in the future. It also fixes all the newly failing tests. See individual patches for details. This patch set also completes a set of ASSERT_xxx() macros, so now there should be a very little reason to use verbose and error-prone generic CHECK() macro. v1->v2: - updated bpf_core_fields_are_compat() comment to mention FLOAT (Lorenz). Cc: Lorenz Bauer <lmb@cloudflare.com> ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Andrii Nakryiko authored
Fix failed tests checks in core_reloc test runner, which allowed failing tests to pass quietly. Also add extra check to make sure that expected to fail test cases with invalid names are caught as test failure anyway, as this is not an expected failure mode. Also fix mislabeled probed vs direct bitfield test cases. Fixes: 124a892d ("selftests/bpf: Test TYPE_EXISTS and TYPE_SIZE CO-RE relocations") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-6-andrii@kernel.org
-
Andrii Nakryiko authored
Negative field existence cases for have a broken assumption that FIELD_EXISTS CO-RE relo will fail for fields that match the name but have incompatible type signature. That's not how CO-RE relocations generally behave. Types and fields that match by name but not by expected type are treated as non-matching candidates and are skipped. Error later is reported if no matching candidate was found. That's what happens for most relocations, but existence relocations (FIELD_EXISTS and TYPE_EXISTS) are more permissive and they are designed to return 0 or 1, depending if a match is found. This allows to handle name-conflicting but incompatible types in BPF code easily. Combined with ___flavor suffixes, it's possible to handle pretty much any structural type changes in kernel within the compiled once BPF source code. So, long story short, negative field existence test cases are invalid in their assumptions, so this patch reworks them into a single consolidated positive case that doesn't match any of the fields. Fixes: c7566a69 ("selftests/bpf: Add field existence CO-RE relocs tests") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-5-andrii@kernel.org
-
Andrii Nakryiko authored
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dade ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org
-
Andrii Nakryiko authored
Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check. Without this, relocations against float/double fields will fail. Also adjust one error message to emit instruction index instead of less convenient instruction byte offset. Fixes: 22541a9e ("libbpf: Add BTF_KIND_FLOAT support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org
-
Andrii Nakryiko authored
Add ASSERT_TRUE/ASSERT_FALSE for conditions calculated with custom logic to true/false. Also add remaining arithmetical assertions: - ASSERT_LE -- less than or equal; - ASSERT_GT -- greater than; - ASSERT_GE -- greater than or equal. This should cover most scenarios where people fall back to error-prone CHECK()s. Also extend ASSERT_ERR() to print out errno, in addition to direct error. Also convert few CHECK() instances to ensure new ASSERT_xxx() variants work as expected. Subsequent patch will also use ASSERT_TRUE/ASSERT_FALSE more extensively. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-2-andrii@kernel.org
-
- 26 Apr, 2021 27 commits
-
-
Alexei Starovoitov authored
Jiri Olsa says: ==================== hi, while adding test for pinning the module while there's trampoline attach to it, I noticed that we don't allow link detach and following re-attach for trampolines. Adding that for tracing and lsm programs. You need to have patch [1] from bpf tree for test module attach test to pass. v5 changes: - fixed missing hlist_del_init change - fixed several ASSERT calls - added extra patch for missing ';' - added ASSERT macros to lsm test - added acks thanks, jirka [1] https://lore.kernel.org/bpf/20210326105900.151466-1-jolsa@kernel.org/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Replacing CHECK with ASSERT macros. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-8-jolsa@kernel.org
-
Jiri Olsa authored
Adding test to verify that once we attach module's trampoline, the module can't be unloaded. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-7-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) lsm programs, plus check that already linked program can't be attached again. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-6-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) tracing fexit programs, plus check that already linked program can't be attached again. Also switching to ASSERT* macros. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-5-jolsa@kernel.org
-
Jiri Olsa authored
Adding the test to re-attach (detach/attach again) tracing fentry programs, plus check that already linked program can't be attached again. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-4-jolsa@kernel.org
-
Jiri Olsa authored
Currently we don't allow re-attaching of trampolines. Once it's detached, it can't be re-attach even when the program is still loaded. Adding the possibility to re-attach the loaded tracing and lsm programs. Fixing missing unlock with proper cleanup goto jump reported by Julia. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210414195147.1624932-2-jolsa@kernel.org
-
David S. Miller authored
Michael Chan says: ==================== bnxt_en: Updates for net-next. This series includes these main enhancements: 1. Link related changes - add NRZ/PAM4 link signal mode to the link up message if known - rely on firmware to bring down the link during ifdown 2. SRIOV related changes - allow VF promiscuous mode if the VF is trusted - allow ndo operations to configure VF when the PF is ifdown - fix the scenario of the VF taking back control of it's MAC address - add Hyper-V VF device IDs 3. Support the option to transmit without FCS/CRC. 4. Implement .ndo_features_check() to disable offload when the UDP encap. packets are not supported. v2: Patch10: Reverse the check for supported UDP ports to be more straight forward. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
For UDP encapsultions, we only support the offloaded Vxlan port and Geneve port. All other ports included FOU and GUE are not supported so we need to turn off TSO and checksum features. v2: Reverse the check for supported UDP ports to be more straight forward. Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
If firmware is capable, set the IFF_SUPP_NOFCS flag to support the sockets option to transmit packets without FCS. This is mainly used for testing. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Support VF device IDs used by the Hyper-V hypervisor. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
When the PF is no longer enforcing an assigned MAC address on a VF, the VF needs to call bnxt_approve_mac() to tell the PF what MAC address it is now using. Otherwise it gets out of sync and the PF won't know what MAC address the VF wants to use. Ultimately the VF will fail when it tries to setup the L2 MAC filter for the vnic. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Move it before bnxt_update_vf_mac(). In the next patch, we need to call bnxt_approve_mac() from bnxt_update_mac() under some conditions. This will avoid forward declaration. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edwin Peer authored
It is perfectly legal for the stack to query and configure VFs via PF NDOs while the NIC is administratively down. Remove the unnecessary check for the PF to be in open state. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edwin Peer authored
Firmware previously only allowed promiscuous mode for VFs associated with a default VLAN. It is now possible to enable promiscuous mode for a VF having no VLAN configured provided that it is trusted. In such cases the VF will see all packets received by the PF, irrespective of destination MAC or VLAN. Note, it is necessary to query firmware at the time of bnxt_promisc_ok() instead of in bnxt_hwrm_func_qcfg() because the trusted status might be altered by the PF after the VF has been configured. This check must now also be deferred because the firmware call sleeps. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
In the current code, the driver will not shutdown the link during IFDOWN if there are still VFs sharing the port. Newer firmware will manage the link down decision when the port is shared by VFs, so we can just call firmware to shutdown the port unconditionally and let firmware make the final decision. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Copy the phy related feature flags from the firmware call HWRM_PORT_PHY_QCAPS to this new field. We can also remove the flags field in the bnxt_test_info structure. It's cleaner to have all PHY related flags in one location, directly copied from the firmware. To keep the BNXT_PHY_CFG_ABLE() macro logic the same, we need to make a slight adjustment to check that it is a PF. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edwin Peer authored
Firmware reports link signalling mode for certain speeds. In these cases, print the signalling modes in kernel log link up messages. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jethro Beekman authored
The default behavior for source MACVLAN is to duplicate packets to appropriate type source devices, and then do the normal destination MACVLAN flow. This patch adds an option to skip destination MACVLAN processing if any matching source MACVLAN device has the option set. This allows setting up a "catch all" device for source MACVLAN: create one or more devices with type source nodst, and one device with e.g. type vepa, and incoming traffic will be received on exactly one device. v2: netdev wants non-standard line length Signed-off-by: Jethro Beekman <kernel@jbeekman.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-updates-2021-04-21 devlink external port attribute for SF (Sub-Function) port flavour This adds the support to instantiate Sub-Functions on external hosts E.g when Eswitch manager is enabled on the ARM SmarNic SoC CPU, users are now able to spawn new Sub-Functions on the Host server CPU. Parav Pandit Says: ================== This series introduces and uses external attribute for the SF port to indicate that a SF port belongs to an external controller. This is needed to generate unique phys_port_name when PF and SF numbers are overlapping between local and external controllers. For example two controllers 0 and 1, both of these controller have a SF. having PF number 0, SF number 77. Here, phys_port_name has duplicate entry which doesn't have controller number in it. Hence, add controller number optionally when a SF port is for an external controller. This extension is similar to existing PF and VF eswitch ports of the external controller. When a SF is for external controller an example view of external SF port and config sequence: On eswitch system: $ devlink dev eswitch set pci/0033:01:00.0 mode switchdev $ devlink port show pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false function: hw_addr 00:00:00:00:00:00 $ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1 pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached phys_port_name construction: $ cat /sys/class/net/eth1/phys_port_name c1pf0sf77 Patch summary: First 3 patches prepares the eswitch to handle vports in more generic way using xarray to lookup vport from its unique vport number. Patch-1 returns maximum eswitch ports only when eswitch is enabled Patch-2 prepares eswitch to return eswitch max ports from a struct Patch-3 uses xarray for vport and representor lookup Patch-4 considers SF for an additioanl range of SF vports Patch-5 relies on SF hw table to check SF support Patch-6 extends SF devlink port attribute for external flag Patch-7 stores the per controller SF allocation attributes Patch-8 uses SF function id for filtering events Patch-9 uses helper for allocation and free Patch-10 splits hw table into per controller table and generic one Patch-11 extends sf table for additional range ================== ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
This adds device tree probing to the IXP4xx ethernet driver. Add a platform data bool to tell us whether to register an MDIO bus for the device or not, as well as the corresponding NPE. We need to drop the memory region request as part of this since the OF core will request the memory for the device. Cc: Zoltan HERPAI <wigyori@uid0.hu> Cc: Raylynn Knight <rayknight@me.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
This driver was using a really dated way of obtaining the phy by printing a string and using it with phy_connect(). Switch to using more reasonable modern interfaces. Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Walleij authored
This adds device tree bindings for the IXP4xx ethernet controller with optional MDIO bridge. Cc: Zoltan HERPAI <wigyori@uid0.hu> Cc: Raylynn Knight <rayknight@me.com> Cc: devicetree@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hayes Wang authored
Remove DELL_TB_RX_AGG_BUG and LENOVO_MACPASSTHRU flags of rtl8152_flags. They are only set when initializing and wouldn't be change. It is enough to record them with variables. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dexuan Cui authored
Currently the netvsc/VF binding logic only checks the PCI serial number. The Microsoft Azure Network Adapter (MANA) supports multiple net_device interfaces (each such interface is called a "vPort", and has its unique MAC address) which are backed by the same VF PCI device, so the binding logic should check both the MAC address and the PCI serial number. The change should not break any other existing VF drivers, because Hyper-V NIC SR-IOV implementation requires the netvsc network interface and the VF network interface have the same MAC address. Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Co-developed-by: Shachar Raindel <shacharr@microsoft.com> Signed-off-by: Shachar Raindel <shacharr@microsoft.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiapeng Chong authored
Variable result is being assigned a value from a calculation however the variable is never read, so this redundant variable can be removed. Cleans up the following clang-analyzer warning: drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1488:2: warning: Value stored to 'pos' is never read [clang-analyzer-deadcode.DeadStores]. drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:876:3: warning: Value stored to 'pos' is never read [clang-analyzer-deadcode.DeadStores]. drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:36:3: warning: Value stored to 'start' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-04-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 69 non-merge commits during the last 22 day(s) which contain a total of 69 files changed, 3141 insertions(+), 866 deletions(-). The main changes are: 1) Add BPF static linker support for extern resolution of global, from Andrii. 2) Refine retval for bpf_get_task_stack helper, from Dave. 3) Add a bpf_snprintf helper, from Florent. 4) A bunch of miscellaneous improvements from many developers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-