Commit 49df0019 authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'enable-bpf-programs-to-declare-arrays-of-kptr-bpf_rb_root-and-bpf_list_head'

Kui-Feng Lee says:

====================
Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.

Some types, such as type kptr, bpf_rb_root, and bpf_list_head, are
treated in a special way. Previously, these types could not be the
type of a field in a struct type that is used as the type of a global
variable. They could not be the type of a field in a struct type that
is used as the type of a field in the value type of a map either. They
could not even be the type of array elements. This means that they can
only be the type of global variables or of direct fields in the value
type of a map.

The patch set aims to enable the use of these specific types in arrays
and struct fields, providing flexibility. It examines the types of
global variables or the value types of maps, such as arrays and struct
types, recursively to identify these special types and generate field
information for them.

For example,

  ...
  struct task_struct __kptr *ptr[3];
  ...

it will create 3 instances of "struct btf_field" in the "btf_record" of
the data section.

 [...,
  btf_field(offset=0x100, type=BPF_KPTR_REF),
  btf_field(offset=0x108, type=BPF_KPTR_REF),
  btf_field(offset=0x110, type=BPF_KPTR_REF),
  ...
 ]

It creates a record of each of three elements. These three records are
almost identical except their offsets.

Another example is

  ...
  struct A {
    ...
    struct task_struct __kptr *task;
    struct bpf_rb_root root;
    ...
  }

  struct A foo[2];

it will create 4 records.

 [...,
  btf_field(offset=0x7100, type=BPF_KPTR_REF),
  btf_field(offset=0x7108, type=BPF_RB_ROOT:),
  btf_field(offset=0x7200, type=BPF_KPTR_REF),
  btf_field(offset=0x7208, type=BPF_RB_ROOT:),
  ...
 ]

Assuming that the size of an element/struct A is 0x100 and "foo"
starts at 0x7000, it includes two kptr records at 0x7100 and 0x7200,
and two rbtree root records at 0x7108 and 0x7208.

All these field information will be flatten, for struct types, and
repeated, for arrays.
---
Changes from v6:

 - Return BPF_KPTR_REF from btf_get_field_type() only if var_type is a
   struct type.

   - Pass btf and type to btf_get_field_type().

Changes from v5:

 - Ensure field->offset values of kptrs are advanced correctly from
   one nested struct/or array to another.

Changes from v4:

 - Return -E2BIG for i == MAX_RESOLVE_DEPTH.

Changes from v3:

 - Refactor the common code of btf_find_struct_field() and
   btf_find_datasec_var().

 - Limit the number of levels looking into a struct types.

Changes from v2:

 - Support fields in nested struct type.

 - Remove nelems and duplicate field information with offset
   adjustments for arrays.

Changes from v1:

 - Move the check of element alignment out of btf_field_cmp() to
   btf_record_find().

 - Change the order of the previous patch 4 "bpf:
   check_map_kptr_access() compute the offset from the reg state" as
   the patch 7 now.

 - Reject BPF_RB_NODE and BPF_LIST_NODE with nelems > 1.

 - Rephrase the commit log of the patch "bpf: check_map_access() with
   the knowledge of arrays" to clarify the alignment on elements.

v6: https://lore.kernel.org/all/20240520204018.884515-1-thinker.li@gmail.com/
v5: https://lore.kernel.org/all/20240510011312.1488046-1-thinker.li@gmail.com/
v4: https://lore.kernel.org/all/20240508063218.2806447-1-thinker.li@gmail.com/
v3: https://lore.kernel.org/all/20240501204729.484085-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240412210814.603377-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/bpf/20240410004150.2917641-1-thinker.li@gmail.com/

Kui-Feng Lee (9):
  bpf: Remove unnecessary checks on the offset of btf_field.
  bpf: Remove unnecessary call to btf_field_type_size().
  bpf: refactor btf_find_struct_field() and btf_find_datasec_var().
  bpf: create repeated fields for arrays.
  bpf: look into the types of the fields of a struct type recursively.
  bpf: limit the number of levels of a nested struct type.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested
    struct types.
  selftests/bpf: Test global bpf_list_head arrays.

 kernel/bpf/btf.c                              | 310 ++++++++++++------
 kernel/bpf/verifier.c                         |   4 +-
 .../selftests/bpf/prog_tests/cpumask.c        |   5 +
 .../selftests/bpf/prog_tests/linked_list.c    |  12 +
 .../testing/selftests/bpf/prog_tests/rbtree.c |  47 +++
 .../selftests/bpf/progs/cpumask_success.c     | 171 ++++++++++
 .../testing/selftests/bpf/progs/linked_list.c |  42 +++
 tools/testing/selftests/bpf/progs/rbtree.c    |  77 +++++
 8 files changed, 558 insertions(+), 110 deletions(-)
====================

Link: https://lore.kernel.org/r/20240523174202.461236-1-thinker.li@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 49784c79 43d50ffb
This diff is collapsed.
...@@ -5448,7 +5448,7 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, ...@@ -5448,7 +5448,7 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
* this program. To check that [x1, x2) overlaps with [y1, y2), * this program. To check that [x1, x2) overlaps with [y1, y2),
* it is sufficient to check x1 < y2 && y1 < x2. * it is sufficient to check x1 < y2 && y1 < x2.
*/ */
if (reg->smin_value + off < p + btf_field_type_size(field->type) && if (reg->smin_value + off < p + field->size &&
p < reg->umax_value + off + size) { p < reg->umax_value + off + size) {
switch (field->type) { switch (field->type) {
case BPF_KPTR_UNREF: case BPF_KPTR_UNREF:
...@@ -11640,7 +11640,7 @@ __process_kf_arg_ptr_to_graph_node(struct bpf_verifier_env *env, ...@@ -11640,7 +11640,7 @@ __process_kf_arg_ptr_to_graph_node(struct bpf_verifier_env *env,
node_off = reg->off + reg->var_off.value; node_off = reg->off + reg->var_off.value;
field = reg_find_field_offset(reg, node_off, node_field_type); field = reg_find_field_offset(reg, node_off, node_field_type);
if (!field || field->offset != node_off) { if (!field) {
verbose(env, "%s not found at offset=%u\n", node_type_name, node_off); verbose(env, "%s not found at offset=%u\n", node_type_name, node_off);
return -EINVAL; return -EINVAL;
} }
......
...@@ -18,6 +18,11 @@ static const char * const cpumask_success_testcases[] = { ...@@ -18,6 +18,11 @@ static const char * const cpumask_success_testcases[] = {
"test_insert_leave", "test_insert_leave",
"test_insert_remove_release", "test_insert_remove_release",
"test_global_mask_rcu", "test_global_mask_rcu",
"test_global_mask_array_one_rcu",
"test_global_mask_array_rcu",
"test_global_mask_array_l2_rcu",
"test_global_mask_nested_rcu",
"test_global_mask_nested_deep_rcu",
"test_cpumask_weight", "test_cpumask_weight",
}; };
......
...@@ -183,6 +183,18 @@ static void test_linked_list_success(int mode, bool leave_in_map) ...@@ -183,6 +183,18 @@ static void test_linked_list_success(int mode, bool leave_in_map)
if (!leave_in_map) if (!leave_in_map)
clear_fields(skel->maps.bss_A); clear_fields(skel->maps.bss_A);
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.global_list_push_pop_nested), &opts);
ASSERT_OK(ret, "global_list_push_pop_nested");
ASSERT_OK(opts.retval, "global_list_push_pop_nested retval");
if (!leave_in_map)
clear_fields(skel->maps.bss_A);
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.global_list_array_push_pop), &opts);
ASSERT_OK(ret, "global_list_array_push_pop");
ASSERT_OK(opts.retval, "global_list_array_push_pop retval");
if (!leave_in_map)
clear_fields(skel->maps.bss_A);
if (mode == PUSH_POP) if (mode == PUSH_POP)
goto end; goto end;
......
...@@ -31,6 +31,28 @@ static void test_rbtree_add_nodes(void) ...@@ -31,6 +31,28 @@ static void test_rbtree_add_nodes(void)
rbtree__destroy(skel); rbtree__destroy(skel);
} }
static void test_rbtree_add_nodes_nested(void)
{
LIBBPF_OPTS(bpf_test_run_opts, opts,
.data_in = &pkt_v4,
.data_size_in = sizeof(pkt_v4),
.repeat = 1,
);
struct rbtree *skel;
int ret;
skel = rbtree__open_and_load();
if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load"))
return;
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_nodes_nested), &opts);
ASSERT_OK(ret, "rbtree_add_nodes_nested run");
ASSERT_OK(opts.retval, "rbtree_add_nodes_nested retval");
ASSERT_EQ(skel->data->less_callback_ran, 1, "rbtree_add_nodes_nested less_callback_ran");
rbtree__destroy(skel);
}
static void test_rbtree_add_and_remove(void) static void test_rbtree_add_and_remove(void)
{ {
LIBBPF_OPTS(bpf_test_run_opts, opts, LIBBPF_OPTS(bpf_test_run_opts, opts,
...@@ -53,6 +75,27 @@ static void test_rbtree_add_and_remove(void) ...@@ -53,6 +75,27 @@ static void test_rbtree_add_and_remove(void)
rbtree__destroy(skel); rbtree__destroy(skel);
} }
static void test_rbtree_add_and_remove_array(void)
{
LIBBPF_OPTS(bpf_test_run_opts, opts,
.data_in = &pkt_v4,
.data_size_in = sizeof(pkt_v4),
.repeat = 1,
);
struct rbtree *skel;
int ret;
skel = rbtree__open_and_load();
if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load"))
return;
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_and_remove_array), &opts);
ASSERT_OK(ret, "rbtree_add_and_remove_array");
ASSERT_OK(opts.retval, "rbtree_add_and_remove_array retval");
rbtree__destroy(skel);
}
static void test_rbtree_first_and_remove(void) static void test_rbtree_first_and_remove(void)
{ {
LIBBPF_OPTS(bpf_test_run_opts, opts, LIBBPF_OPTS(bpf_test_run_opts, opts,
...@@ -104,8 +147,12 @@ void test_rbtree_success(void) ...@@ -104,8 +147,12 @@ void test_rbtree_success(void)
{ {
if (test__start_subtest("rbtree_add_nodes")) if (test__start_subtest("rbtree_add_nodes"))
test_rbtree_add_nodes(); test_rbtree_add_nodes();
if (test__start_subtest("rbtree_add_nodes_nested"))
test_rbtree_add_nodes_nested();
if (test__start_subtest("rbtree_add_and_remove")) if (test__start_subtest("rbtree_add_and_remove"))
test_rbtree_add_and_remove(); test_rbtree_add_and_remove();
if (test__start_subtest("rbtree_add_and_remove_array"))
test_rbtree_add_and_remove_array();
if (test__start_subtest("rbtree_first_and_remove")) if (test__start_subtest("rbtree_first_and_remove"))
test_rbtree_first_and_remove(); test_rbtree_first_and_remove();
if (test__start_subtest("rbtree_api_release_aliasing")) if (test__start_subtest("rbtree_api_release_aliasing"))
......
...@@ -12,6 +12,31 @@ char _license[] SEC("license") = "GPL"; ...@@ -12,6 +12,31 @@ char _license[] SEC("license") = "GPL";
int pid, nr_cpus; int pid, nr_cpus;
struct kptr_nested {
struct bpf_cpumask __kptr * mask;
};
struct kptr_nested_pair {
struct bpf_cpumask __kptr * mask_1;
struct bpf_cpumask __kptr * mask_2;
};
struct kptr_nested_mid {
int dummy;
struct kptr_nested m;
};
struct kptr_nested_deep {
struct kptr_nested_mid ptrs[2];
struct kptr_nested_pair ptr_pairs[3];
};
private(MASK) static struct bpf_cpumask __kptr * global_mask_array[2];
private(MASK) static struct bpf_cpumask __kptr * global_mask_array_l2[2][1];
private(MASK) static struct bpf_cpumask __kptr * global_mask_array_one[1];
private(MASK) static struct kptr_nested global_mask_nested[2];
private(MASK_DEEP) static struct kptr_nested_deep global_mask_nested_deep;
static bool is_test_task(void) static bool is_test_task(void)
{ {
int cur_pid = bpf_get_current_pid_tgid() >> 32; int cur_pid = bpf_get_current_pid_tgid() >> 32;
...@@ -460,6 +485,152 @@ int BPF_PROG(test_global_mask_rcu, struct task_struct *task, u64 clone_flags) ...@@ -460,6 +485,152 @@ int BPF_PROG(test_global_mask_rcu, struct task_struct *task, u64 clone_flags)
return 0; return 0;
} }
SEC("tp_btf/task_newtask")
int BPF_PROG(test_global_mask_array_one_rcu, struct task_struct *task, u64 clone_flags)
{
struct bpf_cpumask *local, *prev;
if (!is_test_task())
return 0;
/* Kptr arrays with one element are special cased, being treated
* just like a single pointer.
*/
local = create_cpumask();
if (!local)
return 0;
prev = bpf_kptr_xchg(&global_mask_array_one[0], local);
if (prev) {
bpf_cpumask_release(prev);
err = 3;
return 0;
}
bpf_rcu_read_lock();
local = global_mask_array_one[0];
if (!local) {
err = 4;
bpf_rcu_read_unlock();
return 0;
}
bpf_rcu_read_unlock();
return 0;
}
static int _global_mask_array_rcu(struct bpf_cpumask **mask0,
struct bpf_cpumask **mask1)
{
struct bpf_cpumask *local;
if (!is_test_task())
return 0;
/* Check if two kptrs in the array work and independently */
local = create_cpumask();
if (!local)
return 0;
bpf_rcu_read_lock();
local = bpf_kptr_xchg(mask0, local);
if (local) {
err = 1;
goto err_exit;
}
/* [<mask 0>, NULL] */
if (!*mask0 || *mask1) {
err = 2;
goto err_exit;
}
local = create_cpumask();
if (!local) {
err = 9;
goto err_exit;
}
local = bpf_kptr_xchg(mask1, local);
if (local) {
err = 10;
goto err_exit;
}
/* [<mask 0>, <mask 1>] */
if (!*mask0 || !*mask1 || *mask0 == *mask1) {
err = 11;
goto err_exit;
}
err_exit:
if (local)
bpf_cpumask_release(local);
bpf_rcu_read_unlock();
return 0;
}
SEC("tp_btf/task_newtask")
int BPF_PROG(test_global_mask_array_rcu, struct task_struct *task, u64 clone_flags)
{
return _global_mask_array_rcu(&global_mask_array[0], &global_mask_array[1]);
}
SEC("tp_btf/task_newtask")
int BPF_PROG(test_global_mask_array_l2_rcu, struct task_struct *task, u64 clone_flags)
{
return _global_mask_array_rcu(&global_mask_array_l2[0][0], &global_mask_array_l2[1][0]);
}
SEC("tp_btf/task_newtask")
int BPF_PROG(test_global_mask_nested_rcu, struct task_struct *task, u64 clone_flags)
{
return _global_mask_array_rcu(&global_mask_nested[0].mask, &global_mask_nested[1].mask);
}
/* Ensure that the field->offset has been correctly advanced from one
* nested struct or array sub-tree to another. In the case of
* kptr_nested_deep, it comprises two sub-trees: ktpr_1 and kptr_2. By
* calling bpf_kptr_xchg() on every single kptr in both nested sub-trees,
* the verifier should reject the program if the field->offset of any kptr
* is incorrect.
*
* For instance, if we have 10 kptrs in a nested struct and a program that
* accesses each kptr individually with bpf_kptr_xchg(), the compiler
* should emit instructions to access 10 different offsets if it works
* correctly. If the field->offset values of any pair of them are
* incorrectly the same, the number of unique offsets in btf_record for
* this nested struct should be less than 10. The verifier should fail to
* discover some of the offsets emitted by the compiler.
*
* Even if the field->offset values of kptrs are not duplicated, the
* verifier should fail to find a btf_field for the instruction accessing a
* kptr if the corresponding field->offset is pointing to a random
* incorrect offset.
*/
SEC("tp_btf/task_newtask")
int BPF_PROG(test_global_mask_nested_deep_rcu, struct task_struct *task, u64 clone_flags)
{
int r, i;
r = _global_mask_array_rcu(&global_mask_nested_deep.ptrs[0].m.mask,
&global_mask_nested_deep.ptrs[1].m.mask);
if (r)
return r;
for (i = 0; i < 3; i++) {
r = _global_mask_array_rcu(&global_mask_nested_deep.ptr_pairs[i].mask_1,
&global_mask_nested_deep.ptr_pairs[i].mask_2);
if (r)
return r;
}
return 0;
}
SEC("tp_btf/task_newtask") SEC("tp_btf/task_newtask")
int BPF_PROG(test_cpumask_weight, struct task_struct *task, u64 clone_flags) int BPF_PROG(test_cpumask_weight, struct task_struct *task, u64 clone_flags)
{ {
......
...@@ -11,6 +11,22 @@ ...@@ -11,6 +11,22 @@
#include "linked_list.h" #include "linked_list.h"
struct head_nested_inner {
struct bpf_spin_lock lock;
struct bpf_list_head head __contains(foo, node2);
};
struct head_nested {
int dummy;
struct head_nested_inner inner;
};
private(C) struct bpf_spin_lock glock_c;
private(C) struct bpf_list_head ghead_array[2] __contains(foo, node2);
private(C) struct bpf_list_head ghead_array_one[1] __contains(foo, node2);
private(D) struct head_nested ghead_nested;
static __always_inline static __always_inline
int list_push_pop(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map) int list_push_pop(struct bpf_spin_lock *lock, struct bpf_list_head *head, bool leave_in_map)
{ {
...@@ -309,6 +325,32 @@ int global_list_push_pop(void *ctx) ...@@ -309,6 +325,32 @@ int global_list_push_pop(void *ctx)
return test_list_push_pop(&glock, &ghead); return test_list_push_pop(&glock, &ghead);
} }
SEC("tc")
int global_list_push_pop_nested(void *ctx)
{
return test_list_push_pop(&ghead_nested.inner.lock, &ghead_nested.inner.head);
}
SEC("tc")
int global_list_array_push_pop(void *ctx)
{
int r;
r = test_list_push_pop(&glock_c, &ghead_array[0]);
if (r)
return r;
r = test_list_push_pop(&glock_c, &ghead_array[1]);
if (r)
return r;
/* Arrays with only one element is a special case, being treated
* just like a bpf_list_head variable by the verifier, not an
* array.
*/
return test_list_push_pop(&glock_c, &ghead_array_one[0]);
}
SEC("tc") SEC("tc")
int map_list_push_pop_multiple(void *ctx) int map_list_push_pop_multiple(void *ctx)
{ {
......
...@@ -13,6 +13,15 @@ struct node_data { ...@@ -13,6 +13,15 @@ struct node_data {
struct bpf_rb_node node; struct bpf_rb_node node;
}; };
struct root_nested_inner {
struct bpf_spin_lock glock;
struct bpf_rb_root root __contains(node_data, node);
};
struct root_nested {
struct root_nested_inner inner;
};
long less_callback_ran = -1; long less_callback_ran = -1;
long removed_key = -1; long removed_key = -1;
long first_data[2] = {-1, -1}; long first_data[2] = {-1, -1};
...@@ -20,6 +29,9 @@ long first_data[2] = {-1, -1}; ...@@ -20,6 +29,9 @@ long first_data[2] = {-1, -1};
#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) #define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8)))
private(A) struct bpf_spin_lock glock; private(A) struct bpf_spin_lock glock;
private(A) struct bpf_rb_root groot __contains(node_data, node); private(A) struct bpf_rb_root groot __contains(node_data, node);
private(A) struct bpf_rb_root groot_array[2] __contains(node_data, node);
private(A) struct bpf_rb_root groot_array_one[1] __contains(node_data, node);
private(B) struct root_nested groot_nested;
static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b)
{ {
...@@ -71,6 +83,12 @@ long rbtree_add_nodes(void *ctx) ...@@ -71,6 +83,12 @@ long rbtree_add_nodes(void *ctx)
return __add_three(&groot, &glock); return __add_three(&groot, &glock);
} }
SEC("tc")
long rbtree_add_nodes_nested(void *ctx)
{
return __add_three(&groot_nested.inner.root, &groot_nested.inner.glock);
}
SEC("tc") SEC("tc")
long rbtree_add_and_remove(void *ctx) long rbtree_add_and_remove(void *ctx)
{ {
...@@ -109,6 +127,65 @@ long rbtree_add_and_remove(void *ctx) ...@@ -109,6 +127,65 @@ long rbtree_add_and_remove(void *ctx)
return 1; return 1;
} }
SEC("tc")
long rbtree_add_and_remove_array(void *ctx)
{
struct bpf_rb_node *res1 = NULL, *res2 = NULL, *res3 = NULL;
struct node_data *nodes[3][2] = {{NULL, NULL}, {NULL, NULL}, {NULL, NULL}};
struct node_data *n;
long k1 = -1, k2 = -1, k3 = -1;
int i, j;
for (i = 0; i < 3; i++) {
for (j = 0; j < 2; j++) {
nodes[i][j] = bpf_obj_new(typeof(*nodes[i][j]));
if (!nodes[i][j])
goto err_out;
nodes[i][j]->key = i * 2 + j;
}
}
bpf_spin_lock(&glock);
for (i = 0; i < 2; i++)
for (j = 0; j < 2; j++)
bpf_rbtree_add(&groot_array[i], &nodes[i][j]->node, less);
for (j = 0; j < 2; j++)
bpf_rbtree_add(&groot_array_one[0], &nodes[2][j]->node, less);
res1 = bpf_rbtree_remove(&groot_array[0], &nodes[0][0]->node);
res2 = bpf_rbtree_remove(&groot_array[1], &nodes[1][0]->node);
res3 = bpf_rbtree_remove(&groot_array_one[0], &nodes[2][0]->node);
bpf_spin_unlock(&glock);
if (res1) {
n = container_of(res1, struct node_data, node);
k1 = n->key;
bpf_obj_drop(n);
}
if (res2) {
n = container_of(res2, struct node_data, node);
k2 = n->key;
bpf_obj_drop(n);
}
if (res3) {
n = container_of(res3, struct node_data, node);
k3 = n->key;
bpf_obj_drop(n);
}
if (k1 != 0 || k2 != 2 || k3 != 4)
return 2;
return 0;
err_out:
for (i = 0; i < 3; i++) {
for (j = 0; j < 2; j++) {
if (nodes[i][j])
bpf_obj_drop(nodes[i][j]);
}
}
return 1;
}
SEC("tc") SEC("tc")
long rbtree_first_and_remove(void *ctx) long rbtree_first_and_remove(void *ctx)
{ {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment