Merge branch 'allow-bpf_refcount_acquire-of-mapval-obtained-via-direct-ld'
Dave Marchevsky says: ==================== Allow bpf_refcount_acquire of mapval obtained via direct LD Consider this BPF program: struct cgv_node { int d; struct bpf_refcount r; struct bpf_rb_node rb; }; struct val_stash { struct cgv_node __kptr *v; }; struct { __uint(type, BPF_MAP_TYPE_ARRAY); __type(key, int); __type(value, struct val_stash); __uint(max_entries, 10); } array_map SEC(".maps"); long bpf_program(void *ctx) { struct val_stash *mapval; struct cgv_node *p; int idx = 0; mapval = bpf_map_lookup_elem(&array_map, &idx); if (!mapval || !mapval->v) { /* omitted */ } p = bpf_refcount_acquire(mapval->v); /* Verification FAILs here */ /* Add p to some tree */ return 0; } Verification fails on the refcount_acquire: 160: (79) r1 = *(u64 *)(r8 +8) ; R1_w=untrusted_ptr_or_null_cgv_node(id=11,off=0,imm=0) R8_w=map_value(id=10,off=0,ks=8,vs=16,imm=0) refs=6 161: (b7) r2 = 0 ; R2_w=0 refs=6 162: (85) call bpf_refcount_acquire_impl#117824 arg#0 is neither owning or non-owning ref The above verifier dump is actually from sched_ext's scx_flatcg [0], which is the motivating usecase for this series' changes. Specifically, scx_flatcg stashes a rb_node type w/ cgroup-specific info (struct cgv_node) in a map when the cgroup is created, then later puts that cgroup's node in a rbtree in .enqueue . Making struct cgv_node refcounted would simplify the code a bit by virtue of allowing us to remove the kptr_xchg's, but "later puts that cgroups node in a rbtree" is not possible without a refcount_acquire, which suffers from the above verification failure. If we get rid of PTR_UNTRUSTED flag, and add MEM_ALLOC | NON_OWN_REF, mapval->v would be a non-owning ref and verification would succeed. Due to the most recent set of refcount changes [1], which modified bpf_obj_drop behavior to not reuse refcounted graph node's underlying memory until after RCU grace period, this is safe to do. Once mapval->v has the correct flags it _is_ a non-owning reference and verification of the motivating example will succeed. [0]: https://github.com/sched-ext/sched_ext/blob/52911e1040a0f94b9c426dddcc00be5364a7ae9f/tools/sched_ext/scx_flatcg.bpf.c#L275 [1]: https://lore.kernel.org/bpf/20230821193311.3290257-1-davemarchevsky@fb.com/ Summary of patches: * Patch 1 fixes an issue with bpf_refcount_acquire verification letting MAYBE_NULL ptrs through * Patch 2 tests Patch 1's fix * Patch 3 broadens the use of "free only after RCU GP" to all user-allocated types * Patch 4 is a small nonfunctional refactoring * Patch 5 changes verifier to mark direct LD of stashed graph node kptr as non-owning ref * Patch 6 tests Patch 5's verifier changes Changelog: v1 -> v2: https://lore.kernel.org/bpf/20231025214007.2920506-1-davemarchevsky@fb.com/ Series title changed to "Allow bpf_refcount_acquire of mapval obtained via direct LD". V1's title was mistakenly truncated. * Patch 5 ("bpf: Mark direct ld of stashed bpf_{rb,list}_node as non-owning ref") * Direct LD of percpu kptr should not have MEM_ALLOC flag (Yonghong) * Patch 6 ("selftests/bpf: Test bpf_refcount_acquire of node obtained via direct ld") * Test read from stashed value as well ==================== Link: https://lore.kernel.org/r/20231107085639.3016113-1-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
Showing
Please register or sign in to comment