Commit c272e259 authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'bpf: refine kernel.unprivileged_bpf_disabled behaviour'

Alan Maguire says:

====================

Unprivileged BPF disabled (kernel.unprivileged_bpf_disabled >= 1)
is the default in most cases now; when set, the BPF system call is
blocked for users without CAP_BPF/CAP_SYS_ADMIN.  In some cases
however, it makes sense to split activities between capability-requiring
ones - such as program load/attach - and those that might not require
capabilities such as reading perf/ringbuf events, reading or
updating BPF map configuration etc.  One example of this sort of
approach is a service that loads a BPF program, and a user-space
program that interacts with it.

Here - rather than blocking all BPF syscall commands - unprivileged
BPF disabled blocks the key object-creating commands (prog load,
map load).  Discussion has alluded to this idea in the past [1],
and Alexei mentioned it was also discussed at LSF/MM/BPF this year.

Changes since v3 [2]:
- added acks to patch 1
- CI was failing on Ubuntu; I suspect the issue was an old capability.h
  file which specified CAP_LAST_CAP as < CAP_BPF, leading to the logic
  disabling all caps not disabling CAP_BPF.  Use CAP_BPF as basis for
  "all caps" bitmap instead as we explicitly define it in cap_helpers.h
  if not already found in capabilities.h
- made global variables arguments to subtests instead (Andrii, patch 2)

Changes since v2 [3]:

- added acks from Yonghong
- clang compilation issue in selftest with bpf_prog_query()
  (Alexei, patch 2)
- disable all capabilities for test (Yonghong, patch 2)
- add assertions that size of perf/ringbuf data matches expectations
  (Yonghong, patch 2)
- add map array size definition, remove unneeded whitespace (Yonghong, patch 2)

Changes since RFC [4]:

- widened scope of commands unprivileged BPF disabled allows
  (Alexei, patch 1)
- removed restrictions on map types for lookup, update, delete
  (Alexei, patch 1)
- removed kernel CONFIG parameter controlling unprivileged bpf disabled
  change (Alexei, patch 1)
- widened test scope to cover most BPF syscall commands, with positive
  and negative subtests

[1] https://lore.kernel.org/bpf/CAADnVQLTBhCTAx1a_nev7CgMZxv1Bb7ecz1AFRin8tHmjPREJA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/1652880861-27373-1-git-send-email-alan.maguire@oracle.com/T/
[3] https://lore.kernel.org/bpf/1652788780-25520-1-git-send-email-alan.maguire@oracle.com/T/#t
[4] https://lore.kernel.org/bpf/20220511163604.5kuczj6jx3ec5qv6@MBP-98dd607d3435.dhcp.thefacebook.com/T/#mae65f35a193279e718f37686da636094d69b96ee
====================
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 97949767 90a039fd
......@@ -4863,9 +4863,21 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
{
union bpf_attr attr;
bool capable;
int err;
if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
capable = bpf_capable() || !sysctl_unprivileged_bpf_disabled;
/* Intent here is for unprivileged_bpf_disabled to block key object
* creation commands for unprivileged users; other actions depend
* of fd availability and access to bpffs, so are dependent on
* object creation success. Capabilities are later verified for
* operations such as load and map create, so even with unprivileged
* BPF disabled, capability checks are still carried out for these
* and other operations.
*/
if (!capable &&
(cmd == BPF_MAP_CREATE || cmd == BPF_PROG_LOAD))
return -EPERM;
err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2022, Oracle and/or its affiliates. */
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include "bpf_misc.h"
__u32 perfbuf_val = 0;
__u32 ringbuf_val = 0;
int test_pid;
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32);
} array SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32);
} percpu_array SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32);
} hash SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32);
} percpu_hash SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__type(key, __u32);
__type(value, __u32);
} perfbuf SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_RINGBUF);
__uint(max_entries, 1 << 12);
} ringbuf SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} prog_array SEC(".maps");
SEC("fentry/" SYS_PREFIX "sys_nanosleep")
int sys_nanosleep_enter(void *ctx)
{
int cur_pid;
cur_pid = bpf_get_current_pid_tgid() >> 32;
if (cur_pid != test_pid)
return 0;
bpf_perf_event_output(ctx, &perfbuf, BPF_F_CURRENT_CPU, &perfbuf_val, sizeof(perfbuf_val));
bpf_ringbuf_output(&ringbuf, &ringbuf_val, sizeof(ringbuf_val), 0);
return 0;
}
SEC("perf_event")
int handle_perf_event(void *ctx)
{
return 0;
}
char _license[] SEC("license") = "GPL";
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment