1. 21 Jan, 2022 29 commits
  2. 20 Jan, 2022 5 commits
  3. 19 Jan, 2022 6 commits
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: allow cgroup progs to export custom retval to userspace' · 4e950747
      Alexei Starovoitov authored
      YiFei Zhu says:
      
      ====================
      
      Right now, most cgroup hooks are best used for permission checks. They
      can only reject a syscall with -EPERM, so a cause of a rejection, if
      the rejected by eBPF cgroup hooks, is ambiguous to userspace.
      Additionally, if the syscalls are implemented in eBPF, all permission
      checks and the implementation has to happen within the same filter,
      as programs executed later in the series of progs are unaware of the
      return values return by the previous progs.
      
      This patch series adds two helpers, bpf_get_retval and bpf_set_retval,
      that allows hooks to get/set the return value of syscall to userspace.
      This also allows later progs to retrieve retval set by previous progs.
      
      For legacy programs that rejects a syscall without setting the retval,
      for backwards compatibility, if a prog rejects without itself or a
      prior prog setting retval to an -err, the retval is set by the kernel
      to -EPERM.
      
      For getsockopt hooks that has ctx->retval, this variable mirrors that
      that accessed by the helpers.
      
      Additionally, the following user-visible behavior for getsockopt
      hooks has changed:
        - If a prior filter rejected the syscall, it will be visible
          in ctx->retval.
        - Attempting to change the retval arbitrarily is now allowed and
          will not cause an -EFAULT.
        - If kernel rejects a getsockopt syscall before running the hooks,
          the error will be visible in ctx->retval. Returning 0 from the
          prog will not overwrite the error to -EPERM unless there is an
          explicit call of bpf_set_retval(-EPERM)
      
      Tests have been added in this series to test the behavior of the helper
      with cgroup setsockopt getsockopt hooks.
      
      Patch 1 changes the API of macros to prepare for the next patch and
        should be a no-op.
      Patch 2 moves ctx->retval to a struct pointed to by current
        task_struct.
      Patch 3 implements the helpers.
      Patch 4 tests the behaviors of the helpers.
      Patch 5 updates a test after the test broke due to the visible changes.
      
      v1 -> v2:
        - errno -> retval
        - split one helper to get & set helpers
        - allow retval to be set arbitrarily in the general case
        - made the helper retval and context retval mirror each other
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4e950747
    • YiFei Zhu's avatar
      selftests/bpf: Update sockopt_sk test to the use bpf_set_retval · 1080ef5c
      YiFei Zhu authored
      The tests would break without this patch, because at one point it calls
        getsockopt(fd, SOL_TCP, TCP_ZEROCOPY_RECEIVE, &buf, &optlen)
      This getsockopt receives the kernel-set -EINVAL. Prior to this patch
      series, the eBPF getsockopt hook's -EPERM would override kernel's
      -EINVAL, however, after this patch series, return 0's automatic
      -EPERM will not; the eBPF prog has to explicitly bpf_set_retval(-EPERM)
      if that is wanted.
      
      I also removed the explicit mentions of EPERM in the comments in the
      prog.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/4f20b77cb46812dbc2bdcd7e3fa87c7573bde55e.1639619851.git.zhuyifei@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1080ef5c
    • YiFei Zhu's avatar
      selftests/bpf: Test bpf_{get,set}_retval behavior with cgroup/sockopt · b8bff6f8
      YiFei Zhu authored
      The tests checks how different ways of interacting with the helpers
      (getting retval, setting EUNATCH, EISCONN, and legacy reject
      returning 0 without setting retval), produce different results in
      both the setsockopt syscall and the retval returned by the helper.
      A few more tests verify the interaction between the retval of the
      helper and the retval in getsockopt context.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/43ec60d679ae3f4f6fd2460559c28b63cb93cd12.1639619851.git.zhuyifei@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b8bff6f8
    • YiFei Zhu's avatar
      bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value · b44123b4
      YiFei Zhu authored
      The helpers continue to use int for retval because all the hooks
      are int-returning rather than long-returning. The return value of
      bpf_set_retval is int for future-proofing, in case in the future
      there may be errors trying to set the retval.
      
      After the previous patch, if a program rejects a syscall by
      returning 0, an -EPERM will be generated no matter if the retval
      is already set to -err. This patch change it being forced only if
      retval is not -err. This is because we want to support, for
      example, invoking bpf_set_retval(-EINVAL) and return 0, and have
      the syscall return value be -EINVAL not -EPERM.
      
      For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
      that, if the return value is NET_XMIT_DROP, the packet is silently
      dropped. We preserve this behavior for backward compatibility
      reasons, so even if an errno is set, the errno does not return to
      caller. However, setting a non-err to retval cannot propagate so
      this is not allowed and we return a -EFAULT in that case.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/b4013fd5d16bed0b01977c1fafdeae12e1de61fb.1639619851.git.zhuyifei@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b44123b4
    • YiFei Zhu's avatar
      bpf: Move getsockopt retval to struct bpf_cg_run_ctx · c4dcfdd4
      YiFei Zhu authored
      The retval value is moved to struct bpf_cg_run_ctx for ease of access
      in different prog types with different context structs layouts. The
      helper implementation (to be added in a later patch in the series) can
      simply perform a container_of from current->bpf_ctx to retrieve
      bpf_cg_run_ctx.
      
      Unfortunately, there is no easy way to access the current task_struct
      via the verifier BPF bytecode rewrite, aside from possibly calling a
      helper, so a pointer to current task is added to struct bpf_sockopt_kern
      so that the rewritten BPF bytecode can access struct bpf_cg_run_ctx with
      an indirection.
      
      For backward compatibility, if a getsockopt program rejects a syscall
      by returning 0, an -EPERM will be generated, by having the
      BPF_PROG_RUN_ARRAY_CG family macros automatically set the retval to
      -EPERM. Unlike prior to this patch, this -EPERM will be visible to
      ctx->retval for any other hooks down the line in the prog array.
      
      Additionally, the restriction that getsockopt filters can only set
      the retval to 0 is removed, considering that certain getsockopt
      implementations may return optlen. Filters are now able to set the
      value arbitrarily.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/73b0325f5c29912ccea7ea57ec1ed4d388fc1d37.1639619851.git.zhuyifei@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c4dcfdd4
    • YiFei Zhu's avatar
      bpf: Make BPF_PROG_RUN_ARRAY return -err instead of allow boolean · f10d0596
      YiFei Zhu authored
      Right now BPF_PROG_RUN_ARRAY and related macros return 1 or 0
      for whether the prog array allows or rejects whatever is being
      hooked. The caller of these macros then return -EPERM or continue
      processing based on thw macro's return value. Unforunately this is
      inflexible, since -EPERM is the only err that can be returned.
      
      This patch should be a no-op; it prepares for the next patch. The
      returning of the -EPERM is moved to inside the macros, so the outer
      functions are directly returning what the macros returned if they
      are non-zero.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/788abcdca55886d1f43274c918eaa9f792a9f33b.1639619851.git.zhuyifei@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f10d0596