1. 29 Sep, 2020 37 commits
    • Toke Høiland-Jørgensen's avatar
      bpf: Support attaching freplace programs to multiple attach points · 4a1e7c0c
      Toke Høiland-Jørgensen authored
      This enables support for attaching freplace programs to multiple attach
      points. It does this by amending the UAPI for bpf_link_Create with a target
      btf ID that can be used to supply the new attachment point along with the
      target program fd. The target must be compatible with the target that was
      supplied at program load time.
      
      The implementation reuses the checks that were factored out of
      check_attach_btf_id() to ensure compatibility between the BTF types of the
      old and new attachment. If these match, a new bpf_tracing_link will be
      created for the new attach target, allowing multiple attachments to
      co-exist simultaneously.
      
      The code could theoretically support multiple-attach of other types of
      tracing programs as well, but since I don't have a use case for any of
      those, there is no API support for doing so.
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/160138355169.48470.17165680973640685368.stgit@toke.dk
      4a1e7c0c
    • Toke Høiland-Jørgensen's avatar
      bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach · 3aac1ead
      Toke Høiland-Jørgensen authored
      In preparation for allowing multiple attachments of freplace programs, move
      the references to the target program and trampoline into the
      bpf_tracing_link structure when that is created. To do this atomically,
      introduce a new mutex in prog->aux to protect writing to the two pointers
      to target prog and trampoline, and rename the members to make it clear that
      they are related.
      
      With this change, it is no longer possible to attach the same tracing
      program multiple times (detaching in-between), since the reference from the
      tracing program to the target disappears on the first attach. However,
      since the next patch will let the caller supply an attach target, that will
      also make it possible to attach to the same place multiple times.
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/160138355059.48470.2503076992210324984.stgit@toke.dk
      3aac1ead
    • Alexei Starovoitov's avatar
      Merge branch 'libbpf: support loading/storing any BTF' · 85e3f318
      Alexei Starovoitov authored
      Andrii Nakryiko says:
      
      ====================
      Add support for loading and storing BTF in either little- or big-endian
      integer encodings, regardless of host endianness. This allows users of libbpf
      to not care about endianness when they don't want to and transparently
      open/load BTF of any endianness. libbpf will preserve original endianness and
      will convert output raw data as necessary back to original endianness, if
      necessary. This allows tools like pahole to be ignorant to such issues during
      cross-compilation.
      
      While working with BTF data in memory, the endianness is always native to the
      host. Convetion can happen only during btf__get_raw_data() call, and only in
      a raw data copy.
      
      Additionally, it's possible to force output BTF endianness through new
      btf__set_endianness() API. This which allows to create flexible tools doing
      arbitrary conversions of BTF endianness, just by relying on libbpf.
      
      Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
      Cc: Tony Ambardar <tony.ambardar@gmail.com>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Cc: Luka Perkov <luka.perkov@sartura.hr>
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      85e3f318
    • Andrii Nakryiko's avatar
      selftests/bpf: Test BTF's handling of endianness · ed9cf248
      Andrii Nakryiko authored
      Add selftests juggling endianness back and forth to validate BTF's handling of
      endianness convertions internally.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200929043046.1324350-4-andriin@fb.com
      ed9cf248
    • Andrii Nakryiko's avatar
      libbpf: Support BTF loading and raw data output in both endianness · 3289959b
      Andrii Nakryiko authored
      Teach BTF to recognized wrong endianness and transparently convert it
      internally to host endianness. Original endianness of BTF will be preserved
      and used during btf__get_raw_data() to convert resulting raw data to the same
      endianness and a source raw_data. This means that little-endian host can parse
      big-endian BTF with no issues, all the type data will be presented to the
      client application in native endianness, but when it's time for emitting BTF
      to persist it in a file (e.g., after BTF deduplication), original non-native
      endianness will be preserved and stored.
      
      It's possible to query original endianness of BTF data with new
      btf__endianness() API. It's also possible to override desired output
      endianness with btf__set_endianness(), so that if application needs to load,
      say, big-endian BTF and store it as little-endian BTF, it's possible to
      manually override this. If btf__set_endianness() was used to change
      endianness, btf__endianness() will reflect overridden endianness.
      
      Given there are no known use cases for supporting cross-endianness for
      .BTF.ext, loading .BTF.ext in non-native endianness is not supported.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200929043046.1324350-3-andriin@fb.com
      3289959b
    • Andrii Nakryiko's avatar
      selftests/bpf: Move and extend ASSERT_xxx() testing macros · 22ba3635
      Andrii Nakryiko authored
      Move existing ASSERT_xxx() macros out of btf_write selftest into test_progs.h
      to use across all selftests. Also expand a set of macros for typical cases.
      
      Now there are the following macros:
        - ASSERT_EQ() -- check for equality of two integers;
        - ASSERT_STREQ() -- check for equality of two C strings;
        - ASSERT_OK() -- check for successful (zero) return result;
        - ASSERT_ERR() -- check for unsuccessful (non-zero) return result;
        - ASSERT_NULL() -- check for NULL pointer;
        - ASSERT_OK_PTR() -- check for a valid pointer;
        - ASSERT_ERR_PTR() -- check for NULL or negative error encoded in a pointer.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200929043046.1324350-2-andriin@fb.com
      22ba3635
    • Toke Høiland-Jørgensen's avatar
      selftests: Make sure all 'skel' variables are declared static · f970cbcd
      Toke Høiland-Jørgensen authored
      If programs in prog_tests using skeletons declare the 'skel' variable as
      global but not static, that will lead to linker errors on the final link of
      the prog_tests binary due to duplicate symbols. Fix a few instances of this.
      
      Fixes: b18c1f0a ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
      Fixes: 9a856cae ("bpf: selftest: Add test_btf_skc_cls_ingress")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200929123026.46751-1-toke@redhat.com
      f970cbcd
    • Ciara Loftus's avatar
      xsk: Fix a documentation mistake in xsk_queue.h · f1fc8ece
      Ciara Loftus authored
      After 'peeking' the ring, the consumer, not the producer, reads the data.
      Fix this mistake in the comments.
      
      Fixes: 15d8c916 ("xsk: Add function naming comments and reorder functions")
      Signed-off-by: default avatarCiara Loftus <ciara.loftus@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Link: https://lore.kernel.org/bpf/20200928082344.17110-1-ciara.loftus@intel.com
      f1fc8ece
    • Toke Høiland-Jørgensen's avatar
      selftests/bpf_iter: Don't fail test due to missing __builtin_btf_type_id · d2197c7f
      Toke Høiland-Jørgensen authored
      The new test for task iteration in bpf_iter checks (in do_btf_read()) if it
      should be skipped due to missing __builtin_btf_type_id. However, this
      'skip' verdict is not propagated to the caller, so the parent test will
      still fail. Fix this by also skipping the rest of the parent test if the
      skip condition was reached.
      
      Fixes: b72091bd ("selftests/bpf: Add test for bpf_seq_printf_btf helper")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/20200929123004.46694-1-toke@redhat.com
      d2197c7f
    • Toke Høiland-Jørgensen's avatar
      bpf/preload: Make sure Makefile cleans up after itself, and add .gitignore · 9d9aae53
      Toke Høiland-Jørgensen authored
      The Makefile in bpf/preload builds a local copy of libbpf, but does not
      properly clean up after itself. This can lead to subsequent compilation
      failures, since the feature detection cache is kept around which can lead
      subsequent detection to fail.
      
      Fix this by properly setting clean-files, and while we're at it, also add a
      .gitignore for the directory to ignore the build artifacts.
      
      Fixes: d71fa5c9 ("bpf: Add kernel module with user mode driver that populates bpffs.")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200927193005.8459-1-toke@redhat.com
      9d9aae53
    • Alexei Starovoitov's avatar
      Merge branch 'selftests/bpf: BTF-based kernel data display' · 3aae4a38
      Alexei Starovoitov authored
      Alan Maguire says:
      
      ====================
      Resolve issues in bpf selftests introduced with BTF-based kernel data
      display selftests; these are
      
      - a warning introduced in snprintf_btf.c; and
      - compilation failures with old kernels vmlinux.h
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      3aae4a38
    • Alan Maguire's avatar
      selftests/bpf: Ensure snprintf_btf/bpf_iter tests compatibility with old vmlinux.h · cfe77683
      Alan Maguire authored
      Andrii reports that bpf selftests relying on "struct btf_ptr" and BTF_F_*
      values will not build as vmlinux.h for older kernels will not include
      "struct btf_ptr" or the BTF_F_* enum values.  Undefine and redefine
      them to work around this.
      
      Fixes: b72091bd ("selftests/bpf: Add test for bpf_seq_printf_btf helper")
      Fixes: 076a95f5 ("selftests/bpf: Add bpf_snprintf_btf helper tests")
      Reported-by: default avatarAndrii Nakryiko <andrii.nakryiko@gmail.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/1601379151-21449-3-git-send-email-alan.maguire@oracle.com
      cfe77683
    • Alan Maguire's avatar
      selftests/bpf: Fix unused-result warning in snprintf_btf.c · 96c48058
      Alan Maguire authored
      Daniel reports:
      
      +    system("ping -c 1 127.0.0.1 > /dev/null");
      
      This generates the following new warning when compiling BPF selftests:
      
        [...]
        EXT-OBJ  [test_progs] cgroup_helpers.o
        EXT-OBJ  [test_progs] trace_helpers.o
        EXT-OBJ  [test_progs] network_helpers.o
        EXT-OBJ  [test_progs] testing_helpers.o
        TEST-OBJ [test_progs] snprintf_btf.test.o
      /root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c: In function ‘test_snprintf_btf’:
      /root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c:30:2: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result]
        system("ping -c 1 127.0.0.1 > /dev/null");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        [...]
      
      Fixes: 076a95f5 ("selftests/bpf: Add bpf_snprintf_btf helper tests")
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601379151-21449-2-git-send-email-alan.maguire@oracle.com
      96c48058
    • John Fastabend's avatar
      bpf, selftests: Fix cast to smaller integer type 'int' warning in raw_tp · 00e8c44a
      John Fastabend authored
      Fix warning in bpf selftests,
      
      progs/test_raw_tp_test_run.c:18:10: warning: cast to smaller integer type 'int' from 'struct task_struct *' [-Wpointer-to-int-cast]
      
      Change int type cast to long to fix. Discovered with gcc-9 and llvm-11+
      where llvm was recent main branch.
      
      Fixes: 09d8ad16 ("selftests/bpf: Add raw_tp_test_run")
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/160134424745.11199.13841922833336698133.stgit@john-Precision-5820-Tower
      00e8c44a
    • Alexei Starovoitov's avatar
      Merge branch 'libbpf: BTF writer APIs' · bc600908
      Alexei Starovoitov authored
      Andrii Nakryiko says:
      
      ====================
      This patch set introduces a new set of BTF APIs to libbpf that allow to
      conveniently produce BTF types and strings. These APIs will allow libbpf to do
      more intrusive modifications of program's BTF (by rewriting it, at least as of
      right now), which is necessary for the upcoming libbpf static linking. But
      they are complete and generic, so can be adopted by anyone who has a need to
      produce BTF type information.
      
      One such example outside of libbpf is pahole, which was actually converted to
      these APIs (locally, pending landing of these changes in libbpf) completely
      and shows reduction in amount of custom pahole code necessary and brings nice
      savings in memory usage (about 370MB reduction at peak for my kernel
      configuration) and even BTF deduplication times (one second reduction,
      23.7s -> 22.7s). Memory savings are due to avoiding pahole's own copy of
      "uncompressed" raw BTF data. Time reduction comes from faster string
      search and deduplication by relying on hashmap instead of BST used by pahole's
      own code. Consequently, these APIs are already tested on real-world
      complicated kernel BTF, but there is also pretty extensive selftest doing
      extra validations.
      
      Selftests in patch #3 add a set of generic ASSERT_{EQ,STREQ,ERR,OK} macros
      that are useful for writing shorter and less repretitive selftests. I decided
      to keep them local to that selftest for now, but if they prove to be useful in
      more contexts we should move them to test_progs.h. And few more (e.g.,
      inequality tests) macros are probably necessary to have a more complete set.
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      
      v2->v3:
        - resending original patches #7-9 as patches #1-3 due to merge conflict;
      
      v1->v2:
        - fixed comments (John);
        - renamed btf__append_xxx() into btf__add_xxx() (Alexei);
        - added btf__find_str() in addition to btf__add_str();
        - btf__new_empty() now sets kernel FD to -1 initially.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bc600908
    • Andrii Nakryiko's avatar
    • Andrii Nakryiko's avatar
      libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset · f86ed050
      Andrii Nakryiko authored
      BTF strings are used not just for names, they can be arbitrary strings used
      for CO-RE relocations, line/func infos, etc. Thus "name_by_offset" terminology
      is too specific and might be misleading. Instead, introduce
      btf__str_by_offset() API which uses generic string terminology.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200929020533.711288-3-andriin@fb.com
      f86ed050
    • Andrii Nakryiko's avatar
      libbpf: Add BTF writing APIs · 4a3b33f8
      Andrii Nakryiko authored
      Add APIs for appending new BTF types at the end of BTF object.
      
      Each BTF kind has either one API of the form btf__add_<kind>(). For types
      that have variable amount of additional items (struct/union, enum, func_proto,
      datasec), additional API is provided to emit each such item. E.g., for
      emitting a struct, one would use the following sequence of API calls:
      
      btf__add_struct(...);
      btf__add_field(...);
      ...
      btf__add_field(...);
      
      Each btf__add_field() will ensure that the last BTF type is of STRUCT or
      UNION kind and will automatically increment that type's vlen field.
      
      All the strings are provided as C strings (const char *), not a string offset.
      This significantly improves usability of BTF writer APIs. All such strings
      will be automatically appended to string section or existing string will be
      re-used, if such string was already added previously.
      
      Each API attempts to do all the reasonable validations, like enforcing
      non-empty names for entities with required names, proper value bounds, various
      bit offset restrictions, etc.
      
      Type ID validation is minimal because it's possible to emit a type that refers
      to type that will be emitted later, so libbpf has no way to enforce such
      cases. User must be careful to properly emit all the necessary types and
      specify type IDs that will be valid in the finally generated BTF.
      
      Each of btf__add_<kind>() APIs return new type ID on success or negative
      value on error. APIs like btf__add_field() that emit additional items
      return zero on success and negative value on error.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200929020533.711288-2-andriin@fb.com
      4a3b33f8
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: add helpers to support BTF-based kernel' · 98b972d2
      Alexei Starovoitov authored
      Alan Maguire says:
      
      ====================
      This series attempts to provide a simple way for BPF programs (and in
      future other consumers) to utilize BPF Type Format (BTF) information
      to display kernel data structures in-kernel.  The use case this
      functionality is applied to here is to support a snprintf()-like
      helper to copy a BTF representation of kernel data to a string,
      and a BPF seq file helper to display BTF data for an iterator.
      
      There is already support in kernel/bpf/btf.c for "show" functionality;
      the changes here generalize that support from seq-file specific
      verifier display to the more generic case and add another specific
      use case; rather than seq_printf()ing the show data, it is copied
      to a supplied string using a snprintf()-like function.  Other future
      consumers of the show functionality could include a bpf_printk_btf()
      function which printk()ed the data instead.  Oops messaging in
      particular would be an interesting application for such functionality.
      
      The above potential use case hints at a potential reply to
      a reasonable objection that such typed display should be
      solved by tracing programs, where the in-kernel tracing records
      data and the userspace program prints it out.  While this
      is certainly the recommended approach for most cases, I
      believe having an in-kernel mechanism would be valuable
      also.  Critically in BPF programs it greatly simplifies
      debugging and tracing of such data to invoking a simple
      helper.
      
      One challenge raised in an earlier iteration of this work -
      where the BTF printing was implemented as a printk() format
      specifier - was that the amount of data printed per
      printk() was large, and other format specifiers were far
      simpler.  Here we sidestep that concern by printing
      components of the BTF representation as we go for the
      seq file case, and in the string case the snprintf()-like
      operation is intended to be a basis for perf event or
      ringbuf output.  The reasons for avoiding bpf_trace_printk
      are that
      
      1. bpf_trace_printk() strings are restricted in size and
      cannot display anything beyond trivial data structures; and
      2. bpf_trace_printk() is for debugging purposes only.
      
      As Alexei suggested, a bpf_trace_puts() helper could solve
      this in the future but it still would be limited by the
      1000 byte limit for traced strings.
      
      Default output for an sk_buff looks like this (zeroed fields
      are omitted):
      
      (struct sk_buff){
       .transport_header = (__u16)65535,
       .mac_header = (__u16)65535,
       .end = (sk_buff_data_t)192,
       .head = (unsigned char *)0x000000007524fd8b,
       .data = (unsigned char *)0x000000007524fd8b,
       .truesize = (unsigned int)768,
       .users = (refcount_t){
        .refs = (atomic_t){
         .counter = (int)1,
        },
       },
      }
      
      Flags can modify aspects of output format; see patch 3
      for more details.
      
      Changes since v6:
      
      - Updated safe data size to 32, object name size to 80.
        This increases the number of safe copies done, but performance is
        not a key goal here. WRT name size the largest type name length
        in bpf-next according to "pahole -s" is 64 bytes, so that still gives
        room for additional type qualifiers, parens etc within the name limit
        (Alexei, patch 2)
      - Remove inlines and converted as many #defines to functions as was
        possible.  In a few cases - btf_show_type_value[s]() specifically -
        I left these as macros as btf_show_type_value[s]() prepends and
        appends format strings to the format specifier (in order to include
        indentation, delimiters etc so a macro makes that simpler (Alexei,
        patch 2)
      - Handle btf_resolve_size() error in btf_show_obj_safe() (Alexei, patch 2)
      - Removed clang loop unroll in BTF snprintf test (Alexei)
      - switched to using bpf_core_type_id_kernel(type) as suggested by Andrii,
        and Alexei noted that __builtin_btf_type_id(,1) should be used (patch 4)
      - Added skip logic if __builtin_btf_type_id is not available (patches 4,8)
      - Bumped limits on bpf iters to support printing larger structures (Alexei,
        patch 5)
      - Updated overflow bpf_iter tests to reflect new iter max size (patch 6)
      - Updated seq helper to use type id only (Alexei, patch 7)
      - Updated BTF task iter test to use task struct instead of struct fs_struct
        since new limits allow a task_struct to be displayed (patch 8)
      - Fixed E2BIG handling in iter task (Alexei, patch 8)
      
      Changes since v5:
      
      - Moved btf print prepare into patch 3, type show seq
        with flags into patch 2 (Alexei, patches 2,3)
      - Fixed build bot warnings around static declarations
        and printf attributes
      - Renamed functions to snprintf_btf/seq_printf_btf
        (Alexei, patches 3-6)
      
      Changes since v4:
      
      - Changed approach from a BPF trace event-centric design to one
        utilizing a snprintf()-like helper and an iter helper (Alexei,
        patches 3,5)
      - Added tests to verify BTF output (patch 4)
      - Added support to tests for verifying BTF type_id-based display
        as well as type name via __builtin_btf_type_id (Andrii, patch 4).
      - Augmented task iter tests to cover the BTF-based seq helper.
        Because a task_struct's BTF-based representation would overflow
        the PAGE_SIZE limit on iterator data, the "struct fs_struct"
        (task->fs) is displayed for each task instead (Alexei, patch 6).
      
      Changes since v3:
      
      - Moved to RFC since the approach is different (and bpf-next is
        closed)
      - Rather than using a printk() format specifier as the means
        of invoking BTF-enabled display, a dedicated BPF helper is
        used.  This solves the issue of printk() having to output
        large amounts of data using a complex mechanism such as
        BTF traversal, but still provides a way for the display of
        such data to be achieved via BPF programs.  Future work could
        include a bpf_printk_btf() function to invoke display via
        printk() where the elements of a data structure are printk()ed
       one at a time.  Thanks to Petr Mladek, Andy Shevchenko and
        Rasmus Villemoes who took time to look at the earlier printk()
        format-specifier-focused version of this and provided feedback
        clarifying the problems with that approach.
      - Added trace id to the bpf_trace_printk events as a means of
        separating output from standard bpf_trace_printk() events,
        ensuring it can be easily parsed by the reader.
      - Added bpf_trace_btf() helper tests which do simple verification
        of the various display options.
      
      Changes since v2:
      
      - Alexei and Yonghong suggested it would be good to use
        probe_kernel_read() on to-be-shown data to ensure safety
        during operation.  Safe copy via probe_kernel_read() to a
        buffer object in "struct btf_show" is used to support
        this.  A few different approaches were explored
        including dynamic allocation and per-cpu buffers. The
        downside of dynamic allocation is that it would be done
        during BPF program execution for bpf_trace_printk()s using
        %pT format specifiers. The problem with per-cpu buffers
        is we'd have to manage preemption and since the display
        of an object occurs over an extended period and in printk
        context where we'd rather not change preemption status,
        it seemed tricky to manage buffer safety while considering
        preemption.  The approach of utilizing stack buffer space
        via the "struct btf_show" seemed like the simplest approach.
        The stack size of the associated functions which have a
        "struct btf_show" on their stack to support show operation
        (btf_type_snprintf_show() and btf_type_seq_show()) stays
        under 500 bytes. The compromise here is the safe buffer we
        use is small - 256 bytes - and as a result multiple
        probe_kernel_read()s are needed for larger objects. Most
        objects of interest are smaller than this (e.g.
        "struct sk_buff" is 224 bytes), and while task_struct is a
        notable exception at ~8K, performance is not the priority for
        BTF-based display. (Alexei and Yonghong, patch 2).
      - safe buffer use is the default behaviour (and is mandatory
        for BPF) but unsafe display - meaning no safe copy is done
        and we operate on the object itself - is supported via a
        'u' option.
      - pointers are prefixed with 0x for clarity (Alexei, patch 2)
      - added additional comments and explanations around BTF show
        code, especially around determining whether objects such
        zeroed. Also tried to comment safe object scheme used. (Yonghong,
        patch 2)
      - added late_initcall() to initialize vmlinux BTF so that it would
        not have to be initialized during printk operation (Alexei,
        patch 5)
      - removed CONFIG_BTF_PRINTF config option as it is not needed;
        CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
        determining behaviour of type-based printk can be done via
        retrieval of BTF data; if it's not there BTF was unavailable
        or broken (Alexei, patches 4,6)
      - fix bpf_trace_printk test to use vmlinux.h and globals via
        skeleton infrastructure, removing need for perf events
        (Andrii, patch 8)
      
      Changes since v1:
      
      - changed format to be more drgn-like, rendering indented type info
        along with type names by default (Alexei)
      - zeroed values are omitted (Arnaldo) by default unless the '0'
        modifier is specified (Alexei)
      - added an option to print pointer values without obfuscation.
        The reason to do this is the sysctls controlling pointer display
        are likely to be irrelevant in many if not most tracing contexts.
        Some questions on this in the outstanding questions section below...
      - reworked printk format specifer so that we no longer rely on format
        %pT<type> but instead use a struct * which contains type information
        (Rasmus). This simplifies the printk parsing, makes use more dynamic
        and also allows specification by BTF id as well as name.
      - removed incorrect patch which tried to fix dereferencing of resolved
        BTF info for vmlinux; instead we skip modifiers for the relevant
        case (array element type determination) (Alexei).
      - fixed issues with negative snprintf format length (Rasmus)
      - added test cases for various data structure formats; base types,
        typedefs, structs, etc.
      - tests now iterate through all typedef, enum, struct and unions
        defined for vmlinux BTF and render a version of the target dummy
        value which is either all zeros or all 0xff values; the idea is this
        exercises the "skip if zero" and "print everything" cases.
      - added support in BPF for using the %pT format specifier in
        bpf_trace_printk()
      - added BPF tests which ensure %pT format specifier use works (Alexei).
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      98b972d2
    • Alan Maguire's avatar
      selftests/bpf: Add test for bpf_seq_printf_btf helper · b72091bd
      Alan Maguire authored
      Add a test verifying iterating over tasks and displaying BTF
      representation of task_struct succeeds.
      Suggested-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-9-git-send-email-alan.maguire@oracle.com
      b72091bd
    • Alan Maguire's avatar
      bpf: Add bpf_seq_printf_btf helper · eb411377
      Alan Maguire authored
      A helper is added to allow seq file writing of kernel data
      structures using vmlinux BTF.  Its signature is
      
      long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
                              u32 btf_ptr_size, u64 flags);
      
      Flags and struct btf_ptr definitions/use are identical to the
      bpf_snprintf_btf helper, and the helper returns 0 on success
      or a negative error value.
      Suggested-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
      eb411377
    • Alan Maguire's avatar
      selftests/bpf: Fix overflow tests to reflect iter size increase · eb58bbf2
      Alan Maguire authored
      bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming
      page size need to be bumped also.
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-7-git-send-email-alan.maguire@oracle.com
      eb58bbf2
    • Alan Maguire's avatar
      bpf: Bump iter seq size to support BTF representation of large data structures · af653209
      Alan Maguire authored
      BPF iter size is limited to PAGE_SIZE; if we wish to display BTF-based
      representations of larger kernel data structures such as task_struct,
      this will be insufficient.
      Suggested-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-6-git-send-email-alan.maguire@oracle.com
      af653209
    • Alan Maguire's avatar
      selftests/bpf: Add bpf_snprintf_btf helper tests · 076a95f5
      Alan Maguire authored
      Tests verifying snprintf()ing of various data structures,
      flags combinations using a tp_btf program. Tests are skipped
      if __builtin_btf_type_id is not available to retrieve BTF
      type ids.
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-5-git-send-email-alan.maguire@oracle.com
      076a95f5
    • Alan Maguire's avatar
      bpf: Add bpf_snprintf_btf helper · c4d0bfb4
      Alan Maguire authored
      A helper is added to support tracing kernel type information in BPF
      using the BPF Type Format (BTF).  Its signature is
      
      long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
      		      u32 btf_ptr_size, u64 flags);
      
      struct btf_ptr * specifies
      
      - a pointer to the data to be traced
      - the BTF id of the type of data pointed to
      - a flags field is provided for future use; these flags
        are not to be confused with the BTF_F_* flags
        below that control how the btf_ptr is displayed; the
        flags member of the struct btf_ptr may be used to
        disambiguate types in kernel versus module BTF, etc;
        the main distinction is the flags relate to the type
        and information needed in identifying it; not how it
        is displayed.
      
      For example a BPF program with a struct sk_buff *skb
      could do the following:
      
      	static struct btf_ptr b = { };
      
      	b.ptr = skb;
      	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
      	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);
      
      Default output looks like this:
      
      (struct sk_buff){
       .transport_header = (__u16)65535,
       .mac_header = (__u16)65535,
       .end = (sk_buff_data_t)192,
       .head = (unsigned char *)0x000000007524fd8b,
       .data = (unsigned char *)0x000000007524fd8b,
       .truesize = (unsigned int)768,
       .users = (refcount_t){
        .refs = (atomic_t){
         .counter = (int)1,
        },
       },
      }
      
      Flags modifying display are as follows:
      
      - BTF_F_COMPACT:	no formatting around type information
      - BTF_F_NONAME:		no struct/union member names/types
      - BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
      			equivalent to %px.
      - BTF_F_ZERO:		show zero-valued struct/union members;
      			they are not displayed by default
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
      c4d0bfb4
    • Alan Maguire's avatar
      bpf: Move to generic BTF show support, apply it to seq files/strings · 31d0bc81
      Alan Maguire authored
      generalize the "seq_show" seq file support in btf.c to support
      a generic show callback of which we support two instances; the
      current seq file show, and a show with snprintf() behaviour which
      instead writes the type data to a supplied string.
      
      Both classes of show function call btf_type_show() with different
      targets; the seq file or the string to be written.  In the string
      case we need to track additional data - length left in string to write
      and length to return that we would have written (a la snprintf).
      
      By default show will display type information, field members and
      their types and values etc, and the information is indented
      based upon structure depth. Zeroed fields are omitted.
      
      Show however supports flags which modify its behaviour:
      
      BTF_SHOW_COMPACT - suppress newline/indent.
      BTF_SHOW_NONAME - suppress show of type and member names.
      BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
      BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
      BTF_SHOW_ZERO - show zeroed values (by default they are not shown).
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
      31d0bc81
    • Alan Maguire's avatar
      76654e67
    • Andrii Nakryiko's avatar
      libbpf: Add btf__new_empty() to create an empty BTF object · a871b043
      Andrii Nakryiko authored
      Add an ability to create an empty BTF object from scratch. This is going to be
      used by pahole for BTF encoding. And also by selftest for convenient creation
      of BTF objects.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com
      a871b043
    • Andrii Nakryiko's avatar
      libbpf: Allow modification of BTF and add btf__add_str API · 919d2b1d
      Andrii Nakryiko authored
      Allow internal BTF representation to switch from default read-only mode, in
      which raw BTF data is a single non-modifiable block of memory with BTF header,
      types, and strings layed out sequentially and contiguously in memory, into
      a writable representation with types and strings data split out into separate
      memory regions, that can be dynamically expanded.
      
      Such writable internal representation is transparent to users of libbpf APIs,
      but allows to append new types and strings at the end of BTF, which is
      a typical use case when generating BTF programmatically. All the basic
      guarantees of BTF types and strings layout is preserved, i.e., user can get
      `struct btf_type *` pointer and read it directly. Such btf_type pointers might
      be invalidated if BTF is modified, so some care is required in such mixed
      read/write scenarios.
      
      Switch from read-only to writable configuration happens automatically the
      first time when user attempts to modify BTF by either adding a new type or new
      string. It is still possible to get raw BTF data, which is a single piece of
      memory that can be persisted in ELF section or into a file as raw BTF. Such
      raw data memory is also still owned by BTF and will be freed either when BTF
      object is freed or if another modification to BTF happens, as any modification
      invalidates BTF raw representation.
      
      This patch adds the first two BTF manipulation APIs: btf__add_str(), which
      allows to add arbitrary strings to BTF string section, and btf__find_str()
      which allows to find existing string offset, but not add it if it's missing.
      All the added strings are automatically deduplicated. This is achieved by
      maintaining an additional string lookup index for all unique strings. Such
      index is built when BTF is switched to modifiable mode. If at that time BTF
      strings section contained duplicate strings, they are not de-duplicated. This
      is done specifically to not modify the existing content of BTF (types, their
      string offsets, etc), which can cause confusion and is especially important
      property if there is struct btf_ext associated with struct btf. By following
      this "imperfect deduplication" process, btf_ext is kept consitent and correct.
      If deduplication of strings is necessary, it can be forced by doing BTF
      deduplication, at which point all the strings will be eagerly deduplicated and
      all string offsets both in struct btf and struct btf_ext will be updated.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com
      919d2b1d
    • Andrii Nakryiko's avatar
      libbpf: Extract generic string hashing function for reuse · 7d9c71e1
      Andrii Nakryiko authored
      Calculating a hash of zero-terminated string is a common need when using
      hashmap, so extract it for reuse.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com
      7d9c71e1
    • Andrii Nakryiko's avatar
      libbpf: Generalize common logic for managing dynamically-sized arrays · 192f5a1f
      Andrii Nakryiko authored
      Managing dynamically-sized array is a common, but not trivial functionality,
      which significant amount of logic and code to implement properly. So instead
      of re-implementing it all the time, extract it into a helper function ans
      reuse.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com
      192f5a1f
    • Andrii Nakryiko's avatar
      libbpf: Remove assumption of single contiguous memory for BTF data · b8604247
      Andrii Nakryiko authored
      Refactor internals of struct btf to remove assumptions that BTF header, type
      data, and string data are layed out contiguously in a memory in a single
      memory allocation. Now we have three separate pointers pointing to the start
      of each respective are: header, types, strings. In the next patches, these
      pointers will be re-assigned to point to independently allocated memory areas,
      if BTF needs to be modified.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com
      b8604247
    • Andrii Nakryiko's avatar
      libbpf: Refactor internals of BTF type index · 740e69c3
      Andrii Nakryiko authored
      Refactor implementation of internal BTF type index to not use direct pointers.
      Instead it uses offset relative to the start of types data section. This
      allows for types data to be reallocatable, enabling implementation of
      modifiable BTF.
      
      As now getting type by ID has an extra indirection step, convert all internal
      type lookups to a new helper btf_type_id(), that returns non-const pointer to
      a type by its ID.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com
      740e69c3
    • Toke Høiland-Jørgensen's avatar
      selftests: Remove fmod_ret from test_overhead · b000def2
      Toke Høiland-Jørgensen authored
      The test_overhead prog_test included an fmod_ret program that attached to
      __set_task_comm() in the kernel. However, this function was never listed as
      allowed for return modification, so this only worked because of the
      verifier skipping tests when a trampoline already existed for the attach
      point. Now that the verifier checks have been fixed, remove fmod_ret from
      the test so it works again.
      
      Fixes: 4eaf0b5c ("selftest/bpf: Fmod_ret prog and implement test_overhead as part of bench")
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b000def2
    • Toke Høiland-Jørgensen's avatar
      bpf: verifier: refactor check_attach_btf_id() · f7b12b6f
      Toke Høiland-Jørgensen authored
      The check_attach_btf_id() function really does three things:
      
      1. It performs a bunch of checks on the program to ensure that the
         attachment is valid.
      
      2. It stores a bunch of state about the attachment being requested in
         the verifier environment and struct bpf_prog objects.
      
      3. It allocates a trampoline for the attachment.
      
      This patch splits out (1.) and (3.) into separate functions which will
      perform the checks, but return the computed values instead of directly
      modifying the environment. This is done in preparation for reusing the
      checks when the actual attachment is happening, which will allow tracing
      programs to have multiple (compatible) attachments.
      
      This also fixes a bug where a bunch of checks were skipped if a trampoline
      already existed for the tracing target.
      
      Fixes: 6ba43b76 ("bpf: Attachment verification for BPF_MODIFY_RETURN")
      Fixes: 1e6c62a8 ("bpf: Introduce sleepable BPF programs")
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f7b12b6f
    • Toke Høiland-Jørgensen's avatar
      bpf: change logging calls from verbose() to bpf_log() and use log pointer · efc68158
      Toke Høiland-Jørgensen authored
      In preparation for moving code around, change a bunch of references to
      env->log (and the verbose() logging helper) to use bpf_log() and a direct
      pointer to struct bpf_verifier_log. While we're touching the function
      signature, mark the 'prog' argument to bpf_check_type_match() as const.
      
      Also enhance the bpf_verifier_log_needed() check to handle NULL pointers
      for the log struct so we can re-use the code with logging disabled.
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      efc68158
    • Toke Høiland-Jørgensen's avatar
      bpf: disallow attaching modify_return tracing functions to other BPF programs · 1af9270e
      Toke Høiland-Jørgensen authored
      From the checks and commit messages for modify_return, it seems it was
      never the intention that it should be possible to attach a tracing program
      with expected_attach_type == BPF_MODIFY_RETURN to another BPF program.
      However, check_attach_modify_return() will only look at the function name,
      so if the target function starts with "security_", the attach will be
      allowed even for bpf2bpf attachment.
      
      Fix this oversight by also blocking the modification if a target program is
      supplied.
      
      Fixes: 18644cec ("bpf: Fix use-after-free in fmod_ret check")
      Fixes: 6ba43b76 ("bpf: Attachment verification for BPF_MODIFY_RETURN")
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1af9270e
  2. 28 Sep, 2020 3 commits