1. 16 Mar, 2018 2 commits
  2. 15 Mar, 2018 3 commits
    • Daniel Borkmann's avatar
      Merge branch 'bpf-stackmap-build-id' · 68de5ef4
      Daniel Borkmann authored
      Song Liu says:
      
      ====================
      This work follows up discussion at Plumbers'17 on improving addr->sym
      resolution of user stack traces. The following links have more information
      of the discussion:
      
      http://www.linuxplumbersconf.org/2017/ocw/proposals/4764
      https://lwn.net/Articles/734453/     Section "Stack traces and kprobes"
      
      Currently, bpf stackmap store address for each entry in the call trace.
      To map these addresses to user space files, it is necessary to maintain
      the mapping from these virtual address to symbols in the binary. Usually,
      the user space profiler (such as perf) has to scan /proc/pid/maps at the
      beginning of profiling, and monitor mmap2() calls afterwards. Given the
      cost of maintaining the address map, this solution is not practical for
      system wide profiling that is always on.
      
      This patch tries to address this with a variation to stackmap. Instead
      of storing addresses, the variation stores ELF file build_id + offset.
      After profiling, a user space tool will look up these functions with
      build_id (to find the binary or shared library) and the offset.
      
      I also updated bcc/cc library for the stackmap (no python/lua support yet).
      You can find the work at:
      
        https://github.com/liu-song-6/bcc/commits/bpf_get_stackid_v02
      
      Changes v5 -> v6:
      
      1. When kernel stack is added to stackmap with build_id, use fallback
         mechanism to store ip (status == BPF_STACK_BUILD_ID_IP).
      
      Changes v4 -> v5:
      
      1. Only allow build_id lookup in non-nmi context. Added comment and
         commit message to highlight this limitation.
      2. Minor fix reported by kbuild test robot.
      
      Changes v3 -> v4:
      
      1. Add fallback when build_id lookup failed. In this case, status is set
         to BPF_STACK_BUILD_ID_IP, and ip of this entry is saved.
      2. Handle cases where vma is only part of the file (vma->vm_pgoff != 0).
         Thanks to Teng for helping me identify this issue!
      3. Address feedbacks for previous versions.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      68de5ef4
    • Song Liu's avatar
      bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID · 81f77fd0
      Song Liu authored
      test_stacktrace_build_id() is added. It accesses tracepoint urandom_read
      with "dd" and "urandom_read" and gathers stack traces. Then it reads the
      stack traces from the stackmap.
      
      urandom_read is a statically link binary that reads from /dev/urandom.
      test_stacktrace_build_id() calls readelf to read build ID of urandom_read
      and compares it with build ID from the stackmap.
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      81f77fd0
    • Song Liu's avatar
      bpf: extend stackmap to save binary_build_id+offset instead of address · 615755a7
      Song Liu authored
      Currently, bpf stackmap store address for each entry in the call trace.
      To map these addresses to user space files, it is necessary to maintain
      the mapping from these virtual address to symbols in the binary. Usually,
      the user space profiler (such as perf) has to scan /proc/pid/maps at the
      beginning of profiling, and monitor mmap2() calls afterwards. Given the
      cost of maintaining the address map, this solution is not practical for
      system wide profiling that is always on.
      
      This patch tries to solve this problem with a variation of stackmap. This
      variation is enabled by flag BPF_F_STACK_BUILD_ID. Instead of storing
      addresses, the variation stores ELF file build_id + offset.
      
      Build ID is a 20-byte unique identifier for ELF files. The following
      command shows the Build ID of /bin/bash:
      
        [user@]$ readelf -n /bin/bash
        ...
          Build ID: XXXXXXXXXX
        ...
      
      With BPF_F_STACK_BUILD_ID, bpf_get_stackid() tries to parse Build ID
      for each entry in the call trace, and translate it into the following
      struct:
      
        struct bpf_stack_build_id_offset {
                __s32           status;
                unsigned char   build_id[BPF_BUILD_ID_SIZE];
                union {
                        __u64   offset;
                        __u64   ip;
                };
        };
      
      The search of build_id is limited to the first page of the file, and this
      page should be in page cache. Otherwise, we fallback to store ip for this
      entry (ip field in struct bpf_stack_build_id_offset). This requires the
      build_id to be stored in the first page. A quick survey of binary and
      dynamic library files in a few different systems shows that almost all
      binary and dynamic library files have build_id in the first page.
      
      Build_id is only meaningful for user stack. If a kernel stack is added to
      a stackmap with BPF_F_STACK_BUILD_ID, it will automatically fallback to
      only store ip (status == BPF_STACK_BUILD_ID_IP). Similarly, if build_id
      lookup failed for some reason, it will also fallback to store ip.
      
      User space can access struct bpf_stack_build_id_offset with bpf
      syscall BPF_MAP_LOOKUP_ELEM. It is necessary for user space to
      maintain mapping from build id to binary files. This mostly static
      mapping is much easier to maintain than per process address maps.
      
      Note: Stackmap with build_id only works in non-nmi context at this time.
      This is because we need to take mm->mmap_sem for find_vma(). If this
      changes, we would like to allow build_id lookup in nmi context.
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      615755a7
  3. 09 Mar, 2018 9 commits
  4. 08 Mar, 2018 3 commits
    • Daniel Borkmann's avatar
      Merge branch 'bpf-perf-sample-addr' · 12ef9bda
      Daniel Borkmann authored
      Teng Qin says:
      
      ====================
      These patches add support that allows bpf programs attached to perf events to
      read the address values recorded with the perf events. These values are
      requested by specifying sample_type with PERF_SAMPLE_ADDR when calling
      perf_event_open().
      
      The main motivation for these changes is to support building memory or lock
      access profiling and tracing tools. For example on Intel CPUs, the recorded
      address values for supported memory or lock access perf events would be
      the access or lock target addresses from PEBS buffer. Such information would
      be very valuable for building tools that help understand memory access or
      lock acquire pattern.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      12ef9bda
    • Teng Qin's avatar
      samples/bpf: add example to test reading address · 12fe1225
      Teng Qin authored
      This commit adds additional test in the trace_event example, by
      attaching the bpf program to MEM_UOPS_RETIRED.LOCK_LOADS event with
      PERF_SAMPLE_ADDR requested, and print the lock address value read from
      the bpf program to trace_pipe.
      Signed-off-by: default avatarTeng Qin <qinteng@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      12fe1225
    • Teng Qin's avatar
      bpf: add support to read sample address in bpf program · 95da0cdb
      Teng Qin authored
      This commit adds new field "addr" to bpf_perf_event_data which could be
      read and used by bpf programs attached to perf events. The value of the
      field is copied from bpf_perf_event_data_kern.addr and contains the
      address value recorded by specifying sample_type with PERF_SAMPLE_ADDR
      when calling perf_event_open.
      Signed-off-by: default avatarTeng Qin <qinteng@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      95da0cdb
  5. 07 Mar, 2018 23 commits