• Song Liu's avatar
    bpf: extend stackmap to save binary_build_id+offset instead of address · 615755a7
    Song Liu authored
    Currently, bpf stackmap store address for each entry in the call trace.
    To map these addresses to user space files, it is necessary to maintain
    the mapping from these virtual address to symbols in the binary. Usually,
    the user space profiler (such as perf) has to scan /proc/pid/maps at the
    beginning of profiling, and monitor mmap2() calls afterwards. Given the
    cost of maintaining the address map, this solution is not practical for
    system wide profiling that is always on.
    
    This patch tries to solve this problem with a variation of stackmap. This
    variation is enabled by flag BPF_F_STACK_BUILD_ID. Instead of storing
    addresses, the variation stores ELF file build_id + offset.
    
    Build ID is a 20-byte unique identifier for ELF files. The following
    command shows the Build ID of /bin/bash:
    
      [user@]$ readelf -n /bin/bash
      ...
        Build ID: XXXXXXXXXX
      ...
    
    With BPF_F_STACK_BUILD_ID, bpf_get_stackid() tries to parse Build ID
    for each entry in the call trace, and translate it into the following
    struct:
    
      struct bpf_stack_build_id_offset {
              __s32           status;
              unsigned char   build_id[BPF_BUILD_ID_SIZE];
              union {
                      __u64   offset;
                      __u64   ip;
              };
      };
    
    The search of build_id is limited to the first page of the file, and this
    page should be in page cache. Otherwise, we fallback to store ip for this
    entry (ip field in struct bpf_stack_build_id_offset). This requires the
    build_id to be stored in the first page. A quick survey of binary and
    dynamic library files in a few different systems shows that almost all
    binary and dynamic library files have build_id in the first page.
    
    Build_id is only meaningful for user stack. If a kernel stack is added to
    a stackmap with BPF_F_STACK_BUILD_ID, it will automatically fallback to
    only store ip (status == BPF_STACK_BUILD_ID_IP). Similarly, if build_id
    lookup failed for some reason, it will also fallback to store ip.
    
    User space can access struct bpf_stack_build_id_offset with bpf
    syscall BPF_MAP_LOOKUP_ELEM. It is necessary for user space to
    maintain mapping from build id to binary files. This mostly static
    mapping is much easier to maintain than per process address maps.
    
    Note: Stackmap with build_id only works in non-nmi context at this time.
    This is because we need to take mm->mmap_sem for find_vma(). If this
    changes, we would like to allow build_id lookup in nmi context.
    Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    615755a7
stackmap.c 13.5 KB