1. 10 Jun, 2019 8 commits
    • Mathieu Poirier's avatar
      perf cs-etm: Add handling of itrace start events · a465f3c3
      Mathieu Poirier authored
      Add handling of ITRACE events in order to add the tid/pid of the
      executing process to the perf tools machine infrastructure.  This
      information is later retrieved when a contextID packet is found in the
      trace stream.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190524173508.29044-5-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a465f3c3
    • Mathieu Poirier's avatar
      perf cs-etm: Configure SWITCH_EVENTS in CPU-wide mode · e5993c42
      Mathieu Poirier authored
      Ask the perf core to generate an event when processes are swapped in/out
      of context.  That way proper action can be taken by the decoding code
      when faced with such event.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190524173508.29044-4-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5993c42
    • Mathieu Poirier's avatar
      perf cs-etm: Configure timestamp generation in CPU-wide mode · 1c839a5a
      Mathieu Poirier authored
      When operating in CPU-wide mode tracers need to generate timestamps in
      order to correlate the code being traced on one CPU with what is executed
      on other CPUs.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190524173508.29044-3-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1c839a5a
    • Mathieu Poirier's avatar
      perf cs-etm: Configure contextID tracing in CPU-wide mode · 3399ad9a
      Mathieu Poirier authored
      When operating in CPU-wide mode being notified of contextID changes is
      required so that the decoding mechanic is aware of the process context
      switch.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: default avatarSuzuki Poulouse <suzuki.poulose@arm.com>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190524173508.29044-2-mathieu.poirier@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3399ad9a
    • Jiri Olsa's avatar
      perf evsel: Remove superfluous nthreads system_wide setup in alloc_fd() · 10981c80
      Jiri Olsa authored
      It's already setup in the only caller of this method in
      perf_evsel__open(), right before calling perf_evsel__alloc_fd(), no need
      to do it again.
      
      Also it's better to have it out of the function before we move it to
      libperf.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-1k8lhyjxfk7o8v4g3r7eyjc9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      10981c80
    • yuzhoujian's avatar
      perf record: Add support to collect callchains from kernel or user space only · 53651b28
      yuzhoujian authored
      One can just record callchains in the kernel or user space with this new
      options.
      
      We can use it together with "--all-kernel" options.
      
      This two options is used just like print_stack(sys) or print_ustack(usr)
      for systemtap.
      
      Shown below is the usage of this new option combined with "--all-kernel"
      options:
      
      1. Configure all used events to run in kernel space and just collect
         kernel callchains.
      
        $ perf record -a -g --all-kernel --kernel-callchains
      
      2. Configure all used events to run in kernel space and just collect
         user callchains.
      
        $ perf record -a -g --all-kernel --user-callchains
      
      Committer notes:
      
      Improved documentation to state that asking for kernel callchains really
      is asking for excluding user callchains, and vice versa.
      
      Further mentioned that using both won't get both, but nothing, as both
      will be excluded.
      Signed-off-by: default avataryuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      53651b28
    • Arnaldo Carvalho de Melo's avatar
      perf config: Bail out when a handler returns failure for a key-value pair · 22d46219
      Arnaldo Carvalho de Melo authored
      So perf_config() uses:
      
        int ret = 0;
      
        perf_config_set__for_each_entry(config_set, section, item) {
                ...
                ret = fn();
                if (ret < 0)
                        break;
        }
      
        return ret;
      
      Expecting that that break will imediatelly go to function exit to return
      that error value (ret).
      
      The problem is that perf_config_set__for_each_entry() expands into two
      nested for() loops, one traversing the sections in a config and the
      second the items in each of those sections, so we have to change that
      'break' to a goto label right before that final 'return ret'.
      
      With that, for instance 'perf trace' now correctly bails out when a
      event that is requested to be added via its 'trace.add_events'
      ~/.perfconfig entry gets rejected by the kernel BPF verifier:
      
        # perf trace ls
        event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'
                             \___ Kernel verifier blocks program loading
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
        Error: wrong config key-value pair trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        #
      
      While before it would continue and explode later, when trying to find
      maps that would have been in place had that augmented_raw_syscalls.o
      precompiled BPF proggie been accepted by the, humm, bast... rigorous
      kernel BPF verifier 8-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Yonghong Song <yhs@fb.com>
      Fixes: 8a0a9c7e ("perf config: Introduce new init() and exit()")
      Link: https://lkml.kernel.org/n/tip-qvqxfk9d0rn1l7lcntwiezrr@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      22d46219
    • Leo Yan's avatar
      perf trace: Exit when failing to build eBPF program · 012749ca
      Leo Yan authored
      On my Juno board with ARM64 CPUs, perf trace command reports the eBPF
      program building failure but the command will not exit and continue to
      run.  If we define an eBPF event in config file, the event will be
      parsed with below flow:
      
        perf_config()
          `> trace__config()
               `> parse_events_option()
                    `> parse_events__scanner()
                         `-> parse_events_parse()
                               `> parse_events_load_bpf()
                                    `> llvm__compile_bpf()
      
      Though the low level functions return back error values when detect eBPF
      building failure, but parse_events_option() returns 1 for this case and
      trace__config() passes 1 to perf_config(); perf_config() doesn't treat
      the returned value 1 as failure and it continues to parse other
      configurations.  Thus the perf command continues to run even without
      enabling eBPF event successfully.
      
      This patch changes error handling in trace__config(), when it detects
      failure it will return -1 rather than directly pass error value (1);
      finally, perf_config() will directly bail out and perf will exit for
      this case.
      
      Committer notes:
      
      Simplified the patch to just check directly the return of
      parse_events_option() and it it is non-zero, change err from its initial
      zero value to -1.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Fixes: ac96287c ("perf trace: Allow specifying a set of events to add in perfconfig")
      Link: https://lkml.kernel.org/n/tip-x4i63f5kscykfok0hqim3zma@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      012749ca
  2. 05 Jun, 2019 32 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Associate more argument names with the filename beautifier · dea87bfb
      Arnaldo Carvalho de Melo authored
      For instance, the rename* family uses "oldname", "newname", so check if
      "name" is at the end and treat it as a filename.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-wjy7j4bk06g7atzwoz1mid24@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dea87bfb
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Consume the augmented_raw_syscalls payload · 8195168e
      Arnaldo Carvalho de Melo authored
      To support the SCA_FILENAME beautifier in more than one syscall arg, as
      needed for syscalls such as the rename* family, we need to, after
      processing one such arg, bump the augmented pointers so that the next
      augmented arg don't reuse data for the previous augmented arguments.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-4e4cmzyjxb3wkonfo1x9a27y@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8195168e
    • Jiri Olsa's avatar
      perf jvmti: Address gcc string overflow warning for strncpy() · 279ab04d
      Jiri Olsa authored
      We are getting false positive gcc warning when we compile with gcc9 (9.1.1):
      
           CC       jvmti/libjvmti.o
         In file included from /usr/include/string.h:494,
                          from jvmti/libjvmti.c:5:
         In function ‘strncpy’,
             inlined from ‘copy_class_filename.constprop’ at jvmti/libjvmti.c:166:3:
         /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
           106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
               |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         jvmti/libjvmti.c: In function ‘copy_class_filename.constprop’:
         jvmti/libjvmti.c:165:26: note: length computed here
           165 |   size_t file_name_len = strlen(file_name);
               |                          ^~~~~~~~~~~~~~~~~
         cc1: all warnings being treated as errors
      
      As per Arnaldo's suggestion use strlcpy(), which does the same thing and keeps
      gcc silent.
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20190531131321.GB1281@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      279ab04d
    • Arnaldo Carvalho de Melo's avatar
      perf augmented_raw_syscalls: Move reading filename to the loop · 602bce09
      Arnaldo Carvalho de Melo authored
      Almost there, next step is to copy more than one filename payload.
      
      Probably to read syscall arg structs, etc we'll need just a variation of
      this that will decide what to use, if probe_read_str() or plain
      probe_read for structs, i.e. fixed size.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-uf6u0pld6xe4xuo16f04owlz@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      602bce09
    • Arnaldo Carvalho de Melo's avatar
      perf augmented_raw_syscalls: Change helper to consider just the augmented_filename part · deaf4da4
      Arnaldo Carvalho de Melo authored
      So that we can use it for multiple args, baby steps not to step into the
      verifier toes.
      
      In the process make sure we handle -EFAULT from bpf_prog_read_str(), as
      this really is needed now that we'll handle more than one augmented
      argument, i.e. if there is failure, then we have the argument that fails
      have:
      
        (size = 0, err = -EFAULT, value = [] )
      
      followed by the next, lets say that worked for a second pathname:
      
        (size = 4, err = 0, value = "/tmp" )
      
      So we can skip the first while telling the user about the problem and
      then process the second.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-deyvqi39um6gp6hux6jovos8@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      deaf4da4
    • Arnaldo Carvalho de Melo's avatar
      perf augmented_raw_syscalls: Move the probe_read_str to a separate function · 0c95a7ff
      Arnaldo Carvalho de Melo authored
      One more step into copying multiple filenames to support syscalls like
      rename*.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-xdqtjexdyp81oomm1rkzeifl@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c95a7ff
    • Arnaldo Carvalho de Melo's avatar
      perf augmented_raw_syscalls: Tell which args are filenames and how many bytes to copy · 4cae8675
      Arnaldo Carvalho de Melo authored
      Since we know what args are strings from reading the syscall
      descriptions in tracefs and also already mark such args to be beautified
      using the syscall_arg__scnprintf_filename() helper, all we need is to
      fill in this info in the 'syscalls' BPF map we were using to state which
      syscalls the user is interested in, i.e. the syscall filter.
      
      Right now just set that with PATH_MAX and unroll the syscall arg in the
      BPF program, as the verifier isn't liking something clang generates when
      unrolling the loop.
      
      This also makes the augmented_raw_syscalls.c program support all arches,
      since we removed that set of defines with the hard coded syscall
      numbers, all should be automatically set for all arches, with the
      syscall id mapping done correcly.
      
      Doing baby steps here, i.e. just the first string arg for a syscall is
      printed, syscalls with more than one, say, the various rename* syscalls,
      need further work, but lets get first something that the BPF verifier
      accepts before increasing the complexity
      
      To test it, something like:
      
       # perf trace -e string -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
      
      With:
      
        # cat ~/.perfconfig
        [llvm]
      	dump-obj = true
      	clang-opt = -g
        [trace]
      	#add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
      	show_zeros = yes
      	show_duration = no
      	no_inherit = yes
      	show_timestamp = no
      	show_arg_names = no
      	args_alignment = 40
      	show_prefix = yes
        #
      
      That commented add_events line is needed for developing this
      augmented_raw_syscalls.c BPF program, as if we add it via the
      'add_events' mechanism so as to shorten the 'perf trace' command lines,
      then we end up not setting up the -v option which precludes us having
      access to the bpf verifier log :-\
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-dn863ya0cbsqycxuy0olvbt1@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4cae8675
    • Adrian Hunter's avatar
      perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated · 80b3fb64
      Adrian Hunter authored
      The user probably wants to replace the find text, so select the find
      text when the find bar is activated.
      
      That is fairly standard behaviour for search text entry.
      
      Entering text will replace the current text, but using edit keys
      (arrows, home, end etc) cancels the selection and enables editing.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-23-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      80b3fb64
    • Adrian Hunter's avatar
      perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree · b3b66079
      Adrian Hunter authored
      Enhance the call tree to display IPC information if it is available.
      
      Committer testing:
      
      [acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db
      
      Reports -> Call Tree, then expand a few trees, then select with the
      mouse and press control+C (copy):
      
      Call Path                   Object        Call Time Time  Time(%) Insn  Insn   Cyc   Cyc   IPC Branch Branch
      ▼ simple-retpolin                                   (ns)          Cnt   Cnt(%) Cnt   Cnt(%)     Count Count(%)
        ▼ 23003:23003
          ▼ _start                ld-2.28.so    112195670 218295 100.0 127746 100.0 207320 100.0 0.62 13046 100.0
             unknown             unknown       112195987   3202   1.5      0   0.0      0   0.0    0     1   0.0
             _dl_start           ld-2.28.so    112199189 188471  86.3 123394  96.6 180007  86.8 0.69 12529  96.0
            ▼ _dl_init            ld-2.28.so    112387660  13406   6.1   3207   2.5  14868   7.2 0.22   327   2.5
               call_init.part.0  ld-2.28.so    112387773    117   0.9     70   2.2    639   4.3 0.11     3   0.9
               call_init.part.0  ld-2.28.so    112387890  13129  97.9   3103  96.8  14100  94.8 0.22   315  96.3
               call_init.part.0  ld-2.28.so    112401020      0   0.0      0   0.0      0   0.0    0     2   0.6
            ▼ _start              simple-retpol 112401066  12899   5.9   1142   0.9  11561   5.6 0.10   184   1.4
               unknown           unknown       112401388    846   6.6      0   0.0      0   0.0    0     1   0.5
              ▼ __libc_start_main libc-2.28.so  112402344  11621  90.1   1129  98.9  10350  89.5 0.11   181  98.4
                 __cxa_atexit    libc-2.28.so  112402360   2302  19.8    101   8.9   1817  17.6 0.06    13   7.2
                 __libc_csu_init simple-retpol 112404673    121   1.0     43   3.8    340   3.3 0.13     8   4.4
                 _setjmp         libc-2.28.so  112404794     74   0.6     46   4.1    206   2.0 0.22     4   2.2
                ▼ main            simple-retpol 112404892     44   0.4     23   2.0    126   1.2 0.18    12   6.6
                  ▼ foo           simple-retpol 112404892     19  43.2     12  52.2     55  43.7 0.22     5  41.7
                      bar         simple-retpol 112404896     12  63.2      3  25.0     34  61.8 0.09     1  20.0
                  ▼ foo           simple-retpol 112404911     25  56.8     11  47.8     71  56.3 0.15     5  41.7
                     bar         simple-retpol 112404924     10  40.0      3  27.3     27  38.0 0.11     1  20.0
                 exit            libc-2.28.so  112404936   9029  77.7    878  77.8   7765  75.0 0.11   139  76.8
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-22-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b3b66079
    • Adrian Hunter's avatar
      perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph · 38a846d4
      Adrian Hunter authored
      Enhance the call graph to display IPC information if it is available.
      
      Committer testing:
      
      [acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db
      
      Reports -> Context Sensitive Callgraph, then expand a few trees, then
      select with the mouse and press control+C:
      
      Call Path                     Object          Count Time(ns) Time(%) Insn Insn   Cyc   Cyc    IPC Branch Branch
      ▼ simple-retpolin                                                    Cnt  Cnt(%) Cnt   Cnt(%)     Cnt    Cnt(%)
        ▼ 23003:23003
          ▼ _start                  ld-2.28.so         1 218295   100.0  127746 100.0 207320 100.0 0.62 13046  100.0
             unknown               unknown            1   3202     1.5       0   0.0      0   0.0    0     1    0.0
             _dl_start             ld-2.28.so         1 188471    86.3  123394  96.6 180007  86.8 0.69 12529   96.0
             _dl_init              ld-2.28.so         1  13406     6.1    3207   2.5  14868   7.2 0.22   327    2.5
            ▼ _start                simple-retpoline   1  12899     5.9    1142   0.9  11561   5.6 0.10   184    1.4
               unknown             unknown            1    846     6.6       0   0.0      0   0.0    0     1    0.5
              ▼ __libc_start_main   libc-2.28.so       1  11621    90.1    1129  98.9  10350  89.5 0.11   181   98.4
                 __cxa_atexit      libc-2.28.so       1   2302    19.8     101   8.9   1817  17.6 0.06    13    7.2
                 __libc_csu_init   simple-retpoline   1    121     1.0      43   3.8    340   3.3 0.13     8    4.4
                ▼ _setjmp           libc-2.28.so       1     74     0.6      46   4.1    206   2.0 0.22     4    2.2
                  ▼ __sigsetjmp     libc-2.28.so       1     74   100.0      46 100.0    206 100.0 0.22     3   75.0
                     __sigjmp_save libc-2.28.so       1      0     0.0       0   0.0      0   0.0    0     1   33.3
                ▼ main              simple-retpoline   1     44     0.4      23   2.0    126   1.2 0.18    12    6.6
                  ▼ foo             simple-retpoline   2     44   100.0      23 100.0    126 100.0 0.18    10   83.3
                      bar           simple-retpoline   2     22    50.0       6  26.1     61  48.4 0.10     2   20.0
                 exit              libc-2.28.so       1   9029    77.7     878  77.8   7765  75.0 0.11   139   76.8
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-21-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38a846d4
    • Adrian Hunter's avatar
      perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams · 4a0979d4
      Adrian Hunter authored
      Add a parameter to call graph and call tree, to determine whether IPC
      information is available.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-20-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4a0979d4
    • Adrian Hunter's avatar
      perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports · 530e22fd
      Adrian Hunter authored
      Enhance the "All branches" and "Selected branches" reports to display IPC
      information if it is available.
      
      Committer testing:
      
      So, testing this I noticed that it all starts with the left arrow in every
      line, that should mean there is some tree there, i.e. look at all those 
      symbols:
      
      Reports -> All Branches:
      
      Time              CPU Command         PID   TID   Branch Type  In Tx  Insn Cnt  Cyc Cnt  IPC  Branch
       187836112195670 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110
      +_start (ld-2.28.so)
       187836112195987 7   simple-retpolin 23003 23003 trace end    No     0         883      0    7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown
      +(unknown)
       187836112199189 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110
      +_start (ld-2.28.so)
       187836112199189 7   simple-retpolin 23003 23003 call         No     0         0        0    7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50
      +_dl_start (ld-2.28.so)
       187836112199544 7   simple-retpolin 23003 23003 trace end    No     17        996      0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0
      +unknown (unknown)
       187836112200939 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff73
      +_dl_start+0x23 (ld-2.28.so)
       187836112201229 7   simple-retpolin 23003 23003 trace end    No     1         816      0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0
      +unknown (unknown)
       187836112203500 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff7a
      +_dl_start+0x2a (ld-2.28.so)
      
      But if you click on it, that  disappears and a new click doesn't make
      it reappear, looks buggy, minor oddity, reported to Adrian.
      
      Reports -> Selected Branches, then ask for branches in the ld-2.28.so
      DSO:
      
      Time               CPU  Command          PID    TID    Branch Type        In Tx  Insn Cnt  Cyc Cnt  IPC   Branch
       187836112195987  7    simple-retpolin  23003  23003  trace end          No     0         883      0     7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown)
       187836112199189  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
       187836112199189  7    simple-retpolin  23003  23003  call               No     0         0        0     7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so)
       187836112199544  7    simple-retpolin  23003  23003  trace end          No     17        996      0.02  7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown)
       187836112200939  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so)
       187836112201229  7    simple-retpolin  23003  23003  trace end          No     1         816      0.00  7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown)
       187836112203500  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so)
       187836112203528  7    simple-retpolin  23003  23003  unconditional jump No     0         0        0     7f6f33d4ffe7 _dl_start+0x97 (ld-2.28.so) -> 7f6f33d5000b _dl_start+0xbb (ld-2.28.so)
       187836112203528  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
       187836112203528  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
       187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d50025 _dl_start+0xd5 (ld-2.28.so) -> 7f6f33d50210 _dl_start+0x2c0 (ld-2.28.so)
       187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5021a _dl_start+0x2ca (ld-2.28.so) -> 7f6f33d50360 _dl_start+0x410 (ld-2.28.so)
       187836112203539  7    simple-retpolin  23003  23003  unconditional jump No     0         0        0     7f6f33d50377 _dl_start+0x427 (ld-2.28.so) -> 7f6f33d4ffff _dl_start+0xaf (ld-2.28.so)
       187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
       187836112203562  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-19-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      530e22fd
    • Adrian Hunter's avatar
      perf scripts python: export-to-postgresql.py: Export IPC information · ec7f448e
      Adrian Hunter authored
      Export cycle and instruction counts on samples and calls tables.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-18-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec7f448e
    • Adrian Hunter's avatar
      perf scripts python: export-to-sqlite.py: Export IPC information · 64adadb3
      Adrian Hunter authored
      Export cycle and instruction counts on samples and calls tables.
      
      Committer testing:
      
      First runs some workload collecting intel_pt with the 'cyc' ter just for
      userspace:
      
        [root@quaco adrian.hunter]# perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.035 MB simple-retpoline.perf.data ]
        [root@quaco adrian.hunter]#
      
      Then use the export-to-sqlite.py script to see if the changes in this
      cset don't make it to break and if the changes in the db schema are the
      ones expected:
      
        [root@quaco adrian.hunter]# perf script -i simple-retpoline.perf.data --itrace=be -s ~acme/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
        2019-05-31 11:50:46.942710 Creating database ...
        2019-05-31 11:50:46.949663 Writing records...
        2019-05-31 11:50:47.224033 Adding indexes
        2019-05-31 11:50:47.231599 Done
        [root@quaco adrian.hunter]#
      
      Now lets use the db:
      
        [root@quaco adrian.hunter]# sqlite3 simple-retpoline.db
        SQLite version 3.26.0 2018-12-01 12:34:55
        Enter ".help" for usage hints.
        sqlite> .schema samples
        CREATE TABLE samples (id integer NOT NULL PRIMARY KEY,evsel_id bigint,machine_id bigint,thread_id bigint,comm_id bigint,dso_id bigint,symbol_id bigint,sym_offset bigint,ip bigint,time bigint,cpuinteger,to_dso_id bigint,to_symbol_id bigint,to_sym_offset bigint,to_ip bigint,branch_type integer,in_tx boolean,call_path_id bigint,insn_count bigint,cyc_count bigint);
        sqlite>
      
      Cool, the 'insn_count' and 'cyc_count' are there, now lets see if we can
      use them in a query:
      
        sqlite> select insn_count,cyc_count from samples where cyc_count > 1500 and insn_count < 10;
        6|1507
        sqlite> select insn_count,cyc_count from samples where cyc_count > 1500;
        118|2210
        140|1516
        3783|1861
        132|1521
        6|1507
        sqlite>
      
      Seems to work :-)
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-17-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      64adadb3
    • Adrian Hunter's avatar
      perf db-export: Export IPC information · 52a2ab6f
      Adrian Hunter authored
      Export cycle and instruction counts on samples and call-returns.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-16-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      52a2ab6f
    • Adrian Hunter's avatar
      perf db-export: Add brief documentation · 1159face
      Adrian Hunter authored
      Add brief documentation to explain how the database export maintains
      backward and forward compatibility.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-15-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1159face
    • Adrian Hunter's avatar
      perf thread-stack: Accumulate IPC information · 003ccdc7
      Adrian Hunter authored
      Cycle and instruction counts are added to the stack. The IPC of a
      function and all functions it calls, is also recorded.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-14-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      003ccdc7
    • Adrian Hunter's avatar
      perf intel-pt: Document IPC usage · 5db47f43
      Adrian Hunter authored
      Add brief documentation about instructions-per-cycle (IPC) information
      derived from Intel PT.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-13-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5db47f43
    • Adrian Hunter's avatar
      perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets · 3f055167
      Adrian Hunter authored
      When CYC packets are not available, it is still possible to count cycles
      using TSC/TMA/MTC timestamps.
      
      As the timestamp increments in TSC ticks, convert to CPU cycles using
      the current core-to-bus ratio.
      
      Do not accumulate cycles when control flow packet generation is not
      enabled, nor when time has been "lost", typically due to mwait, which is
      indicated by a TSC/TMA packet that is not part of PSB+.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3f055167
    • Adrian Hunter's avatar
      perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip · f3c98c4b
      Adrian Hunter authored
      To make it easier to add new code for different TIP cases, separate each
      case.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-11-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f3c98c4b
    • Adrian Hunter's avatar
      perf intel-pt: Record when decoding PSB+ packets · 9bc668e3
      Adrian Hunter authored
      In preparation for using MTC packets to count cycles, record whether
      decoding is between a PSB and PSBEND packets.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-10-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9bc668e3
    • Adrian Hunter's avatar
      perf script: Add output of IPC ratio · 68fb45bf
      Adrian Hunter authored
      Add field 'ipc' to display instructions-per-cycle.
      
      Example:
      
       perf record -e intel_pt/cyc/u ls
       perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
      
       ls  2670177.697113434:  7f0dfdbcd090 _start+0x0      mov %rsp, %rdi   IPC: 0.00 (1/877)
       ls  2670177.697113434:  7f0dfdbcd093 _start+0x3      callq  0x7f0dfdbce030
       ls  2670177.697113434:  7f0dfdbce030 _dl_start+0x0   pushq  %rbp
       ls  2670177.697113434:  7f0dfdbce031 _dl_start+0x1   mov %rsp, %rbp
       ls  2670177.697113434:  7f0dfdbce034 _dl_start+0x4   pushq  %r15
       ls  2670177.697113434:  7f0dfdbce036 _dl_start+0x6   pushq  %r14
       ls  2670177.697113434:  7f0dfdbce038 _dl_start+0x8   pushq  %r13
       ls  2670177.697113434:  7f0dfdbce03a _dl_start+0xa   pushq  %r12
       ls  2670177.697113434:  7f0dfdbce03c _dl_start+0xc   mov %rdi, %r12
       ls  2670177.697113434:  7f0dfdbce03f _dl_start+0xf   pushq  %rbx
       ls  2670177.697113434:  7f0dfdbce040 _dl_start+0x10  sub $0x38, %rsp
       ls  2670177.697113434:  7f0dfdbce044 _dl_start+0x14  rdtsc
       ls  2670177.697113434:  7f0dfdbce046 _dl_start+0x16  mov %eax, %eax
       ls  2670177.697113434:  7f0dfdbce048 _dl_start+0x18  shl $0x20, %rdx
       ls  2670177.697113434:  7f0dfdbce04c _dl_start+0x1c  or %rax, %rdx
       ls  2670177.697114471:  7f0dfdbce04f _dl_start+0x1f  movq  0x27e22(%rip), %rax        IPC: 0.00 (15/1685)
       ls  2670177.697116177:  7f0dfdbce056 _dl_start+0x26  movq  %rdx, 0x27683(%rip)        IPC: 0.00 (1/881)
      
      Note, the IPC values are low due to page faults at the beginning of
      execution. The additional cycles are due to the time to enter the
      kernel, not the actual kernel page fault handler.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      68fb45bf
    • Adrian Hunter's avatar
      perf intel-pt: Add support for samples to contain IPC ratio · 5b1dc0fd
      Adrian Hunter authored
      Copy the incremental instruction count and cycle count onto 'instructions'
      and 'branches' samples.
      
      Because Intel PT does not update the cycle count on every branch or
      instruction, the incremental values will often be zero.
      
      When there are values, they will be the number of instructions and
      number of cycles since the last update, and thus represent the average
      IPC since the last IPC value.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-8-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b1dc0fd
    • Adrian Hunter's avatar
      perf tools: Add IPC information to perf_sample · 61d276f4
      Adrian Hunter authored
      Add counts of instructions and cycles, in order to represent
      instructions-per-cycle (IPC).
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-7-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      61d276f4
    • Adrian Hunter's avatar
      perf intel-pt: Accumulate cycle count from CYC packets · 7b4b4f83
      Adrian Hunter authored
      In preparation for providing instructions-per-cycle (IPC) information,
      accumulate cycle count from CYC packets.
      
      Although CYC packets are optional (requires config term 'cyc' to enable
      cycle-accurate mode when recording), the simplest way to count cycles is
      with CYC packets.
      
      The first complication is that cycles must be counted only when also
      counting instructions.
      
      That means when control flow packet generation is enabled i.e. between
      TIP.PGE and TIP.PGD packets.
      
      Also, sampling the cycle count follows the same rules as sampling the
      timestamp, that is, not before the instruction to which the decoder is
      walking is reached.
      
      In addition, the cycle count is not accurate for any but the first
      branch of a TNT packet.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-6-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7b4b4f83
    • Adrian Hunter's avatar
      perf intel-pt: Factor out intel_pt_update_sample_time · 948e9dc8
      Adrian Hunter authored
      To eliminate some duplication and make the code more understandable,
      factor out intel_pt_update_sample_time.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-5-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      948e9dc8
    • Alexey Budankov's avatar
      perf record: Allow mixing --user-regs with --call-graph=dwarf · d194d8fc
      Alexey Budankov authored
      When DWARF stacks were requested and at the same time that the user
      specifies a register set using the --user-regs option the full register
      context was being captured on samples:
      
        $ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- stack_test2.g.O3
      
        188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002): 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
        ... FP chain: nr:0
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x53b
        .... BX    0x7ffedbdd3cc0
        .... CX    0xffffffff
        .... DX    0x33d3a
        .... SI    0x7f09b74c38d0
        .... DI    0x0
        .... BP    0x401260
        .... SP    0x7ffedbdd3cc0
        .... IP    0x401236
        .... FLAGS 0x20a
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x7f09b74c3800
        .... R9    0x7f09b74c2da0
        .... R10   0xfffffffffffff3ce
        .... R11   0x246
        .... R12   0x401070
        .... R13   0x7ffedbdd5db0
        .... R14   0x0
        .... R15   0x0
        ... ustack: size 1024, offset 0xe0
         . data_src: 0x5080021
         ... thread: stack_test2.g.O:23828
         ...... dso: /root/abudanko/stacks/stack_test2.g.O3
      
      I.e. the --user-regs=IP,SP,BP was being ignored, being overridden by the
      needs of --call-graph=dwarf.
      
      After applying the change in this patch the sample data contains the
      user specified register, but making sure that at least the minimal set
      of register needed for DWARF unwinding (DWARF_MINIMAL_REGS) is
      requested.
      
      The user is warned that DWARF unwinding may not work if extra registers
      end up being needed.
      
        -g call-graph dwarf,K                         full_regs
        --user-regs=user_regs                         user_regs
        -g call-graph dwarf,K --user-regs=user_regs	user_regs + DWARF_MINIMAL_REGS
      
        $ perf record -g --call-graph dwarf,1024 --user-regs=BP -- ls
        WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced.
        arch   COPYING	Documentation  include	Kbuild	 lbuild    MAINTAINERS	modules.builtin		 Module.symvers  perf.data.old	scripts   System.map  virt
        block  CREDITS	drivers        init	Kconfig  lib	   Makefile	modules.builtin.modinfo  net		 README		security  tools       vmlinux
        certs  crypto	fs	       ipc	kernel	 LICENSES  mm		modules.order		 perf.data	 samples	sound	  usr	      vmlinux.o
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
      
        188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002): 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
        ... FP chain: nr:0
        ... user regs: mask 0x1c0 ABI 64-bit
        .... BP    0x401260
        .... SP    0x7ffd3d85cc20
        .... IP    0x401236
        ... ustack: size 1024, offset 0x58
         . data_src: 0x5080021
      
      Committer notes:
      
      Detected build failures on arches where PERF_REGS_ is not available,
      such as debian:experimental-x-{mips,mips64,mipsel}, fedora 24 and 30 for
      ARC uClibc and glibc, reported to Alexey that provided a patch moving
      the DWARF_MINIMAL_REGS from evsel.c to util/perf_regs.h, where it is
      guarded by an HAVE_PERF_REGS_SUPPORT ifdef.
      
      Committer testing:
      
        # perf record --user-regs=bp,ax -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.955 MB perf.data (1773 samples) ]
        # perf script -F+uregs | grep AX: | head -5
           perf 1719 [000] 181.272398:    1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
           perf 1719 [000] 181.272402:    1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
           perf 1719 [000] 181.272403:    8 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
           perf 1719 [000] 181.272405:  181 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
           perf 1719 [000] 181.272406: 4405 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffef828fb00
        # perf record --call-graph=dwarf --user-regs=bp,ax -a sleep 1
        WARNING: The use of --call-graph=dwarf may require all the user registers, specifying a subset with --user-regs may render DWARF unwinding unreliable, so the minimal registers set (IP, SP) is explicitly forced.
        [ perf record: Woken up 55 times to write data ]
        [ perf record: Captured and wrote 24.184 MB perf.data (2841 samples) ]
        [root@quaco ~]# perf script --hide-call-graph -F+uregs | grep AX: | head -5
           perf 1729 [000] 211.268006:    1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
           perf 1729 [000] 211.268014:    1 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
           perf 1729 [000] 211.268017:    5 cycles: ffffffffba06a7c4 native_write_msr+0x4 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
           perf 1729 [000] 211.268020:   48 cycles: ffffffffba06a7c6 native_write_msr+0x6 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
           perf 1729 [000] 211.268024:  490 cycles: ffffffffba00e471 intel_bts_enable_local+0x21 (/lib/modules/5.2.0-rc1+/build/vmlinux) ABI:2 AX:0xffffffffffffffda BP:0x7ffc8679abb0 SP:0x7ffc8679ab78 IP:0x7fa75223a0db
        #
      Signed-off-by: default avatarAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/e7fd37b1-af22-0d94-a0dc-5895e803bbfe@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d194d8fc
    • Leo Yan's avatar
      perf symbols: Remove unused variable 'err' · e5f177a5
      Leo Yan authored
      Variable 'err' is defined but never used in function symsrc__init(),
      remove it and directly return -1 at the end of the function.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190530093801.20510-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5f177a5
    • Arnaldo Carvalho de Melo's avatar
      perf data: Document directory format header: HEADER_DIR_FORMAT · 0da6ae94
      Arnaldo Carvalho de Melo authored
      We forgot to update the perf.data file format document for the
      HEADER_DIR_FORMAT header, do it now from comments in the patch
      introducing it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Chong Jiang <chongjiang@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Fixes: 258031c0 ("perf header: Add DIR_FORMAT feature to describe directory data")
      Link: https://lkml.kernel.org/n/tip-jbrzb7ijb5al33gi8br6f9rr@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0da6ae94
    • Arnaldo Carvalho de Melo's avatar
      perf data: Document clockid header: HEADER_CLOCKID · a9de7cfc
      Arnaldo Carvalho de Melo authored
      We forgot to update the perf.data file format document for the
      HEADER_CLOCKID header, do it now from comments in the patch introducing
      it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Chong Jiang <chongjiang@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Que <sque@chromium.org>
      Fixes: cf790516 ("perf record: Encode -k clockid frequency into Perf trace")
      Link: https://lkml.kernel.org/n/tip-slhnjp06027j3ae17qqetzxj@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a9de7cfc
    • Arnaldo Carvalho de Melo's avatar
      perf data: Document memory topology header: HEADER_MEM_TOPOLOGY · 835fbf12
      Arnaldo Carvalho de Melo authored
      We forgot to update the perf.data file format document for the
      HEADER_MEM_TOPOLOGY header, do it now from comments in the patch
      introducing it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Chong Jiang <chongjiang@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Simon Que <sque@chromium.org>
      Fixes: e2091ced ("perf tools: Add MEM_TOPOLOGY feature to perf data file")
      Link: https://lkml.kernel.org/n/tip-h5lcm1nbe9ztxwm61gmadd56@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      835fbf12
    • Song Liu's avatar
      perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF · 8e21be4f
      Song Liu authored
      This patch addes description of HEADER_BPF_PROG_INFO and HEADER_BPF_BTF to
      perf.data-file-format.txt.
      Requested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 606f972b ("perf bpf: Save bpf_prog_info information as headers to perf.data")
      Link: http://lkml.kernel.org/r/20190521064406.2498925-1-songliubraving@fb.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e21be4f