1. 30 Jul, 2020 21 commits
  2. 29 Jul, 2020 2 commits
    • Wei Li's avatar
      perf tools: No need to cache the PMUs in ARM SPE auxtrace init routine · 3e43d79d
      Wei Li authored
      - auxtrace_record__init() is called only once, so there is no point in
        using a static variable to cache the results of
        find_all_arm_spe_pmus(), make it local and free the results after use.
      
      - Another reason is, even though SPE is micro-architecture dependent,
        but so far it only supports "statistical-profiling-extension-v1" and
        we have no chance to use multiple SPE's PMU events in Perf command.
      
      So remove the useless check code to make it clear.
      Signed-off-by: default avatarWei Li <liwei391@huawei.com>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200724071111.35593-3-liwei391@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e43d79d
    • Wei Li's avatar
      perf tools: Fix record failure when mixed with ARM SPE event · 31e81e0b
      Wei Li authored
      When recording with cache-misses and arm_spe_x event, I found that it
      will just fail without showing any error info if i put cache-misses
      after 'arm_spe_x' event.
      
        [root@localhost 0620]# perf record -e cache-misses \
      				-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.067 MB perf.data ]
        [root@localhost 0620]#
        [root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
      				     -e  cache-misses sleep 1
        [root@localhost 0620]#
      
      The current code can only work if the only event to be traced is an
      'arm_spe_x', or if it is the last event to be specified. Otherwise the
      last event type will be checked against all the arm_spe_pmus[i]->types,
      none will match and an out of bound 'i' index will be used in
      arm_spe_recording_init().
      
      We don't support concurrent multiple arm_spe_x events currently, that
      is checked in arm_spe_recording_options(), and it will show the relevant
      info. So add the check and record of the first found 'arm_spe_pmu' to
      fix this issue here.
      
      Fixes: ffd3d18c ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
      Signed-off-by: default avatarWei Li <liwei391@huawei.com>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      31e81e0b
  3. 28 Jul, 2020 1 commit
    • Davidlohr Bueso's avatar
      perf bench: Add basic syscall benchmark · c2a08203
      Davidlohr Bueso authored
      The usefulness of having a standard way of testing syscall performance
      has come up from time to time[0]. Furthermore, some of our testing
      machinery (such as 'mmtests') already makes use of a simplified version
      of the microbenchmark. This patch mainly takes the same idea to measure
      syscall throughput compatible with 'perf-bench' via getppid(2), yet
      without any of the additional template stuff from Ingo's version (based
      on numa.c). The code is identical to what mmtests uses.
      
      [0] https://lore.kernel.org/lkml/20160201074156.GA27156@gmail.com/
      
      Committer notes:
      
      Add mising stdlib.h and unistd.h to get the prototypes for exit() and
      getppid().
      
      Committer testing:
      
        $ perf bench
        Usage:
        	perf bench [<common options>] <collection> <benchmark> [<options>]
      
                # List of all available benchmark collections:
      
                 sched: Scheduler and IPC benchmarks
               syscall: System call benchmarks
                   mem: Memory access benchmarks
                  numa: NUMA scheduling and MM benchmarks
                 futex: Futex stressing benchmarks
                 epoll: Epoll stressing benchmarks
             internals: Perf-internals benchmarks
                   all: All benchmarks
      
        $
        $ perf bench syscall
      
                # List of available benchmarks for collection 'syscall':
      
                 basic: Benchmark for basic getppid(2) calls
                   all: Run all syscall benchmarks
      
        $ perf bench syscall basic
        # Running 'syscall/basic' benchmark:
        # Executed 10000000 getppid() calls
             Total time: 3.679 [sec]
      
               0.367957 usecs/op
                2717708 ops/sec
        $ perf bench syscall all
        # Running syscall/basic benchmark...
        # Executed 10000000 getppid() calls
             Total time: 3.644 [sec]
      
               0.364456 usecs/op
                2743815 ops/sec
      
        $
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20190308181747.l36zqz2avtivrr3c@linux-r8p5Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c2a08203
  4. 22 Jul, 2020 8 commits
  5. 21 Jul, 2020 3 commits
  6. 17 Jul, 2020 5 commits
    • Jiri Olsa's avatar
      perf metric: Add 'struct expr_id_data' to keep expr value · 070b3b5a
      Jiri Olsa authored
      Add 'struct expr_id_data' to keep an expr value instead of just a simple
      double pointer, so we can store more data for ID in the following
      changes.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200712132634.138901-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      070b3b5a
    • Jiri Olsa's avatar
      perf metric: Rename expr__add_id() to expr__add_val() · 2c46f542
      Jiri Olsa authored
      Rename expr__add_id() to expr__add_val() so we can use expr__add_id() to
      actually add just the id without any value in following changes.
      
      There's no functional change.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200712132634.138901-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c46f542
    • Masami Hiramatsu's avatar
      perf probe: Warn if the target function is a GNU indirect function · 3de2bf9d
      Masami Hiramatsu authored
      Warn if the probe target function is a GNU indirect function (GNU_IFUNC)
      because it may not be what the user wants to probe.
      
      The GNU indirect function ( https://sourceware.org/glibc/wiki/GNU_IFUNC )
      is the dynamic symbol solved at runtime. An IFUNC function is a selector
      which is invoked from the ELF loader, but the symbol address of the
      function which will be modified by the IFUNC is the same as the IFUNC in
      the symbol table. This can confuse users trying to probe such functions.
      
      For example, memcpy is an IFUNC.
      
        probe_libc:memcpy    (on __new_memcpy_ifunc@x86_64/multiarch/memcpy.c in /usr/lib64/libc-2.30.so)
      
      the probe is put on an IFUNC.
      
        perf  1742 [000] 26201.715632: probe_libc:memcpy: (7fdaa53824c0)
                    7fdaa53824c0 __new_memcpy_ifunc+0x0 (inlined)
                    7fdaa5d4a980 elf_machine_rela+0x6c0 (inlined)
                    7fdaa5d4a980 elf_dynamic_do_Rela+0x6c0 (inlined)
                    7fdaa5d4a980 _dl_relocate_object+0x6c0 (/usr/lib64/ld-2.30.so)
                    7fdaa5d42155 dl_main+0x1cc5 (/usr/lib64/ld-2.30.so)
                    7fdaa5d5831a _dl_sysdep_start+0x54a (/usr/lib64/ld-2.30.so)
                    7fdaa5d3ffeb _dl_start_final+0x25b (inlined)
                    7fdaa5d3ffeb _dl_start+0x25b (/usr/lib64/ld-2.30.so)
                    7fdaa5d3f117 .annobin_rtld.c+0x7 (inlined)
      
      And the event is invoked from the ELF loader instead of the target
      program's main code.
      
      Moreover, at this moment, we can not probe on the function which will
      be selected by the IFUNC, because it is determined at runtime. But
      uprobe will be prepared before running the target binary.
      
      Thus, I decided to warn user when 'perf probe' detects that the probe
      point is on an GNU IFUNC symbol. Someone who wants to probe an IFUNC
      symbol to debug the IFUNC function can ignore this warning.
      
      Committer notes:
      
      I.e., this warning will be emitted if the probe point is an IFUNC:
      
        "Warning: The probe function (%s) is a GNU indirect function.\n"
        "Consider identifying the final function used at run time and set the probe directly on that.\n"
      
      Complete set of steps:
      
        # readelf -sW /lib64/libc-2.29.so  | grep IFUNC | tail
         22196: 0000000000109a80   183 IFUNC   GLOBAL DEFAULT   14 __memcpy_chk
         22214: 00000000000b7d90   191 IFUNC   GLOBAL DEFAULT   14 __gettimeofday
         22336: 000000000008b690    60 IFUNC   GLOBAL DEFAULT   14 memchr
         22350: 000000000008b9b0    89 IFUNC   GLOBAL DEFAULT   14 __stpcpy
         22420: 000000000008bb10    76 IFUNC   GLOBAL DEFAULT   14 __strcasecmp_l
         22582: 000000000008a970    60 IFUNC   GLOBAL DEFAULT   14 strlen
         22585: 00000000000a54d0    92 IFUNC   WEAK   DEFAULT   14 wmemset
         22600: 000000000010b030    92 IFUNC   GLOBAL DEFAULT   14 __wmemset_chk
         22618: 000000000008b8a0   183 IFUNC   GLOBAL DEFAULT   14 __mempcpy
         22675: 000000000008ba70    76 IFUNC   WEAK   DEFAULT   14 strcasecmp
        #
        # perf probe -x /lib64/libc-2.29.so strlen
        Warning: The probe function (strlen) is a GNU indirect function.
        Consider identifying the final function used at run time and set the probe directly on that.
        Added new event:
          probe_libc:strlen    (on strlen in /usr/lib64/libc-2.29.so)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_libc:strlen -aR sleep 1
      
        #
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Link: http://lore.kernel.org/lkml/159438669349.62703.5978345670436126948.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3de2bf9d
    • Masami Hiramatsu's avatar
      perf probe: Fix memory leakage when the probe point is not found · 12d572e7
      Masami Hiramatsu authored
      Fix the memory leakage in debuginfo__find_trace_events() when the probe
      point is not found in the debuginfo. If there is no probe point found in
      the debuginfo, debuginfo__find_probes() will NOT return -ENOENT, but 0.
      
      Thus the caller of debuginfo__find_probes() must check the tf.ntevs and
      release the allocated memory for the array of struct probe_trace_event.
      
      The current code releases the memory only if the debuginfo__find_probes()
      hits an error but not checks tf.ntevs. In the result, the memory allocated
      on *tevs are not released if tf.ntevs == 0.
      
      This fixes the memory leakage by checking tf.ntevs == 0 in addition to
      ret < 0.
      
      Fixes: ff741783 ("perf probe: Introduce debuginfo to encapsulate dwarf information")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/159438668346.62703.10887420400718492503.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      12d572e7
    • Masami Hiramatsu's avatar
      perf probe: Fix wrong variable warning when the probe point is not found · 11fd3eb8
      Masami Hiramatsu authored
      Fix a wrong "variable not found" warning when the probe point is not
      found in the debuginfo.
      
      Since the debuginfo__find_probes() can return 0 even if it does not find
      given probe point in the debuginfo, fill_empty_trace_arg() can be called
      with tf.ntevs == 0 and it can emit a wrong warning.  To fix this, reject
      ntevs == 0 in fill_empty_trace_arg().
      
      E.g. without this patch;
      
        # perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
        Failed to find the location of the '%di' variable at this address.
         Perhaps it has been optimized out.
         Use -V with the --range option to show '%di' location range.
        Added new events:
          probe_libc:memcpy    (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
          probe_libc:memcpy    (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_libc:memcpy -aR sleep 1
      
      With this;
      
        # perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
        Added new events:
          probe_libc:memcpy    (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
          probe_libc:memcpy    (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_libc:memcpy -aR sleep 1
      
      Fixes: cb402730 ("perf probe: Trace a magic number if variable is not found")
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Tested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/159438667364.62703.2200642186798763202.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      11fd3eb8