1. 08 Mar, 2018 3 commits
    • Agustin Vega-Frias's avatar
      perf pmu: Auto-merge PMU events created by prefix or glob match · c199c11d
      Agustin Vega-Frias authored
      Auto-merge for these events was disabled when auto-merging of non-alias
      events was disabled in commit 63ce8449 (perf stat: Only auto-merge events
      that are PMU aliases).
      
      Non-merging of legacy events is preserved:
      
          $ perf stat -ag -e cache-misses,cache-misses sleep 1
      
           Performance counter stats for 'system wide':
      
                      86,323      cache-misses
                      86,323      cache-misses
      
                 1.002623307 seconds time elapsed
      
      But prefix or glob matching auto-merges the events created:
      
          $ perf stat -a -e l3cache/read-miss/ sleep 1
      
           Performance counter stats for 'system wide':
      
                         328      l3cache/read-miss/
      
                 1.002627008 seconds time elapsed
      
          $ perf stat -a -e l3cache_0_[01]/read-miss/ sleep 1
      
           Performance counter stats for 'system wide':
      
                         172      l3cache/read-miss/
      
                 1.002627008 seconds time elapsed
      
      As with events created with aliases, auto-merging can be suppressed with
      the --no-merge option:
      
          $ perf stat -a -e l3cache/read-miss/ --no-merge sleep 1
      
           Performance counter stats for 'system wide':
      
                          67      l3cache/read-miss/
                          67      l3cache/read-miss/
                          63      l3cache/read-miss/
                          60      l3cache/read-miss/
      
                 1.002622192 seconds time elapsed
      Signed-off-by: default avatarAgustin Vega-Frias <agustinv@codeaurora.org>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Change-Id: I0a47eed54c05e1982ca964d743b37f50f60c508c
      Link: http://lkml.kernel.org/r/1520345084-42646-4-git-send-email-agustinv@codeaurora.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c199c11d
    • Agustin Vega-Frias's avatar
      perf pmu: Display pmu name when printing unmerged events in stat · 8c5421c0
      Agustin Vega-Frias authored
      To simplify creation of events accross multiple instances of the same
      type of PMU stat supports two methods for creating multiple events from
      a single event specification:
      
      1. A prefix or glob can be used in the PMU name.
      2. Aliases, which are listed immediately after the Kernel PMU events
         by perf list, are used.
      
      When the --no-merge option is passed and these events are displayed
      individually the PMU name is lost and it's not possible to see which
      count corresponds to which pmu:
      
          $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          67      l3cache/read-miss/
                          67      l3cache/read-miss/
                          63      l3cache/read-miss/
                          60      l3cache/read-miss/
      
                 0.001675706 seconds time elapsed
      
          $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          12      l3cache_read_miss
                          17      l3cache_read_miss
                          10      l3cache_read_miss
                           8      l3cache_read_miss
      
                 0.001661305 seconds time elapsed
      
      This change adds the original pmu name to the event. For dynamic pmu
      events the pmu name is restored in the event name:
      
          $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          63      l3cache_0_3/read-miss/
                          74      l3cache_0_1/read-miss/
                          64      l3cache_0_2/read-miss/
                          74      l3cache_0_0/read-miss/
      
                 0.001675706 seconds time elapsed
      
      For alias events the name is added after the event name:
      
          $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          10      l3cache_read_miss [l3cache_0_3]
                          12      l3cache_read_miss [l3cache_0_1]
                          10      l3cache_read_miss [l3cache_0_2]
                          17      l3cache_read_miss [l3cache_0_0]
      
                 0.001661305 seconds time elapsed
      Signed-off-by: default avatarAgustin Vega-Frias <agustinv@codeaurora.org>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Change-Id: I8056b9eda74bda33e95065056167ad96e97cb1fb
      Link: http://lkml.kernel.org/r/1520345084-42646-3-git-send-email-agustinv@codeaurora.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8c5421c0
    • Agustin Vega-Frias's avatar
      perf pmu: Support wildcards on pmu name in dynamic pmu events · b2b9d3a3
      Agustin Vega-Frias authored
      Starting on v4.12 event parsing code for dynamic pmu events already
      supports prefix-based matching of multiple pmus when creating dynamic
      events. E.g., in a system with the following dynamic pmus:
      
          mypmu_0
          mypmu_1
          mypmu_2
          mypmu_4
      
      passing mypmu/<config>/ as an event spec will result in the creation of
      the event in all of the pmus. This change expands this matching through
      the use of fnmatch so glob-like expressions can be used to create events
      in multiple pmus. E.g., in the system described above if a user only
      wants to create the event in mypmu_0 and mypmu_1, mypmu_[01]/<config>/
      can be passed.
      Signed-off-by: default avatarAgustin Vega-Frias <agustinv@codeaurora.org>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Timur Tabi <timur@codeaurora.org>
      Change-Id: Icb25653fc5d5239c20f3bffdfdf4ab4c9c9bb20b
      Link: http://lkml.kernel.org/r/1520454947-16977-1-git-send-email-agustinv@codeaurora.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b2b9d3a3
  2. 07 Mar, 2018 20 commits
  3. 06 Mar, 2018 5 commits
    • Adrian Hunter's avatar
      perf tools: Fix trigger class trigger_on() · de19e5c3
      Adrian Hunter authored
      trigger_on() means that the trigger is available but not ready, however
      trigger_on() was making it ready. That can segfault if the signal comes
      before trigger_ready(). e.g. (USR2 signal delivery not shown)
      
        $ perf record -e intel_pt//u -S sleep 1
        perf: Segmentation fault
        Obtained 16 stack frames.
        /home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550]
        /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
        /home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6]
        /home/ahunter/bin/perf() [0x43a45b]
        /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
        /lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5]
        /home/ahunter/bin/perf() [0x4ec6c9]
        /home/ahunter/bin/perf() [0x4ec73b]
        /home/ahunter/bin/perf() [0x4ec73b]
        /home/ahunter/bin/perf() [0x4ec73b]
        /home/ahunter/bin/perf() [0x4eca15]
        /home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77]
        /home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0]
        /home/ahunter/bin/perf(cmd_record+0x722) [0x43c132]
        /home/ahunter/bin/perf() [0x4a11ae]
        /home/ahunter/bin/perf(main+0x5d4) [0x427fb4]
      
      Note, for testing purposes, this is hard to hit unless you add some sleep()
      in builtin-record.c before record__open().
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: stable@vger.kernel.org
      Fixes: 3dcc4436 ("perf tools: Introduce trigger class")
      Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de19e5c3
    • Adrian Hunter's avatar
      perf auxtrace: Prevent decoding when --no-itrace · 2e2967f4
      Adrian Hunter authored
      Prevent auxtrace_queues__process_index() from queuing AUX area data for
      decoding when the --no-itrace option has been used.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1520327598-1317-3-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2e2967f4
    • Ilya Pronin's avatar
      perf stat: Fix CVS output format for non-supported counters · 40c21898
      Ilya Pronin authored
      When printing stats in CSV mode, 'perf stat' appends extra separators
      when a counter is not supported:
      
      <not supported>,,L1-dcache-store-misses,mesos/bd442f34-2b4a-47df-b966-9b281f9f56fc,0,100.00,,,,
      
      Which causes a failure when parsing fields. The numbers of separators
      should be the same for each line, no matter if the counter is or not
      supported.
      Signed-off-by: default avatarIlya Pronin <ipronin@twitter.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20180306064353.31930-1-xiyou.wangcong@gmail.com
      Fixes: 92a61f64 ("perf stat: Implement CSV metrics output")
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40c21898
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.17-20180305' of... · 55b4ce61
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.17-20180305' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      - Be more robust when drawing arrows in the annotation TUI, avoiding a
        segfault when jump instructions have as a target addresses in functions
        other that the one currently being annotated. The full fix will come in
        the following days, when jumping to other functions will work as call
        instructions (Arnaldo Carvalho de Melo)
      
      - Allow asking for the maximum allowed sample rate in 'top' and
        'record', i.e. 'perf record -F max' will read the
        kernel.perf_event_max_sample_rate sysctl and use it (Arnaldo Carvalho de Melo)
      
      - When the user specifies a freq above kernel.perf_event_max_sample_rate,
        Throttle it down to that max freq, and warn the user about it, add as
        well --strict-freq so that the previous behaviour of not starting the
        session when the desired freq can't be used can be selected (Arnaldo Carvalho de Melo)
      
      - Find 'call' instruction target symbol at parsing time, used so far in
        the TUI, part of the infrastructure changes that will end up allowing
        for jumps to navigate to other functions, just like 'call'
        instructions. (Arnaldo Carvalho de Melo)
      
      - Use xyarray dimensions to iterate fds in 'perf stat' (Andi Kleen)
      
      - Ignore threads for which the current user hasn't permissions when
        enabling system-wide --per-thread (Jin Yao)
      
      - Fix some backtrace perf test cases to use 'perf record' + 'perf script'
        instead, till 'perf trace' starts using ordered_events or equivalent
        to avoid symbol resolving artifacts due to reordering of
        PERF_RECORD_MMAP events (Jiri Olsa)
      
      - Fix crash in 'perf record' pipe mode, it needs to allocate the ID
        array even for a single event, unlike non-pipe mode (Jiri Olsa)
      
      - Make annoying fallback message on older kernels with newer 'perf top'
        binaries trying to use overwrite mode and that not being present
        in the older kernels (Kan Liang)
      
      - Switch last users of old APIs to the newer perf_mmap__read_event()
        one, then discard those old mmap read forward APIs (Kan Liang)
      
      - Fix the usage on the 'perf kallsyms' man page (Sangwon Hong)
      
      - Simplify cgroup arguments when tracking multiple events (weiping zhang)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      55b4ce61
    • Ingo Molnar's avatar
      8af31363
  4. 05 Mar, 2018 12 commits