1. 02 Jul, 2019 17 commits
    • Jin Yao's avatar
      perf diff: Print the basic block cycles diff · b10c78c5
      Jin Yao authored
       $ perf record -b ./div
       $ perf record -b ./div
      
      Following is the default perf diff output
      
       $ perf diff
      
       # Event 'cycles'
       #
       # Baseline  Delta Abs  Shared Object     Symbol
       # ........  .........  ................  ..................................
       #
           48.75%     +0.33%  div               [.] main
            8.21%     -0.20%  div               [.] compute_flag
           19.02%     -0.12%  libc-2.23.so      [.] __random_r
           16.17%     -0.09%  libc-2.23.so      [.] __random
            2.27%     -0.03%  div               [.] rand@plt
                      +0.02%  [i915]            [k] gen8_irq_handler
            5.52%     +0.02%  libc-2.23.so      [.] rand
      
      This patch creates a new computation selection 'cycles'.
      
       $ perf diff -c cycles
      
       # Event 'cycles'
       #
       # Baseline       [Program Block Range] Cycles Diff Shared Object Symbol
       # ........ ....................................... .........................................
       #
           48.75%             [div.c:42 -> div.c:45]  147 div           [.] main
           48.75%             [div.c:31 -> div.c:40]    4 div           [.] main
           48.75%             [div.c:40 -> div.c:40]    0 div           [.] main
           48.75%             [div.c:42 -> div.c:42]    0 div           [.] main
           48.75%             [div.c:42 -> div.c:44]    0 div           [.] main
           19.02% [random_r.c:357 -> random_r.c:360]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:373]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:376]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:380]    0 libc-2.23.so  [.] __random_r
           19.02% [random_r.c:357 -> random_r.c:392]    0 libc-2.23.so  [.] __random_r
           16.17%     [random.c:288 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:295]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:288 -> random.c:297]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:291 -> random.c:291]    0 libc-2.23.so  [.] __random
           16.17%     [random.c:293 -> random.c:293]    0 libc-2.23.so  [.] __random
            8.21%             [div.c:22 -> div.c:22]  148 div           [.] compute_flag
            8.21%             [div.c:22 -> div.c:25]    0 div           [.] compute_flag
            8.21%             [div.c:27 -> div.c:28]    0 div           [.] compute_flag
            5.52%           [rand.c:26 -> rand.c:27]    0 libc-2.23.so  [.] rand
            5.52%           [rand.c:26 -> rand.c:28]    0 libc-2.23.so  [.] rand
            2.27%         [rand@plt+0 -> rand@plt+0]    0 div           [.] rand@plt
            0.01% [entry_64.S:694 -> entry_64.S:694]   16 [vmlinux]     [k] native_irq_return_iret
            0.00%       [fair.c:7676 -> fair.c:7665]  162 [vmlinux]     [k] update_blocked_averages
      
      "[Program Block Range]" indicates the range of program basic block
      (start -> end). If we can find the source line it prints the source line
      otherwise it prints the symbol+offset instead.
      
       v4:
       ---
       Use source lines or symbol+offset to indicate the basic block. It should
       be easier to understand.
      
       v3:
       ---
       Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf.
       Use symbol_conf.report_block to check if executing hist_entry__block_fprintf.
      
       v2:
       ---
       Keep standard perf diff format and display the 'Baseline' and
       'Shared Object'.
      
      The output is sorted by "Baseline" and the basic blocks in the same
      function are sorted by cycles diff.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b10c78c5
    • Jin Yao's avatar
      perf diff: Link same basic blocks among different data · f3810817
      Jin Yao authored
      The target is to compare the performance difference (cycles diff) for
      the same basic blocks in different data files.
      
      The same basic block means same function, same start address and same
      end address. This patch finds the same basic blocks from different data
      files and link them together and resort by the cycles diff.
      
       v3:
       ---
       The block stuffs are maintained by new structure 'block_hist',
       so this patch is update accordingly.
      
       v2:
       ---
       Since now the basic block hists is changed to per symbol,
       the patch only links the basic block hists for the same
       symbol in different data files.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-6-git-send-email-yao.jin@linux.intel.com
      [ sym->name is an array, not a pointer, so no need to check it for NULL, fixes de build in some distros ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f3810817
    • Jin Yao's avatar
      perf diff: Use hists to manage basic blocks per symbol · 99150a1f
      Jin Yao authored
      The hist__account_cycles() can account cycles per basic block. The basic
      block information is saved in cycles_hist structure.
      
      This patch processes each symbol, get basic blocks from cycles_hist and
      add the basic block entries to a new hists (in 'struct block_hist').
      Using a hists is because we need to compare, sort and print the basic
      blocks later.
      
       v6:
       ---
       Since 'ops' argument is removed from hists__add_entry_block,
       update the code accordingly. No functional change.
      
       v5:
       ---
       Since now we still carry block_info in 'struct hist_entry'
       we don't need to use our own new/free ops for hist entries.
       And the block_info is released in hist_entry__delete.
      
       v3:
       ---
       1. In v2, we put block stuffs in 'struct hist_entry', but
       it's not a good design. In v3, we create a new
       'struct block_hist' and cast the 'struct hist_entry' to
       'struct block_hist' in some places, which can avoid adding
       new stuffs in 'struct hist_entry'.
      
       2. abs() -> labs(), in block_cycles_diff_cmp().
      
       v2:
       ---
       v1 adds the basic block entries to per data-file hists
       but v2 adds the basic block entries to per symbol hists.
       That is to keep current perf-diff format. Will show the
       result in next patches.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      99150a1f
    • Jin Yao's avatar
      perf diff: Check if all data files with branch stacks · 30d81553
      Jin Yao authored
      We will expand perf diff to support diff cycles of individual programs
      blocks, so it requires all data files having branch stacks.
      
      This patch checks HEADER_BRANCH_STACK in header, and only set the flag
      has_br_stack when HEADER_BRANCH_STACK are set in all data files.
      
       v2:
       ---
       Move check_file_brstack() from __cmd_diff() to cmd_diff().
       Because later patch will check flag 'has_br_stack' before
       ui_init().
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      30d81553
    • Jin Yao's avatar
      perf hists: Add block_info in hist_entry · fe96245c
      Jin Yao authored
      The block_info contains the program basic block information, i.e,
      contains the start address and the end address of this basic block and
      how much cycles it takes.
      
      We need to compare, sort and even print out the basic block by some
      orders, i.e. sort by cycles.
      
      For this purpose, we add block_info field to hist_entry. In order not to
      impact current interface, we creates a new function
      hists__add_entry_block.
      
       v6:
       ---
       Remove the 'ops' argument in hists__add_entry_block
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fe96245c
    • Jin Yao's avatar
      perf symbol: Create block_info structure · 0cec2447
      Jin Yao authored
      'perf diff' currently can only diff symbols(functions).
      
      We should expand it to diff cycles of individual programs blocks as
      reported by timed LBR.  This would allow to identify changes in specific
      code accurately.
      
      We need a new structure to maintain the basic block information, such as,
      symbol(function), start/end address of this block, cycles. This patch
      creates this structure and with some ops.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1561713784-30533-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0cec2447
    • Jiri Olsa's avatar
      objtool: Fix build by linking against tools/lib/ctype.o sources · 0c69b931
      Jiri Olsa authored
      Fix objtool build, because it adds _ctype dependency via isspace call patch.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: André Goddard Rosa <andre.goddard@gmail.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 7bd330de ("tools lib: Adopt skip_spaces() from the kernel sources")
      Link: http://lkml.kernel.org/r/20190702121240.GB12694@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c69b931
    • Luke Mujica's avatar
      perf jevents: Use nonlocal include statements in pmu-events.c · 06c642c0
      Luke Mujica authored
      Change pmu-events.c to not use local include statements. The code that
      creates the include statements for pmu-events.c is in jevents.c.
      
      pmu-events.c is a generated file, and for build systems that put
      generated files in a separate directory, include statements with local
      pathing cannot find non-generated files.
      Signed-off-by: default avatarLuke Mujica <lukemujica@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Numfor Mbiziwo-Tiapo <nums@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lkml.kernel.org/n/tip-prgnwmaoo1pv9zz4vnv1bjaj@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06c642c0
    • Mao Han's avatar
      perf annotate: Add csky support · aa23aa55
      Mao Han authored
      This patch add basic arch initialization and instruction associate
      support for the csky CPU architecture.
      
      E.g.:
      
        $ perf annotate --stdio2
        Samples: 161  of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.):
        40250000, [percent: local period]
        test_4() /usr/lib/perf-test/callchain_test
        Percent
      
                    Disassembly of section .text:
      
                    00008420 <test_4>:
                  test_4():
                      subi  sp, sp, 4
                      st.w  r8, (sp, 0x0)
                      mov   r8, sp
                      subi  sp, sp, 8
                      subi  r3, r8, 4
                      movi  r2, 0
                      st.w  r2, (r3, 0x0)
                    ↓ br    2e
        100.00  14:   subi  r3, r8, 4
                      ld.w  r2, (r3, 0x0)
                      subi  r3, r8, 8
                      st.w  r2, (r3, 0x0)
                      subi  r3, r8, 4
                      ld.w  r3, (r3, 0x0)
                      addi  r2, r3, 1
                      subi  r3, r8, 4
                      st.w  r2, (r3, 0x0)
                2e:   subi  r3, r8, 4
                      ld.w  r2, (r3, 0x0)
                      lrw   r3, 0x98967f    // 8598 <main+0x28>
                      cmplt r3, r2
                    ↑ bf    14
                      mov   r0, r0
                      mov   r0, r0
                      mov   sp, r8
                      ld.w  r8, (sp, 0x0)
                      addi  sp, sp, 4
                    ← rts
      Signed-off-by: default avatarMao Han <han_mao@c-sky.com>
      Acked-by: default avatarGuo Ren <guoren@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-csky@vger.kernel.org
      Link: http://lkml.kernel.org/r/d874d7782d9acdad5d98f2f5c4a6fb26fbe41c5d.1561531557.git.han_mao@c-sky.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa23aa55
    • Andi Kleen's avatar
      perf stat: Fix metrics with --no-merge · e3a94273
      Andi Kleen authored
      Since Fixes: 8c5421c0 ("perf pmu: Display pmu name when printing
      unmerged events in stat") using --no-merge adds the PMU name to the
      evsel name.
      
      This breaks the metric value lookup because the parser doesn't know
      about this.
      
      Remove the extra postfixes for the metric evaluation.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: 8c5421c0 ("perf pmu: Display pmu name when printing unmerged events in stat")
      Link: http://lkml.kernel.org/r/20190624193711.35241-5-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3a94273
    • Andi Kleen's avatar
      perf stat: Fix group lookup for metric group · 2f87f33f
      Andi Kleen authored
      The metric group code tries to find a group it added earlier in the
      evlist. Fix the lookup to handle groups with partially overlaps
      correctly. When a sub string match fails and we reset the match, we have
      to compare the first element again.
      
      I also renamed the find_evsel function to find_evsel_group to make its
      purpose clearer.
      
      With the earlier changes this fixes:
      
      Before:
      
        % perf stat -M UPI,IPC sleep 1
        ...
               1,032,922      uops_retired.retire_slots #      1.1 UPI
               1,896,096      inst_retired.any
               1,896,096      inst_retired.any
               1,177,254      cpu_clk_unhalted.thread
      
      After:
      
        % perf stat -M UPI,IPC sleep 1
        ...
              1,013,193      uops_retired.retire_slots #      1.1 UPI
                 932,033      inst_retired.any
                 932,033      inst_retired.any          #      0.9 IPC
               1,091,245      cpu_clk_unhalted.thread
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: b18f3e36 ("perf stat: Support JSON metrics in perf stat")
      Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2f87f33f
    • Andi Kleen's avatar
      perf stat: Don't merge events in the same PMU · 6c5f4e5c
      Andi Kleen authored
      Event merging is mainly to collapse similar events in lots of different
      duplicated PMUs.
      
      It can break metric displaying. It's possible for two metrics to have
      the same event, and when the two events happen in a row the second
      wouldn't be displayed.  This would also not show the second metric.
      
      To avoid this don't merge events in the same PMU. This makes sense, if
      we have multiple events in the same PMU there is likely some reason for
      it (e.g. using multiple groups) and we better not merge them.
      
      While in theory it would be possible to construct metrics that have
      events with the same name in different PMU no current metrics have this
      problem.
      
      This is the fix for perf stat -M UPI,IPC (needs also another bug fix to
      completely work)
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: 430daf2d ("perf stat: Collapse identically named events")
      Link: http://lkml.kernel.org/r/20190624193711.35241-3-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6c5f4e5c
    • Andi Kleen's avatar
      perf stat: Make metric event lookup more robust · 145c407c
      Andi Kleen authored
      After setting up metric groups through the event parser, the metricgroup
      code looks them up again in the event list.
      
      Make sure we only look up events that haven't been used by some other
      metric. The data structures currently cannot handle more than one metric
      per event. This avoids problems with multiple events partially
      overlapping.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Link: http://lkml.kernel.org/r/20190624193711.35241-2-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      145c407c
    • Arnaldo Carvalho de Melo's avatar
      tools lib: Move argv_{split,free} from tools/perf/util/ · 9c10548c
      Arnaldo Carvalho de Melo authored
      This came from the kernel lib/argv_split.c, so move it to
      tools/lib/argv_split.c, to get it closer to the kernel structure.
      
      We need to audit the usage of argv_split() to figure out if it is really
      necessary to do have one allocation per argv[] entry, looking at one of
      its users I guess that is not the case and we probably are even leaking
      those allocations by not using argv_free() judiciously, for later.
      
      With this we further remove stuff from tools/perf/util/, reducing the
      perf specific codebase and encouraging other tools/ code to use these
      routines so as to keep the style and constructs used with the kernel.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-j479s1ive9h75w5lfg16jroz@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9c10548c
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Drop strxfrchar(), use strreplace() equivalent from kernel · af0de0c5
      Arnaldo Carvalho de Melo authored
      No change in behaviour intended, just reducing the codebase and using
      something available in tools/lib/.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-oyi6zif3810nwi4uu85odnhv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af0de0c5
    • Arnaldo Carvalho de Melo's avatar
      tools lib: Adopt strreplace() from the kernel · 2a60689a
      Arnaldo Carvalho de Melo authored
      We'll use it to further reduce the size of tools/perf/util/string.c,
      replacing the strxfrchar() equivalent function we have there.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-x3r61ikjrso1buygxwke8id3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2a60689a
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Ditch rtrim(), use strim() from tools/lib · 13c230ab
      Arnaldo Carvalho de Melo authored
      Cleaning up a bit more tools/perf/util/ by using things we got from the
      kernel and have in tools/lib/
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-7hluuoveryoicvkclshzjf1k@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13c230ab
  2. 26 Jun, 2019 13 commits
  3. 25 Jun, 2019 10 commits