1. 24 Mar, 2020 8 commits
    • Vijay Thakkar's avatar
      perf vendor events amd: Restrict model detection for zen1 based processors · c5f18e9e
      Vijay Thakkar authored
      This patch changes the previous blanket detection of AMD Family 17h
      processors to be more specific to Zen1 core based products only by
      replacing model detection regex pattern [[:xdigit:]]+ with
      ([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.
      
      This change is required to allow for the addition of separate PMU events
      for Zen2 core based models in the following patches as those belong to
      family 17h but have different PMCs. Current PMU events directory has
      also been renamed to "amdzen1" from "amdfam17h" to reflect this
      specificity.
      
      Note that although this change does not break PMU counters for existing
      zen1 based systems, it does disable the current set of counters for zen2
      based systems. Counters for zen2 have been added in the following
      patches in this patchset.
      Signed-off-by: default avatarVijay Thakkar <vijaythakkar@me.com>
      Acked-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jon Grimm <jon.grimm@amd.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5f18e9e
    • Kajol Jain's avatar
      perf metricgroup: Fix printing event names of metric group with multiple... · 58fc90fd
      Kajol Jain authored
      perf metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
      
      Commit f01642e4 ("perf metricgroup: Support multiple events for
      metricgroup") introduced support for multiple events in a metric group.
      But with the current upstream, metric events names are not printed
      properly incase we try to run multiple metric groups with overlapping
      event.
      
      With current upstream version, incase of overlapping metric events issue
      is, we always start our comparision logic from start.  So, the events
      which already matched with some metric group also take part in
      comparision logic. Because of that when we have overlapping events, we
      end up matching current metric group event with already matched one.
      
      For example, in skylake machine we have metric event CoreIPC and
      Instructions. Both of them need 'inst_retired.any' event value.  As
      events in Instructions is subset of events in CoreIPC, they endup in
      pointing to same 'inst_retired.any' value.
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
           1,254,992,790      inst_retired.any          # 1254992790.0
                                                          Instructions
                                                        #      1.3 CoreIPC
             977,172,805      cycles
           1,254,992,756      inst_retired.any
      
             1.000802596 seconds time elapsed
      
      command:# sudo ./perf stat -M UPI,IPC sleep 1
      
         Performance counter stats for 'sleep 1':
                 948,650      uops_retired.retire_slots
                 866,182      inst_retired.any          #      0.7 IPC
                 866,182      inst_retired.any
               1,175,671      cpu_clk_unhalted.thread
      
      Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
      track of events which already matched with some group by setting it
      true.  So, we skip all used events in list when we start comparision
      logic.  Patch also make some changes in comparision logic, incase we get
      a match miss, we discard the whole match and start again with first
      event id in metric event.
      
      With this patch:
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
               3,348,415      inst_retired.any          #      0.3 CoreIPC
              11,779,026      cycles
               3,348,381      inst_retired.any          # 3348381.0
                                                          Instructions
      
             1.001649056 seconds time elapsed
      
      command:# ./perf stat -M UPI,IPC sleep 1
      
       Performance counter stats for 'sleep 1':
      
               1,023,148      uops_retired.retire_slots #      1.1 UPI
                 924,976      inst_retired.any
                 924,976      inst_retired.any          #      0.6 IPC
               1,489,414      cpu_clk_unhalted.thread
      
             1.003064672 seconds time elapsed
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      58fc90fd
    • Jin Yao's avatar
      perf stat: Align the output for interval aggregation mode · d13e9e41
      Jin Yao authored
      There is a slight misalignment in -A -I output.
      
      For example:
      
       # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
      
       #           time CPU                    counts unit events
            1.000440863 CPU0               1,068,388      cpu/event=cpu-cycles/
            1.000440863 CPU1                 875,954      cpu/event=cpu-cycles/
            1.000440863 CPU2               3,072,538      cpu/event=cpu-cycles/
            1.000440863 CPU3               4,026,870      cpu/event=cpu-cycles/
            1.000440863 CPU4               5,919,630      cpu/event=cpu-cycles/
            1.000440863 CPU5               2,714,260      cpu/event=cpu-cycles/
            1.000440863 CPU6               2,219,240      cpu/event=cpu-cycles/
            1.000440863 CPU7               1,299,232      cpu/event=cpu-cycles/
      
      The value of counts is not aligned with the column "counts" and
      the event name is not aligned with the column "events".
      
      With this patch, the output is,
      
       # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
      
       #           time CPU                    counts unit events
            1.000423009 CPU0                  997,421      cpu/event=cpu-cycles/
            1.000423009 CPU1                1,422,042      cpu/event=cpu-cycles/
            1.000423009 CPU2                  484,651      cpu/event=cpu-cycles/
            1.000423009 CPU3                  525,791      cpu/event=cpu-cycles/
            1.000423009 CPU4                1,370,100      cpu/event=cpu-cycles/
            1.000423009 CPU5                  442,072      cpu/event=cpu-cycles/
            1.000423009 CPU6                  205,643      cpu/event=cpu-cycles/
            1.000423009 CPU7                1,302,250      cpu/event=cpu-cycles/
      
      Now output is aligned.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200218071614.25736-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d13e9e41
    • Jin Yao's avatar
      perf report/top TUI: Support hotkeys to let user select any event for sorting · dbddf174
      Jin Yao authored
      When performing "perf report --group", it shows the event group information
      together. In previous patch, we have supported a new option "--group-sort-idx"
      to sort the output by the event at the index n in event group.
      
      It would be nice if we can use a hotkey in browser to select a event
      to sort.
      
      For example,
      
        # perf report --group
      
       Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
                              Overhead  Command    Shared Object            Symbol
        92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
         3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
         1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
         1.56%   0.01%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494ce
         1.56%   0.00%   0.00%   0.00%  mgen       [kernel.kallsyms]        [k] task_tick_fair
         0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
         0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
         0.00%   0.03%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] g_main_context_check
         0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] apic_timer_interrupt
         0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] check_preempt_curr
      
      When user press hotkey '3' (event index, starting from 0), it indicates
      to sort output by the forth event in group.
      
        Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
                              Overhead  Command    Shared Object            Symbol
        92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
         0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
         3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
         0.00%   0.00%   0.00%   0.06%  swapper    [kernel.kallsyms]        [k] hrtimer_start_range_ns
         1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
         0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] update_curr
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] apic_timer_interrupt
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] native_apic_msr_eoi_write
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] __update_load_avg_se
      
       v6:
       ---
       Jiri provided a good improvement to eliminate unneeded refresh.
       This improvement is added to v6.
      
       v2:
       ---
       1. Report warning at helpline when index is invalid.
       2. Report warning at helpline when it's not group event.
       3. Use "case '0' ... '9'" to refine the code
       4. Split K_RELOAD implementation to another patch.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dbddf174
    • Jin Yao's avatar
      perf report: Support a new key to reload the browser · 5e3b810a
      Jin Yao authored
      Sometimes we may need to reload the browser to update the output since
      some options are changed.
      
      This patch creates a new key K_RELOAD. Once the __cmd_report() returns
      K_RELOAD, it would repeat the whole process, such as, read samples from
      data file, sort the data and display in the browser.
      
       v5:
       ---
       1. Fix the 'make NO_SLANG=1' error. Define K_RELOAD in util/hist.h.
       2. Skip setup_sorting() in repeat path if last key is K_RELOAD.
      
       v4:
       ---
       Need to quit in perf_evsel_menu__run if key is K_RELOAD.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5e3b810a
    • Jin Yao's avatar
      perf report: Allow specifying event to be used as sort key in --group output · 429a5f9d
      Jin Yao authored
      When performing "perf report --group", it shows the event group
      information together. By default, the output is sorted by the first
      event in group.
      
      It would be nice for user to select any event for sorting. This patch
      introduces a new option "--group-sort-idx" to sort the output by the
      event at the index n in event group.
      
      For example,
      
      Before:
      
        # perf report --group --stdio
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
        # Event count (approx.): 6451235635
        #
        #                         Overhead  Command    Shared Object            Symbol
        # ................................  .........  .......................  ...................................
        #
            92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
             3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
             1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
             1.56%   0.01%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494ce
             1.56%   0.00%   0.00%   0.00%  mgen       [kernel.kallsyms]        [k] task_tick_fair
             0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
             0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
             0.00%   0.03%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] g_main_context_check
             0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] apic_timer_interrupt
             ...
      
      After:
      
        # perf report --group --stdio --group-sort-idx 3
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
        # Event count (approx.): 6451235635
        #
        #                         Overhead  Command    Shared Object            Symbol
        # ................................  .........  .......................  ...................................
        #
            92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
             0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
             3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
             0.00%   0.00%   0.00%   0.06%  swapper    [kernel.kallsyms]        [k] hrtimer_start_range_ns
             1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
             0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] update_curr
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] apic_timer_interrupt
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] native_apic_msr_eoi_write
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] __update_load_avg_se
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] scheduler_tick
      
      Now the output is sorted by the fourth event in group.
      
       v7:
       ---
       Rebase to latest perf/core, no other change.
      
       v4:
       ---
       1. Update Documentation/perf-report.txt to mention
          '--group-sort-idx' support multiple groups with different
          amount of events and it should be used on grouped events.
      
       2. Update __hpp__group_sort_idx(), just return when the
          idx is out of limit.
      
       3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
          So now we don't need to use together with --group.
      
       v3:
       ---
       Refine the code in __hpp__group_sort_idx().
      
       Before:
         for (i = 1; i < nr_members; i++) {
              if (i == idx) {
                      ret = field_cmp(fields_a[i], fields_b[i]);
                      if (ret)
                              goto out;
              }
         }
      
       After:
         if (idx >= 1 && idx < nr_members) {
              ret = field_cmp(fields_a[idx], fields_b[idx]);
              if (ret)
                      goto out;
         }
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-2-yao.jin@linux.intel.com
      [ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      429a5f9d
    • Jin Yao's avatar
      perf report/top TUI: Support hotkey 'a' for annotation of unresolved addresses · ec0479a6
      Jin Yao authored
      In previous patch, we have supported the annotation functionality even
      without symbols.
      
      For this patch, it supports the hotkey 'a' on address in report view.
      Note that, for branch mode, we only support the annotation for "branch
      to" address.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200227043939.4403-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec0479a6
    • Jin Yao's avatar
      perf report: Support interactive annotation of code without symbols · 7b0a0dcb
      Jin Yao authored
      For perf report on stripped binaries it is currently impossible to do
      annotation. The annotation state is all tied to symbols, but there are
      either no symbols, or symbols are not covering all the code.
      
      We should support the annotation functionality even without symbols.
      
      This patch fakes a symbol and the symbol name is the string of address.
      After that, we just follow current annotation working flow.
      
      For example,
      
      1. perf report
      
        Overhead  Command  Shared Object     Symbol
          20.67%  div      libc-2.27.so      [.] __random_r
          17.29%  div      libc-2.27.so      [.] __random
          10.59%  div      div               [.] 0x0000000000000628
           9.25%  div      div               [.] 0x0000000000000612
           6.11%  div      div               [.] 0x0000000000000645
      
      2. Select the line of "10.59%  div      div               [.] 0x0000000000000628" and ENTER.
      
        Annotate 0x0000000000000628
        Zoom into div thread
        Zoom into div DSO (use the 'k' hotkey to zoom directly into the kernel)
        Browse map details
        Run scripts for samples of symbol [0x0000000000000628]
        Run scripts for all samples
        Switch to another data file in PWD
        Exit
      
      3. Select the "Annotate 0x0000000000000628" and ENTER.
      
      Percent│
             │
             │
             │     Disassembly of section .text:
             │
             │     0000000000000628 <.text+0x68>:
             │       divsd %xmm4,%xmm0
             │       divsd %xmm3,%xmm1
             │       movsd (%rsp),%xmm2
             │       addsd %xmm1,%xmm0
             │       addsd %xmm2,%xmm0
             │       movsd %xmm0,(%rsp)
      
      Now we can see the dump of object starting from 0x628.
      
       v5:
       ---
       Remove the hotkey 'a' implementation from this patch. It
       will be moved to a separate patch.
      
       v4:
       ---
       1. Support the hotkey 'a'. When we press 'a' on address,
          now it supports the annotation.
      
       2. Change the patch title from
          "Support interactive annotation of code without symbols" to
          "perf report: Support interactive annotation of code without symbols"
      
       v3:
       ---
       Keep just the ANNOTATION_DUMMY_LEN, and remove the
       opts->annotate_dummy_len since it's the "maybe in future
       we will provide" feature.
      
       v2:
       ---
       Fix a crash issue when annotating an address in "unknown" object.
      
       The steps to reproduce this issue:
      
       perf record -e cycles:u ls
       perf report
      
          75.29%  ls       ld-2.27.so        [.] do_lookup_x
          23.64%  ls       ld-2.27.so        [.] __GI___tunables_init
           1.04%  ls       [unknown]         [k] 0xffffffff85c01210
           0.03%  ls       ld-2.27.so        [.] _start
      
       When annotating 0xffffffff85c01210, the crash happens.
      
       v2 adds checking for ms->map in add_annotate_opt(). If the object is
       "unknown", ms->map is NULL.
      
      Committer notes:
      
      Renamed new_annotate_sym() to symbol__new_unresolved().
      
      Use PRIx64 to fix this issue in some 32-bit arches:
      
        ui/browsers/hists.c: In function 'symbol__new_unresolved':
        ui/browsers/hists.c:2474:38: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
          snprintf(name, sizeof(name), "%-#.*lx", BITS_PER_LONG / 4, addr);
                                        ~~~~~~^                      ~~~~
                                        %-#.*llx
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200227043939.4403-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7b0a0dcb
  2. 23 Mar, 2020 3 commits
    • Jin Yao's avatar
      perf report: Print al_addr when symbol is not found · 443bc639
      Jin Yao authored
      For branch mode, if the symbol is not found, it prints
      the address.
      
      For example, 0x0000555eee0365a0 in below output.
      
        Overhead  Command  Source Shared Object  Source Symbol                            Target Symbol
          17.55%  div      libc-2.27.so          [.] __random                             [.] __random
           6.11%  div      div                   [.] 0x0000555eee0365a0                   [.] rand
           6.10%  div      libc-2.27.so          [.] rand                                 [.] 0x0000555eee036769
           5.80%  div      libc-2.27.so          [.] __random_r                           [.] __random
           5.72%  div      libc-2.27.so          [.] __random                             [.] __random_r
           5.62%  div      libc-2.27.so          [.] __random_r                           [.] __random_r
           5.38%  div      libc-2.27.so          [.] __random                             [.] rand
           4.56%  div      libc-2.27.so          [.] __random                             [.] __random
           4.49%  div      div                   [.] 0x0000555eee036779                   [.] 0x0000555eee0365ff
           4.25%  div      div                   [.] 0x0000555eee0365fa                   [.] 0x0000555eee036760
      
      But it's not very easy to understand what the instructions
      are in the binary. So this patch uses the al_addr instead.
      
      With this patch, the output is
      
        Overhead  Command  Source Shared Object  Source Symbol                            Target Symbol
          17.55%  div      libc-2.27.so          [.] __random                             [.] __random
           6.11%  div      div                   [.] 0x00000000000005a0                   [.] rand
           6.10%  div      libc-2.27.so          [.] rand                                 [.] 0x0000000000000769
           5.80%  div      libc-2.27.so          [.] __random_r                           [.] __random
           5.72%  div      libc-2.27.so          [.] __random                             [.] __random_r
           5.62%  div      libc-2.27.so          [.] __random_r                           [.] __random_r
           5.38%  div      libc-2.27.so          [.] __random                             [.] rand
           4.56%  div      libc-2.27.so          [.] __random                             [.] __random
           4.49%  div      div                   [.] 0x0000000000000779                   [.] 0x00000000000005ff
           4.25%  div      div                   [.] 0x00000000000005fa                   [.] 0x0000000000000760
      
      Now we can use objdump to dump the object starting from 0x5a0.
      
      For example,
      objdump -d --start-address 0x5a0 div
      
      00000000000005a0 <rand@plt>:
       5a0:   ff 25 2a 0a 20 00       jmpq   *0x200a2a(%rip)        # 200fd0 <__cxa_finalize@plt+0x200a20>
       5a6:   68 02 00 00 00          pushq  $0x2
       5ab:   e9 c0 ff ff ff          jmpq   570 <srand@plt-0x10>
       ...
      
      Committer testing:
      
        [root@seventh ~]# perf record -a -b sleep 1
        [root@seventh ~]# perf report --header-only | grep cpudesc
        # cpudesc : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
        [root@seventh ~]# perf evlist -v
        cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
        [root@seventh ~]#
      
      Before:
      
        [root@seventh ~]# perf report --stdio --dso libsystemd-shared-241.so | head -20
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 2K of event 'cycles'
        # Event count (approx.): 2240
        #
        # Overhead  Command          Source Shared Object      Source Symbol           Target Symbol           Basic Block Cycles
        # ........  ...............  ........................  ......................  ......................  ..................
        #
             0.13%  systemd-journal  libc-2.29.so              [.] cfree@GLIBC_2.2.5   [.] _int_free           1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00007fe406465c82  [.] 0x00007fe406465d80  1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00007fe406465ded  [.] 0x00007fe406465c30  1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00007fe406465e4e  [.] 0x00007fe406465de0  1
             0.09%  systemd-journal  systemd-journald          [.] free@plt            [.] cfree@GLIBC_2.2.5   1
             0.09%  systemd-journal  libc-2.29.so              [.] _int_free           [.] _int_free           18
             0.09%  systemd-journal  libc-2.29.so              [.] _int_free           [.] _int_free           2
             0.04%  systemd          libsystemd-shared-241.so  [.] bus_resolve@plt     [.] bus_resolve         204
             0.04%  systemd          libsystemd-shared-241.so  [.] getpid_cached@plt   [.] getpid_cached       7
        [root@seventh ~]#
      
      After:
      
        [root@seventh ~]# perf report --stdio --dso libsystemd-shared-241.so | head -20
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 2K of event 'cycles'
        # Event count (approx.): 2240
        #
        # Overhead  Command          Source Shared Object      Source Symbol           Target Symbol           Basic Block Cycles
        # ........  ...............  ........................  ......................  ......................  ..................
        #
             0.13%  systemd-journal  libc-2.29.so              [.] cfree@GLIBC_2.2.5   [.] _int_free           1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00000000000f7c82  [.] 0x00000000000f7d80  1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00000000000f7ded  [.] 0x00000000000f7c30  1
             0.09%  systemd          libsystemd-shared-241.so  [.] 0x00000000000f7e4e  [.] 0x00000000000f7de0  1
             0.09%  systemd-journal  systemd-journald          [.] free@plt            [.] cfree@GLIBC_2.2.5   1
             0.09%  systemd-journal  libc-2.29.so              [.] _int_free           [.] _int_free           18
             0.09%  systemd-journal  libc-2.29.so              [.] _int_free           [.] _int_free           2
             0.04%  systemd          libsystemd-shared-241.so  [.] bus_resolve@plt     [.] bus_resolve         204
             0.04%  systemd          libsystemd-shared-241.so  [.] getpid_cached@plt   [.] getpid_cached       7
        [root@seventh ~]#
      
      Lets use -v to get full paths and then try objdump on the unresolved address:
      
        [root@seventh ~]# perf report -v --stdio --dso libsystemd-shared-241.so |& grep libsystemd-shared-241.so | tail -1
           0.04% systemd-journal /usr/lib/systemd/libsystemd-shared-241.so 0x80c1a B [.] 0x0000000000080c1a 0x80a95 B [.] 0x0000000000080a95 61
        [root@seventh ~]#
      
        [root@seventh ~]# objdump -d --start-address 0x00000000000f7d80 /usr/lib/systemd/libsystemd-shared-241.so | head -20
      
        /usr/lib/systemd/libsystemd-shared-241.so:     file format elf64-x86-64
      
        Disassembly of section .text:
      
        00000000000f7d80 <proc_cmdline_parse_given@@SD_SHARED+0x330>:
           f7d80:	41 39 11             	cmp    %edx,(%r9)
           f7d83:	0f 84 ff fe ff ff    	je     f7c88 <proc_cmdline_parse_given@@SD_SHARED+0x238>
           f7d89:	4c 8d 05 97 09 0c 00 	lea    0xc0997(%rip),%r8        # 1b8727 <utf8_skip_data@@SD_SHARED+0x3147>
           f7d90:	b9 49 00 00 00       	mov    $0x49,%ecx
           f7d95:	48 8d 15 c9 f5 0b 00 	lea    0xbf5c9(%rip),%rdx        # 1b7365 <utf8_skip_data@@SD_SHARED+0x1d85>
           f7d9c:	31 ff                	xor    %edi,%edi
           f7d9e:	48 8d 35 9b ff 0b 00 	lea    0xbff9b(%rip),%rsi        # 1b7d40 <utf8_skip_data@@SD_SHARED+0x2760>
           f7da5:	e8 a6 d6 f4 ff       	callq  45450 <log_assert_failed_realm@plt>
           f7daa:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
           f7db0:	41 56                	push   %r14
           f7db2:	41 55                	push   %r13
           f7db4:	41 54                	push   %r12
           f7db6:	55                   	push   %rbp
        [root@seventh ~]#
      
      If we tried the the reported address before this patch:
      
        [root@seventh ~]# objdump -d --start-address 0x00007fe406465d80 /usr/lib/systemd/libsystemd-shared-241.so | head -20
      
        /usr/lib/systemd/libsystemd-shared-241.so:     file format elf64-x86-64
      
        [root@seventh ~]#
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200227043939.4403-2-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      443bc639
    • Leo Yan's avatar
      perf symbols: Consolidate symbol fixup issue · 7eec00a7
      Leo Yan authored
      After copying Arm64's perf archive with object files and perf.data file
      to x86 laptop, the x86's perf kernel symbol resolution fails.  It
      outputs 'unknown' for all symbols parsing.
      
      This issue is root caused by the function elf__needs_adjust_symbols(),
      x86 perf tool uses one weak version, Arm64 (and powerpc) has rewritten
      their own version.  elf__needs_adjust_symbols() decides if need to parse
      symbols with the relative offset address; but x86 building uses the weak
      function which misses to check for the elf type 'ET_DYN', so that it
      cannot parse symbols in Arm DSOs due to the wrong result from
      elf__needs_adjust_symbols().
      
      The DSO parsing should not depend on any specific architecture perf
      building; e.g. x86 perf tool can parse Arm and Arm64 DSOs, vice versa.
      And confirmed by Naveen N. Rao that powerpc64 kernels are not being
      built as ET_DYN anymore and change to ET_EXEC.
      
      This patch removes the arch specific functions for Arm64 and powerpc and
      changes elf__needs_adjust_symbols() as a common function.
      
      In the common elf__needs_adjust_symbols(), it checks an extra condition
      'ET_DYN' for elf header type.  With this fixing, the Arm64 DSO can be
      parsed properly with x86's perf tool.
      
      Before:
      
        # perf script
        main 3258 1 branches:                0 [unknown] ([unknown]) => ffff800010c4665c [unknown] ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c46670 [unknown] ([kernel.kallsyms]) => ffff800010c4eaec [unknown] ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eaec [unknown] ([kernel.kallsyms]) => ffff800010c4eb00 [unknown] ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eb08 [unknown] ([kernel.kallsyms]) => ffff800010c4e780 [unknown] ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4e7a0 [unknown] ([kernel.kallsyms]) => ffff800010c4eeac [unknown] ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eebc [unknown] ([kernel.kallsyms]) => ffff800010c4ed80 [unknown] ([kernel.kallsyms])
      
      After:
      
        # perf script
        main 3258 1 branches:                0 [unknown] ([unknown]) => ffff800010c4665c coresight_timeout+0x54 ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c46670 coresight_timeout+0x68 ([kernel.kallsyms]) => ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) => ffff800010c4eb00 etm4_enable_hw+0x3e0 ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eb08 etm4_enable_hw+0x3e8 ([kernel.kallsyms]) => ffff800010c4e780 etm4_enable_hw+0x60 ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4e7a0 etm4_enable_hw+0x80 ([kernel.kallsyms]) => ffff800010c4eeac etm4_enable+0x2d4 ([kernel.kallsyms])
        main 3258 1 branches: ffff800010c4eebc etm4_enable+0x2e4 ([kernel.kallsyms]) => ffff800010c4ed80 etm4_enable+0x1a8 ([kernel.kallsyms])
      
      v3: Changed to check for ET_DYN across all architectures.
      
      v2: Fixed Arm64 and powerpc native building.
      Reported-by: default avatarMike Leach <mike.leach@linaro.org>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Link: http://lore.kernel.org/lkml/20200306015759.10084-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7eec00a7
    • Ian Rogers's avatar
      perf parse-events: Fix 3 use after frees found with clang ASAN · d4953f7e
      Ian Rogers authored
      Reproducible with a clang asan build and then running perf test in
      particular 'Parse event definition strings'.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: clang-built-linux@googlegroups.com
      Link: http://lore.kernel.org/lkml/20200314170356.62914-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4953f7e
  3. 20 Mar, 2020 5 commits
  4. 19 Mar, 2020 4 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.7-20200317' of... · d1c9f7d1
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.7-20200317' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      perf record:
      
        Alexey Budankov:
      
        - Fix binding of AIO user space buffers to nodes
      
      maps:
      
        Dominik b. Czarnota:
      
        - Fix off by one in strncpy() size argument.
      
        Arnaldo Carvalho de Melo:
      
        - Use strstarts() to look for Android libraries.
      
        Ian Rogers:
      
        - Give synthetic mmap events an inode generation.
      
      man pages:
      
        Ian Rogers:
      
        - Set man page date to last git commit.
      
      perf test:
      
        Ian Rogers:
      
        - Print if shell directory isn't present.
      
      perf report:
      
        Jin Yao:
      
        - Fix no branch type statistics report issue.
      
      perf expr:
      
        Jiri Olsa:
      
        - Fix copy/paste mistake
      
      vendor events:
      
        Kan Liang:
      
        - Support metric constraints.
      
      vendor events intel:
      
        Kan Liang:
      
        - Add NO_NMI_WATCHDOG metric constraint.
      
      vendor events s390:
      
        Thomas Richter:
      
       - Add new deflate counters for IBM z15.
      
      ARM cs-etm:
      
        Leo Yan:
      
        - Last branch improvements.
      
      intel-pt:
      
        Adrian Hunter:
      
        - Update intel-pt.txt file with new location of the documentation.
      
        - Add Intel PT man page references.
      
        - Rename intel-pt.txt and put it in man page format.
      
      perl scripting:
      
        Michael Petlan:
      
       - Add common_callchain to fix argument order.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      
      Conflicts:
      	tools/perf/util/map.c
      d1c9f7d1
    • Ingo Molnar's avatar
      409e1a31
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.7-20200310' of... · fdca7c14
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.7-20200310' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      perf stat:
      
        Jin Yao:
      
        - Show percore counts in per CPU output.
      
      perf report:
      
        Jin Yao:
      
        - Allow selecting which block info columns to report and its order.
      
        - Support color ops to print block percents in color.
      
        - Fix wrong block address comparison in block_info__cmp().
      
      perf annotate:
      
        Ravi Bangoria:
      
        - Get rid of annotation->nr_jumps, unused.
      
      expr:
      
        Jiri Olsa:
      
        - Move expr lexer to flex.
      
      llvm:
      
        Arnaldo Carvalho de Melo:
      
        - Add debug hint message about missing kernel-devel package.
      
      core:
      
        Kan Liang:
      
        - Initial patches to support the recently added PERF_SAMPLE_BRANCH_HW_INDEX
          kernel feature.
      
        - Add check for unexpected use of reserved membrs in event attr, so that in
          the future older perf tools will complain instead of silently try to process
          unknown features.
      
      libapi:
      
        Namhyung Kim:
      
        - Adopt cgroupsfs_find_mountpoint() from tools/perf/util/.
      
      libperf:
      
        Michael Petlan:
      
        - Add counting example.
      
      libtraceevent:
      
         Steven Rostedt (VMware):
      
        - Remove extra '\n' in print_event_time().
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fdca7c14
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-5.6-20200309' of... · db5d85ce
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-5.6-20200309' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      perf probe:
      
        Masami Hiramatsu:
      
        - Fix deletion of multiple probe events.
      
        - Fix userspace libraries handling by not depending on dwfl_module_addrsym().
      
      Event parsing:
      
        Ian Rogers:
      
        - Fix reading of invalid memory in event parsing.
      
      python binding:
      
        Ilie Halip:
      
        - Fix clang detection when using CC=clang-version.
      
      build:
      
        Masami Hiramatsu:
      
        - Fix O= use with relative paths.
      
      Android:
      
        Dominik b. Czarnota:
      
        - Fix off by one in strncpy() size argument when handling Android
          libraries.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      db5d85ce
  5. 17 Mar, 2020 7 commits
    • Jiri Olsa's avatar
      perf expr: Fix copy/paste mistake · 59a08b4b
      Jiri Olsa authored
      Copy/paste leftover from recent refactor.
      
      Fixes: 26226a97 ("perf expr: Move expr lexer to flex")
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200315155609.603948-1-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      59a08b4b
    • Jin Yao's avatar
      perf report: Fix no branch type statistics report issue · c3b10649
      Jin Yao authored
      Previously we could get the report of branch type statistics.
      
      For example:
      
        # perf record -j any,save_type ...
        # t perf report --stdio
      
        #
        # Branch Statistics:
        #
        COND_FWD:  40.6%
        COND_BWD:   4.1%
        CROSS_4K:  24.7%
        CROSS_2M:  12.3%
            COND:  44.7%
          UNCOND:   0.0%
             IND:   6.1%
            CALL:  24.5%
             RET:  24.7%
      
      But now for the recent perf, it can't report the branch type statistics.
      
      It's a regression issue caused by commit 40c39e30 ("perf report: Fix
      a no annotate browser displayed issue"), which only counts the branch
      type statistics for browser mode.
      
      This patch moves the branch_type_count() outside of ui__has_annotation()
      checking, then branch type statistics can work for stdio mode.
      
      Fixes: 40c39e30 ("perf report: Fix a no annotate browser displayed issue")
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200313134607.12873-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3b10649
    • Ian Rogers's avatar
      perf tools: Give synthetic mmap events an inode generation · 3b7a15b0
      Ian Rogers authored
      When mmap2 events are synthesized the ino_generation field isn't being
      set leading to uninitialized memory being compared.
      
      Caught with clang's -fsanitize=memory:
      
      ==124733==WARNING: MemorySanitizer: use-of-uninitialized-value
          #0 0x55a96a6a65cc in __dso_id__cmp tools/perf/util/dsos.c:23:6
          #1 0x55a96a6a81d5 in dso_id__cmp tools/perf/util/dsos.c:38:9
          #2 0x55a96a6a717f in __dso__cmp_long_name tools/perf/util/dsos.c:74:15
          #3 0x55a96a6a6c4c in __dsos__findnew_link_by_longname_id tools/perf/util/dsos.c:106:12
          #4 0x55a96a6a851e in __dsos__findnew_by_longname_id tools/perf/util/dsos.c:178:9
          #5 0x55a96a6a7798 in __dsos__find_id tools/perf/util/dsos.c:191:9
          #6 0x55a96a6a7b57 in __dsos__findnew_id tools/perf/util/dsos.c:251:20
          #7 0x55a96a6a7a57 in dsos__findnew_id tools/perf/util/dsos.c:259:17
          #8 0x55a96a7776ae in machine__findnew_dso_id tools/perf/util/machine.c:2709:9
          #9 0x55a96a77dfcf in map__new tools/perf/util/map.c:193:10
          #10 0x55a96a77240a in machine__process_mmap2_event tools/perf/util/machine.c:1670:8
          #11 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
          #12 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
          #13 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
          #14 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
          #15 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
          #16 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
          #17 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
          #18 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
          #19 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
          #20 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
          #21 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
          #22 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
          #23 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
          #24 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
          #25 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
          #26 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
          #27 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
          #28 0x55a96a282223 in main tools/perf/perf.c:538:3
      
        Uninitialized value was stored to memory at
          #1 0x55a96a6a18f7 in dso__new_id tools/perf/util/dso.c:1230:14
          #2 0x55a96a6a78ee in __dsos__addnew_id tools/perf/util/dsos.c:233:20
          #3 0x55a96a6a7bcc in __dsos__findnew_id tools/perf/util/dsos.c:252:21
          #4 0x55a96a6a7a57 in dsos__findnew_id tools/perf/util/dsos.c:259:17
          #5 0x55a96a7776ae in machine__findnew_dso_id tools/perf/util/machine.c:2709:9
          #6 0x55a96a77dfcf in map__new tools/perf/util/map.c:193:10
          #7 0x55a96a77240a in machine__process_mmap2_event tools/perf/util/machine.c:1670:8
          #8 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
          #9 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
          #10 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
          #11 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
          #12 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
          #13 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
          #14 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
          #15 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
          #16 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
          #17 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
          #18 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
          #19 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
      
        Uninitialized value was stored to memory at
          #0 0x55a96a7725af in machine__process_mmap2_event tools/perf/util/machine.c:1646:25
          #1 0x55a96a7741a3 in machine__process_event tools/perf/util/machine.c:1882:9
          #2 0x55a96a6aee39 in perf_event__process tools/perf/util/event.c:454:9
          #3 0x55a96a87d633 in perf_tool__process_synth_event tools/perf/util/synthetic-events.c:63:9
          #4 0x55a96a87f131 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:403:7
          #5 0x55a96a8815d6 in __event__synthesize_thread tools/perf/util/synthetic-events.c:548:9
          #6 0x55a96a882bff in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:681:3
          #7 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
          #8 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
          #9 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
          #10 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
          #11 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
          #12 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
          #13 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
          #14 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
          #15 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
          #16 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
          #17 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
          #18 0x55a96a282223 in main tools/perf/perf.c:538:3
      
        Uninitialized value was created by a heap allocation
          #0 0x55a96a22f60d in malloc llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:925:3
          #1 0x55a96a882948 in __perf_event__synthesize_threads tools/perf/util/synthetic-events.c:655:15
          #2 0x55a96a881ec2 in perf_event__synthesize_threads tools/perf/util/synthetic-events.c:750:9
          #3 0x55a96a562b26 in synth_all tools/perf/tests/mmap-thread-lookup.c:136:9
          #4 0x55a96a5623b1 in mmap_events tools/perf/tests/mmap-thread-lookup.c:174:8
          #5 0x55a96a561fa0 in test__mmap_thread_lookup tools/perf/tests/mmap-thread-lookup.c:230:2
          #6 0x55a96a52c182 in run_test tools/perf/tests/builtin-test.c:378:9
          #7 0x55a96a52afc1 in test_and_print tools/perf/tests/builtin-test.c:408:9
          #8 0x55a96a52966e in __cmd_test tools/perf/tests/builtin-test.c:603:4
          #9 0x55a96a52855d in cmd_test tools/perf/tests/builtin-test.c:747:9
          #10 0x55a96a2844d4 in run_builtin tools/perf/perf.c:312:11
          #11 0x55a96a282bd0 in handle_internal_command tools/perf/perf.c:364:8
          #12 0x55a96a284097 in run_argv tools/perf/perf.c:408:2
          #13 0x55a96a282223 in main tools/perf/perf.c:538:3
      
      SUMMARY: MemorySanitizer: use-of-uninitialized-value tools/perf/util/dsos.c:23:6 in __dso_id__cmp
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: clang-built-linux@googlegroups.com
      Link: http://lore.kernel.org/lkml/20200313053129.131264-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b7a15b0
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · ac309e77
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - string buffer formatting fixes in picolcd and sensor drivers, from
         Takashi Iwai
      
       - two new device IDs from Chen-Tsung Hsieh and Tony Fischetti
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: add ALWAYS_POLL quirk to lenovo pixart mouse
        HID: google: add moonball USB id
        HID: hid-sensor-custom: Use scnprintf() for avoiding potential buffer overflow
        HID: hid-picolcd_fb: Use scnprintf() for avoiding potential buffer overflow
      ac309e77
    • Kim Phillips's avatar
      perf/amd/uncore: Add support for Family 19h L3 PMU · e48667b8
      Kim Phillips authored
      Family 19h introduces change in slice, core and thread specification in
      its L3 Performance Event Select (ChL3PmcCfg) h/w register. The change is
      incompatible with Family 17h's version of the register.
      
      Introduce a new path in l3_thread_slice_mask() to do things differently
      for Family 19h vs. Family 17h, otherwise the new hardware doesn't get
      programmed correctly.
      
      Instead of a linear core--thread bitmask, Family 19h takes an encoded
      core number, and a separate thread mask. There are new bits that are set
      for all cores and all slices, of which only the latter is used, since
      the driver counts events for all slices on behalf of the specified CPU.
      
      Also update amd_uncore_init() to base its L2/NB vs. L3/Data Fabric mode
      decision based on Family 17h or above, not just 17h and 18h: the Family
      19h Data Fabric PMC is compatible with the Family 17h DF PMC.
      
       [ bp: Touchups. ]
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200313231024.17601-3-kim.phillips@amd.com
      e48667b8
    • Kim Phillips's avatar
      perf/amd/uncore: Make L3 thread mask code more readable · 9689dbbe
      Kim Phillips authored
      Convert the l3_thread_slice_mask() function to use the more readable
      topology_* helper functions, more intuitive variable names like shift
      and thread_mask, and BIT_ULL().
      
      No functional changes.
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200313231024.17601-2-kim.phillips@amd.com
      9689dbbe
    • Kim Phillips's avatar
      perf/amd/uncore: Prepare L3 thread mask code for Family 19h · 4dcc3df8
      Kim Phillips authored
      In order to better accommodate the upcoming Family 19h, given
      the 80-char line limit, move the existing code into a new
      l3_thread_slice_mask() function.
      
      No functional changes.
      
       [ bp: Touchups. ]
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200313231024.17601-1-kim.phillips@amd.com
      4dcc3df8
  6. 16 Mar, 2020 3 commits
  7. 15 Mar, 2020 10 commits
    • Linus Torvalds's avatar
      Linux 5.6-rc6 · fb33c651
      Linus Torvalds authored
      fb33c651
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a42a7bb6
      Linus Torvalds authored
      Pull irq fix from Thomas Gleixner:
       "A single commit to handle an erratum in Cavium ThunderX to prevent
        access to GIC registers which are broken in the implementation"
      
      * tag 'irq-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3: Workaround Cavium erratum 38539 when reading GICD_TYPER2
      a42a7bb6
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 34d5a4b3
      Linus Torvalds authored
      Pull futex fix from Thomas Gleixner:
       "Fix for yet another subtle futex issue.
      
        The futex code used ihold() to prevent inodes from vanishing, but
        ihold() does not guarantee inode persistence. Replace the inode
        pointer with a per boot, machine wide, unique inode identifier.
      
        The second commit fixes the breakage of the hash mechanism which
        causes a 100% performance regression"
      
      * tag 'locking-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Unbreak futex hashing
        futex: Fix inode life-time issue
      34d5a4b3
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec181b7f
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for x86:
      
         - Map EFI runtime service data as encrypted when SEV is enabled.
      
           Otherwise e.g. SMBIOS data cannot be properly decoded by dmidecode.
      
         - Remove the warning in the vector management code which triggered
           when a managed interrupt affinity changed outside of a CPU hotplug
           operation.
      
           The warning was correct until the recent core code change that
           introduced a CPU isolation feature which needs to migrate managed
           interrupts away from online CPUs under certain conditions to
           achieve the isolation"
      
      * tag 'x86-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vector: Remove warning on managed interrupt migration
        x86/ioremap: Map EFI runtime services data as encrypted for SEV
      ec181b7f
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e99bc917
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "A pile of perf fixes:
      
        Kernel side:
      
         - AMD uncore driver: Replace the open coded sanity check with the
           core variant, which provides the correct error code and also leaves
           a hint in dmesg
      
        Tooling:
      
         - Fix the stdio input handling with glibc versions >= 2.28
      
         - Unbreak the futex-wake benchmark which was reduced to 0 test
           threads due to the conversion to cpumaps
      
         - Initialize sigaction structs before invoking sys_sigactio()
      
         - Plug the mapfile memory leak in perf jevents
      
         - Fix off by one relative directory includes
      
         - Fix an undefined string comparison in perf diff"
      
      * tag 'perf-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag
        tools: Fix off-by 1 relative directory includes
        perf jevents: Fix leak of mapfile memory
        perf bench: Clear struct sigaction before sigaction() syscall
        perf bench futex-wake: Restore thread count default to online CPU count
        perf top: Fix stdio interface input handling with glibc 2.28+
        perf diff: Fix undefined string comparision spotted by clang's -Wstring-compare
        perf symbols: Don't try to find a vmlinux file when looking for kernel modules
        perf bench: Share some global variables to fix build with gcc 10
        perf parse-events: Use asprintf() instead of strncpy() to read tracepoint files
        perf env: Do not return pointers to local variables
        perf tests bp_account: Make global variable static
      e99bc917
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ffe6da91
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix adding the missing time namespace adjustment in
        sys/sysinfo which caused sys/sysinfo to be inconsistent with
        /proc/uptime when read from a task inside a time namespace"
      
      * tag 'timers-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sys/sysinfo: Respect boottime inside time namespace
      ffe6da91
    • Linus Torvalds's avatar
      Merge tag 'ras-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 52ac3777
      Linus Torvalds authored
      Pull RAS fixes from Thomas Gleixner:
       "Two RAS related fixes:
      
         - Shut down the per CPU thermal throttling poll work properly when a
           CPU goes offline.
      
           The missing shutdown caused the poll work to be migrated to a
           unbound worker which triggered warnings about the usage of
           smp_processor_id() in preemptible context
      
         - Fix the PPIN feature initialization which missed to enable the
           functionality when PPIN_CTL was enabled but the MSR locked against
           updates"
      
      * tag 'ras-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix logic and comments around MSR_PPIN_CTL
        x86/mce/therm_throt: Undo thermal polling properly on CPU offline
      52ac3777
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b67775e1
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
       "Two EFI fixes:
      
         - Prevent a race and buffer overflow in the sysfs efivars interface
           which causes kernel memory corruption.
      
         - Add the missing NULL pointer checks in efivar_store_raw()"
      
      * tag 'efi-urgent-2020-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Add a sanity check to efivar_store_raw()
        efi: Fix a race and a buffer overflow while reading efivars via sysfs
      b67775e1
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · de28a65c
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
      
       - Intel VT-d fixes:
          - RCU list handling fixes
          - Replace WARN_TAINT with pr_warn + add_taint for reporting firmware
            issues
          - DebugFS fixes
          - Fix for hugepage handling in iova_to_phys implementation
          - Fix for handling VMD devices, which have a domain number which
            doesn't fit into 16 bits
          - Warning message fix
      
       - MSI allocation fix for iommu-dma code
      
       - Sign-extension fix for io page-table code
      
       - Fix for AMD-Vi to properly update the is-running bit when AVIC is
         used
      
      * tag 'iommu-fixes-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Populate debugfs if IOMMUs are detected
        iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE
        iommu/vt-d: Ignore devices with out-of-spec domain number
        iommu/vt-d: Fix the wrong printing in RHSA parsing
        iommu/vt-d: Fix debugfs register reads
        iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: dmar_parse_one_rmrr: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
        iommu/vt-d: Silence RCU-list debugging warnings
        iommu/vt-d: Fix RCU-list bugs in intel_iommu_init()
        iommu/dma: Fix MSI reservation allocation
        iommu/io-pgtable-arm: Fix IOVA validation for 32-bit
        iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
        iommu/vt-d: Fix RCU list debugging warnings
      de28a65c
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.6-2' of... · 92c22755
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
      - Add workaround for Cavium/Marvell ThunderX unimplemented GIC registers
      92c22755