1. 18 Feb, 2021 13 commits
    • Namhyung Kim's avatar
      perf test: Fix unaligned access in sample parsing test · c5c97cad
      Namhyung Kim authored
      The ubsan reported the following error.  It was because sample's raw
      data missed u32 padding at the end.  So it broke the alignment of the
      array after it.
      
      The raw data contains an u32 size prefix so the data size should have
      an u32 padding after 8-byte aligned data.
      
      27: Sample parsing  :util/synthetic-events.c:1539:4:
        runtime error: store to misaligned address 0x62100006b9bc for type
        '__u64' (aka 'unsigned long long'), which requires 8 byte alignment
      0x62100006b9bc: note: pointer points here
        00 00 00 00 ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
                    ^
          #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13
          #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8
          #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9
          #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9
          #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9
          #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4
          #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9
          #7 0x56153264e796 in run_builtin perf.c:312:11
          #8 0x56153264cf03 in handle_internal_command perf.c:364:8
          #9 0x56153264e47d in run_argv perf.c:408:2
          #10 0x56153264c9a9 in main perf.c:538:3
          #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc)
          #12 0x561532596828 in _start ...
      
      SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use
       util/synthetic-events.c:1539:4 in
      
      Fixes: 045f8cd8 ("perf tests: Add a sample parsing test")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5c97cad
    • Kan Liang's avatar
      perf tools: Support arch specific PERF_SAMPLE_WEIGHT_STRUCT processing · fbefe9c2
      Kan Liang authored
      For X86, the var2_w field of PERF_SAMPLE_WEIGHT_STRUCT stands for the
      instruction latency. Current perf forces the var2_w to the data->ins_lat
      in the generic code. It works well for now because X86 is the only
      architecture that supports the PERF_SAMPLE_WEIGHT_STRUCT, but it may
      bring problems once other architectures support the sample type.  For
      example, the var2_w may be used to capture something else on PowerPC.
      
      Create two architecture specific functions to parse and synthesize the
      weight related samples. Move the X86 specific codes to the X86 version
      functions. Other architectures can implement their own functions later
      separately.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/1612540912-6562-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fbefe9c2
    • Adrian Hunter's avatar
      perf intel-pt: Add PSB events · c840cbfe
      Adrian Hunter authored
      Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis
      of code with Intel PT, it is useful to know if a timing bubble was caused
      by Intel PT or not. Add reporting of PSB events via perf script. PSB
      events are printed with the existing itrace 'p' option which also prints
      power and frequency changes. The PSB event contains the trace offset at
      which the PSB occurs, to allow easy reference back to the PSB+ packets.
      
      The PSB event timestamp is always the timestamp from the PSB+ TSC
      packet, and the ip is always the address from the PSB+ FUP packet.
      
      The code changes are non-trivial because the decoder must walk to the
      PSB+ FUP address before outputting the PSB event.
      
      Example:
      
        $ perf record -e intel_pt/cyc,psb_period=0/u uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.046 MB perf.data ]
        $ perf script --itrace=p --ns
           perf 17981 [006] 25617.510820383:  psb:  psb offs: 0                               0 [unknown] ([unknown])
           perf 17981 [006] 25617.510820383:  cbr:  cbr: 42 freq: 4219 MHz (156%)             0 [unknown] ([unknown])
          uname 17981 [006] 25617.510889753:  psb:  psb offs: 0xb50                7f78c12a212e __GI___tunables_init+0xee (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.510899162:  psb:  psb offs: 0x12d0               7f78c128af1c dl_main+0x93c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.510939242:  psb:  psb offs: 0x1a50               7f78c128eefc _dl_map_object_from_fd+0x13c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.510981274:  psb:  psb offs: 0x21c8               7f78c1296307 _dl_relocate_object+0x927 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.510993034:  psb:  psb offs: 0x2948               7f78c12940e4 _dl_lookup_symbol_x+0x14 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.511003871:  psb:  psb offs: 0x30c8               7f78c12937b3 do_lookup_x+0x2f3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.511019854:  psb:  psb offs: 0x3850               7f78c1295eed _dl_relocate_object+0x50d (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.511029015:  psb:  psb offs: 0x4390               7f78c12a855a strcmp+0xf6a (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
          uname 17981 [006] 25617.511064876:  psb:  psb offs: 0x4b10                          0 [unknown] ([unknown])
          uname 17981 [006] 25617.511080762:  psb:  psb offs: 0x5290               7f78c11db53d _dl_addr+0x13d (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511086035:  psb:  psb offs: 0x5a08               7f78c11db538 _dl_addr+0x138 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511091381:  psb:  psb offs: 0x6190               7f78c11db534 _dl_addr+0x134 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511096681:  psb:  psb offs: 0x6910               7f78c11db4c3 _dl_addr+0xc3 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511119520:  psb:  psb offs: 0x7090               7f78c10ada5e _nl_intern_locale_data+0x12e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511126584:  psb:  psb offs: 0x7818               7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511132775:  psb:  psb offs: 0x8358               7f78c10c20c0 getenv+0xa0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511134598:  psb:  psb offs: 0x8ad0               7f78c10ada09 _nl_intern_locale_data+0xd9 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511135685:  psb:  psb offs: 0x9258               7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511138322:  psb:  psb offs: 0x99d0               7f78c11fffd9 __strncmp_avx2+0x39 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
          uname 17981 [006] 25617.511158907:  psb:  psb offs: 0xa150                          0 [unknown] ([unknown])
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20210205175350.23817-5-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c840cbfe
    • Adrian Hunter's avatar
      perf intel-pt: Fix IPC with CYC threshold · 6af4b600
      Adrian Hunter authored
      The code assumed every CYC-eligible packet has a CYC packet, which is not
      the case when CYC thresholds are used. Fix by checking if a CYC packet is
      actually present in that case.
      
      Fixes: 5b1dc0fd ("perf intel-pt: Add support for samples to contain IPC ratio")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20210205175350.23817-4-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6af4b600
    • Adrian Hunter's avatar
      perf intel-pt: Fix premature IPC · 20aa3970
      Adrian Hunter authored
      The code assumed a change in cycle count means accurate IPC. That is not
      correct, for example when sampling both branches and instructions, or at
      a FUP packet (which is not CYC-eligible) address. Fix by using an explicit
      flag to indicate when IPC can be sampled.
      
      Fixes: 5b1dc0fd ("perf intel-pt: Add support for samples to contain IPC ratio")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: linux-kernel@vger.kernel.org
      Link: https://lore.kernel.org/r/20210205175350.23817-3-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      20aa3970
    • Adrian Hunter's avatar
      perf intel-pt: Fix missing CYC processing in PSB · 03fb0f85
      Adrian Hunter authored
      Add missing CYC packet processing when walking through PSB+. This
      improves the accuracy of timestamps that follow PSB+, until the next
      MTC.
      
      Fixes: 3d498078 ("perf tools: Add new Intel PT packet definitions")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/r/20210205175350.23817-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      03fb0f85
    • Dave Rigby's avatar
      perf unwind: Set userdata for all __report_module() paths · 4e148144
      Dave Rigby authored
      When locating the DWARF module for a given address, __find_debuginfo()
      requires a 'struct dso' passed via the userdata argument.
      
      However, this field is only set in __report_module() if the module is
      found in via dwfl_addrmodule(), not if it is found later via
      dwfl_report_elf().
      
      Set userdata irrespective of how the DWARF module was found, as long as
      we found a module.
      
      Fixes: bf53fc6b ("perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder")
      Signed-off-by: default avatarDave Rigby <d.rigby@me.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211801Acked-by: default avatarJan Kratochvil <jan.kratochvil@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/linux-perf-users/20210218165654.36604-1-d.rigby@me.com/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4e148144
    • Yang Jihong's avatar
      perf record: Fix continue profiling after draining the buffer · e16c2ce7
      Yang Jihong authored
      Commit da231338 ("perf record: Use an eventfd to wakeup when
      done") uses eventfd() to solve a rare race where the setting and
      checking of 'done' which add done_fd to pollfd.  When draining buffer,
      revents of done_fd is 0 and evlist__filter_pollfd function returns a
      non-zero value.  As a result, perf record does not stop profiling.
      
      The following simple scenarios can trigger this condition:
      
        # sleep 10 &
        # perf record -p $!
      
      After the sleep process exits, perf record should stop profiling and exit.
      However, perf record keeps running.
      
      If pollfd revents contains only POLLERR or POLLHUP, perf record
      indicates that buffer is draining and need to stop profiling.  Use
      fdarray_flag__nonfilterable() to set done eventfd to nonfilterable
      objects, so that evlist__filter_pollfd() does not filter and check done
      eventfd.
      
      Fixes: da231338 ("perf record: Use an eventfd to wakeup when done")
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: zhangjinhao2@huawei.com
      Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e16c2ce7
    • Jiapeng Chong's avatar
      perf tools: Simplify the calculation of variables · 52bcc603
      Jiapeng Chong authored
      Fix the following coccicheck warnings:
      
      ./tools/perf/util/header.c:3809:18-20: WARNING !A || A && B is
      equivalent to !A || B.
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/1612497255-87189-1-git-send-email-jiapeng.chong@linux.alibaba.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      52bcc603
    • Joakim Zhang's avatar
      perf vendor events arm64: Add JSON metrics for imx8mp DDR Perf · 37b9c7bb
      Joakim Zhang authored
      Add JSON metrics for imx8mp DDR Perf.
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-imx@nxp.com
      Cc: kernel@pengutronix.de
      Link: https://lore.kernel.org/r/20210127105734.12198-5-qiangqing.zhang@nxp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      37b9c7bb
    • Joakim Zhang's avatar
      perf vendor events arm64: Add JSON metrics for imx8mq DDR Perf · 3a35093a
      Joakim Zhang authored
      Add JSON metrics for imx8mq DDR Perf.
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-imx@nxp.com
      Cc: kernel@pengutronix.de
      Link: https://lore.kernel.org/r/20210127105734.12198-4-qiangqing.zhang@nxp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3a35093a
    • Joakim Zhang's avatar
      perf vendor events arm64: Add JSON metrics for imx8mn DDR Perf · 842ed298
      Joakim Zhang authored
      Add JSON metrics for imx8mn DDR Perf.
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel@pengutronix.de
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-imx@nxp.com
      Link: https://lore.kernel.org/r/20210127105734.12198-3-qiangqing.zhang@nxp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      842ed298
    • Joakim Zhang's avatar
      perf vendor events arm64: Fix indentation of brackets in imx8mm metrics · 84b102f5
      Joakim Zhang authored
      Fix indentation of brackets in imx8mm metrics.
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-imx@nxp.com
      Cc: kernel@pengutronix.de
      Link: https://lore.kernel.org/r/20210127105734.12198-2-qiangqing.zhang@nxp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84b102f5
  2. 17 Feb, 2021 7 commits
  3. 16 Feb, 2021 5 commits
    • Arnaldo Carvalho de Melo's avatar
      37b3fa0e
    • Arnaldo Carvalho de Melo's avatar
      Merge branch 'perf/urgent' into perf/core · c1bd8a2b
      Arnaldo Carvalho de Melo authored
      To get some fixes that didn't made into 5.11.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1bd8a2b
    • Leo Yan's avatar
      perf arm-spe: Set sample's data source field · a89dbc9b
      Leo Yan authored
      The sample structure contains the field 'data_src' which is used to
      tell the data operation attributions, e.g. operation type is loading or
      storing, cache level, it's snooping or remote accessing, etc.  At the
      end, the 'data_src' will be parsed by perf mem/c2c tools to display
      human readable strings.
      
      This patch is to fill the 'data_src' field in the synthesized samples
      base on different types.  Currently perf tool can display statistics for
      L1/L2/L3 caches but it doesn't support the 'last level cache'.  To fit
      to current implementation, 'data_src' field uses L3 cache for last level
      cache.
      
      Before this commit, perf mem report looks like this:
        # Samples: 75K of event 'l1d-miss'
        # Total weight : 75951
        # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
        #
        # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
        # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
        #
            81.56%    61945  0             N/A            [.] 0x00000000000009d8  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A
            18.44%    14003  0             N/A            [.] 0x0000000000000828  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A
      
      Now on a system with Arm SPE, addresses and access types are displayed:
      
        # Samples: 75K of event 'l1d-miss'
        # Total weight : 75951
        # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
        #
        # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
        # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
        #
             0.43%      324  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794e00  anon         N/A    Walker hit
             0.42%      322  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794580  anon         N/A    Walker hit
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-6-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a89dbc9b
    • Leo Yan's avatar
      perf arm-spe: Synthesize memory event · e55ed342
      Leo Yan authored
      The memory event can deliver two benefits:
      
      - The first benefit is the memory event can give out global view for
        memory accessing, rather than organizing events with scatter mode
        (e.g. uses separate event for L1 cache, last level cache, etc) which
        which can only display a event for single memory type, memory events
        include all memory accessing so it can display the data accessing
        cross memory levels in the same view;
      
      - The second benefit is the sample generation might introduce a big
        overhead and need to wait for long time for Perf reporting, we can
        specify itrace option '--itrace=M' to filter out other events and only
        output memory events, this can significantly reduce the overhead
        caused by generating samples.
      
      This patch is to enable memory event for Arm SPE.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-5-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e55ed342
    • Leo Yan's avatar
      perf arm-spe: Fill address info for samples · 54f7815e
      Leo Yan authored
      To properly handle memory and branch samples, this patch divides into
      two functions for generating samples: arm_spe__synth_mem_sample() is for
      synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
      to synthesize branch samples.
      
      Arm SPE backend decoder has passed virtual and physical address through
      packets, the address info is stored into the synthesize samples in the
      function arm_spe__synth_mem_sample().
      
      Committer notes:
      
      Fixed this:
      
        36    46.77 fedora:27                     : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
      
          util/arm-spe.c:269:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
                  struct perf_sample sample = { 0 };
                                                  ^
          util/arm-spe.c:288:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
                  struct perf_sample sample = { 0 };
      
      By using = { .ip = 0, };
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wei Li <liwei391@huawei.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Link: https://lore.kernel.org/r/20210211133856.2137-4-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      54f7815e
  4. 14 Feb, 2021 7 commits
  5. 13 Feb, 2021 8 commits