1. 03 Apr, 2024 11 commits
    • Yang Jihong's avatar
      perf beauty: Fix AT_EACCESS undeclared build error for system with kernel versions lower than v5.8 · 089ef2f4
      Yang Jihong authored
      In the environment of ubuntu 20.04 (the version of kernel headers is
      5.4), there is an error in building perf:
      
          CC      trace/beauty/fs_at_flags.o
        trace/beauty/fs_at_flags.c: In function ‘faccessat2__scnprintf_flags’:
        trace/beauty/fs_at_flags.c:35:14: error: ‘AT_EACCESS’ undeclared (first use in this function); did you mean ‘DN_ACCESS’?
           35 |  if (flags & AT_EACCESS) {
              |              ^~~~~~~~~~
              |              DN_ACCESS
        trace/beauty/fs_at_flags.c:35:14: note: each undeclared identifier is reported only once for each function it appears in
      
      commit 8a1ad441 ("tools headers: Remove now unused copies of
      uapi/{fcntl,openat2}.h and asm/fcntl.h") removes fcntl.h from tools
      headers directory, and fs_at_flags.c uses the 'AT_EACCESS' macro.
      
      This macro was introduced in the kernel version v5.8.  For system with a
      kernel version older than this version, it will cause compilation to
      fail.
      
      Fixes: 8a1ad441 ("tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h")
      Signed-off-by: default avatarYang Jihong <yangjihong@bytedance.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240403122558.1438841-1-yangjihong@bytedance.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      089ef2f4
    • Namhyung Kim's avatar
      perf annotate: Add symbol name when using capstone · 92dfc594
      Namhyung Kim authored
      This is to keep the existing behavior with objdump.  It needs to show
      symbol information of global variables like below:
      
         Percent |      Source code & Disassembly of elf for cycles:P (1 samples, percent: local period)
        ------------------------------------------------------------------------------------------------
                 : 0                0xffffffff81338f70 <vm_normal_page>:
            0.00 :   ffffffff81338f70:       endbr64
            0.00 :   ffffffff81338f74:       callq   0xffffffff81083a40
            0.00 :   ffffffff81338f79:       movq    %rdi, %r8
            0.00 :   ffffffff81338f7c:       movq    %rdx, %rdi
            0.00 :   ffffffff81338f7f:       callq   *0x17021c3(%rip)   # ffffffff82a3b148 <pv_ops+0x1e8>
            0.00 :   ffffffff81338f85:       movq    0xffbf3c(%rip), %rdx       # ffffffff82334ec8 <physical_mask>
            0.00 :   ffffffff81338f8c:       testq   %rax, %rax                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            0.00 :   ffffffff81338f8f:       je      0xffffffff81338fd0                         here
            0.00 :   ffffffff81338f91:       movq    %rax, %rcx
            0.00 :   ffffffff81338f94:       andl    $1, %ecx
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240329215812.537846-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92dfc594
    • Namhyung Kim's avatar
      perf annotate: Use libcapstone to disassemble · 6d17edc1
      Namhyung Kim authored
      Now it can use the capstone library to disassemble the instructions.
      Let's use that (if available) for perf annotate to speed up.  Currently
      it only supports x86 architecture.  With this change I can see ~3x speed
      up in data type profiling.
      
      But note that capstone cannot give the source file and line number info.
      For now, users should use the external objdump for that by specifying
      the --objdump option explicitly.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6d17edc1
    • Namhyung Kim's avatar
      perf annotate: Split out util/disasm.c · 98f69a57
      Namhyung Kim authored
      The util/annotate.c code has both disassembly and sample annotation
      related codes.  Factor out the disasm part so that it can be handled
      more easily.
      
      No functional changes intended.
      
      Committer notes:
      
      Add missing include env.h, util.h, bpf-event.h and bpf-util.h to
      disasm.c, to fix things like:
      
        util/disasm.c: In function ‘symbol__disassemble_bpf’:
        util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration]
         1203 |         perf_exe(tpath, sizeof(tpath));
              |         ^~~~~~~~
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      98f69a57
    • Namhyung Kim's avatar
      perf annotate: Add and use ins__is_nop() · 10adbf77
      Namhyung Kim authored
      Likewise, add ins__is_nop() to check if the current instruction is NOP.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240329215812.537846-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      10adbf77
    • Namhyung Kim's avatar
      perf annotate: Use ins__is_xxx() if possible · ad399baa
      Namhyung Kim authored
      This is to prepare separation of disasm related code.  Use the public
      ins API instead of checking the internal data structure.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240329215812.537846-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ad399baa
    • Yang Jihong's avatar
      perf evsel: Use evsel__name_is() helper · 09d2056e
      Yang Jihong authored
      Code cleanup, replace strcmp(evsel__name(evsel, {NAME})) with
      evsel__name_is() helper.
      
      No functional change.
      
      Committer notes:
      
      Fix this build error:
      
                trace.syscalls.events.bpf_output = evlist__last(trace.evlist);
        -       assert(evsel__name_is(trace.syscalls.events.bpf_output), "__augmented_syscalls__");
        +       assert(evsel__name_is(trace.syscalls.events.bpf_output, "__augmented_syscalls__"));
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong@bytedance.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240401062724.1006010-3-yangjihong@bytedance.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      09d2056e
    • Yang Jihong's avatar
      perf sched timehist: Fix -g/--call-graph option failure · 6e4b3987
      Yang Jihong authored
      When 'perf sched' enables the call-graph recording, sample_type of dummy
      event does not have PERF_SAMPLE_CALLCHAIN, timehist_check_attr() checks
      that the evsel does not have a callchain, and set show_callchain to 0.
      
      Currently 'perf sched timehist' only saves callchain when processing the
      'sched:sched_switch event', timehist_check_attr() only needs to determine
      whether the event has PERF_SAMPLE_CALLCHAIN.
      
      Before:
      
        # perf sched record -g true
        [ perf record: Woken up 0 times to write data ]
        [ perf record: Captured and wrote 4.153 MB perf.data (7536 samples) ]
        # perf sched timehist
        Samples do not have callchains.
                   time    cpu  task name                       wait time  sch delay   run time
                                [tid/pid]                          (msec)     (msec)     (msec)
        --------------- ------  ------------------------------  ---------  ---------  ---------
          147851.826019 [0000]  perf[285035]                        0.000      0.000      0.000
          147851.826029 [0000]  migration/0[15]                     0.000      0.003      0.009
          147851.826063 [0001]  perf[285035]                        0.000      0.000      0.000
          147851.826069 [0001]  migration/1[21]                     0.000      0.003      0.006
        <SNIP>
      
      After:
      
        # perf sched record -g true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 2.572 MB perf.data (822 samples) ]
        # perf sched timehist
               time cpu task name        waittime  sch delay  runtime
                          [tid/pid]        (msec)  (msec)    (msec)
        ----------- --- ---------------  --------  --------  -----
        4193.035164 [0] perf[277062]        0.000     0.000   0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion
        4193.035174 [0] migration/0[15]     0.000     0.003   0.009 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork
        4193.035207 [1] perf[277062]        0.000     0.000   0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion
        4193.035214 [1] migration/1[21]     0.000     0.003   0.007 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork
        <SNIP>
      
      Fixes: 9c95e4ef ("perf evlist: Add evlist__findnew_tracking_event() helper")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong@bytedance.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240401062724.1006010-2-yangjihong@bytedance.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6e4b3987
    • Namhyung Kim's avatar
      perf annotate: Honor output options with --data-type · bdeaf6ff
      Namhyung Kim authored
      For data type profiling output, it should be in sync with normal output
      so make it display percentage for each field.  Also use coloring scheme
      for users to identify fields with big overhead easily.
      
      Users can use --show-total-period or --show-nr-samples to change the
      output style like in the normal perf annotate output.
      
      Before:
      
        $ perf annotate --data-type
        Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
        ============================================================================
            samples     offset       size  field
                 34          0       9792  struct task_struct    {
                  2          0         24      struct thread_info       thread_info {
                  0          0          8          long unsigned int    flags;
                  1          8          8          long unsigned int    syscall_work;
                  0         16          4          u32  status;
                  1         20          4          u32  cpu;
                                               };
      
      After:
      
        $ perf annotate --data-type
        Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples):
        ============================================================================
         Percent     offset       size  field
          100.00          0       9792  struct task_struct       {
            3.55          0         24      struct thread_info  thread_info {
            0.00          0          8          long unsigned int       flags;
            1.63          8          8          long unsigned int       syscall_work;
            0.00         16          4          u32     status;
            1.91         20          4          u32     cpu;
                                            };
      
      Committer testing:
      
      First collect a suitable perf.data file for use with 'perf annotate --data-type':
      
        root@number:~# perf mem record -a sleep 1s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 11.047 MB perf.data (3466 samples) ]
        root@number:~#
      
      Then, before:
      
        root@number:~# perf annotate --data-type
        Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
        ============================================================================
            samples     offset       size  field
                  6          0         40  union         {
                  6          0         40      struct __pthread_mutex_s __data {
                  2          0          4          int  __lock;
                  0          4          4          unsigned int __count;
                  0          8          4          int  __owner;
                  1         12          4          unsigned int __nusers;
                  2         16          4          int  __kind;
                  1         20          2          short int    __spins;
                  0         22          2          short int    __elision;
                  0         24         16          __pthread_list_t     __list {
                  0         24          8              struct __pthread_internal_list*  __prev;
                  0         32          8              struct __pthread_internal_list*  __next;
                                                   };
                                               };
                  0          0          0      char*    __size;
                  2          0          8      long int __align;
                                           };
        <SNIP>
      
      And after:
      
        Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples):
        ============================================================================
         Percent     offset       size  field
          100.00          0         40  union    {
          100.00          0         40      struct __pthread_mutex_s    __data {
           31.27          0          4          int     __lock;
            0.00          4          4          unsigned int    __count;
            0.00          8          4          int     __owner;
            7.67         12          4          unsigned int    __nusers;
           53.10         16          4          int     __kind;
            7.96         20          2          short int       __spins;
            0.00         22          2          short int       __elision;
            0.00         24         16          __pthread_list_t        __list {
            0.00         24          8              struct __pthread_internal_list*     __prev;
            0.00         32          8              struct __pthread_internal_list*     __next;
                                                };
                                            };
            0.00          0          0      char*       __size;
           31.27          0          8      long int    __align;
                                        };
        <SNIP>
      
      The lines with percentages >= 7.67 have its percentages red colored.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240322224313.423181-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bdeaf6ff
    • Namhyung Kim's avatar
      perf annotate: Get rid of duplicate --group option item · 374af9f1
      Namhyung Kim authored
      The options array in cmd_annotate() has duplicate --group options.  It
      only needs one and let's get rid of the other.
      
        $ perf annotate -h 2>&1 | grep group
              --group           Show event group information together
              --group           Show event group information together
      
      Fixes: 7ebaf489 ("perf annotate: Support '--group' option")
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240322224313.423181-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      374af9f1
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add Kan Liang to MAINTAINERS as a reviewer · f7a0674e
      Arnaldo Carvalho de Melo authored
      Kan has been reviewing patches regularly, add him as a perf tools
      reviewer so that people CC him on new patches.
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatar"Liang, Kan" <kan.liang@linux.intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f7a0674e
  2. 21 Mar, 2024 29 commits
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Move uapi/linux/vhost.h copy out of the directory used to build perf · 4962e194
      Arnaldo Carvalho de Melo authored
      It is only used to generate string tables, not to build perf, so move it
      to the tools/perf/trace/beauty/include/ hierarchy, that is used just for
      scraping.
      
      This is a something that should've have happened, as happened with the
      linux/socket.h scrapper, do it now as Ian suggested while doing an
      audit/refactor session in the headers used by perf.
      
      No other tools/ living code uses it, just <linux/vhost.h> coming from
      either 'make install_headers' or from the system /usr/include/
      directory.
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4962e194
    • Ian Rogers's avatar
      perf dso: Reorder members to save space in 'struct dso' · b3ad832d
      Ian Rogers authored
      Save 40 bytes and move from 8 to 7 cache lines. Make member dwfl
      dependent on being a powerpc build. Squeeze bits of int/enum types
      when appropriate. Remove holes/padding by reordering variables.
      
      Before:
      
        struct dso {
                struct mutex               lock;                 /*     0    40 */
                struct list_head           node;                 /*    40    16 */
                struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
                /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
                struct rb_root *           root;                 /*    80     8 */
                struct rb_root_cached      symbols;              /*    88    16 */
                struct symbol * *          symbol_names;         /*   104     8 */
                size_t                     symbol_names_len;     /*   112     8 */
                struct rb_root_cached      inlined_nodes;        /*   120    16 */
                /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
                struct rb_root_cached      srclines;             /*   136    16 */
                struct {
                        u64                addr;                 /*   152     8 */
                        struct symbol *    symbol;               /*   160     8 */
                } last_find_result;                              /*   152    16 */
                void *                     a2l;                  /*   168     8 */
                char *                     symsrc_filename;      /*   176     8 */
                unsigned int               a2l_fails;            /*   184     4 */
                enum dso_space_type        kernel;               /*   188     4 */
                /* --- cacheline 3 boundary (192 bytes) --- */
                _Bool                      is_kmod;              /*   192     1 */
      
                /* XXX 3 bytes hole, try to pack */
      
                enum dso_swap_type         needs_swap;           /*   196     4 */
                enum dso_binary_type       symtab_type;          /*   200     4 */
                enum dso_binary_type       binary_type;          /*   204     4 */
                enum dso_load_errno        load_errno;           /*   208     4 */
                u8                         adjust_symbols:1;     /*   212: 0  1 */
                u8                         has_build_id:1;       /*   212: 1  1 */
                u8                         header_build_id:1;    /*   212: 2  1 */
                u8                         has_srcline:1;        /*   212: 3  1 */
                u8                         hit:1;                /*   212: 4  1 */
                u8                         annotate_warned:1;    /*   212: 5  1 */
                u8                         auxtrace_warned:1;    /*   212: 6  1 */
                u8                         short_name_allocated:1; /*   212: 7  1 */
                u8                         long_name_allocated:1; /*   213: 0  1 */
                u8                         is_64_bit:1;          /*   213: 1  1 */
      
                /* XXX 6 bits hole, try to pack */
      
                _Bool                      sorted_by_name;       /*   214     1 */
                _Bool                      loaded;               /*   215     1 */
                u8                         rel;                  /*   216     1 */
      
                /* XXX 7 bytes hole, try to pack */
      
                struct build_id            bid;                  /*   224    32 */
                /* --- cacheline 4 boundary (256 bytes) --- */
                u64                        text_offset;          /*   256     8 */
                u64                        text_end;             /*   264     8 */
                const char  *              short_name;           /*   272     8 */
                const char  *              long_name;            /*   280     8 */
                u16                        long_name_len;        /*   288     2 */
                u16                        short_name_len;       /*   290     2 */
      
                /* XXX 4 bytes hole, try to pack */
      
                void *                     dwfl;                 /*   296     8 */
                struct auxtrace_cache *    auxtrace_cache;       /*   304     8 */
                int                        comp;                 /*   312     4 */
      
                /* XXX 4 bytes hole, try to pack */
      
                /* --- cacheline 5 boundary (320 bytes) --- */
                struct {
                        struct rb_root     cache;                /*   320     8 */
                        int                fd;                   /*   328     4 */
                        int                status;               /*   332     4 */
                        u32                status_seen;          /*   336     4 */
      
                        /* XXX 4 bytes hole, try to pack */
      
                        u64                file_size;            /*   344     8 */
                        struct list_head   open_entry;           /*   352    16 */
                        u64                elf_base_addr;        /*   368     8 */
                        u64                debug_frame_offset;   /*   376     8 */
                        /* --- cacheline 6 boundary (384 bytes) --- */
                        u64                eh_frame_hdr_addr;    /*   384     8 */
                        u64                eh_frame_hdr_offset;  /*   392     8 */
                } data;                                          /*   320    80 */
                struct {
                        u32                id;                   /*   400     4 */
                        u32                sub_id;               /*   404     4 */
                        struct perf_env *  env;                  /*   408     8 */
                } bpf_prog;                                      /*   400    16 */
                union {
                        void *             priv;                 /*   416     8 */
                        u64                db_id;                /*   416     8 */
                };                                               /*   416     8 */
                struct nsinfo *            nsinfo;               /*   424     8 */
                struct dso_id              id;                   /*   432    24 */
                /* --- cacheline 7 boundary (448 bytes) was 8 bytes ago --- */
                refcount_t                 refcnt;               /*   456     4 */
                char                       name[];               /*   460     0 */
      
                /* size: 464, cachelines: 8, members: 49 */
                /* sum members: 440, holes: 4, sum holes: 18 */
                /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 6 bits */
                /* padding: 4 */
                /* forced alignments: 1 */
                /* last cacheline: 16 bytes */
        } __attribute__((__aligned__(8)));
      
      After:
      
        struct dso {
                struct mutex               lock;                 /*     0    40 */
                struct list_head           node;                 /*    40    16 */
                struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
                /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
                struct rb_root *           root;                 /*    80     8 */
                struct rb_root_cached      symbols;              /*    88    16 */
                struct symbol * *          symbol_names;         /*   104     8 */
                size_t                     symbol_names_len;     /*   112     8 */
                struct rb_root_cached      inlined_nodes;        /*   120    16 */
                /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
                struct rb_root_cached      srclines;             /*   136    16 */
                struct {
                        u64                addr;                 /*   152     8 */
                        struct symbol *    symbol;               /*   160     8 */
                } last_find_result;                              /*   152    16 */
                struct build_id            bid;                  /*   168    32 */
                /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */
                u64                        text_offset;          /*   200     8 */
                u64                        text_end;             /*   208     8 */
                const char  *              short_name;           /*   216     8 */
                const char  *              long_name;            /*   224     8 */
                void *                     a2l;                  /*   232     8 */
                char *                     symsrc_filename;      /*   240     8 */
                struct nsinfo *            nsinfo;               /*   248     8 */
                /* --- cacheline 4 boundary (256 bytes) --- */
                struct auxtrace_cache *    auxtrace_cache;       /*   256     8 */
                union {
                        void *             priv;                 /*   264     8 */
                        u64                db_id;                /*   264     8 */
                };                                               /*   264     8 */
                struct {
                        struct perf_env *  env;                  /*   272     8 */
                        u32                id;                   /*   280     4 */
                        u32                sub_id;               /*   284     4 */
                } bpf_prog;                                      /*   272    16 */
                struct {
                        struct rb_root     cache;                /*   288     8 */
                        struct list_head   open_entry;           /*   296    16 */
                        u64                file_size;            /*   312     8 */
                        /* --- cacheline 5 boundary (320 bytes) --- */
                        u64                elf_base_addr;        /*   320     8 */
                        u64                debug_frame_offset;   /*   328     8 */
                        u64                eh_frame_hdr_addr;    /*   336     8 */
                        u64                eh_frame_hdr_offset;  /*   344     8 */
                        int                fd;                   /*   352     4 */
                        int                status;               /*   356     4 */
                        u32                status_seen;          /*   360     4 */
                } data;                                          /*   288    80 */
      
                /* XXX last struct has 4 bytes of padding */
      
                struct dso_id              id;                   /*   368    24 */
                /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
                unsigned int               a2l_fails;            /*   392     4 */
                int                        comp;                 /*   396     4 */
                refcount_t                 refcnt;               /*   400     4 */
                enum dso_load_errno        load_errno;           /*   404     4 */
                u16                        long_name_len;        /*   408     2 */
                u16                        short_name_len;       /*   410     2 */
                enum dso_binary_type       symtab_type:8;        /*   412: 0  4 */
                enum dso_binary_type       binary_type:8;        /*   412: 8  4 */
                enum dso_space_type        kernel:2;             /*   412:16  4 */
                enum dso_swap_type         needs_swap:2;         /*   412:18  4 */
      
                /* Bitfield combined with next fields */
      
                _Bool                      is_kmod:1;            /*   414: 4  1 */
                u8                         adjust_symbols:1;     /*   414: 5  1 */
                u8                         has_build_id:1;       /*   414: 6  1 */
                u8                         header_build_id:1;    /*   414: 7  1 */
                u8                         has_srcline:1;        /*   415: 0  1 */
                u8                         hit:1;                /*   415: 1  1 */
                u8                         annotate_warned:1;    /*   415: 2  1 */
                u8                         auxtrace_warned:1;    /*   415: 3  1 */
                u8                         short_name_allocated:1; /*   415: 4  1 */
                u8                         long_name_allocated:1; /*   415: 5  1 */
                u8                         is_64_bit:1;          /*   415: 6  1 */
      
                /* XXX 1 bit hole, try to pack */
      
                _Bool                      sorted_by_name;       /*   416     1 */
                _Bool                      loaded;               /*   417     1 */
                u8                         rel;                  /*   418     1 */
                char                       name[];               /*   419     0 */
      
                /* size: 424, cachelines: 7, members: 48 */
                /* sum members: 415 */
                /* sum bitfield members: 31 bits, bit holes: 1, sum bit holes: 1 bits */
                /* padding: 5 */
                /* paddings: 1, sum paddings: 4 */
                /* forced alignments: 1 */
                /* last cacheline: 40 bytes */
        } __attribute__((__aligned__(8)));
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240321160300.1635121-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b3ad832d
    • Anne Macedo's avatar
      perf lock contention: Trim backtrace by skipping traceiter functions · 2a5049b7
      Anne Macedo authored
      The 'perf lock contention' program currently shows the caller of the locks
      as __traceiter_contention_begin+0x??. This caller can be ignored, as it is
      from the traceiter itself. Instead, it should show the real callers for
      the locks.
      
      When fiddling with the --stack-skip parameter, the actual callers for
      the locks start to show up. However, just ignore the
      __traceiter_contention_begin and the __traceiter_contention_end symbols
      so the actual callers will show up.
      
      Before this patch is applied:
      
      sudo perf lock con -a -b -- sleep 3
       contended   total wait     max wait     avg wait         type   caller
      
               8      2.33 s       2.28 s     291.18 ms     rwlock:W   __traceiter_contention_begin+0x44
               4      2.33 s       2.28 s     582.35 ms     rwlock:W   __traceiter_contention_begin+0x44
               7    140.30 ms     46.77 ms     20.04 ms     rwlock:W   __traceiter_contention_begin+0x44
               2     63.35 ms     33.76 ms     31.68 ms        mutex   trace_contention_begin+0x84
               2     46.74 ms     46.73 ms     23.37 ms     rwlock:W   __traceiter_contention_begin+0x44
               1     13.54 us     13.54 us     13.54 us        mutex   trace_contention_begin+0x84
               1      3.67 us      3.67 us      3.67 us      rwsem:R   __traceiter_contention_begin+0x44
      
      Before this patch is applied - using --stack-skip 5
      
      sudo perf lock con --stack-skip 5 -a -b -- sleep 3
       contended   total wait     max wait     avg wait         type   caller
      
               2      2.24 s       2.24 s       1.12 s      rwlock:W   do_epoll_wait+0x5a0
               4      1.65 s     824.21 ms    412.08 ms     rwlock:W   do_exit+0x338
               2    824.35 ms    824.29 ms    412.17 ms     spinlock   get_signal+0x108
               2    824.14 ms    824.14 ms    412.07 ms     rwlock:W   release_task+0x68
               1     25.22 ms     25.22 ms     25.22 ms        mutex   cgroup_kn_lock_live+0x58
               1     24.71 us     24.71 us     24.71 us     spinlock   do_exit+0x44
               1     22.04 us     22.04 us     22.04 us      rwsem:R   lock_mm_and_find_vma+0xb0
      
      After this patch is applied:
      
      sudo ./perf lock con -a -b -- sleep 3
       contended   total wait     max wait     avg wait         type   caller
      
               4      4.13 s       2.07 s       1.03 s      rwlock:W   release_task+0x68
               2      2.07 s       2.07 s       1.03 s      rwlock:R   mm_update_next_owner+0x50
               2      2.07 s       2.07 s       1.03 s      rwlock:W   do_exit+0x338
               1     41.56 ms     41.56 ms     41.56 ms        mutex   cgroup_kn_lock_live+0x58
               2     36.12 us     18.83 us     18.06 us     rwlock:W   do_exit+0x338
      Signed-off-by: default avatarAnne Macedo <retpolanne@posteo.net>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240319143629.3422590-1-retpolanne@posteo.netSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2a5049b7
    • Ian Rogers's avatar
      perf vendor events intel: Remove info metrics erroneously in TopdownL1 · af34a16d
      Ian Rogers authored
      Bug affected server metrics only. This doesn't impact default metrics
      but if the TopdownL1 metric group is specified. Passes on the fix in:
      
        https://github.com/intel/perfmon/commit/b09f0a3953234ec592b4a872b87764c78da05d8bReviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-13-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af34a16d
    • Ian Rogers's avatar
      perf vendor events intel: Update snowridgex to 1.22 · 7bce27f8
      Ian Rogers authored
      Update events from 1.21 to 1.22 as released in:
      
        https://github.com/intel/perfmon/commit/ba4f96039f96231b51e3eb69d5a21e2b00f6de5b
      
      Updates various descriptions and removes the event
      UNC_IIO_NUM_REQ_FROM_CPU.IRP.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-12-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7bce27f8
    • Ian Rogers's avatar
      perf vendor events intel: Update skylake to v58 · 70e7028c
      Ian Rogers authored
      Update events from:
      
        https://github.com/intel/perfmon/commit/f2e5136e062a91ae554dc40530132e66f9271848
      
      This change didn't increase the version number from v58.
      
      Updates various descriptions.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-11-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70e7028c
    • Ian Rogers's avatar
      perf vendor events intel: Update skylakex to 1.33 · d70cc755
      Ian Rogers authored
      Update events from 1.32 to 1.33 as released in:
      
        https://github.com/intel/perfmon/commit/3fe7390dd18496c35ec3a9cf17de0473fd5485cb
      
      Various description updates. Adds the event
      OFFCORE_RESPONSE.ALL_READS.L3_HIT.HIT_OTHER_CORE_FWD.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-10-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d70cc755
    • Ian Rogers's avatar
      perf vendor events intel: Update sierraforest to 1.02 · bf270b15
      Ian Rogers authored
      Update events from 1.01 to 1.02 as released in:
      
        https://github.com/intel/perfmon/commit/451dd41ae627b56433ad4065bf3632789eb70834
      
      Various description updates. Adds topdown events
      TOPDOWN_BAD_SPECULATION.ALL_P, TOPDOWN_BE_BOUND.ALL_P,
      TOPDOWN_FE_BOUND.ALL_P and TOPDOWN_RETIRING.ALL_P.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bf270b15
    • Ian Rogers's avatar
      perf vendor events intel: Update sapphirerapids to 1.20 · 2edee9e6
      Ian Rogers authored
      Update events from 1.17 to 1.20 as released in:
      
        https://github.com/intel/perfmon/commit/6f674057745acf0125395638ca6be36458a59bda
      
      Various description updates. Adds uncore events
      UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL,
      UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE,
      UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL, UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE,
      UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL,
      UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_LOCAL,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_REMOTE,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_LOCAL,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_REMOTE,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_LOCAL,
      UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_REMOTE and removes core events
      AMX_OPS_RETIRED.BF16 and AMX_OPS_RETIRED.INT8.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-8-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2edee9e6
    • Ian Rogers's avatar
      perf vendor events intel: Update meteorlake to 1.08 · 84d0e8c6
      Ian Rogers authored
      Update events from 1.07 to 1.08 as released in:
      
        https://github.com/intel/perfmon/commit/f0f8f3e163d9eb84e6ce8e2108a22cb43b2527e5
      
      Various description updates. Adds topdown, offcore and uncore events
      OCR.DEMAND_DATA_RD.L3_HIT, OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_NO_FWD,
      OCR.DEMAND_RFO.L3_HIT, OCR.DEMAND_DATA_RD.L3_MISS,
      OCR.DEMAND_RFO.L3_MISS, OCR.DEMAND_DATA_RD.ANY_RESPONSE,
      OCR.DEMAND_DATA_RD.DRAM, OCR.DEMAND_RFO.ANY_RESPONSE,
      OCR.DEMAND_RFO.DRAM, TOPDOWN_BAD_SPECULATION.ALL_P,
      TOPDOWN_BE_BOUND.ALL_P, TOPDOWN_FE_BOUND.ALL_P,
      TOPDOWN_RETIRING.ALL_P, UNC_ARB_DAT_OCCUPANCY.RD and
      UNC_HAC_ARB_COH_TRK_REQUESTS.ALL.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84d0e8c6
    • Ian Rogers's avatar
      perf vendor events intel: Update lunarlake to 1.01 · 3670ffbd
      Ian Rogers authored
      Update events from 1.00 to 1.01 as released in:
      
        https://github.com/intel/perfmon/commit/56ab8d837ac566d51a4d8748b6b4b817a22c9b84
      
      Various encoding and description updates. Adds the events
      CPU_CLK_UNHALTED.CORE, CPU_CLK_UNHALTED.CORE_P,
      CPU_CLK_UNHALTED.REF_TSC_P, CPU_CLK_UNHALTED.THREAD,
      MISC_RETIRED.LBR_INSERTS, TOPDOWN_BAD_SPECULATION.ALL_P,
      TOPDOWN_BE_BOUND.ALL_P, TOPDOWN_FE_BOUND.ALL_P,
      TOPDOWN_RETIRING.ALL_P.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3670ffbd
    • Ian Rogers's avatar
      perf vendor events intel: Update icelakex to 1.24 · 5157c204
      Ian Rogers authored
      Update events from 1.23 to 1.24 as released in:
      
        https://github.com/intel/perfmon/commit/d883888ae60882028e387b6fe1ebf683beb693fa
      
      Fixes spelling and descriptions. Adds the uncore events
      UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL and
      UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE, while removing
      UNC_IIO_NUM_REQ_FROM_CPU.IRP.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5157c204
    • Ian Rogers's avatar
      perf vendor events intel: Update grandridge to 1.02 · a02dc01c
      Ian Rogers authored
      Update events from 1.01 to 1.02 as released in:
      
        https://github.com/intel/perfmon/commit/b2a81e803add1ba0af68a442c975683d226d868c
      
      Fixes spelling and descriptions. Adds topdown events and uncore cache
      UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD_OPT,
      UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT,
      UNC_CHA_TOR_OCCUPANCY.IA_DRD_OPT.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a02dc01c
    • Ian Rogers's avatar
      perf vendor events intel: Update emeraldrapids to 1.06 · 36f353a1
      Ian Rogers authored
      Update events from 1.03 to 1.96 as released in:
      
        https://github.com/intel/perfmon/commit/21a8be3ea7918749141db4036fb65a2343cd865d
      
      Fixes spelling and descriptions. Adds cache miss latency events
      UNC_CHA_TOR_(INSERTS|OCCUPANCY).IO_(PCIRDCUR|ITOM|ITOMCACHENEAR)_(LOCAL|REMOTE).
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      36f353a1
    • Ian Rogers's avatar
      perf vendor events intel: Update cascadelakex to 1.21 · 4376424a
      Ian Rogers authored
      Update events from 1.20 to 1.21 as released in:
      
        https://github.com/intel/perfmon/commit/fcfdba3be8f3be81ad6b509fdebf953ead92dc2c
      
      Largely fixes spelling and descriptions.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Link: https://lore.kernel.org/r/20240321060016.1464787-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4376424a
    • Arnaldo Carvalho de Melo's avatar
      perf probe: Add missing libgen.h header needed for using basename() · 58103715
      Arnaldo Carvalho de Melo authored
      This prototype is obtained indirectly, by luck, from some other header
      in probe-event.c in most systems, but recently exploded on alpine:edge:
      
         8    13.39 alpine:edge                   : FAIL gcc version 13.2.1 20240309 (Alpine 13.2.1_git20240309)
          util/probe-event.c: In function 'convert_exec_to_group':
          util/probe-event.c:225:16: error: implicit declaration of function 'basename' [-Werror=implicit-function-declaration]
            225 |         ptr1 = basename(exec_copy);
                |                ^~~~~~~~
          util/probe-event.c:225:14: error: assignment to 'char *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
            225 |         ptr1 = basename(exec_copy);
                |              ^
          cc1: all warnings being treated as errors
          make[3]: *** [/git/perf-6.8.0/tools/build/Makefile.build:158: util] Error 2
      
      Fix it by adding the libgen.h header where basename() is prototyped.
      
      Fixes: fb7345bb ("perf probe: Support basic dwarf-based operations on uprobe events")
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      58103715
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Fix 'newfstatat'/'fstatat' argument pretty printing · 0831638e
      Arnaldo Carvalho de Melo authored
      There were needless two entries, one for 'newfstatat' and another for
      'fstatat', keep just one and pretty print its 'flags' argument using the
      fs_at_flags scnprintf that is also used by other FS syscalls such as
      'stat', now:
      
        root@number:~# perf trace -e newfstatat --max-events=5
             0.000 ( 0.010 ms): abrt-dump-jour/1400 newfstatat(dfd: 7, filename: "", statbuf: 0x7fff0d127000, flag: EMPTY_PATH) = 0
             0.020 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 9, filename: "", statbuf: 0x55752507b0e8, flag: EMPTY_PATH) = 0
             0.039 ( 0.004 ms): abrt-dump-jour/1400 newfstatat(dfd: 19, filename: "", statbuf: 0x557525061378, flag: EMPTY_PATH) = 0
             0.047 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 20, filename: "", statbuf: 0x5575250b8cc8, flag: EMPTY_PATH) = 0
             0.053 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 22, filename: "", statbuf: 0x5575250535d8, flag: EMPTY_PATH) = 0
        root@number:~#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-6-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0831638e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Beautify the 'flags' arg of unlinkat · 4d923282
      Arnaldo Carvalho de Melo authored
      Reusing the fs_at_flags array done for the 'stat' syscall.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-5-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4d923282
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce faccessat2 flags scnprintf routine · b8171a84
      Arnaldo Carvalho de Melo authored
      The fsaccessat and fsaccessat2 now have beautifiers for its arguments.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-4-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8171a84
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce scrape script for the 'statx' syscall 'mask' argument · f122b3d6
      Arnaldo Carvalho de Melo authored
      It was using the first variation on producing a string representation
      for a binary flag, one that used the system's stat.h and preprocessor
      tricks that had to be updated everytime a new flag was introduced.
      
      Use the more recent scrape script + strarray +
      strarray__scnprintf_flags() combo.
      
        $ tools/perf/trace/beauty/statx_mask.sh
        static const char *statx_mask[] = {
        	[ilog2(0x00000001) + 1] = "TYPE",
        	[ilog2(0x00000002) + 1] = "MODE",
        	[ilog2(0x00000004) + 1] = "NLINK",
        	[ilog2(0x00000008) + 1] = "UID",
        	[ilog2(0x00000010) + 1] = "GID",
        	[ilog2(0x00000020) + 1] = "ATIME",
        	[ilog2(0x00000040) + 1] = "MTIME",
        	[ilog2(0x00000080) + 1] = "CTIME",
        	[ilog2(0x00000100) + 1] = "INO",
        	[ilog2(0x00000200) + 1] = "SIZE",
        	[ilog2(0x00000400) + 1] = "BLOCKS",
        	[ilog2(0x00000800) + 1] = "BTIME",
        	[ilog2(0x00001000) + 1] = "MNT_ID",
        	[ilog2(0x00002000) + 1] = "DIOALIGN",
        	[ilog2(0x00004000) + 1] = "MNT_ID_UNIQUE",
        };
        $
      
      Now we need a copy of uapi/linux/stat.h from tools/include/ in the
      scrape only directory tools/perf/trace/beauty/include.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-3-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f122b3d6
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce scrape script for various fs syscalls 'flags' arguments · 3d6cfbaf
      Arnaldo Carvalho de Melo authored
      It was using the first variation on producing a string representation
      for a binary flag, one that used the system's fcntl.h and preprocessor
      tricks that had to be updated everytime a new flag was introduced.
      
      Use the more recent scrape script + strarray + strarray__scnprintf_flags() combo.
      
        $ tools/perf/trace/beauty/fs_at_flags.sh
        static const char *fs_at_flags[] = {
        	[ilog2(0x100) + 1] = "SYMLINK_NOFOLLOW",
        	[ilog2(0x200) + 1] = "REMOVEDIR",
        	[ilog2(0x400) + 1] = "SYMLINK_FOLLOW",
        	[ilog2(0x800) + 1] = "NO_AUTOMOUNT",
        	[ilog2(0x1000) + 1] = "EMPTY_PATH",
        	[ilog2(0x0000) + 1] = "STATX_SYNC_AS_STAT",
        	[ilog2(0x2000) + 1] = "STATX_FORCE_SYNC",
        	[ilog2(0x4000) + 1] = "STATX_DONT_SYNC",
        	[ilog2(0x8000) + 1] = "RECURSIVE",
        	[ilog2(0x80000000) + 1] = "GETATTR_NOSEC",
        };
        $
      
      Now we need a copy of uapi/linux/fcntl.h from tools/include/ in the
      scrape only directory tools/perf/trace/beauty/include and will use that
      fs_at_flags array for other fs syscalls.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-2-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3d6cfbaf
    • Ian Rogers's avatar
      perf tests: Run tests in parallel by default · 4cef0e7a
      Ian Rogers authored
      Switch from running tests sequentially to running in parallel by
      default. Change the opt-in '-p' or '--parallel' flag to '-S' or
      '--sequential'.
      
      On an 8 core tigerlake an address sanitizer run time changes from:
      
        326.54user 622.73system 6:59.91elapsed 226%CPU
      
      to:
      
        973.02user 583.98system 3:01.17elapsed 859%CPU
      
      So over twice as fast, saving 4 minutes.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301174711.2646944-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4cef0e7a
    • Ian Rogers's avatar
      perf help: Lower levenshtein penality for deleting character · 7aea01ea
      Ian Rogers authored
      The levenshtein penalty for deleting a character was far higher than
      subsituting or inserting a character. Lower the penalty to match that
      of inserting a character.
      
      Before:
      
        $ perf recccord
        perf: 'recccord' is not a perf-command. See 'perf --help'.
        $
      
      After:
      
        $ perf recccord
        perf: 'recccord' is not a perf-command. See 'perf --help'.
      
        Did you mean this?
                record
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301201306.2680986-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7aea01ea
    • Ian Rogers's avatar
      perf tools: Suggest inbuilt commands for unknown command · f664d515
      Ian Rogers authored
      The existing unknown command code looks for perf scripts like
      perf-archive.sh and perf-iostat.sh, however, inbuilt commands aren't
      suggested. Add the inbuilt commands so they may be suggested too.
      
      Before:
      
        $ perf reccord
        perf: 'reccord' is not a perf-command. See 'perf --help'.
        $
      
      After:
      
        $ perf reccord
        perf: 'reccord' is not a perf-command. See 'perf --help'.
      
        Did you mean this?
                record
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301201306.2680986-1-irogers@google.com
      [ Added some fixes from Ian to problems I noticed while testing ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f664d515
    • Ian Rogers's avatar
      perf test: Read child test 10 times a second rather than 1 · 5f2f051a
      Ian Rogers authored
      Make the perf test output smoother by timing out the poll of the child
      process after 100ms rather than 1s.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5f2f051a
    • Ian Rogers's avatar
      perf test: Use a single fd for the child process out/err · e120f709
      Ian Rogers authored
      Switch from dumping err then out, to a single file descriptor for both
      of them. This allows the err and output to be correctly interleaved in
      verbose output.
      
      Fixes: b482f5f8 ("perf tests: Add option to run tests in parallel")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e120f709
    • Ian Rogers's avatar
      perf test: Stat output per thread of just the parent process · f68c981b
      Ian Rogers authored
      Per-thread mode requires either system-wide (-a), a pid (-p) or a tid
      (-t).
      
      The stat output tests were using system-wide mode but this is racy when
      threads are starting and exiting - something that happens a lot when
      running the tests in parallel (perf test -p).
      
      Avoid the race conditions by using pid mode with the pid of the parent
      process.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f68c981b
    • Ian Rogers's avatar
      perf record: Delete session after stopping sideband thread · 88ce0106
      Ian Rogers authored
      The session has a header in it which contains a perf env with
      bpf_progs. The bpf_progs are accessed by the sideband thread and so
      the sideband thread must be stopped before the session is deleted, to
      avoid a use after free.  This error was detected by AddressSanitizer
      in the following:
      
        ==2054673==ERROR: AddressSanitizer: heap-use-after-free on address 0x61d000161e00 at pc 0x55769289de54 bp 0x7f9df36d4ab0 sp 0x7f9df36d4aa8
        READ of size 8 at 0x61d000161e00 thread T1
            #0 0x55769289de53 in __perf_env__insert_bpf_prog_info util/env.c:42
            #1 0x55769289dbb1 in perf_env__insert_bpf_prog_info util/env.c:29
            #2 0x557692bbae29 in perf_env__add_bpf_info util/bpf-event.c:483
            #3 0x557692bbb01a in bpf_event__sb_cb util/bpf-event.c:512
            #4 0x5576928b75f4 in perf_evlist__poll_thread util/sideband_evlist.c:68
            #5 0x7f9df96a63eb in start_thread nptl/pthread_create.c:444
            #6 0x7f9df9726a4b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      
        0x61d000161e00 is located 384 bytes inside of 2136-byte region [0x61d000161c80,0x61d0001624d8)
        freed by thread T0 here:
            #0 0x7f9dfa6d7288 in __interceptor_free libsanitizer/asan/asan_malloc_linux.cpp:52
            #1 0x557692978d50 in perf_session__delete util/session.c:319
            #2 0x557692673959 in __cmd_record tools/perf/builtin-record.c:2884
            #3 0x55769267a9f0 in cmd_record tools/perf/builtin-record.c:4259
            #4 0x55769286710c in run_builtin tools/perf/perf.c:349
            #5 0x557692867678 in handle_internal_command tools/perf/perf.c:402
            #6 0x557692867a40 in run_argv tools/perf/perf.c:446
            #7 0x557692867fae in main tools/perf/perf.c:562
            #8 0x7f9df96456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      
      Fixes: 657ee553 ("perf evlist: Introduce side band thread")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88ce0106
    • Ian Rogers's avatar
      perf tools: Add/use PMU reverse lookup from config to name · 67ee8e71
      Ian Rogers authored
      Add perf_pmu__name_from_config that does a reverse lookup from a
      config number to an alias name. The lookup is expensive as the config
      is computed for every alias by filling in a perf_event_attr, but this
      is only done when verbose output is enabled. The lookup also only
      considers config, and not config1, config2 or config3.
      
      An example of the output:
      
        $ perf stat -vv -e data_read true
        ...
        perf_event_attr:
          type                             24 (uncore_imc_free_running_0)
          size                             136
          config                           0x20ff (data_read)
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ...
      
      Committer notes:
      
      Fix the python binding build by adding dummies for not strictly
      needed perf_pmu__name_from_config() and perf_pmus__find_by_type().
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      67ee8e71