1. 01 Aug, 2024 1 commit
    • Leo Yan's avatar
      perf arm-spe: Extract evsel setting up · ccd6fcda
      Leo Yan authored
      The evsel for Arm SPE PMU needs to be set up. Extract the setting up
      into a function arm_spe_setup_evsel().
      Signed-off-by: default avatarLeo Yan <leo.yan@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc:  <coresight@lists.linaro.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc:  <linux-perf-users@vger.kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccd6fcda
  2. 31 Jul, 2024 39 commits
    • Weilin Wang's avatar
      perf test: make metric validation test return early when there is no metric... · 4ed0f392
      Weilin Wang authored
      perf test: make metric validation test return early when there is no metric supported on the test system
      
      Add a check to return the metric validation test early when perf list metric
      does not output any metric. This would happen when NO_JEVENTS=1 is set or in a
      system that there is no metric supported.
      Signed-off-by: default avatarWeilin Wang <weilin.wang@intel.com>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Samantha Alt <samantha.alt@intel.com>
      Link: https://lore.kernel.org/lkml/20240522204254.1841420-1-weilin.wang@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4ed0f392
    • Namhyung Kim's avatar
      perf ftrace profile: Add -s/--sort option · 74ae366c
      Namhyung Kim authored
      The -s/--sort option is to sort the output by given column.
      
        $ sudo perf ftrace profile -s max sync | head
        # Total (us)   Avg (us)   Max (us)      Count   Function
            6301.811   6301.811   6301.811          1   __do_sys_sync
            6301.328   6301.328   6301.328          1   ksys_sync
            5320.300   1773.433   2858.819          3   iterate_supers
            2755.875     17.012   2610.633        162   sync_fs_one_sb
            2728.351    682.088   2610.413          4   ext4_sync_fs [ext4]
            2603.654   2603.654   2603.654          1   jbd2_log_wait_commit [jbd2]
            4750.615    593.827   2597.427          8   schedule
            2164.986     26.728   2115.673         81   sync_inodes_one_sb
            2143.842     26.467   2115.438         81   sync_inodes_sb
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/lkml/20240729004127.238611-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74ae366c
    • Namhyung Kim's avatar
      perf ftrace: Add 'profile' command · 0f223813
      Namhyung Kim authored
      The 'perf ftrace profile' command is to get function execution profiles
      using function-graph tracer so that users can see the total, average,
      max execution time as well as the number of invocations easily.
      
      The following is a profile for the perf_event_open syscall.
      
        $ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \
          perf stat -e cycles -C1 true 2> /dev/null | head
        # Total (us)   Avg (us)   Max (us)      Count   Function
              65.611     65.611     65.611          1   __x64_sys_perf_event_open
              30.527     30.527     30.527          1   anon_inode_getfile
              30.260     30.260     30.260          1   __anon_inode_getfile
              29.700     29.700     29.700          1   alloc_file_pseudo
              17.578     17.578     17.578          1   d_alloc_pseudo
              17.382     17.382     17.382          1   __d_alloc
              16.738     16.738     16.738          1   kmem_cache_alloc_lru
              15.686     15.686     15.686          1   perf_event_alloc
              14.012      7.006     11.264          2   obj_cgroup_charge
        #
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/lkml/20240729004127.238611-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f223813
    • Namhyung Kim's avatar
      perf ftrace: Factor out check_ftrace_capable() · 608585f4
      Namhyung Kim authored
      The check is a common part of the ftrace commands, let's move it out.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/lkml/20240729004127.238611-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      608585f4
    • Namhyung Kim's avatar
      perf ftrace: Add 'tail' option to --graph-opts · c7780089
      Namhyung Kim authored
      The 'graph-tail' option is to print function name as a comment at the end.
      This is useful when a large function is mixed with other functions
      (possibly from different CPUs).
      
      For example,
      
        $ sudo perf ftrace -- perf stat true
        ...
         1)               |    get_unused_fd_flags() {
         1)               |      alloc_fd() {
         1)   0.178 us    |        _raw_spin_lock();
         1)   0.187 us    |        expand_files();
         1)   0.169 us    |        _raw_spin_unlock();
         1)   1.211 us    |      }
         1)   1.503 us    |    }
      
        $ sudo perf ftrace --graph-opts tail -- perf stat true
        ...
         1)               |    get_unused_fd_flags() {
         1)               |      alloc_fd() {
         1)   0.099 us    |        _raw_spin_lock();
         1)   0.083 us    |        expand_files();
         1)   0.081 us    |        _raw_spin_unlock();
         1)   0.601 us    |      } /* alloc_fd */
         1)   0.751 us    |    } /* get_unused_fd_flags */
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/lkml/20240729004127.238611-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c7780089
    • Dr. David Alan Gilbert's avatar
      perf test pmu: Remove unused test_pmus · 156e8dcf
      Dr. David Alan Gilbert authored
      Commit aa1551f2 ("perf test pmu: Refactor format test and exposed
      test APIs") added the 'test_pmus' list, but didn't use it.
      (It seems to put them on the other_pmus list?)
      
      Remove it.
      
      Fixes: aa1551f2 ("perf test pmu: Refactor format test and exposed test APIs")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarDr. David Alan Gilbert <linux@treblig.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Link: https://lore.kernel.org/lkml/20240727175919.1041468-1-linux@treblig.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      156e8dcf
    • Adrian Hunter's avatar
      perf tools: Enable evsel__is_aux_event() to work for S390_CPUMSF · feab89bf
      Adrian Hunter authored
      evsel__is_aux_event() identifies AUX area tracing selected events.
      
      S390_CPUMSF uses a raw event type (PERF_TYPE_RAW - refer
      s390_cpumsf_evsel_is_auxtrace()) not a PMU type value that could be checked
      in evsel__is_aux_event(). However it sets needs_auxtrace_mmap (refer
      auxtrace_record__init()), so check that first.
      
      Currently, the features that use evsel__is_aux_event() are used only by
      Intel PT, but that may change in the future.
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarLeo Yan <leo.yan@arm.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20240715160712.127117-7-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      feab89bf
    • Adrian Hunter's avatar
      perf tools: Enable evsel__is_aux_event() to work for ARM/ARM64 · c91928a8
      Adrian Hunter authored
      Set pmu->auxtrace on ARM/ARM64 AUX area PMUs. evsel__is_aux_event() needs
      the setting to identify AUX area tracing selected events.
      
      Currently, the features that use evsel__is_aux_event() are used only by
      Intel PT, but that may change in the future.
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarLeo Yan <leo.yan@arm.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20240715160712.127117-6-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c91928a8
    • James Clark's avatar
      perf scripts python cs-etm: Restore first sample log in verbose mode · ae8e4f40
      James Clark authored
      The linked commit moved the early return on the first sample to before
      the verbose log, so move the log earlier too. Now the first sample is
      also logged and not skipped.
      
      Fixes: 2d98dbb4 ("perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample")
      Reviewed-by: default avatarLeo Yan <leo.yan@arm.com>
      Signed-off-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Benjamin Gray <bgray@linux.ibm.com>
      Cc: coresight@lists.linaro.org
      Cc: gankulkarni@os.amperecomputing.com
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ruidong Tian <tianruidong@linux.alibaba.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Link: https://lore.kernel.org/r/20240723132858.12747-1-james.clark@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ae8e4f40
    • James Clark's avatar
      perf cs-etm: Output 0 instead of 0xdeadbeef when exception packets are flushed · 41947446
      James Clark authored
      Normally exception packets don't directly output a branch sample, but
      if they're the last record in a buffer then they will. Because they
      don't have addresses set we'll see the placeholder value
      CS_ETM_INVAL_ADDR (0xdeadbeef) in the output.
      
      Since commit 6035b680 ("perf cs-etm: Support dummy address value for
      CS_ETM_TRACE_ON packet") we've used 0 as an externally visible "not set"
      address value. For consistency reasons and to not make exceptions look
      like an error, change them to use 0 too.
      
      This is particularly visible when doing userspace only tracing because
      trace is disabled when jumping to the kernel, causing the flush and then
      forcing the last exception packet to be emitted as a branch. With kernel
      trace included, there is no flush so exception packets don't generate
      samples until the next range packet and they'll pick up the correct
      address.
      
      Before:
      
        $ perf record -e cs_etm//u -- stress -i 1 -t 1
        $ perf script -F comm,ip,addr,flags
      
        stress   syscall                    ffffb7eedbc0 => deadbeefdeadbeef
        stress   syscall                    ffffb7f14a14 => deadbeefdeadbeef
        stress   syscall                    ffffb7eedbc0 => deadbeefdeadbeef
      
      After:
      
        stress   syscall                    ffffb7eedbc0 =>                0
        stress   syscall                    ffffb7f14a14 =>                0
        stress   syscall                    ffffb7eedbc0 =>                0
      Reviewed-by: default avatarMike Leach <mike.leach@linaro.org>
      Signed-off-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@arm.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: gankulkarni@os.amperecomputing.com
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20240722152756.59453-2-james.clark@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41947446
    • Chen Ni's avatar
      perf inject: Convert comma to semicolon · 496cae1b
      Chen Ni authored
      Replace a comma between expression statements by a semicolon.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarChen Ni <nichen@iscas.ac.cn>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240716075347.969041-1-nichen@iscas.ac.cnSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      496cae1b
    • Chen Ni's avatar
      perf daemon: Convert comma to semicolon · e60fc19e
      Chen Ni authored
      Replace a comma between expression statements by a semicolon.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarChen Ni <nichen@iscas.ac.cn>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240716074340.968909-1-nichen@iscas.ac.cnSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e60fc19e
    • Chen Ni's avatar
      perf annotate: Convert comma to semicolon · 050f2a03
      Chen Ni authored
      Replace a comma between expression statements by a semicolon.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarChen Ni <nichen@iscas.ac.cn>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240716073405.968801-1-nichen@iscas.ac.cnSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      050f2a03
    • Kajol Jain's avatar
      perf vendor events power10: Update JSON/events · 42d37fc0
      Kajol Jain authored
      Update JSON/events for power10 platform with additional events.
      
      Also move PM_VECTOR_LD_CMPL event from others.json to frontend.json
      file.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: hbathini@linux.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20240723052154.96202-1-kjain@linux.ibm.com
      [ Remove alternative to ' char that made the build break in some distros with a unicode parsing python error ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      42d37fc0
    • Athira Rajeev's avatar
      perf annotate: Set instruction name to be used with insn-stat when using raw instruction · 2c9db747
      Athira Rajeev authored
      Since the "ins.name" is not set while using raw instruction,
      'perf annotate' with insn-stat gives wrong data:
      
      Result from "./perf annotate --data-type --insn-stat":
      
        Annotate Instruction stats
        total 615, ok 419 (68.1%), bad 196 (31.9%)
      
          Name      :  Good   Bad
          -----------------------------------------------------------
                    :   419   196
      
      This patch sets "dl->ins.name" in arch specific function
      "check_ppc_insn" while initialising "struct disasm_line".
      
      Also update "ins_find" function to pass "struct disasm_line" as a
      parameter so as to set its name field in arch specific call.
      
      With the patch changes:
      
        Annotate Instruction stats
        total 609, ok 446 (73.2%), bad 163 (26.8%)
      
        Name/opcode         :  Good   Bad
        -----------------------------------------------------------
        58                  :   323    80
        32                  :    49    43
        34                  :    33    11
        OP_31_XOP_LDX       :     8    20
        40                  :    23     0
        OP_31_XOP_LWARX     :     5     1
        OP_31_XOP_LWZX      :     2     3
        OP_31_XOP_LDARX     :     3     0
        33                  :     0     2
        OP_31_XOP_LBZX      :     0     1
        OP_31_XOP_LWAX      :     0     1
        OP_31_XOP_LHZX      :     0     1
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-16-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c9db747
    • Athira Rajeev's avatar
      perf annotate: Add support to use libcapstone in powerpc · c5d60de1
      Athira Rajeev authored
      Now perf uses the capstone library to disassemble the instructions in
      x86. capstone is used (if available) for perf annotate to speed up.
      
      Currently it only supports x86 architecture.
      
      This patch includes changes to enable this in powerpc.
      
      For now, only for data type sort keys, this method is used and only
      binary code (raw instruction) is read. This is because powerpc approach
      to understand instructions and reg fields uses raw instruction.
      
      The "cs_disasm" is currently not enabled. While attempting to do
      cs_disasm, observation is that some of the instructions were not
      identified (ex: extswsli, maddld) and it had to fallback to use objdump.
      
      Hence enabling "cs_disasm" is added in comment section as a TODO for
      powerpc.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-15-atrajeev@linux.vnet.ibm.com
      [ Use dso__nsinfo(dso) as required to match EXTRA_CFLAGS=-DREFCNT_CHECKING=1 build expectations ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5d60de1
    • Athira Rajeev's avatar
      perf annotate: Use capstone_init and remove open_capstone_handle from disasm.c · f1e9347c
      Athira Rajeev authored
      capstone_init is made availbale for all archs to use and updated to
      enable support for CS_ARCH_PPC as well. Patch removes
      open_capstone_handle and uses capstone_init in all the places.
      
      Committer notes:
      
      Avoid including capstone/capstone.h from print_insn.h to not break the
      build in builtin-script.c due to the namespace clash with libbpf:
      
        /usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-14-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f1e9347c
    • Athira Rajeev's avatar
      perf annotate: Make capstone_init non-static so that it can be used during symbol disassemble · 1fe86bc2
      Athira Rajeev authored
      symbol__disassemble_capstone in util/disasm.c calls function
      open_capstone_handle to open/init the capstone.
      
      We already have a capstone_init function in "util/print_insn.c". But
      capstone_init is defined as a static function in util/print_insn.c.
      
      Change this and also add the function in print_insn.h
      
      The open_capstone_handle checks the disassembler_style option from
      annotation_options to decide whether to set CS_OPT_SYNTAX_ATT.
      
      Add that logic in capstone_init also and by default set it to true.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-13-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1fe86bc2
    • Athira Rajeev's avatar
      perf annotate: Update instruction tracking for powerpc · 88444952
      Athira Rajeev authored
      Add instruction tracking function "update_insn_state_powerpc" for
      powerpc. Example sequence in powerpc:
      
        ld      r10,264(r3)
        mr      r31,r3
        <<after some sequence>
        ld      r9,312(r31)
      
      Consider ithe sample is pointing to: "ld r9,312(r31)".
      
      Here the memory reference is hit at "312(r31)" where 312 is the offset
      and r31 is the source register.
      
      Previous instruction sequence shows that register state of r3 is moved
      to r31.
      
      So to identify the data type for r31 access, the previous instruction
      ("mr") needs to be tracked and the state type entry has to be updated.
      
      Current instruction tracking support in perf tools infrastructure is
      specific to x86. Patch adds this support for powerpc as well.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-12-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88444952
    • Athira Rajeev's avatar
      perf annotate: Add more instructions for instruction tracking · 539bfea3
      Athira Rajeev authored
      Add few more instructions and use opcode as search key
      to find if it is supported by the architecture.
      
      The added ones are: addi, addic, addic., addis, subfic and mulli
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-11-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      539bfea3
    • Athira Rajeev's avatar
      perf annotate: Add some of the arithmetic instructions to support instruction tracking in powerpc · cd0b6f67
      Athira Rajeev authored
      Data-type profiling has the concept of instruction tracking.
      
      Example sequence in powerpc:
      
      	ld      r10,264(r3)
      	mr      r31,r3
      	<<after some sequence>
      	ld      r9,312(r31)
      
      or differently
      
      	lwz	r10,264(r3)
      	add	r31, r3, RB
      	lwz	r9, 0(r31)
      
      If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends
      on previous instruction sequence here. So to track the previous
      instructions, patch adds changes to identify some of the arithmetic
      instructions which are having opcode as 31.
      
      Since memory instructions also has cases with opcode 31, use the bits
      22:30 to filter the arithmetic instructions here.
      
      Also there are instructions with just two operands like "addme", "addze".
      
      This patch adds new instructions ops "arithmetic_ops" to handle this
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-10-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cd0b6f67
    • Athira Rajeev's avatar
      perf annotate: Add support to identify memory instructions of opcode 31 in powerpc · ace7d681
      Athira Rajeev authored
      There are memory instructions in powerpc with opcode as 31.
      Example: "ldx RT,RA,RB" , Its X form is as below:
      
        ______________________________________
        | 31 |  RT  |  RA |  RB |   21     |/|
        --------------------------------------
        0    6     11    16    21         30 31
      
      The opcode for "ldx" is 31. There are other instructions also with
      opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux
      But all instructions with opcode 31 are not memory. Example is add
      instruction: "add RT,RA,RB"
      
      The value in bit 21-30 [ 21 for ldx ] is different for these
      instructions. Patch uses this value to assign instruction ops for these
      cases. The naming convention and value to identify these are picked from
      defines in "arch/powerpc/include/asm/ppc-opcode.h"
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-9-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ace7d681
    • Athira Rajeev's avatar
      perf annotate: Add parse function for memory instructions in powerpc · 1acdad68
      Athira Rajeev authored
      Use the raw instruction code and macros to identify memory instructions,
      extract register fields and also offset.
      
      The implementation addresses the D-form, X-form, DS-form instructions.
      Two main functions are added.
      
      New parse function "load_store__parse" as instruction ops parser for
      memory instructions.
      
      Unlike other parsers (like mov__parse), this one fills in the
      "multi_regs" field for source/target and new added "mem_ref" field. No
      other fields are set because, here there is no need to parse the
      disassembled code and arch specific macros will take care of extracting
      offset and regs which is easier and will be precise.
      
      In powerpc, all instructions with a primary opcode from 32 to 63
      are memory instructions. Update "ins__find" function to have "raw_insn"
      also as a parameter.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-8-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1acdad68
    • Athira Rajeev's avatar
      perf annotate: Update parameters for reg extract functions to use raw instruction on powerpc · 1b4406d2
      Athira Rajeev authored
      Use the raw instruction code and macros to identify memory instructions,
      extract register fields and also offset.
      
      The implementation addresses the D-form, X-form, DS-form instructions.
      
      Adds "mem_ref" field to check whether source/target has memory
      reference.
      
      Add function "get_powerpc_regs" which will set these fields: reg1, reg2,
      offset depending of where it is source or target ops.
      
      Update "parse" callback for "struct ins_ops" to also pass "struct
      disasm_line" as argument. This is needed in parse functions where opcode
      is used to determine whether to set multi_regs and other fields
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1b4406d2
    • Athira Rajeev's avatar
      perf annotate: Add support to capture and parse raw instruction in powerpc... · 0b971e6b
      Athira Rajeev authored
      perf annotate: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility
      
      Add support to capture and parse raw instruction in powerpc.
      Currently, the perf tool infrastructure uses two ways to disassemble
      and understand the instruction. One is objdump and other option is
      via libcapstone.
      
      Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
      with "objdump" while disassemble. Example from powerpc with this option
      for an instruction address is:
      
      Snippet from:
      
        objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>
      
        c0000000010224b4:	lwz     r10,0(r9)
      
      This line "lwz r10,0(r9)" is parsed to extract instruction name,
      registers names and offset. Also to find whether there is a memory
      reference in the operands, "memory_ref_char" field of objdump is used.
      For x86, "(" is used as memory_ref_char to tackle instructions of the
      form "mov  (%rax), %rcx".
      
      In case of powerpc, not all instructions using "(" are the only memory
      instructions. Example, above instruction can also be of extended form (X
      form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
      and extract the source/target registers, patch adds support to use raw
      instruction for powerpc. Approach used is to read the raw instruction
      directly from the DSO file using "dso__data_read_offset" utility which
      is already implemented in perf infrastructure in "util/dso.c".
      
      Example:
      
      38 01 81 e8     ld      r4,312(r1)
      
      Here "38 01 81 e8" is the raw instruction representation. In powerpc,
      this translates to instruction form: "ld RT,DS(RA)" and binary code
      as:
      
         | 58 |  RT  |  RA |      DS       | |
         -------------------------------------
         0    6     11    16              30 31
      
      Function "symbol__disassemble_dso" is updated to read raw instruction
      directly from DSO using dso__data_read_offset utility. In case of
      above example, this captures:
      line:    38 01 81 e8
      
      The above works well when 'perf report' is invoked with only sort keys
      for data type ie type and typeoff.
      
      Because there is no instruction level annotation needed if only data
      type information is requested for.
      
      For annotating sample, along with type and typeoff sort key, "sym" sort
      key is also needed. And by default invoking just "perf report" uses sort
      key "sym" that displays the symbol information.
      
      With approach changes in powerpc which first reads DSO for raw
      instruction, "perf annotate" and "perf report" + a key breaks since
      it doesn't do the instruction level disassembly.
      
      Snippet of result from 'perf report':
      
        Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
        do_work  /usr/bin/pmlogger [Percent: local period]
        Percent│        ea230010
               │        3a550010
               │        3a600000
      
               │        38f60001
               │        39490008
               │        42400438
         51.44 │        81290008
               │        7d485378
      
      Here, raw instruction is displayed in the output instead of human
      readable annotated form.
      
      One way to get the appropriate data is to specify "--objdump path", by
      which code annotation will be done. But the default behaviour will be
      changed. To fix this breakage, check if "sym" sort key is set. If so
      fallback and use the libcapstone/objdump way of disassmbling the sample.
      
      With the changes and "perf report"
      
      Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
      do_work  /usr/bin/pmlogger [Percent: local period]
      Percent│        ld        r17,16(r3)
             │        addi      r18,r21,16
             │        li        r19,0
      
             │ 8b0:   rldicl    r10,r10,63,33
             │        addi      r10,r10,1
             │        mtctr     r10
             │      ↓ b         8e4
             │ 8c0:   addi      r7,r22,1
             │        addi      r10,r9,8
             │      ↓ bdz       d00
       51.44 │        lwz       r9,8(r9)
             │        mr        r8,r10
             │        cmpw      r20,r9
      
      Committer notes:
      
      Just add the extern for 'sort_order' in disasm.c so that we don't end up
      breaking the build due to this type colision with capstone and libbpf:
      
        In file included from /usr/include/capstone/capstone.h:325,
                         from /git/perf-6.10.0/tools/perf/util/print_insn.h:23,
                         from builtin-script.c:38:
        /usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag
           94 | typedef enum bpf_insn {
      
      I reported this to the bpf mailing list, see one of the links below.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-6-atrajeev@linux.vnet.ibm.com
      Link: https://lore.kernel.org/bpf/ZqOltPk9VQGgJZAA@x1/T/#uSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b971e6b
    • Athira Rajeev's avatar
      perf annotate: Add disasm_line__parse() to parse raw instruction for powerpc · 06dd4c5a
      Athira Rajeev authored
      Currently, the perf tool infrastructure uses the disasm_line__parse
      function to parse disassembled line.
      
      Example snippet from objdump:
      
        objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>
      
        c0000000010224b4:	lwz     r10,0(r9)
      
      This line "lwz r10,0(r9)" is parsed to extract instruction name,
      registers names and offset.
      
      In powerpc, the approach for data type profiling uses raw instruction
      instead of result from objdump to identify the instruction category and
      extract the source/target registers.
      
      Example: 38 01 81 e8     ld      r4,312(r1)
      
      Here "38 01 81 e8" is the raw instruction representation. Add function
      "disasm_line__parse_powerpc" to handle parsing of raw instruction.
      Also update "struct disasm_line" to save the binary code/
      With the change, function captures:
      
      line -> "38 01 81 e8     ld      r4,312(r1)"
      raw instruction "38 01 81 e8"
      
      Raw instruction is used later to extract the reg/offset fields. Macros
      are added to extract opcode and register fields. "struct disasm_line"
      is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw
      code (raw).
      
      Function "disasm_line__parse_powerpc fills the raw instruction hex value
      and can use macros to get opcode. There is no changes in existing code
      paths, which parses the disassembled code.  The size of raw instruction
      depends on architecture.
      
      In case of powerpc, the parsing the disasm line needs to handle cases
      for reading binary code directly from DSO as well as parsing the objdump
      result. Hence adding the logic into separate function instead of
      updating "disasm_line__parse".  The architecture using the instruction
      name and present approach is not altered. Since this approach targets
      powerpc, the macro implementation is added for powerpc as of now.
      
      Since the disasm_line__parse is used in other cases (perf annotate) and
      not only data tye profiling, the powerpc callback includes changes to
      work with binary code as well as mnemonic representation.
      
      Also in case if the DSO read fails and libcapstone is not supported, the
      approach fallback to use objdump as option. Hence as option, patch has
      changes to ensure objdump option also works well.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com
      [ Add check for strndup() result ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06dd4c5a
    • Athira Rajeev's avatar
      perf annotate: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc · b1d8d968
      Athira Rajeev authored
      TYPE_STATE_MAX_REGS is arch-dependent. Currently this is defined to be
      16.
      
      While checking if reg is valid using has_reg_type, max value is checked
      using TYPE_STATE_MAX_REGS value.
      
      Define this conditionally for powerpc.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-4-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b1d8d968
    • Athira Rajeev's avatar
      perf annotate: Add "update_insn_state" callback function to handle arch... · 782959ac
      Athira Rajeev authored
      perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking
      
      Add "update_insn_state" callback to "struct arch" to handle instruction
      tracking. Currently updating instruction state is handled by static
      function "update_insn_state_x86" which is defined in "annotate-data.c".
      
      Make this as a callback for specific arch and move to archs specific
      file "arch/x86/annotate/instructions.c" . This will help to add helper
      function for other platforms in file:
      "arch/<platform>/annotate/instructions.c" and make changes/updates
      easier.
      
      Define callback "update_insn_state" as part of "struct arch", also make
      some of the debug functions non-static so that it can be referenced from
      other places.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      782959ac
    • Athira Rajeev's avatar
      perf annotate: Move the data structures related to register type to header file · 1d303dee
      Athira Rajeev authored
      Data type profiling uses instruction tracking by checking each
      instruction and updating the register type state in some data
      structures.
      
      This is useful to find the data type in cases when the register state
      gets transferred from one reg to another.
      
      Example, in x86, "mov" instruction and in powerpc, "mr" instruction.
      
      Currently these structures are defined in annotate-data.c and
      instruction tracking is implemented only for x86.
      
      Move these data structures to "annotate-data.h" header file so that
      other arch implementations can use it in arch specific files as well.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Akanksha J N <akanksha@linux.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Segher Boessenkool <segher@kernel.crashing.org>
      Link: https://lore.kernel.org/lkml/20240718084358.72242-2-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d303dee
    • Ian Rogers's avatar
      perf test: Avoid python leak sanitizer test failures · e293f4b1
      Ian Rogers authored
      Leak sanitizer will report memory leaks from python and the leak
      sanitizer output causes tests to fail. For example:
      
        ```
        $ perf test 98 -v
         98: perf script tests:
        --- start ---
        test child forked, pid 1272962
        DB test
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.046 MB /tmp/perf-test-script.x0EktdCel8/perf.data (8 samples) ]
        call_path_table((1, 0, 0, 0)
        call_path_table((2, 1, 0, 140339508617447)
        call_path_table((3, 2, 2, 0)
        call_path_table((4, 3, 3, 0)
        call_path_table((5, 4, 4, 0)
        call_path_table((6, 5, 5, 0)
        call_path_table((7, 6, 6, 0)
        call_path_table((8, 7, 7, 0)
        call_path_table((9, 8, 8, 0)
        call_path_table((10, 9, 9, 0)
        call_path_table((11, 10, 10, 0)
        call_path_table((12, 11, 11, 0)
        call_path_table((13, 12, 1, 0)
        sample_table((1, 1, 1, 1, 1, 1, 1, 8, -2058824120, 588306954119000, -1, 0, 0, 0, 0, 1, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        sample_table((2, 1, 1, 1, 1, 1, 1, 8, -2058824120, 588306954137053, -1, 0, 0, 0, 0, 1, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        sample_table((3, 1, 1, 1, 1, 1, 1, 8, -2058824120, 588306954140089, -1, 0, 0, 0, 0, 9, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        sample_table((4, 1, 1, 1, 1, 1, 1, 8, -2058824120, 588306954142376, -1, 0, 0, 0, 0, 155, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        sample_table((5, 1, 1, 1, 1, 1, 1, 8, -2058824120, 588306954144045, -1, 0, 0, 0, 0, 2493, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        sample_table((6, 1, 1, 1, 1, 1, 12, 77, -2046828595, 588306954145722, -1, 0, 0, 0, 0, 47555, 0, 0, 128933429281, 0, 0, 13, 0, 0, 0, -1, -1))
        call_path_table((14, 9, 14, 0)
        call_path_table((15, 14, 15, 0)
        call_path_table((16, 15, 0, -1040969624)
        call_path_table((17, 16, 16, 0)
        call_path_table((18, 17, 17, 0)
        call_path_table((19, 18, 18, 0)
        call_path_table((20, 19, 19, 0)
        call_path_table((21, 20, 13, 0)
        sample_table((7, 1, 1, 1, 2, 1, 13, 46, -2053700898, 588306954157436, -1, 0, 0, 0, 0, 964078, 0, 0, 128933429281, 0, 0, 21, 0, 0, 0, -1, -1))
        call_path_table((22, 1, 21, 0)
        call_path_table((23, 22, 22, 0)
        call_path_table((24, 23, 23, 0)
        call_path_table((25, 24, 24, 0)
        call_path_table((26, 25, 25, 0)
        call_path_table((27, 26, 26, 0)
        call_path_table((28, 27, 27, 0)
        call_path_table((29, 28, 28, 0)
        call_path_table((30, 29, 29, 0)
        call_path_table((31, 30, 30, 0)
        call_path_table((32, 31, 31, 0)
        call_path_table((33, 32, 32, 0)
        call_path_table((34, 33, 33, 0)
        call_path_table((35, 34, 20, 0)
        sample_table((8, 1, 1, 1, 2, 1, 20, 49, -2046878127, 588306954378624, -1, 0, 0, 0, 0, 2534317, 0, 0, 128933429281, 0, 0, 35, 0, 0, 0, -1, -1))
      
        =================================================================
        ==1272975==ERROR: LeakSanitizer: detected memory leaks
      
        Direct leak of 13628 byte(s) in 6 object(s) allocated from:
            #0 0x56354f60c092 in malloc (/tmp/perf/perf+0x29c092)
            #1 0x7ff25c7d02e7 in _PyObject_Malloc /build/python3.11/../Objects/obmalloc.c:2003:11
            #2 0x7ff25c7d02e7 in _PyObject_Malloc /build/python3.11/../Objects/obmalloc.c:1996:1
      
        SUMMARY: AddressSanitizer: 13628 byte(s) leaked in 6 allocation(s).
        --- Cleaning up ---
        ---- end(-1) ----
         98: perf script tests                                               : FAILED!
        ```
      
      Disable leak sanitizer when running specific perf+python tests to
      avoid this. This causes the tests to pass when run with leak
      sanitizer.
      Reviewed-by: default avatarAditya Gupta <adityag@linux.ibm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e293f4b1
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Remove arg_fmt->is_enum, we can get that from the BTF type · c3d74713
      Arnaldo Carvalho de Melo authored
      This is to pave the way for other BTF types, i.e. we try to find BTF
      type then use things like btf_is_enum(btf_type) that we cached to find
      the right strtoul and scnprintf routines.
      
      For now only enum is supported, all the other types simple return zero
      for scnprintf which makes it have the same behaviour as when BTF isn't
      available, i.e. fallback to no pretty printing. Ditto for strtoul.
      
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~#
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarHoward Chu <howardchu95@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Howard Chu <howardchu95@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240624181345.124764-9-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3d74713
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Introduce trace__btf_scnprintf() · 62284329
      Arnaldo Carvalho de Melo authored
      To have a central place that will look at the BTF type and call the
      right scnprintf routine or return zero, meaning BTF pretty printing
      isn't available or not implemented for a specific type.
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarHoward Chu <howardchu95@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240624181345.124764-8-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      62284329
    • Howard Chu's avatar
      perf test trace_btf_enum: Add regression test for the BTF augmentation of enums in 'perf trace' · d66763fe
      Howard Chu authored
      Trace landlock_add_rule syscall to see if the output is desirable.
      
      Trace the non-syscall tracepoint 'timer:hrtimer_init' and
      'timer:hrtimer_start', see if the 'mode' argument is augmented,
      the 'mode' enum argument has the prefix of 'HRTIMER_MODE_'
      in its name.
      
      Committer testing:
      
        root@x1:~# perf test enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf test -v enum
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~# perf trace -e landlock_add_rule perf test -v enum
             0.000 ( 0.010 ms): perf/749827 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd324171d4, flags: 45) = -1 EINVAL (Invalid argument)
             0.012 ( 0.002 ms): perf/749827 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd324171e0, flags: 45) = -1 EINVAL (Invalid argument)
           457.821 ( 0.007 ms): perf/749830 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd4acd31e4, flags: 45) = -1 EINVAL (Invalid argument)
           457.832 ( 0.003 ms): perf/749830 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd4acd31f0, flags: 45) = -1 EINVAL (Invalid argument)
        124: perf trace enum augmentation tests                              : Ok
        root@x1:~#
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240619082042.4173621-6-howardchu95@gmail.com
      Link: https://lore.kernel.org/r/20240624181345.124764-7-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d66763fe
    • Howard Chu's avatar
      perf test: Add landlock workload · 3656e566
      Howard Chu authored
      We'll use it to add a regression test for the BTF augmentation of enum
      arguments for tracepoints in 'perf trace':
      
        root@x1:~# perf trace -e landlock_add_rule perf test -w landlock
             0.000 ( 0.009 ms): perf/747160 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd8e258594, flags: 45) = -1 EINVAL (Invalid argument)
             0.011 ( 0.002 ms): perf/747160 landlock_add_rule(ruleset_fd: 11, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd8e2585a0, flags: 45) = -1 EINVAL (Invalid argument)
        root@x1:~#
      
      Committer notes:
      
      It was agreed on the discussion (see Link below) to shorten then name of
      the workload from 'landlock_add_rule' to 'landlock', and I moved it to a
      separate patch.
      
      Also, to address a build failure from Namhyung, I stopped loading
      linux/landlock.h and instead added the used defines, enums and types to
      make this build in older systems. All we want is to emit the syscall and
      intercept it.
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/CAH0uvohaypdTV6Z7O5QSK+va_qnhZ6BP6oSJ89s1c1E0CjgxDA@mail.gmail.com
      Link: https://lore.kernel.org/r/20240624181345.124764-1-howardchu95@gmail.com
      Link: https://lore.kernel.org/r/20240624181345.124764-6-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3656e566
    • Howard Chu's avatar
      perf trace: Filter enum arguments with enum names · 95586588
      Howard Chu authored
      Before:
      
      perf $ ./perf trace -e timer:hrtimer_start --filter='mode!=HRTIMER_MODE_ABS_PINNED_HARD' --max-events=1
      No resolver (strtoul) for "mode" in "timer:hrtimer_start", can't set filter "(mode!=HRTIMER_MODE_ABS_PINNED_HARD) && (common_pid != 281988)"
      
      After:
      
      perf $ ./perf trace -e timer:hrtimer_start --filter='mode!=HRTIMER_MODE_ABS_PINNED_HARD' --max-events=1
           0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff9498a6ca5f18, function: 0xffffffffa77a5be0, expires: 12351248764875, softexpires: 12351248764875, mode: HRTIMER_MODE_ABS)
      
      && and ||:
      
      perf $ ./perf trace -e timer:hrtimer_start --filter='mode != HRTIMER_MODE_ABS_PINNED_HARD && mode != HRTIMER_MODE_ABS' --max-events=1
           0.000 Hyprland/534 timer:hrtimer_start(hrtimer: 0xffff9497801a84d0, function: 0xffffffffc04cdbe0, expires: 12639434638458, softexpires: 12639433638458, mode: HRTIMER_MODE_REL)
      
      perf $ ./perf trace -e timer:hrtimer_start --filter='mode == HRTIMER_MODE_REL || mode == HRTIMER_MODE_PINNED' --max-events=1
           0.000 ldlck-test/60639 timer:hrtimer_start(hrtimer: 0xffffb16404ee7bf8, function: 0xffffffffa7790420, expires: 12772614418016, softexpires: 12772614368016, mode: HRTIMER_MODE_REL)
      
      Switching it up, using both enum name and integer value(--filter='mode == HRTIMER_MODE_ABS_PINNED_HARD || mode == 0'):
      
      perf $ ./perf trace -e timer:hrtimer_start --filter='mode == HRTIMER_MODE_ABS_PINNED_HARD || mode == 0' --max-events=3
           0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff9498a6ca5f18, function: 0xffffffffa77a5be0, expires: 12601748739825, softexpires: 12601748739825, mode: HRTIMER_MODE_ABS_PINNED_HARD)
           0.036 :0/0 timer:hrtimer_start(hrtimer: 0xffff9498a6ca5f18, function: 0xffffffffa77a5be0, expires: 12518758748124, softexpires: 12518758748124, mode: HRTIMER_MODE_ABS_PINNED_HARD)
           0.172 tmux: server/41881 timer:hrtimer_start(hrtimer: 0xffffb164081e7838, function: 0xffffffffa7790420, expires: 12518768255836, softexpires: 12518768205836, mode: HRTIMER_MODE_ABS)
      
      P.S.
      perf $ pahole hrtimer_mode
      enum hrtimer_mode {
              HRTIMER_MODE_ABS             = 0,
              HRTIMER_MODE_REL             = 1,
              HRTIMER_MODE_PINNED          = 2,
              HRTIMER_MODE_SOFT            = 4,
              HRTIMER_MODE_HARD            = 8,
              HRTIMER_MODE_ABS_PINNED      = 2,
              HRTIMER_MODE_REL_PINNED      = 3,
              HRTIMER_MODE_ABS_SOFT        = 4,
              HRTIMER_MODE_REL_SOFT        = 5,
              HRTIMER_MODE_ABS_PINNED_SOFT = 6,
              HRTIMER_MODE_REL_PINNED_SOFT = 7,
              HRTIMER_MODE_ABS_HARD        = 8,
              HRTIMER_MODE_REL_HARD        = 9,
              HRTIMER_MODE_ABS_PINNED_HARD = 10,
              HRTIMER_MODE_REL_PINNED_HARD = 11,
      };
      
      Committer testing:
      
        root@x1:~# perf trace -e timer:hrtimer_start --filter='mode != HRTIMER_MODE_ABS' --max-events=2
             0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff2a5050, function: 0xffffffff9e22ddd0, expires: 241502326000000, softexpires: 241502326000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
        18446744073709.488 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff425050, function: 0xffffffff9e22ddd0, expires: 241501814000000, softexpires: 241501814000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
        root@x1:~# perf trace -e timer:hrtimer_start --filter='mode != HRTIMER_MODE_ABS && mode != HRTIMER_MODE_ABS_PINNED_HARD' --max-events=2
             0.000 podman/510644 timer:hrtimer_start(hrtimer: 0xffffa2024f5f7dd0, function: 0xffffffff9e2170c0, expires: 241530497418194, softexpires: 241530497368194, mode: HRTIMER_MODE_REL)
            40.251 gnome-shell/2484 timer:hrtimer_start(hrtimer: 0xffff8d48bda17650, function: 0xffffffffc0661550, expires: 241550528619247, softexpires: 241550527619247, mode: HRTIMER_MODE_REL)
        root@x1:~# perf trace -v -e timer:hrtimer_start --filter='mode != HRTIMER_MODE_ABS && mode != HRTIMER_MODE_ABS_PINNED_HARD && mode != HRTIMER_MODE_REL' --max-events=2
        Using CPUID GenuineIntel-6-BA-3
        vmlinux BTF loaded
        <SNIP>
        0
        0xa
        0x1
        New filter for timer:hrtimer_start: (mode != 0 && mode != 0xa && mode != 0x1) && (common_pid != 524049 && common_pid != 4041)
        mmap size 528384B
        ^Croot@x1:~#
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/ZnCcliuecJABD5FN@x1
      Link: https://lore.kernel.org/r/20240624181345.124764-5-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      95586588
    • Howard Chu's avatar
      perf trace: Augment non-syscall tracepoints with enum arguments with BTF · 607bbdb4
      Howard Chu authored
      Before:
      
      perf $ ./perf trace -e timer:hrtimer_start --max-events=1
           0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466c25f18, function: 0xffffffff89da5be0, expires: 377432432256753, softexpires: 377432432256753, mode: 10)
      
      After:
      
      perf $ ./perf trace -e timer:hrtimer_start --max-events=1
           0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff9498a6ca5f18, function: 0xffffffffa77a5be0, expires: 4382442895089, softexpires: 4382442895089, mode: HRTIMER_MODE_ABS_PINNED_HARD)
      
      in which HRTIMER_MODE_ABS_PINNED_HARD is:
      
      perf $ pahole hrtimer_mode
      enum hrtimer_mode {
              HRTIMER_MODE_ABS             = 0,
              HRTIMER_MODE_REL             = 1,
              HRTIMER_MODE_PINNED          = 2,
              HRTIMER_MODE_SOFT            = 4,
              HRTIMER_MODE_HARD            = 8,
              HRTIMER_MODE_ABS_PINNED      = 2,
              HRTIMER_MODE_REL_PINNED      = 3,
              HRTIMER_MODE_ABS_SOFT        = 4,
              HRTIMER_MODE_REL_SOFT        = 5,
              HRTIMER_MODE_ABS_PINNED_SOFT = 6,
              HRTIMER_MODE_REL_PINNED_SOFT = 7,
              HRTIMER_MODE_ABS_HARD        = 8,
              HRTIMER_MODE_REL_HARD        = 9,
              HRTIMER_MODE_ABS_PINNED_HARD = 10,
              HRTIMER_MODE_REL_PINNED_HARD = 11,
      };
      
      Can also be tested by
      
      ./perf trace -e pagemap:mm_lru_insertion,timer:hrtimer_start,timer:hrtimer_init,skb:kfree_skb --max-events=10
      
      (Chose these 4 events because they happen quite frequently.)
      
      However some enum arguments may not be contained in vmlinux BTF. To see
      what enum arguments are supported, use:
      
      vmlinux_dir $ bpftool btf dump file /sys/kernel/btf/vmlinux > vmlinux
      
      vmlinux_dir $  while read l; do grep "ENUM '$l'" vmlinux; done < <(grep field:enum /sys/kernel/tracing/events/*/*/format | awk '{print $3}' | sort | uniq) | awk '{print $3}' | sed "s/'\(.*\)'/\1/g"
      dev_pm_qos_req_type
      error_detector
      hrtimer_mode
      i2c_slave_event
      ieee80211_bss_type
      lru_list
      migrate_mode
      nl80211_auth_type
      nl80211_band
      nl80211_iftype
      numa_vmaskip_reason
      pm_qos_req_action
      pwm_polarity
      skb_drop_reason
      thermal_trip_type
      xen_lazy_mode
      xen_mc_extend_args
      xen_mc_flush_reason
      zone_type
      
      And what tracepoints have these enum types as their arguments:
      
      vmlinux_dir $ while read l; do grep "ENUM '$l'" vmlinux; done < <(grep field:enum /sys/kernel/tracing/events/*/*/format | awk '{print $3}' | sort | uniq) | awk '{print $3}' | sed "s/'\(.*\)'/\1/g" > good_enums
      
      vmlinux_dir $ cat good_enums
      dev_pm_qos_req_type
      error_detector
      hrtimer_mode
      i2c_slave_event
      ieee80211_bss_type
      lru_list
      migrate_mode
      nl80211_auth_type
      nl80211_band
      nl80211_iftype
      numa_vmaskip_reason
      pm_qos_req_action
      pwm_polarity
      skb_drop_reason
      thermal_trip_type
      xen_lazy_mode
      xen_mc_extend_args
      xen_mc_flush_reason
      zone_type
      
      vmlinux_dir $ grep -f good_enums -l /sys/kernel/tracing/events/*/*/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_chandef_dfs_required/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_ch_switch_notify/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_ch_switch_started_notify/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_get_bss/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_ibss_joined/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_inform_bss_frame/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_radar_event/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_ready_on_channel_expired/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_ready_on_channel/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_reg_can_beacon/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_return_bss/format
      /sys/kernel/tracing/events/cfg80211/cfg80211_tx_mgmt_expired/format
      /sys/kernel/tracing/events/cfg80211/rdev_add_virtual_intf/format
      /sys/kernel/tracing/events/cfg80211/rdev_auth/format
      /sys/kernel/tracing/events/cfg80211/rdev_change_virtual_intf/format
      /sys/kernel/tracing/events/cfg80211/rdev_channel_switch/format
      /sys/kernel/tracing/events/cfg80211/rdev_connect/format
      /sys/kernel/tracing/events/cfg80211/rdev_inform_bss/format
      /sys/kernel/tracing/events/cfg80211/rdev_libertas_set_mesh_channel/format
      /sys/kernel/tracing/events/cfg80211/rdev_mgmt_tx/format
      /sys/kernel/tracing/events/cfg80211/rdev_remain_on_channel/format
      /sys/kernel/tracing/events/cfg80211/rdev_return_chandef/format
      /sys/kernel/tracing/events/cfg80211/rdev_return_int_survey_info/format
      /sys/kernel/tracing/events/cfg80211/rdev_set_ap_chanwidth/format
      /sys/kernel/tracing/events/cfg80211/rdev_set_monitor_channel/format
      /sys/kernel/tracing/events/cfg80211/rdev_set_radar_background/format
      /sys/kernel/tracing/events/cfg80211/rdev_start_ap/format
      /sys/kernel/tracing/events/cfg80211/rdev_start_radar_detection/format
      /sys/kernel/tracing/events/cfg80211/rdev_tdls_channel_switch/format
      /sys/kernel/tracing/events/compaction/mm_compaction_defer_compaction/format
      /sys/kernel/tracing/events/compaction/mm_compaction_deferred/format
      /sys/kernel/tracing/events/compaction/mm_compaction_defer_reset/format
      /sys/kernel/tracing/events/compaction/mm_compaction_finished/format
      /sys/kernel/tracing/events/compaction/mm_compaction_kcompactd_wake/format
      /sys/kernel/tracing/events/compaction/mm_compaction_suitable/format
      /sys/kernel/tracing/events/compaction/mm_compaction_wakeup_kcompactd/format
      /sys/kernel/tracing/events/error_report/error_report_end/format
      /sys/kernel/tracing/events/i2c_slave/i2c_slave/format
      /sys/kernel/tracing/events/migrate/mm_migrate_pages/format
      /sys/kernel/tracing/events/migrate/mm_migrate_pages_start/format
      /sys/kernel/tracing/events/pagemap/mm_lru_insertion/format
      /sys/kernel/tracing/events/power/dev_pm_qos_add_request/format
      /sys/kernel/tracing/events/power/dev_pm_qos_remove_request/format
      /sys/kernel/tracing/events/power/dev_pm_qos_update_request/format
      /sys/kernel/tracing/events/power/pm_qos_update_flags/format
      /sys/kernel/tracing/events/power/pm_qos_update_target/format
      /sys/kernel/tracing/events/pwm/pwm_apply/format
      /sys/kernel/tracing/events/pwm/pwm_get/format
      /sys/kernel/tracing/events/sched/sched_skip_vma_numa/format
      /sys/kernel/tracing/events/skb/kfree_skb/format
      /sys/kernel/tracing/events/thermal/thermal_zone_trip/format
      /sys/kernel/tracing/events/timer/hrtimer_init/format
      /sys/kernel/tracing/events/timer/hrtimer_start/format
      /sys/kernel/tracing/events/xen/xen_mc_batch/format
      /sys/kernel/tracing/events/xen/xen_mc_extend_args/format
      /sys/kernel/tracing/events/xen/xen_mc_flush_reason/format
      /sys/kernel/tracing/events/xen/xen_mc_issue/format
      
      Committer testing:
      
        root@x1:~# perf trace -e timer:hrtimer_start --max-events=2
             0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 241152380000000, softexpires: 241152380000000, mode: HRTIMER_MODE_ABS)
             0.028 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 241153654000000, softexpires: 241153654000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
        root@x1:~#
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Reviewed-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/20240615032743.112750-1-howardchu95@gmail.com
      Link: https://lore.kernel.org/r/20240624181345.124764-4-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      607bbdb4
    • Howard Chu's avatar
      perf trace: BTF-based enum pretty printing for syscall args · 45a0c928
      Howard Chu authored
      In this patch, BTF is used to turn enum value to the corresponding
      name. There is only one system call that uses enum value as its
      argument, that is `landlock_add_rule()`.
      
      The vmlinux btf is loaded lazily, when user decided to trace the
      `landlock_add_rule` syscall. But if one decide to run `perf trace`
      without any arguments, the behaviour is to trace `landlock_add_rule`,
      so vmlinux btf will be loaded by default.
      
      The laziest behaviour is to load vmlinux btf when a
      `landlock_add_rule` syscall hits. But I think you could lose some
      samples when loading vmlinux btf at run time, for it can delay the
      handling of other samples. I might need your precious opinions on
      this...
      
      before:
      
      ```
      perf $ ./perf trace -e landlock_add_rule
           0.000 ( 0.008 ms): ldlck-test/438194 landlock_add_rule(rule_type: 2) = -1 EBADFD (File descriptor in bad state)
           0.010 ( 0.001 ms): ldlck-test/438194 landlock_add_rule(rule_type: 1) = -1 EBADFD (File descriptor in bad state)
      ```
      
      after:
      
      ```
      perf $ ./perf trace -e landlock_add_rule
           0.000 ( 0.029 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_NET_PORT)     = -1 EBADFD (File descriptor in bad state)
           0.036 ( 0.004 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_PATH_BENEATH) = -1 EBADFD (File descriptor in bad state)
      ```
      
      Committer notes:
      
      Made it build with NO_LIBBPF=1, simplified btf_enum_fprintf(), see [1]
      for the discussion.
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Günther Noack <gnoack@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mickaël Salaün <mic@digikod.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/20240613022757.3589783-1-howardchu95@gmail.com
      Link: https://lore.kernel.org/lkml/ZnXAhFflUl_LV1QY@x1 # [1]
      Link: https://lore.kernel.org/r/20240624181345.124764-3-howardchu95@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      45a0c928
    • Linus Torvalds's avatar
      Merge tag 'for-6.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · e4fc196f
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix regression in extent map rework when handling insertion of
         overlapping compressed extent
      
       - fix unexpected file length when appending to a file using direct io
         and buffer not faulted in
      
       - in zoned mode, fix accounting of unusable space when flipping
         read-only block group back to read-write
      
       - fix page locking when COWing an inline range, assertion failure found
         by syzbot
      
       - fix calculation of space info in debugging print
      
       - tree-checker, add validation of data reference item
      
       - fix a few -Wmaybe-uninitialized build warnings
      
      * tag 'for-6.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: initialize location to fix -Wmaybe-uninitialized in btrfs_lookup_dentry()
        btrfs: fix corruption after buffer fault in during direct IO append write
        btrfs: zoned: fix zone_unusable accounting on making block group read-write again
        btrfs: do not subtract delalloc from avail bytes
        btrfs: make cow_file_range_inline() honor locked_page on error
        btrfs: fix corrupt read due to bad offset of a compressed extent map
        btrfs: tree-checker: validate dref root and objectid
      e4fc196f
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.11-2024-07-30' of... · e254e0c5
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.11-2024-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Namhyung Kim:
       "Some more build fixes and a random crash fix:
      
         - Fix cross-build by setting pkg-config env according to the arch
      
         - Fix static build for missing library dependencies
      
         - Fix Segfault when callchain has no symbols"
      
      * tag 'perf-tools-fixes-for-v6.11-2024-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        perf docs: Document cross compilation
        perf: build: Link lib 'zstd' for static build
        perf: build: Link lib 'lzma' for static build
        perf: build: Only link libebl.a for old libdw
        perf: build: Set Python configuration for cross compilation
        perf: build: Setup PKG_CONFIG_LIBDIR for cross compilation
        perf tool: fix dereferencing NULL al->maps
      e254e0c5