An error occurred fetching the project authors.
  1. 15 Mar, 2023 24 commits
    • Leo Yan's avatar
      perf kvm: Polish sorting key · c695d48a
      Leo Yan authored
      Since histograms supports sorting, the tool doesn't need to maintain the
      mapping between the sorting keys and the corresponding comparison
      callbacks, therefore, this patch removes structure kvm_event_key.
      
      But we still need to validate the sorting key, this patch uses an array
      for sorting keys and renames function select_key() to is_valid_key()
      to validate the sorting key passed by user.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c695d48a
    • Leo Yan's avatar
      perf kvm: Use histograms list to replace cached list · f57a6414
      Leo Yan authored
      perf kvm tool defines its own cached list which is managed with RB tree,
      histograms also provide RB tree to manage data entries.  Since now we
      have introduced histograms in the tool, it's not necessary to use the
      self defined list and we can directly use histograms list to manage
      KVM events.
      
      This patch changes to use histograms list to track KVM events, and it
      invokes the common function hists__output_resort_cb() to sort result,
      this also give us flexibility to extend more sorting key words easily.
      
      After histograms list supported, the cached list is redundant so remove
      the relevant code for it.
      
      Committer notes:
      
      kvm_hists__reinit() is only used by functions enclosed in:
      
        #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So do it with this new function as well.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f57a6414
    • Leo Yan's avatar
      perf kvm: Add dimensions for KVM event statistics · 41f1138e
      Leo Yan authored
      To support KVM event statistics, this patch firstly registers histograms
      columns and sorting fields; every column or field has its own format
      structure, the format structure is dereferenced to access the dimension,
      finally the dimension provides the comparison callback for sorting
      result.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41f1138e
    • Leo Yan's avatar
      perf hist: Add 'kvm_info' field in histograms entry · ebf39d29
      Leo Yan authored
      __hists__add_entry() creates a temporary entry and compare it with
      existed histograms entries, if any existed entry equals to the
      temporary entry it skips to allocation to avoid duplication.
      
      The problem for support KVM event in histograms is it doesn't contain
      any info to identify KVM event and can be used for comparison entries.
      
      This patch adds 'kvm_info' field in the histograms entry which contains
      the KVM event's key, this identifier will be used for comparison
      histograms entries in later change.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ebf39d29
    • Leo Yan's avatar
      perf kvm: Parse address location for samples · 001b08f4
      Leo Yan authored
      Parse address location for samples and save it into the structure
      'perf_kvm_stat', it is to be used by histograms entry.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      001b08f4
    • Leo Yan's avatar
      perf kvm: Pass argument 'sample' to kvm_alloc_init_event() · 730651f7
      Leo Yan authored
      This patch adds an argument 'sample' for kvm_alloc_init_event(), and its
      caller functions are updated as well for passing down the 'sample'
      pointer.
      
      This is a preparation change to allow later patch to create histograms
      entries for kvm event, no any functionality changes.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      730651f7
    • Leo Yan's avatar
      perf kvm: Introduce histograms data structures · 2d08124b
      Leo Yan authored
      This is a preparation to support histograms in perf kvm tool.  As first
      step, this patch defines histograms data structures and initialize them.
      
      Committer notes:
      
      Those are only used by functions enclosed in:
      
        #if efined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So do this for these new functions and struct as well.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2d08124b
    • Leo Yan's avatar
      perf kvm: Use macro to replace variable 'decode_str_len' · 2d31e0bf
      Leo Yan authored
      The variable 'decode_str_len' defines the string length for KVM event
      name and every arch defines its own values.
      
      This introduces complexity that the variable definition are spreading in
      multiple source files under arch folder.  This patch refactors code to
      use a macro KVM_EVENT_NAME_LEN to define event name length and thus
      remove the definitions in arch files.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2d31e0bf
    • Leo Yan's avatar
      perf kvm: Use subtraction for comparison metrics · dd787ae4
      Leo Yan authored
      Currently the metrics comparison uses greater operator (>), it returns
      the boolean value (0 or 1).
      
      This patch changes to use subtraction as comparison result, which can
      be used by histograms sorting.  Since the subtraction result is u64
      type, we change key_cmp_fun's return type to int64_t to avoid overflow.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd787ae4
    • Leo Yan's avatar
      perf kvm: Move up metrics helpers · f098376d
      Leo Yan authored
      This patch moves up the helper functions of event's metrics for later
      adding code to call them.
      
      No any functionality changes, but has a function renaming from
      compare_kvm_event_{metric}() to cmp_event_{metric}().
      
      Committer notes:
      
      Those helper functions are only used if this is true:
      
        if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So keep them enclosed with that.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f098376d
    • Leo Yan's avatar
      perf kvm: Add pointer to 'perf_kvm_stat' in kvm event · a7d451a8
      Leo Yan authored
      Sometimes, handling kvm events needs to base on global variables, e.g.
      when read event counts we need to know the target vcpu ID; the global
      variables are stored in structure perf_kvm_stat.
      
      This patch adds add a 'perf_kvm_stat' pointer in kvm event structure,
      it is to be used by later refactoring.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a7d451a8
    • Leo Yan's avatar
      perf kvm: Refactor overall statistics · 9c3aa1f4
      Leo Yan authored
      Currently the tool computes overall statistics when sort the results.
      This patch refactors overall statistics during events processing,
      therefore, the function update_total_coun() is not needed anymore, an
      extra benefit is we can de-couple code between the statistics and the
      sorting.
      
      This patch is not expected any functionality changes.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9c3aa1f4
    • Namhyung Kim's avatar
      perf record: Update documentation for BPF filters · c46bf3bd
      Namhyung Kim authored
      Add more description and examples.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c46bf3bd
    • Namhyung Kim's avatar
      perf bpf filter: Show warning for missing sample flags · 4310551b
      Namhyung Kim authored
      For a BPF filter to work properly, users need to provide appropriate
      options to enable the sample types.  Otherwise the BPF program would
      see an invalid value (i.e. always 0) and filter won't work well.
      
      Show a warning message if sample types are missing like below.
      
        $ sudo ./perf record -e cycles --filter 'addr < 100' true
        Error: cycles event does not have PERF_SAMPLE_ADDR
         Hint: please add -d option to perf record.
        failed to set filter "BPF" on event cycles with 22 (Invalid argument)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4310551b
    • Namhyung Kim's avatar
      perf bpf filter: Add logical OR operator · 46996dd7
      Namhyung Kim authored
      It supports two or more expressions connected as a group and the group
      result is considered true when one of them returns true.  The new group
      operators (GROUP_BEGIN and GROUP_END) are added to setup and check the
      condition.  As it doesn't allow nested groups, the condition is saved
      in local variables.
      
      For example, the following is to get samples only if the data source
      memory level is L2 cache or the weight value is greater than 30.
      
        $ sudo ./perf record -adW -e cpu/mem-loads/pp \
        > --filter 'mem_lvl == l2 || weight > 30' -- sleep 1
      
        $ sudo ./perf script -F data_src,weight
           10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A		    47
           11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A      57
           10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                56
           10650100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L2 miss|LCK No|BLK  N/A                    144
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                16
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                20
           11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A     189
           1026a100142 |OP LOAD|LVL L1 or L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes|BLK  N/A              193
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                18
           ...
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      46996dd7
    • Namhyung Kim's avatar
      perf bpf filter: Add data_src sample data support · ff612055
      Namhyung Kim authored
      The data_src has many entries to express memory behaviors.  Add each
      term separately so that users can combine them for their purpose.
      
      I didn't add prefix for the constants for simplicity as they are mostly
      distinguishable but I had to use l1_miss and l2_hit for mem_dtlb since
      mem_lvl has different values for the same names.  Note that I decided
      mem_lvl to be used as an alias of mem_lvlnum as it's deprecated now.
      According to the comment in the UAPI header, users should use the mix of
      mem_lvlnum, mem_remote and mem_snoop.  Also the SNOOPX bits are
      concatenated to mem_snoop for simplicity.
      
      The following terms are used for data_src and the corresponding perf
      sample data fields:
      
       * mem_op : { load, store, pfetch, exec }
       * mem_lvl: { l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem }
       * mem_snoop: { none, hit, miss, hitm, fwd, peer }
       * mem_remote: { remote }
       * mem_lock: { locked }
       * mem_dtlb { l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault }
       * mem_blk { by_data, by_addr }
       * mem_hops { hops0, hops1, hops2, hops3 }
      
      We can now use a filter expression like below:
      
        'mem_op == load, mem_lvl <= l2, mem_dtlb == l1_hit'
        'mem_dtlb == l2_miss, mem_hops > hops1'
        'mem_lvl == ram, mem_remote == 1'
      
      Note that 'na' is shared among the terms as it has the same value except
      for mem_lvl.  I don't have a good idea to handle that for now.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff612055
    • Namhyung Kim's avatar
      perf bpf filter: Add more weight sample data support · 409bcd80
      Namhyung Kim authored
      The weight data consists of a couple of fields with the
      PERF_SAMPLE_WEIGHT_STRUCT.  Add weight{1,2,3} term to select them
      separately.  Also add their aliases like 'ins_lat', 'p_stage_cyc' and
      'retire_lat'.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      409bcd80
    • Namhyung Kim's avatar
      perf bpf filter: Add 'pid' sample data support · 33581847
      Namhyung Kim authored
      The pid is special because it's saved in the PERF_SAMPLE_TID together.
      So it needs to differenciate tid and pid using the 'part' field in the
      perf bpf filter entry struct.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      33581847
    • Namhyung Kim's avatar
      perf record: Record dropped sample count · 27c6f245
      Namhyung Kim authored
      When it uses bpf filters, event might drop some samples.  It'd be nice
      if it can report how many samples it lost.  As LOST_SAMPLES event can
      carry the similar information, let's use it for bpf filters.
      
      To indicate it's from BPF filters, add a new misc flag for that and
      do not display cpu load warnings.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      27c6f245
    • Namhyung Kim's avatar
      perf record: Add BPF event filter support · d180aa56
      Namhyung Kim authored
      Use --filter option to set BPF filter for generic events other than the
      tracepoints or Intel PT.  The BPF program will check the sample data and
      filter according to the expression.
      
      For example, the below is the typical perf record for frequency mode.
      The sample period started from 1 and increased gradually.
      
        $ sudo ./perf record -e cycles true
        $ sudo ./perf script
             perf-exec 2272336 546683.916875:          1 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916892:          1 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916899:          3 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916905:         17 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916911:        100 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916917:        589 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916924:       3470 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
             perf-exec 2272336 546683.916930:      20465 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
                  true 2272336 546683.916940:     119873 cycles:  ffffffff8283afdd perf_iterate_ctx+0x2d ([kernel.kallsyms])
                  true 2272336 546683.917003:     461349 cycles:  ffffffff82892517 vma_interval_tree_insert+0x37 ([kernel.kallsyms])
                  true 2272336 546683.917237:     635778 cycles:  ffffffff82a11400 security_mmap_file+0x20 ([kernel.kallsyms])
      
      When you add a BPF filter to get samples having periods greater than 1000,
      the output would look like below:
      
        $ sudo ./perf record -e cycles --filter 'period > 1000' true
        $ sudo ./perf script
             perf-exec 2273949 546850.708501:       5029 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
             perf-exec 2273949 546850.708508:      32409 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
             perf-exec 2273949 546850.708526:     143369 cycles:  ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
             perf-exec 2273949 546850.708600:     372650 cycles:  ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
             perf-exec 2273949 546850.708791:     482953 cycles:  ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
                  true 2273949 546850.709036:     501985 cycles:  ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
                  true 2273949 546850.709292:     503065 cycles:      7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
      
      Committer notes:
      
      Add stubs for perf_bpf_filter__prepare() and perf_bpf_filter__destroy()
      to tools/perf/util/python.c to keep it building.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d180aa56
    • Namhyung Kim's avatar
      perf bpf filter: Implement event sample filtering · 56ec9457
      Namhyung Kim authored
      The BPF program will be attached to a perf_event and be triggered when
      it overflows.  It'd iterate the filters map and compare the sample
      value according to the expression.  If any of them fails, the sample
      would be dropped.
      
      Also it needs to have the corresponding sample data for the expression
      so it compares data->sample_flags with the given value.  To access the
      sample data, it uses the bpf_cast_to_kern_ctx() kfunc which was added
      in v6.2 kernel.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56ec9457
    • Namhyung Kim's avatar
      perf bpf filter: Introduce basic BPF filter expression · 990a71e9
      Namhyung Kim authored
      This implements a tiny parser for the filter expressions used for BPF.
      Each expression will be converted to struct perf_bpf_filter_expr and
      be passed to a BPF map.
      
      For now, I'd like to start with the very basic comparisons like EQ or
      GT.  The LHS should be a term for sample data and the RHS is a number.
      The expressions are connected by a comma.  For example,
      
          period > 10000
          ip < 0x1000000000000, cpu == 3
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      990a71e9
    • liuwenyu's avatar
      perf top: Fix rare segfault in thread__comm_len() · 6e57f69f
      liuwenyu authored
      In thread__comm_len(),strlen() is called outside of the
      thread->comm_lock critical section,which may cause a UAF
      problems if comm__free() is called by the process_thread
      concurrently.
      
      backtrace of the core file is as follows:
      
          (gdb) bt
          #0  __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
          #1  0x000055ad15d31de5 in thread__comm_len (thread=0x7f627d20e300) at util/thread.c:320
          #2  0x000055ad15d4fade in hists__calc_col_len (h=0x7f627d295940, hists=0x55ad1772bfe0)
              at util/hist.c:103
          #3  hists__calc_col_len (hists=0x55ad1772bfe0, h=0x7f627d295940) at util/hist.c:79
          #4  0x000055ad15d52c8c in output_resort (hists=hists@entry=0x55ad1772bfe0, prog=0x0,
              use_callchain=false, cb=cb@entry=0x0, cb_arg=0x0) at util/hist.c:1926
          #5  0x000055ad15d530a4 in evsel__output_resort_cb (evsel=evsel@entry=0x55ad1772bde0,
              prog=prog@entry=0x0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at util/hist.c:1945
          #6  0x000055ad15d53110 in evsel__output_resort (evsel=evsel@entry=0x55ad1772bde0,
              prog=prog@entry=0x0) at util/hist.c:1950
          #7  0x000055ad15c6ae9a in perf_top__resort_hists (t=t@entry=0x7ffcd9cbf4f0) at builtin-top.c:311
          #8  0x000055ad15c6cc6d in perf_top__print_sym_table (top=0x7ffcd9cbf4f0) at builtin-top.c:346
          #9  display_thread (arg=0x7ffcd9cbf4f0) at builtin-top.c:700
          #10 0x00007f6282fab4fa in start_thread (arg=<optimized out>) at pthread_create.c:443
          #11 0x00007f628302e200 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      
      The reason is that strlen() get a pointer to a memory that has been freed.
      
      The string pointer is stored in the structure comm_str, which corresponds
      to a rb_tree node,when the node is erased, the memory of the string is also freed.
      
      In thread__comm_len(),it gets the pointer within the thread->comm_lock critical section,
      but passed to strlen() outside of the thread->comm_lock critical section, and the perf
      process_thread may called comm__free() concurrently, cause this segfault problem.
      
      The process is as follows:
      
      display_thread                                  process_thread
      --------------                                  --------------
      
      thread__comm_len
        -> thread__comm_str
             # held the comm read lock
          -> __thread__comm_str(thread)
             # release the comm read lock
                                                      thread__delete
                                                           # held the comm write lock
                                                        -> comm__free
                                                          -> comm_str__put(comm->comm_str)
                                                            -> zfree(&cs->str)
                                                           # release the comm write lock
            # The memory of the string pointed
              to by comm has been free.
          -> thread->comm_len = strlen(comm);
      
      This patch expand the critical section range of thread->comm_lock in thread__comm_len(),
      to make strlen() called safe.
      Signed-off-by: default avatarWenyu Liu <liuwenyu7@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Feilong Lin <linfeilong@huawei.com>
      Cc: Hewenliang <hewenliang4@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Yunfeng Ye <yeyunfeng@huawei.com>
      Link: https://lore.kernel.org/r/322bfb49-840b-f3b6-9ef1-f9ec3435b07e@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6e57f69f
    • Adrian Hunter's avatar
      perf script: Fix Python support when no libtraceevent · 80c3a7d9
      Adrian Hunter authored
      Python scripting can be used without libtraceevent. In particular,
      scripting for Intel PT does not use tracepoints, and so does not need
      libtraceevent support.
      
      Alter the build and employ conditional compilation to allow Python
      scripting without libtraceevent.
      
      Example:
      
       Before:
      
          $ ldd `which perf` | grep -i python
          $ ldd `which perf` | grep -i libtraceevent
          $ perf record -e intel_pt//u uname
          Linux
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.031 MB perf.data ]
          $ perf script intel-pt-events.py |& head -3
            Error: Couldn't find script `intel-pt-events.py'
      
           See perf script -l for available scripts.
      
       After:
      
          $ ldd `which perf` | grep -i python
                  libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000)
          $ ldd `which perf` | grep -i libtraceevent
          $ perf script intel-pt-events.py | head
          Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
               Switch In    8021/8021  [000]     11234.097713404     0/0
                 perf-exec  8021/8021  [000]     11234.098041726       psb                        offset: 0x0                0 [unknown] ([unknown])
                 perf-exec  8021/8021  [000]     11234.098041726       cbr                         45  freq: 4505 MHz  (161%)                0 [unknown] ([unknown])
                     uname  8021/8021  [000]     11234.098082170  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
                     uname  8021/8021  [000]     11234.098082379  branches:uH  tr end                    7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])
                     uname  8021/8021  [000]     11234.098083629  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
                     uname  8021/8021  [000]     11234.098083629  branches:uH  call                      7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
                     uname  8021/8021  [000]     11234.098083837  branches:uH  tr end                    7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])  IPC: 0.01 (9/938)
                     uname  8021/8021  [000]     11234.098084670  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
      
      Fixes: 378ef0f5 ("perf build: Use libtraceevent from the system")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      80c3a7d9
  2. 14 Mar, 2023 16 commits