• Ian Rogers's avatar
    perf stat: Introduce skippable evsels · 1b114824
    Ian Rogers authored
    'perf stat' with no arguments will use default events and metrics. These
    events may fail to open even with kernel and hypervisor disabled. When
    these fail then the permissions error appears even though they were
    implicitly selected. This is particularly a problem with the automatic
    selection of the TopdownL1 metric group on certain architectures like
    Skylake:
    
      $ perf stat true
      Error:
      Access to performance monitoring and observability operations is limited.
      Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
      access to performance monitoring and observability operations for processes
      without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
      More information can be found at 'Perf events and tool security' document:
      https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
      perf_event_paranoid setting is 2:
        -1: Allow use of (almost) all events by all users
            Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
      >= 0: Disallow raw and ftrace function tracepoint access
      >= 1: Disallow CPU event access
      >= 2: Disallow kernel profiling
      To make the adjusted perf_event_paranoid setting permanent preserve it
      in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
      $
    
    This patch adds skippable evsels that when they fail to open won't cause
    termination and will appear as "<not supported>" in output. The
    TopdownL1 events, from the metric group, are marked as skippable. This
    turns the failure above to:
    
      $ perf stat perf bench internals synthesize
      Computing performance of single threaded perf event synthesis by
      synthesizing events on the perf process itself:
        Average synthesis took: 49.287 usec (+- 0.083 usec)
        Average num. events: 3.000 (+- 0.000)
        Average time per event 16.429 usec
        Average data synthesis took: 49.641 usec (+- 0.085 usec)
        Average num. events: 11.000 (+- 0.000)
        Average time per event 4.513 usec
    
       Performance counter stats for 'perf bench internals synthesize':
    
                1,222.38 msec task-clock:u                     #    0.993 CPUs utilized
                       0      context-switches:u               #    0.000 /sec
                       0      cpu-migrations:u                 #    0.000 /sec
                     162      page-faults:u                    #  132.529 /sec
             774,445,184      cycles:u                         #    0.634 GHz                         (49.61%)
           1,640,969,811      instructions:u                   #    2.12  insn per cycle              (59.67%)
             302,052,148      branches:u                       #  247.102 M/sec                       (59.69%)
               1,807,718      branch-misses:u                  #    0.60% of all branches             (59.68%)
               5,218,927      CPU_CLK_UNHALTED.REF_XCLK:u      #    4.269 M/sec
                                                        #     17.3 %  tma_frontend_bound
                                                        #     56.4 %  tma_retiring
                                                        #      nan %  tma_backend_bound
                                                        #      nan %  tma_bad_speculation      (60.01%)
             536,580,469      IDQ_UOPS_NOT_DELIVERED.CORE:u    #  438.965 M/sec                       (60.33%)
         <not supported>      INT_MISC.RECOVERY_CYCLES_ANY:u
               5,223,936      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u #    4.274 M/sec                       (40.31%)
             774,127,250      CPU_CLK_UNHALTED.THREAD:u        #  633.297 M/sec                       (50.34%)
           1,746,579,518      UOPS_RETIRED.RETIRE_SLOTS:u      #    1.429 G/sec                       (50.12%)
           1,940,625,702      UOPS_ISSUED.ANY:u                #    1.588 G/sec                       (49.70%)
    
             1.231055525 seconds time elapsed
    
             0.258327000 seconds user
             0.965749000 seconds sys
      $
    
    The event INT_MISC.RECOVERY_CYCLES_ANY:u is skipped as it can't be
    opened with paranoia 2 on Skylake. With a lower paranoia, or as root,
    all events/metrics are computed.
    Signed-off-by: default avatarIan Rogers <irogers@google.com>
    Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ahmad Yasin <ahmad.yasin@intel.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Cc: Caleb Biggers <caleb.biggers@intel.com>
    Cc: Edward Baker <edward.baker@intel.com>
    Cc: Florian Fischer <florian.fischer@muhq.space>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: James Clark <james.clark@arm.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: John Garry <john.g.garry@oracle.com>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Kang Minchul <tegongkang@gmail.com>
    Cc: Leo Yan <leo.yan@linaro.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Perry Taylor <perry.taylor@intel.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ravi Bangoria <ravi.bangoria@amd.com>
    Cc: Rob Herring <robh@kernel.org>
    Cc: Samantha Alt <samantha.alt@intel.com>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
    Cc: Thomas Richter <tmricht@linux.ibm.com>
    Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
    Cc: Weilin Wang <weilin.wang@intel.com>
    Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
    Cc: Yang Jihong <yangjihong1@huawei.com>
    Link: https://lore.kernel.org/r/20230502223851.2234828-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    1b114824
builtin-stat.c 71.6 KB