• Andi Kleen's avatar
    perf vendor events: Add JSON metrics for Broadwell · cf979623
    Andi Kleen authored
    Add JSON metrics for Broadwell.
    
    Commiter testing:
    
      # uname -a
      Linux jouet 4.13.0-rc7+ #3 SMP Sat Sep 2 09:04:44 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
      # grep "model name" /proc/cpuinfo  | head -1
      model name	: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
      # perf list metricgroup
    
      List of pre-defined events (to be used in -e):
    
      Metric Groups:
    
      DSB
      FLOPS
      Frontend
      Frontend_Bandwidth
      Memory_BW
      Memory_Bound
      Memory_Lat
      Pipeline
      Ports_Utilization
      Power
      SMT
      Summary
      TLB
      TopDownL1
      Unknown_Branches
      # perf stat -M Power --metric-only -a sleep 1
    
       Performance counter stats for 'system wide':
    
      Turbo_Utilization  C3_Core_Residency  C6_Core_Residency  C7_Core_Residency  C2_Pkg_Residency  C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
           1.1               0.0                 0.0               0.0                0.0               0.0               0.0               0.0
    
             1.003502904 seconds time elapsed
    
      #
      # perf stat -M Memory_BW --metric-only -a sleep 1
    
       Performance counter stats for 'system wide':
    
      MLP
           1.7
    
             1.001364525 seconds time elapsed
    
      #
      # perf stat -M TLB --metric-only -a sleep 1
    
       Performance counter stats for 'system wide':
    
      Page_Walks_Utilization
           0.1
    
             1.005962198 seconds time elapsed
    
      #
      # perf stat -M Summary --metric-only -a sleep 1
    
       Performance counter stats for 'system wide':
    
      Instructions   CPI          CLKS          CPU_Utilization   GFLOPs  SMT_2T_Utilization  Kernel_Utilization
      7281856697.0       0.0    11150898087.0     1.0              0.0    1.0                 0.7
    
             1.012134025 seconds time elapsed
    
      #
    
    Running in verbose mode shows which counters and expressions are being
    used:
    
      # perf stat -v -M Summary --metric-only -a sleep 1
      Using CPUID GenuineIntel-6-3D
      metric expr 1 / inst_retired.any / cycles for CPI
      found event inst_retired.any
      found event cycles
      metric expr cpu_clk_unhalted.thread for CLKS
      found event cpu_clk_unhalted.thread
      metric expr inst_retired.any for Instructions
      found event inst_retired.any
      metric expr cpu_clk_unhalted.ref_tsc / msr@tsc@ for CPU_Utilization
      found event cpu_clk_unhalted.ref_tsc
      found event msr/tsc/
      metric expr ( 1*( fp_arith_inst_retired.scalar_single + fp_arith_inst_retired.scalar_double ) + 2* fp_arith_inst_retired.128b_packed_double + 4*( fp_arith_inst_retired.128b_packed_single + fp_arith_inst_retired.256b_packed_double ) + 8* fp_arith_inst_retired.256b_packed_single ) / 1000000000 / duration_time for GFLOPs
      found event fp_arith_inst_retired.scalar_single
      found event fp_arith_inst_retired.scalar_double
      found event fp_arith_inst_retired.128b_packed_double
      found event fp_arith_inst_retired.128b_packed_single
      found event fp_arith_inst_retired.256b_packed_double
      found event fp_arith_inst_retired.256b_packed_single
      found event duration_time
      metric expr 1 - cpu_clk_thread_unhalted.one_thread_active / ( cpu_clk_thread_unhalted.ref_xclk_any / 2 ) if #smt_on else 0 for SMT_2T_Utilization
      found event cpu_clk_thread_unhalted.one_thread_active
      found event cpu_clk_thread_unhalted.ref_xclk_any
      metric expr cpu_clk_unhalted.ref_tsc:u / cpu_clk_unhalted.ref_tsc for Kernel_Utilization
      found event cpu_clk_unhalted.ref_tsc:u
      found event cpu_clk_unhalted.ref_tsc
      adding {inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_arith_inst_retired.scalar_single,fp_arith_inst_retired.scalar_double,fp_arith_inst_retired.128b_packed_double,fp_arith_inst_retired.128b_packed_single,fp_arith_inst_retired.256b_packed_double,fp_arith_inst_retired.256b_packed_single,duration_time}:W,{cpu_clk_thread_unhalted.one_thread_active,cpu_clk_thread_unhalted.ref_xclk_any}:W,{cpu_clk_unhalted.ref_tsc:u,cpu_clk_unhalted.ref_tsc}:W
      inst_retired.any -> cpu/event=0xc0/
      cpu_clk_unhalted.thread -> cpu/event=0x3c/
      inst_retired.any -> cpu/event=0xc0/
      cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
      fp_arith_inst_retired.scalar_single -> cpu/umask=0x2,period=2000003,event=0xc7/
      fp_arith_inst_retired.scalar_double -> cpu/umask=0x1,period=2000003,event=0xc7/
      fp_arith_inst_retired.128b_packed_double -> cpu/umask=0x4,period=2000003,event=0xc7/
      fp_arith_inst_retired.128b_packed_single -> cpu/umask=0x8,period=2000003,event=0xc7/
      fp_arith_inst_retired.256b_packed_double -> cpu/umask=0x10,period=2000003,event=0xc7/
      fp_arith_inst_retired.256b_packed_single -> cpu/umask=0x20,period=2000003,event=0xc7/
      cpu_clk_thread_unhalted.one_thread_active -> cpu/umask=0x2,period=2000003,event=0x3c/
      cpu_clk_thread_unhalted.ref_xclk_any -> cpu/umask=0x1,any=1,period=2000003,event=0x3c/
      cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
      cpu_clk_unhalted.ref_tsc -> cpu/umask=0x3,period=2000003,event=0/
      Weak group for fp_arith_inst_retired.scalar_single/7 failed
      Weak group for cpu_clk_unhalted.ref_tsc:u/2 failed
      inst_retired.any: 8704146437 4026374016 619883741
      cycles: 11180800018 4026374016 619883741
      cpu_clk_unhalted.thread: 11140030295 4026323772 931621933
      inst_retired.any: 8643115117 4026260510 1243595906
      cpu_clk_unhalted.ref_tsc: 10201638510 4026184297 1247351077
      msr/tsc/: 10378022785 4026184297 1247351077
      fp_arith_inst_retired.scalar_single: 134697 4026102728 1559210545
      fp_arith_inst_retired.scalar_double: 274339 4026007348 1870014984
      fp_arith_inst_retired.128b_packed_double: 1639 4025886054 1866736918
      fp_arith_inst_retired.128b_packed_single: 0 4025776614 2175106569
      fp_arith_inst_retired.256b_packed_double: 0 4025681734 1235551129
      fp_arith_inst_retired.256b_packed_single: 0 4025582962 1232398454
      duration_time: 0 4025552913 4025552913
      cpu_clk_thread_unhalted.one_thread_active: 10505 4025474649 923893076
      cpu_clk_thread_unhalted.ref_xclk_any: 394992110 4025474649 923893076
      cpu_clk_unhalted.ref_tsc:u: 5341421014 4025360315 1231634198
      cpu_clk_unhalted.ref_tsc: 10258278508 4025252611 307909362
    
       Performance counter stats for 'system wide':
    
      Instructions         CPI                  CLKS                 CPU_Utilization      GFLOPs               SMT_2T_Utilization   Kernel_Utilization
      8704146437.0             0.0            11140030295.0            1.0                 0.0                 1.0                 0.5
    
             1.006783654 seconds time elapsed
    
      #
    Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Link: http://lkml.kernel.org/r/20170905195235.GW2482@two.firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    cf979623
bdw-metrics.json 7.34 KB