• Arnaldo Carvalho de Melo's avatar
    perf sched: Don't read all tracepoint variables in advance · 9ec3f4e4
    Arnaldo Carvalho de Melo authored
    Do it just at the actual consumer of these fields, that way we avoid
    needless lookups:
    
      [root@sandy ~]# perf sched record sleep 30s
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]
    
    Before:
    
      [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
    
       Performance counter stats for 'perf sched lat' (10 runs):
    
              103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                      12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                       0 cpu-migrations            #    0.000 K/sec
                   7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
             345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
             106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
              62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
             628,246,586 instructions              #    1.82  insns per cycle
                                                   #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
             134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
               1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]
    
             0.104333272 seconds time elapsed                                          ( +-  0.33% )
    
      [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null
    
       Performance counter stats for 'perf sched lat' (10 runs):
    
             98.848272 task-clock                #    0.993 CPUs utilized            ( +-  0.48% )
                    11 context-switches          #    0.112 K/sec                    ( +-  2.83% )
                     0 cpu-migrations            #    0.003 K/sec                    ( +- 50.92% )
                 7,604 page-faults               #    0.077 M/sec                    ( +-  0.00% )
           332,216,085 cycles                    #    3.361 GHz                      ( +-  0.14% ) [82.87%]
           100,623,710 stalled-cycles-frontend   #   30.29% frontend cycles idle     ( +-  0.53% ) [82.95%]
            58,788,692 stalled-cycles-backend    #   17.70% backend  cycles idle     ( +-  0.59% ) [67.15%]
           609,402,433 instructions              #    1.83  insns per cycle
                                                 #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.76%]
           131,277,138 branches                  # 1328.067 M/sec                    ( +-  0.06% ) [83.77%]
             1,117,871 branch-misses             #    0.85% of all branches          ( +-  0.32% ) [83.51%]
    
           0.099580430 seconds time elapsed                                          ( +-  0.48% )
    
      [root@sandy ~]#
    
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Namhyung Kim <namhyung@gmail.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: http://lkml.kernel.org/n/tip-kracdpw8wqlr0xjh75uk8g11@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    9ec3f4e4
builtin-sched.c 44.4 KB