An error occurred fetching the project authors.
  1. 11 Dec, 2012 2 commits
  2. 10 Dec, 2012 1 commit
  3. 09 Dec, 2012 2 commits
  4. 14 Nov, 2012 3 commits
  5. 06 Oct, 2012 1 commit
  6. 03 Oct, 2012 2 commits
  7. 26 Sep, 2012 4 commits
  8. 06 Sep, 2012 1 commit
  9. 15 Aug, 2012 2 commits
  10. 14 Aug, 2012 1 commit
    • Jiri Olsa's avatar
      perf tools: Enable grouping logic for parsed events · 6a4bb04c
      Jiri Olsa authored
      This patch adds a functionality that allows to create event groups
      based on the way they are specified on the command line. Adding
      functionality to the '{}' group syntax introduced in earlier patch.
      
      The current '--group/-g' option behaviour remains intact. If you
      specify it for record/stat/top command, all the specified events
      become members of a single group with the first event as a group
      leader.
      
      With the new '{}' group syntax you can create group like:
        # perf record -e '{cycles,faults}' ls
      
      resulting in single event group containing 'cycles' and 'faults'
      events, with cycles event as group leader.
      
      All groups are created with regards to threads and cpus. Thus
      recording an event group within a 2 threads on server with
      4 CPUs will create 8 separate groups.
      
      Examples (first event in brackets is group leader):
      
        # 1 group (cpu-clock,task-clock)
        perf record --group -e cpu-clock,task-clock ls
        perf record -e '{cpu-clock,task-clock}' ls
      
        # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
        perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
      
        # 1 group (cpu-clock,task-clock,minor-faults,major-faults)
        perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
        perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
      
        # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
        perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
         -e instructions ls
      
        # 1 group
        # (cpu-clock,task-clock,minor-faults,major-faults,instructions)
        perf record --group -e cpu-clock,task-clock \
         -e minor-faults,major-faults -e instructions ls perf record -e
      '{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
      
      It's possible to use standard event modifier for a group, which spans
      over all events in the group and updates each event modifier settings,
      for example:
      
        # perf record -r '{faults:k,cache-references}:p'
      
      resulting in ':kp' modifier being used for 'faults' and ':p' modifier
      being used for 'cache-references' event.
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a4bb04c
  11. 02 Aug, 2012 2 commits
  12. 25 Jul, 2012 1 commit
    • Frederic Weisbecker's avatar
      perf tools: Fix trace events storms due to weight demux · 0983cc0d
      Frederic Weisbecker authored
      Trace events have a period (weight) of 1 by default. This can be
      overriden on events definition by using the __perf_count() macro.
      
      For example, the sched_stat_runtime() is weighted with the runtime of
      the task that fired the event.
      
      By default, perf handles such weighted event by dividing it into
      individual events carrying a weight of 1. For example if
      sched_stat_runtime is fired and the task has run 5000000 nsecs, perf
      divides it into 5000000 events in the buffer.
      
      This behaviour makes weighted events unusable because they quickly
      fullfill the buffers and we lose most events.
      
      The commit 5d81e5cf ("events: Don't
      divide events if it has field period") solves this problem by sending
      only one event when PERF_SAMPLE_PERIOD flag is set. The weight is
      carried in the sample itself such that we don't need to demultiplex it
      anymore.
      
      This patch provides the last missing piece to use this feature by
      setting PERF_SAMPLE_PERIOD from perf tools when we deal with trace
      events.
      
      Before:
      	$ ./perf record -e sched:* -a sleep 1
      	[ perf record: Woken up 3 times to write data ]
      	[ perf record: Captured and wrote 1.619 MB perf.data (~70749 samples) ]
      	Warning:
      	Processed 16909 events and lost 1 chunks!
      
      	Check IO/CPU overload!
      
      	$ ./perf script
      	perf  1894 [003]   824.898327: sched_migrate_task: comm=perf pid=1898 prio=120 orig_cpu=2 dest_cpu=0
      	perf  1894 [003]   824.898335: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898336: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898337: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898338: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898339: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898340: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	perf  1894 [003]   824.898341: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
      	[...]
      
      After:
      	$ ./perf record -e sched:* -a sleep 1
      	[ perf record: Woken up 1 times to write data ]
      	[ perf record: Captured and wrote 0.074 MB perf.data (~3228 samples) ]
      
      	$ ./perf script
      
      	perf  1461 [000]   554.286957: sched_migrate_task: comm=perf pid=1465 prio=120 orig_cpu=3 dest_cpu=1
      	perf  1461 [000]   554.286964: sched_stat_sleep: comm=perf pid=1465 delay=133047190 [ns]
      	perf  1461 [000]   554.286967: sched_wakeup: comm=perf pid=1465 prio=120 success=1 target_cpu=001
      	swapper     0 [001]   554.286976: sched_stat_wait: comm=perf pid=1465 delay=0 [ns]
      	swapper     0 [001]   554.286983: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf
      	[...]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1342631456-7233-1-git-send-email-fweisbec@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0983cc0d
  13. 27 Jun, 2012 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Stop using a global trace events description list · da378962
      Arnaldo Carvalho de Melo authored
      The pevent thing is per perf.data file, so I made it stop being static
      and become a perf_session member, so tools processing perf.data files
      use perf_session and _there_ we read the trace events description into
      session->pevent and then change everywhere to stop using that single
      global pevent variable and use the per session one.
      
      Note that it _doesn't_ fall backs to trace__event_id, as we're not
      interested at all in what is present in the
      /sys/kernel/debug/tracing/events in the workstation doing the analysis,
      just in what is in the perf.data file.
      
      This patch also introduces perf_session__set_tracepoints_handlers that
      is the perf perf.data/session way to associate handlers to tracepoint
      events by resolving their IDs using the events descriptions stored in a
      perf.data file. Make 'perf sched' use it.
      Reported-by: default avatarDmitry Antipov <dmitry.antipov@linaro.org>
      Tested-by: default avatarDmitry Antipov <dmitry.antipov@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linaro-dev@lists.linaro.org
      Cc: patches@linaro.org
      Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      da378962
  14. 31 May, 2012 1 commit
  15. 30 May, 2012 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf stat: Initialize default events wrt exclude_{guest,host} · 79695e1b
      Arnaldo Carvalho de Melo authored
      When no event is specified the tools use perf_evlist__add_default(), that will
      call event_attr_init to initialize the KVM exclusion bits.
      
      When the change was made to the tools so that by default guest samples would be
      excluded, the changes were made just to the parsing routines and to
      perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
      just by perf stat to add multiple events, according to the level of detail
      specified.
      
      Recently the tools were changed to reconstruct the event name from all the
      details in perf_event_attr, not just from .type and .config, but taking into
      account all the feature bits (.exclude_{guest,host,user,kernel,etc},
      .precise_ip, etc).
      
      That is when we noticed that the default for perf stat wasn't the one for the
      rest of the tools, i.e. the .exclude_guest bit wasn't being set.
      
      I.e. the default, that doesn't call event_attr_init was showing the :HG
      modifier:
      
        $ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  0.942119 task-clock                #    0.454 CPUs utilized
                         1 context-switches          #    0.001 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       126 page-faults               #    0.134 M/sec
                   693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
                   407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
                   365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
                   465,982 instructions:HG           #    0.67  insns per cycle
                                                     #    0.87  stalled cycles per insn
                    89,760 branches:HG               #   95.275 M/sec
                     6,178 branch-misses:HG          #    6.88% of all branches
      
               0.002077228 seconds time elapsed
      
      While if one explicitely specifies the same events, which will make the parsing code
      to be called and thus event_attr_init is called:
      
        $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  1.040349 task-clock                #    0.500 CPUs utilized
                         2 context-switches          #    0.002 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       127 page-faults               #    0.122 M/sec
                   587,966 cycles                    #    0.565 GHz                     [13.18%]
                   459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
                   390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
                   504,006 instructions              #    0.86  insns per cycle
                                                     #    0.91  stalled cycles per insn
                    96,455 branches                  #   92.714 M/sec
                     6,522 branch-misses             #    6.76% of all branches         [96.12%]
      
               0.002078681 seconds time elapsed
      
      Fix it by introducing a perf_evlist__add_default_attrs method that will call
      evlist_attr_init in all the perf_event_attr entries before adding the events.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      79695e1b
  16. 17 May, 2012 1 commit
  17. 16 May, 2012 2 commits
  18. 07 May, 2012 2 commits
  19. 02 May, 2012 3 commits
  20. 16 Mar, 2012 1 commit
  21. 05 Mar, 2012 1 commit
  22. 29 Feb, 2012 1 commit
  23. 14 Feb, 2012 1 commit
  24. 01 Feb, 2012 1 commit
  25. 24 Jan, 2012 1 commit
  26. 06 Jan, 2012 1 commit