1. 25 Nov, 2009 1 commit
  2. 24 Nov, 2009 20 commits
    • Steven Rostedt's avatar
      tracing: Convert some sched trace events to DEFINE_EVENT and _PRINT · 75ec29ab
      Steven Rostedt authored
      Converting some of the scheduler trace events to use the
      TRACE_EVENT_TEMPLATE, DEFINE_EVENT and DEFINE_EVENT_PRINT helped to
      save some space:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is without any tracepoints compiled
      sched.o-templ is with this patch
      sched.o-trace is the tracepoints before this patch
      
      The trace events converted to DEFINE_EVENT:
      
      sched_wakeup, sched_wakeup_new, sched_process_free, sched_process_exit,
      and sched_stat_wait.
      
      The trace events converted to DEFINE_EVENT_PRINT:
      
      sched_stat_sleep and sched_stat_iowait.
      
      Note, since the TRACE_EVENT_TEMPLATE always uses a print, the
      sched_stat_wait print format is defined in the template and this
      template is used by sched_stat_sleep and sched_stat_iowait. But the
      later two override the print format.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      75ec29ab
    • Steven Rostedt's avatar
      tracing: Create new DEFINE_EVENT_PRINT · e5bc9721
      Steven Rostedt authored
      After creating the TRACE_EVENT_TEMPLATE I started to look at other
      trace points to see what duplication was made. I noticed that there
      are several trace points where they are almost identical except for
      the name and the output format. Since TRACE_EVENT_TEMPLATE was successful
      in bringing down the size of trace events, I added a DEFINE_EVENT_PRINT.
      
      DEFINE_EVENT_PRINT is used just like DEFINE_EVENT is. That is, the
      DEFINE_EVENT_PRINT also uses a TRACE_EVENT_TEMPLATE, but it allows the
      developer to overwrite the print format. If there are two or more
      TRACE_EVENTS that are identical except for the name and print, then
      they can be converted to use a TRACE_EVENT_TEMPLATE. Since the
      TRACE_EVENT_TEMPLATE already does the print output, the first trace event
      would have its print format held in the TRACE_EVENT_TEMPLATE and
      be defined with a DEFINE_EVENT. The rest will use the DEFINE_EVENT_PRINT
      and override the print format.
      
      Converting the sched trace points to both DEFINE_EVENT and
      DEFINE_EVENT_PRINT. Five were converted to DEFINE_EVENT and two were
      converted to DEFINE_EVENT_PRINT.
      
      I was able to get the following:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is the scheduler compiled with no trace points.
      sched.o-templ is with the use of DEFINE_EVENT and DEFINE_EVENT_PRINT
      sched.o-trace is the current trace events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e5bc9721
    • Steven Rostedt's avatar
      tracing: Create new TRACE_EVENT_TEMPLATE · ff038f5c
      Steven Rostedt authored
      There are some places in the kernel that define several tracepoints and
      they are all identical besides the name. The code to enable, disable and
      record is created for every trace point even if most of the code is
      identical.
      
      This patch adds TRACE_EVENT_TEMPLATE that lets the developer create
      a template TRACE_EVENT and create trace points with DEFINE_EVENT, which
      is based off of a given template. Each trace point used by this
      will share most of the code, and bring down the size of the kernel
      when there are several duplicate events.
      
      Usage is:
      
      TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print);
      
      Which would be the same as defining a normal TRACE_EVENT.
      
      To create the trace events that the trace points will use:
      
      DEFINE_EVENT(template, name, proto, args) is done. The template
      is the name of the TRACE_EVENT_TEMPLATE to use. The name is the
      name of the trace point. The parameters proto and args must be the same
      as the proto and args of the template. If they are not the same,
      then a compile error will result. I tried hard removing this duplication
      but the C preprocessor is not powerful enough (or my CPP magic
      experience points is not at a high enough level) to not need them.
      
      A lot of trace events are coming in with new XFS development. Most of
      the trace points are identical except for the name. The following shows
      the advantage of having TRACE_EVENT_TEMPLATE:
      
      $ size fs/xfs/xfs.o.*
          text          data     bss     dec     hex filename
        452114          2788    3520  458422   6feb6 fs/xfs/xfs.o.old
        638482         38116    3744  680342   a6196 fs/xfs/xfs.o.template
        996954         38116    4480 1039550   fdcbe fs/xfs/xfs.o.trace
      
      xfs.o.old is without any tracepoints.
      xfs.o.template uses the new TRACE_EVENT_TEMPLATE.
      xfs.o.trace uses the current TRACE_EVENT macros.
      Requested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ff038f5c
    • Frederic Weisbecker's avatar
      perf_events: Fix bad software/trace event recursion counting · fe612672
      Frederic Weisbecker authored
      Commit 4ed7c92d
      (perf_events: Undo some recursion damage) has introduced a bad
      reference counting of the recursion context. putting the context
      behaves like getting it, dropping every software/trace events
      after the first one in a context.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1259091502-5171-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fe612672
    • Stephane Eranian's avatar
      perf_events, x86: Fix validate_event bug · 1261a02a
      Stephane Eranian authored
      The validate_event() was failing on valid event combinations. The
      function was assuming that if x86_schedule_event() returned 0, it
      meant error. But x86_schedule_event() returns the counter index and
      0 is a perfectly valid value. An error is returned if the function
      returns a negative value.
      
      Furthermore, validate_event() was also failing for event groups
      because the event->pmu was not set until after
      hw_perf_event_init().
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: paulus@samba.org
      Cc: perfmon2-devel@lists.sourceforge.net
      Cc: eranian@gmail.com
      LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      --
       arch/x86/kernel/cpu/perf_event.c |    4 ++--
       1 file changed, 2 insertions(+), 2 deletions(-)
      1261a02a
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Rename find_symbol routines to find_function · fcf1203a
      Arnaldo Carvalho de Melo authored
      Paving the way for supporting variable in adition to function
      symbols.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fcf1203a
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove unused wrapper routines · 727dad10
      Arnaldo Carvalho de Melo authored
      And also make xrealloc and xmalloc weak symbols so that we don't
      have this problem:
      
       /usr/lib/gcc/x86_64-redhat-linux/4.4.1/../../../../lib64/libiberty.a(xmalloc.o):
       In function `xrealloc':
       (.text+0xc0): multiple definition of `xrealloc'
       libperf.a(wrapper.o):/home/acme_unencrypted/git/linux-2.6-tip/tools/perf/util/wrapper.c:67:
       first defined here
       collect2: ld returned 1 exit status
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-4-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      727dad10
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Introduce zalloc() for the common calloc(1, N) case · 36479484
      Arnaldo Carvalho de Melo authored
      This way we type less characters and it looks more like the
      kzalloc kernel counterpart.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-3-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      36479484
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Simplify symbol machinery setup · b32d133a
      Arnaldo Carvalho de Melo authored
      And also express its configuration toggles via a struct.
      
      Now all one has to do is to call symbol__init(NULL) if the
      defaults are OK, or pass a struct symbol_conf pointer with the
      desired configuration.
      
      If a tool uses kernel_maps__find_symbol() to look at the kernel
      and modules mappings for a symbol but didn't call symbol__init()
      first, that will generate a one time warning too, alerting the
      subcommand developer that symbol__init() must be called.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-2-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b32d133a
    • Arnaldo Carvalho de Melo's avatar
      perf top: Always show the DSO column, even if its all the same · 7cc017ed
      Arnaldo Carvalho de Melo authored
      Ingo found it confusing, and I agree with that, for 'perf
      report' its OK because it is static, but for a tool refreshing
      it the eventual switch from column to summary at the top may
      seem confusing.
      Suggested-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259071517-3242-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7cc017ed
    • John Kacur's avatar
      perf tools: Use common process_event functions for annotate and report · e74328d3
      John Kacur authored
      Prevent bit-rot in perf-annotate by using common functions where
      possible. Here we create process_events.[ch] to hold the common
      functions.
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: acme@redhat.com
      LKML-Reference: <1259073301-11506-3-git-send-email-jkacur@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e74328d3
    • John Kacur's avatar
      perf tools: Add perf.data to .gitignore · c9c7ccaf
      John Kacur authored
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: acme@redhat.com
      LKML-Reference: <1259073301-11506-2-git-send-email-jkacur@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c9c7ccaf
    • Ingo Molnar's avatar
      Merge branch 'perf/bench' into perf/core · 1263d736
      Ingo Molnar authored
      Merge reason: Looks mergable - ready it for the merge window.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1263d736
    • Stephane Eranian's avatar
      perf_events: Fix bogus copy_to_user() in perf_event_read_group() · 184d3da8
      Stephane Eranian authored
      When using an event group, the value and id for non leaders events
      were wrong due to invalid offset into the outgoing buffer.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: paulus@samba.org
      Cc: perfmon2-devel@lists.sourceforge.net
      LKML-Reference: <4b0b71e1.0508d00a.075e.ffff84a3@mx.google.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      184d3da8
    • Li Zefan's avatar
      perf kmem: Add help file · b23d5767
      Li Zefan authored
      Add Documentation/perf-kmem.txt
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6EAF.80802@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b23d5767
    • Li Zefan's avatar
      perf kmem: Measure kmalloc/kfree CPU ping-pong call-sites · 079d3f65
      Li Zefan authored
      Show statistics for allocations and frees on different cpus:
      
      ------------------------------------------------------------------------------------------------------
      Callsite                           | Total_alloc/Per | Total_req/Per   | Hit   | Ping-pong | Frag
      ------------------------------------------------------------------------------------------------------
       perf_event_alloc.clone.0+0         |      7504/682   |      7128/648   |     11 |        0 |  5.011%
       alloc_buffer_head+16               |       288/57    |       280/56    |      5 |        0 |  2.778%
       radix_tree_preload+51              |       296/296   |       288/288   |      1 |        0 |  2.703%
       tracepoint_add_probe+32e           |       157/31    |       154/30    |      5 |        0 |  1.911%
       do_maps_open+0                     |       796/12    |       792/12    |     66 |        0 |  0.503%
       sock_alloc_send_pskb+16e           |     23780/495   |     23744/494   |     48 |       38 |  0.151%
       anon_vma_prepare+9a                |      3744/44    |      3740/44    |     85 |        0 |  0.107%
       d_alloc+21                         |     64948/164   |     64944/164   |    396 |        0 |  0.006%
       proc_alloc_inode+23                |    262292/676   |    262288/676   |    388 |        0 |  0.002%
       create_object+28                   |    459600/200   |    459600/200   |   2298 |       71 |  0.000%
       journal_start+67                   |     14440/40    |     14440/40    |    361 |        0 |  0.000%
       get_empty_filp+df                  |     53504/256   |     53504/256   |    209 |        0 |  0.000%
       getname+2a                         |    823296/4096  |    823296/4096  |    201 |        0 |  0.000%
       seq_read+2b0                       |    544768/4096  |    544768/4096  |    133 |        0 |  0.000%
       seq_open+6d                        |     17024/128   |     17024/128   |    133 |        0 |  0.000%
       mmap_region+2e6                    |     11704/88    |     11704/88    |    133 |        0 |  0.000%
       single_open+0                      |      1072/16    |      1072/16    |     67 |        0 |  0.000%
       __alloc_skb+2e                     |     12544/256   |     12544/256   |     49 |       38 |  0.000%
       __sigqueue_alloc+4a                |      1296/144   |      1296/144   |      9 |        8 |  0.000%
       tracepoint_add_probe+6f            |        80/16    |        80/16    |      5 |        0 |  0.000%
      ------------------------------------------------------------------------------------------------------
      ...
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E9F.6020309@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      079d3f65
    • Li Zefan's avatar
      perf kmem: Collect cross node allocation statistics · 7d0d3945
      Li Zefan authored
      Show cross node memory allocations:
      
       # ./perf kmem
      
       SUMMARY
       =======
       ...
       Cross node allocations: 0/3633
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E87.10906@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7d0d3945
    • Li Zefan's avatar
      perf kmem: Default to sort by fragmentation · 29b3e152
      Li Zefan authored
      Make the output sort by fragmentation by default.
      
      Also make the usage of "--sort" option consistent with other
      perf tools. That is, we support multi keys: "--sort
      key1[,key2]...".
      
       # ./perf kmem --stat caller
       ------------------------------------------------------------------------------
       Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
       ------------------------------------------------------------------------------
       __netdev_alloc_skb+23       |    5048/1682   |    4564/1521  |     3|   9.588%
       perf_event_alloc.clone.0+0  |    7504/682    |    7128/648   |    11|   5.011%
       tracepoint_add_probe+32e    |     157/31     |     154/30    |     5|   1.911%
       alloc_buffer_head+16        |     456/57     |     448/56    |     8|   1.754%
       radix_tree_preload+51       |     584/292    |     576/288   |     2|   1.370%
       ...
      
      TODO:
      - Extract duplicate code in builtin-kmem.c and builtin-sched.c
        into util/sort.c.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E72.7010200@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      29b3e152
    • Li Zefan's avatar
      perf kmem: Add new option to show raw ip · 7707b6b6
      Li Zefan authored
      Add option "--raw-ip" to show raw ip instead of symbols:
      
       # ./perf kmem --stat caller --raw-ip
       ------------------------------------------------------------------------------
       Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
       ------------------------------------------------------------------------------
       0xc05301aa                  |  733184/4096   |  733184/4096  |   179|   0.000%
       0xc0542ba0                  |  483328/4096   |  483328/4096  |   118|   0.000%
       ...
      
      Also show symbols with format sym+offset instead of sym/offset.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      LKML-Reference: <4B0B6E5C.4080900@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7707b6b6
    • Paul Mackerras's avatar
      perf tools: Fix compilation on powerpc · ee3d2504
      Paul Mackerras authored
      Currently, perf fails to compile on powerpc with this error:
      
           CC util/header.o
       In file included from util/../perf.h:17,
                        from util/header.c:9:
       util/../../../arch/powerpc/include/asm/unistd.h:360:27: error:
       linux/linkage.h: No such file or directory make: ***
       [util/header.o] Error 1
      
      The reason is that we still have a #define __KERNEL__ in effect
      at the point where <asm/unistd.h> gets included, which means we
      get extra stuff that we don't need or want.
      
      This fixes the problem by undefining __KERNEL__ once we have
      included the file for which we need __KERNEL__ defined.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <19211.24287.453183.78836@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ee3d2504
  3. 23 Nov, 2009 19 commits
    • Frederic Weisbecker's avatar
      hw-breakpoints: Fix misordered ifdef · fa7c27ee
      Frederic Weisbecker authored
      Fix a misplaced ifdef. We need the perf event headers also in
      off-case to avoid the following build error:
      
       include/linux/hw_breakpoint.h:94: error: expected declaration specifiers or '...' before 'perf_callback_t'
       include/linux/hw_breakpoint.h:102: error: expected declaration specifiers or '...' before 'perf_callback_t'
       include/linux/hw_breakpoint.h:109: error: expected declaration specifiers or '...' before 'perf_callback_t'
       include/linux/hw_breakpoint.h:116: error: expected declaration specifiers or '...' before 'perf_callback_t'
      Reported-by: default avatarKisskb-bot by Michael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1259011812-8093-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fa7c27ee
    • Arnaldo Carvalho de Melo's avatar
      perf kmem: Resolve symbols · 1b145ae5
      Arnaldo Carvalho de Melo authored
      E.g.:
      
        [root@doppio linux-2.6-tip]# perf kmem record sleep 3s
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.804 MB perf.data (~35105 samples) ]
      
        [root@doppio linux-2.6-tip]# perf kmem --stat caller | head -10
        ------------------------------------------------------------------------------
        Callsite                    |Total_alloc/Per | Total_req/Per | Hit  | Frag
        ------------------------------------------------------------------------------
        getname/40                  | 1519616/4096   | 1519616/4096  |   371|   0.000%
        seq_read/a2                 |  987136/4096   |  987136/4096  |   241|   0.000%
        __netdev_alloc_skb/43       |  260368/1049   |  259968/1048  |   248|   0.154%
        __alloc_skb/5a              |   77312/256    |   77312/256   |   302|   0.000%
        proc_alloc_inode/33         |   76480/632    |   76472/632   |   121|   0.010%
        get_empty_filp/8d           |   70272/192    |   70272/192   |   366|   0.000%
        split_vma/8e                |   42064/176    |   42064/176   |   239|   0.000%
        [root@doppio linux-2.6-tip]#
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1259005869-13487-2-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1b145ae5
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Move graph_line and graph_dotted_line from top · 2890284b
      Arnaldo Carvalho de Melo authored
      So that they can be used in other tools.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259005869-13487-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2890284b
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Look for vmlinux in more places · cc612d81
      Arnaldo Carvalho de Melo authored
      Now that we can check the buildid to see if it really matches,
      this can be done safely:
      
        vmlinux
        /boot/vmlinux
        /boot/vmlinux-<uts.release>
        /lib/modules/<uts.release>/build/vmlinux
        /usr/lib/debug/lib/modules/%s/vmlinux
      
      More can be added - if you know about distros that put the
      vmlinux somewhere else please let us know.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cc612d81
    • Frederic Weisbecker's avatar
      perf tools: Add support for breakpoint events in perf tools · 1b290d67
      Frederic Weisbecker authored
      Add the breakpoint events support with this new sysnopsis:
      
        mem:addr[:access]
      
      Where addr is a raw addr value in the kernel and access can be
      either [r][w][x]
      
      Example to profile tasklist_lock:
      
      	$ grep tasklist_lock /proc/kallsyms
      	ffffffff8189c000 D tasklist_lock
      
      	$ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
      	$ perf report
      
      	# Samples: 62
      	#
      	# Overhead          Command  Shared Object  Symbol
      	# ........  ...............  .............  ......
      	#
      	    29.03%          swapper  [kernel]       [k] _raw_read_trylock
      	    29.03%          swapper  [kernel]       [k] _raw_read_unlock
      	    19.35%             init  [kernel]       [k] _raw_read_trylock
      	    19.35%             init  [kernel]       [k] _raw_read_unlock
      	     1.61%         events/0  [kernel]       [k] _raw_read_trylock
      	     1.61%         events/0  [kernel]       [k] _raw_read_unlock
      
      Coming soon:
      
       - Support for symbols in the event definition.
      
       - Default period to 1 for breakpoint events because these are
         not high frequency events. The same thing is needed for trace
         events.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1258987355-8751-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      1b290d67
    • Frederic Weisbecker's avatar
      perf: Add kernel side syscall events support for breakpoints · f5ffe02e
      Frederic Weisbecker authored
      Add the remaining necessary bits to support breakpoints created
      through perf syscall.
      
      We don't use the software counter interface as:
      
      - We don't need to check against recursion, this is already done
        in hardware breakpoints arch level.
      
      - We already know the perf event we are dealing with when the
        event is to be committed.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1258987355-8751-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f5ffe02e
    • Frederic Weisbecker's avatar
      hw-breakpoints: Check the breakpoint params from perf tools · fdf6bc95
      Frederic Weisbecker authored
      Perf tools create perf events as disabled in the beginning.
      Breakpoints are then considered like ptrace temporary
      breakpoints, only meant to reserve a breakpoint slot until we
      get all the necessary informations from the user.
      
      In this case, we don't check the address that is breakpointed as
      it is NULL in the ptrace case.
      
      But perf tools don't have the same purpose, events are created
      disabled to wait for all events to be created before enabling
      all of them. We want to check the breakpoint parameters in this
      case.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1258987355-8751-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fdf6bc95
    • Frederic Weisbecker's avatar
      hw-breakpoints: Include only linux/perf_event.h from kernel part of bp headers · e6db4876
      Frederic Weisbecker authored
      As userspace only needs the breakpoints enum types from the
      breakpoints headers.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <1258987355-8751-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e6db4876
    • K.Prasad's avatar
      hw-breakpoint: Attribute authorship of hw-breakpoint related files · ba6909b7
      K.Prasad authored
      Attribute authorship to developers of hw-breakpoint related
      files.
      Signed-off-by: default avatarK.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091123154713.GA5593@in.ibm.com>
      [ v2: moved it to latest -tip ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ba6909b7
    • Peter Zijlstra's avatar
      perf_events: Restore sanity to scaling land · acd1d7c1
      Peter Zijlstra authored
      It is quite possible to call update_event_times() on a context
      that isn't actually running and thereby confuse the thing.
      
      perf stat was reporting !100% scale values for software counters
      (2e2af50b perf_events: Disable events when we detach them,
      solved the worst of that, but there was still some left).
      
      The thing that happens is that because we are not self-reaping
      (we have a caring parent) there is a time between the last
      schedule (out) and having do_exit() called which will detach the
      events.
      
      This period would be accounted as enabled,!running because the
      event->state==INACTIVE, even though !event->ctx->is_active.
      
      Similar issues could have been observed by calling read() on a
      event while the attached task was not scheduled in.
      
      Solve this by teaching update_event_times() about
      ctx->is_active.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1258984836.4531.480.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      acd1d7c1
    • Peter Zijlstra's avatar
      perf_events: Undo some recursion damage · 4ed7c92d
      Peter Zijlstra authored
      Make perf_swevent_get_recursion_context return a context number
      and disable preemption.
      
      This could be used to remove the IRQ disable from the trace bit
      and index the per-cpu buffer with.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091123103819.993226816@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4ed7c92d
    • Peter Zijlstra's avatar
      perf_events: Fix __perf_event_exit_task() vs. update_event_times() locking · f67218c3
      Peter Zijlstra authored
      Move the update_event_times() call in __perf_event_exit_task()
      into list_del_event() because that holds the proper lock
      (ctx->lock) and seems a more natural place to do the last time
      update.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091123103819.842455480@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f67218c3
    • Peter Zijlstra's avatar
      perf_events: Update the context time on exit · 5e942bb3
      Peter Zijlstra authored
      It appeared we did call update_event_times() on exit, but we
      failed to update the context time, which renders the former
      moot.
      
      Locking is a bit iffy, we call update_event_times under
      ctx->mutex instead of ctx->lock - the next patch fixes this.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091123103819.764207355@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5e942bb3
    • Peter Zijlstra's avatar
      perf_events: Disable events when we detach them · 2e2af50b
      Peter Zijlstra authored
      If we leave the event in STATE_INACTIVE, any read of the event
      after the detach will increase the running count but not the
      enabled count and cause funny scaling artefacts.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091123103819.689055515@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2e2af50b
    • Peter Zijlstra's avatar
      perf_events: Fix style nits · 6c2bfcbe
      Peter Zijlstra authored
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091123103819.613427378@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6c2bfcbe
    • Peter Zijlstra's avatar
      perf_events: Undo copy/paste damage · a66a3052
      Peter Zijlstra authored
      We had two almost identical functions, avoid the duplication.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20091123103819.537537928@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a66a3052
    • Ingo Molnar's avatar
      perf_events: Optimize the swcounter hotpath · a4234bfc
      Ingo Molnar authored
      The structure init creates a bit memcpy, which shows
      up big time in perf annotate output:
      
                :      ffffffff810a859d <__perf_sw_event>:
           1.68 :      ffffffff810a859d:       55                      push   %rbp
           1.69 :      ffffffff810a859e:       41 89 fa                mov    %edi,%r10d
           0.01 :      ffffffff810a85a1:       49 89 c9                mov    %rcx,%r9
           0.00 :      ffffffff810a85a4:       31 c0                   xor    %eax,%eax
           1.71 :      ffffffff810a85a6:       b9 16 00 00 00          mov    $0x16,%ecx
           0.00 :      ffffffff810a85ab:       48 89 e5                mov    %rsp,%rbp
           0.00 :      ffffffff810a85ae:       48 83 ec 60             sub    $0x60,%rsp
           1.52 :      ffffffff810a85b2:       48 8d 7d a0             lea    -0x60(%rbp),%rdi
          85.20 :      ffffffff810a85b6:       f3 ab                   rep stos %eax,%es:(%rdi)
      
      None of the callees depends on the structure being pre-initialized,
      so only initialize ->addr. This gets rid of the memcpy overhead.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a4234bfc
    • Ingo Molnar's avatar
      perf events: Do not generate function trace entries in perf code · 6e3d8330
      Ingo Molnar authored
      Decreases perf overhead when function tracing is enabled,
      by about 50%.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6e3d8330
    • Simon Kaempflein's avatar
      perf record, x86: Print more intelligent error message when sampling fails · bfd45118
      Simon Kaempflein authored
      Print more accurate error message when "perf record" fails because
      there is no APIC support, on x86.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bfd45118