1. 07 Jun, 2009 5 commits
    • Ingo Molnar's avatar
      perf_counter tools: Handle kernels with !CONFIG_PERF_COUNTER · 30c806a0
      Ingo Molnar authored
      If perf is run on a !CONFIG_PERF_COUNTER kernel right now it
      bails out with no messages or with confusing messages.
      
      Standardize this case some more and explain the situation.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      30c806a0
    • Ingo Molnar's avatar
      perf record: Fall back to cpu-clock-ticks if no PMU · 3da297a6
      Ingo Molnar authored
      On architectures/CPUs without PMU support but with perfcounters
      enabled 'perf record' currently fails because it cannot create a
      cycle based hw-perfcounter.
      
      Fall back to the cpu-clock-tick sw-perfcounter in this case, which
      is hrtimer based and will always work (as long as perfcounters
      are enabled).
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3da297a6
    • Ingo Molnar's avatar
      perf top: Fall back to cpu-clock-tick hrtimer sampling if no cycle counter available · 716c69fe
      Ingo Molnar authored
      On architectures/CPUs without PMU support but with perfcounters
      enabled 'perf top' currently fails because it cannot create a
      cycle based hw-perfcounter.
      
      Fall back to the cpu-clock-tick sw-perfcounter in this case, which
      is hrtimer based and will always work (as long as perfcounters
      is enabled).
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      716c69fe
    • Ingo Molnar's avatar
      perf stat: Continue even on counter creation error · 743ee1f8
      Ingo Molnar authored
      Before:
      
       $ perf stat ~/hackbench 5
      
       error: syscall returned with -1 (No such device)
      
      After:
      
       $ perf stat ~/hackbench 5
       Time: 1.640
      
       Performance counter stats for '/home/mingo/hackbench 5':
      
          6524.570382  task-clock-ticks     #       3.838 CPU utilization factor
                35704  context-switches     #       0.005 M/sec
                  191  CPU-migrations       #       0.000 M/sec
                 8958  page-faults          #       0.001 M/sec
        <not counted>  cycles
        <not counted>  instructions
        <not counted>  cache-references
        <not counted>  cache-misses
      
       Wall-clock time elapsed:  1699.999995 msecs
      
      Also add -v (--verbose) option to allow the printing of failed
      counter opens.
      
      Plus dont print 'inf' if wall-time is zero (due to jiffies granularity),
      instead skip the printing of the CPU utilization factor.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      743ee1f8
    • Frederic Weisbecker's avatar
      perf top: Wait for a minimal set of events before reading first snapshot · 2f01190a
      Frederic Weisbecker authored
      The first snapshot reading often occur before any events have
      been read in the mapped perfcounter files.
      
      Just wait until we have at least one event before starting the
      snapshot, or the delay before the first set of entries to be
      displayed may be long in case of low refresh rate.
      
      Note: we could also use a semaphore to wait before
      "print_entries" number of eveents is reached, but again this
      value is tunable and we can't ensure we will even reach it.
      Also we could base on a default mimimum set of entries for the
      first refresh, say 15, but again, the minimal sample is
      tunable, and we could end up displaying nothing until we have a
      minimal default set of events, which can take some time in case
      of high samples filters.
      
      Hence this simple solution which partially covers the default
      case.
      
      [ Impact: fix display artifacts in perf top ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbeec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1244322643-6447-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2f01190a
  2. 06 Jun, 2009 16 commits
    • Ingo Molnar's avatar
      perf annotate: Fix command line help text · 23b87116
      Ingo Molnar authored
      Arjan noticed this bug in the perf annotate help output:
      
          -s, --symbol <file>   symbol to annotate
      
      that should be <symbol> instead.
      Reported-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      23b87116
    • Arjan van de Ven's avatar
      perf_counter tools: Initialize a stack variable before use · e9fbc9dc
      Arjan van de Ven authored
      the "perf report" utility crashed in some circumstances
      because the "sym" stack variable was not initialized before used
      (as also proven by valgrind).
      
      With this fix both the crash goes away and valgrind no longer complains.
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e9fbc9dc
    • Ingo Molnar's avatar
      perf annotate: Automatically pick up vmlinux in the local directory · 39273ee9
      Ingo Molnar authored
      Right now kernel debug info does not get resolved by default, because
      we dont know where to look for the vmlinux.
      
      The -k option can be used for that - but if no option is given, pick
      up vmlinux files in the current directory - in case a kernel hacker
      runs profiling from the source directory that the kernel was built in.
      
      The real solution would be to embedd the location (and perhaps the
      date/timestamp) of the vmlinux file in /proc/kallsyms, so that
      tools can pick it up automatically.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      39273ee9
    • Ingo Molnar's avatar
      perf_counter tools: Fix error condition in parse_aliases() · 8953645f
      Ingo Molnar authored
      gcc warned about this bug:
      
      util/parse-events.c: In function ‘parse_generic_hw_symbols’:
      util/parse-events.c:175: warning: comparison is always false due to limited range of data type
      util/parse-events.c:182: warning: comparison is always false due to limited range of data type
      util/parse-events.c:190: warning: comparison is always false due to limited range of data type
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8953645f
    • Arjan van de Ven's avatar
      perf_counter tools: Warning fixes on 32-bit · 7d37a0cb
      Arjan van de Ven authored
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7d37a0cb
    • Ingo Molnar's avatar
      perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/ · 86470930
      Ingo Molnar authored
      Several people have suggested that 'perf' has become a full-fledged
      tool that should be moved out of Documentation/. Move it to the
      (new) tools/ directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      86470930
    • Ingo Molnar's avatar
      Merge branch 'linus' into perfcounters/core · 75b50322
      Ingo Molnar authored
      Merge reason: Pick up the latest fixes before the -v8 perfcounters
      	      release.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      75b50322
    • Ingo Molnar's avatar
      perf_counter tools: Add 'perf annotate' feature · 0b73da3f
      Ingo Molnar authored
      Add new perf sub-command to display annotated source code:
      
       $ perf annotate decode_tree_entry
      
      ------------------------------------------------
       Percent |	Source code & Disassembly of /home/mingo/git/git
      ------------------------------------------------
               :
               :	/home/mingo/git/git:     file format elf64-x86-64
               :
               :
               :	Disassembly of section .text:
               :
               :	00000000004a0da0 <decode_tree_entry>:
               :		*modep = mode;
               :		return str;
               :	}
               :
               :	static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size)
               :	{
          3.82 :	  4a0da0:	41 54                	push   %r12
               :		const char *path;
               :		unsigned int mode, len;
               :
               :		if (size < 24 || buf[size - 21])
          0.17 :	  4a0da2:	48 83 fa 17          	cmp    $0x17,%rdx
               :		*modep = mode;
               :		return str;
               :	}
               :
               :	static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size)
               :	{
          0.00 :	  4a0da6:	49 89 fc             	mov    %rdi,%r12
          0.00 :	  4a0da9:	55                   	push   %rbp
          3.37 :	  4a0daa:	53                   	push   %rbx
               :		const char *path;
               :		unsigned int mode, len;
               :
               :		if (size < 24 || buf[size - 21])
          0.08 :	  4a0dab:	76 73                	jbe    4a0e20 <decode_tree_entry+0x80>
          0.00 :	  4a0dad:	80 7c 16 eb 00       	cmpb   $0x0,-0x15(%rsi,%rdx,1)
          3.48 :	  4a0db2:	75 6c                	jne    4a0e20 <decode_tree_entry+0x80>
               :	static const char *get_mode(const char *str, unsigned int *modep)
               :	{
               :		unsigned char c;
               :		unsigned int mode = 0;
               :
               :		if (*str == ' ')
          1.94 :	  4a0db4:	0f b6 06             	movzbl (%rsi),%eax
          0.39 :	  4a0db7:	3c 20                	cmp    $0x20,%al
          0.00 :	  4a0db9:	74 65                	je     4a0e20 <decode_tree_entry+0x80>
               :			return NULL;
               :
               :		while ((c = *str++) != ' ') {
          0.06 :	  4a0dbb:	89 c2                	mov    %eax,%edx
               :			if (c < '0' || c > '7')
          1.99 :	  4a0dbd:	31 ed                	xor    %ebp,%ebp
               :		unsigned int mode = 0;
               :
               :		if (*str == ' ')
               :			return NULL;
               :
               :		while ((c = *str++) != ' ') {
          1.74 :	  4a0dbf:	48 8d 5e 01          	lea    0x1(%rsi),%rbx
               :			if (c < '0' || c > '7')
          0.00 :	  4a0dc3:	8d 42 d0             	lea    -0x30(%rdx),%eax
          0.17 :	  4a0dc6:	3c 07                	cmp    $0x7,%al
          0.00 :	  4a0dc8:	76 0d                	jbe    4a0dd7 <decode_tree_entry+0x37>
          0.00 :	  4a0dca:	eb 54                	jmp    4a0e20 <decode_tree_entry+0x80>
          0.00 :	  4a0dcc:	0f 1f 40 00          	nopl   0x0(%rax)
         16.57 :	  4a0dd0:	8d 42 d0             	lea    -0x30(%rdx),%eax
          0.14 :	  4a0dd3:	3c 07                	cmp    $0x7,%al
          0.00 :	  4a0dd5:	77 49                	ja     4a0e20 <decode_tree_entry+0x80>
               :				return NULL;
               :			mode = (mode << 3) + (c - '0');
          3.12 :	  4a0dd7:	0f b6 c2             	movzbl %dl,%eax
               :		unsigned int mode = 0;
               :
               :		if (*str == ' ')
               :			return NULL;
               :
               :		while ((c = *str++) != ' ') {
          0.00 :	  4a0dda:	0f b6 13             	movzbl (%rbx),%edx
         16.74 :	  4a0ddd:	48 83 c3 01          	add    $0x1,%rbx
               :			if (c < '0' || c > '7')
               :				return NULL;
               :			mode = (mode << 3) + (c - '0');
      
      The first column is the percentage of samples that arrived on that
      particular line - relative to the total cost of the function.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0b73da3f
    • Ingo Molnar's avatar
      perf_counter tools: Prepare for 'perf annotate' · 8035e428
      Ingo Molnar authored
      Prepare for the 'perf annotate' implementation by splitting off
      builtin-annotate.c from builtin-report.c.
      
      ( We keep this commit separate to ease the later librarization
        of the facilities that perf-report and perf-annotate shares. )
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8035e428
    • Ingo Molnar's avatar
      perf_counter tools: Tidy up manpage details · 6e6b754f
      Ingo Molnar authored
      Also fix a misalignment in usage string printing.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6e6b754f
    • Ingo Molnar's avatar
      perf_counter tools: Uniform help printouts · 502fc5c7
      Ingo Molnar authored
      Also add perf list to command-list.txt.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      502fc5c7
    • Thomas Gleixner's avatar
      perf_counter tools: Add help for perf list · 386b05e3
      Thomas Gleixner authored
      Also update other areas of the help texts.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      386b05e3
    • Ingo Molnar's avatar
      perf_counter tools: Fix cache-event printout · 8faf3b54
      Ingo Molnar authored
      Also standardize the cache printout (so that it can be pasted back
      into the command) and sort out the aliases.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8faf3b54
    • Thomas Gleixner's avatar
      perf_counter tools: Add 'perf list' to list available events · 86847b62
      Thomas Gleixner authored
      perf list: List all the available event types which can be used in
      -e (--event) options.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      86847b62
    • Ingo Molnar's avatar
      perf_counter: Implement generalized cache event types · 8326f44d
      Ingo Molnar authored
      Extend generic event enumeration with the PERF_TYPE_HW_CACHE
      method.
      
      This is a 3-dimensional space:
      
             { L1-D, L1-I, L2, ITLB, DTLB, BPU } x
             { load, store, prefetch } x
             { accesses, misses }
      
      User-space passes in the 3 coordinates and the kernel provides
      a counter. (if the hardware supports that type and if the
      combination makes sense.)
      
      Combinations that make no sense produce a -EINVAL.
      Combinations that are not supported by the hardware produce -ENOTSUP.
      
      Extend the tools to deal with this, and rewrite the event symbol
      parsing code with various popular aliases for the units and
      access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
      both valid aliases.
      
      ( x86 is supported for now, with the Nehalem event table filled in,
        and with Core2 and Atom having placeholder tables. )
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8326f44d
    • Ingo Molnar's avatar
      perf_counter: Separate out attr->type from attr->config · a21ca2ca
      Ingo Molnar authored
      Counter type is a frequently used value and we do a lot of
      bit juggling by encoding and decoding it from attr->config.
      
      Clean this up by creating a separate attr->type field.
      
      Also clean up the various similarly complex user-space bits
      all around counter attribute management.
      
      The net improvement is significant, and it will be easier
      to add a new major type (which is what triggered this cleanup).
      
      (This changes the ABI, all tools are adapted.)
      (PowerPC build-tested.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a21ca2ca
  3. 05 Jun, 2009 16 commits
  4. 04 Jun, 2009 3 commits