1. 05 Oct, 2009 4 commits
  2. 04 Oct, 2009 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove show_mask bitmask · ec218fc4
      Arnaldo Carvalho de Melo authored
      As it was not being exposed via any command line and with --dsos/--comms
      we can do this and even more, like asking for just kernel + some module:
      
      [root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\]
      --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15
       # Samples: 619669
       #
       # Overhead          Command  Shared Object  Symbol
       # ........  ...............  .............  ......
       #
            7.12%          swapper  [kernel]       [k] read_hpet
            6.86%             init  [kernel]       [k] read_hpet
            6.22%             init  [kernel]       [k] mwait_idle_with_hints
            5.34%          swapper  [kernel]       [k] mwait_idle_with_hints
            3.01%          firefox  [kernel]       [.] vread_hpet
            2.14%             Xorg  [drm]          [k] drm_clflush_pages
            2.09%           pidgin  [kernel]       [.] vread_hpet
            1.58%     npviewer.bin  [kernel]       [.] vread_hpet
            1.37%          swapper  [kernel]       [k] hpet_next_event
            1.23%             Xorg  [kernel]       [k] read_hpet
      [root@doppio linux-2.6-tip]#
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091003233048.GA30535@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ec218fc4
  3. 03 Oct, 2009 1 commit
  4. 02 Oct, 2009 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Rewrite and improve support for kernel modules · 439d473b
      Arnaldo Carvalho de Melo authored
      Representing modules as struct map entries, backed by a DSO, etc,
      using /proc/modules to find where the module is loaded.
      
      DSOs now can have a short and long name, so that in verbose mode we
      can show exactly which .ko or vmlinux image was used.
      
      As kernel modules now are a DSO separate from the kernel, we can
      ask for just the hits for a particular set of kernel modules, just
      like we can do with shared libraries:
      
      [root@doppio linux-2.6-tip]# perf report -n --vmlinux
      /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
          84.58%      13266             Xorg  [k] drm_clflush_pages
           4.02%        630             Xorg  [k] trace_kmalloc.clone.0
           3.95%        619             Xorg  [k] drm_ioctl
           2.07%        324             Xorg  [k] drm_addbufs
           1.68%        263             Xorg  [k] drm_gem_close_ioctl
           0.77%        120             Xorg  [k] drm_setmaster_ioctl
           0.70%        110             Xorg  [k] drm_lastclose
           0.68%        106             Xorg  [k] drm_open
           0.54%         85             Xorg  [k] drm_mm_search_free
      [root@doppio linux-2.6-tip]#
      
      Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
      would have the same effect. Allowing specifying just 'drm.ko' is left
      for another patch.
      
      Processing kallsyms so that per kernel module struct map are
      instantiated was also left for another patch. That will allow
      removing the module name from each of its symbols.
      
      struct symbol was reduced by removing the ->module backpointer and
      moving it (well now the map) to struct symbol_entry in perf top,
      that is its only user right now.
      
      The total linecount went down by ~500 lines.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Avi Kivity <avi@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      439d473b
  5. 01 Oct, 2009 1 commit
  6. 30 Sep, 2009 4 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Remove dead code · cad30714
      Arnaldo Carvalho de Melo authored
      Several variables are not used at all, cut'n'paste leftovers.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      LKML-Reference: <20090928200818.GF3361@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cad30714
    • Arnaldo Carvalho de Melo's avatar
      perf sched: Remove dead code · a80deb62
      Arnaldo Carvalho de Melo authored
      Several variables are not used at all, cut'n'paste leftovers.
      
      Also check if the sample_type is RAW earlier, to avoid needless
      searches.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a80deb62
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Use rb_tree for maps · 1b46cddf
      Arnaldo Carvalho de Melo authored
      Threads can have many and kernel modules will be represented as a
      tree of maps as well.
      
      Ah, and for a perf.data with 146607 samples:
      
      Before:
      
      [root@doppio ~]# perf stat -r 5 perf report > /dev/null
      
       Performance counter stats for 'perf report' (5 runs):
      
           699.823680  task-clock-msecs         #      0.991 CPUs    ( +-   0.454% )
                   74  context-switches         #      0.000 M/sec   ( +-   1.709% )
                    2  CPU-migrations           #      0.000 M/sec   ( +-  17.008% )
                23114  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1381257019  cycles                   #   1973.721 M/sec   ( +-   0.290% )
           1456894438  instructions             #      1.055 IPC     ( +-   0.007% )
             18779818  cache-references         #     26.835 M/sec   ( +-   0.380% )
               641799  cache-misses             #      0.917 M/sec   ( +-   1.200% )
      
          0.705972729  seconds time elapsed   ( +-   0.501% )
      
      [root@doppio ~]#
      
      After
      
       Performance counter stats for 'perf report' (5 runs):
      
           691.261451  task-clock-msecs         #      0.993 CPUs    ( +-   0.307% )
                   72  context-switches         #      0.000 M/sec   ( +-   0.829% )
                    6  CPU-migrations           #      0.000 M/sec   ( +-  18.409% )
                23127  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1366395876  cycles                   #   1976.670 M/sec   ( +-   0.153% )
           1443136016  instructions             #      1.056 IPC     ( +-   0.012% )
             17956402  cache-references         #     25.976 M/sec   ( +-   0.325% )
               661924  cache-misses             #      0.958 M/sec   ( +-   1.335% )
      
          0.696127275  seconds time elapsed   ( +-   0.377% )
      
      I.e. we see some speedup too.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      LKML-Reference: <20090928174846.GA3361@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1b46cddf
    • John Kacur's avatar
      perf tools: Put common histogram functions in their own file · 3d1d07ec
      John Kacur authored
      Move histogram related functions into their own files (hist.c and
      hist.h) and make use of them in builtin-annotate.c and
      builtin-report.c.
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3d1d07ec
  7. 24 Sep, 2009 7 commits
  8. 23 Sep, 2009 2 commits
    • Mike Galbraith's avatar
      perf tools: Fix module symbol loading bug · 508c4d08
      Mike Galbraith authored
      Avi Kivity reported 'perf annotate' failures with modules, the
      requested function was not annotated.
      
      If there are no modules currently loaded, or the last module
      scanned is not loaded, dso__load_modules() steps on the value from
      dso__load_vmlinux(), so we happily load the kallsyms symbols on top
      of what we've already loaded.
      
      Fix that such that the total count of symbols loaded is returned.
      Should module symbol load fail after parsing of vmlinux, is's a
      hard failure, so do not silently fall-back to kallsyms.
      Reported-by: default avatarAvi Kivity <avi@redhat.com>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: rostedt@goodmis.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <1253697658.11461.36.camel@marge.simson.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      508c4d08
    • Peter Zijlstra's avatar
      perf_event, x86: Fix 'perf sched record' crashing the machine · 7d428966
      Peter Zijlstra authored
      Chris Malley reported that 'perf sched record' sometimes
      crashes his box with:
      
      [  389.272175] BUG: unable to handle kernel paging request at ffffb300
      [  389.272294] IP: [<c011b0bd>] default_send_IPI_self+0x1d/0x50
      [  389.272366] *pde = 0073f067 *pte = 00000000
      [  389.274708] Call Trace:
      [  389.274752]  [<c010e3b4>] ?  set_perf_event_pending+0x14/0x20
      [  389.274801]  [<c01b9751>] ?  perf_output_unlock+0x121/0x1a0
      [  389.274848]  [<c01b981a>] ? perf_output_end+0x4a/0x70
      [  389.274893]  [<c01ba690>] ?  __perf_event_overflow+0x240/0x2f0
      [  389.274942]  [<c030963e>] ? atomic64_cmpxchg+0x1e/0x30
      [  389.274988]  [<c01ba8f4>] ?  perf_swevent_ctx_event+0x1b4/0x1c0
      [  389.275035]  [<c01ba773>] ?  perf_swevent_ctx_event+0x33/0x1c0
      [  389.275081]  [<c01ba9a7>] ? do_perf_sw_event+0xa7/0x160
      [  389.275127]  [<c01baae2>] ? perf_tp_event+0x82/0xa0
      [  389.275174]  [<c012e9c6>] ?  ftrace_profile_sched_stat_runtime+0xe6/0x120
      [  389.275224]  [<c012e8e0>] ?  ftrace_profile_sched_stat_runtime+0x0/0x120
      [  389.275273]  [<c013c85a>] ? update_curr+0x18a/0x230
      [  389.275318]  [<c013cdc5>] ?  put_prev_task_fair+0x155/0x160
      [  389.275366]  [<c01618b5>] ? sched_clock_cpu+0xd5/0x110
      [  389.275413]  [<c04e7525>] ? _spin_lock_irq+0x45/0x50
      [  389.275458]  [<c04e424e>] ? schedule+0x20e/0xb10
      
      The problem is that the box has no lapic enabled:
      
        [    0.042445] Local APIC not detected. Using dummy APIC emulation.
      
      The below seems like the best fix. We disabled all lapic bits, except
      the self-IPI-resend logic.
      Reported-by: default avatarChris Malley <mail@chrismalley.co.uk>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <7863dc4c0909221409v7893bfd3o4b590d5951a233ba@mail.gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7d428966
  9. 22 Sep, 2009 3 commits
    • Anton Blanchard's avatar
      perf_event: Update PERF_EVENT_FORK header definition · a6f10a2f
      Anton Blanchard authored
      PERF_EVENT_FORK always outputs the time field, so update the header
      to reflect this.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090922123424.GD19453@kryten>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a6f10a2f
    • Ingo Molnar's avatar
      perf stat: Fix zero total printouts · c7f7fea3
      Ingo Molnar authored
      Before:
      
                 0  sched:sched_switch #        nan M/sec
      
      After:
      
                 0  sched:sched_switch #      0.000 M/sec
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7f7fea3
    • Paul Mackerras's avatar
      perf_event, powerpc: Fix compilation after big perf_counter rename · a8f90e90
      Paul Mackerras authored
      This fixes two places in the powerpc perf_event (perf_counter) code
      where 'list_entry' needs to be changed to 'group_entry', but were
      missed in commit 65abc865 ("perf_counter: Rename list_entry ->
      group_entry, counter_list -> group_list").
      
      This also changes 'event' back to 'counter' in a couple of
      contexts:
      
      * Field and function names that deal with the limited-function
        counters: it's really the hardware counters whose function is
        limited, not the events that they count.  Hence:
      
        MAX_LIMITED_HWEVENTS -> MAX_LIMITED_HWCOUNTERS
        limited_event -> limited_counter
        freeze/thaw_limited_events -> freeze/thaw_limited_counters
      
      * The machine-specific PMU description struct (struct power_pmu): this
        renames 'n_event' back to 'n_counter' since it really describes how
        many hardware counters the machine has.  (Renaming this back avoids
        a compile error in each of the machine-specific PMU back-ends where
        they initialize their power_pmu struct.)
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@ozlabs.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <19128.4280.813369.589704@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a8f90e90
  10. 21 Sep, 2009 16 commits