1. 31 Oct, 2012 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Handle --version string generation on machines without git · 688b2c2f
      Arnaldo Carvalho de Melo authored
      If git is installed we'll have a 'perf --version' output of this form:
      
      $ make -j8 -C tools/perf/ O=/home/acme/git/build/perf install
      $ perf --version
      perf version 3.7.rc3.g3afad6
      
      Now on a machine without git installed:
      
      $ mv  /home/acme/bin/git /home/acme/bin/git.OFF
      $ make -j8 -C tools/perf/ O=/home/acme/git/build/perf install
      $ perf --version
      perf version 3.7.0-rc2
      
      That is, no error message due to git not being installed will appear on the
      screen and instead the version string in the top level Makefile will be
      used.
      Requested-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-am6yp6phvxyjmyndxogpunjv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      688b2c2f
    • Ingo Molnar's avatar
      perf tools: Further speed up the perf build · 0e2af956
      Ingo Molnar authored
      There's another source of overhead in the perf version string generator:
      
         git update-index -q --refresh
      
      ... which will iterate the whole checked out tree. This can be pretty
      slow on NFS volumes, but takes some time even with local SSD disks and a
      fully cached kernel tree:
      
       $ perf stat --null --repeat 3 --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
      
       Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):
      
             0.306999221 seconds time elapsed                                          ( +-  0.56% )
      
      So remove the .dirty differentiator as well - it adds little information
      because locally patched git trees are common, but seldom are the perf
      tools modified.
      
      So a lot of version strings are reported as 'dirty' while in fact they
      are pristine perf builds. For example 99% of my perf builds are not
      patched but the kernel tree is slightly patched, which adds the .dirty
      tag.
      
      Eliminating that tag speeds up version generation by another order of
      magnitude:
      
       $ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
       PERF_VERSION = 3.7.rc3.g4b0bd3
       PERF_VERSION = 3.7.rc3.g4b0bd3
       PERF_VERSION = 3.7.rc3.g4b0bd3
      
       Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):
      
             0.021270923 seconds time elapsed                                          ( +-  1.94% )
      
      (Also clean up some of the comments around this code.)
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20121030085441.GC8245@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0e2af956
    • Ingo Molnar's avatar
      perf tools: Speed up the perf build time by simplifying the perf --version string generation · acddedfb
      Ingo Molnar authored
      Building perf is pretty slow on trees that have a lot of commits
      relative to the nearest Git tag. This slowness manifests itself during
      version string generation:
      
       $ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
       PERF_VERSION = 3.7.rc3.1458.g5399b3b
       PERF_VERSION = 3.7.rc3.1458.g5399b3b
       PERF_VERSION = 3.7.rc3.1458.g5399b3b
      
       Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):
      
             2.857503976 seconds time elapsed                                          ( +-  0.22% )
      
      The build can be even slower than that, when one over NFS volumes.
      
      The reason for the slowness is that util/PERF-VERSION-GEN uses "git
      describe" to generate the string, which has to count the "number of
      commits distance" from the nearest tag - the ".1458." count in the
      output above. For that Git had to extract and decompress 1458 Git
      objects, which takes time and bandwidth.
      
      But this "number of commits" value is mostly irrelevant in practice. We
      either want to know an approximate tag name, or we want to know the
      precise sha1.
      
      So this patch simplifies the version string to:
      
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
      
      which speeds up the version string generation script by an order of
      magnitude:
      
       $ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
       PERF_VERSION = 3.7.rc3.g5399b3b.dirty
      
       Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):
      
             0.307633559 seconds time elapsed                                          ( +-  0.84% )
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20121030084600.GB8245@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      acddedfb
    • Joonsoo Kim's avatar
      perf tools: Add info about cross compiling for Android ARM · cd69ef88
      Joonsoo Kim authored
      Without defining ARCH=arm, building perf for Android ARM will fail,
      because it needs architecture specific files.
      
      So add related relevant information to the android documentation.
      Signed-off-by: default avatarJoonsoo Kim <js1304@gmail.com>
      Reviewed-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Irina Tirdea <irina.tirdea@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1351518066-4791-1-git-send-email-js1304@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cd69ef88
    • Namhyung Kim's avatar
      perf tools: Warn about missing libelf · d30ff295
      Namhyung Kim authored
      When perf detects no libelf during the build, it'll use internal mini
      elf parser instead of libelf.  But as it only supports minimal
      functionalities, it also disables support to 'probe' builtin command.
      
      Currently it didn't warned to user.  Fix it.
      
      $ sudo apt-get remove libelf-dev
      $ make
          CHK -fstack-protector-all
          CHK -Wstack-protector
          CHK -Wvolatile-register-var
          CHK bionic
          CHK libelf
          CHK glibc
      Makefile:491: No libelf found, disables 'probe' tool, please install elfutils-libelf-devel/libelf-dev
          CHK libunwind
          CHK libaudit
      
      $ make NO_LIBELF=1
          CHK -fstack-protector-all
          CHK -Wstack-protector
          CHK -Wvolatile-register-var
          CHK bionic
          CHK libaudit
      Reported-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-8ww8zc4hhpxabfskxs3u5ede@git.kernel.org
      [ committer note: The package needed is elfutils-libelf-devel, not elfutils-devel ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d30ff295
  2. 30 Oct, 2012 2 commits
    • Peter Huewe's avatar
      perf/x86: Fix sparse warnings · 95d18aa2
      Peter Huewe authored
      FYI, there are new sparse warnings:
      
       arch/x86/kernel/cpu/perf_event.c:1356:18: sparse: symbol 'events_attr' was not declared. Should it be static?
      
      This patch makes it static and also adds the static keyword to
      fix arch/x86/kernel/cpu/perf_event.c:1344:9: warning: symbol
      'events_sysfs_show' was not declared.
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com>
      Cc: fengguang.wu@intel.com
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/n/tip-lerdpXlnruh0yvWs2owwuizl@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      95d18aa2
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 8748dd9b
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements, fixes and code move from Arnaldo Carvalho de Melo:
      
       * Initialize 'page_size' variable in the python binding, this was sent
         for perf/urgent by mistake, then when merging Ingo removed it, fixing
         the problem for perf/urgent, but when perf/urgent was merged with
         perf/core, where that initialization is needed, made the python
         binding mmap call to fail, fix it by initializing page_size again.
      
       * Add a browser for 'perf script' and make it available from the report
         and annotate browsers. It does filtering to find the scripts that
         handle events found in the perf.data file used. From Feng Tang
      
       * Move some functions from symbol.c to more appropriate files, creating
         dso.[ch] in the process, no code changes. From Jiri Olsa
      
       * Fix mmap error output message for when perf_mmap fails and returns
         !-EPERM, where the default for mmap_pages, INT_MAX, was causing a
         !power of 2 error message, fix from Jiri Olsa.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8748dd9b
  3. 29 Oct, 2012 13 commits
  4. 28 Oct, 2012 1 commit
  5. 26 Oct, 2012 11 commits
  6. 25 Oct, 2012 4 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Use sched:sched_stat_runtime to provide a thread summary · 1302d88e
      Arnaldo Carvalho de Melo authored
      [root@sandy ~]# perf trace --sched --duration 0.100 --pid `pidof firefox`
      <SNIP>
       17079.847 ( 0.009 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
       17079.892 ( 0.010 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096         ) = -1 EAGAIN Resource temporarily unavailable
       17079.921 ( 0.013 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
       17079.949 ( 0.009 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096         ) = -1 EAGAIN Resource temporarily unavailable
      ^C
       _____________________________________________________________________
       __)    Summary of events    (__
      
                    [ task - pid ]     [ events ] [ ratio ]  [ runtime ]
       _____________________________________________________________________
      
                   firefox - 17643 :      18013   [ 72.2% ]    359.110 ms
                   firefox - 17663 :         41   [  0.2% ]     21.439 ms
                   firefox - 17664 :       6840   [ 27.4% ]    133.642 ms
                   firefox - 17667 :         46   [  0.2% ]      0.682 ms
      [root@sandy ~]#
      
      This is equivalent to the 'perf trace summary' subcomand in the tmp.perf/trace2
      branch.
      
      Another example, setting a huge duration filter to get just a system
      wide summary:
      
      [root@sandy ~]# perf trace --duration 10000.0 --sched
      ^C
       _____________________________________________________________________
       __)    Summary of events    (__
      
                    [ task - pid ]     [ events ] [ ratio ]  [ runtime ]
       _____________________________________________________________________
      
                 scsi_eh_1 - 258   :         15   [  0.0% ]      0.133 ms
              kworker/0:1H - 322   :         13   [  0.0% ]      0.032 ms
               jbd2/dm-0-8 - 384   :          4   [  0.0% ]      0.115 ms
               flush-253:0 - 470   :          1   [  0.0% ]      0.027 ms
                   firefox - 950   :       4783   [  0.1% ]     24.863 ms
                   firefox - 992   :       1883   [  0.1% ]      6.808 ms
                   firefox - 995   :         35   [  0.0% ]      0.111 ms
               ksoftirqd/6 - 4362  :          2   [  0.0% ]      0.005 ms
               ksoftirqd/7 - 4365  :          1   [  0.0% ]      0.007 ms
                      Xorg - 4671  :        148   [  0.0% ]      0.912 ms
           gnome-settings- - 4846  :         14   [  0.0% ]      0.086 ms
           seahorse-daemon - 4847  :         14   [  0.0% ]      0.092 ms
               gnome-panel - 4875  :         46   [  0.0% ]      0.159 ms
           gnome-power-man - 4918  :         16   [  0.0% ]      0.065 ms
           gvfs-afc-volume - 4992  :         77   [  0.0% ]      0.136 ms
           gnome-screensav - 5114  :         24   [  0.0% ]      0.128 ms
                     xchat - 8082  :        466   [  0.0% ]      2.019 ms
                  synergyc - 8369  :        941   [  0.0% ]      3.291 ms
                  synergyc - 8371  :         85   [  0.0% ]      1.817 ms
               jbd2/dm-4-8 - 9352  :          4   [  0.0% ]      0.109 ms
                   rpcbind - 9786  :          3   [  0.0% ]      0.017 ms
              rtkit-daemon - 12802 :         10   [  0.0% ]      0.038 ms
              rtkit-daemon - 12803 :          8   [  0.0% ]      0.000 ms
             udisks-daemon - 13020 :         27   [  0.0% ]      0.240 ms
               kworker/7:0 - 14651 :        669   [  0.0% ]      2.616 ms
               kworker/5:1 - 16220 :          2   [  0.0% ]      0.069 ms
               kworker/4:0 - 19776 :         13   [  0.0% ]      0.176 ms
                   openvpn - 20131 :        133   [  0.0% ]      0.762 ms
           plugin-containe - 20508 :      60658   [  1.7% ]    131.153 ms
              npviewer.bin - 20520 :      72208   [  2.0% ]    138.945 ms
              npviewer.bin - 20542 :         35   [  0.0% ]      0.074 ms
              npviewer.bin - 20543 :         30   [  0.0% ]      0.074 ms
              npviewer.bin - 20547 :         35   [  0.0% ]      0.092 ms
              npviewer.bin - 20552 :         35   [  0.0% ]      0.093 ms
                      sshd - 20645 :         32   [  0.0% ]      0.071 ms
              npviewer.bin - 21053 :         35   [  0.0% ]      0.074 ms
              npviewer.bin - 21054 :         35   [  0.0% ]      0.097 ms
               kworker/0:2 - 21169 :        149   [  0.0% ]      1.143 ms
               kworker/3:0 - 22171 :        113   [  0.0% ]     96.892 ms
               flush-253:4 - 22410 :          1   [  0.0% ]      0.028 ms
               kworker/6:0 - 24581 :         25   [  0.0% ]      0.275 ms
               kworker/1:0 - 25572 :          4   [  0.0% ]      0.103 ms
               kworker/2:1 - 26299 :        138   [  0.0% ]      1.440 ms
               kworker/0:0 - 26325 :          1   [  0.0% ]      0.003 ms
                      perf - 26330 :    3506967   [ 96.1% ]   6648.310 ms
      [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/n/tip-mzuli0srnxyi1o029py6537x@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1302d88e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Count number of events for each thread and globally · efd5745e
      Arnaldo Carvalho de Melo authored
      The nr_events in trace__run was local, but we will need it in other
      trace methods, move it to struct trace.
      
      We'll also need the number of events per thread, so introduce a
      nr_events method for that in struct thread_trace.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ksutaz0mtejnf7e6az3ca1td@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      efd5745e
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Don't stop synthesizing threads when one vanishes · ba361c92
      Arnaldo Carvalho de Melo authored
      The perf_event__synthesize_threads routine synthesizes all the existing
      threads in the system, because we don't have any kernel facilities to
      ask for PERF_RECORD_{FORK,MMAP,COMM} for existing threads.
      
      It was returning an error as soon as one thread couldn't be synthesized,
      which is a bit extreme when, for instance, a forkish workload is
      running, like a kernel compile.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-i7oas1eodpoer2bx38fwyasv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba361c92
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 6ca2a9c6
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
       * Align the 'Ok'/'FAILED!' test results in 'perf test.
      
       * Support interrupted syscalls in 'trace'
      
       * Add an event duration column and filter in 'trace'.
      
       * There are references to the man pages in some tools, so try to build
         Documentation when installing, warning the user if that is not possible,
         from Borislav Petkov.
      
       * Give user better message if precise is not supported, from David Ahern.
      
       * Try to find cross-built objdump path by using the session environment
         information in the perf.data file header, from Irina Tirdea, original
         patch and idea by Namhyung Kim.
      
       * Diplays more output on features check for make V=1, so that one can figure
         out what is happening by looking at gcc output, etc. From Jiri Olsa.
      
       * Account the nr_entries in rblist properly, fix by Suzuki K. Poulose.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6ca2a9c6
  7. 24 Oct, 2012 4 commits