1. 05 Jun, 2010 7 commits
    • Arun Sharma's avatar
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma authored
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: default avatarArun Sharma <aruns@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo authored
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • Stephane Eranian's avatar
      perf annotate: Ask objdump to demangle symbols · 45d8e802
      Stephane Eranian authored
      Perf report is demangling symbols but not annotate.
      
      The former uses internal demangling via libbdf or libiberty. The latter
      executes objdump which by default does not demangle symbols.
      
      This patch adds the -C option to the objdump cmdline to enable symbol
      demangling.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      45d8e802
    • Stephane Eranian's avatar
      perf buildid: add perfconfig option to specify buildid cache dir · 45de34bb
      Stephane Eranian authored
      This patch adds the ability to specify an alternate directory to store the
      buildid cache (buildids, copy of binaries). By default, it is hardcoded to
      $HOME/.debug. This directory contains immutable data. The layout of the
      directory is such that no conflicts in filenames are possible. A modification
      in a file, yields a different buildid and thus a different location in the
      subdir hierarchy.
      
      You may want to put the buildid cache elsewhere because of disk space
      limitation or simply to share the cache between users. It is also useful for
      remote collect vs. local analysis of profiles.
      
      This patch adds a new config option to the perfconfig file.  Under the tag
      'buildid', there is a dir option. For instance, if you have:
      
      $ cat /etc/perfconfig
      [buildid]
      dir = /var/cache/perf-buildid
      
      All buildids and binaries are be saved in the directory specified. The perf
      record, buildid-list, buildid-cache, report, annotate, and archive commands
      will it to pull information out.
      
      The option can be set in the system-wide perfconfig file or in the
      $HOME/.perfconfig file.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      45de34bb
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Make target to generate self contained source tarball · 8e5564e6
      Arnaldo Carvalho de Melo authored
      Useful for when people want to try some version of the perf tools and don't
      wants to download the kernel tarball.
      
      Here is a session using this new target:
      
        [root@emilia linux-2.6-tip]# make help | grep -i perf
          perf-tar-src-pkg    - Build perf-2.6.35-rc1.tar source tarball
          perf-targz-src-pkg  - Build perf-2.6.35-rc1.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-2.6.35-rc1.tar.bz2 source tarball
        [root@emilia linux-2.6-tip]# make perf-tarbz2-src-pkg
          TAR
        [root@emilia linux-2.6-tip]# ls -la perf-2.6.35-rc1.tar.bz2
        -rw-r--r-- 1 root root 295731 May 31 11:18 perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# tar xf perf-2.6.35-rc1.tar.bz2
        [root@emilia linux-2.6-tip]# cd perf-2.6.35-rc1
        [root@emilia perf-2.6.35-rc1]# ls
        arch  HEAD  include  lib  tools
        [root@emilia perf-2.6.35-rc1]# cd tools/perf
        [root@emilia perf]# make -j9 2>&1 | tail
            CC arch/x86/util/dwarf-regs.o
            CC util/probe-finder.o
            CC util/newt.o
            CC util/scripting-engines/trace-event-perl.o
            CC scripts/perl/Perf-Trace-Util/Context.o
            CC perf.o
            CC builtin-help.o
            AR libperf.a
            LINK perf
        rm .perf.dev.null
        [root@emilia perf]# ./perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.262 MB perf.data (~11457 samples) ]
        [root@emilia perf]# ./perf report | head -12
        # Events: 6K cycles
        #
        # Overhead          Command       Shared Object  Symbol
        # ........  ...............  ..................  ......
        #
             4.73%             perf  [kernel.kallsyms]   [k] format_decode
             4.49%             perf  libc-2.12.so        [.] _IO_file_underflow_internal
             4.38%             init  [kernel.kallsyms]   [k] mwait_idle
             3.29%             perf  [kernel.kallsyms]   [k] vsnprintf
             2.38%             init  [kernel.kallsyms]   [k] sched_clock_local
             2.35%             init  [kernel.kallsyms]   [k] apic_timer_interrupt
             1.86%     sirq-timer/5  [kernel.kallsyms]   [k] find_busiest_group
        [root@emilia perf]#
      Acked-by: default avatarMichal Marek <mmarek@suse.cz>
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100528185357.GA28009@ghostprotocols.net>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e5564e6
    • Stephane Eranian's avatar
      perf tools: Add the ability to specify list of cpus to monitor · c45c6ea2
      Stephane Eranian authored
      This patch adds a -C option to stat, record, top to designate a list of CPUs to
      monitor. CPUs can be specified as a comma-separated list or ranges, no space
      allowed.
      
      Examples:
      $ perf record -a -C0-1,4-7 sleep 1
      $ perf top -C0-4
      $ perf stat -a -C1,2,3,4 sleep 1
      
      With perf record in per-thread mode with inherit mode on, samples are collected
      only when the thread runs on the designated CPUs.
      
      The -C option does not turn on system-wide mode automatically.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c45c6ea2
    • Stephane Eranian's avatar
      perf report: Make -D print sampled CPU · 761844b9
      Stephane Eranian authored
      It is useful to know on which CPU a sample was captured on.
      The information is captured with perf record -R but it was
      not printed out by perf report -D. This patch adds this.
      
      When -R is not used, cpu is set to -1to indicate that
      the CPU is unknown (it is not captured).
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      761844b9
  2. 04 Jun, 2010 2 commits
  3. 03 Jun, 2010 1 commit
    • Peter Zijlstra's avatar
      perf: Fix crash in swevents · c6df8d5a
      Peter Zijlstra authored
      Frederic reported that because swevents handling doesn't disable IRQs
      anymore, we can get a recursion of perf_adjust_period(), once from
      overflow handling and once from the tick.
      
      If both call ->disable, we get a double hlist_del_rcu() and trigger
      a LIST_POISON2 dereference.
      
      Since we don't actually need to stop/start a swevent to re-programm
      the hardware (lack of hardware to program), simply nop out these
      callbacks for the swevent pmu.
      Reported-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1275557609.27810.35218.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c6df8d5a
  4. 02 Jun, 2010 1 commit
  5. 01 Jun, 2010 3 commits
    • Arnaldo Carvalho de Melo's avatar
      perf buildid-list: Fix --with-hits event processing · b5c874f1
      Arnaldo Carvalho de Melo authored
      When we use plain 'perf buildid-list' we use only what is in the buildid
      table in the perf.data header. And those have absolute pathnames because
      at 'perf record' time we used __perf_session__process_events and that
      doesn't sets up the path shortening code in map__new() that happens if
      symbol_conf.full_paths is false, the default.
      
      On the other hand, when we use 'perf buildid-list --with-hits' we
      process all the events using perf_session__process_events, adding
      entries to the global DSO list _after_ removing the current directory
      from the DSO name, for presentation purposes.
      
      Because of that we end up having two entries in the DSO list when
      recording events for binaries using relative pathnames.
      
      Fix it minimally by setting symbol_conf.full_paths to true when marking
      the DSOs with hits in 'perf buildid-list --with-hits', as used by 'perf
      archive'
      
      Right fix longer term is to shorten the path only at presentation time.
      Will be done for 2.6.36.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Tested-by: default avatarStephane Eranian <eranian@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100601183837.GC4093@ghostprotocols.net>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b5c874f1
    • Pierre Tardy's avatar
      perf scripts python: Give field dict to unhandled callback · c0251485
      Pierre Tardy authored
      trace_unhandled() callback does not allow to access event fields, this patch
      resolves the problem.
      
      It can also been used as a more pythonic and flexible way for script writters
      to demux event types
      
      This will for example greatly simplify pytimechart event demux.
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>,
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1275340329-2397-1-git-send-email-tardyp@gmail.com>
      Signed-off-by: default avatarPierre Tardy <tardyp@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c0251485
    • Konstantin Stepanyuk's avatar
      perf hist: fix objdump output parsing · 75d9ef17
      Konstantin Stepanyuk authored
      hist_entry__annotate() runs objdump with -S option so the output may contain
      lines of any format. If a line starts with a colon strtoull() returns 0 and
      calculated offset will be negative. This causes perf annotate segfaults.
      
      Make sure that strtoull() has parsed at least one digit.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarKonstantin Stepanyuk <konstantin.stepanyuk@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      75d9ef17
  6. 31 May, 2010 11 commits
    • Borislav Petkov's avatar
      perf-record: Check correct pid when forking · 2fb750e8
      Borislav Petkov authored
      When forking the child to be traced, we should check the correct
      return value from fork() and not a local variable which is otherwise
      unused.
      Signed-off-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <20100531211818.GA30175@liondog.tnic>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      2fb750e8
    • Frederic Weisbecker's avatar
      perf: Do the comm inheritance per thread in event__process_task · dd833d71
      Frederic Weisbecker authored
      event__process_task() doesn't propagate the comm copy on clone,
      but only on process fork. So we loose all the tid:comm resolution
      for tasks that aren't a main process thread.
      
      Progragate the per thread granularity to event__process_task for
      pid resolution.
      
      This fixes various unresolved pids in perf sched, especially when
      we trace multithread processes. The problem is quickly reproducible
      with the messaging benchmark using the multithread mode "-t" :
      
      	perf sched record perf bench sched messaging -t
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      dd833d71
    • Frederic Weisbecker's avatar
      perf: Use event__process_task from perf sched · af64865b
      Frederic Weisbecker authored
      perf sched uses event__process_comm(), which means it can resolve
      comms from:
      
      - tasks that have exec'ed (kernel comm events)
      - tasks that were running when perf record started the actual
        recording (synthetized comm events)
      
      But perf sched can't resolve the pids of tasks that were created
      after the recording started.
      
      To solve this, we need to inherit the comms on fork events using
      event__process_task().
      
      This fixes various unresolved pids in perf sched, easily visible
      with:
      	perf sched record perf bench sched messaging
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      af64865b
    • Frederic Weisbecker's avatar
      perf: Process comm events by tid · 13eb04fd
      Frederic Weisbecker authored
      When we synthetize the existing running tasks though procfs,
      we walk through every threads of a process, queuing one comm
      events per tid.
      
      But then on report time, event__process_comm() only creates and
      sets the comm on a per process granularity. This is the right
      thing for comm events that came from the kernel, as they are
      only created on exec. Sub-threads then inherit their comm
      from fork events. But that doesn't work with our synthetized
      comm events taken from procfs informations as the per thread
      granularity is done on comm events directly there.
      
      Hence we need event__process_comm() to work with the tid rather
      than the pid. It won't change anything for comm events coming
      from the kernel but this will fix the synthetized ones.
      
      Before:
      
      	$ ./perf report -D | grep COMM | grep firefox
      
      	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5297
      
      After:
      	$ ./perf report -D | grep COMM | grep firefox
      
      	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5299
      	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5300
      	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5308
      	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5309
      	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5312
      
      This fixes various unresolved pid on perf sched.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      13eb04fd
    • Randy Dunlap's avatar
      blktrace: Fix new kernel-doc warnings · 546cf44a
      Randy Dunlap authored
      Fix blktrace.c kernel-doc warnings:
       Warning(kernel/trace/blktrace.c:858): No description found for parameter 'ignore'
       Warning(kernel/trace/blktrace.c:890): No description found for parameter 'ignore'
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100529114507.c466fc1e.randy.dunlap@oracle.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      546cf44a
    • Frederic Weisbecker's avatar
      perf_events: Fix unincremented buffer base on partial copy · 74048f89
      Frederic Weisbecker authored
      If a sample size crosses to the next page boundary, the copy
      will be made in more than one step. However we forget to advance
      the source offset for the next copy, leading to unexpected double
      copies that completely mess up the traces.
      
      This fixes various kinds of bad traces that have irrelevant
      data inside, as an example:
      
      	geany-4979  [001]  5758.077775: sched_switch: prev_comm=! prev_pid=121
      		prev_prio=0 prev_state=S|D|Z|X|x ==> next_comm= next_pid=7497072
      		next_prio=0
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274988898-5639-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      74048f89
    • Stephane Eranian's avatar
      perf_events: Fix event scheduling issues introduced by transactional API · 90151c35
      Stephane Eranian authored
      The transactional API patch between the generic and model-specific
      code introduced several important bugs with event scheduling, at
      least on X86. If you had pinned events, e.g., watchdog,  and were
      over-committing the PMU, you would get bogus counts. The bug was
      showing up on Intel CPU because events would move around more
      often that on AMD. But the problem also existed on AMD, though
      harder to expose.
      
      The issues were:
      
       - group_sched_in() was missing a cancel_txn() in the error path
      
       - cpuc->n_added was not properly maintained, leading to missing
         actions in hw_perf_enable(), i.e., n_running being 0. You cannot
         update n_added until you know the transaction has succeeded. In
         case of failed transaction n_added was not adjusted back.
      
       - in case of failed transactions, event_sched_out() was called
         and eventually invoked x86_disable_event() to touch the HW reg.
         But with transactions, on X86, event_sched_in() does not touch
         HW registers, it simply collects events into a list. Thus, you
         could end up calling x86_disable_event() on a counter which
         did not correspond to the current event when idx != -1.
      
      The patch modifies the generic and X86 code to avoid all those problems.
      
      First, we keep track of the number of events added last. In case the
      transaction fails, we substract them from n_added. This approach is
      necessary (as opposed to delaying updates to n_added) because not all
      event updates use the transaction API, e.g., single events.
      
      Second, we encapsulate the event_sched_in() and event_sched_out() in
      group_sched_in() inside the transaction. That makes the operations
      symmetrical and you can also detect that you are inside a transaction
      and skip the HW reg access by checking cpuc->group_flag.
      
      With this patch, you can now overcommit the PMU even with pinned
      system-wide events present and still get valid counts.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274796225.5882.1389.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      90151c35
    • Peter Zijlstra's avatar
      perf_events, trace: Fix perf_trace_destroy(), mutex went missing · 2e97942f
      Peter Zijlstra authored
      Steve spotted I forgot to do the destroy under event_mutex.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274451913.1674.1707.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2e97942f
    • Peter Zijlstra's avatar
      perf_events, trace: Fix probe unregister race · 3771f077
      Peter Zijlstra authored
      tracepoint_probe_unregister() does not synchronize against the probe
      callbacks, so do that explicitly. This properly serializes the callbacks
      and the free of the data used therein.
      
      Also, use this_cpu_ptr() where possible.
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274438476.1674.1702.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3771f077
    • Peter Zijlstra's avatar
      perf_events: Fix races in group composition · 8a49542c
      Peter Zijlstra authored
      Group siblings don't pin each-other or the parent, so when we destroy
      events we must make sure to clean up all cross referencing pointers.
      
      In particular, for destruction of a group leader we must be able to
      find all its siblings and remove their reference to it.
      
      This means that detaching an event from its context must not detach it
      from the group, otherwise we can end up failing to clear all pointers.
      
      Solve this by clearly separating the attachment to a context and
      attachment to a group, and keep the group composed until we destroy
      the events.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8a49542c
    • Peter Zijlstra's avatar
      perf_events: Fix races and clean up perf_event and perf_mmap_data interaction · ac9721f3
      Peter Zijlstra authored
      In order to move toward separate buffer objects, rework the whole
      perf_mmap_data construct to be a more self-sufficient entity, one
      with its own lifetime rules.
      
      This greatly sanitizes the whole output redirection code, which
      was riddled with bugs and races.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ac9721f3
  7. 30 May, 2010 15 commits