- 06 May, 2015 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix 'perf probe -a' segfault if passed with '' (Wang Nan) - Fix report -T/--threads option (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 05 May, 2015 1 commit
-
-
Wang Nan authored
Since parse_perf_probe_point() deals with a user passed argument, we should not assume it to be a valid string. Without this patch, if pass '' to perf probe, a segfault raises: $ perf probe -a '' Segmentation fault This patch checks argument of parse_perf_probe_point() before string processing. After this patch: $ perf probe -a '' usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...] ... Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1430210769-94177-1-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 May, 2015 2 commits
-
-
Namhyung Kim authored
The commit 512ae1bd ("perf tools: Consolidate management of default sort orders") changed default value of the 'sort_order' variable to NULL indicating that users don't set any sort keys on the command line. However it missed to update a check in perf_evlist__tty_browse_hists() so that 'perf report -T' cannot show the per-thread values after the normal output. This patch fixes it to work again. Note that the -T option only works on --stdio and neither --sort nor --parent option was given. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430309328-28317-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge tag 'perf-urgent-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf tooling fixes from Arnaldo Carvalho de Melo: . Fix a segfault in 'perf top' when kernel map is restricted (Wang Nan) . Fix hung wakeup tasks after requeueing in 'perf bench futex' (Davidlohr Bueso) . Fix bug in perf probe global variables handling, missing curly braces on an if body (He Kuang) . 'perf bench numa' fixes (command line help/handling, etc) (Petr Holasek) . fix the 'perf kmem' build on RHEL6/OL6 (David Ahern) . fix the libtraceevent build on 32-bit arch (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 27 Apr, 2015 5 commits
-
-
Petr Holasek authored
This patch fixes the race in the beginning of benchmark run when some threads hasn't got assigned curr_cpu yet so they don't occur in nodes-of-process stats and benchmark concludes that all remaining threads are converged already. The race can be reproduced with small amount of threads and some bigger amount of shared process memory, e.g. one process, two threads and 5GB of process memory. Signed-off-by: Petr Holasek <pholasek@redhat.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1429198699-25039-4-git-send-email-pholasek@redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Petr Holasek authored
Corrected description and fixed function of --quiet argument. Signed-off-by: Petr Holasek <pholasek@redhat.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1429198699-25039-2-git-send-email-pholasek@redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Davidlohr Bueso authored
The futex-requeue benchmark can hang because of missing wakeups once the benchmark is done, ie: [Run 1]: Requeued 1024 of 1024 threads in 0.3290 ms perf: couldn't wakeup all tasks (135/1024) This bug, while perhaps suggesting missing wakeups in kernel futex code, is merely a consequence of the crappy FUTEX_CMP_REQUEUE man page, incorrectly mentioning that the number of requeued tasks is in fact returned, not the wakeups. This patch acknowledges this and updates the corresponding futex_wake code around it. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1429894848.10273.44.camel@stgolabs.netSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
There are missing curly braces which causes find_variable() return wrong value when probing with global variables. This problem can be reproduced as following: $ perf probe -v --add='generic_perform_write global_variable_for_test' ... Try to find probe point from debuginfo. Probe point found: generic_perform_write+0 Searching 'global_variable_for_test' variable in context. An error occurred in debuginfo analysis (-2). Error: Failed to add events. Reason: No such file or directory (Code: -2) After this patch: $ perf probe -v --add='generic_perform_write global_variable_for_test' ... Converting variable global_variable_for_test into trace event. global_variable_for_test type is int. Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Added new event: Writing event: p:probe/generic_perform_write _stext+1237464 global_variable_for_test=@global_variable_for_test+0:s32 probe:generic_perform_write (on generic_perform_write with global_variable_for_test) You can now use it in all perf tools, such as: perf record -e probe:generic_perform_write -aR sleep 1 Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1429949338-18678-1-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Perf top raise a warning if a kernel sample is collected but kernel map is restricted. The warning message needs to dereference al.map->dso... However, previous perf_event__preprocess_sample() doesn't always guarantee al.map != NULL, for example, when kernel map is restricted. This patch validates al.map before dereferencing, avoid the segfault. Before this patch: $ cat /proc/sys/kernel/kptr_restrict 1 $ perf top -p 120183 perf: Segmentation fault -------- backtrace -------- /path/to/perf[0x509868] /lib64/libc.so.6(+0x3545f)[0x7f9a1540045f] /path/to/perf[0x448820] /path/to/perf(cmd_top+0xe3c)[0x44a5dc] /path/to/perf[0x4766a2] /path/to/perf(main+0x5f5)[0x42e545] /lib64/libc.so.6(__libc_start_main+0xf4)[0x7f9a153ecbd4] /path/to/perf[0x42e674] And gdb call trace: Program received signal SIGSEGV, Segmentation fault. perf_event__process_sample (machine=0xa44030, sample=0x7fffffffa4c0, evsel=0xa43b00, event=0x7ffff41c3000, tool=0x7fffffffa8a0) at builtin-top.c:736 736 !RB_EMPTY_ROOT(&al.map->dso->symbols[MAP__FUNCTION]) ? (gdb) bt #0 perf_event__process_sample (machine=0xa44030, sample=0x7fffffffa4c0, evsel=0xa43b00, event=0x7ffff41c3000, tool=0x7fffffffa8a0) at builtin-top.c:736 #1 perf_top__mmap_read_idx (top=top@entry=0x7fffffffa8a0, idx=idx@entry=0) at builtin-top.c:855 #2 0x000000000044a5dd in perf_top__mmap_read (top=0x7fffffffa8a0) at builtin-top.c:872 #3 __cmd_top (top=0x7fffffffa8a0) at builtin-top.c:997 #4 cmd_top (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-top.c:1267 #5 0x00000000004766a3 in run_builtin (p=p@entry=0x8a6ce8 <commands+264>, argc=argc@entry=3, argv=argv@entry=0x7fffffffdf70) at perf.c:371 #6 0x000000000042e546 in handle_internal_command (argv=0x7fffffffdf70, argc=3) at perf.c:430 #7 run_argv (argv=0x7fffffffdcf0, argcp=0x7fffffffdcfc) at perf.c:474 #8 main (argc=3, argv=0x7fffffffdf70) at perf.c:589 (gdb) Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1429946703-80807-1-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 Apr, 2015 2 commits
-
-
Namhyung Kim authored
In my i386 build, it failed like this: CC event-parse.o event-parse.c: In function 'print_str_arg': event-parse.c:3868:5: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'uint64_t' [-Wformat] Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Javi Merino <javi.merino@arm.com> Link: http://lkml.kernel.org/r/20150424020218.GF1905@sejongSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
David Ahern authored
0d68bc92 breaks compiles on RHEL6/OL6: cc1: warnings being treated as errors builtin-kmem.c: In function ‘search_page_alloc_stat’: builtin-kmem.c:322: error: declaration of ‘stat’ shadows a global declaration node = &parent->rb_left; /usr/include/sys/stat.h:455: error: shadowed declaration is here builtin-kmem.c: In function ‘perf_evsel__process_page_alloc_event’: builtin-kmem.c:378: error: declaration of ‘stat’ shadows a global declaration /usr/include/sys/stat.h:455: error: shadowed declaration is here builtin-kmem.c: In function ‘perf_evsel__process_page_free_event’: builtin-kmem.c:431: error: declaration of ‘stat’ shadows a global declaration /usr/include/sys/stat.h:455: error: shadowed declaration is here Rename local variable to pstat to avoid the name conflict. Signed-off-by: David Ahern <david.ahern@oracle.com> Link: http://lkml.kernel.org/r/1429033773-31383-1-git-send-email-david.ahern@oracle.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 Apr, 2015 4 commits
-
-
Bobby Powers authored
Some toolchains (like Hardened Gentoo) define _FORTIFY_SOURCE in the built-in, default args. This causes perf builds to fail with: <command-line>:0:0: error: "_FORTIFY_SOURCE" redefined [-Werror] <built-in>: note: this is the location of the previous definition cc1: all warnings being treated as errors To avoid this, undefine _FORTIFY_SOURCE before (possibly re-)defining it in tools/lib/api. v2 applies cleanly on top of already pulled kbuild changes for 4.1-rc1. Signed-off-by: Bobby Powers <bobbypowers@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Dirk Gouders <dirk@gouders.net> Cc: Michal Marek <mmarek@suse.cz> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-kbuild@vger.kernel.org Link: http://lkml.kernel.org/r/1429658381-3039-1-git-send-email-bobbypowers@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Will Deacon authored
Building the perf tool for 32-bit ARM results in the following build error due to a combination of an incorrect conversion specifier and compiling with -Werror: builtin-kmem.c: In function ‘print_page_summary’: builtin-kmem.c:644:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=] nr_alloc_freed, (total_alloc_freed_bytes) / 1024); ^ builtin-kmem.c:647:9: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘u64’ [-Werror=format=] (total_page_alloc_bytes - total_alloc_freed_bytes) / 1024); ^ cc1: all warnings being treated as errors This patch fixes the problem by consistently using PRIu64 for printing out u64 values. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1429796437-1790-1-git-send-email-will.deacon@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We were not checking in the inner event processing loop if the forked workload had finished, which, on a busy system, may make it take a long time trying to drain events, entering a seemingly neverending loop, waiting for the system to get idle enough to make it drain the buffers. Fix it by disabling the events when 'done' is true, in the inner loop, to start draining what is in the buffers. Now: [root@ssdandy ~]# time trace --filter-pids 14003 -a sleep 1 | tail 996.748 ( 0.002 ms): sh/30296 rt_sigprocmask(how: SETMASK, nset: 0x7ffc83418160, sigsetsize: 8) = 0 996.751 ( 0.002 ms): sh/30296 rt_sigprocmask(how: BLOCK, nset: 0x7ffc834181f0, oset: 0x7ffc83418270, sigsetsize: 8) = 0 996.755 ( 0.002 ms): sh/30296 rt_sigaction(sig: INT, act: 0x7ffc83417f50, oact: 0x7ffc83417ff0, sigsetsize: 8) = 0 1004.543 ( 0.362 ms): tail/30198 ... [continued]: read()) = 4096 1004.548 ( 7.791 ms): sh/30296 wait4(upid: -1, stat_addr: 0x7ffc834181a0) ... 1004.975 ( 0.427 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096 1005.390 ( 0.410 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096 1005.743 ( 0.348 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096 1006.197 ( 0.449 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096 1006.492 ( 0.290 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096 real 0m1.219s user 0m0.704s sys 0m0.331s [root@ssdandy ~]# Reported-by: Michael Petlan <mpetlan@redhat.com> Suggested-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-p6kpn1b26qcbe47pufpw0tex@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
commit f7aa222f Author: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Tue Feb 3 13:25:39 2015 -0300 perf trace: No need to enable evsels for workload started from perf The assumption was that whenever a workload is specified, the attr.enable_on_exec evsel flag would be set, but that is not happening when perf_record_opts.system_wide is set, for instance That resulted in both perf_evlist__enable() and attr.enable_on_exec being not called/set, which made the events to remain disabled while the workload runs, producing no output. Fix it, by calling perf_evlist__enable() in the 'trace' tool when forking and not targetting a workload started from trace v2: Test against !target__none(), as suggested by Namhyung Kim, that is what is used in perf_evsel__config() when deciding if the attr.enable_on_exec flag to be set. More work is needed to cover other cases such as opts->initial_delay. Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-27z7169pvfxgj8upic636syv@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 Apr, 2015 3 commits
-
-
Sonny Rao authored
This keeps all the related PCI IDs together in the driver where they are used. Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1429644791-25724-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Sonny Rao authored
perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs This uncore is the same as the Haswell desktop part but uses a different PCI ID. Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1429569247-16697-1-git-send-email-sonnyrao@chromium.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
The core_pmu does not define cpu_* callbacks, which handles allocation of 'struct cpu_hw_events::shared_regs' data, initialization of debug store and PMU_FL_EXCL_CNTRS counters. While this probably won't happen on bare metal, virtual CPU can define x86_pmu.extra_regs together with PMU version 1 and thus be using core_pmu -> using shared_regs data without it being allocated. That could could leave to following panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8152cd4f>] _spin_lock_irqsave+0x1f/0x40 SNIP [<ffffffff81024bd9>] __intel_shared_reg_get_constraints+0x69/0x1e0 [<ffffffff81024deb>] intel_get_event_constraints+0x9b/0x180 [<ffffffff8101e815>] x86_schedule_events+0x75/0x1d0 [<ffffffff810586dc>] ? check_preempt_curr+0x7c/0x90 [<ffffffff810649fe>] ? try_to_wake_up+0x24e/0x3e0 [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20 [<ffffffff8109eb16>] ? autoremove_wake_function+0x16/0x40 [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90 [<ffffffff811a9517>] ? __d_lookup+0xa7/0x150 [<ffffffff8119db5f>] ? do_lookup+0x9f/0x230 [<ffffffff811a993a>] ? dput+0x9a/0x150 [<ffffffff8119c8f5>] ? path_to_nameidata+0x25/0x60 [<ffffffff8119e90a>] ? __link_path_walk+0x7da/0x1000 [<ffffffff8101d8f9>] ? x86_pmu_add+0xb9/0x170 [<ffffffff8101d7a7>] x86_pmu_commit_txn+0x67/0xc0 [<ffffffff811b07b0>] ? mntput_no_expire+0x30/0x110 [<ffffffff8119c731>] ? path_put+0x31/0x40 [<ffffffff8107c297>] ? current_fs_time+0x27/0x30 [<ffffffff8117d170>] ? mem_cgroup_get_reclaim_stat_from_page+0x20/0x70 [<ffffffff8111b7aa>] group_sched_in+0x13a/0x170 [<ffffffff81014a29>] ? sched_clock+0x9/0x10 [<ffffffff8111bac8>] ctx_sched_in+0x2e8/0x330 [<ffffffff8111bb7b>] perf_event_sched_in+0x6b/0xb0 [<ffffffff8111bc36>] perf_event_context_sched_in+0x76/0xc0 [<ffffffff8111eb3b>] perf_event_comm+0x1bb/0x2e0 [<ffffffff81195ee9>] set_task_comm+0x69/0x80 [<ffffffff81195fe1>] setup_new_exec+0xe1/0x2e0 [<ffffffff811ea68e>] load_elf_binary+0x3ce/0x1ab0 Adding cpu_(prepare|starting|dying) for core_pmu to have shared_regs data allocated for core_pmu. AFAICS there's no harm to initialize debug store and PMU_FL_EXCL_CNTRS either for core_pmu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20150421152623.GC13169@krava.redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 18 Apr, 2015 1 commit
-
-
Ingo Molnar authored
Dan Carpenter reported that pt_event_add() has buggy error handling logic: it returns 0 instead of -EBUSY when it fails to start a newly added event. Furthermore, the control flow in this function is messy, with cleanup labels mixed with direct returns. Fix the bug and clean up the code by converting it to a straight fast path for the regular non-failing case, plus a clear sequence of cascading goto labels to do all cleanup. NOTE: I materially changed the existing clean up logic in the pt_event_start() failure case to use the direct perf_aux_output_end() path, not pt_event_del(), because perf_aux_output_end() is enough here. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150416103830.GB7847@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 17 Apr, 2015 4 commits
-
-
Kan Liang authored
Same as Haswell, Broadwell also support the LBR callstack. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1427962377-40955-1-git-send-email-kan.liang@intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jacob Pan authored
RAPL energy hardware unit can vary within a single CPU package, e.g. HSW server DRAM has a fixed energy unit of 15.3 uJ (2^-16) whereas the unit on other domains can be enumerated from power unit MSR. There might be other variations in the future, this patch adds per cpu model quirk to allow special handling of certain cpus. hw_unit is also removed from per cpu data since it is not per cpu and the sampling rate for energy counter is typically not high. Without this patch, DRAM domain on HSW servers will be counted 4x higher than the real energy counter. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1427405325-780-1-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Ingo reported that cycles:pp didn't work for him on some machines. It turns out that in this commit: af4bdcf6 perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events Andi forgot to explicitly allow that event when he disabled event flags for PEBS on those uarchs. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: af4bdcf6 ("perf/x86/intel: Disallow flags for most Core2/Atom/Nehalem/Westmere events") Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Somehow we ended up with overlapping flags when merging the RDPMC control flag - this is bad, fix it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 14 Apr, 2015 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Analyze page allocator events in 'perf kmem' (Namhyung Kim) User visible changes: - Fix retprobe 'perf probe' handling when failing to find needed debuginfo (He Kuang) - lazy_line probe fixes in 'perf probe' (Naohiro Aota, He Kuang) Infrastructure changes: - Record pfn instead of pointer to struct page in tracepoints (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 13 Apr, 2015 5 commits
-
-
He Kuang authored
The first argument passed to find_probe_point_lazy() should be CU die, which will be passed to die_walk_lines() when lazy_line matches. Currently, when we probe with lazy_line pattern to file without function name, NULL pointer is passed and causes a segment fault. Can be reproduced as following: $ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;' [ 1958.984658] perf[1020]: segfault at 10 ip 00007fc6e10d8c71 sp 00007ffcbfaaf900 error 4 in libdw-0.161.so[7fc6e10ce000+34000] Segmentation fault After this patch: $ perf probe -k vmlinux --add='fs/super.c;s->s_count=1;' Added new event: probe:_stext (on @fs/super.c) You can now use it in all perf tools, such as: perf record -e probe:_stext -aR sleep 1 Signed-off-by: He Kuang <hekuang@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1428925290-5623-3-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Naohiro Aota authored
If we use lazy matching, it failed to open a souce file if perf command is invoked outside of compilation directory: $ perf probe -a '__schedule;clear_*' Failed to open kernel/sched/core.c: No such file or directory Error: Failed to add events. (-2) OTOH, other commands like "probe -L" can solve the souce directory by themselves. Let's make it possible for lazy matching too! Signed-off-by: Naohiro Aota <naota@elisp.net> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1426223923-1493-1-git-send-email-naota@elisp.netSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
When perf probe searched in a debuginfo file and failed, it tried with an alternative, in function get_alternative_probe_event(): memcpy(tmp, &pev->point, sizeof(*tmp)); memset(&pev->point, 0, sizeof(pev->point)); In this case, it drops the retprobe flag and forgets to set it back in find_alternative_probe_point(), so the problem occurs. Can be reproduced as following: $ perf probe -v -k vmlinux --add='sys_write%return' ... Added new event: Writing event: p:probe/sys_write _stext+1584952 probe:sys_write (on sys_write%return) $ cat /sys/kernel/debug/tracing/kprobe_events p:probe/sys_write _stext+1584952 After this patch: $ perf probe -v -k vmlinux --add='sys_write%return' Added new event: Writing event: r:probe/sys_write SyS_write+0 probe:sys_write (on sys_write%return) $ cat /sys/kernel/debug/tracing/kprobe_events r:probe/sys_write SyS_write Signed-off-by: He Kuang <hekuang@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1428925290-5623-1-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The perf kmem command records and analyze kernel memory allocation only for SLAB objects. This patch implement a simple page allocator analyzer using kmem:mm_page_alloc and kmem:mm_page_free events. It adds two new options of --slab and --page. The --slab option is for analyzing SLAB allocator and that's what perf kmem currently does. The new --page option enables page allocator events and analyze kernel memory usage in page unit. Currently, 'stat --alloc' subcommand is implemented only. If none of these --slab nor --page is specified, --slab is implied. First run 'perf kmem record' to generate a suitable perf.data file: # perf kmem record --page sleep 5 Then run 'perf kmem stat' to postprocess the perf.data file: # perf kmem stat --page --alloc --line 10 ------------------------------------------------------------------------------- PFN | Total alloc (KB) | Hits | Order | Mig.type | GFP flags ------------------------------------------------------------------------------- 4045014 | 16 | 1 | 2 | RECLAIM | 00285250 4143980 | 16 | 1 | 2 | RECLAIM | 00285250 3938658 | 16 | 1 | 2 | RECLAIM | 00285250 4045400 | 16 | 1 | 2 | RECLAIM | 00285250 3568708 | 16 | 1 | 2 | RECLAIM | 00285250 3729824 | 16 | 1 | 2 | RECLAIM | 00285250 3657210 | 16 | 1 | 2 | RECLAIM | 00285250 4120750 | 16 | 1 | 2 | RECLAIM | 00285250 3678850 | 16 | 1 | 2 | RECLAIM | 00285250 3693874 | 16 | 1 | 2 | RECLAIM | 00285250 ... | ... | ... | ... | ... | ... ------------------------------------------------------------------------------- SUMMARY (page allocator) ======================== Total allocation requests : 44,260 [ 177,256 KB ] Total free requests : 117 [ 468 KB ] Total alloc+freed requests : 49 [ 196 KB ] Total alloc-only requests : 44,211 [ 177,060 KB ] Total free-only requests : 68 [ 272 KB ] Total allocation failures : 0 [ 0 KB ] Order Unmovable Reclaimable Movable Reserved CMA/Isolated ----- ------------ ------------ ------------ ------------ ------------ 0 32 . 44,210 . . 1 . . . . . 2 . 18 . . . 3 . . . . . 4 . . . . . 5 . . . . . 6 . . . . . 7 . . . . . 8 . . . . . 9 . . . . . 10 . . . . . Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1428298576-9785-4-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The struct page is opaque for userspace tools, so it'd be better to save pfn in order to identify page frames. The textual output of $debugfs/tracing/trace file remains unchanged and only raw (binary) data format is changed - but thanks to libtraceevent, userspace tools which deal with the raw data (like perf and trace-cmd) can parse the format easily. So impact on the userspace will also be minimal. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Based-on-patch-by: Joonsoo Kim <js1304@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1428298576-9785-3-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 Apr, 2015 1 commit
-
-
Ingo Molnar authored
Dan Carpenter pointed out that the control flow in pt_pmu_hw_init() is a bit messy: for example the kfree(de_attrs) is entirely superfluous. Another problem is the inconsistent mixing of label based and direct return error handling. Add modern, label based error handling instead and clean up the code a bit as well. Note that we'll still do a kfree(NULL) in the normal case - this does not matter as this is an init path and kfree() returns early if it sees a NULL. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150409090805.GG17605@mwandaSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 11 Apr, 2015 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New user visible features: - Support multiple probes on different binaries on the same command line (Masami Hiramatsu) User visible changes: - Fix synthesizing fork_event.ppid for non-main thread (David Ahern) - Fix cross-endian analysis (David Ahern) - Fix segfault in 'perf buildid-list' when show DSOs with hits (He Kuang) Infrastructure changes: - Fix type for references to data_head/tail (David Ahern) - Fix error path to do closedir() when synthesizing threads (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 10 Apr, 2015 7 commits
-
-
David Ahern authored
The data_head and data_tail fields are defined as __u64 in linux/perf_event.h, but perf userspace uses int and unsigned int. Convert all references to u64 for consistency. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1428420037-26599-1-git-send-email-dsahern@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
To avoid probing in unintended binary, the orphaned -x option must be checked and warned. Without this patch, following command sets up the probe in the kernel. ----- # perf probe -a strcpy -x ./perf Added new event: probe:strcpy (on strcpy) You can now use it in all perf tools, such as: perf record -e probe:strcpy -aR sleep 1 ----- But in this case, it seems that the user may want to probe in the perf binary. With this patch, perf-probe correctly handles the orphaned -x. ----- # perf probe -a strcpy -x ./perf Error: -x/-m must follow the probe definitions. ... ----- Reported-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150401102541.17137.75477.stgit@localhost.localdomainSigned-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Support multiple probes on different binaries with just one command. In the result, this example sets up the probes on icmp_rcv in kernel, on main and set_target in perf, and on pcspkr_event in pcspker.ko driver. ----- # perf probe -a icmp_rcv -x ./perf -a main -a set_target \ -m /lib/modules/4.0.0-rc5+/kernel/drivers/input/misc/pcspkr.ko \ -a pcspkr_event Added new event: probe:icmp_rcv (on icmp_rcv) You can now use it in all perf tools, such as: perf record -e probe:icmp_rcv -aR sleep 1 Added new event: probe_perf:main (on main in /home/mhiramat/ksrc/linux-3/tools/perf/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:main -aR sleep 1 Added new event: probe_perf:set_target (on set_target in /home/mhiramat/ksrc/linux-3/tools/perf/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:set_target -aR sleep 1 Added new event: probe:pcspkr_event (on pcspkr_event in pcspkr) You can now use it in all perf tools, such as: perf record -e probe:pcspkr_event -aR sleep 1 ----- Reported-by: Arnaldo Carvalho de Melo <acme@infradead.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150401102539.17137.46454.stgit@localhost.localdomainSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
commit: f3b623b8 ("perf tools: Reference count struct thread") appends every thread->node to dead_threads in machine__remove_thread() and list_del_init() this node in thread__put(). perf_event__exit_del_thread() releases thread wihout using machine__remove_thread(), and causes a NULL pointer crash when list_del_init(&thread->node) is called. Fix this by using machine_remove_thread() instead of using thread__put() directly. This problem can be reproduced as following: $ perf record ls $ perf buildid-list --with-hits [ 3874.195070] perf[1018]: segfault at 0 ip 00000000004b0b15 sp 00007ffc35b44780 error 6 in perf[400000+166000] Segmentation fault After this patch: $ perf record ls $ perf buildid-list --with-hits bc23e7c3281e542650ba4324421d6acf78f4c23e /proc/kcore 643324cb0e969f30c56d660f167f84a150845511 [vdso] 0000000000000000000000000000000000000000 /bin/busybox ... Signed-off-by: He Kuang <hekuang@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1428658500-6483-1-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
David Ahern authored
Trying to analyze a big endian data file on little endian system fails with the error: 0xa9b40 [0x70]: failed to process type: 9 The problem is that header parsing is not done correctly because the file attributes are not swapped. Make it so. With this patch able to analyze a sparc64 data file on x86_64. Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1428610546-178789-1-git-send-email-david.ahern@oracle.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
When traversing /proc to synthesize the PERF_RECORD_FORK et al events we were bailing out on errors without calling closedir(), fix it. Reported-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vxtp593rfztgbi8noy0m967p@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
David Ahern authored
Commit ca6c41c5 sets the ppid based on what is read from the /proc/pid/status file when synthesizing fork events. This is correct thing to do for new processes but not threads of a process. Fix ppid for threads to be the main thread when synthesizing fork events (ie., assume main thread spawned all sub-threads in a process). Reported-by: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Signed-off-by: David Ahern <david.ahern@oracle.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1428598107-178999-1-git-send-email-david.ahern@oracle.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 Apr, 2015 2 commits
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra) - Improve 'perf sched replay' on high CPU core count machines (Yunlong Song) - Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT events (Arnaldo Carvalho de Melo) - Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa) - Respect -i option 'in perf kmem' (Jiri Olsa) Infrastructure changes: - Honor operator priority in libtraceevent (Namhyung Kim) - Merge all perf_event_attr print functions (Peter Zijlstra) - Check kmaps access to make code more robust (Wang Nan) - Fix inverted logic in perf_mmap__empty() (He Kuang) - Fix ARM 32 'perf probe' building error (Wang Nan) - Fix perf_event_attr tests (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Adding 'I' event modifier to have complete set of modifiers for perf_event_attr:exclude_* bits. Any event specified with 'I' modifier will have the perf_event_attr:exclude_idle bit set. $ perf record -e cycles:I -vv ls 2>&1 | grep exclude_idle exclude_hv 0 exclude_idle 1 Adding automated tests. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: William Cohen <wcohen@redhat.com> Link: http://lkml.kernel.org/r/1428441919-23099-2-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-