- 30 May, 2016 2 commits
-
-
Arnaldo Carvalho de Melo authored
Additionally to being able to control the system wide maximum depth via /proc/sys/kernel/perf_event_max_stack, now we are able to ask for different depths per event, using perf_event_attr.sample_max_stack for that. This uses an u16 hole at the end of perf_event_attr, that, when perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if sample_max_stack is zero, means use perf_event_max_stack, otherwise it'll be bounds checked under callchain_mutex. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Move the get_main_thread function from db-export.c to thread.c so that it can be used elsewhere. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1464051145-19968-2-git-send-email-andi@firstfloor.org [ Removed leftover bits from db-export.h ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 May, 2016 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-urgent-for-mingo-20160527' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix kptr_restrict=2 related 'perf record' segfault (Wang Nan) - Fix CTF/libbabeltrace handling of chinese COMM strings (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 27 May, 2016 3 commits
-
-
Wang Nan authored
We observed some crazy apps on Android set their comm to unprintable string. For example: # cat /proc/10607/task/*/comm tencent.qqmusic ... Binder_2 日志输出线 <-- Chinese word 'log output thread' WifiManager ... 'perf data convert' fails to convert perf.data with such string to CTF format. For example: # cat << EOF > ./badguy.c #include <sys/prctl.h> int main(int argc, char *argv[]) { prctl(PR_SET_NAME, "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf"); while(1) sleep(1); return 0; } EOF # gcc ./badguy.c # perf record -e sched:* ./a.out # perf data convert --to-ctf ./bad.ctf CTF stream 4 flush failed [ perf data convert: Converted 'perf.data' into CTF data './bad.ctf' ] [ perf data convert: Converted and wrote 0.008 MB (78 samples) ] # babeltrace ./bad.ctf/ [error] Packet size (18446744073709551615 bits) is larger than remaining file size (262144 bits). [error] Stream index creation error. [error] Open file stream error. [warning] [Context] Cannot open_trace of format ctf at path ./bad.ctf. [warning] [Context] cannot open trace "./bad.ctf" from ./bad.ctf/ for reading. [error] Cannot open any trace for reading. [error] opening trace "./bad.ctf/" for reading. [error] none of the specified trace paths could be opened. This patch converts unprintable characters to hexadecimal word. After applying this patch the above test works correctly: # ~/perf data convert --to-ctf ./good.ctf [ perf data convert: Converted 'perf.data' into CTF data './good.ctf' ] [ perf data convert: Converted and wrote 0.008 MB (78 samples) ] # babeltrace ./good.ctf .. [23:14:35.491665268] (+0.000001100) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5123, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 } [23:14:35.491666230] (+0.000000962) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5122, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 } .. Committer note: To build perf with libabeltrace, use: $ mkdir -p /tmp/build/perf $ make LIBBABELTRACE=1 LIBBABELTRACE_DIR=/usr/local O=/tmp/build/perf -C tools/perf install-bin Or equivalent (no O=, fixup LIBBABELTRACE_DIR, etc). Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1464348951-179595-1-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Before this patch, a simple 'perf record' could fail if kptr_restrict is set to 1 (for normal user) or 2 (for root): # perf record ls WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. Segmentation fault (core dumped) This patch skips perf_event__synthesize_kernel_mmap() when kptr is not available. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Fixes: 45e90056 ("perf machine: Do not bail out if not managing to read ref reloc symbol") Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
If kptr_restrict is set to 2, even root is not allowed to see pointers. This patch checks kptr_restrict even if euid == 0. For root, report error if kptr_restrict is 2. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 May, 2016 1 commit
-
-
Vincent Stehlé authored
On rapl cleanup path, kfree() is given by mistake the address of the pointer of the structure to free (rapl_pmus->pmus + i). Pass the pointer instead (rapl_pmus->pmus[i]). Fixes: 9de8d686 "perf/x86/intel/rapl: Convert it to a per package facility" Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1464101629-14905-1-git-send-email-vincent.stehle@intel.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 24 May, 2016 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-20160523' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements from Arnaldo Carvalho de Melo: User visible changes: - Add "srcline_from" and "srcline_to" branch sort keys to 'perf top' and 'perf report' (Andi Kleen) Infrastructure changes: - Make 'perf trace' auto-attach fd->name and ptr->name beautifiers based on the name of syscall arguments, this way new syscalls that have 'const char * (path,pathname,filename)' will use the fd->name beautifier (vfs_getname perf probe, if in place) and the 'fd->name' (vfs_getname or via /proc/PID/fd/) (Arnaldo Carvalho de Melo) - Infrastructure to read from a ring buffer in backward write mode (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 23 May, 2016 7 commits
-
-
Wang Nan authored
Introduce rb_find_range() to find start and end position from a backward ring buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-5-git-send-email-wangnan0@huawei.comSigned-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
record__mmap_read() writes data from ring buffer into perf.data. 'head' is maintained by the kernel, points to the last written record. 'old' is maintained by perf, points to the record read in previous round. record__mmap_read() saves data from 'old' to 'head' to perf.data. The names of these variables are not very intutive. In addition, when dealing with backward writing ring buffer, the md->prev pointer should point to 'head' instead of the last byte it got. Add 'start' and 'end' pointer to make code clear and set md->prev to 'head' instead of the moved 'old' pointer. This patch doesn't change behavior since: buf = &data[old & md->mask]; size = head - old; old += size; <--- Here, old == head Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-4-git-send-email-wangnan0@huawei.comSigned-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
When record__mmap_read() requires data more than the size of ring buffer, drop those data to avoid accessing invalid memory. This can happen when reading from overwritable ring buffer, which should be avoided. However, check this for robustness. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-3-git-send-email-wangnan0@huawei.comSigned-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl. Following commits use them to ensure overwrite ring buffer is paused before reading. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463987628-163563-2-git-send-email-wangnan0@huawei.comSigned-off-by: He Kuang <hekuang@huawei.com> [ Return -1, like all other ioctl() usage in evlist.c, rename 'pause' arg to avoid breaking the build on ubuntu 12.04 and other old systems ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Auto-attach the ptr->name beautifier to syscall args "filename", "path" and "pathname" if they are of type "const char *". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jxii4qmcgoppftv0zdvml9d7@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Noticed when the 'setsockopt' 'fd' arg wasn't being formatted via the SCA_FD beautifier, so just remove the setting of "fd" args to SCA_FD and do it when reading the syscall info, like we do for args of type "pid_t", i.e. "fd" as the name should be enough as the decision to use the SFA_FD beautifier. For odd cases we can just do it explicitely. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0qissgetiuqmqyj4b6ancmpn@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
Add "srcline_from" and "srcline_to" branch sort keys that allow to show the source lines of a branch. That makes it much easier to track down where particular branches happen in the program, for example to examine branch mispredictions, or to associate it with cycle counts: % perf record -b -e cycles:p ./tcall % perf report --sort srcline_from,srcline_to,mispredict ... 15.10% tcall.c:18 tcall.c:10 N 14.83% tcall.c:11 tcall.c:5 N 14.12% tcall.c:7 tcall.c:12 N 14.04% tcall.c:12 tcall.c:5 N 12.42% tcall.c:17 tcall.c:18 N 12.39% tcall.c:7 tcall.c:13 N 12.27% tcall.c:13 tcall.c:17 N ... % perf report --sort srcline_from,srcline_to,cycles ... 17.12% tcall.c:18 tcall.c:11 1 17.01% tcall.c:12 tcall.c:6 1 16.98% tcall.c:11 tcall.c:6 1 15.91% tcall.c:17 tcall.c:18 1 6.38% tcall.c:7 tcall.c:17 7 4.80% tcall.c:7 tcall.c:12 8 4.21% tcall.c:7 tcall.c:17 8 2.67% tcall.c:7 tcall.c:12 7 2.62% tcall.c:7 tcall.c:12 10 2.10% tcall.c:7 tcall.c:17 9 1.58% tcall.c:7 tcall.c:12 6 1.44% tcall.c:7 tcall.c:12 5 1.38% tcall.c:7 tcall.c:12 9 1.06% tcall.c:7 tcall.c:17 13 1.05% tcall.c:7 tcall.c:12 4 1.01% tcall.c:7 tcall.c:17 6 Open issues: - Some kernel symbols get misresolved. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 May, 2016 15 commits
-
-
Wang Nan authored
Add a fd field into struct perf_mmap so that perf can track the mmap fd. This feature will be used for toggling overwrite ring buffers. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463762315-155689-3-git-send-email-wangnan0@huawei.comSigned-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Add 'overwrite' attribute to evsel to mark whether this event is overwritable. The following commits will support syntax like: # perf record -e cycles/overwrite/ ... An overwritable evsel requires kernel support for the perf_event_attr.write_backward ring buffer feature. Add it to perf_missing_feature. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1463762315-155689-2-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-20160520' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - We should not use the current value of the kernel.perf_event_max_stack as the default value for --max-stack in tools that can process perf.data files, they will only match if that sysctl wasn't changed from its default value at the time the perf.data file was recorded, fix it. This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report' produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo) - Provide a better warning when running 'perf trace' on a system where the kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record', noticed on ubuntu 16.04 where this is the default kptr_restrict setting. (Arnaldo Carvalho de Melo) - Fix ordering of instructions in the annotation code, noticed when annotating ARM binaries, now that table is auto-ordered at first use, to avoid more such problems (Chris Ryder) - Set buildid dir under symfs when --symfs is provided (He Kuang) - Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo) - Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being traced (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
He Kuang authored
This patch moves the reference of buildid dir to 'symfs/.debug' and skips the local buildid dir when '--symfs' is given, so that every single file opened by perf is relative to symfs directory now. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
When --min-stack or --max-stack is passwd but --no-syscalls is also in effect, there is no point in automatically setting '--call-graph dwarf'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pq922i7h9wef0pho1dqpttvn@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Chris Ryder authored
Currently the list of instructions recognised by perf annotate has to be explicitly written in sorted order. This makes it easy to make mistakes when adding new instructions. Sort the list of instructions on first access. Signed-off-by: Chris Ryder <chris.ryder@arm.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/4268febaf32f47f322c166fb2fe98cfec7041e11.1463676839.git.chris.ryder@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Chris Ryder authored
The ARM blt and bls instructions are not correctly identified when parsing assembly because the list of recognised instructions must be sorted by name. Swap the ordering of blt and bls. Signed-off-by: Chris Ryder <chris.ryder@arm.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/560e196b7c79b7ff853caae13d8719a31479cb1a.1463676839.git.chris.ryder@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We cannot limit processing stacks from the current value of the sysctl, as we may be processing perf.data files, possibly from other machines. Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can be overriden using --max-stack or equivalent. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Fixes: 4cb93446 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack") Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
As thread__resolve_callchain_sample can be used for handling perf.data files, that could've been recorded with a large max_stack sysctl setting than what the system used for analysis has set. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-2995bt2g5yq2m05vga4kip6m@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
This doesn't return, so there is no raw_syscalls:sys_exit for it, add the ending ')', without any return value, since it is void. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vh2mii0g4qlveuc4joufbipu@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Its now there, no need to have it too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-y18oeou494uy11im7u9to0dx@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Hook into the libtraceevent plugin kernel symbol resolver to warn the user that that can't happen with kptr_restrict=1. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9gc412xx1gl0lvqj1d1xwlyb@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
This means the user can't access /proc/kallsyms, for instance, because /proc/sys/kernel/kptr_restrict is set to 1. Instead leave the ref_reloc_sym as NULL and code using it will cope. This allows 'perf trace' to work on such systems for !root, the only issue would be when trying to resolve kernel symbols, which happens, for instance, in some libtracevent plugins. A warning for that case will be provided in the next patch in this series. Noticed in Ubuntu 16.04, that comes with kptr_restrict=1. Reported-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-knpu3z4iyp2dxpdfm798fac4@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Colin Ian King authored
Remove an extraneous space to fix up indentation. Trivial and no functional change Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1463503215-18339-1-git-send-email-colin.king@canonical.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-20160516' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Honour the kernel.perf_event_max_stack knob more precisely by not counting PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo) - Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim) - Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim) - Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim) - Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen) - Store vdso buildid unconditionally, as it appears in callchains and we're not checking those when creating the build-id table, so we end up not being able to resolve VDSO symbols when doing analysis on a different machine than the one where recording was done, possibly of a different arch even (arm -> x86_64) (He Kuang) Infrastructure changes: - Generalize max_stack sysctl handler, will be used for configuring multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo) Cleanups: - Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using open coded strings (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 19 May, 2016 1 commit
-
-
Peter Zijlstra authored
The move of the x86 perf implementation forgot to update the MAINTAINERS F(ile) pattern. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: fa9cbf32 ("perf/x86: Move perf_event.c ............... => x86/events/core.c") Link: http://lkml.kernel.org/r/20160519103019.GJ3206@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 18 May, 2016 1 commit
-
-
Jiri Olsa authored
When booting with nr_cpus=1, uncore_pci_probe tries to init the PCI/uncore also for the other packages and fails with warning when they are not found. The warning is bogus because it's correct to fail here for packages which are not initialized. Remove it and return silently. Fixes: cf6d445f "perf/x86/uncore: Track packages, not per CPU data" Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 17 May, 2016 8 commits
-
-
Arnaldo Carvalho de Melo authored
The perf_sample->ip_callchain->nr value includes all the entries in the ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc}, while what the user expects is that what is in the kernel.perf_event_max_stack sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be honoured in terms of IP addresses in the stack trace. So match the kernel support and validate chain->nr taking into account both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
The perf_sample->ip_callchain->nr value includes all the entries in the ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc}, while what the user expects is that what is in the kernel.perf_event_max_stack sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be honoured in terms of IP addresses in the stack trace. So allocate a bunch of extra entries for contexts, and do the accounting via perf_callchain_entry_ctx struct members. A new sysctl, kernel.perf_event_max_contexts_per_stack is also introduced for investigating possible bugs in the callchain implementation by some arch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We need have different helpers to account how many contexts we have in the sample and for real addresses, so do it now as a prep patch, to ease review. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will use it to count how many addresses are in the entry->ip[] array, excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really return the number of entries specified by the user via the relevant sysctl, kernel.perf_event_max_contexts, or via the per event perf_event_attr.sample_max_stack knob. This way we keep the perf_sample->ip_callchain->nr meaning, that is the number of entries, be it real addresses or PERF_CONTEXT_ entries, while honouring the max_stack knobs, i.e. the end result will be max_stack entries if we have at least that many entries in a given stack trace. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
This makes perf_callchain_{user,kernel}() receive the max stack as context for the perf_callchain_entry, instead of accessing the global sysctl_perf_event_max_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
So that it can be used for other stack related knobs, such as the upcoming one to tweak the max number of of contexts per stack sample. In all those cases we can only change the value if there are no perf sessions collecting stacks, so they need to grab that mutex, etc. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Instead of using a raw string, use DSO__NAME_KALLSYMS and DSO__NAME_KCORE macros for kallsyms and kcore. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devboxSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Currently 'perf stat' always counts task-clock event by default. But it's somewhat confusing for system-wide targets (especially with 'sleep N' as the 'sleep' task just sleeps and doesn't use cputime). Changing to cpu-clock event instead for that case makes more sense IMHO. Before: # perf stat -a sleep 0.1 Performance counter stats for 'system wide': 403.038603 task-clock (msec) # 4.001 CPUs utilized 150 context-switches # 0.372 K/sec 7 cpu-migrations # 0.017 K/sec 71 page-faults # 0.176 K/sec 23,705,169 cycles # 0.059 GHz 15,888,166 instructions # 0.67 insn per cycle 3,326,078 branches # 8.253 M/sec 87,643 branch-misses # 2.64% of all branches 0.100737009 seconds time elapsed # After: # perf stat -a sleep 0.1 Performance counter stats for 'system wide': 404.271182 cpu-clock (msec) # 4.000 CPUs utilized 143 context-switches # 0.354 K/sec 13 cpu-migrations # 0.032 K/sec 73 page-faults # 0.181 K/sec 22,119,220 cycles # 0.055 GHz 13,622,065 instructions # 0.62 insn per cycle 2,918,769 branches # 7.220 M/sec 85,033 branch-misses # 2.91% of all branches 0.101073089 seconds time elapsed # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-