- 26 Nov, 2015 14 commits
-
-
He Kuang authored
Add bpf_map_update_elem() helper function which calls the sys_bpf syscall to update elements in bpf maps. Upcoming patches will use it to adjust data in map through the perf command line. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448372181-151723-4-git-send-email-wangnan0@huawei.comSigned-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding WIEGHT bit_name call to display sample_type properly. $ perf evlist -v cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|ID|CPU|DATA_SRC|WEIGHT ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448465815-27404-2-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Clear sample_(type|period) for counting, as it only confuses debug output with unwanted sampling details: Before: $ sudo perf stat -e 'raw_syscalls:sys_enter' -vv ls ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x11 { sample_period, sample_freq } 1 sample_type TIME|CPU|PERIOD|RAW read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ... After: $ sudo perf stat -e 'raw_syscalls:sys_enter' -vv ls ------------------------------------------------------------ perf_event_attr: type 2 size 112 config 0x11 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448465815-27404-1-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ekaterina Tumanova authored
Currently when debuginfo is separated to vmlinux.debug, it's contents get ignored. Let's change that and add it to the vmlinux_path list. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448469166-61363-3-git-send-email-tumanova@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ekaterina Tumanova authored
Refactor vmlinux_path__init() to ease subsequent additions of new vmlinux locations. Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448469166-61363-2-git-send-email-tumanova@linux.vnet.ibm.com [ Rename vmlinux_path__update() to vmlinux_path__add() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding OUTPUT path prefix for fixdep target so we use it properly in out of tree builds. If the fixdep already existed in the tree, the out of tree build would see it already exist and did not build the out of tree version, as reported by Arnaldo: [acme@zoo linux]$ make O=/tmp/build/perf -C tools/perf make: Entering directory '/home/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build make[2]: Nothing to be done for 'fixdep'. make: Leaving directory '/home/git/linux/tools/perf' Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151126185055.GC19410@krava.brq.redhat.com [ Fixed conflict with 5725dd8f ("tools build: Clean CFLAGS and LDFLAGS for fixdep") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Passing perf_script struct into process_event function, so we could process configuration data for event printing. It will be used in following patch to get event name string width. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20151126175521.GA18979@krava.brq.redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Yannick Brosseau authored
When parsing /proc/xxx/maps, the sscanf in perf_event__synthesize_mmap_events truncate the map name at the space in "/anon_hugepage (deleted)". is_anon_memory() then only receives the string "/anon_hugepage" and does not detect it. We change is_anon_memory() to only compare the first part of the string, effectively ignoring if " (deleted)" is there. Signed-off-by: Yannick Brosseau <scientist@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joshua Zhu <zhu.wen-jie@hp.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/1448538152-2898-1-git-send-email-scientist@fb.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Something unexpected may happen if copy statically linked perf to a production environment: # ./perf probe -m ./mymodule.ko my_func [mymodule] with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf buildid-cache -a ./mymodule.ko # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 Where: # ldd ./perf not a dynamic executable # strace -e open ./perf probe -m ./mymodule.ko my_func ... open("/home/wangnan/kmodule/mymodule.ko", O_RDONLY) = 3 open("/home/wangnan/kmodule/../lib64/elfutils/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) ... open("/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/wangnan/.debug/.build-id/32/6ab42550ef3d24944f53c817533728367effeb", O_RDONLY) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) In the above example, probe fails before we put the module into buildid-cache. However, user would expect it success in both case because perf is able to find probe points actually. The reason is because perf won't utilize module's full path if it failed to open debuginfo. In: convert_to_probe_trace_events -> find_probe_trace_events_from_map -> get_target_map -> kernel_get_module_map -> machine__findnew_module_map -> map_groups__find_by_name map_groups__find_by_name() is able to find the map of that module, but this information is found from /proc/module before it knows the real path of the offline module. Therefore, the map->dso->long_name is set to something like '[mymodule]', which prevent dso__load() find the real path of the module file. In another aspect, if dso__load() can get the offline module through buildid cache, it can read symble table from that ko. Even if debuginfo is not available, 'perf probe' can success if the '.symtab' can be found. This patch improves machine__findnew_module_map(): when dso->long_name is leading with '[' (doesn't find path of module when parsing /proc/modules), fixes it by dso__set_long_name(), so following dso__load() is possible to find the symbol table. This patch won't interfere with buildid matching. Here is the test result: # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 # ./perf probe -d '*' Removed event: probe:my_func # mv ./mymodule.{ko,.bak} # mv ./moduleb.ko mymodule.ko # ./perf probe -m ./mymodule.ko my_func /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf probe -v -m ./mymodule.ko my_func probe-definition(0): my_func symbol:my_func file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Could not open debuginfo. Try to use symbols. symsrc__init: build id mismatch for /home/wangnan/kmodule/mymodule.ko. /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. Reason: No such file or directory (Code: -2) Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448510397-187965-1-git-send-email-wangnan0@huawei.com [ Renamed adjust_dso_long_name() do dso__adjust_kmod_long_name() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Ingo reported following build failure: $ make clean install ... CC plugin_kmem.o fixdep: error opening depfile: ./.plugin_hrtimer.o.d: No such file or directory /home/mingo/tip/tools/build/Makefile.build:77: recipe for target 'plugin_hrtimer.o' failed make[3]: *** [plugin_hrtimer.o] Error 2 Makefile:189: recipe for target 'plugin_hrtimer-in.o' failed make[2]: *** [plugin_hrtimer-in.o] Error 2 Makefile.perf:414: recipe for target 'libtraceevent_plugins' failed make[1]: *** [libtraceevent_plugins] Error 2 make[1]: *** Waiting for unfinished jobs.... Currently we have the install-traceevent-plugins target being dependent on $(LIBTRACEEVENT), which will actualy not build any plugin. So the install-traceevent-plugins target itself will try to build plugins, but.. Plugins built is also triggered by perf build itself via libtraceevent_plugins target. This might cause a race having one make thread removing temp files from another and result in above error. Fixing this by having proper plugins build dependency before installing plugins. Reported-and-Tested-by: : Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448546044-28973-3-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
The default script handler (the one that displays samples on screen) is implemented scripting_ops instance with process_event callback. This way we can't pass any script config into display function, because we don't want perl or python handlers to be depended on perf script internals. Removing the default_scripting_ops and calling process event function directly. This way it's possible to pass perf_script struct and process configuration data in following commit. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1448546125-29245-1-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The callchain rbtree is rebuilt periodically, so it needs to reinitialize the root everytime. Otherwise it can be stuck in the rbtree insertion with stale pointers. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448521700-32062-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
If user requested to hide unresolved entries, skip unresolved callchains as well as hist entries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1448521700-32062-3-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix to free temporal Dwarf_Frame correctly in 'perf probe', fixing a regression introduced in perf/core that prevented, at least, adding an uprobe collecting function parameter values (Masami Hiramatsu) - Fix output of %llu for 64 bit values read on 32 bit machines in libtraceevent (Steven Rostedt) Developer visible: - Clean CFLAGS and LDFLAGS for fixdep in tools/build (Wang Nan) - Don't do a feature check when cleaning tools/lib/bpf (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 25 Nov, 2015 3 commits
-
-
Wang Nan authored
Before this patch libbpf always do feature check even when cleaning. For example: $ cd kernel/tools/lib/bpf $ make Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CC libbpf.o CC bpf.o LD libbpf-in.o LINK libbpf.a LINK libbpf.so $ make clean CLEAN libbpf CLEAN core-gen $ make clean Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CLEAN libbpf CLEAN core-gen $ Although the first 'make clean' doesn't show feature check result, it still does the check. No output because check result is similar to FEATURE-DUMP.libbpf. This patch uses same method as perf to turn off feature checking when 'make clean'. Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448372181-151723-3-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Sometimes passing variables to tools/build is dangerous. For example, on my platform there is a gcc problem (gcc 4.8.1): It passes the stackprotector-all feature check: $ gcc -fstack-protector-all -c ./test.c $ echo $? 0 But requires LDFLAGS support if separate compiling and linking: $ gcc -fstack-protector-all -c ./test.c $ gcc ./test.o ./test.o: In function `main': test.c:(.text+0xb): undefined reference to `__stack_chk_guard' test.c:(.text+0x21): undefined reference to `__stack_chk_guard' collect2: error: ld returned 1 exit status $ gcc -fstack-protector-all ./test.o $ echo $? 0 $ gcc ./test.o -lssp $ echo $? 0 $ In this environment building perf throws an error: $ make BUILD: Doing 'make -j24' parallel build config/Makefile:344: No libunwind found. Please install libunwind-dev[el] >= 1.1 and/or set LIBUNWIND_DIR config/Makefile:403: No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev config/Makefile:418: slang not found, disables TUI support. Please install slang-devel or libslang-dev config/Makefile:432: GTK2 not found, disables GTK2 support. Please install gtk2-devel or libgtk2.0-dev config/Makefile:564: No bfd.h/libbfd found, please install binutils-dev[el]/zlib-static/libiberty-dev to gain symbol demangling config/Makefile:606: No numa.h found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev CC fixdep.o LD fixdep-in.o LINK fixdep fixdep-in.o: In function `parse_dep_file': /kernel/tools/build/fixdep.c:47: undefined reference to `__stack_chk_guard' /kernel/tools/build/fixdep.c:117: undefined reference to `__stack_chk_guard' fixdep-in.o: In function `main': /kernel-hydrogen/tools/build/fixdep.c:156: undefined reference to `__stack_chk_guard' /kernel/tools/build/fixdep.c:168: undefined reference to `__stack_chk_guard' collect2: error: ld returned 1 exit status make[2]: *** [fixdep] Error 1 make[1]: *** [fixdep] Error 2 make: *** [all] Error 2 This is because the CFLAGS used in building perf pollutes the CFLAGS used for fixdep, passing -fstack-protector-all to buiold fixdep which is obviously not required. Since fixdep is a small host side tool, we should keep its CFLAGS/LDFLAGS simple and clean. This patch clears the CFLAGS and LDFLAGS passed when building fixdep, so such gcc problem won't block the perf build process. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1448372181-151723-2-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
The commit 05c8d802 ("perf probe: Fix to free temporal Dwarf_Frame") tried to fix the memory leak of Dwarf_Frame, but it released the frame at wrong point. Since the dwarf_frame_cfa(frame, &pf->fb_ops, &nops) can return an address inside the frame data structure to pf->fb_ops, we can not release the frame before using pf->fb_ops. This reverts the commit and releases the frame afterwards (right before returning from call_probe_finder) correctly. Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 05c8d802 ("perf probe: Fix to free temporal Dwarf_Frame") LPU-Reference: 20151125103432.1473.31009.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 Nov, 2015 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Allow callchain order (caller, callee) to the libdw and libunwind based DWARF unwinders (Jiri Olsa) - Add missing parent_val initialization in the callchain code, fixing a SEGFAULT when using callchains with 'perf top' (Jiri Olsa) - Add initial 'perf config' command, for now just with a --list command to the contents of the configuration file in use and a basic man page describing its format, commands for doing edits and detailed documentation are being reviewed and proof-read. (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 23 Nov, 2015 21 commits
-
-
Steven Rostedt authored
When a long value is read on 32 bit machines for 64 bit output, the parsing needs to change "%lu" into "%llu", as the value is read natively. Unfortunately, if "%llu" is already there, the code will add another "l" to it and fail to parse it properly. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151116172516.4b79b109@gandalf.local.homeSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding missing parent_val callchain_node initialization. It's causing segfault in perf top: $ sudo perf top -g perf: Segmentation fault -------- backtrace -------- free_callchain_node(+0x29) in perf [0x4a4b3e] free_callchain(+0x29) in perf [0x4a5a83] hist_entry__delete(+0x126) in perf [0x4c6649] hists__delete_entry(+0x6e) in perf [0x4c66dc] hists__decay_entries(+0x7d) in perf [0x4c6776] perf_top__sort_new_samples(+0x7c) in perf [0x436a78] hist_browser__run(+0xf2) in perf [0x507760] perf_evsel__hists_browse(+0x1da) in perf [0x507c8d] perf_evlist__tui_browse_hists(+0x3e) in perf [0x5088cf] display_thread_tui(+0x7f) in perf [0x437953] start_thread(+0xc5) in libpthread-2.21.so [0x7f7068fbb555] __clone(+0x6d) in libc-2.21.so [0x7f7066fc3b9d] [0x0] Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 4b3a3212 ("perf hists browser: Support flat callchains") Link: http://lkml.kernel.org/r/20151121102355.GA17313@krava.localSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Taeung Song authored
Add perf-config document to describe the perf configuration and a 'list’ subcommand. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/63AD9B57-7B8C-46F8-8F18-0FFEB9A6A1BC@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Taeung Song authored
The perf configuration file contains many variables to change various aspects of each of its tools, including output, disk usage, etc. But looking at the state of configuration is difficult and there's no documentation about config variables except for the variables in perfconfig.example exist. So this patch adds a 'perf-config' command with a '--list' option. perf config [options] display current perf config variables. # perf config -l | --list Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1447768424-17327-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
As reported by Milian, currently for DWARF unwind (both libdw and libunwind) we display callchain in callee order only. Adding the support to follow callchain order setup to libdw DWARF unwinder, so we could get following output for report: $ perf record --call-graph dwarf ls ... $ perf report --no-children --stdio 21.12% ls libc-2.21.so [.] __strcoll_l | ---__strcoll_l mpsort_with_tmp mpsort_with_tmp mpsort_with_tmp sort_files main __libc_start_main _start $ perf report --stdio --no-children -g caller 21.12% ls libc-2.21.so [.] __strcoll_l | ---_start __libc_start_main main sort_files mpsort_with_tmp mpsort_with_tmp mpsort_with_tmp __strcoll_l Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jan Kratochvil <jkratoch@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20151119130119.GA26617@krava.brq.redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding callchain order setup for DWARF unwinder test. The test now runs unwinder for both callee and caller orders. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1447772739-18471-4-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
As reported by Milian, currently for DWARF unwind (both libdw and libunwind) we display callchain in callee order only. Adding the support to follow callchain order setup to libunwind DWARF unwinder, so we could get following output for report: $ perf record --call-graph dwarf ls ... $ perf report --no-children --stdio 39.26% ls libc-2.21.so [.] __strcoll_l | ---__strcoll_l mpsort_with_tmp mpsort_with_tmp sort_files main __libc_start_main _start 0 $ perf report -g caller --no-children --stdio ... 39.26% ls libc-2.21.so [.] __strcoll_l | ---0 _start __libc_start_main main sort_files mpsort_with_tmp mpsort_with_tmp __strcoll_l Based-on-patch-by: Milian Wolff <milian.wolff@kdab.com> Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20151118075247.GA5416@krava.brq.redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Moving initial entry call into get_entries function so all entries processing is on one place. It will be useful for next change that adds ordering logic. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1447772739-18471-2-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andi Kleen authored
The earlier constraint fix for Broadwell CYCLE_ACTIVITY.* forced umask 8 to counter 2. For this it used UEVENT, to match the complete umask. The event list for Broadwell has an additional STALLS_L1D_PENDIND event that uses umask 8, but also sets other bits in the umask. The earlier strict umask match didn't handle this case. Add a new UBIT_EVENT constraint macro that only matches the specified bits in the umask. Then use that macro to handle CYCLE_ACTIVITY.* on Broadwell. The documented event also uses cmask, but there's no need to let the event scheduler know about the cmask, as the scheduling restriction is only tied to the umask. Reported-by: Grant Ayers <ayers@cs.stanford.edu> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1447719667-9998-1-git-send-email-andi@firstfloor.org [ Filled in the missing email address of Grant Ayers - hopefully I got the right one. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Takao Indoh authored
This patch stops Intel PT logging and saves its registers in memory before kdump is started. This feature is needed to prevent Intel PT from overwriting its log buffer after panic, and saved registers are needed to find the last position where Intel PT wrote data. After the crash dump is captured by kdump, users can retrieve the log buffer from the vmcore and use it to investigate bad kernel behavior. Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: H.Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/1446614553-6072-3-git-send-email-indou.takao@jp.fujitsu.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Takao Indoh authored
This patch add a function for external components to stop Intel PT. Basically this function is used when kernel panic occurs. When it is called, the intel_pt driver disables Intel PT and saves its registers using pt_event_stop(), which is also used by pmu.stop handler. This function stops Intel PT on the CPU where it is working, therefore users of it need to call it for each CPU to stop all logging. Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: H.Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/1446614553-6072-2-git-send-email-indou.takao@jp.fujitsu.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is not free anymore, as it has moved to a separate MSR. For callstack mode we don't need any of this information; so we can avoid the unnecessary MSR read. Add flags to the perf interface where perf record can request not collecting this information. Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual for branch_sample_types to be negative (disable), not positive (enable), but since the legacy ABI reported the flags we need some form of explicit disabling to avoid breaking the ABI. After we have the flags the x86 perf code can keep track if any users need the flags. If noone needs it the information is not collected. This cuts down the cost of LBR callstack on Skylake significantly. Profiling a kernel build with LBR call stack the average run time of the PMI handler drops by 43%. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Change the perf user stack walking to use the new __copy_from_user_nmi(), and split each access into word sized transfer sizes. This allows to inline the complete access and optimize it all into a single load. The main advantage is that this avoids the overhead of double page faults. When normal copy_from_user() fails it reexecutes the copy to compute an accurate number of non copied bytes. This leads to executing the expensive page fault twice. While walking stacks having a fault at some point is relatively common (typically when some part of the program isn't compiled with frame pointers), so this is a large overhead. With the optimized copies we avoid this problem because they only do all accesses once. And of course they're much faster too when the access does not fault because they're just single instructions instead of complex function calls. While profiling a kernel build with -g, the patch brings down the average time of the PMI handler from 966ns to 552ns (-43%). Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1445551641-13379-2-git-send-email-andi@firstfloor.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Add a inlined __ variant of copy_from_user_nmi. The inlined variant allows the user to: - batch the access_ok() check for multiple accesses - avoid having a pagefault_disable/enable() on every access if the caller already ensures disabled page faults due to its context. - get all the optimizations in copy_*_user() for small constant sized transfers It is just a define to __copy_from_user_inatomic(). Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1445551641-13379-1-git-send-email-andi@firstfloor.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Allows BPF scriptlets specify arguments to be fetched using DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan) - Allow attaching BPF scriptlets to module symbols (Wang Nan) - Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan) - BPF programs now can specify 'perf probe' tunables via its section name, separating key=val values using semicolons (Wang Nan) Testing some of these new BPF features: Use case: get callchains when receiving SSL packets, filter then in the kernel, at arbitrary place. # cat ssl.bpf.c #define SEC(NAME) __attribute__((section(NAME), used)) struct pt_regs; SEC("func=__inet_lookup_established hnum") int func(struct pt_regs *ctx, int err, unsigned short port) { return err == 0 && port == 443; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf record -a -g -e ssl.bpf.c ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ] # perf script | head -30 swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux) 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux) 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux) 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux) 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux) 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux) 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux) 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux) 1163ffa start_kernel ([kernel.vmlinux].init.text) 11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text) 1163623 x86_64_start_kernel ([kernel.vmlinux].init.text) qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux) 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux) 430a br_handle_frame_finish ([bridge]) 48bc br_handle_frame ([bridge]) 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) # Use 'perf probe' various options to list functions, see what variables can be collected at any given point, experiment first collecting without a filter, then filter, use it together with 'perf trace', 'perf top', with or without callchains, if it explodes, please tell us! - Introduce a new callchain mode: "folded", that will list per line representations of all callchains for a give histogram entry, facilitating 'perf report' output processing by other tools, such as Brendan Gregg's flamegraph tools (Namhyung Kim) E.g: # perf report | grep -v ^# | head 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry | ---cpu_startup_entry | |--12.07%--start_secondary | --6.30%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel # Becomes, in "folded" mode: # perf report -g folded | grep -v ^# | head -5 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry 12.07% cpu_startup_entry;start_secondary 6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle 11.23% call_cpuidle;cpu_startup_entry;start_secondary 5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter 11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state # The user can also select one of "count", "period" or "percent" as the first column. Infrastructure changes: - Fix multiple leaks found with Valgrind and a refcount debugger (Masami Hiramatsu) - Add further 'perf test' entries for BPF and LLVM (Wang Nan) - Improve 'perf test' to suport subtests, so that the series of tests performed in the LLVM and BPF main tests appear in the default 'perf test' output (Wang Nan) - Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo) - Adopt strtobool() from the kernel into tools/lib/ (Wang Nan) - Fix selftests_install tools/ Makefile rule (Kevin Hilman) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
There were still a number of references to my old Red Hat email address in the kernel source. Remove these while keeping the Red Hat copyright notices intact. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
This fixes a bug I added in the following commit: 90405aa0 ("perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode") The bug could lead to lost LBR call stacks. When restoring the LBR state we need to use the TOS of the previous context, not the current context. To do that we need to save/restore the TOS. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1445366797-30894-1-git-send-email-andi@firstfloor.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
While still valid, I'm trying to phase out this email address. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Stephane Eranian authored
This patch reinforces the lockdep checks performed by perf_cgroup_from_tsk() by passing the perf_event_context whenever possible. It is okay to not hold the RCU read lock when we know we hold the ctx->lock. This patch makes sure this property holds. In some functions, such as perf_cgroup_sched_in(), we do not pass the context because we are sure we are holding the RCU read lock. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: edumazet@google.com Link: http://lkml.kernel.org/r/1447322404-10920-3-git-send-email-eranian@google.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Stephane Eranian authored
The RCU checker detected RCU violation in the cgroup switching routines perf_cgroup_sched_in() and perf_cgroup_sched_out(). We were dereferencing cgroup from task without holding the RCU lock. Fix this by holding the RCU read lock. We move the locking from perf_cgroup_switch() to avoid double locking. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: edumazet@google.com Link: http://lkml.kernel.org/r/1447322404-10920-2-git-send-email-eranian@google.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Linus Torvalds authored
-
- 22 Nov, 2015 1 commit
-
-
Linus Torvalds authored
Merge slub bulk allocator updates from Andrew Morton: "This missed the merge window because I was waiting for some repairs to come in. Nothing actually uses the bulk allocator yet and the changes to other code paths are pretty small. And the net guys are waiting for this so they can start merging the client code" More comments from Jesper Dangaard Brouer: "The kmem_cache_alloc_bulk() call, in mm/slub.c, were included in previous kernel. The present version contains a bug. Vladimir Davydov noticed it contained a bug, when kernel is compiled with CONFIG_MEMCG_KMEM (see commit 03ec0ed5: "slub: fix kmem cgroup bug in kmem_cache_alloc_bulk"). Plus the mem cgroup counterpart in kmem_cache_free_bulk() were missing (see commit 03374518 "slub: add missing kmem cgroup support to kmem_cache_free_bulk"). I don't consider the fix stable-material because there are no in-tree users of the API. But with known bugs (for memcg) I cannot start using the API in the net-tree" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: slab/slub: adjust kmem_cache_alloc_bulk API slub: add missing kmem cgroup support to kmem_cache_free_bulk slub: fix kmem cgroup bug in kmem_cache_alloc_bulk slub: optimize bulk slowpath free by detached freelist slub: support for bulk free with SLUB freelists
-