- 16 Mar, 2018 25 commits
-
-
Ingo Molnar authored
When recently using 'perf report --stat' it was not clear to me from the output whether a particular statistics field (LOST_SAMPLES) was not present, or just zero: fomalhaut:~> perf report --stat Aggregated stats: TOTAL events: 495984 MMAP events: 85 COMM events: 3389 EXIT events: 1605 THROTTLE events: 2 UNTHROTTLE events: 2 FORK events: 3377 SAMPLE events: 472629 MMAP2 events: 14753 FINISHED_ROUND events: 139 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 I had to check the output several times to ascertain that I'm not misreading the output, that the field didn't change and that I didn't misremember the name. In fact I had to look into the perf source to make sure that zero fields are indeed not shown. With the patch applied: fomalhaut:~> perf report --stat Aggregated stats: TOTAL events: 495984 MMAP events: 85 LOST events: 0 COMM events: 3389 EXIT events: 1605 THROTTLE events: 2 UNTHROTTLE events: 2 FORK events: 3377 READ events: 0 SAMPLE events: 472629 MMAP2 events: 14753 AUX events: 0 ITRACE_START events: 0 LOST_SAMPLES events: 0 SWITCH events: 0 SWITCH_CPU_WIDE events: 0 NAMESPACES events: 0 ATTR events: 0 EVENT_TYPE events: 0 TRACING_DATA events: 0 BUILD_ID events: 0 FINISHED_ROUND events: 139 ID_INDEX events: 0 AUXTRACE_INFO events: 0 AUXTRACE events: 0 AUXTRACE_ERROR events: 0 THREAD_MAP events: 1 CPU_MAP events: 1 STAT_CONFIG events: 0 STAT events: 0 STAT_ROUND events: 0 EVENT_UPDATE events: 0 TIME_CONV events: 1 FEATURE events: 0 It's pretty clear at a glance that LOST_SAMPLES is present but zero. The original output can still be gotten via: fomalhaut:~> perf report --stat | grep -vw 0 Aggregated stats: TOTAL events: 495984 MMAP events: 85 COMM events: 3389 EXIT events: 1605 THROTTLE events: 2 UNTHROTTLE events: 2 FORK events: 3377 SAMPLE events: 472629 MMAP2 events: 14753 FINISHED_ROUND events: 139 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 So I don't think there's any real loss in functionality. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20180307152430.7e5h7e657b7bgd7q@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Thomas Richter authored
Executing command 'perf stat -T -- ls' dumps core on x86 and s390. Here is the call back chain (done on x86): # gdb ./perf .... (gdb) r stat -T -- ls ... Program received signal SIGSEGV, Segmentation fault. 0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6 (gdb) where #0 0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6 #1 0x00007ffff56ae484 in asprintf () from /lib64/libc.so.6 #2 0x00000000004f1982 in __parse_events_add_pmu (parse_state=0x7fffffffd580, list=0xbfb970, name=0xbf3ef0 "cpu", head_config=0xbfb930, auto_merge_stats=false) at util/parse-events.c:1233 #3 0x00000000004f1c8e in parse_events_add_pmu (parse_state=0x7fffffffd580, list=0xbfb970, name=0xbf3ef0 "cpu", head_config=0xbfb930) at util/parse-events.c:1288 #4 0x0000000000537ce3 in parse_events_parse (_parse_state=0x7fffffffd580, scanner=0xbf4210) at util/parse-events.y:234 #5 0x00000000004f2c7a in parse_events__scanner (str=0x6b66c0 "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}", parse_state=0x7fffffffd580, start_token=258) at util/parse-events.c:1673 #6 0x00000000004f2e23 in parse_events (evlist=0xbe9990, str=0x6b66c0 "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}", err=0x0) at util/parse-events.c:1713 #7 0x000000000044e137 in add_default_attributes () at builtin-stat.c:2281 #8 0x000000000044f7b5 in cmd_stat (argc=1, argv=0x7fffffffe3b0) at builtin-stat.c:2828 #9 0x00000000004c8b0f in run_builtin (p=0xab01a0 <commands+288>, argc=4, argv=0x7fffffffe3b0) at perf.c:297 #10 0x00000000004c8d7c in handle_internal_command (argc=4, argv=0x7fffffffe3b0) at perf.c:349 #11 0x00000000004c8ece in run_argv (argcp=0x7fffffffe20c, argv=0x7fffffffe200) at perf.c:393 #12 0x00000000004c929c in main (argc=4, argv=0x7fffffffe3b0) at perf.c:537 (gdb) It turns out that a NULL pointer is referenced. Here are the function calls: ... cmd_stat() +---> add_default_attributes() +---> parse_events(evsel_list, transaction_attrs, NULL); 3rd parameter set to NULL Function parse_events(xx, xx, struct parse_events_error *err) dives into a bison generated scanner and creates parser state information for it first: struct parse_events_state parse_state = { .list = LIST_HEAD_INIT(parse_state.list), .idx = evlist->nr_entries, .error = err, <--- NULL POINTER !!! .evlist = evlist, }; Now various functions inside the bison scanner are called to end up in __parse_events_add_pmu(struct parse_events_state *parse_state, ..) with first parameter being a pointer to above structure definition. Now the PMU event name is not found (because being executed in a VM) and this function tries to create an error message with asprintf(&parse_state->error.str, ....) which references a NULL pointer and dumps core. Fix this by providing a pointer to the necessary error information instead of NULL. Technically only the else part is needed to avoid the core dump, just lets be safe... Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180308145735.64717-1-tmricht@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
This patch adds the HiSilicon hip08 JSON file. This platform follows the ARMv8 recommended IMPLEMENTATION DEFINED events, where applicable. Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-12-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
This patch fixes the ARM Cortex-A53 json to use event definition from the ARMv8 recommended events. In addition to this change, other changes were made: - remove stray ',' - remove mirrored events in memory.json and bus.json - fixed indentation to be consistent with other ARM JSONs Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-11-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
This patch fixes the Cavium ThunderX2 JSON to use event definitions from the ARMv8 recommended events. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-10-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
Add JSON for ARMv8 IMPLEMENTATION DEFINED recommended events. The JSON is copied from ARMv8 architecture reference manual, available here: https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm.pdf Originally-from: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-9-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
For some architectures (like arm), there are architecture- defined events. Sometimes these events may be "recommended" according to the architecture standard, in that the implementer is free ignore the "recommendation" and create its custom event. This patch adds support for parsing standard events from arch-defined JSONs, and fixing up vendor events when they have implemented these events as standard. Support is also ensured that the vendor may implement their own custom events. A new step is added to the pmu events parsing to fix up the vendor events with the arch-standard events. The arch-defined JSONs must be placed in the arch root folder for preprocessing prior to tree JSON processing. In the vendor JSON, to specify that the arch event is supported, the keyword "ArchStdEvent" should be used, like this: [ { "ArchStdEvent": "L1D_CACHE_WR", }, ] Matching is based on the "EventName" field in the architecture JSON. No other JSON objects are strictly required. However, for other objects added, these take precedence over architecture defined standard events, thus supporting separate events which have the same event code. Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-8-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
Since jevents now supports vendor subdirectory, relocate the Cortex-A53 JSONs to arm subdirectory. Signed-off-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-7-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
Since jevents now supports vendor subdirectory, relocate the ThunderX2 JSON to Cavium subdirectory. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-6-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
For some architectures (like arm), it is required to support a vendor subdirectory and not locate all the JSONs for a specific vendor in the same folder. This is because all the events for the same vendor will be placed in the same pmu events table, which may cause conflict. This conflict would be in the instance that a vendor's custom implemented events do have the same meaning on different platforms, so events in the pmu table would conflict. In addition, per list command may show events which are not even supported for a given platform. This patch adds support for a arch/vendor/platform directory hierarchy, while maintaining backwards-compatibility for existing arch/platform structure. In this, each platform would always have its own pmu events table. In generated file pmu_events.c, each platform table name is in the format pme{_vendor}_platform, like this: struct pmu_events_map pmu_events_map[] = { { .cpuid = "0x00000000420f5160", .version = "v1", .type = "core", .table = pme_cavium_thunderx2 }, { .cpuid = 0, .version = 0, .type = 0, .table = 0, }, }; Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-5-git-send-email-john.garry@huawei.com Link: http://lkml.kernel.org/r/1521047452-28565-1-git-send-email-john.garry@huawei.com [ Add missing limits.h include, fixing the build on at least all Alpine Linux versions tested (3.4 to 3.7 + edge), ] [ Applied a patch to fix reading ./.. directories in XFS, see second Link tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
Currently a topic subdirectory is supported in the pmu-events dir, in the following sample structure: /arch/platform/subtopic/mysubtopic.json Upto 256 levels of topic subdirectories are supported. So this means that JSONs may be located in a topic dir as well as the platform dir. This topic subdirectory causes problems if we want to add support for a vendor dir in the pmu-events structure (in the form arch/platform/vendor), in that we cannot differentiate between a vendor dir and a topic dir. Since the topic dir feature is not used, drop it so it does not block adding vendor subdirectory support. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-4-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
When EXPECT macro fails an assertion, the error code is not properly set after the first loop of tokens in function json_events(). This is because err is set to the return value from func function pointer call, which must be 0 to continue to loop, yet it is not reset for for each loop. I assume that this was not the intention, so change the code so err is set appropriately in EXPECT macro itself. In addition to this, the indention in EXPECT macro is tidied. The current indention alludes that the 2 statements following the if statement are in the body, which is not true. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-3-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
John Garry authored
Currently jevents supports multiple mapfiles, but this is only in the form where mapfile basename starts with 'mapfile.csv' At the moment, no architectures actually use multiple mapfiles, so drop the support for now. This patch also solves a nuisance where, when the mapfile is edited and the text editor may create a backup, jevents may use the backup, as shown: jevents: Many mapfiles? Using pmu-events/arch/arm64/mapfile.csv~, ignoring pmu-events/arch/arm64/mapfile.csv Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Link: http://lkml.kernel.org/r/1520506716-197429-2-git-send-email-john.garry@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Kim Phillips authored
Based on prior work: https://lkml.org/lkml/2014/5/6/395 and on how other arches add libdw unwind support. Includes support for running the unwind test, e.g., on a system with only elfutils' libdw 0.170, the test now runs, and successfully: $ ./perf test unwind 56: Test dwarf unwind : Ok Originally-by: Jean Pihet <jean.pihet@linaro.org> Reported-by: Christian Hansen <chansen3@cisco.com> Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180308211030.4ee4a0d6ff6dc5cda1b567d4@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding the 'PA cnt' column grouped under data cacheline address. It shows how many times the physical addresses changed for the hist entry. It does not show the number of different physical addresses for entry, because we don't store those. We only track the number of times we got different address than we currently hold, which is not expensive and gives similar info. $ perf c2c report --stdio # ----------- Cacheline ---------- Total Tot ----- LLC Load Hitm ----- # Index Address Node PA cnt records Hitm Total Lcl Rmt # ..... .................. .... ...... ....... ....... ....... ....... ....... # 0 0xffff9ad56dca0a80 0 9 10 7.69% 2 2 0 1 0xffff9ad56dce0a80 0 9 9 7.69% 2 2 0 2 0xffff9ad37659ad80 0 1 2 3.85% 1 1 0 ... # ----- HITM ----- -- Store Refs -- --------- Data address --------- # Num Rmt Lcl L1 Hit L1 Miss Offset Node PA cnt Pid # ..... ....... ....... ....... ....... .................. .... ...... ....... # ------------------------------------------------------------- 0 0 2 3 0 0xffff9ad56dca0a80 ------------------------------------------------------------- 0.00% 0.00% 33.33% 0.00% 0x0 0 1 2510 0.00% 0.00% 33.33% 0.00% 0x4 0 1 2476 0.00% 0.00% 33.33% 0.00% 0x20 0 1 0 0.00% 100.00% 0.00% 0.00% 0x38 0 1 0 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-10-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Forcing the NUMA node output to be grouped with the "Cacheline" column in both "Shared Data Cache Line Table" and "Shared Cache Line Distribution Pareto" tables. Before: # Total Tot ----- LLC Load Hitm ----- # Index Cacheline Node records Hitm Total Lcl Rmt # ..... .................. .... ....... ....... ....... ....... ....... # 0 0x7f0830100000 0 84 10.53% 8 8 0 1 0xffff922a93154200 0 3 2.63% 2 2 0 2 0xffff922a93154500 0 4 2.63% 2 2 0 After: # ------- Cacheline ------ Total Tot ----- LLC Load Hitm ----- # Index Address Node records Hitm Total Lcl Rmt # ..... .................. .... ....... ....... ....... ....... ....... # 0 0x7f0830100000 0 84 10.53% 8 8 0 1 0xffff922a93154200 0 3 2.63% 2 2 0 2 0xffff922a93154500 0 4 2.63% 2 2 0 Before: # ----- HITM ----- -- Store Refs -- Data address # Num Rmt Lcl L1 Hit L1 Miss Offset Node Pid # ..... ....... ....... ....... ....... .................. .... ....... # ------------------------------------------------------------- 0 0 8 32 2 0x7f0830100000 ------------------------------------------------------------- 0.00% 75.00% 21.88% 0.00% 0x18 0 1791 0.00% 12.50% 37.50% 0.00% 0x18 0 1791 0.00% 0.00% 34.38% 0.00% 0x18 0 1791 After: # ----- HITM ----- -- Store Refs -- ----- Data address ----- # Num Rmt Lcl L1 Hit L1 Miss Offset Node Pid # ..... ....... ....... ....... ....... .................. .... ....... # ------------------------------------------------------------- 0 0 8 32 2 0x7f0830100000 ------------------------------------------------------------- 0.00% 75.00% 21.88% 0.00% 0x18 0 1791 0.00% 12.50% 37.50% 0.00% 0x18 0 1791 0.00% 0.00% 34.38% 0.00% 0x18 0 1791 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-9-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding the NUMA node info for the data cacheline. Adding the new column to both "Shared Data Cache Line Table" and "Shared Cache Line Distribution Pareto". Note the new 'Node' column next to the 'Cacheline'. $ perf c2c report --stdio ================================================= Shared Data Cache Line Table ================================================= # # Total Tot ----- LLC Load Hitm ----- # Index Cacheline Node records Hitm Total Lcl Rmt # ..... .................. .... ....... ....... ....... ....... ....... # 0 0x7f0830100000 0 84 10.53% 8 8 0 1 0xffff922a93154200 0 3 2.63% 2 2 0 2 0xffff922a93154500 0 4 2.63% 2 2 0 ... Note the new 'Node' column next to the 'Offset'. ================================================= Shared Cache Line Distribution Pareto ================================================= # # ----- HITM ----- -- Store Refs -- Data address # Num Rmt Lcl L1 Hit L1 Miss Offset Node Pid # ..... ....... ....... ....... ....... .................. .... ....... # ------------------------------------------------------------- 0 0 8 32 2 0x7f0830100000 ------------------------------------------------------------- 0.00% 75.00% 21.88% 0.00% 0x18 0 1791 0.00% 12.50% 37.50% 0.00% 0x18 0 1791 0.00% 0.00% 34.38% 0.00% 0x18 0 1791 Using the mem2node object to get the NUMA node data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-8-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
There's no need to calculate column widths for entries that are not going to be displayed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-7-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
We are going to calculate tje column width based on the struct c2c_hist_entry data, so making calc_width to work with struct c2c_hist_entry. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-6-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
We are going to display NUMA node information in following patches. For this we need to have physical address data in the sample. Adding --phys-data as a default option for perf c2c record. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-5-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding mem2node object automated test. The test prepares few artificial nodes - memory maps and verifies the mem2node object returns proper node values to given addresses. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-4-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding mem2node object to allow the easy lookup of the node for the physical address. It has following interface: int mem2node__init(struct mem2node *map, struct perf_env *env); void mem2node__exit(struct mem2node *map); int mem2node__node(struct mem2node *map, u64 addr); The mem2node__toolsinit initialize object from the perf data file MEM_TOPOLOGY feature data. Following calls to mem2node__node will return node number for given physical address. The mem2node__exit function frees the object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-3-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Forgot to free env's memory nodes, adding needed code to perf_env__exit. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180309101442.9224-2-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Mark Rutland authored
When perf_group_dettach() is called on a group leader, it updates each sibling's group_leader field to point to that sibling, effectively upgrading each siblnig to a group leader. After perf_group_detach has completed, the caller may free the leader event. We only remove siblings from the group leader's sibling_list when the leader has a non-empty group_node. This was fine prior to commit: 8343aae6 ("perf/core: Remove perf_event::group_entry") ... as the sibling's sibling_list would be empty. However, now that we use the sibling_list field as both the list head and the list entry, this leaves each sibling with a non-empty sibling list, including the stale leader event. If perf_group_detach() is subsequently called on a sibling, it will appear to be a group leader, and we'll walk the sibling_list, potentially dereferencing these stale events. In 0day testing, this has been observed to result in kernel panics. Let's avoid this by always removing siblings from the sibling list when we promote them to leaders. Fixes: 8343aae6 ("perf/core: Remove perf_event::group_entry") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: vincent.weaver@maine.edu Cc: Peter Zijlstra <peterz@infradead.org> Cc: torvalds@linux-foundation.org Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: valery.cherepennikov@intel.com Cc: linux-tip-commits@vger.kernel.org Cc: eranian@google.com Cc: acme@redhat.com Cc: alexander.shishkin@linux.intel.com Cc: davidcc@google.com Cc: kan.liang@intel.com Cc: Dmitry.Prohorov@intel.com Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lkml.kernel.org/r/20180316131741.3svgr64yibc6vsid@lakrids.cambridge.arm.com
-
Peter Zijlstra authored
Mark noticed that the change to sibling_list changed some iteration semantics; because previously we used group_list as list entry, sibling events would always have an empty sibling_list. But because we now use sibling_list for both list head and list entry, siblings will report as having siblings. Fix this with a custom for_each_sibling_event() iterator. Fixes: 8343aae6 ("perf/core: Remove perf_event::group_entry") Reported-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: vincent.weaver@maine.edu Cc: alexander.shishkin@linux.intel.com Cc: torvalds@linux-foundation.org Cc: alexey.budankov@linux.intel.com Cc: valery.cherepennikov@intel.com Cc: eranian@google.com Cc: acme@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: davidcc@google.com Cc: kan.liang@intel.com Cc: Dmitry.Prohorov@intel.com Cc: jolsa@redhat.com Link: https://lkml.kernel.org/r/20180315170129.GX4043@hirez.programming.kicks-ass.net
-
- 13 Mar, 2018 8 commits
-
-
Milind Chabbi authored
Problem and motivation: Once a breakpoint perf event (PERF_TYPE_BREAKPOINT) is created, there is no flexibility to change the breakpoint type (bp_type), breakpoint address (bp_addr), or breakpoint length (bp_len). The only option is to close the perf event and configure a new breakpoint event. This inflexibility has a significant performance overhead. For example, sampling-based, lightweight performance profilers (and also concurrency bug detection tools), monitor different addresses for a short duration using PERF_TYPE_BREAKPOINT and change the address (bp_addr) to another address or change the kind of breakpoint (bp_type) from "write" to a "read" or vice-versa or change the length (bp_len) of the address being monitored. The cost of these modifications is prohibitive since it involves unmapping the circular buffer associated with the perf event, closing the perf event, opening another perf event and mmaping another circular buffer. Solution: The new ioctl flag for perf events, PERF_EVENT_IOC_MODIFY_ATTRIBUTES, introduced in this patch takes a pointer to a struct perf_event_attr as an argument to update an old breakpoint event with new address, type, and size. This facility allows retaining a previous mmaped perf events ring buffer and avoids having to close and reopen another perf event. This patch supports only changing PERF_TYPE_BREAKPOINT event type; future implementations can extend this feature. The patch replicates some of its functionality of modify_user_hw_breakpoint() in kernel/events/hw_breakpoint.c. modify_user_hw_breakpoint cannot be called directly since perf_event_ctx_lock() is already held in _perf_ioctl(). Evidence: Experiments show that the baseline (not able to modify an already created breakpoint) costs an order of magnitude (~10x) more than the suggested optimization (having the ability to dynamically modifying a configured breakpoint via ioctl). When the breakpoints typically do not trap, the speedup due to the suggested optimization is ~10x; even when the breakpoints always trap, the speedup is ~4x due to the suggested optimization. Testing: tests posted at https://github.com/linux-contrib/perf_event_modify_bp demonstrate the performance significance of this patch. Tests also check the functional correctness of the patch. Signed-off-by: Milind Chabbi <chabbi.milind@gmail.com> [ Using modify_user_hw_breakpoint_check function. ] [ Reformated PERF_EVENT_IOC_*, so the values are all in one column. ] Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-8-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Adding test that: - detects the number of watch/break-points, skip test if any is missing - detects PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl, skip test if it's missing - detects if watchpoints and breakpoints share same slots - create all possible watchpoints on cpu 0 - change one of it to breakpoint - in case wp and bp do not share slots, we create another watchpoint to ensure the slot accounting is correct Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-9-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Move the sample_max_stack check and setup into perf_copy_attr(), so we have all perf_event_attr initial setup in one place and can easily compare attrs in the new ioctl introduced in following change. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-7-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
And rename it to modify_user_hw_breakpoint_check(). We are about to use modify_user_hw_breakpoint_check() for user space breakpoints modification, we must be very strict to check only the fields we can change have changed. As Peter explained: "Suppose someone does: attr = malloc(sizeof(*attr)); // uninitialized memory attr->type = BP; attr->bp_addr = new_addr; attr->bp_type = bp_type; attr->bp_len = bp_len; ioctl(fd, PERF_IOC_MOD_ATTR, &attr); And feeds absolute shite for the rest of the fields. Then we later want to extend IOC_MOD_ATTR to allow changing attr::sample_type but we can't, because that would break the above application." I'm making this check optional because we already export modify_user_hw_breakpoint() and with this check we could break existing users. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-6-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Moving out all the functionality without the events disabling/enabling calls, because we want to call another disabling/enabling functions in following change. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-5-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Add the modify_bp_slot() function to keep slot numbers correct when changing the breakpoint type. Using existing __release_bp_slot()/__reserve_bp_slot() call sequence to update the slot counts. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-4-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Passing bp_type argument to __reserve_bp_slot() and __release_bp_slot() functions, so we can pass another bp_type than the one defined in bp->attr.bp_type. This will be handy in following change that fixes breakpoint slot counts during its modification. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-3-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jiri Olsa authored
Pass bp_type directly as a find_slot_idx() argument, so we don't need to have whole event to get the breakpoint slot type. It will be used in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Milind Chabbi <chabbi.milind@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <onestero@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180312134548.31532-2-jolsa@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 12 Mar, 2018 7 commits
-
-
leilei.lin authored
There's two problems when installing cgroup events on CPUs: firstly list_update_cgroup_event() only tries to set cpuctx->cgrp for the first event, if that mismatches on @cgrp we'll not try again for later additions. Secondly, when we install a cgroup event into an active context, only issue an event reprogram when the event matches the current cgroup context. This avoids a pointless event reprogramming. Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com> [ Improved the changelog and comments. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: brendan.d.gregg@gmail.com Cc: eranian@gmail.com Cc: linux-kernel@vger.kernel.org Cc: yang_oliver@hotmail.com Link: http://lkml.kernel.org/r/20180306093637.28247-1-linxiulei@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
The event schedule order (as per perf_event_sched_in()) is: - cpu pinned - task pinned - cpu flexible - task flexible But perf_rotate_context() will unschedule cpu-flexible even if it doesn't need a rotation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Similar to how first programming cpu=-1 and then cpu=# is wrong, so is rotating both. It was especially wrong when we were still programming the PMU in this same order, because in that scenario we might never actually end up running cpu=# events at all. Cure this by using the active_list to pick the rotation event; since at programming we already select the left-most event. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
The last argument is, and always must be, the same. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
When an event group contains more events than can be scheduled on the hardware, iterating the full event group for ctx_sched_out is a waste of time. Keep track of the events that got programmed on the hardware, such that we can iterate this smaller list in order to schedule them out. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Now that all the grouping is done with RB trees, we no longer need group_entry and can replace the whole thing with sibling_list. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Scheduling in events with cpu=-1 before events with cpu=# changes semantics and is undesirable in that it would priorize these events. Given that groups->index is across all groups we actually have an inter-group ordering, meaning we can merge-sort two groups, which is just what we need to preserve semantics. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Dmitri Prokhorov <Dmitry.Prohorov@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valery Cherepennikov <valery.cherepennikov@intel.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
-