1. 06 Oct, 2015 6 commits
    • Jiri Olsa's avatar
      perf tools: Use hpp_dimension__add_output to register hpp columns · 1178bfd4
      Jiri Olsa authored
      The perf_hpp__init currently does not respect sorting dimensions and the
      setup_sorting function could endup queueing same format twice. That
      screwed up the perf_hpp__list and got stuck in loop within
      perf_hpp__setup_output_field function.
      
        $ perf report -F +overhead
      
        0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
        1506    {
      
           #0  0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
           #1  0x00000000004c139d in perf_hpp__same_sort_entry (a=a@entry=0x880440 <perf_hpp.format>, b=b@entry=0x2bb2fe0) at util/sort.c:1380
           #2  0x00000000004f8d3c in perf_hpp__setup_output_field () at ui/hist.c:554
           #3  0x00000000004c1d1e in setup_sorting () at util/sort.c:1984
           #4  0x000000000042efbf in cmd_report (argc=0, argv=0x7ffea5a0e790, prefix=<optimized out>) at builtin-report.c:874
           #5  0x0000000000476f13 in run_builtin (p=p@entry=0x875628 <commands+168>, argc=argc@entry=3, argv=argv@entry=0x7ffea5a0e790) at perf.c:385
           #6  0x000000000047710b in handle_internal_command (argc=3, argv=0x7ffea5a0e790) at perf.c:445
           #7  0x0000000000477176 in run_argv (argcp=argcp@entry=0x7ffea5a0e5fc, argv=argv@entry=0x7ffea5a0e5f0) at perf.c:489
           #8  0x00000000004773e7 in main (argc=3, argv=0x7ffea5a0e790) at perf.c:606
      
      Using hpp_dimension__add_output function to register the output column.
      It will also mark the dimension as taken and omit above stuck.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444134312-29136-4-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1178bfd4
    • Jiri Olsa's avatar
      perf tools: Introduce hpp_dimension__add_output function · beeaaeb3
      Jiri Olsa authored
      This function will allow to register output column from ui code and
      respect taken sort/output dimensions.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444134312-29136-3-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      beeaaeb3
    • Jiri Olsa's avatar
      perf tools: Get rid of superfluos call to reset_dimensions · 0974d2c9
      Jiri Olsa authored
      There's no need to call reset_dimensions within __setup_output_field
      function. It's already called in its caller setup_sorting right before
      perf_hpp__init, which will be changed in following patch to respect
      taken dimension.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444134312-29136-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0974d2c9
    • Taku Izumi's avatar
      perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore · 712df65c
      Taku Izumi authored
      In multi-segment system, uncore devices may belong to buses whose segment
      number is other than 0:
      
        ....
        0000:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:7f:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:bf:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03
        ...
      
      In that case, relation of bus number and physical id may be broken
      because "uncore_pcibus_to_physid" doesn't take account of PCI segment.
      For example, bus 0000:ff and 0001:ff uses the same entry of
      "uncore_pcibus_to_physid" array.
      
      This patch fixes this problem by introducing the segment-aware pci2phy_map instead.
      Signed-off-by: default avatarTaku Izumi <izumi.taku@jp.fujitsu.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1443096621-4119-1-git-send-email-izumi.taku@jp.fujitsu.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      712df65c
    • Kan Liang's avatar
      perf/x86: Add Intel cstate PMUs support · 7ce1346a
      Kan Liang authored
      This patch adds new PMUs to support cstate related free running
      (read-only) counters. These counters may be used simultaneously by other
      tools, such as turbostat. However, it still make sense to implement them
      in perf. Because we can conveniently collect them together with other
      events, and allow to use them from tools without special MSR access
      code.
      
      These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
      According to counters' scope and category, two PMUs are registered with
      the perf_event core subsystem.
      
       - 'cstate_core': The counter is available for each physical core. The
                        counters include CORE_C*_RESIDENCY.
      
       - 'cstate_pkg':  The counter is available for each physical package. The
                        counters include PKG_C*_RESIDENCY.
      
      The events are exposed in sysfs for use by perf stat and other tools.
      The files are:
      
        /sys/devices/cstate_core/events/c*-residency
        /sys/devices/cstate_pkg/events/c*-residency
      
      These events only support system-wide mode counting.
      The /sys/devices/cstate_*/cpumask file can be used by tools to figure
      out which CPUs to monitor by default.
      
      The PMU type (attr->type) is dynamically allocated and is available from
      /sys/devices/core_misc/type and /sys/device/cstate_*/type.
      
      Sampling is not supported.
      
      Here is an example.
      
       - To caculate the fraction of time when the core is running in C6 state
         CORE_C6_time% = CORE_C6_RESIDENCY / TSC
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5
      
         11838820015,,cstate_core/c6-residency/,5175919658,100.00
         11877130740,,msr/tsc/,5175922010,100.00
      
       For sleep, 99.7% of time we ran in C6 state.
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop
      
         1253316,,cstate_core/c6-residency/,4360969154,100.00
         10012635248,,msr/tsc/,4360972366,100.00
      
       For busyloop, 0.01% of time we ran in C6 state.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7ce1346a
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 1c748dc2
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Switch the default callchain output mode to 'graph,0.5,caller', to make it
          look like the default for other tools, reducing the learning curve for
          people used to 'caller' based viewing. (Arnaldo Carvalho de Melo)
      
        - Implement column based horizontal scrolling in the hists browser (top, report),
          making it possible to use the TUI for things like 'perf mem report' where
          there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo)
      
        - Support sorting by symbol_iaddr with perf.data files produced by
          'perf mem record'. (Don Zickus)
      
        - Display DATA_SRC sample type bit, i.e. when running 'perf evlist -v' the
          "DATA_SRC" wasn't appearing when set, fix it to look like: (Jiri Olsa)
      
            cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC
      
        - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.:
      
           $ perf record -e cycles:P usleep 1
      
          Is now similar to:
      
           $ perf record usleep 1
      
          Useful, for instance, when specifying multiple events. (Jiri Olsa)
      
        - Make 'perf -v' and 'perf -h' work. (Jiri Olsa)
      
        - Fail properly when pattern matching fails to find a tracepoint, i.e.
          '-e non:existent' was being correctly handled, with a proper error message
          about that not being a valid event, but '-e non:existent*' wasn't,
          fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Separate arch specific entries in 'perf test' and add an 'Intel CQM' one
          to be fun on x86 only. (Matt Fleming)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1c748dc2
  2. 05 Oct, 2015 16 commits
  3. 03 Oct, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · e3b0ac1b
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
       - Do event name substring search as last resort in 'perf list'.
         (Arnaldo Carvalho de Melo)
      
         E.g.:
      
          # perf list clock
      
          List of pre-defined events (to be used in -e):
      
           cpu-clock                                          [Software event]
           task-clock                                         [Software event]
      
           uncore_cbox_0/clockticks/                          [Kernel PMU event]
           uncore_cbox_1/clockticks/                          [Kernel PMU event]
      
           kvm:kvm_pvclock_update                             [Tracepoint event]
           kvm:kvm_update_master_clock                        [Tracepoint event]
           power:clock_disable                                [Tracepoint event]
           power:clock_enable                                 [Tracepoint event]
           power:clock_set_rate                               [Tracepoint event]
           syscalls:sys_enter_clock_adjtime                   [Tracepoint event]
           syscalls:sys_enter_clock_getres                    [Tracepoint event]
           syscalls:sys_enter_clock_gettime                   [Tracepoint event]
           syscalls:sys_enter_clock_nanosleep                 [Tracepoint event]
           syscalls:sys_enter_clock_settime                   [Tracepoint event]
           syscalls:sys_exit_clock_adjtime                    [Tracepoint event]
           syscalls:sys_exit_clock_getres                     [Tracepoint event]
           syscalls:sys_exit_clock_gettime                    [Tracepoint event]
           syscalls:sys_exit_clock_nanosleep                  [Tracepoint event]
           syscalls:sys_exit_clock_settime                    [Tracepoint event]
      
       - Reduce min 'perf stat --interval-print/-I' to 10ms. (Kan Liang)
      
         perf stat --interval in action:
      
         # perf stat -e cycles -I 50 -a usleep $((200 * 1000))
         print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution.
         #   time                    counts unit events
            0.050233636         48,240,396      cycles
            0.100557098         35,492,594      cycles
            0.150804687         39,295,112      cycles
            0.201032269         33,101,961      cycles
            0.201980732            786,379      cycles
        #
      
       - Allow for max_stack greater than PERF_MAX_STACK_DEPTH, as when
         synthesizing callchains from Intel PT data. (Adrian Hunter)
      
       - Allow probing on kmodules without DWARF. (Masami Hiramatsu)
      
       - Fix a segfault when processing a perf.data file with callchains using
         "perf report --call-graph none". (Namhyung Kim)
      
       - Fix unresolved COMMs in 'perf top' when -s comm is used. (Namhyung Kim)
      
       - Register idle thread in 'perf top'. (Namhyung Kim)
      
       - Change 'record.samples' type to unsigned long long, fixing output of
         number of samples in 32-bit architectures. (Yang Shi)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e3b0ac1b
  4. 02 Oct, 2015 4 commits
    • Kan Liang's avatar
      perf stat: Reduce min --interval-print to 10ms · 19afd104
      Kan Liang authored
      The --interval-print parameter was limited to 100ms. However, for
      example, 10ms is required to do sophisticated bandwidth analysis using
      uncore events.
      
      The test shows that the overhead of the system-wide uncore monitoring
      with 10ms interval is only ~2%. So this patch reduces the minimal
      interval-print allowd to 10ms.
      
      But 10ms may not work well for all cases. For example, when the
      cpus/threads number is very large, for system-wide core event monitoring
      the overhead could be high.
      
      To handle this issue, a warning will be displayed when the
      interval-print is set between 10ms to 100ms. So users can make a
      decision according to their specific cases.
      
       # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
      
       print interval < 100ms. The overhead percentage could be high in some
       cases. Please proceed with caution.
       #           time             counts unit events
            0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
            0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
            0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
            0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
            0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
      [ Added warning about overhead when using sub 100ms intervals to the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19afd104
    • Yang Shi's avatar
      perf record: Change 'record.samples' type to unsigned long long · 9f065194
      Yang Shi authored
      When run "perf record -e", the number of samples showed up is wrong on some
      32 bit systems, i.e. powerpc and arm.
      
      For example, run the below commands on 32 bit powerpc:
      
        perf probe -x /lib/libc.so.6 malloc
        perf record -e probe_libc:malloc -a ls perf.data
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]
      
      Actually, "perf script" just shows 21 samples. The number of samples is also
      absurd since samples is long type, but it is printed as PRIu64.
      
      Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
      [ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f065194
    • Masami Hiramatsu's avatar
      perf probe: Allow probing on kmodules without dwarf · 1a8ac29c
      Masami Hiramatsu authored
      Allow probing on kernel modules when 'perf' is built without debuginfo
      support.
      
      Currently perf-probe --module requires linking with libdw, but this
      doesn't make sense.
      
      E.g.
        ----
        # make NO_DWARF=1
        # ./perf probe -m pcspkr pcspkr_event%return
          Error: unknown switch `m'
        ----
      
      With this patch
        ----
        # ./perf probe -m pcspkr pcspkr_event%return
        Added new event:
          probe:pcspkr_event   (on pcspkr_event%return in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event -aR sleep 1
        ----
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20151002125832.18617.78721.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1a8ac29c
    • Arnaldo Carvalho de Melo's avatar
      perf list: Honour 'event_glob' whem printing selectable PMUs · fa52ceab
      Arnaldo Carvalho de Melo authored
      Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
      
      	$ perf record -e intel_bts:// usleep 1
      
      Is a valid event name.
      
      But the code printing such PMUs was not honouring the 'event_glob'
      parameter, so the following line was always appearing:
      
        $ intel_bts//                                        [Kernel PMU event]
      
      Fix it:
      
        $ [acme@felicio linux]$ perf list data
      
        List of pre-defined events (to be used in -e):
      
          uncore_imc/data_reads/                             [Kernel PMU event]
          uncore_imc/data_writes/                            [Kernel PMU event]
      
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa52ceab
  5. 01 Oct, 2015 8 commits
  6. 30 Sep, 2015 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: By default use the most precise "cycles" hw counter available · 7f8d1ade
      Arnaldo Carvalho de Melo authored
      If the user doesn't specify any event, try the most precise "cycles"
      available, i.e. start by "cycles:ppp" and go on removing "p" till it
      works.
      
      E.g.
      
        $ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ]
        $ perf evlist
        cycles:pp
        $ perf evlist -v
        cycles:pp: size: 112, { sample_period, sample_freq }: 4000, sample_type:
        IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
        enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1,
        exclude_guest: 1, mmap2: 1, comm_exec: 1
        $ grep 'model name' /proc/cpuinfo | head -1
        model name	: Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
        $
      
      When 'cycles' appears explicitely is specified this will not be tried,
      i.e. the user has full control of the level of precision to be used:
      
        $ perf record -e cycles usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.016 MB perf.data (9 samples) ]
        $ perf evlist
        cycles
        $ perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
        IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
        enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
        1, comm_exec: 1
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://www.youtube.com/watch?v=nXaxk27zwlk
      Link: http://lkml.kernel.org/n/tip-b1ywebmt22pi78vjxau01wth@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7f8d1ade
    • Arnaldo Carvalho de Melo's avatar
      perf list: Remove blank lines, headers when piping output · dfc431cb
      Arnaldo Carvalho de Melo authored
      So that one can, for instance, use it with wc -l:
      
        # perf list *:*write* | wc -l
        60
      
      Or to look for the "bio" tracepoints, without 'perf list' headers:
      
        # perf list *:*bio* | head
          block:block_bio_backmerge                          [Tracepoint event]
          block:block_bio_bounce                             [Tracepoint event]
          block:block_bio_complete                           [Tracepoint event]
          block:block_bio_frontmerge                         [Tracepoint event]
          block:block_bio_queue                              [Tracepoint event]
          block:block_bio_remap                              [Tracepoint event]
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ts7sc0x8u4io4cifzkup4j44@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dfc431cb
    • Masami Hiramatsu's avatar
      perf probe: Improve error message when %return is on inlined function · 6cca13bd
      Masami Hiramatsu authored
      perf probe shows more precisely message when it finds given
      %return target function is inlined.
      
      Without this fix:
        ----
        # ./perf probe -V getname_flags%return
        Return probe must be on the head of a real function.
        Debuginfo analysis failed.
          Error: Failed to show vars.
        ----
      
      With this fix:
        ----
        # ./perf probe -V getname_flags%return
        Failed to find "getname_flags%return",
         because getname_flags is an inlined function and has no return point.
        Debuginfo analysis failed.
          Error: Failed to show vars.
        ----
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164137.3733.55055.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6cca13bd
    • Masami Hiramatsu's avatar
      perf probe: Fix a segfault bug in debuginfo_cache · 20f49859
      Masami Hiramatsu authored
      perf probe --list will get a segfault if the first kprobe event is on a
      module and the second or latter one is on the kernel.
      
      e.g.
        ----
        # ./perf probe -q -m pcspkr pcspkr_event
        # ./perf probe -q vfs_read
        # ./perf probe -l
        Segmentation fault (core dumped)
        ----
      
      This is because the debuginfo_cache fails to handle NULL module name,
      which causes segfault on strcmp. (Note that strcmp("something", NULL)
      always causes segfault)
      
      To fix this debuginfo_cache__open always translates the NULL module name
      to "kernel" (this is correct, because NULL module name means opening the
      debuginfo for the kernel)
      
        ----
        # ./perf probe -l
          probe:pcspkr_event   (on pcspkr_event@drivers/input/misc/pcspkr.c
          in pcspkr)
          probe:vfs_read       (on vfs_read@ksrc/linux-3/fs/read_write.c)
        ----
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164135.3733.23993.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      20f49859
    • Masami Hiramatsu's avatar
      perf probe: Show correct source lines of probes on kmodules · 9b239a12
      Masami Hiramatsu authored
      Perf probe always failed to find appropriate line numbers because of
      failing to find .text start address offset from debuginfo.
      
      e.g.
        ----
        # ./perf probe -m pcspkr pcspkr_event:5
        Added new events:
          probe:pcspkr_event   (on pcspkr_event:5 in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event:5 in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event_1 -aR sleep 1
      
        # ./perf probe -l
        Failed to find debug information for address ffffffffa031f006
        Failed to find debug information for address ffffffffa031f016
          probe:pcspkr_event   (on pcspkr_event+6 in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event+22 in pcspkr)
        ----
      
      This fixes the above issue as below.
      1. Get the relative address of the symbol in .text by using
         map->start.
      2. Adjust the address by adding the offset of .text section
         in the kernel module binary.
      
      With this fix, perf probe -l shows lines correctly.
        ----
        # ./perf probe -l
          probe:pcspkr_event   (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
        ----
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164132.3733.24643.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9b239a12