1. 06 Oct, 2015 2 commits
    • Kan Liang's avatar
      perf/x86: Add Intel cstate PMUs support · 7ce1346a
      Kan Liang authored
      This patch adds new PMUs to support cstate related free running
      (read-only) counters. These counters may be used simultaneously by other
      tools, such as turbostat. However, it still make sense to implement them
      in perf. Because we can conveniently collect them together with other
      events, and allow to use them from tools without special MSR access
      code.
      
      These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
      According to counters' scope and category, two PMUs are registered with
      the perf_event core subsystem.
      
       - 'cstate_core': The counter is available for each physical core. The
                        counters include CORE_C*_RESIDENCY.
      
       - 'cstate_pkg':  The counter is available for each physical package. The
                        counters include PKG_C*_RESIDENCY.
      
      The events are exposed in sysfs for use by perf stat and other tools.
      The files are:
      
        /sys/devices/cstate_core/events/c*-residency
        /sys/devices/cstate_pkg/events/c*-residency
      
      These events only support system-wide mode counting.
      The /sys/devices/cstate_*/cpumask file can be used by tools to figure
      out which CPUs to monitor by default.
      
      The PMU type (attr->type) is dynamically allocated and is available from
      /sys/devices/core_misc/type and /sys/device/cstate_*/type.
      
      Sampling is not supported.
      
      Here is an example.
      
       - To caculate the fraction of time when the core is running in C6 state
         CORE_C6_time% = CORE_C6_RESIDENCY / TSC
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5
      
         11838820015,,cstate_core/c6-residency/,5175919658,100.00
         11877130740,,msr/tsc/,5175922010,100.00
      
       For sleep, 99.7% of time we ran in C6 state.
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop
      
         1253316,,cstate_core/c6-residency/,4360969154,100.00
         10012635248,,msr/tsc/,4360972366,100.00
      
       For busyloop, 0.01% of time we ran in C6 state.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7ce1346a
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 1c748dc2
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Switch the default callchain output mode to 'graph,0.5,caller', to make it
          look like the default for other tools, reducing the learning curve for
          people used to 'caller' based viewing. (Arnaldo Carvalho de Melo)
      
        - Implement column based horizontal scrolling in the hists browser (top, report),
          making it possible to use the TUI for things like 'perf mem report' where
          there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo)
      
        - Support sorting by symbol_iaddr with perf.data files produced by
          'perf mem record'. (Don Zickus)
      
        - Display DATA_SRC sample type bit, i.e. when running 'perf evlist -v' the
          "DATA_SRC" wasn't appearing when set, fix it to look like: (Jiri Olsa)
      
            cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC
      
        - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.:
      
           $ perf record -e cycles:P usleep 1
      
          Is now similar to:
      
           $ perf record usleep 1
      
          Useful, for instance, when specifying multiple events. (Jiri Olsa)
      
        - Make 'perf -v' and 'perf -h' work. (Jiri Olsa)
      
        - Fail properly when pattern matching fails to find a tracepoint, i.e.
          '-e non:existent' was being correctly handled, with a proper error message
          about that not being a valid event, but '-e non:existent*' wasn't,
          fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Separate arch specific entries in 'perf test' and add an 'Intel CQM' one
          to be fun on x86 only. (Matt Fleming)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1c748dc2
  2. 05 Oct, 2015 16 commits
  3. 03 Oct, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · e3b0ac1b
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
       - Do event name substring search as last resort in 'perf list'.
         (Arnaldo Carvalho de Melo)
      
         E.g.:
      
          # perf list clock
      
          List of pre-defined events (to be used in -e):
      
           cpu-clock                                          [Software event]
           task-clock                                         [Software event]
      
           uncore_cbox_0/clockticks/                          [Kernel PMU event]
           uncore_cbox_1/clockticks/                          [Kernel PMU event]
      
           kvm:kvm_pvclock_update                             [Tracepoint event]
           kvm:kvm_update_master_clock                        [Tracepoint event]
           power:clock_disable                                [Tracepoint event]
           power:clock_enable                                 [Tracepoint event]
           power:clock_set_rate                               [Tracepoint event]
           syscalls:sys_enter_clock_adjtime                   [Tracepoint event]
           syscalls:sys_enter_clock_getres                    [Tracepoint event]
           syscalls:sys_enter_clock_gettime                   [Tracepoint event]
           syscalls:sys_enter_clock_nanosleep                 [Tracepoint event]
           syscalls:sys_enter_clock_settime                   [Tracepoint event]
           syscalls:sys_exit_clock_adjtime                    [Tracepoint event]
           syscalls:sys_exit_clock_getres                     [Tracepoint event]
           syscalls:sys_exit_clock_gettime                    [Tracepoint event]
           syscalls:sys_exit_clock_nanosleep                  [Tracepoint event]
           syscalls:sys_exit_clock_settime                    [Tracepoint event]
      
       - Reduce min 'perf stat --interval-print/-I' to 10ms. (Kan Liang)
      
         perf stat --interval in action:
      
         # perf stat -e cycles -I 50 -a usleep $((200 * 1000))
         print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution.
         #   time                    counts unit events
            0.050233636         48,240,396      cycles
            0.100557098         35,492,594      cycles
            0.150804687         39,295,112      cycles
            0.201032269         33,101,961      cycles
            0.201980732            786,379      cycles
        #
      
       - Allow for max_stack greater than PERF_MAX_STACK_DEPTH, as when
         synthesizing callchains from Intel PT data. (Adrian Hunter)
      
       - Allow probing on kmodules without DWARF. (Masami Hiramatsu)
      
       - Fix a segfault when processing a perf.data file with callchains using
         "perf report --call-graph none". (Namhyung Kim)
      
       - Fix unresolved COMMs in 'perf top' when -s comm is used. (Namhyung Kim)
      
       - Register idle thread in 'perf top'. (Namhyung Kim)
      
       - Change 'record.samples' type to unsigned long long, fixing output of
         number of samples in 32-bit architectures. (Yang Shi)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e3b0ac1b
  4. 02 Oct, 2015 4 commits
    • Kan Liang's avatar
      perf stat: Reduce min --interval-print to 10ms · 19afd104
      Kan Liang authored
      The --interval-print parameter was limited to 100ms. However, for
      example, 10ms is required to do sophisticated bandwidth analysis using
      uncore events.
      
      The test shows that the overhead of the system-wide uncore monitoring
      with 10ms interval is only ~2%. So this patch reduces the minimal
      interval-print allowd to 10ms.
      
      But 10ms may not work well for all cases. For example, when the
      cpus/threads number is very large, for system-wide core event monitoring
      the overhead could be high.
      
      To handle this issue, a warning will be displayed when the
      interval-print is set between 10ms to 100ms. So users can make a
      decision according to their specific cases.
      
       # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
      
       print interval < 100ms. The overhead percentage could be high in some
       cases. Please proceed with caution.
       #           time             counts unit events
            0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
            0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
            0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
            0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
            0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
      [ Added warning about overhead when using sub 100ms intervals to the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19afd104
    • Yang Shi's avatar
      perf record: Change 'record.samples' type to unsigned long long · 9f065194
      Yang Shi authored
      When run "perf record -e", the number of samples showed up is wrong on some
      32 bit systems, i.e. powerpc and arm.
      
      For example, run the below commands on 32 bit powerpc:
      
        perf probe -x /lib/libc.so.6 malloc
        perf record -e probe_libc:malloc -a ls perf.data
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]
      
      Actually, "perf script" just shows 21 samples. The number of samples is also
      absurd since samples is long type, but it is printed as PRIu64.
      
      Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
      [ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f065194
    • Masami Hiramatsu's avatar
      perf probe: Allow probing on kmodules without dwarf · 1a8ac29c
      Masami Hiramatsu authored
      Allow probing on kernel modules when 'perf' is built without debuginfo
      support.
      
      Currently perf-probe --module requires linking with libdw, but this
      doesn't make sense.
      
      E.g.
        ----
        # make NO_DWARF=1
        # ./perf probe -m pcspkr pcspkr_event%return
          Error: unknown switch `m'
        ----
      
      With this patch
        ----
        # ./perf probe -m pcspkr pcspkr_event%return
        Added new event:
          probe:pcspkr_event   (on pcspkr_event%return in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event -aR sleep 1
        ----
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20151002125832.18617.78721.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1a8ac29c
    • Arnaldo Carvalho de Melo's avatar
      perf list: Honour 'event_glob' whem printing selectable PMUs · fa52ceab
      Arnaldo Carvalho de Melo authored
      Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
      
      	$ perf record -e intel_bts:// usleep 1
      
      Is a valid event name.
      
      But the code printing such PMUs was not honouring the 'event_glob'
      parameter, so the following line was always appearing:
      
        $ intel_bts//                                        [Kernel PMU event]
      
      Fix it:
      
        $ [acme@felicio linux]$ perf list data
      
        List of pre-defined events (to be used in -e):
      
          uncore_imc/data_reads/                             [Kernel PMU event]
          uncore_imc/data_writes/                            [Kernel PMU event]
      
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa52ceab
  5. 01 Oct, 2015 8 commits
  6. 30 Sep, 2015 9 commits