1. 05 Mar, 2018 14 commits
  2. 27 Feb, 2018 1 commit
    • Jin Yao's avatar
      perf stat: Ignore error thread when enabling system-wide --per-thread · ab6c79b8
      Jin Yao authored
      If we execute 'perf stat --per-thread' with non-root account (even set
      kernel.perf_event_paranoid = -1 yet), it reports the error:
      
        jinyao@skl:~$ perf stat --per-thread
        Error:
        You may not have permission to collect system-wide stats.
      
        Consider tweaking /proc/sys/kernel/perf_event_paranoid,
        which controls use of the performance events system by
        unprivileged users (without CAP_SYS_ADMIN).
      
        The current value is 2:
      
          -1: Allow use of (almost) all events by all users
              Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
        >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
              Disallow raw tracepoint access by users without CAP_SYS_ADMIN
        >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
        >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
      
        To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
      
                kernel.perf_event_paranoid = -1
      
      Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
      the global --per-thread mode had better ignore such errors and continue
      working on other threads.
      
      This patch will record the index of error thread in perf_evsel__open()
      and remove this thread before retrying.
      
      For example (run with non-root, kernel.perf_event_paranoid isn't set):
      
        jinyao@skl:~$ perf stat --per-thread
        ^C
         Performance counter stats for 'system wide':
      
               vmstat-3458    6.171984   cpu-clock:u (msec) #  0.000 CPUs utilized
                 perf-3670    0.515599   cpu-clock:u (msec) #  0.000 CPUs utilized
               vmstat-3458   1,163,643   cycles:u           #  0.189 GHz
                 perf-3670      40,881   cycles:u           #  0.079 GHz
               vmstat-3458   1,410,238   instructions:u     #  1.21  insn per cycle
                 perf-3670       3,536   instructions:u     #  0.09  insn per cycle
               vmstat-3458     288,937   branches:u         # 46.814 M/sec
                 perf-3670         936   branches:u         #  1.815 M/sec
               vmstat-3458      15,195   branch-misses:u    #  5.26% of all branches
                 perf-3670          76   branch-misses:u    #  8.12% of all branches
      
              12.651675247 seconds time elapsed
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab6c79b8
  3. 26 Feb, 2018 1 commit
  4. 22 Feb, 2018 1 commit
  5. 21 Feb, 2018 3 commits
  6. 19 Feb, 2018 5 commits
    • Jaroslav Škarvada's avatar
      perf tools: Add Python 3 support · 66dfdff0
      Jaroslav Škarvada authored
      Added Python 3 support while keeping Python 2.7 compatibility.
      
      Committer notes:
      
      This doesn't make it to auto detect python 3, one has to explicitely ask
      it to build with python 3 devel files, here are the instructions
      provided by Jaroslav:
      
       ---
        $ cp -a tools/perf tools/python3-perf
        $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 all
        $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 all
        $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 DESTDIR=%{buildroot} install-python_ext
        $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 DESTDIR=%{buildroot} install-python_ext
       ---
      
      We need to make this automatic, just like the existing tests for checking if
      the python2 devel files are in place, allowing the build with python3 if
      available, fallbacking to python2 and then just disabling it if none are
      available.
      
      So, using the PYTHON variable to build it using O= we get:
      
      Before this patch:
      
        $ rpm -q python3 python3-devel
        python3-3.6.4-7.fc27.x86_64
        python3-devel-3.6.4-7.fc27.x86_64
        $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf PYTHON=/usr/bin/python3 -C tools/perf install-bin
        make: Entering directory '/home/acme/git/linux/tools/perf'
        <SNIP>
        Makefile.config:670: Python 3 is not yet supported; please set
        Makefile.config:671: PYTHON and/or PYTHON_CONFIG appropriately.
        Makefile.config:672: If you also have Python 2 installed, then
        Makefile.config:673: try something like:
        Makefile.config:674:
        Makefile.config:675:   make PYTHON=python2
        Makefile.config:676:
        Makefile.config:677: Otherwise, disable Python support entirely:
        Makefile.config:678:
        Makefile.config:679:   make NO_LIBPYTHON=1
        Makefile.config:680:
        Makefile.config:681: *** .  Stop.
        make[1]: *** [Makefile.perf:212: sub-make] Error 2
        make: *** [Makefile:110: install-bin] Error 2
        make: Leaving directory '/home/acme/git/linux/tools/perf'
        $
      
      After:
      
        $ make O=/tmp/build/perf PYTHON=python3 -C tools/perf install-bin
        $ ldd ~/bin/perf | grep python
      	libpython3.6m.so.1.0 => /lib64/libpython3.6m.so.1.0 (0x00007f58a31e8000)
        $ rpm -qf /lib64/libpython3.6m.so.1.0
        python3-libs-3.6.4-7.fc27.x86_64
        $
      
      Now verify that when using the binding the right ELF file is loaded,
      using perf trace:
      
        $ perf trace -e open* perf test python
           0.051 ( 0.016 ms): perf/3927 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
      <SNIP>
        18: 'import perf' in python                               :
           8.849 ( 0.013 ms): sh/3929 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
      <SNIP>
          25.572 ( 0.008 ms): python3/3931 openat(dfd: CWD, filename: /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so, flags: CLOEXEC) = 3
      <SNIP>
       Ok
      <SNIP>
        $
      
      And using tools/perf/python/twatch.py, to show PERF_RECORD_ metaevents:
      
        $ python3 tools/perf/python/twatch.py
        cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5207, ppid: 16060, tid: 5207, ptid: 16060, time: 10798513015459}
        cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5208, ppid: 16060, tid: 5208, ptid: 16060, time: 10798513562503}
        cpu: 0, pid: 5208, tid: 5208 { type: comm, pid: 5208, tid: 5208, comm: grep }
        cpu: 2, pid: 5207, tid: 5207 { type: comm, pid: 5207, tid: 5207, comm: ps }
        cpu: 2, pid: 5207, tid: 5207 { type: exit, pid: 5207, ppid: 5207, tid: 5207, ptid: 5207, time: 10798551337484}
        cpu: 3, pid: 5208, tid: 5208 { type: exit, pid: 5208, ppid: 5208, tid: 5208, ptid: 5208, time: 10798551292153}
        cpu: 3, pid: 601, tid: 601 { type: fork, pid: 5209, ppid: 601, tid: 5209, ptid: 601, time: 10801779977324}
        ^CTraceback (most recent call last):
          File "tools/perf/python/twatch.py", line 68, in <module>
            main()
          File "tools/perf/python/twatch.py", line 40, in main
            evlist.poll(timeout = -1)
        KeyboardInterrupt
        $
      
        # ps ax|grep twatch
       5197 pts/8    S+     0:00 python3 tools/perf/python/twatch.py
        # ls -la /proc/5197/smaps
        -r--r--r--. 1 acme acme 0 Feb 19 13:14 /proc/5197/smaps
        # grep python /proc/5197/smaps
        558111307000-558111309000 r-xp 00000000 fd:00 3151710  /usr/bin/python3.6
        558111508000-558111509000 r--p 00001000 fd:00 3151710  /usr/bin/python3.6
        558111509000-55811150a000 rw-p 00002000 fd:00 3151710  /usr/bin/python3.6
        7ffad6fc1000-7ffad7008000 r-xp 00000000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
        7ffad7008000-7ffad7207000 ---p 00047000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
        7ffad7207000-7ffad7208000 r--p 00046000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
        7ffad7208000-7ffad7215000 rw-p 00047000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
        7ffadea77000-7ffaded3d000 r-xp 00000000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
        7ffaded3d000-7ffadef3c000 ---p 002c6000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
        7ffadef3c000-7ffadef42000 r--p 002c5000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
        7ffadef42000-7ffadefa5000 rw-p 002cb000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
        #
      
      And with this patch, but building normally, without specifying the
      PYTHON=python3 part, which will make it use python2 if its devel files are
      available, like in this test:
      
        $ make O=/tmp/build/perf -C tools/perf install-bin
        $ ldd ~/bin/perf | grep python
      	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f6a44410000)
        $ ldd /tmp/build/perf/python_ext_build/lib/perf.so  | grep python
      	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007fed28a2c000)
        $
      
        [acme@jouet perf]$ tools/perf/python/twatch.py
        cpu: 0, pid: 2817, tid: 2817 { type: fork, pid: 2817, ppid: 2817, tid: 8910, ptid: 2817, time: 11126454335306}
        cpu: 0, pid: 2817, tid: 2817 { type: comm, pid: 2817, tid: 8910, comm: worker }
        $ ps ax | grep twatch.py
         8909 pts/8    S+     0:00 /usr/bin/python tools/perf/python/twatch.py
        $ grep python /proc/8909/smaps
        5579de658000-5579de659000 r-xp 00000000 fd:00 3156044  /usr/bin/python2.7
        5579de858000-5579de859000 r--p 00000000 fd:00 3156044  /usr/bin/python2.7
        5579de859000-5579de85a000 rw-p 00001000 fd:00 3156044  /usr/bin/python2.7
        7f0de01f7000-7f0de023e000 r-xp 00000000 00:2d 230695   /tmp/build/perf/python/perf.so
        7f0de023e000-7f0de043d000 ---p 00047000 00:2d 230695   /tmp/build/perf/python/perf.so
        7f0de043d000-7f0de043e000 r--p 00046000 00:2d 230695   /tmp/build/perf/python/perf.so
        7f0de043e000-7f0de044b000 rw-p 00047000 00:2d 230695   /tmp/build/perf/python/perf.so
        7f0de6f0f000-7f0de6f13000 r-xp 00000000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
        7f0de6f13000-7f0de7113000 ---p 00004000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
        7f0de7113000-7f0de7114000 r--p 00004000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
        7f0de7114000-7f0de7115000 rw-p 00005000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
        7f0de7e73000-7f0de8052000 r-xp 00000000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
        7f0de8052000-7f0de8251000 ---p 001df000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
        7f0de8251000-7f0de8255000 r--p 001de000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
        7f0de8255000-7f0de8291000 rw-p 001e2000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
        $
      Signed-off-by: default avatarJaroslav Škarvada <jskarvad@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      LPU-Reference: 20180119205641.24242-1-jskarvad@redhat.com
      Link: https://lkml.kernel.org/n/tip-8d7dt9kqp83vsz25hagug8fu@git.kernel.org
      [ Removed explicit check for python version, allowing it to really build with python3 ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      66dfdff0
    • Arnaldo Carvalho de Melo's avatar
      perf python: Make twatch.py work with both python2 and python3 · d2ed5d2b
      Arnaldo Carvalho de Melo authored
      Will be used to test patches allowing to build perf with python3, so
      that we make sure that we can build with both versions.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jaroslav Škarvada <jskarvad@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-c2ynv0ozr3eifzsyit6qgh3h@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d2ed5d2b
    • Changbin Du's avatar
      perf ftrace: Append an EOL when write tracing files · 63cd02d8
      Changbin Du authored
      Before this change, the '--graph-funcs', '--nograph-funcs' and
      '--trace-funcs' options didn't work as expected when the <func> doesn't
      exist. Because the kernel side hid possible errors.
      
        $ sudo ./perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
         0)   0.140 us    |  rcu_all_qs();
         3)   0.304 us    |  mutex_unlock();
         0)   0.153 us    |  find_vma();
         3)   0.088 us    |  __fsnotify_parent();
         0)   6.145 us    |  handle_mm_fault();
         3)   0.089 us    |  fsnotify();
         3)   0.161 us    |  __sb_end_write();
         3)   0.710 us    |  SyS_close();
         3)   7.848 us    |  exit_to_usermode_loop();
      
      On the example above, I specified the function filter 'abcdefg' but all
      functions are enabled. The expected result is for all functions to be
      filtered, since there is no such function ('abcdefg')
      
      The original fix is to make the kernel support '\0' as end of string:
      https://lkml.org/lkml/2018/1/16/116
      
      But above fix cannot be compatible with old kernels. Then Namhyung Kim
      suggest adding a space after function name.
      
      This patch will append an '\n' when write tracing file. After this fix,
      the perf will report correct error state. Also let it print an error if
      reset_tracing_files() fails.
      
      Committer testing:
      
      Now it prints:
      
        # perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
        failed to set tracing filters
        #
      
      And for an existing function:
      
        # perf ftrace -a --graph-depth 1 --graph-funcs SyS_open
         3)               |  SyS_open() {
         3) ! 494.899 us  |  }
         0) + 23.910 us   |  SyS_open();
         1) + 17.115 us   |  SyS_open();
         1) + 13.900 us   |  SyS_open();
         ------------------------------------------
         3)  qemu-sy-2817  =>  pickup-1290
         ------------------------------------------
      
         3) + 20.021 us   |  SyS_open();
        #
      Signed-off-by: default avatarChangbin Du <changbin.du@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1519007609-14551-1-git-send-email-changbin.du@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63cd02d8
    • Namhyung Kim's avatar
      perf machine: Fix paranoid check in machine__set_kernel_mmap() · 1d12cec6
      Namhyung Kim authored
      The machine__set_kernel_mmap() is to setup addresses of the kernel map
      using external info.  But it has a check when the address is given from
      an incorrect input which should have the start and end address of 0
      (i.e. machine__process_kernel_mmap_event).
      
      But we also use the end address of 0 for a valid input so change it to
      check both start and end addresses.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20180219101936.GD1583@sejongSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d12cec6
    • Thomas Richter's avatar
      perf s390: Fix reading cpuid model information · 47812e00
      Thomas Richter authored
      Commit eca0fa28 (perf record: Provide detailed information on s390
      CPU") fixed a  build error on Ubuntu. However the fix uses the wrong
      size to print the model information.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Fixes: eca0fa28 ("perf record: Provide detailed information on s390 CPU")
      Link: http://lkml.kernel.org/r/20180219102444.96900-1-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      47812e00
  7. 17 Feb, 2018 2 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.17-20180216' of... · 11737ca9
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.17-20180216' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      - Fix wrong jump arrow in systems with branch records with cycles,
        i.e. Intel's >= Skylake (Jin Yao)
      
      - Fix 'perf record --per-thread' problem introduced when
        implementing 'perf stat --per-thread (Jin Yao)
      
      - Use arch__compare_symbol_names() to fix 'perf test vmlinux',
        that was using strcmp(symbol names) while the dso routines
        doing symbol lookups used the arch overridable one, making
        this test fail in architectures that overrided that function
        with something other than strcmp() (Jiri Olsa)
      
      - Add 'perf script --show-round-event' to display
        PERF_RECORD_FINISHED_ROUND entries (Jiri Olsa)
      
      - Fix dwarf unwind for stripped binaries in 'perf test' (Jiri Olsa)
      
      - Use ordered_events for 'perf report --tasks', otherwise we may get
        artifacts when PERF_RECORD_FORK gets processed before PERF_RECORD_COMM
        (when they got recorded in different CPUs) (Jiri Olsa)
      
      - Add support to display group output for non group events, i.e.
        now when one uses 'perf report --group' on a perf.data file
        recorded without explicitly grouping events with {} (e.g.
        "perf record -e '{cycles,instructions}'" get the same output
        that would produce, i.e. see all those non-grouped events in
        multiple columns, at the same time (Jiri Olsa)
      
      - Skip non-address kallsyms entries, e.g. '(null)' for !root (Jiri Olsa)
      
      - Kernel maps fixes wrt perf.data(report) versus live system (top)
        (Jiri Olsa)
      
      - Fix memory corruption when using 'perf record -j call -g -a <application>'
        followed by 'perf report --branch-history' (Jiri Olsa)
      
      - ARM CoreSight fixes (Mathieu Poirier)
      
      - Add inject capability for CoreSight Traces (Robert Waker)
      
      - Update documentation for use of 'perf' + ARM CoreSight (Robert Walker)
      
      - Man pages fixes (Sangwon Hong, Jaecheol Shin)
      
      - Fix some 'perf test' cases on s/390 and x86_64 (some backtraces
        changed with a glibc update) (Thomas Richter)
      
      - Add detailed CPUID info in the 'perf.data' headers for s/390 to
        then use it in 'perf annotate' (Thomas Richter)
      
      - Add '--interval-count N' to 'perf stat', to use with -I, i.e.
        'perf stat -I 1000 --interval-count 2' will show stats every
         1000ms, two times (yuzhoujian)
      
      - Add 'perf stat --timeout Nms', that will run for that many
        milliseconds and then stop, printing the counters (yuzhoujian)
      
      - Fix description for 'perf report --mem-modex (Andi Kleen)
      
      - Use a wildcard to remove the vfs_getname probe in the
        'perf test' shell based test cases (Arnaldo Carvalho de Melo)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      11737ca9
    • Ingo Molnar's avatar
      7057bb97
  8. 16 Feb, 2018 13 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tests shell lib: Use a wildcard to remove the vfs_getname probe · 21316ac6
      Arnaldo Carvalho de Melo authored
      In some situations the vfs_getname is being added both as requested and
      with a _1 suffix (inlines?):
      
        probe:vfs_getname_1  (on getname_flags:63@acme/git/linux/fs/namei.c with pathname)
      
      This ends up making the cleanup to miss that one, as it removes just
      'probe:vfs_getname', which makes the second test to use this probe point
      to fail, since it finds that leftover from the first test, use a
      wildcard to remove both.
      
      Before:
      
        # perf test 60 61 62 63
        60: Use vfs_getname probe to get syscall args filenames   : FAILED!
        61: probe libc's inet_pton & backtrace it with ping       : Ok
        62: Check open filename arg using perf trace + vfs_getname: FAILED!
        63: Add vfs_getname probe to get syscall args filenames   : Ok
      
      After:
      
        # perf test 60 61 62 63
        60: Use vfs_getname probe to get syscall args filenames   : Ok
        61: probe libc's inet_pton & backtrace it with ping       : Ok
        62: Check open filename arg using perf trace + vfs_getname: Ok
        63: Add vfs_getname probe to get syscall args filenames   : Ok
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2k5kutwr4ds36adiakyb4yvy@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      21316ac6
    • Thomas Richter's avatar
      perf test: Fix test case inet_pton to accept inlines. · 0f19a038
      Thomas Richter authored
      Using Fedora 27 and latest Linux kernel the test case
      trace+probe_libc_inet_pton.sh fails again on s390.  This time is the
      inlining of functions which does not match.  After an update of the
      glibc (from 2.26-16 to 2.26-24) the output is different
      
      The expected output is:
      
                   __inet_pton (/usr/lib64/libc-2.26.so)
                   gaih_inet (inlined)
                   ....
      
      The actual output is:
      
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
             0.000 probe_libc:inet_pton:(3ffb2140448))
                   __inet_pton (inlined)
                   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                   ...
      
      Fix this by being less strict on 'inlined' verses library name and
      accept both
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180214070303.55757-1-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f19a038
    • Thomas Richter's avatar
      perf test: Fix test case 23 for s390 z/VM or KVM guests · b3be39c5
      Thomas Richter authored
      On s390 perf can be executed on a LPAR with support for hardware events
      (i. e. cycles) or on a z/VM or KVM guest where no hardware events are
      supported. In this environment use software event named cpu-clock for
      this test case.
      
      Use the cpuid infrastructure functions to determine the cpuid on s390
      which contains an indication of the cpu counter facility availability.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180213151419.80737-4-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b3be39c5
    • Thomas Richter's avatar
      perf cpuid: Introduce a platform specific cpuid compare function · 4cb7d3ec
      Thomas Richter authored
      The function get_cpuid_str() is called by perf_pmu__getcpuid() and on
      s390 returns a complete description of the CPU and its capabilities,
      which is a comma separated list.
      
      To map the CPU type with the value defined in the
      pmu-events/arch/s390/mapfile.csv, introduce an architecture specific
      cpuid compare function named strcmp_cpuid_str()
      
      The currently used regex algorithm is defined as the weak default and
      will be used if no platform specific one is defined. This matches the
      current behavior.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180213151419.80737-3-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4cb7d3ec
    • Thomas Richter's avatar
      perf annotate: Scan cpuid for s390 and save machine type · c59124fa
      Thomas Richter authored
      Scan the cpuid string and extract the type number for later use.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180213151419.80737-2-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c59124fa
    • Thomas Richter's avatar
      perf record: Provide detailed information on s390 CPU · eca0fa28
      Thomas Richter authored
      When perf record ... is setup to record data, the s390 cpu information
      was a fixed string "IBM/S390".
      
      Replace this string with one containing more information about the
      machine. The information included in the cpuid is a comma separated
      list:
      
         manufacturer,type,model-capacity,model[,version,authorization]
      with
      
      - manufacturer: up to 16 byte name of the manufacturer (IBM).
      - type: a four digit number refering to the machine
        generation.
      - model-capacitiy: up to 16 characters describing number
        of cpus etc.
      - model: up to 16 characters describing model.
      - version: the CPU-MF counter facility version number,
        available on LPARs only, omitted on z/VM guests.
      - authorization: the CPU-MF counter facility authorization level,
        available on LPARs only, omitted on z/VM guests.
      
      Before:
      
        [root@s8360047 perf]# ./perf record -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ]
        [root@s8360047 perf]# ./perf report --header | fgrep cpuid
         # cpuid : IBM/S390
        [root@s8360047 perf]#
      
      After:
      
        [root@s35lp76 perf]# ./perf report --header|fgrep cpuid
         # cpuid : IBM,3906,704,M03,3.5,002f
        [root@s35lp76 perf]#
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180213151419.80737-1-tmricht@linux.vnet.ibm.com
      [ Use scnprintf instead of strncat to fix build errors on gcc GNU C99 5.4.0 20160609 -march=zEC12 -m64 -mzarch -ggdb3 -O6 -std=gnu99 -fPIC -fno-omit-frame-pointer -funwind-tables -fstack-protector-all ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eca0fa28
    • Ravi Bangoria's avatar
      perf trace powerpc: Use generated syscall table · 4281da23
      Ravi Bangoria authored
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      It also enables users to specify wildcards, for example, perf trace -e
      'open*', just like was already possible on x86 and s390.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-4-ravi.bangoria@linux.vnet.ibm.com
      [ Do it for ppc32 as well ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4281da23
    • Ravi Bangoria's avatar
      perf powerpc: Generate system call table from asm/unistd.h · 8e2ff72a
      Ravi Bangoria authored
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-3-ravi.bangoria@linux.vnet.ibm.com
      [ Made it generate syscall_32.c as well to fix the build on 32-bit ppc ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e2ff72a
    • Ravi Bangoria's avatar
      tools include powerpc: Grab a copy of arch/powerpc/include/uapi/asm/unistd.h · 1350fb7d
      Ravi Bangoria authored
      Will be used for generating the syscall id/string translation table.
      
      Committer notes:
      
      Update it already to catch with these csets applied since Ravi first
      submitted this patch:
      
        3350eb2e powerpc: sys_pkey_mprotect() system call
        9499ec1b powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
      
      So now 'perf trace' on ppc now knows about the pkey_ syscals.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/20180129083417.31240-2-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1350fb7d
    • Jiri Olsa's avatar
      perf report: Fix memory corruption in --branch-history mode --branch-history · e3ebaa46
      Jiri Olsa authored
      Jin Yao reported memory corrupton in perf report with
      branch info used for stack trace:
      
        > Following command lines will cause perf crash.
      
        > perf record -j call -g -a <application>
        > perf report --branch-history
        >
        > *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
        > ======= Backtrace: =========
        > /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
        > /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
        > /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
        > perf[0x51b914]
        > perf(hist_entry_iter__add+0x1e5)[0x51f305]
        > perf[0x43cf01]
        > perf[0x4fa3bf]
        > perf[0x4fa923]
        > perf[0x4fd396]
        > perf[0x4f9614]
        > perf(perf_session__process_events+0x89e)[0x4fc38e]
        > perf(cmd_report+0x15d2)[0x43f202]
        > perf[0x4a059f]
        > perf(main+0x631)[0x427b71]
        > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
        > perf(_start+0x29)[0x427d89]
      
      For the cumulative output, we allocate the he_cache array based on the
      --max-stack option value and populate it with data from 'callchain_cursor'.
      
      The --max-stack option value does not ensure now the limit for number of
      callchain_cursor nodes, so the cumulative iter code will allocate smaller array
      than it's actually needed and cause above corruption.
      
      I think the --max-stack limit does not apply here anyway, because we add
      callchain data as normal hist entries, while the --max-stack control the limit
      of single entry callchain depth.
      
      Using the callchain_cursor.nr as he_cache array count to fix this. Also
      removing struct hist_entry_iter::max_stack, because there's no longer any use
      for it.
      
      We need more fixes to ensure that the branch stack code follows properly the
      logic of --max-stack, which is not the case at the moment.
      Original-patch-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reported-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180216123619.GA9945@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3ebaa46
    • Jin Yao's avatar
      perf report: Fix wrong jump arrow · b40982e8
      Jin Yao authored
      When we use perf report interactive annotate view, we can see
      the position of jump arrow is not correct. For example,
      
      1. perf record -b ...
      2. perf report
      3. In interactive mode, select Annotate 'function'
      
      Percent│ IPC Cycle
             │                                if (flag)
        1.37 │0.4┌──   1      ↓ je     82
             │   │                                    x += x / y + y / x;
        0.00 │0.4│  1310        movsd  (%rsp),%xmm0
        0.00 │0.4│   565        movsd  0x8(%rsp),%xmm4
             │0.4│              movsd  0x8(%rsp),%xmm1
             │0.4│              movsd  (%rsp),%xmm3
             │0.4│              divsd  %xmm4,%xmm0
        0.00 │0.4│   579        divsd  %xmm3,%xmm1
             │0.4│              movsd  (%rsp),%xmm2
             │0.4│              addsd  %xmm1,%xmm0
             │0.4│              addsd  %xmm2,%xmm0
        0.00 │0.4│              movsd  %xmm0,(%rsp)
             │   │                    volatile double x = 1212121212, y = 121212;
             │   │
             │   │                    s_randseed = time(0);
             │   │                    srand(s_randseed);
             │   │
             │   │                    for (i = 0; i < 2000000000; i++) {
        1.37 │0.4└─→      82:   sub    $0x1,%ebx
       28.21 │0.48    17      ↑ jne    38
      
      The jump arrow in above example is not correct. It should add the
      width of IPC and Cycle.
      
      With this patch, the result is:
      
      Percent│ IPC Cycle
             │                                if (flag)
        1.37 │0.48     1     ┌──je     82
             │               │                        x += x / y + y / x;
        0.00 │0.48  1310     │  movsd  (%rsp),%xmm0
        0.00 │0.48   565     │  movsd  0x8(%rsp),%xmm4
             │0.48           │  movsd  0x8(%rsp),%xmm1
             │0.48           │  movsd  (%rsp),%xmm3
             │0.48           │  divsd  %xmm4,%xmm0
        0.00 │0.48   579     │  divsd  %xmm3,%xmm1
             │0.48           │  movsd  (%rsp),%xmm2
             │0.48           │  addsd  %xmm1,%xmm0
             │0.48           │  addsd  %xmm2,%xmm0
        0.00 │0.48           │  movsd  %xmm0,(%rsp)
             │               │        volatile double x = 1212121212, y = 121212;
             │               │
             │               │        s_randseed = time(0);
             │               │        srand(s_randseed);
             │               │
             │               │        for (i = 0; i < 2000000000; i++) {
        1.37 │0.48        82:└─→sub    $0x1,%ebx
       28.21 │0.48    17      ↑ jne    38
      
      Committer notes:
      
      Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
      Skylake is that we'll have the cycles counts in each branch record
      entry, so to see the Cycles and IPC columns, and be able to test this
      patch, one need a capable hardware.
      
      While applying this I first tested it on a Broadwell class machine and
      couldn't get those columns, will add code to the annotate browser to
      warn the user about that, i.e. you have branch records, but no cycles,
      use a more recent hardware to get the cycles and IPC columns.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b40982e8
    • Andi Kleen's avatar
      perf report: Fix description for --mem-mode · fc2f5237
      Andi Kleen authored
      The "mem-loads" event only works when PEBS is enabled, so add the "/p"
      ("precise") suffix to the examples.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      LPU-Reference: 20180209163909.9240-1-andi@firstfloor.org
      Link: https://lkml.kernel.org/n/tip-v0gcd4u9tktrvjjsp6y7ouv4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fc2f5237
    • Robert Walker's avatar
      coresight: Update documentation for perf usage · 6673016f
      Robert Walker authored
      Add notes on using perf to collect and analyze CoreSight trace
      Signed-off-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1518607481-4059-4-git-send-email-robert.walker@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6673016f