1. 07 Nov, 2019 19 commits
    • Jin Yao's avatar
      perf report: Sort by sampled cycles percent per block for stdio · 6f7164fa
      Jin Yao authored
      It would be useful to support sorting for all blocks by the sampled
      cycles percent per block. This is useful to concentrate on the globally
      hottest blocks.
      
      This patch implements a new option "--total-cycles" which sorts all
      blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:
      
       percent = block sampled cycles aggregation / total sampled cycles
      
      Note that, this patch only supports "--stdio" mode.
      
      For example,
      
        # perf record -b ./div
        # perf report --total-cycles --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 2M of event 'cycles'
        # Event count (approx.): 2753248
        #
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                             [Program Block Range]      Shared Object
        # ...............  ..............  ...........  ..........  ................................................  .................
        #
                   26.04%            2.8M        0.40%          18                            [div.c:42 -> div.c:39]                div
                   15.17%            1.2M        0.16%           7                [random_r.c:357 -> random_r.c:380]       libc-2.27.so
                    5.11%          402.0K        0.04%           2                            [div.c:27 -> div.c:28]                div
                    4.87%          381.6K        0.04%           2                    [random.c:288 -> random.c:291]       libc-2.27.so
                    4.53%          381.0K        0.04%           2                            [div.c:40 -> div.c:40]                div
                    3.85%          300.9K        0.02%           1                            [div.c:22 -> div.c:25]                div
                    3.08%          241.1K        0.02%           1                          [rand.c:26 -> rand.c:27]       libc-2.27.so
                    3.06%          240.0K        0.02%           1                    [random.c:291 -> random.c:291]       libc-2.27.so
                    2.78%          215.7K        0.02%           1                    [random.c:298 -> random.c:298]       libc-2.27.so
                    2.52%          198.3K        0.02%           1                    [random.c:293 -> random.c:293]       libc-2.27.so
                    2.36%          184.8K        0.02%           1                          [rand.c:28 -> rand.c:28]       libc-2.27.so
                    2.33%          180.5K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
                    2.28%          176.7K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
                    2.20%          168.8K        0.02%           1                        [rand@plt+0 -> rand@plt+0]                div
                    1.98%          158.2K        0.02%           1                [random_r.c:388 -> random_r.c:388]       libc-2.27.so
                    1.57%          123.3K        0.02%           1                            [div.c:42 -> div.c:44]                div
                    1.44%          116.0K        0.42%          19                [random_r.c:357 -> random_r.c:394]       libc-2.27.so
                    0.25%          182.5K        0.02%           1                [random_r.c:388 -> random_r.c:391]       libc-2.27.so
                    0.00%              48        1.07%          48        [x86_pmu_enable+284 -> x86_pmu_enable+298]  [kernel.kallsyms]
                    0.00%              74        1.64%          74             [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92]  [kernel.kallsyms]
                    0.00%              73        1.62%          73                         [vm_mmap+0 -> vm_mmap+48]  [kernel.kallsyms]
                    0.00%              63        0.69%          31                       [up_write+0 -> up_write+34]  [kernel.kallsyms]
                    0.00%              13        0.29%          13      [setup_arg_pages+396 -> setup_arg_pages+413]  [kernel.kallsyms]
                    0.00%               3        0.07%           3      [setup_arg_pages+418 -> setup_arg_pages+450]  [kernel.kallsyms]
                    0.00%             616        6.84%         308   [security_mmap_file+0 -> security_mmap_file+72]  [kernel.kallsyms]
                    0.00%              23        0.51%          23  [security_mmap_file+77 -> security_mmap_file+87]  [kernel.kallsyms]
                    0.00%               4        0.02%           1                  [sched_clock+0 -> sched_clock+4]  [kernel.kallsyms]
                    0.00%               4        0.02%           1                 [sched_clock+9 -> sched_clock+12]  [kernel.kallsyms]
                    0.00%               1        0.02%           1                [rcu_nmi_exit+0 -> rcu_nmi_exit+9]  [kernel.kallsyms]
      
      Committer testing:
      
      This should provide material for hours of endless joy, both from looking
      for suspicious things in the implementation of this patch, such as the
      top one:
      
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
                    2.17%            1.7M        0.08%         607   [compiler.h:199 -> common.c:221]              [kernel.vmlinux]
      
      As well from things that look legit:
      
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
                    0.16%          123.0K        0.60%        4.7K   [nospec-branch.h:265 -> nospec-branch.h:278]  [kernel.vmlinux]
      
      :-)
      
      Very short system wide taken branches session:
      
        # perf record -h -b
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -b, --branch-any      sample any taken branches
      
        #
        # perf record -b
        ^C[ perf record: Woken up 595 times to write data ]
        [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]
      
        #
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
        #
        # perf report --total-cycles --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 6M of event 'cycles'
        # Event count (approx.): 6299936
        #
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
        # ...............  ..............  ...........  ..........  ......................................................................  ....................
        #
                    2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
                    1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
                    0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -> entry_64.S:662]      [kernel.vmlinux]
                    0.56%          541.8K        0.09%         672                                        [compiler.h:199 -> common.c:300]      [kernel.vmlinux]
                    0.39%          293.2K        0.01%         104                                    [list_debug.c:43 -> list_debug.c:61]      [kernel.vmlinux]
                    0.36%          278.6K        0.03%         272                                    [entry_64.S:1289 -> entry_64.S:1308]      [kernel.vmlinux]
                    0.30%          260.8K        0.07%         564                              [clear_page_64.S:47 -> clear_page_64.S:50]      [kernel.vmlinux]
                    0.28%          215.3K        0.05%         369                                            [traps.c:623 -> traps.c:628]      [kernel.vmlinux]
                    0.23%          178.1K        0.04%         278                                      [entry_64.S:271 -> entry_64.S:275]      [kernel.vmlinux]
                    0.20%          152.6K        0.09%         706                                      [paravirt.c:177 -> paravirt.c:179]      [kernel.vmlinux]
                    0.20%          155.8K        0.05%         373                                      [entry_64.S:153 -> entry_64.S:175]      [kernel.vmlinux]
                    0.18%          136.6K        0.03%         222                                                [msr.h:105 -> msr.h:166]      [kernel.vmlinux]
                    0.16%          123.0K        0.60%        4.7K                            [nospec-branch.h:265 -> nospec-branch.h:278]      [kernel.vmlinux]
                    0.16%          118.3K        0.01%          44                                      [entry_64.S:632 -> entry_64.S:657]      [kernel.vmlinux]
                    0.14%          104.5K        0.00%          28                                          [rwsem.c:1541 -> rwsem.c:1544]      [kernel.vmlinux]
                    0.13%           99.2K        0.01%          53                                      [spinlock.c:150 -> spinlock.c:152]      [kernel.vmlinux]
                    0.13%           95.5K        0.00%          35                                              [swap.c:456 -> swap.c:471]      [kernel.vmlinux]
                    0.12%           96.2K        0.05%         407                              [copy_user_64.S:175 -> copy_user_64.S:209]      [kernel.vmlinux]
                    0.11%           85.9K        0.00%          31                                        [swap.c:400 -> page-flags.h:188]      [kernel.vmlinux]
                    0.10%           73.0K        0.01%          52                                          [paravirt.h:763 -> list.h:131]      [kernel.vmlinux]
                    0.07%           56.2K        0.03%         214                                      [filemap.c:1524 -> filemap.c:1557]      [kernel.vmlinux]
                    0.07%           54.2K        0.02%         145                                        [memory.c:1032 -> memory.c:1049]      [kernel.vmlinux]
                    0.07%           50.3K        0.00%          39                                            [mmzone.c:49 -> mmzone.c:69]      [kernel.vmlinux]
                    0.06%           48.3K        0.01%          40                                   [paravirt.h:768 -> page_alloc.c:3304]      [kernel.vmlinux]
                    0.06%           46.7K        0.02%         155                                        [memory.c:1032 -> memory.c:1056]      [kernel.vmlinux]
                    0.06%           46.9K        0.01%         103                                              [swap.c:867 -> swap.c:902]      [kernel.vmlinux]
                    0.06%           47.8K        0.00%          34                                    [entry_64.S:1201 -> entry_64.S:1202]      [kernel.vmlinux]
      
       -----------------------------------------------------------
      
       v7:
       ---
       Use use_browser in report__browse_block_hists for supporting
       stdio and potential tui mode.
      
       v6:
       ---
       Create report__browse_block_hists in block-info.c (codes are
       moved from builtin-report.c). It's called from
       perf_evlist__tty_browse_hists.
      
       v5:
       ---
       1. Move all block functions to block-info.c
      
       2. Move the code of setting ms in block hist_entry to
          other patch.
      
       v4:
       ---
       1. Use new option '--total-cycles' to replace
          '-s total_cycles' in v3.
      
       2. Move block info collection out of block info
          printing.
      
       v3:
       ---
       1. Use common function block_info__process_sym to
          process the blocks per symbol.
      
       2. Remove the nasty hack for skipping calculation
          of column length
      
       3. Some minor cleanup
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6f7164fa
    • Jin Yao's avatar
      perf hist: Support block formats with compare/sort/display · b65a7d37
      Jin Yao authored
      This patch provides helper routines to support new columns for block
      info output.
      
      The new columns are:
      
        Sampled Cycles%
        Sampled Cycles
        Avg Cycles%
        Avg Cycles
        [Program Block Range]
        Shared Object
      
       v5:
       ---
       1. Move more block related functions from builtin-report.c to
          block-info.c
      
       2. Set ms (map+sym) in block hist_entry. Because this info
          is needed for reporting the block range (i.e. source line)
      
      Committer notes:
      
      Remove unused set_fmt() function, some build were not completing with:
      
        util/block-info.c:396:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
        static inline void set_fmt(struct block_fmt *block_fmt,
                           ^
        1 error generated.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-5-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b65a7d37
    • Jin Yao's avatar
      perf hist: Count the total cycles of all samples · 7841f40a
      Jin Yao authored
      We can get the per sample cycles by hist__account_cycles(). It's also
      useful to know the total cycles of all samples in order to get the
      cycles coverage for a single program block in further. For example:
      
        coverage = per block sampled cycles / total sampled cycles
      
      This patch creates a new argument 'total_cycles' in hist__account_cycles(),
      which will be added with the cycles of each sample.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7841f40a
    • Jin Yao's avatar
      perf block: Cleanup and refactor block info functions · 60414418
      Jin Yao authored
      We have already implemented some block-info related functions.
      Now it's time to do some cleanup, refactoring and move the
      functions and structures to new block-info.h/block-info.c.
      
       v4:
       ---
       Move code for skipping column length calculation to patch:
       'perf diff: Don't use hack to skip column length calculation'
      
       v3:
       ---
       1. Rename the patch title
       2. Rename from block.h/block.c to block-info.h/block-info.c
       3. Move more common part to block-info, such as
          block_info__process_sym.
       4. Remove the nasty hack for skipping calculation of column
          length
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      60414418
    • Jin Yao's avatar
      perf diff: Don't use hack to skip column length calculation · 0bdf181f
      Jin Yao authored
      Previously we use a nasty hack to skip the hists__calc_col_len for block
      since this function is not very suitable for block column length
      calculation.
      
      This patch removes the hack code and add a check at the entry of
      hists__calc_col_len to skip for block case.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-2-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0bdf181f
    • Leo Yan's avatar
      perf tests: Fix out of bounds memory access · af8490eb
      Leo Yan authored
      The test case 'Read backward ring buffer' failed on 32-bit architectures
      which were found by LKFT perf testing.  The test failed on arm32 x15
      device, qemu_arm32, qemu_i386, and found intermittent failure on i386;
      the failure log is as below:
      
        50: Read backward ring buffer                  :
        --- start ---
        test child forked, pid 510
        Using CPUID GenuineIntel-6-9E-9
        mmap size 1052672B
        mmap size 8192B
        Finished reading overwrite ring buffer: rewind
        free(): invalid next size (fast)
        test child interrupted
        ---- end ----
        Read backward ring buffer: FAILED!
      
      The log hints there have issue for memory usage, thus free() reports
      error 'invalid next size' and directly exit for the case.  Finally, this
      issue is root caused as out of bounds memory access for the data array
      'evsel->id'.
      
      The backward ring buffer test invokes do_test() twice.  'evsel->id' is
      allocated at the first call with the flow:
      
        test__backward_ring_buffer()
          `-> do_test()
      	  `-> evlist__mmap()
      	        `-> evlist__mmap_ex()
      	              `-> perf_evsel__alloc_id()
      
      So 'evsel->id' is allocated with one item, and it will be used in
      function perf_evlist__id_add():
      
         evsel->id[0] = id
         evsel->ids   = 1
      
      At the second call for do_test(), it skips to initialize 'evsel->id'
      and reuses the array which is allocated in the first call.  But
      'evsel->ids' contains the stale value.  Thus:
      
         evsel->id[1] = id    -> out of bound access
         evsel->ids   = 2
      
      To fix this issue, we will use evlist__open() and evlist__close() pair
      functions to prepare and cleanup context for evlist; so 'evsel->id' and
      'evsel->ids' can be initialized properly when invoke do_test() and avoid
      the out of bounds memory access.
      
      Fixes: ee74701e ("perf tests: Add test to check backward ring buffer")
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: stable@vger.kernel.org # v4.10+
      Link: http://lore.kernel.org/lkml/20191107020244.2427-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af8490eb
    • Jiwei Sun's avatar
      perf record: Add support for limit perf output file size · 6d575816
      Jiwei Sun authored
      The patch adds a new option to limit the output file size, then based on
      it, we can create a wrapper of the perf command that uses the option to
      avoid exhausting the disk space by the unconscious user.
      
      In order to make the perf.data parsable, we just limit the sample data
      size, since the perf.data consists of many headers and sample data and
      other data, the actual size of the recorded file will bigger than the
      setting value.
      
      Testing it:
      
        # ./perf record -a -g --max-size=10M
        Couldn't synthesize bpf events.
        [ perf record: perf size limit reached (10249 KB), stopping session ]
        [ perf record: Woken up 32 times to write data ]
        [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]
      
        # ls -lh perf.data
        -rw------- 1 root root 11M Oct 22 14:32 perf.data
      
        # ./perf record -a -g --max-size=10K
        [ perf record: perf size limit reached (10 KB), stopping session ]
        Couldn't synthesize bpf events.
        [ perf record: Woken up 0 times to write data ]
        [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]
      
        # ls -l perf.data
        -rw------- 1 root root 1626952 Oct 22 14:36 perf.data
      
      Committer notes:
      
      Fixed the build in multiple distros by using PRIu64 to print u64 struct
      members, fixing this:
      
        builtin-record.c: In function 'record__write':
        builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
             rec->bytes_written >> 10);
             ^
          CC       /tmp/build/pe
      Signed-off-by: default avatarJiwei Sun <jiwei.sun@windriver.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Danter <richard.danter@windriver.com>
      Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6d575816
    • Masami Hiramatsu's avatar
      perf probe: Skip overlapped location on searching variables · dee36a2a
      Masami Hiramatsu authored
      Since debuginfo__find_probes() callback function can be called with  the
      location which already passed, the callback function must filter out
      such overlapped locations.
      
      add_probe_trace_event() has already done it by commit 1a375ae7
      ("perf probe: Skip same probe address for a given line"), but
      add_available_vars() doesn't. Thus perf probe -v shows same address
      repeatedly as below:
      
        # perf probe -V vfs_read:18
        Available variables at vfs_read:18
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+226>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
      
      With this fix, perf probe -V shows it correctly:
      
        # perf probe -V vfs_read:18
        Available variables at vfs_read:18
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+226>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
      
      Fixes: cf6eb489 ("perf probe: Show accessible local variables")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241938927.32002.4026859017790562751.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dee36a2a
    • Masami Hiramatsu's avatar
      perf probe: Fix to show calling lines of inlined functions · 86c0bf85
      Masami Hiramatsu authored
      Fix to show calling lines of inlined functions (where an inline function
      is called).
      
      die_walk_lines() filtered out the lines inside inlined functions based
      on the address. However this also filtered out the lines which call
      those inlined functions from the target function.
      
      To solve this issue, check the call_file and call_line attributes and do
      not filter out if it matches to the line information.
      
      Without this fix, perf probe -L doesn't show some lines correctly.
      (don't see the lines after 17)
      
        # perf probe -L vfs_read
        <vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
              0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
              1  {
              2         ssize_t ret;
      
              4         if (!(file->f_mode & FMODE_READ))
                                return -EBADF;
              6         if (!(file->f_mode & FMODE_CAN_READ))
                                return -EINVAL;
              8         if (unlikely(!access_ok(buf, count)))
                                return -EFAULT;
      
             11         ret = rw_verify_area(READ, file, pos, count);
             12         if (!ret) {
             13                 if (count > MAX_RW_COUNT)
                                        count =  MAX_RW_COUNT;
             15                 ret = __vfs_read(file, buf, count, pos);
             16                 if (ret > 0) {
                                        fsnotify_access(file);
                                        add_rchar(current, ret);
                                }
      
      With this fix:
      
        # perf probe -L vfs_read
        <vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
              0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
              1  {
              2         ssize_t ret;
      
              4         if (!(file->f_mode & FMODE_READ))
                                return -EBADF;
              6         if (!(file->f_mode & FMODE_CAN_READ))
                                return -EINVAL;
              8         if (unlikely(!access_ok(buf, count)))
                                return -EFAULT;
      
             11         ret = rw_verify_area(READ, file, pos, count);
             12         if (!ret) {
             13                 if (count > MAX_RW_COUNT)
                                        count =  MAX_RW_COUNT;
             15                 ret = __vfs_read(file, buf, count, pos);
             16                 if (ret > 0) {
             17                         fsnotify_access(file);
             18                         add_rchar(current, ret);
                                }
             20                 inc_syscr(current);
                        }
      
      Fixes: 4cc9cec6 ("perf probe: Introduce lines walker interface")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241937995.32002.17899884017011512577.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      86c0bf85
    • Masami Hiramatsu's avatar
      perf probe: Filter out instances except for inlined subroutine and subprogram · da6cb952
      Masami Hiramatsu authored
      Filter out instances except for inlined_subroutine and subprogram DIE in
      die_walk_instances() and die_is_func_instance().
      
      This fixes an issue that perf probe sets some probes on calling address
      instead of a target function itself.
      
      When perf probe walks on instances of an abstruct origin (a kind of
      function prototype of inlined function), die_walk_instances() can also
      pass a GNU_call_site (a GNU extension for call site) to callback. Since
      it is not an inlined instance of target function, we have to filter out
      when searching a probe point.
      
      Without this patch, perf probe sets probes on call site address too.This
      can happen on some function which is marked "inlined", but has actual
      symbol. (I'm not sure why GCC mark it "inlined"):
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+2500017
        p:probe/vfs_read_1 _text+2499468
        p:probe/vfs_read_2 _text+2499563
        p:probe/vfs_read_3 _text+2498876
        p:probe/vfs_read_4 _text+2498512
        p:probe/vfs_read_5 _text+2498627
      
      With this patch:
      
      Slightly different results, similar tho:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+2498512
      
      Committer testing:
      
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
      
      Before:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+3131557
        p:probe/vfs_read_1 _text+3130975
        p:probe/vfs_read_2 _text+3131047
        p:probe/vfs_read_3 _text+3130380
        p:probe/vfs_read_4 _text+3130000
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
        #
      
      After:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+3130000
        #
      
      Fixes: db0d2c64 ("perf probe: Search concrete out-of-line instances")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241937063.32002.11024544873990816590.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      da6cb952
    • Masami Hiramatsu's avatar
      perf probe: Skip end-of-sequence and non statement lines · f4d99bdf
      Masami Hiramatsu authored
      Skip end-of-sequence and non-statement lines while walking through lines
      list.
      
      The "end-of-sequence" line information means:
      
       "the current address is that of the first byte after the
        end of a sequence of target machine instructions."
       (DWARF version 4 spec 6.2.2)
      
      This actually means out of scope and we can not probe on it.
      
      On the other hand, the statement lines (is_stmt) means:
      
       "the current instruction is a recommended breakpoint location.
        A recommended breakpoint location is intended to “represent”
        a line, a statement and/or a semantically distinct subpart
        of a statement."
      
       (DWARF version 4 spec 6.2.2)
      
      So, non-statement line info also should be skipped.
      
      These can reduce unneeded probe points and also avoid an error.
      
      E.g. without this patch:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new events:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_3 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_4 (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask_4 -aR sleep 1
      
        #
      
      This puts 5 probes on one line, but acutally it's not inlined function.
      This is because there are many non statement instructions at the
      function prologue.
      
      With this patch:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new event:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
      
        #
      
      Now perf-probe skips unneeded addresses.
      
      Committer testing:
      
      Slightly different results, but similar:
      
      Before:
      
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
        #
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new events:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask_2 -aR sleep 1
      
        #
      
      After:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new event:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
      
        # perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
        #
      
      Fixes: 4cc9cec6 ("perf probe: Introduce lines walker interface")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241936090.32002.12156347518596111660.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f4d99bdf
    • Masami Hiramatsu's avatar
      perf probe: Return a better scope DIE if there is no best scope · c701636a
      Masami Hiramatsu authored
      Make find_best_scope() returns innermost DIE at given address if there
      is no best matched scope DIE. Since Gcc sometimes generates intuitively
      strange line info which is out of inlined function address range, we
      need this fixup.
      
      Without this, sometimes perf probe failed to probe on a line inside an
      inlined function:
      
        # perf probe -D ksys_open:3
        Failed to find scope of probe point.
          Error: Failed to add events.
      
      With this fix, 'perf probe' can probe it:
      
        # perf probe -D ksys_open:3
        p:probe/ksys_open _text+25707308
        p:probe/ksys_open_1 _text+25710596
        p:probe/ksys_open_2 _text+25711114
        p:probe/ksys_open_3 _text+25711343
        p:probe/ksys_open_4 _text+25714058
        p:probe/ksys_open_5 _text+2819653
        p:probe/ksys_open_6 _text+2819701
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157291300887.19771.14936015360963292236.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c701636a
    • Ian Rogers's avatar
      perf annotate: Fix heap overflow · 5c65b1c0
      Ian Rogers authored
      Fix expand_tabs that copies the source lines '\0' and then appends
      another '\0' at a potentially out of bounds address.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20191026035644.217548-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5c65b1c0
    • Arnaldo Carvalho de Melo's avatar
      perf machine: Add kernel_dso() method · 93730f85
      Arnaldo Carvalho de Melo authored
      To reduce boilerplate in some places.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-9s1bgoxxhlnu037e1nqx0tw3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      93730f85
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Remove needless checks for map->groups->machine · b0c76fc4
      Arnaldo Carvalho de Melo authored
      Its sufficient to check if map->groups is NULL before using it to get
      ->machine value.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-utiepyiv8b1tf8f79ok9d6j8@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0c76fc4
    • Ian Rogers's avatar
      perf parse: Add a deep delete for parse event terms · 1dc92556
      Ian Rogers authored
      Add a parse_events_term deep delete function so that owned strings and
      arrays are freed.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-10-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1dc92556
    • Ian Rogers's avatar
      perf parse: If pmu configuration fails free terms · 38f2c422
      Ian Rogers authored
      Avoid a memory leak when the configuration fails.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38f2c422
    • Ian Rogers's avatar
      perf parse: Before yyabort-ing free components · cabbf268
      Ian Rogers authored
      Yyabort doesn't destruct inputs and so this must be done manually before
      using yyabort.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-8-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cabbf268
    • Ian Rogers's avatar
      perf parse: Add destructors for parse event terms · f2a8ecd8
      Ian Rogers authored
      If parsing fails then destructors are ran to clean the up the stack.
      Rename the head union member to make the term and evlist use cases more
      distinct, this simplifies matching the correct destructor.
      
      Committer notes:
      
      Jiri: "Nice did not know about this.. looks like it's been in bison for some time, right?"
      
      Ian:  "Looks like it wasn't in Bison 1 but in Bison 2, we're at Bison 3 and
             Bison 2 is > 14 years old:
             https://web.archive.org/web/20050924004158/http://www.gnu.org/software/bison/manual/html_mono/bison.html#Destructor-Decl"
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f2a8ecd8
  2. 06 Nov, 2019 21 commits