1. 14 Apr, 2016 3 commits
  2. 13 Apr, 2016 18 commits
  3. 12 Apr, 2016 14 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Print unresolved symbol names as addresses · 00768a2b
      Arnaldo Carvalho de Melo authored
      Instead of having "[unknown]" as the name used for unresolved symbols,
      use the address in the callchain, in hexadecimal form:
      
        28.801 ( 0.007 ms): qemu-system-x8/10065 ppoll(ufds: 0x55c98b39e400, nfds: 72, tsp: 0x7fffe4e4fe60, sigsetsize: 8) = 0 Timeout
                                           ppoll+0x91 (/usr/lib64/libc-2.22.so)
                                           [0x337309] (/usr/bin/qemu-system-x86_64)
                                           [0x336ab4] (/usr/bin/qemu-system-x86_64)
                                           main+0x1724 (/usr/bin/qemu-system-x86_64)
                                           __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                           [0xc59a9] (/usr/bin/qemu-system-x86_64)
        35.265 (14.805 ms): gnome-shell/2287  ... [continued]: poll()) = 1
                                           [0xf6fdd] (/usr/lib64/libc-2.22.so)
                                           g_main_context_iterate.isra.29+0x17c (/usr/lib64/libglib-2.0.so.0.4600.2)
                                           g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
                                           meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
                                           main+0x3f7 (/usr/bin/gnome-shell)
                                           __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                           [0x2909] (/usr/bin/gnome-shell)
      Suggested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00768a2b
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Allow unresolved symbol names to be printed as addresses · fd4be130
      Arnaldo Carvalho de Melo authored
      The fprintf_sym() and fprintf_callchain() methods now allow users to
      change the existing behaviour of showing "[unknown]" as the name of
      unresolved symbols to instead show "[0x123456]", i.e. its address.
      
      The current patch doesn't change tools to use this facility, the results
      from 'perf trace' and 'perf script' cotinue like:
      
      70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
                                         [unknown] (/usr/lib64/libc-2.22.so)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         start_thread+0xca (/usr/lib64/libpthread-2.22.so)
                                         __clone+0x6d (/usr/lib64/libc-2.22.so)
      
      The next patch will make 'perf trace' use the new formatting.
      Suggested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd4be130
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Make "--call-graph" affect just "raw_syscalls:sys_exit" · fde54b78
      Arnaldo Carvalho de Melo authored
      We don't need the callchains at the syscall enter tracepoint, just when
      finishing it at syscall exit, so reduce the overhead by asking for
      callchains just at syscall exit.
      Suggested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fde54b78
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Rename config_callgraph() to config_callchain() and make it public · 01e0d50c
      Arnaldo Carvalho de Melo authored
      The rename is for consistency with the parameter name.
      
      Make it public for fine grained control of which evsels should have
      callchains enabled, like, for instance, will be done in the next
      changesets in 'perf trace', to enable callchains just on the
      "raw_syscalls:sys_exit" tracepoint.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01e0d50c
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Add (reset,set)_sample_bit methods · 22c8a376
      Arnaldo Carvalho de Melo authored
      For fiddling with sample_type fields in all evsels in an evlist.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      22c8a376
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Do not use globals in config() · e68ae9cf
      Arnaldo Carvalho de Melo authored
      Instead receive a callchain_param pointer to configure callchain
      aspects, not doing so if NULL is passed.
      
      This will allow fine grained control over which evsels in an evlist
      gets callchains enabled.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e68ae9cf
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Exclude the kernel part of the callchain leading to a syscall · 44621819
      Arnaldo Carvalho de Melo authored
      The kernel parts are not that useful:
      
        # trace -m 512 -e nanosleep --call dwarf  usleep 1
           0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0
                                             syscall_slow_exit_work ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             return_from_SYSCALL_64 ([kernel.kallsyms])
                                             __nanosleep (/usr/lib64/libc-2.22.so)
                                             usleep (/usr/lib64/libc-2.22.so)
                                             main (/usr/bin/usleep)
                                             __libc_start_main (/usr/lib64/libc-2.22.so)
                                             _start (/usr/bin/usleep)
        #
      
      So lets just use perf_event_attr.exclude_callchain_kernel to avoid
      collecting it in the ring buffer:
      
        # trace -m 512 -e nanosleep --call dwarf  usleep 1
           0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0
                                             __nanosleep (/usr/lib64/libc-2.22.so)
                                             usleep (/usr/lib64/libc-2.22.so)
                                             main (/usr/bin/usleep)
                                             __libc_start_main (/usr/lib64/libc-2.22.so)
                                             _start (/usr/bin/usleep)
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44621819
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Introduce fprintf_callchain() method out of fprintf_sym() · ea453965
      Arnaldo Carvalho de Melo authored
      In 'perf trace' we're just interested in printing callchains, and we
      don't want to use the symbol_conf.use_callchain, so move the callchain
      part to a new method.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ea453965
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Rename print_ip() to fprintf_sym() · ff0c1078
      Arnaldo Carvalho de Melo authored
      As it receives a FILE, and its more than just the IP, which can even be
      requested not to be printed.
      
      For consistency with other similar methods in tools/perf/, name it as
      perf_evsel__fprintf_sym() and make it return the number of bytes
      printed, just like 'fprintf(3)'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff0c1078
    • Milian Wolff's avatar
      perf trace: Add support for printing call chains on sys_exit events. · 566a0885
      Milian Wolff authored
      Now, one can print the call chain for every encountered sys_exit event,
      e.g.:
      
          $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep
          1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0
                                                   syscall_slow_exit_work ([kernel.kallsyms])
                                                   syscall_return_slowpath ([kernel.kallsyms])
                                                   int_ret_from_sys_call ([kernel.kallsyms])
                                                   __nanosleep (/usr/lib/libc-2.23.so)
                                                   [unknown] (/usr/lib/libQt5Core.so.5.6.0)
                                                   QThread::sleep (/usr/lib/libQt5Core.so.5.6.0)
                                                   main (path/to/ex_sleep)
                                                   __libc_start_main (/usr/lib/libc-2.23.so)
                                                   _start (path/to/ex_sleep)
      
      Note that it is advised to increase the number of mmap pages to prevent
      event losses when using this new feature. Often, adding `-m 10M` to the
      `perf trace` invocation is enough.
      
      This feature is also available in strace when built with libunwind via
      `strace -k`. Performance wise, this solution is much better:
      
          $ time find path/to/linux &> /dev/null
      
          real    0m0.051s
          user    0m0.013s
          sys     0m0.037s
      
          $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null
      
          real    0m2.624s
          user    0m1.203s
          sys     0m1.333s
      
          $ time strace -k find path/to/linux  &> /dev/null
      
          real    0m35.398s
          user    0m10.403s
          sys     0m23.173s
      
      Note that it is currently not possible to configure the print output.
      Adding such a feature, similar to what is available in `perf script` via
      its `--fields` knob can be added later on.
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com
      [ Split from a larger patch, do not print the IP, left align,
        remove dup call symbol__init(), added man page entry ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      566a0885
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Allow passing a left alignment when printing a symbol · db3617f3
      Arnaldo Carvalho de Melo authored
      For callchains, etc where we want it to align just below the syscall
      name, for instance, in 'perf trace'
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      db3617f3
    • Milian Wolff's avatar
      perf evsel: Allow specifying a file to output in perf_evsel__print_ip · 6186de9a
      Milian Wolff authored
      As this function will be used in 'perf trace'.
      
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org
      [ Split from a larger patch ]
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      6186de9a
    • Wang Nan's avatar
      perf bpf: Automatically create bpf-output event __bpf_stdout__ · 72c08098
      Wang Nan authored
      This patch removes the need to set a bpf-output event in cmdline.  By
      referencing a map named '__bpf_stdout__', perf automatically creates an
      event for it.
      
      For example:
      
        # perf record -e ./test_bpf_trace.c usleep 100000
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]
        # perf script
                 usleep  4639 [000] 261895.307826:        0            __bpf_stdout__:  ffffffff810eb9a1 ...
             BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                         0008: 42 50 46 20 65 76 65 6e  BPF even
                         0010: 74 21 00 00              t!..
             BPF string: "Raise a BPF event!"
      
                 usleep  4639 [000] 261895.407883:        0            __bpf_stdout__:  ffffffff8105d609 ...
             BPF output: 0000: 52 61 69 73 65 20 61 20  Raise a
                         0008: 42 50 46 20 65 76 65 6e  BPF even
                         0010: 74 21 00 00              t!..
             BPF string: "Raise a BPF event!"
      
        perf record -e ./test_bpf_trace.c usleep 100000
      
        equals to:
      
        perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \
                    -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \
                    usleep 100000
      
      Where test_bpf_trace.c is:
      
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
               unsigned int type;
               unsigned int key_size;
               unsigned int value_size;
               unsigned int max_entries;
        };
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
               (void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
               (void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
               (void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
               (void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") __bpf_stdout__ = {
               .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
               .key_size = sizeof(int),
               .value_size = sizeof(u32),
               .max_entries = __NR_CPUS__,
        };
      
        static inline int __attribute__((always_inline))
        func(void *ctx, int type)
        {
      	char output_str[] = "Raise a BPF event!";
      	char err_str[] = "BAD %d\n";
      	int err;
      
              err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(),
      			        &output_str, sizeof(output_str));
      	if (err)
      		trace_printk(err_str, sizeof(err_str), err);
              return 1;
        }
        SEC("func_begin=sys_nanosleep")
        int func_begin(void *ctx) {return func(ctx, 1);}
        SEC("func_end=sys_nanosleep%return")
        int func_end(void *ctx) { return func(ctx, 2);}
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
      Committer note:
      
      Testing with 'perf trace':
      
        # trace -e nanosleep --ev test_bpf_stdout.c usleep 1
           0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ...
           0.007 (         ): __bpf_stdout__:Raise a BPF event!..)
           0.008 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
           0.069 (         ): __bpf_stdout__:Raise a BPF event!..)
           0.070 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
           0.072 ( 0.072 ms): usleep/729  ... [continued]: nanosleep()) = 0
        #
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      72c08098
    • Wang Nan's avatar
      perf bpf: Clone bpf stdout events in multiple bpf scripts · d7888573
      Wang Nan authored
      This patch allows cloning bpf-output event configuration among multiple
      bpf scripts. If there exist a map named '__bpf_output__' and not
      configured using 'map:__bpf_output__.event=', this patch clones the
      configuration of another '__bpf_stdout__' map. For example, following
      command:
      
        # perf trace --ev bpf-output/no-inherit,name=evt/ \
                     --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
                     --ev ./test_bpf_trace2.c usleep 100000
      
      equals to:
      
        # perf trace --ev bpf-output/no-inherit,name=evt/ \
                     --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/  \
                     --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
                     usleep 100000
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d7888573
  4. 11 Apr, 2016 4 commits
  5. 10 Apr, 2016 1 commit
    • Linus Torvalds's avatar
      Revert "ext4: allow readdir()'s of large empty directories to be interrupted" · 9f2394c9
      Linus Torvalds authored
      This reverts commit 1028b55b.
      
      It's broken: it makes ext4 return an error at an invalid point, causing
      the readdir wrappers to write the the position of the last successful
      directory entry into the position field, which means that the next
      readdir will now return that last successful entry _again_.
      
      You can only return fatal errors (that terminate the readdir directory
      walk) from within the filesystem readdir functions, the "normal" errors
      (that happen when the readdir buffer fills up, for example) happen in
      the iterorator where we know the position of the actual failing entry.
      
      I do have a very different patch that does the "signal_pending()"
      handling inside the iterator function where it is allowable, but while
      that one passes all the sanity checks, I screwed up something like four
      times while emailing it out, so I'm not going to commit it today.
      
      So my track record is not good enough, and the stars will have to align
      better before that one gets committed.  And it would be good to get some
      review too, of course, since celestial alignments are always an iffy
      debugging model.
      
      IOW, let's just revert the commit that caused the problem for now.
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f2394c9