• Arnaldo Carvalho de Melo's avatar
    perf trace: Deref sys_enter pointer args with contents from probe:vfs_getname · f994592d
    Arnaldo Carvalho de Melo authored
    To work like strace and dereference syscall pointer args we need to
    insert probes (or tracepoints) right after we copy those bytes from
    userspace.
    
    Since we're formatting the syscall args at raw_syscalls:sys_enter time,
    we need to have a formatter that just stores the position where, later,
    when we get the probe:vfs_getname, we can insert the pointer contents.
    
    Now, if a probe:vfs_getname with this format is in place:
    
     # perf probe -l
      probe:vfs_getname (on getname_flags:72@/home/git/linux/fs/namei.c with pathname)
    
    That was, in this case, put in place with:
    
     # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
     Added new event:
      probe:vfs_getname    (on getname_flags:72 with pathname=filename:string)
    
     You can now use it in all perf tools, such as:
    
    	perf record -e probe:vfs_getname -aR sleep 1
     #
    
    Then 'perf trace' will notice that and do the pointer -> contents
    expansion:
    
     # trace -e open touch /tmp/bla
      0.165 (0.010 ms): touch/17752 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
      0.195 (0.011 ms): touch/17752 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
      0.512 (0.012 ms): touch/17752 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
      0.582 (0.012 ms): touch/17752 open(filename: /tmp/bla, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: 438) = 3
     #
    
    Roughly equivalent to strace's output:
    
     # strace -rT -e open touch /tmp/bla
      0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000039>
      0.000317 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000102>
      0.001461 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.000072>
      0.000405 open("/tmp/bla", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3 <0.000055>
      0.000641 +++ exited with 0 +++
     #
    
    Now we need to either look for at all syscalls that are marked as
    pointers and have some well known names ("filename", "pathname", etc)
    and set the arg formatter to the one used for the "open" syscall in this
    patch.
    
    This implementation works for syscalls with just a string being copied
    from userspace, for matching syscalls with more than one string being
    copied via the same probe/trace point (vfs_getname) we need to extend
    the vfs_getname probe spec to include the pointer too, but there are
    some problems with that in 'perf probe' or the kernel kprobes code, need
    to investigate before considering supporting multiple strings per
    syscall.
    
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Milian Wolff <mail@milianw.de>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Stephane Eranian <eranian@google.com>
    Link: http://lkml.kernel.org/n/tip-xvuwx6nuj8cf389kf9s2ue2s@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    f994592d
builtin-trace.c 81 KB