• Donglin Peng's avatar
    function_graph: Support recording and printing the return value of function · a1be9ccc
    Donglin Peng authored
    Analyzing system call failures with the function_graph tracer can be a
    time-consuming process, particularly when locating the kernel function
    that first returns an error in the trace logs. This change aims to
    simplify the process by recording the function return value to the
    'retval' member of 'ftrace_graph_ret' and printing it when outputting
    the trace log.
    
    We have introduced new trace options: funcgraph-retval and
    funcgraph-retval-hex. The former controls whether to display the return
    value, while the latter controls the display format.
    
    Please note that even if a function's return type is void, a return
    value will still be printed. You can simply ignore it.
    
    This patch only establishes the fundamental infrastructure. Subsequent
    patches will make this feature available on some commonly used processor
    architectures.
    
    Here is an example:
    
    I attempted to attach the demo process to a cpu cgroup, but it failed:
    
    echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
    -bash: echo: write error: Invalid argument
    
    The strace logs indicate that the write system call returned -EINVAL(-22):
    ...
    write(1, "273\n", 4)                    = -1 EINVAL (Invalid argument)
    ...
    
    To capture trace logs during a write system call, use the following
    commands:
    
    cd /sys/kernel/debug/tracing/
    echo 0 > tracing_on
    echo > trace
    echo *sys_write > set_graph_function
    echo *spin* > set_graph_notrace
    echo *rcu* >> set_graph_notrace
    echo *alloc* >> set_graph_notrace
    echo preempt* >> set_graph_notrace
    echo kfree* >> set_graph_notrace
    echo $$ > set_ftrace_pid
    echo function_graph > current_tracer
    echo 1 > options/funcgraph-retval
    echo 0 > options/funcgraph-retval-hex
    echo 1 > tracing_on
    echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
    echo 0 > tracing_on
    cat trace > ~/trace.log
    
    To locate the root cause, search for error code -22 directly in the file
    trace.log and identify the first function that returned -22. Once you
    have identified this function, examine its code to determine the root
    cause.
    
    For example, in the trace log below, cpu_cgroup_can_attach
    returned -22 first, so we can focus our analysis on this function to
    identify the root cause.
    
    ...
    
     1)          | cgroup_migrate() {
     1) 0.651 us |   cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
     1)          |   cgroup_migrate_execute() {
     1)          |     cpu_cgroup_can_attach() {
     1)          |       cgroup_taskset_first() {
     1) 0.732 us |         cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
     1) 1.232 us |       } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
     1) 0.380 us |       sched_rt_can_attach(); /* = 0x0 */
     1) 2.335 us |     } /* cpu_cgroup_can_attach = -22 */
     1) 4.369 us |   } /* cgroup_migrate_execute = -22 */
     1) 7.143 us | } /* cgroup_migrate = -22 */
    
    ...
    
    Link: https://lkml.kernel.org/r/1fc502712c981e0e6742185ba242992170ac9da8.1680954589.git.pengdonglin@sangfor.com.cnTested-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
    Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: default avatarDonglin Peng <pengdonglin@sangfor.com.cn>
    Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
    a1be9ccc
fgraph.c 17.6 KB