• Rasmus Villemoes's avatar
    tracing: Eliminate const char[] auto variables · 0f5e5a3a
    Rasmus Villemoes authored
    Automatic const char[] variables cause unnecessary code
    generation. For example, the this_mod variable leads to
    
        3f04:       48 b8 5f 5f 74 68 69 73 5f 6d   movabs $0x6d5f736968745f5f,%rax # __this_m
        3f0e:       4c 8d 44 24 02                  lea    0x2(%rsp),%r8
        3f13:       48 8d 7c 24 10                  lea    0x10(%rsp),%rdi
        3f18:       48 89 44 24 02                  mov    %rax,0x2(%rsp)
        3f1d:       4c 89 e9                        mov    %r13,%rcx
        3f20:       b8 65 00 00 00                  mov    $0x65,%eax # e
        3f25:       48 c7 c2 00 00 00 00            mov    $0x0,%rdx
                            3f28: R_X86_64_32S      .rodata.str1.1+0x18d
        3f2c:       be 48 00 00 00                  mov    $0x48,%esi
        3f31:       c7 44 24 0a 6f 64 75 6c         movl   $0x6c75646f,0xa(%rsp) # odul
        3f39:       66 89 44 24 0e                  mov    %ax,0xe(%rsp)
    
    i.e., the string gets built on the stack at runtime. Similar code can be
    found for the other instances I'm replacing here. Putting the string
    in .rodata reduces the combined .text+.rodata size and saves time and
    stack space at runtime.
    
    The simplest fix, and what I've done for the this_mod case, is to just
    make the variable static.
    
    However, for the "<faulted>" case where the same string is used twice,
    that prevents the linker from merging those two literals, so instead use
    a macro - that also keeps the two instances automatically in
    sync (instead of only the compile-time strlen expression).
    
    Finally, for the two runs of spaces, it turns out that the "build
    these strings on the stack" is not the worst part of what gcc does -
    it turns print_func_help_header_irq() into "if (tgid) { /*
    print_event_info + five seq_printf calls */ } else { /* print
    event_info + another five seq_printf */}". Taking inspiration from a
    suggestion from Al Viro, use %.*s to make snprintf either stop after
    the first two spaces or print the whole string. As a bonus, the
    seq_printfs now fit on single lines (at least, they are not longer
    than the existing ones in the function just above), making it easier
    to see that the ascii art lines up.
    
    x86-64 defconfig + CONFIG_FUNCTION_TRACER:
    
    $ scripts/stackdelta /tmp/stackusage.{0,1}
    ./kernel/trace/ftrace.c ftrace_mod_callback     152     136     -16
    ./kernel/trace/trace.c  trace_default_header    56      32      -24
    ./kernel/trace/trace.c  tracing_mark_raw_write  96      72      -24
    ./kernel/trace/trace.c  tracing_mark_write      104     80      -24
    
    bloat-o-meter
    
    add/remove: 1/0 grow/shrink: 0/4 up/down: 14/-375 (-361)
    Function                                     old     new   delta
    this_mod                                       -      14     +14
    ftrace_mod_callback                          577     542     -35
    tracing_mark_raw_write                       444     374     -70
    tracing_mark_write                           616     540     -76
    trace_default_header                         600     406    -194
    
    Link: http://lkml.kernel.org/r/20190320081757.6037-1-linux@rasmusvillemoes.dkSigned-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    0f5e5a3a
trace.c 221 KB