• Sebastian Andrzej Siewior's avatar
    tracing: Merge irqflags + preempt counter. · 36590c50
    Sebastian Andrzej Siewior authored
    The state of the interrupts (irqflags) and the preemption counter are
    both passed down to tracing_generic_entry_update(). Only one bit of
    irqflags is actually required: The on/off state. The complete 32bit
    of the preemption counter isn't needed. Just whether of the upper bits
    (softirq, hardirq and NMI) are set and the preemption depth is needed.
    
    The irqflags and the preemption counter could be evaluated early and the
    information stored in an integer `trace_ctx'.
    tracing_generic_entry_update() would use the upper bits as the
    TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth
    (considering that one must be substracted from the counter in one
    special cases).
    
    The actual preemption value is not used except for the tracing record.
    The `irqflags' variable is mostly used only for the tracing record. An
    exception here is for instance wakeup_tracer_call() or
    probe_wakeup_sched_switch() which explicilty disable interrupts and use
    that `irqflags' to save (and restore) the IRQ state and to record the
    state.
    
    Struct trace_event_buffer has also the `pc' and flags' members which can
    be replaced with `trace_ctx' since their actual value is not used
    outside of trace recording.
    
    This will reduce tracing_generic_entry_update() to simply assign values
    to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved
    to _tracing_gen_ctx_flags() which replaces preempt_count() and
    local_save_flags() invocations.
    
    As an example, ftrace_syscall_enter() may invoke:
    - trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update()
    - event_trigger_unlock_commit()
      -> ftrace_trace_stack() -> … -> tracing_generic_entry_update()
      -> ftrace_trace_userstack() -> … -> tracing_generic_entry_update()
    
    In this case the TRACE_FLAG_* bits were evaluated three times. By using
    the `trace_ctx' they are evaluated once and assigned three times.
    
    A build with all tracers enabled on x86-64 with and without the patch:
    
        text     data      bss      dec      hex    filename
    21970669 17084168  7639260 46694097  2c87ed1 vmlinux.old
    21970293 17084168  7639260 46693721  2c87d59 vmlinux.new
    
    text shrank by 379 bytes, data remained constant.
    
    Link: https://lkml.kernel.org/r/20210125194511.3924915-2-bigeasy@linutronix.deSigned-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    36590c50
trace_event_perf.c 12.1 KB