• Vineet Gupta's avatar
    ARC: dw2 unwind: Remove falllback linear search thru FDE entries · 2e22502c
    Vineet Gupta authored
    Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"
    
    | perf record -g -c 15000 -e cycles /sbin/hackbench
    |
    | INFO: rcu_preempt self-detected stall on CPU
    | 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
    | Task dump for CPU 1:
    
    in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
    search (which iterates thru each of ~11K entries) thus takes 2 orders of
    magnitude longer (~3 million cycles vs. 2000). Routines written in hand
    assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
    yet) fail the unwinder binary lookup, hit linear search, failing
    nevertheless in the end.
    
    However the linear search is pointless as binary lookup tables are created
    from it in first place. It is impossible to have binary lookup fail while
    succeed the linear search. It is pure waste of cycles thus removed by
    this patch.
    
    This manifested as RCU stalls / NMI watchdog splat when running
    hackbench under perf with callgraph profiling. The triggering condition
    was perf counter overflowing in routine lacking dwarf info (like memset)
    leading to patheic 3 million cycle unwinder slow path and by the time it
    returned new interrupts were already pending (Timer, IPI) and taken
    rightaway. The original memset didn't make forward progress, system kept
    accruing more interrupts and more unwinder delayes in a vicious feedback
    loop, ultimately triggering the NMI diagnostic.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
    2e22502c
unwind.c 32.1 KB