Commit 860b6497 authored by Brendan Gregg's avatar Brendan Gregg

softirq and hardirq

parent 971e1c25
......@@ -69,9 +69,11 @@ Tools:
- tools/[biosnoop](tools/biosnoop): Trace block device I/O with PID and latency. [Examples](tools/biosnoop_example.txt).
- tools/[funccount](tools/funccount): Count kernel function calls. [Examples](tools/funccount_example.txt).
- tools/[funclatency](tools/funclatency): Time kernel functions and show their latency distribution. [Examples](tools/funclatency_example.txt).
- tools/[hardirqs](tools/hardirqs): Measure hard IRQ (hard interrupt) event time. [Examples](tools/hardirqs_example.txt).
- tools/[killsnoop](tools/killsnoop): Trace signals issued by the kill() syscall. [Examples](tools/killsnoop_example.txt).
- tools/[opensnoop](tools/opensnoop): Trace open() syscalls. [Examples](tools/opensnoop_example.txt).
- tools/[pidpersec](tools/pidpersec): Count new processes (via fork). [Examples](tools/pidpersec_example.txt).
- tools/[softirqs](tools/softirqs): Measure soft IRQ (soft interrupt) event time. [Examples](tools/softirqs_example.txt).
- tools/[syncsnoop](tools/syncsnoop): Trace sync() syscall. [Examples](tools/syncsnoop_example.txt).
- tools/[tcpaccept](tools/tcpaccept): Trace TCP passive connections (accept()). [Examples](tools/tcpaccept_example.txt).
- tools/[tcpconnect](tools/tcpconnect): Trace TCP active connections (connect()). [Examples](tools/tcpconnect_example.txt).
......
.TH hardirqs 8 "2015-10-20" "USER COMMANDS"
.SH NAME
hardirqs \- Measure hard IRQ (hard interrupt) event time. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B hardirqs [\-h] [\-T] [\-N] [\-d] [interval] [count]
.SH DESCRIPTION
This summarizes the time spent servicing hard IRQs (hard interrupts), and can
show this time as either totals or histogram distributions. A system-wide
summary of this time is shown by the %irq column of mpstat(1), and event
counts (but not times) are shown by /proc/interrupts.
WARNING: This currently uses dynamic tracing of hard interrupts. You should
understand what this means before use. Try in a test environment. Future
versions should switch to tracepoints.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-T
Include timestamps on output.
.TP
\-N
Output in nanoseconds
.TP
\-d
Show IRQ time distribution as histograms
.SH EXAMPLES
.TP
Sum hard IRQ event time until Ctrl-C:
#
.B hardirqs
.TP
Show hard IRQ event time as histograms:
#
.B hardirqs \-d
.TP
Print 1 second summaries, 10 times:
#
.B hardirqs 1 10
.TP
1 second summaries, printed in nanoseconds, with timestamps:
#
.B hardirqs \-NT 1
.SH FIELDS
.TP
HARDIRQ
The irq action name for this hard IRQ.
.TP
TOTAL_usecs
Total time spent in this hard IRQ in microseconds.
.TP
TOTAL_nsecs
Total time spent in this hard IRQ in nanoseconds.
.TP
usecs
Range of microseconds for this bucket.
.TP
nsecs
Range of nanoseconds for this bucket.
.TP
count
Number of hard IRQs in this time range.
.TP
distribution
ASCII representation of the distribution (the count column).
.SH OVERHEAD
This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment
before use.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
softirqs(8)
.TH softirqs 8 "2015-10-20" "USER COMMANDS"
.SH NAME
softirqs \- Measure soft IRQ (soft interrupt) event time. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B softirqs [\-h] [\-T] [\-N] [\-d] [interval] [count]
.SH DESCRIPTION
This summarizes the time spent servicing soft IRQs (soft interrupts), and can
show this time as either totals or histogram distributions. A system-wide
summary of this time is shown by the %soft column of mpstat(1), and soft IRQ
event counts (but not times) are available in /proc/softirqs.
WARNING: This currently uses dynamic tracing of various soft interrupt
functions, and can easily not work with different kernel versions. Check and
adjust the code as necessary. Also try in a test environment and ensure this
tool is safe before use. Future versions should switch to tracepoints.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-T
Include timestamps on output.
.TP
\-N
Output in nanoseconds
.TP
\-d
Show IRQ time distribution as histograms
.SH EXAMPLES
.TP
Sum soft IRQ event time until Ctrl-C:
#
.B softirqs
.TP
Show soft IRQ event time as histograms:
#
.B softirqs \-d
.TP
Print 1 second summaries, 10 times:
#
.B softirqs 1 10
.TP
1 second summaries, printed in nanoseconds, with timestamps:
#
.B softirqs \-NT 1
.SH FIELDS
.TP
SOFTIRQ
The kernel function name that performs the soft IRQ action.
.TP
TOTAL_usecs
Total time spent in this soft IRQ function in microseconds.
.TP
TOTAL_nsecs
Total time spent in this soft IRQ function in nanoseconds.
.TP
usecs
Range of microseconds for this bucket.
.TP
nsecs
Range of nanoseconds for this bucket.
.TP
count
Number of soft IRQs in this time range.
.TP
distribution
ASCII representation of the distribution (the count column).
.SH OVERHEAD
This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment
before use.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
hardirqs(8)
#!/usr/bin/python
#
# hardirqs Summarize hard IRQ (interrupt) event time.
# For Linux, uses BCC, eBPF.
#
# USAGE: hardirqs [-h] [-T] [-Q] [-m] [-D] [interval] [count]
#
# Thanks Amer Ather for help understanding irq behavior.
#
# Copyright (c) 2015 Brendan Gregg.
# Licensed under the Apache License, Version 2.0 (the "License")
#
# 19-Oct-2015 Brendan Gregg Created this.
from __future__ import print_function
from bcc import BPF
from time import sleep, strftime
import argparse
### arguments
examples = """examples:
./hardirqs # sum hard irq event time
./hardirqs -d # show hard irq event time as histograms
./hardirqs 1 10 # print 1 second summaries, 10 times
./hardirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
"""
parser = argparse.ArgumentParser(
description="Summarize hard irq event time as histograms",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=examples)
parser.add_argument("-T", "--timestamp", action="store_true",
help="include timestamp on output")
parser.add_argument("-N", "--nanoseconds", action="store_true",
help="output in nanoseconds")
parser.add_argument("-d", "--dist", action="store_true",
help="show distributions as histograms")
parser.add_argument("interval", nargs="?", default=99999999,
help="output interval, in seconds")
parser.add_argument("count", nargs="?", default=99999999,
help="number of outputs")
args = parser.parse_args()
countdown = int(args.count)
if args.nanoseconds:
factor = 1
label = "nsecs"
else:
factor = 1000
label = "usecs"
debug = 0
### define BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
#include <linux/irq.h>
#include <linux/irqdesc.h>
#include <linux/interrupt.h>
typedef struct irq_key {
char name[32];
u64 slot;
} irq_key_t;
BPF_HASH(start, u32);
BPF_HASH(irqdesc, u32, struct irq_desc *);
BPF_HISTOGRAM(dist, irq_key_t);
// time IRQ
int trace_start(struct pt_regs *ctx, struct irq_desc *desc)
{
u32 pid = bpf_get_current_pid_tgid();
u64 ts = bpf_ktime_get_ns();
start.update(&pid, &ts);
irqdesc.update(&pid, &desc);
return 0;
}
int trace_completion(struct pt_regs *ctx)
{
u64 *tsp, delta;
struct irq_desc **descp;
u32 pid = bpf_get_current_pid_tgid();
// fetch timestamp and calculate delta
tsp = start.lookup(&pid);
descp = irqdesc.lookup(&pid);
if (tsp == 0 || descp == 0) {
return 0; // missed start
}
// Note: descp is a value from map, so '&' can be done without
// probe_read, but the next level irqaction * needs a probe read.
// Do these steps first after reading the map, otherwise some of these
// pointers may get pushed onto the stack and verifier will fail.
struct irqaction *action = 0;
bpf_probe_read(&action, sizeof(action), &(*descp)->action);
const char **namep = &action->name;
char *name = 0;
bpf_probe_read(&name, sizeof(name), namep);
delta = bpf_ktime_get_ns() - *tsp;
// store as sum or histogram
STORE
start.delete(&pid);
irqdesc.delete(&pid);
return 0;
}
"""
### code substitutions
if args.dist:
bpf_text = bpf_text.replace('STORE',
'irq_key_t key = {.slot = bpf_log2l(delta)};' +
'bpf_probe_read(&key.name, sizeof(key.name), name);' +
'dist.increment(key);')
else:
bpf_text = bpf_text.replace('STORE',
'irq_key_t key = {.slot = 0 /* ignore */};' +
'bpf_probe_read(&key.name, sizeof(key.name), name);' +
'u64 zero = 0, *vp = dist.lookup_or_init(&key, &zero);' +
'(*vp) += delta;')
if debug:
print(bpf_text)
### load BPF program
b = BPF(text=bpf_text)
# these should really use irq:irq_handler_entry/exit tracepoints:
b.attach_kprobe(event="handle_irq_event_percpu", fn_name="trace_start")
b.attach_kretprobe(event="handle_irq_event_percpu", fn_name="trace_completion")
print("Tracing hard irq event time... Hit Ctrl-C to end.")
### output
exiting = 0 if args.interval else 1
dist = b.get_table("dist")
while (1):
try:
sleep(int(args.interval))
except KeyboardInterrupt:
exiting=1
print()
if args.timestamp:
print("%-8s\n" % strftime("%H:%M:%S"), end="")
if args.dist:
dist.print_log2_hist(label, "hardirq")
else:
print("%-26s %11s" % ("HARDIRQ", "TOTAL_" + label))
for k, v in sorted(dist.items(), key=lambda dist: dist[1].value):
print("%-26s %11d" % (k.name, v.value / factor))
dist.clear()
countdown -= 1
if exiting or countdown == 0:
exit()
Demonstrations of hardirqs, the Linux eBPF/bcc version.
This program traces hard interrupts (irqs), and stores timing statistics
in-kernel for efficiency. For example:
# ./hardirqs
Tracing hard irq event time... Hit Ctrl-C to end.
^C
HARDIRQ TOTAL_usecs
callfuncsingle0 2
callfuncsingle5 5
callfuncsingle6 5
callfuncsingle7 21
blkif 66
timer7 84
resched5 94
resched0 97
resched3 102
resched7 111
resched6 255
timer3 362
resched4 367
timer5 474
timer1 529
timer6 679
timer2 746
timer4 943
resched1 1048
timer0 1558
resched2 1750
eth0 11441
The HARDIRQ column prints the interrupt action name. While tracing, the eth0
hard irq action ran for 11441 microseconds (11 milliseconds) in total.
Many other interrupts are visible in the output: this is an 8 CPU system, and
some of these interrupts have a separate action per-CPU (eg, "timer",
"resched").
An interval can be provided, and also optionally a count. Eg, printing output
every 1 second, and including timestamps (-T):
# ./hardirqs -T 1 3
Tracing hard irq event time... Hit Ctrl-C to end.
22:16:14
HARDIRQ TOTAL_usecs
callfuncsingle0 2
callfuncsingle7 5
callfuncsingle3 5
callfuncsingle2 5
callfuncsingle6 6
callfuncsingle1 11
resched0 32
blkif 51
resched5 71
resched7 71
resched4 72
resched6 82
timer7 172
resched1 187
resched2 236
timer3 252
resched3 282
timer1 320
timer2 374
timer6 396
timer5 427
timer4 470
timer0 1430
eth0 7498
22:16:15
HARDIRQ TOTAL_usecs
callfuncsingle7 6
callfuncsingle5 11
callfuncsingle4 13
timer2 17
callfuncsingle6 18
resched0 21
blkif 33
resched3 40
resched5 60
resched4 69
resched6 70
resched7 74
timer7 86
resched2 91
timer3 134
resched1 293
timer5 354
timer1 433
timer6 497
timer4 1112
timer0 1768
eth0 6972
22:16:16
HARDIRQ TOTAL_usecs
callfuncsingle7 5
callfuncsingle3 5
callfuncsingle2 6
timer3 10
resched0 18
callfuncsingle4 22
resched5 27
resched6 44
blkif 45
resched7 65
resched4 69
timer4 77
resched2 97
timer7 98
resched3 103
timer2 169
resched1 226
timer5 525
timer1 691
timer6 697
timer0 1415
eth0 7152
This can be useful for quantifying where CPU cycles are spent among the hard
interrupts (summarized as the %irq column from mpstat(1)). The output above
shows that most time was spent processing for eth0 (network interface), which
was around 7 milliseconds per second (total across all CPUs).
Note that the time spent among the "timer" interrupts was low, and usually less
than one microsecond per second. Here's the hardirq per-second output when the
perf tool is performing a 999 Hertz CPU profile ("perf record -F 999 -a ..."):
22:13:59
HARDIRQ TOTAL_usecs
callfuncsingle7 5
callfuncsingle5 5
callfuncsingle3 6
callfuncsingle4 7
callfuncsingle6 19
blkif 66
resched0 66
resched2 82
resched7 87
resched3 96
resched4 118
resched5 120
resched6 130
resched1 230
timer3 946
timer1 1981
timer7 2618
timer5 3063
timer6 3141
timer4 3511
timer2 3554
timer0 5044
eth0 16015
This sheds some light into the CPU overhead of the perf profiler, which cost
around 3 milliseconds per second. Note that I'm usually profiling at a much
lower rate, 99 Hertz, which looks like this:
22:22:12
HARDIRQ TOTAL_usecs
callfuncsingle3 5
callfuncsingle6 5
callfuncsingle5 22
blkif 46
resched6 47
resched5 57
resched4 66
resched7 78
resched2 97
resched0 214
timer2 326
timer0 498
timer5 536
timer6 576
timer1 600
timer4 982
resched1 1315
timer7 1364
timer3 1825
resched3 5708
eth0 9743
Much lower (and remember to compare this to the baseline). Note that perf has
other overheads (non-irq CPU cycles, file system storage).
The distribution of interrupt run time can be printed as a histogram with the -d
option. Eg:
# ./hardirqs -d
Tracing hard irq event time... Hit Ctrl-C to end.
^C
hardirq = 'callfuncsingle1'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 1 |****************************************|
hardirq = 'callfuncsingle0'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 1 |****************************************|
hardirq = 'callfuncsingle3'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 3 |****************************************|
hardirq = 'callfuncsingle2'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 2 |****************************************|
hardirq = 'callfuncsingle5'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 5 |****************************************|
hardirq = 'callfuncsingle4'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 6 |****************************************|
hardirq = 'callfuncsingle7'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 4 |****************************************|
hardirq = 'callfuncsingle6'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 4 |****************************************|
hardirq = 'eth0'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 5102 |********* |
1024 -> 2047 : 20617 |****************************************|
2048 -> 4095 : 4832 |********* |
4096 -> 8191 : 12 | |
hardirq = 'timer7'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 9 |*** |
2048 -> 4095 : 70 |***************************** |
4096 -> 8191 : 94 |****************************************|
hardirq = 'timer6'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 1 | |
2048 -> 4095 : 86 |*********** |
4096 -> 8191 : 295 |****************************************|
8192 -> 16383 : 28 |*** |
hardirq = 'timer5'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 1 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 137 |****************************************|
4096 -> 8191 : 123 |*********************************** |
8192 -> 16383 : 8 |** |
hardirq = 'timer4'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 46 |********* |
4096 -> 8191 : 198 |****************************************|
8192 -> 16383 : 49 |********* |
hardirq = 'timer3'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 4 | |
2048 -> 4095 : 210 |****************************************|
4096 -> 8191 : 186 |*********************************** |
hardirq = 'timer2'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 245 |****************************************|
4096 -> 8191 : 227 |************************************* |
8192 -> 16383 : 6 | |
hardirq = 'timer1'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 6 |* |
2048 -> 4095 : 112 |************************ |
4096 -> 8191 : 181 |****************************************|
8192 -> 16383 : 7 |* |
hardirq = 'timer0'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 887 |****************************************|
8192 -> 16383 : 92 |**** |
hardirq = 'blkif'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 9 |****************************************|
8192 -> 16383 : 7 |******************************* |
16384 -> 32767 : 2 |******** |
hardirq = 'resched4'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 104 |****************************************|
2048 -> 4095 : 80 |****************************** |
hardirq = 'resched5'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 27 |***** |
1024 -> 2047 : 216 |****************************************|
2048 -> 4095 : 27 |***** |
4096 -> 8191 : 1 | |
hardirq = 'resched6'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 480 |******************* |
1024 -> 2047 : 1003 |****************************************|
2048 -> 4095 : 64 |** |
hardirq = 'resched7'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 46 |********* |
1024 -> 2047 : 190 |****************************************|
2048 -> 4095 : 42 |******** |
hardirq = 'resched0'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 11 |**** |
1024 -> 2047 : 100 |****************************************|
2048 -> 4095 : 23 |********* |
hardirq = 'resched1'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 96 |******** |
1024 -> 2047 : 462 |****************************************|
2048 -> 4095 : 36 |*** |
hardirq = 'resched2'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 120 |************************** |
1024 -> 2047 : 183 |****************************************|
2048 -> 4095 : 41 |******** |
hardirq = 'resched3'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 789 |****************************************|
2048 -> 4095 : 39 |* |
USAGE message:
# ./hardirqs -h
usage: hardirqs [-h] [-T] [-N] [-d] [interval] [count]
Summarize hard irq event time as histograms
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-N, --nanoseconds output in nanoseconds
-d, --dist show distributions as histograms
examples:
./hardirqs # sum hard irq event time
./hardirqs -d # show hard irq event time as histograms
./hardirqs 1 10 # print 1 second summaries, 10 times
./hardirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
#!/usr/bin/python
#
# softirqs Summarize soft IRQ (interrupt) event time.
# For Linux, uses BCC, eBPF.
#
# USAGE: softirqs [-h] [-T] [-N] [-d] [interval] [count]
#
# Copyright (c) 2015 Brendan Gregg.
# Licensed under the Apache License, Version 2.0 (the "License")
#
# 20-Oct-2015 Brendan Gregg Created this.
from __future__ import print_function
from bcc import BPF
from time import sleep, strftime
import argparse
### arguments
examples = """examples:
./softirqs # sum soft irq event time
./softirqs -d # show soft irq event time as histograms
./softirqs 1 10 # print 1 second summaries, 10 times
./softirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
"""
parser = argparse.ArgumentParser(
description="Summarize soft irq event time as histograms",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=examples)
parser.add_argument("-T", "--timestamp", action="store_true",
help="include timestamp on output")
parser.add_argument("-N", "--nanoseconds", action="store_true",
help="output in nanoseconds")
parser.add_argument("-d", "--dist", action="store_true",
help="show distributions as histograms")
parser.add_argument("interval", nargs="?", default=99999999,
help="output interval, in seconds")
parser.add_argument("count", nargs="?", default=99999999,
help="number of outputs")
args = parser.parse_args()
countdown = int(args.count)
if args.nanoseconds:
factor = 1
label = "nsecs"
else:
factor = 1000
label = "usecs"
debug = 0
### define BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
typedef struct irq_key {
u64 ip;
u64 slot;
} irq_key_t;
BPF_HASH(start, u32);
BPF_HASH(iptr, u32);
BPF_HISTOGRAM(dist, irq_key_t);
// time IRQ
int trace_start(struct pt_regs *ctx)
{
u32 pid = bpf_get_current_pid_tgid();
u64 ip = ctx->ip, ts = bpf_ktime_get_ns();
start.update(&pid, &ts);
iptr.update(&pid, &ip);
return 0;
}
int trace_completion(struct pt_regs *ctx)
{
u64 *tsp, delta, ip, *ipp;
u32 pid = bpf_get_current_pid_tgid();
// fetch timestamp and calculate delta
tsp = start.lookup(&pid);
ipp = iptr.lookup(&pid);
if (tsp == 0 || ipp == 0) {
return 0; // missed start
}
delta = bpf_ktime_get_ns() - *tsp;
ip = *ipp;
// store as sum or histogram
STORE
start.delete(&pid);
iptr.delete(&pid);
return 0;
}
"""
### code substitutions
if args.dist:
bpf_text = bpf_text.replace('STORE',
'irq_key_t key = {.ip = ip, .slot = bpf_log2l(delta)};' +
'dist.increment(key);')
else:
bpf_text = bpf_text.replace('STORE',
'irq_key_t key = {.ip = ip, .slot = 0 /* ignore */};' +
'u64 zero = 0, *vp = dist.lookup_or_init(&key, &zero);' +
'(*vp) += delta;')
if debug:
print(bpf_text)
### load BPF program
b = BPF(text=bpf_text)
# this should really use irq:softirq_entry/exit tracepoints; for now the
# soft irq functions are individually traced (search your kernel for
# open_softirq() calls, and adjust the following list as needed).
for softirqfunc in ("blk_iopoll_softirq", "blk_done_softirq",
"rcu_process_callbacks", "run_rebalance_domains", "tasklet_action",
"tasklet_hi_action", "run_timer_softirq", "net_tx_action",
"net_rx_action"):
b.attach_kprobe(event=softirqfunc, fn_name="trace_start")
b.attach_kretprobe(event=softirqfunc, fn_name="trace_completion")
print("Tracing soft irq event time... Hit Ctrl-C to end.")
### output
exiting = 0 if args.interval else 1
dist = b.get_table("dist")
while (1):
try:
sleep(int(args.interval))
except KeyboardInterrupt:
exiting=1
print()
if args.timestamp:
print("%-8s\n" % strftime("%H:%M:%S"), end="")
if args.dist:
dist.print_log2_hist(label, "softirq", section_print_fn=b.ksym)
else:
print("%-26s %11s" % ("SOFTIRQ", "TOTAL_" + label))
for k, v in sorted(dist.items(), key=lambda dist: dist[1].value):
print("%-26s %11d" % (b.ksym(k.ip), v.value / factor))
dist.clear()
countdown -= 1
if exiting or countdown == 0:
exit()
Demonstrations of softirqs, the Linux eBPF/bcc version.
This program traces soft interrupts (irqs), and stores timing statistics
in-kernel for efficiency. For example:
# ./softirqs
Tracing soft irq event time... Hit Ctrl-C to end.
^C
SOFTIRQ TOTAL_usecs
rcu_process_callbacks 974
run_rebalance_domains 1809
run_timer_softirq 2615
net_tx_action 14605
tasklet_action 38692
net_rx_action 88188
The SOFTIRQ column prints the interrupt action function name. While tracing,
the net_rx_action() soft interrupt ran for 20199 microseconds (20 milliseconds)
in total.
This tool works by dynamic tracing the individual softirq functions, and will
need to be adjusted to match kernel/module changes. Future versions should
use the softirq tracepoints instead.
An interval can be provided, and also optionally a count. Eg, printing output
every 1 second, and including timestamps (-T):
# ./softirqs -T 1 3
Tracing soft irq event time... Hit Ctrl-C to end.
22:29:16
SOFTIRQ TOTAL_usecs
rcu_process_callbacks 456
run_rebalance_domains 1005
run_timer_softirq 1196
net_tx_action 2796
tasklet_action 5534
net_rx_action 15075
22:29:17
SOFTIRQ TOTAL_usecs
rcu_process_callbacks 456
run_rebalance_domains 839
run_timer_softirq 1142
net_tx_action 1912
tasklet_action 4428
net_rx_action 14652
22:29:18
SOFTIRQ TOTAL_usecs
rcu_process_callbacks 502
run_rebalance_domains 840
run_timer_softirq 1192
net_tx_action 2341
tasklet_action 5496
net_rx_action 15656
This can be useful for quantifying where CPU cycles are spent among the soft
interrupts (summarized as the %softirq column from mpstat(1), and shown as
event counts in /proc/softirqs). The output above shows that most time was spent
processing net_rx_action(), which was around 15 milleconds per second (total
time across all CPUs).
The distribution of interrupt run time can be printed as a histogram with the -d
option. Eg:
# ./softirqs -d
Tracing soft irq event time... Hit Ctrl-C to end.
^C
softirq = net_tx_action
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 440 | |
512 -> 1023 : 27613 |****************************************|
1024 -> 2047 : 5728 |******** |
2048 -> 4095 : 439 | |
4096 -> 8191 : 53 | |
8192 -> 16383 : 2 | |
softirq = net_rx_action
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 6 | |
1024 -> 2047 : 35 | |
2048 -> 4095 : 3562 |**************** |
4096 -> 8191 : 7023 |******************************** |
8192 -> 16383 : 8770 |****************************************|
16384 -> 32767 : 1780 |******** |
32768 -> 65535 : 216 | |
65536 -> 131071 : 4 | |
softirq = tasklet_action
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 95 | |
512 -> 1023 : 12521 |****************************************|
1024 -> 2047 : 1068 |*** |
2048 -> 4095 : 1077 |*** |
4096 -> 8191 : 12349 |*************************************** |
8192 -> 16383 : 464 |* |
16384 -> 32767 : 1 | |
softirq = rcu_process_callbacks
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 708 |****************************************|
1024 -> 2047 : 495 |*************************** |
2048 -> 4095 : 98 |***** |
4096 -> 8191 : 62 |*** |
8192 -> 16383 : 4 | |
softirq = run_timer_softirq
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 2 | |
512 -> 1023 : 366 |********* |
1024 -> 2047 : 1525 |****************************************|
2048 -> 4095 : 629 |**************** |
4096 -> 8191 : 87 |** |
8192 -> 16383 : 1 | |
softirq = run_rebalance_domains
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 3 | |
512 -> 1023 : 18 |* |
1024 -> 2047 : 80 |******** |
2048 -> 4095 : 374 |****************************************|
4096 -> 8191 : 257 |*************************** |
8192 -> 16383 : 50 |***** |
16384 -> 32767 : 24 |** |
USAGE message:
# ./softirqs -h
usage: softirqs [-h] [-T] [-N] [-d] [interval] [count]
Summarize soft irq event time as histograms
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-N, --nanoseconds output in nanoseconds
-d, --dist show distributions as histograms
examples:
./softirqs # sum soft irq event time
./softirqs -d # show soft irq event time as histograms
./softirqs 1 10 # print 1 second summaries, 10 times
./softirqs -NT 1 # 1s summaries, nanoseconds, and timestamps
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment