regexp: optimize for provably too short inputs
For many patterns we can compute the minimum input length at compile time. If the input is shorter, we can return early and get a huge speedup. As pointed out by Damian Gryski, Perl's regex engine contains a number of these kinds of fail-fast optimizations: https://perldoc.perl.org/perlreguts.html#Peep-hole-Optimisation-and-Analysis Benchmarks: (including new ones for compile time) name old time/op new time/op delta Compile/Onepass-8 4.39µs ± 1% 4.40µs ± 0% +0.34% (p=0.029 n=9+8) Compile/Medium-8 9.80µs ± 0% 9.91µs ± 0% +1.17% (p=0.000 n=10+10) Compile/Hard-8 72.7µs ± 0% 73.5µs ± 0% +1.10% (p=0.000 n=9+10) name old time/op new time/op delta Match/Easy0/16-8 52.6ns ± 5% 4.9ns ± 0% -90.68% (p=0.000 n=10+9) Match/Easy0/32-8 64.1ns ±10% 61.4ns ± 1% ~ (p=0.188 n=10+9) Match/Easy0/1K-8 280ns ± 1% 277ns ± 2% -0.97% (p=0.004 n=10+10) Match/Easy0/32K-8 4.61µs ± 1% 4.55µs ± 1% -1.49% (p=0.000 n=9+10) Match/Easy0/1M-8 229µs ± 0% 226µs ± 1% -1.29% (p=0.000 n=8+10) Match/Easy0/32M-8 7.50ms ± 1% 7.47ms ± 1% ~ (p=0.165 n=10+10) Match/Easy0i/16-8 533ns ± 1% 5ns ± 2% -99.07% (p=0.000 n=10+10) Match/Easy0i/32-8 950ns ± 0% 950ns ± 1% ~ (p=0.920 n=10+9) Match/Easy0i/1K-8 27.5µs ± 1% 27.5µs ± 0% ~ (p=0.739 n=10+10) Match/Easy0i/32K-8 1.13ms ± 0% 1.13ms ± 1% ~ (p=0.079 n=9+10) Match/Easy0i/1M-8 36.7ms ± 2% 36.1ms ± 0% -1.64% (p=0.000 n=10+9) Match/Easy0i/32M-8 1.17s ± 0% 1.16s ± 1% -0.80% (p=0.004 n=8+9) Match/Easy1/16-8 55.5ns ± 6% 4.9ns ± 1% -91.19% (p=0.000 n=10+9) Match/Easy1/32-8 58.3ns ± 8% 56.6ns ± 1% ~ (p=0.449 n=10+8) Match/Easy1/1K-8 750ns ± 0% 748ns ± 1% ~ (p=0.072 n=8+10) Match/Easy1/32K-8 31.8µs ± 0% 31.6µs ± 1% -0.50% (p=0.035 n=10+9) Match/Easy1/1M-8 1.10ms ± 1% 1.09ms ± 0% -0.95% (p=0.000 n=10+9) Match/Easy1/32M-8 35.5ms ± 0% 35.2ms ± 1% -1.05% (p=0.000 n=9+10) Match/Medium/16-8 442ns ± 2% 5ns ± 1% -98.89% (p=0.000 n=10+10) Match/Medium/32-8 875ns ± 0% 878ns ± 1% ~ (p=0.071 n=9+10) Match/Medium/1K-8 26.1µs ± 0% 25.9µs ± 0% -0.64% (p=0.000 n=10+10) Match/Medium/32K-8 1.09ms ± 1% 1.08ms ± 0% -0.84% (p=0.000 n=10+9) Match/Medium/1M-8 34.9ms ± 0% 34.6ms ± 1% -0.98% (p=0.000 n=9+10) Match/Medium/32M-8 1.12s ± 1% 1.11s ± 1% -0.98% (p=0.000 n=10+9) Match/Hard/16-8 721ns ± 1% 5ns ± 0% -99.32% (p=0.000 n=10+9) Match/Hard/32-8 1.32µs ± 1% 1.31µs ± 0% -0.71% (p=0.000 n=9+9) Match/Hard/1K-8 39.8µs ± 1% 39.7µs ± 1% ~ (p=0.165 n=10+10) Match/Hard/32K-8 1.57ms ± 0% 1.56ms ± 0% -0.70% (p=0.000 n=10+9) Match/Hard/1M-8 50.4ms ± 1% 50.1ms ± 1% -0.57% (p=0.007 n=10+10) Match/Hard/32M-8 1.62s ± 1% 1.60s ± 0% -0.98% (p=0.000 n=10+10) Match/Hard1/16-8 3.88µs ± 1% 3.86µs ± 0% ~ (p=0.118 n=10+10) Match/Hard1/32-8 7.44µs ± 1% 7.46µs ± 1% ~ (p=0.109 n=10+10) Match/Hard1/1K-8 232µs ± 1% 229µs ± 1% -1.31% (p=0.000 n=10+9) Match/Hard1/32K-8 7.41ms ± 2% 7.41ms ± 0% ~ (p=0.237 n=10+8) Match/Hard1/1M-8 238ms ± 1% 238ms ± 0% ~ (p=0.481 n=10+10) Match/Hard1/32M-8 7.69s ± 1% 7.61s ± 0% -1.00% (p=0.000 n=10+10) Fixes #31329 Change-Id: I04640e8c59178ec8b3106e13ace9b109b6bdbc25 Reviewed-on: https://go-review.googlesource.com/c/go/+/171023Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
Showing
Please register or sign in to comment