- 25 Apr, 2018 22 commits
-
-
Josh Bleecher Snyder authored
The previous change sped up the pure computation form of LeadingZeros8. This places it somewhat close to the table lookup form. Depending on something that varies from toolchain to toolchain (alignment, perhaps?), the slowdown from ditching the table lookup is either 20% or 5%. This benchmark is the best case scenario for the table lookup: It is in the L1 cache already. I think we're close enough that we can switch to the computational version, and trust that the memory effects and binary size savings will be worth it. Code: func f8(x uint8) { z = bits.LeadingZeros8(x) } Before: "".f8 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) LEAQ math/bits.len8tab(SB), CX 0x000f 00015 (x.go:7) MOVBLZX (CX)(AX*1), AX 0x0013 00019 (x.go:7) ADDQ $-8, AX 0x0017 00023 (x.go:7) NEGQ AX 0x001a 00026 (x.go:7) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:7) RET After: "".f8 STEXT nosplit size=30 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) LEAL 1(AX)(AX*1), AX 0x000c 00012 (x.go:7) BSRL AX, AX 0x000f 00015 (x.go:7) ADDQ $-8, AX 0x0013 00019 (x.go:7) NEGQ AX 0x0016 00022 (x.go:7) MOVQ AX, "".z(SB) 0x001d 00029 (x.go:7) RET Change-Id: Icc7db50a7820fb9a3da8a816d6b6940d7f8e193e Reviewed-on: https://go-review.googlesource.com/108942 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Josh Bleecher Snyder authored
Introduce Len8 and Len16 ops and provide optimized lowerings for them. amd64 only for this CL, although it wouldn't surprise me if other architectures also admit of optimized lowerings. Also use and optimize the Len32 lowering, along the same lines. Leave Len8 unused for the moment; a subsequent CL will enable it. For 16 and 32 bits, this leads to a speed-up. name old time/op new time/op delta LeadingZeros16-8 1.42ns ± 5% 1.23ns ± 5% -13.42% (p=0.000 n=20+20) LeadingZeros32-8 1.25ns ± 5% 1.03ns ± 5% -17.63% (p=0.000 n=20+16) Code: func f16(x uint16) { z = bits.LeadingZeros16(x) } func f32(x uint32) { z = bits.LeadingZeros32(x) } Before: "".f16 STEXT nosplit size=38 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) BSRQ AX, AX 0x000c 00012 (x.go:8) MOVQ $-1, CX 0x0013 00019 (x.go:8) CMOVQEQ CX, AX 0x0017 00023 (x.go:8) ADDQ $-15, AX 0x001b 00027 (x.go:8) NEGQ AX 0x001e 00030 (x.go:8) MOVQ AX, "".z(SB) 0x0025 00037 (x.go:8) RET "".f32 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX 0x0004 00004 (x.go:9) BSRQ AX, AX 0x0008 00008 (x.go:9) MOVQ $-1, CX 0x000f 00015 (x.go:9) CMOVQEQ CX, AX 0x0013 00019 (x.go:9) ADDQ $-31, AX 0x0017 00023 (x.go:9) NEGQ AX 0x001a 00026 (x.go:9) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:9) RET After: "".f16 STEXT nosplit size=30 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) LEAL 1(AX)(AX*1), AX 0x000c 00012 (x.go:8) BSRL AX, AX 0x000f 00015 (x.go:8) ADDQ $-16, AX 0x0013 00019 (x.go:8) NEGQ AX 0x0016 00022 (x.go:8) MOVQ AX, "".z(SB) 0x001d 00029 (x.go:8) RET "".f32 STEXT nosplit size=28 args=0x8 locals=0x0 0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX 0x0004 00004 (x.go:9) LEAQ 1(AX)(AX*1), AX 0x0009 00009 (x.go:9) BSRQ AX, AX 0x000d 00013 (x.go:9) ADDQ $-32, AX 0x0011 00017 (x.go:9) NEGQ AX 0x0014 00020 (x.go:9) MOVQ AX, "".z(SB) 0x001b 00027 (x.go:9) RET Change-Id: I6c93c173752a7bfdeab8be30777ae05a736e1f4b Reviewed-on: https://go-review.googlesource.com/108941 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>
-
Josh Bleecher Snyder authored
Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them. amd64 only for this CL, although it wouldn't surprise me if other architectures also admit of optimized lowerings. name old time/op new time/op delta TrailingZeros8-8 1.33ns ± 6% 0.84ns ± 3% -36.90% (p=0.000 n=20+20) TrailingZeros16-8 1.26ns ± 5% 0.84ns ± 5% -33.50% (p=0.000 n=20+18) Code: func f8(x uint8) { z = bits.TrailingZeros8(x) } func f16(x uint16) { z = bits.TrailingZeros16(x) } Before: "".f8 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) MOVBLZX AL, AX 0x0008 00008 (x.go:7) BTSQ $8, AX 0x000d 00013 (x.go:7) BSFQ AX, AX 0x0011 00017 (x.go:7) MOVL $64, CX 0x0016 00022 (x.go:7) CMOVQEQ CX, AX 0x001a 00026 (x.go:7) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:7) RET "".f16 STEXT nosplit size=34 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) MOVWLZX AX, AX 0x0008 00008 (x.go:8) BTSQ $16, AX 0x000d 00013 (x.go:8) BSFQ AX, AX 0x0011 00017 (x.go:8) MOVL $64, CX 0x0016 00022 (x.go:8) CMOVQEQ CX, AX 0x001a 00026 (x.go:8) MOVQ AX, "".z(SB) 0x0021 00033 (x.go:8) RET After: "".f8 STEXT nosplit size=20 args=0x8 locals=0x0 0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX 0x0005 00005 (x.go:7) BTSL $8, AX 0x0009 00009 (x.go:7) BSFL AX, AX 0x000c 00012 (x.go:7) MOVQ AX, "".z(SB) 0x0013 00019 (x.go:7) RET "".f16 STEXT nosplit size=20 args=0x8 locals=0x0 0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8 0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) 0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) 0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX 0x0005 00005 (x.go:8) BTSL $16, AX 0x0009 00009 (x.go:8) BSFL AX, AX 0x000c 00012 (x.go:8) MOVQ AX, "".z(SB) 0x0013 00019 (x.go:8) RET Change-Id: I0551e357348de2b724737d569afd6ac9f5c3aa11 Reviewed-on: https://go-review.googlesource.com/108940 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Keith Randall <khr@golang.org>
-
Russ Cox authored
It's going to grow. Change-Id: I4f5d3cce6e03250508d1ae0981a6d82a4192ae31 Reviewed-on: https://go-review.googlesource.com/107915 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
-
Russ Cox authored
This gives an easy way to query properties of all the deps of a set of packages, in a single go list invocation. Go list has already done the hard work of loading these packages, so exposing them is more efficient than requiring a second invocation. This will be helpful for tools asking cmd/go about build information. Change-Id: I90798e386246b24aad92dd13cb9e3788c7d30e91 Reviewed-on: https://go-review.googlesource.com/107776 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
-
Ian Lance Taylor authored
The test has been flaky, probably due to EAGAIN, but let's find out for sure. Updates #25078 Change-Id: I5a5b14bfc52cb43f25f07ca7d207b61ae9d4f944 Reviewed-on: https://go-review.googlesource.com/109359 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
-
Russ Cox authored
Found by pending CL to make cmd/vet auto-detect printf wrappers. Change-Id: I6b5ba8f9c301dd2d7086c152cf2e54a68b012208 Reviewed-on: https://go-review.googlesource.com/109345 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Russ Cox authored
Found by pending CL to make cmd/vet auto-detect printf wrappers. Change-Id: I2ad06647b7b41cf68859820a60eeac2e689ca2e6 Reviewed-on: https://go-review.googlesource.com/109344 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Russ Cox authored
Found by pending CL to make cmd/vet auto-detect printf wrappers. Change-Id: I1928a5bcd7885cdd950ce81b7d0ba07fbad3bf88 Reviewed-on: https://go-review.googlesource.com/109343 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Russ Cox authored
If X depends on Y and X was installed but Y is only present in the cache (as happens when you "go install X") then we should report X as up-to-date, not as stale. This applies whether X is a package or a main binary. Fixes #24558. Fixes #23818. Change-Id: I26a0b375b1f7f7ac909cc0db68e92f4e04529208 Reviewed-on: https://go-review.googlesource.com/107957 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
-
Matthew Dempsky authored
Passes toolstash-check. Change-Id: Idc00f15e369cad62cb8f7a09fd0ef09abd3fcdef Reviewed-on: https://go-review.googlesource.com/109356 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Robert Griesemer authored
This was already partially fixed by commit 99843e22 (https://go-review.googlesource.com/c/go/+/96376); but we missed a couple of places where we also need to propagate the scope. Fixes #25008. Change-Id: I041fa74d1f6d3b5a8edb922efa126ff1dacd7900 Reviewed-on: https://go-review.googlesource.com/109139Reviewed-by: Alan Donovan <adonovan@google.com>
-
Russ Cox authored
Don't chase import cycles forever preparing list JSON. Fixes #24086. Change-Id: Ia1139d0c8d813d068c367a8baee59d240a545617 Reviewed-on: https://go-review.googlesource.com/108016 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
-
Ilya Tocar authored
Currently branchelim is too aggressive in converting branches to conditinal movs. On most x86 cpus resulting cmov* are more expensive than most simple instructions, because they have a latency of 2, instead of 1, So by teaching branchelim to limit number of CondSelects and consider possible need to recalculate flags, we can archive huge speed-ups (fix big regressions). In package strings: ToUpper/#00-6 10.9ns ± 1% 11.8ns ± 1% +8.29% (p=0.000 n=10+10) ToUpper/ONLYUPPER-6 27.9ns ± 0% 27.8ns ± 1% ~ (p=0.106 n=9+10) ToUpper/abc-6 90.3ns ± 2% 90.3ns ± 1% ~ (p=0.956 n=10+10) ToUpper/AbC123-6 110ns ± 1% 113ns ± 2% +3.19% (p=0.000 n=10+10) ToUpper/azAZ09_-6 109ns ± 2% 110ns ± 1% ~ (p=0.174 n=10+10) ToUpper/longStrinGwitHmixofsmaLLandcAps-6 228ns ± 1% 233ns ± 2% +2.11% (p=0.000 n=10+10) ToUpper/longɐstringɐwithɐnonasciiⱯchars-6 907ns ± 1% 709ns ± 2% -21.85% (p=0.000 n=10+10) ToUpper/ɐɐɐɐɐ-6 793ns ± 2% 562ns ± 2% -29.06% (p=0.000 n=10+10) In fmt: SprintfQuoteString-6 272ns ± 2% 195ns ± 4% -28.39% (p=0.000 n=10+10) And in archive/zip: CompressedZipGarbage-6 4.00ms ± 0% 4.03ms ± 0% +0.71% (p=0.000 n=10+10) Zip64Test-6 27.5ms ± 1% 24.2ms ± 0% -12.01% (p=0.000 n=10+10) Zip64TestSizes/4096-6 10.4µs ±12% 10.7µs ± 8% ~ (p=0.068 n=10+8) Zip64TestSizes/1048576-6 79.0µs ± 3% 70.2µs ± 2% -11.14% (p=0.000 n=10+10) Zip64TestSizes/67108864-6 4.64ms ± 1% 4.11ms ± 1% -11.43% (p=0.000 n=10+10) As far as I can tell, no cases with significant gain from cmov have regressed. On go1 it looks like most changes are unrelated, but I've verified that TimeFormat really switched from cmov to branch in a hot spot. Fill results below: name old time/op new time/op delta BinaryTree17-6 4.42s ± 1% 4.44s ± 1% ~ (p=0.075 n=10+10) Fannkuch11-6 4.23s ± 0% 4.18s ± 0% -1.16% (p=0.000 n=8+10) FmtFprintfEmpty-6 67.5ns ± 2% 67.5ns ± 0% ~ (p=0.950 n=10+7) FmtFprintfString-6 117ns ± 2% 119ns ± 1% +1.07% (p=0.045 n=9+10) FmtFprintfInt-6 122ns ± 0% 123ns ± 2% ~ (p=0.825 n=8+10) FmtFprintfIntInt-6 188ns ± 1% 187ns ± 1% -0.85% (p=0.001 n=10+10) FmtFprintfPrefixedInt-6 223ns ± 1% 226ns ± 1% +1.40% (p=0.000 n=9+10) FmtFprintfFloat-6 380ns ± 1% 379ns ± 0% ~ (p=0.350 n=9+7) FmtManyArgs-6 784ns ± 0% 790ns ± 1% +0.81% (p=0.000 n=10+9) GobDecode-6 10.7ms ± 1% 10.8ms ± 0% +0.68% (p=0.000 n=10+10) GobEncode-6 8.95ms ± 0% 8.94ms ± 0% ~ (p=0.436 n=10+10) Gzip-6 378ms ± 0% 378ms ± 0% ~ (p=0.696 n=8+10) Gunzip-6 60.5ms ± 0% 60.9ms ± 0% +0.73% (p=0.000 n=8+10) HTTPClientServer-6 109µs ± 3% 111µs ± 2% +2.53% (p=0.000 n=10+10) JSONEncode-6 20.2ms ± 0% 20.2ms ± 0% ~ (p=0.382 n=8+8) JSONDecode-6 85.9ms ± 1% 84.5ms ± 0% -1.59% (p=0.000 n=10+8) Mandelbrot200-6 6.89ms ± 0% 6.85ms ± 1% -0.49% (p=0.000 n=9+10) GoParse-6 5.49ms ± 0% 5.40ms ± 0% -1.63% (p=0.000 n=8+10) RegexpMatchEasy0_32-6 126ns ± 1% 129ns ± 1% +2.14% (p=0.000 n=10+10) RegexpMatchEasy0_1K-6 320ns ± 1% 317ns ± 2% ~ (p=0.089 n=10+10) RegexpMatchEasy1_32-6 119ns ± 2% 121ns ± 4% ~ (p=0.591 n=10+10) RegexpMatchEasy1_1K-6 544ns ± 1% 541ns ± 0% -0.67% (p=0.020 n=8+8) RegexpMatchMedium_32-6 184ns ± 1% 184ns ± 1% ~ (p=0.360 n=10+10) RegexpMatchMedium_1K-6 57.7µs ± 2% 58.3µs ± 1% +1.12% (p=0.022 n=10+10) RegexpMatchHard_32-6 2.72µs ± 5% 2.70µs ± 0% ~ (p=0.166 n=10+8) RegexpMatchHard_1K-6 80.2µs ± 0% 81.0µs ± 0% +1.01% (p=0.000 n=8+10) Revcomp-6 607ms ± 0% 601ms ± 2% -1.00% (p=0.006 n=8+10) Template-6 93.1ms ± 1% 92.6ms ± 0% -0.56% (p=0.000 n=9+9) TimeParse-6 472ns ± 0% 470ns ± 0% -0.28% (p=0.001 n=9+9) TimeFormat-6 546ns ± 0% 511ns ± 0% -6.41% (p=0.001 n=7+7) [Geo mean] 76.4µs 76.3µs -0.12% name old speed new speed delta GobDecode-6 71.5MB/s ± 1% 71.1MB/s ± 0% -0.68% (p=0.000 n=10+10) GobEncode-6 85.8MB/s ± 0% 85.8MB/s ± 0% ~ (p=0.425 n=10+10) Gzip-6 51.3MB/s ± 0% 51.3MB/s ± 0% ~ (p=0.680 n=8+10) Gunzip-6 321MB/s ± 0% 318MB/s ± 0% -0.73% (p=0.000 n=8+10) JSONEncode-6 95.9MB/s ± 0% 96.0MB/s ± 0% ~ (p=0.367 n=8+8) JSONDecode-6 22.6MB/s ± 1% 22.9MB/s ± 0% +1.62% (p=0.000 n=10+8) GoParse-6 10.6MB/s ± 0% 10.7MB/s ± 0% +1.64% (p=0.000 n=8+10) RegexpMatchEasy0_32-6 252MB/s ± 1% 247MB/s ± 1% -2.22% (p=0.000 n=10+10) RegexpMatchEasy0_1K-6 3.19GB/s ± 1% 3.22GB/s ± 2% ~ (p=0.105 n=10+10) RegexpMatchEasy1_32-6 267MB/s ± 2% 264MB/s ± 4% ~ (p=0.481 n=10+10) RegexpMatchEasy1_1K-6 1.88GB/s ± 1% 1.89GB/s ± 0% +0.62% (p=0.038 n=8+8) RegexpMatchMedium_32-6 5.41MB/s ± 2% 5.43MB/s ± 0% ~ (p=0.339 n=10+8) RegexpMatchMedium_1K-6 17.8MB/s ± 1% 17.6MB/s ± 1% -1.12% (p=0.029 n=10+10) RegexpMatchHard_32-6 11.8MB/s ± 5% 11.8MB/s ± 0% ~ (p=0.163 n=10+8) RegexpMatchHard_1K-6 12.8MB/s ± 0% 12.6MB/s ± 0% -1.06% (p=0.000 n=7+10) Revcomp-6 419MB/s ± 0% 423MB/s ± 2% +1.02% (p=0.006 n=8+10) Template-6 20.9MB/s ± 0% 21.0MB/s ± 0% +0.53% (p=0.000 n=8+8) [Geo mean] 77.0MB/s 77.0MB/s +0.05% diff --git a/src/cmd/compile/internal/ssa/branchelim.go b/src/cmd/compile/internal/ssa/branchelim.go Change-Id: Ibdffa9ea9b4c72668617ce3202ec4a83a1cd59be Reviewed-on: https://go-review.googlesource.com/107936 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Russ Cox authored
Change-Id: I831775db5de92d211495acc012fc4366c7c84851 Reviewed-on: https://go-review.googlesource.com/109335 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-
Daniel Martí authored
Now that vet always has type information, there's no reason to use string handling on type names to gather information about them, such as whether or not they are a local type. The semantics remain the same - the only difference should be that the implementation is less fragile and simpler. Change-Id: I71386b4196922e4c9f2653d90abc382efbf01b3c Reviewed-on: https://go-review.googlesource.com/95915 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alan Donovan <adonovan@google.com>
-
Martin Möhrmann authored
The file name suffix arm64 already limits the file to be build only on arm64. Change-Id: I33db713041b6dec9eb00889bac3b54c727e90743 Reviewed-on: https://go-review.googlesource.com/108986 Run-TryBot: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Russ Cox authored
The test wants to check that copies of Cond are detected at runtime. Make a copy that isn't detected by vet at compile time. Change-Id: I933ab1003585f75ba96723563107f1ba8126cb72 Reviewed-on: https://go-review.googlesource.com/108557Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Russ Cox authored
The existing text makes it seem like there's no way to use GitHub over HTTPS. There is. Explain that. Also, the existing text suggests explicit checkout into $GOPATH, which is not going to work in the new module world. Drop that alternative. Also, the existing text uses pushInsteadOf instead of insteadOf, which would have the effect of being able to push to a private repo but not clone it in the first place. That seems not helpful, so suggest insteadOf instead. Fixes #18927. Change-Id: Ic358b66f88064b53067d174a2a1591ac8bf96c88 Reviewed-on: https://go-review.googlesource.com/107775 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Russ Cox authored
Previously, 's' was only written to, never read, which is disallowed by the spec. cmd/compile has a bug where it doesn't notice this when a closure is involved, but go/types does notice, which was making "go vet" fail. This CL moves the variable into the closure and also makes sure to use it. Change-Id: I2d83fb6b5c1c9018df03533e966cbdf455f83bf9 Reviewed-on: https://go-review.googlesource.com/108556 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Ian Lance Taylor authored
We were using absolute paths in the #line directives in the export header file. This makes the header file change if you move GOPATH. The absolute paths aren't helpful for the final user, which is some C program elsewhere. Fixes #24945 Change-Id: I2da32c9b477df578bd5087435a03fe97abe462e3 Reviewed-on: https://go-review.googlesource.com/108315 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Matthew Dempsky authored
Change-Id: Ifd51636c9254de51b8a21371d7507a9481bcca0a Reviewed-on: https://go-review.googlesource.com/109142Reviewed-by: Robert Griesemer <gri@golang.org>
-
- 24 Apr, 2018 18 commits
-
-
Keith Randall authored
Get rid of a bunch of stuff we've already done. Change-Id: Ibae4be7535ddb58590a072a2390c5f3e948c2fd7 Reviewed-on: https://go-review.googlesource.com/109136Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Matthew Dempsky authored
This reduces the API surface of Type slightly (for #25056), but also makes it more consistent with the reflect and go/types APIs. Passes toolstash-check. Change-Id: Ief9a8eb461ae6e88895f347e2a1b7b8a62423222 Reviewed-on: https://go-review.googlesource.com/109138 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Matthew Dempsky authored
This was an artifact from when we had a separate ssa.Type interface to break circular dependency between packages ssa and gc. It's no longer needed now that package ssa directly uses package types. Change-Id: I6a93e5d79082815f7f0eb89507381969cc6cb403 Reviewed-on: https://go-review.googlesource.com/109137 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Hana Kim authored
A task may have other user annotation events after the task ends. So far, task.lastTimestamp returned the task end event if the event available. This change introduces task.endTimestamp for that and makes task.lastTimestamp returns the "last" seen event's timestamp if the task is ended. If the task is not ended, both returns the last timestamp of the entire trace assuming the task is still active. This fixes the task-oriented trace view mode not to drop user annotation instances when they appear outside a task's lifespan. Adds a test. Change-Id: Iba1062914f224edd521b9ee55c6cd5e180e55359 Reviewed-on: https://go-review.googlesource.com/109175Reviewed-by: Heschi Kreinick <heschi@google.com>
-
erifan01 authored
This CL adjusts the order of the branch instructions of the code to make it easier for the LIKELY branch to happen. Benchmarks: name old time/op new time/op delta pkg:strings goos:linux goarch:arm64 IndexHard2-8 2.17ms ± 1% 1.23ms ± 0% -43.34% (p=0.008 n=5+5) CountHard2-8 2.13ms ± 1% 1.21ms ± 2% -43.31% (p=0.008 n=5+5) pkg:bytes goos:linux goarch:arm64 IndexRune/4M-8 661µs ±22% 513µs ± 0% -22.32% (p=0.008 n=5+5) IndexEasy/4M-8 672µs ±23% 513µs ± 0% -23.71% (p=0.016 n=5+4) Change-Id: Ib96f095edf77747edc8a971e79f5c1428e5808ce Reviewed-on: https://go-review.googlesource.com/109015Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Heschi Kreinick authored
For ppc64, skip -linkmode=external per https://go-review.googlesource.com/c/go/+/106775#message-f95b9bd716e3d9ebb3f47a50492cde9f2972e859 For Solaris, apparently type.* isn't the same as runtime.types. I don't know why, but runtime.types is what goes into moduledata, and so it's definitely the more correct thing to use. Fixes: #24983 Change-Id: I6b465ac7b8f91ce55a63acbd7fe76e4a2dbb6f22 Reviewed-on: https://go-review.googlesource.com/108955 Run-TryBot: Heschi Kreinick <heschi@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Josh Bleecher Snyder authored
Before: live values at end of each block b1: v3 v2 v7 avoid=0 b2: v3 v13 avoid=81 b3: v19[AX] v3 avoid=81 b6: avoid=0 b7: avoid=0 b5: avoid=0 b4: v3 v18 avoid=81 After: live values at end of each block b1: v3 v2 v7 b2: v3 v13 avoid=AX DI b3: v19[AX] v3 avoid=AX DI b6: b7: b5: b4: v3 v18 avoid=AX DI Change-Id: Ibec5c76a16151832b8d49a21c640699fdc9a9d28 Reviewed-on: https://go-review.googlesource.com/109000 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Hana Kim authored
Also, avoid Region creation when tracing is disabled. Unfortunate side-effect of this change is that we no longer trace pre-existing regions in tracing, but we can add the feature in the future when we find it useful and justifiable. Until then, let's avoid the overhead from this low-level api use as much as possible. goos: linux goarch: amd64 pkg: runtime/trace // Trace disabled BenchmarkStartRegion-12 2000000000 0.66 ns/op 0 B/op 0 allocs/op BenchmarkNewTask-12 30000000 40.4 ns/op 56 B/op 2 allocs/op // Trace enabled, -trace=/dev/null BenchmarkStartRegion-12 5000000 287 ns/op 32 B/op 1 allocs/op BenchmarkNewTask-12 5000000 283 ns/op 56 B/op 2 allocs/op Also, skip other tests if tracing is already enabled. Change-Id: Id3028d60b5642fcab4b09a74fd7d79361a3861e5 Reviewed-on: https://go-review.googlesource.com/109115Reviewed-by: Peter Weinberger <pjw@google.com>
-
Hana Kim authored
"Span" is a commonly used term in many distributed tracing systems (Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a period of time, not necessarily tied into execution of underlying processor, thread, or goroutine, unlike the "Span" of runtime/trace package. Since distributed tracing and go runtime execution tracing are already similar enough to cause confusion, this CL attempts to avoid using the same word if possible. "Region" is being used in a certain tracing system to refer to a code region which is pretty close to what runtime/trace.Span currently refers to. So, replace that. https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions This CL also tweaks APIs a bit based on jbd and heschi's comments: NewContext -> NewTask and it now returns a Task object that exports End method. StartSpan -> StartRegion and it now returns a Region object that exports End method. Also, changed WithSpan to WithRegion and it now takes func() with no context. Another thought is to get rid of WithRegion. It is a nice concept but in practice, it seems problematic (a lot of code churn, and polluting stack trace). Already, the tracing concept is very low level, and we hope this API to be used with great care. Recommended usage will be defer trace.StartRegion(ctx, "someRegion").End() Left old APIs untouched in this CL. Once the usage of them are cleaned up, they will be removed in a separate CL. Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a Reviewed-on: https://go-review.googlesource.com/108296 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: JBD <jbd@google.com>
-
Ilya Tocar authored
We currently rewrite (TESTQ (MOVQconst [c] x)) into (TESTQconst [c] x) and (TESTQconst [-1] x) into (TESTQ x x) if x is a (MOVQconst [-1]) we will be stuck in the endless rewrite loop. Don't perform the rewrite in such cases. Fixes #25006 Change-Id: I77f561ba2605fc104f1e5d5c57f32e9d67a2c000 Reviewed-on: https://go-review.googlesource.com/108879 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Hana (Hyang-Ah) Kim authored
The Go's heap profile contains four kinds of samples (inuse_space, inuse_objects, alloc_space, and alloc_objects). The pprof tool by default chooses the inuse_space (the bytes of live, in-use objects). When analyzing the current memory usage the choice of inuse_space as the default may be useful, but in some cases, users are more interested in analyzing the total allocation statistics throughout the program execution. For example, when we analyze the memory profile from benchmark or program test run, we are more likely interested in the whole allocation history than the live heap snapshot at the end of the test or benchmark. The pprof tool provides flags to control which sample type to be used for analysis. However, it is one of the less-known features of pprof and we believe it's better to choose the right type of samples as the default when producing the profile. This CL introduces a new type of profile, "allocs", which is the same as the "heap" profile but marks the alloc_space as the default type unlike heap profiles that use inuse_space as the default type. 'go test -memprofile=...' command is changed to use the new "allocs" profile type instead of the traditional "heap" profile. Fixes #24443 Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9 Reviewed-on: https://go-review.googlesource.com/102696 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org>
-
quasilyte authored
Memory arguments for debug/control register moves are a minefield for programmer: not useful, but can lead to errors. See referenced issue for detailed explanation. Fixes #24981 Change-Id: I918e81cd4a8b1dfcfc9023cdfc3de45abe29e749 Reviewed-on: https://go-review.googlesource.com/107075 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
-
isharipo authored
gc/ssa.go initilizes SP and SB values with TUINTPTR type. Assign same type in SSA tests and modify check.go to catch mismatching types for those ops. This makes SSA tests more consistent. Change-Id: I798440d57d00fb949d1a0cd796759c9b82a934bd Reviewed-on: https://go-review.googlesource.com/106658 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
ludweeg authored
Fixes go lint warning. Change-Id: I5a7485a4c8316b81e6aa50b95fe75e424f2fcedc Reviewed-on: https://go-review.googlesource.com/109055Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Andrei Tudor Călin authored
This change adds support for the splice system call on Linux, for the purpose of optimizing (*TCPConn).ReadFrom by reducing copies of data from and to userspace. It does so by creating a temporary pipe and splicing data from the source connection to the pipe, then from the pipe to the destination connection. The pipe serves as an in-kernel buffer for the data transfer. No new API is added to package net, but a new Splice function is added to package internal/poll, because using splice requires help from the network poller. Users of the net package should benefit from the change transparently. This change only enables the optimization if the Reader in ReadFrom is a TCP connection. Since splice is a more general interface, it could, in theory, also be enabled if the Reader were a unix socket, or the read half of a pipe. However, benchmarks show that enabling it for unix sockets is most likely not a net performance gain. The tcp <- unix case is also fairly unlikely to be used very much by users of package net. Enabling the optimization for pipes is also problematic from an implementation perspective, since package net cannot easily get at the *poll.FD of an *os.File. A possible solution to this would be to dup the pipe file descriptor, register the duped descriptor with the network poller, and work on that *poll.FD instead of the original. However, this seems too intrusive, so it has not been done. If there was a clean way to do it, it would probably be worth doing, since splicing from a pipe to a socket can be done directly. Therefore, this patch only enables the optimization for what is likely the most common use case: tcp <- tcp. The following benchmark compares the performance of the previous userspace genericReadFrom code path to the new optimized code path. The sub-benchmarks represent chunk sizes used by the writer on the other end of the Reader passed to ReadFrom. benchmark old ns/op new ns/op delta BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80% BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01% BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56% BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43% BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04% BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42% BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24% BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17% BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31% BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71% BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99% benchmark old MB/s new MB/s speedup BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x Fixes #10948 Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5 Reviewed-on: https://go-review.googlesource.com/107715 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Wèi Cōngruì authored
The caller of epollctl expects it to return a negative errno value, but it returns a positive errno value on mips, mips64 and ppc64. The change fixes this. Updates #23446 Change-Id: Ie6372eca6c23de21964caaaa433c9a45ef93531e Reviewed-on: https://go-review.googlesource.com/89235Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Ian Lance Taylor authored
Ever since we added sleep to the runtime back in 2008, we've implemented it on GNU/Linux with the select (or pselect or pselect6) system call. But the Linux kernel has a nanosleep system call, which should be a tiny bit more efficient since it doesn't have to check to see whether there are any file descriptors. So use it. Change-Id: Icc3430baca46b082a4d33f97c6c47e25fa91cb9a Reviewed-on: https://go-review.googlesource.com/108538 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Matthew Dempsky authored
Change-Id: Id018eeb79afbe2c695a583b3845cfbc1aab08388 Reviewed-on: https://go-review.googlesource.com/106797 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
-