- 01 May, 2016 19 commits
-
-
Keith Randall authored
:= is the wrong thing here. The new variable masks the old variable so we allocate the slice afresh each time around the loop. Change-Id: I759c30e1bfa88f40decca6dd7d1e051e14ca0844 Reviewed-on: https://go-review.googlesource.com/22679 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>
-
Brad Fitzpatrick authored
Change-Id: I753e62879a56582a9511e3f34fdeac929202efbf Reviewed-on: https://go-review.googlesource.com/22680Reviewed-by: Ralph Corderoy <ralph@inputplus.co.uk> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Brad Fitzpatrick authored
The Transport's automatic gzip uncompression lost information in the process (the compressed Content-Length, if known). Normally that's okay, but it's not okay for reverse proxies which have to be able to generate a valid HTTP response from the Transport's provided *Response. Reverse proxies should normally be disabling compression anyway and just piping the compressed pipes though and not wasting CPU cycles decompressing them. So also document that on the new Uncompressed field. Then, using the new field, fix Response.Write to not inject a bogus "Connection: close" header when it doesn't see a transfer encoding or content-length. Updates #15366 (the http2 side remains, once this is submitted) Change-Id: I476f40aa14cfa7aa7b3bf99021bebba4639f9640 Reviewed-on: https://go-review.googlesource.com/22671Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Brad Fitzpatrick authored
This adds a context key named LocalAddrContextKey (for now, see #15229) to let users access the net.Addr of the net.Listener that accepted the connection that sent an HTTP request. This is similar to ServerContextKey which provides access to the *Server. (A Server may have multiple Listeners) Fixes #6732 Change-Id: I74296307b68aaaab8df7ad4a143e11b5227b5e62 Reviewed-on: https://go-review.googlesource.com/22672Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Brad Fitzpatrick authored
Don't keep idle HTTP client connections open forever. Add a new knob, Transport.IdleConnTimeout, and make the default be 90 seconds. I figure 90 seconds is more than a minute, and less than infinite, and I figure enough code has things waking up once a minute polling APIs. This also removes the Transport's idleCount field which was unused and redundant with the size of the idleLRU map (which was actually used). Change-Id: Ibb698a9a9a26f28e00a20fe7ed23f4afb20c2322 Reviewed-on: https://go-review.googlesource.com/22670Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Brad Fitzpatrick authored
And add a test. Updates #12580 Change-Id: Ia7eaba09b8e7fd0eddbcaefb948d01ab10af876e Reviewed-on: https://go-review.googlesource.com/22659Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Brad Fitzpatrick authored
Fixes #15150 Change-Id: I1a892d5b0516a37dac050d3bb448e0a2571db16e Reviewed-on: https://go-review.googlesource.com/22658Reviewed-by: Andrew Gerrand <adg@golang.org>
-
Josh Bleecher Snyder authored
Before this CL: $ go test -bench=CompressedZipGarbage -count=5 -run=NONE archive/zip BenchmarkCompressedZipGarbage-8 50 20677087 ns/op 42973 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 100 20584764 ns/op 24294 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 50 20859221 ns/op 42973 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 100 20901176 ns/op 24294 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 50 21282409 ns/op 42973 B/op 47 allocs/op The B/op number is effectively meaningless. There is a surprisingly large one-time cost that gets divided by the number of iterations that your machine can get through in a second. This CL discards the first run, which helps. It is not a panacea. Running with -benchtime=10s will allow the sync.Pool to be emptied, which brings the problem back. However, since there are more iterations to divide the cost through, it’s not quite as bad, and running with a high benchtime is rare. This CL changes the meaning of the B/op number, which is unfortunate, since it won’t have the same order of magnitude as previous Go versions. But it wasn’t really comparable before anyway, since it didn’t have any reliable meaning at all. After this CL: $ go test -bench=CompressedZipGarbage -count=5 -run=NONE archive/zip BenchmarkCompressedZipGarbage-8 100 20881890 ns/op 5616 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 50 20622757 ns/op 5616 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 50 20628193 ns/op 5616 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 100 20756612 ns/op 5616 B/op 47 allocs/op BenchmarkCompressedZipGarbage-8 100 20639774 ns/op 5616 B/op 47 allocs/op Change-Id: Iedee04f39328974c7fa272a6113d423e7ffce50f Reviewed-on: https://go-review.googlesource.com/22585Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Brad Fitzpatrick authored
Change-Id: I53dd5affc3a1e1f741fe44c7ce691bb2cd432764 Reviewed-on: https://go-review.googlesource.com/22657Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Cherry Zhang authored
a new relocation R_ADDRMIPSTLS is added, which resolves to 16-bit offset of a TLS address on mips64x. Change-Id: Ic60d0e1ba49ff1c433cead242f5884677ab227a5 Reviewed-on: https://go-review.googlesource.com/19804Reviewed-by: Minux Ma <minux@golang.org>
-
Austin Clements authored
This updates some comments that became out of date when we moved the mark bit out of the heap bitmap and started using the high bit for the first word as a scan/dead bit. Change-Id: I4a572d16db6114cadff006825466c1f18359f2db Reviewed-on: https://go-review.googlesource.com/22662Reviewed-by: Rick Hudson <rlh@golang.org>
-
Cherry Zhang authored
MIPS N64 ABI passes arguments in registers R4-R11, return value in R2. R16-R23, R28, R30 and F24-F31 are callee-save. gcc PIC code expects to be called with indirect call through R25. Change-Id: I24f582b4b58e1891ba9fd606509990f95cca8051 Reviewed-on: https://go-review.googlesource.com/19805Reviewed-by: Minux Ma <minux@golang.org>
-
Cherry Zhang authored
Fixes #12560 Change-Id: Ic2004fc7b09f2dbbf83c41f8c6307757c0e1676d Reviewed-on: https://go-review.googlesource.com/19803Reviewed-by: Minux Ma <minux@golang.org>
-
Frits van Bommel authored
Factor out the Aux/AuxInt handling in (*Value).LongString() and use it in (*Value).LongHTML() as well. This especially improves readability of auxFloat32, auxFloat64, and auxSymValAndOff values which would otherwise be printed as opaque integers. This change also makes LongString() slightly less verbose by eliding offsets that are zero (as is very often the case). Additionally, ensure the HTML is interpreted as UTF-8 so that non-ASCII characters (especially the "middle dots" in some symbols) show up correctly. Change-Id: Ie26221df876faa056d322b3e423af63f33cd109d Reviewed-on: https://go-review.googlesource.com/22641Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Frits van Bommel <fvbommel@gmail.com>
-
Cherry Zhang authored
SB register (R28) is introduced for access external addresses with shorter instruction sequences. It is loaded at entry points. External data within 2G of SB can be accessed this way. cmd/internal/obj: relocaltion R_ADDRMIPS is split into two relocations R_ADDRMIPS and R_ADDRMIPSU, handling the low 16 bits and the "upper" 16 bits of external addresses, respectively, since the instructios may not be adjacent. It might be better if relocation Variant could be used. cmd/link/internal/mips64: support new relocations. cmd/compile/internal/mips64: reserve SB register. runtime: initialize SB register at entry points. Change-Id: I5f34868f88c5a9698c042a8a1f12f76806c187b9 Reviewed-on: https://go-review.googlesource.com/19802Reviewed-by: Minux Ma <minux@golang.org>
-
Cherry Zhang authored
Change-Id: I724ce0a48c1aeed14267c049fa415a6fa2fffbcf Reviewed-on: https://go-review.googlesource.com/19864Reviewed-by: Minux Ma <minux@golang.org>
-
Cherry Zhang authored
Leave R28 to SB register, which will be introduced in CL 19802. Change-Id: I1cf7a789695c5de664267ec8086bfb0b043ebc14 Reviewed-on: https://go-review.googlesource.com/19863Reviewed-by: Minux Ma <minux@golang.org>
-
Cherry Zhang authored
on mips64, address is 64 bit, not a WORD. also it is never used anywhere. Change-Id: Ic6bf6d6a21c8d2f1eb7bfe9efc5a29186ec2a8ef Reviewed-on: https://go-review.googlesource.com/19801Reviewed-by: Minux Ma <minux@golang.org>
-
Brad Fitzpatrick authored
The HTTP client had a limit for the maximum number of idle connections per-host, but not a global limit. This CLs adds a global idle connection limit too, Transport.MaxIdleConns. All idle conns are now also stored in a doubly-linked list. When there are too many, the oldest one is closed. Fixes #15461 Change-Id: I72abbc28d140c73cf50f278fa70088b45ae0deef Reviewed-on: https://go-review.googlesource.com/22655Reviewed-by: Andrew Gerrand <adg@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 30 Apr, 2016 7 commits
-
-
Brad Fitzpatrick authored
Clarify that it includes the RFC 7230 "request-line". Fixes #15494 Change-Id: I9cc5dd5f2d85ebf903229539208cec4da5c38d04 Reviewed-on: https://go-review.googlesource.com/22656Reviewed-by: Andrew Gerrand <adg@golang.org>
-
Kevin Burke authored
Previously named byte types like json.RawMessage could get dirty database memory from a call to Scan. These types would activate a code path that didn't clone the byte data coming from the database before assigning it. Another thread could then overwrite the byte array in src, which has unexpected consequences. Originally reported by Jason Moiron; the patch and test are his suggestions. Fixes #13905. Change-Id: Iacfef61cbc9dd51c8fccef9b2b9d9544c77dd0e0 Reviewed-on: https://go-review.googlesource.com/22393Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Austin Clements authored
With the switch to separate mark bitmaps, the scan/dead bit for the first word of each object is now unused. Reclaim this bit and use it as a scan/dead bit, just like words three and on. The second word is still used for checkmark. This dramatically simplifies heapBitsSetTypeNoScan and hasPointers, since they no longer need different cases for 1, 2, and 3+ word objects. They can instead just manipulate the heap bitmap for the first word and be done with it. In order to enable this, we change heapBitsSetType and runGCProg to always set the scan/dead bit to scan for the first word on every code path. Since these functions only apply to types that have pointers, there's no need to do this conditionally: it's *always* necessary to set the scan bit in the first word. We also change every place that scans an object and checks if there are more pointers. Rather than only checking morePointers if the word is >= 2, we now check morePointers if word != 1 (since that's the checkmark word). Looking forward, we should probably reclaim the checkmark bit, too, but that's going to be quite a bit more work. Tested by setting doubleCheck in heapBitsSetType and running all.bash on both linux/amd64 and linux/386, and by running GOGC=10 all.bash. This particularly improves the FmtFprintf* go1 benchmarks, since they do a large amount of noscan allocation. name old time/op new time/op delta BinaryTree17-12 2.34s ± 1% 2.38s ± 1% +1.70% (p=0.000 n=17+19) Fannkuch11-12 2.09s ± 0% 2.09s ± 1% ~ (p=0.276 n=17+16) FmtFprintfEmpty-12 44.9ns ± 2% 44.8ns ± 2% ~ (p=0.340 n=19+18) FmtFprintfString-12 127ns ± 0% 125ns ± 0% -1.57% (p=0.000 n=16+15) FmtFprintfInt-12 128ns ± 0% 122ns ± 1% -4.45% (p=0.000 n=15+20) FmtFprintfIntInt-12 207ns ± 1% 193ns ± 0% -6.55% (p=0.000 n=19+14) FmtFprintfPrefixedInt-12 197ns ± 1% 191ns ± 0% -2.93% (p=0.000 n=17+18) FmtFprintfFloat-12 263ns ± 0% 248ns ± 1% -5.88% (p=0.000 n=15+19) FmtManyArgs-12 794ns ± 0% 779ns ± 1% -1.90% (p=0.000 n=18+18) GobDecode-12 7.14ms ± 2% 7.11ms ± 1% ~ (p=0.072 n=20+20) GobEncode-12 5.85ms ± 1% 5.82ms ± 1% -0.49% (p=0.000 n=20+20) Gzip-12 218ms ± 1% 215ms ± 1% -1.22% (p=0.000 n=19+19) Gunzip-12 36.8ms ± 0% 36.7ms ± 0% -0.18% (p=0.006 n=18+20) HTTPClientServer-12 77.1µs ± 4% 77.1µs ± 3% ~ (p=0.945 n=19+20) JSONEncode-12 15.6ms ± 1% 15.9ms ± 1% +1.68% (p=0.000 n=18+20) JSONDecode-12 55.2ms ± 1% 53.6ms ± 1% -2.93% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 1% 4.05ms ± 0% ~ (p=0.306 n=17+17) GoParse-12 3.14ms ± 1% 3.10ms ± 1% -1.31% (p=0.000 n=19+18) RegexpMatchEasy0_32-12 69.3ns ± 1% 70.0ns ± 0% +0.89% (p=0.000 n=19+17) RegexpMatchEasy0_1K-12 237ns ± 1% 236ns ± 0% -0.62% (p=0.000 n=19+16) RegexpMatchEasy1_32-12 69.5ns ± 1% 70.3ns ± 1% +1.14% (p=0.000 n=18+17) RegexpMatchEasy1_1K-12 377ns ± 1% 366ns ± 1% -3.03% (p=0.000 n=15+19) RegexpMatchMedium_32-12 107ns ± 1% 107ns ± 2% ~ (p=0.318 n=20+19) RegexpMatchMedium_1K-12 33.8µs ± 3% 33.5µs ± 1% -1.04% (p=0.001 n=20+19) RegexpMatchHard_32-12 1.68µs ± 1% 1.73µs ± 0% +2.50% (p=0.000 n=20+18) RegexpMatchHard_1K-12 50.8µs ± 1% 52.0µs ± 1% +2.50% (p=0.000 n=19+18) Revcomp-12 381ms ± 1% 385ms ± 1% +1.00% (p=0.000 n=17+18) Template-12 64.9ms ± 3% 62.6ms ± 1% -3.55% (p=0.000 n=19+18) TimeParse-12 324ns ± 0% 328ns ± 1% +1.25% (p=0.000 n=18+18) TimeFormat-12 345ns ± 0% 334ns ± 0% -3.31% (p=0.000 n=15+17) [Geo mean] 52.1µs 51.5µs -1.00% Change-Id: I13e74da3193a7f80794c654f944d1f0d60817049 Reviewed-on: https://go-review.googlesource.com/22632Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
This makes this code better self-documenting and makes it easier to find these places in the future. Change-Id: I31dc5598ae67f937fb9ef26df92fd41d01e983c3 Reviewed-on: https://go-review.googlesource.com/22631Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
heapBits.bits is carefully written to produce good machine code. Use it in heapBits.morePointers and heapBits.isPointer to get good machine code there, too. Change-Id: I208c7d0d38697e7a22cad67f692162589b75f1e2 Reviewed-on: https://go-review.googlesource.com/22630Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Keith Randall authored
Fixes #15496 Change-Id: Ieb5be1caa4b1c23e23b20d56c1a0a619032a9f5d Reviewed-on: https://go-review.googlesource.com/22652Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Michael Munday authored
Fix issues introduced in 5f9a870b. Change-Id: Ia75945ef563956613bf88bbe57800a96455c265d Reviewed-on: https://go-review.googlesource.com/22661Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
- 29 Apr, 2016 14 commits
-
-
Ian Lance Taylor authored
Change-Id: I4b34bcd5cde71ecfbb352b39c4231de6168cc7f3 Reviewed-on: https://go-review.googlesource.com/22651 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <munday@ca.ibm.com>
-
Matthew Dempsky authored
Change-Id: I99b2ca52824341d986090f5c78ab4f396594bcdf Reviewed-on: https://go-review.googlesource.com/22660Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Ian Lance Taylor authored
Add support for the context function set by runtime.SetCgoTraceback. The context function was added in CL 17761, without support. This CL is the support. This CL has not been tested for real C code, as a working context function for C code requires unwind support that does not seem to exist. I wanted to get the CL out before the freeze. I apologize for the length of this CL. It's mostly plumbing, but unfortunately the plumbing is processor-specific. Change-Id: I8ce11a0de9b3dafcc29efd2649d776e93bff0e90 Reviewed-on: https://go-review.googlesource.com/22508Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Michael Munday authored
This commit adds the new 'ctrAble' interface to the crypto/cipher package. The role of ctrAble is the same as gcmAble but for CTR instead of GCM. It allows block ciphers to provide optimized CTR implementations. The primary benefit of adding CTR support to the s390x AES implementation is that it allows us to encrypt the counter values in bulk, giving the cipher message instruction a larger chunk of data to work on per invocation. The xorBytes assembly is necessary because xorBytes becomes a bottleneck when CTR is done in this way. Hopefully it will be possible to remove this once s390x has migrated to the ssa backend. name old speed new speed delta AESCTR1K 160MB/s ± 6% 867MB/s ± 0% +442.42% (p=0.000 n=9+10) Change-Id: I1ae16b0ce0e2641d2bdc7d7eabc94dd35f6e9318 Reviewed-on: https://go-review.googlesource.com/22195Reviewed-by: Adam Langley <agl@golang.org>
-
Michael Munday authored
This commit adds the cbcEncAble and cbcDecAble interfaces that can be implemented by block ciphers that support an optimized implementation of CBC. This is similar to what is done for GCM with the gcmAble interface. The cbcEncAble, cbcDecAble and gcmAble interfaces all now have tests to ensure they are detected correctly in the cipher package. name old speed new speed delta AESCBCEncrypt1K 152MB/s ± 1% 1362MB/s ± 0% +795.59% (p=0.000 n=10+9) AESCBCDecrypt1K 143MB/s ± 1% 1362MB/s ± 0% +853.00% (p=0.000 n=10+9) Change-Id: I715f686ab3686b189a3dac02f86001178fa60580 Reviewed-on: https://go-review.googlesource.com/22523 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Adam Langley <agl@golang.org>
-
Keith Randall authored
Fixes #15488 Change-Id: I054eb1e1c859de315e3cdbdef5428682bce693fd Reviewed-on: https://go-review.googlesource.com/22609 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Rick Hudson authored
This commit moves the GC from free list allocation to bit mark allocation. Instead of using the bitmaps generated during the mark phases to generate free list and then using the free lists for allocation we allocate directly from the bitmaps. The change in the garbage benchmark name old time/op new time/op delta XBenchGarbage-12 2.22ms ± 1% 2.13ms ± 1% -3.90% (p=0.000 n=18+18) Change-Id: I17f57233336f0ca5ef5404c3be4ecb443ab622aa
-
Rick Hudson authored
nextFreeFast is currently not inlined by the compiler due to its size and complexity. This CL simplifies nextFreeFast by letting the slow path handle (nextFree) handle a corner cases. Change-Id: Ia9c5d1a7912bcb4bec072f5fd240f0e0bafb20e4 Reviewed-on: https://go-review.googlesource.com/22598Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>
-
David Chase authored
This is necessary to avoid disrupting the go1 suite and gives us a place to put other tests of basic compiler function and correctness. Change-Id: I36933819ff2bfe6a2121fff2be9a98efd2123d9a Reviewed-on: https://go-review.googlesource.com/22597 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
-
Keith Randall authored
Break really long lines. Add spacing to line up columns. In AMD64, put all the optimization rules after all the lowering rules. Change-Id: I45cc7368bf278416e67f89e74358db1bd4326a93 Reviewed-on: https://go-review.googlesource.com/22470Reviewed-by: David Chase <drchase@google.com>
-
Austin Clements authored
sweep used to skip mcental.freeSpan (and its locking) if it didn't find any new free objects. We lost that optimization when the freed-object counting changed in dad83f7 to count total free objects instead of newly freed objects. The previous commit brings back counting of newly freed objects, so we can easily revive this optimization by checking that count (like we used to) instead of the total free objects count. Change-Id: I43658707a1c61674d0366124d5976b00d98741a9 Reviewed-on: https://go-review.googlesource.com/22596 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
Commit 8dda1c4c changed the meaning of "nfree" in sweep from the number of newly freed objects to the total number of free objects in the span, but didn't update where sweep added nfree to c.local_nsmallfree. Hence, we're over-accounting the number of frees. This is causing TestArrayHash to fail with "too many allocs NNN - hash not balanced". Fix this by computing the number of newly freed objects and adding that to c.local_nsmallfree, so it behaves like it used to. Computing this requires a small tweak to mallocgc: apparently we've never set s.allocCount when allocating a large object; fix this by setting it to 1 so sweep doesn't get confused. Change-Id: I31902ffd310110da4ffd807c5c06f1117b872dc8 Reviewed-on: https://go-review.googlesource.com/22595Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
Austin Clements authored
We broke tracing of freed objects in GODEBUG=allocfreetrace=1 mode when we removed the sweep over the mark bitmap. Fix it by re-introducing the sweep over the bitmap specifically if we're in allocfreetrace mode. This doesn't have to be even remotely efficient, since the overhead of allocfreetrace is huge anyway, so we can keep the code for this down to just a few lines. Change-Id: I9e176b3b04c73608a0ea3068d5d0cd30760ebd40 Reviewed-on: https://go-review.googlesource.com/22592 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
Currently we always zero objects when we allocate them. We used to have an optimization that would not zero objects that had not been allocated since the whole span was last zeroed (either by getting it from the system or by getting it from the heap, which does a bulk zero), but this depended on the sweeper clobbering the first two words of each object. Hence, we lost this optimization when the bitmap sweeper went away. Re-introduce this optimization using a different mechanism. Each span already keeps a flag indicating that it just came from the OS or was just bulk zeroed by the mheap. We can simply use this flag to know when we don't need to zero an object. This is slightly less efficient than the old optimization: if a span gets allocated and partially used, then GC happens and the span gets returned to the mcentral, then the span gets re-acquired, the old optimization knew that it only had to re-zero the objects that had been reclaimed, whereas this optimization will re-zero everything. However, in this case, you're already paying for the garbage collection, and you've only wasted one zeroing of the span, so in practice there seems to be little difference. (If we did want to revive the full optimization, each span could keep track of a frontier beyond which all free slots are zeroed. I prototyped this and it didn't obvious do any better than the much simpler approach in this commit.) This significantly improves BinaryTree17, which is allocation-heavy (and runs first, so most pages are already zeroed), and slightly improves everything else. name old time/op new time/op delta XBenchGarbage-12 2.15ms ± 1% 2.14ms ± 1% -0.80% (p=0.000 n=17+17) name old time/op new time/op delta BinaryTree17-12 2.71s ± 1% 2.56s ± 1% -5.73% (p=0.000 n=18+19) DivconstI64-12 1.70ns ± 1% 1.70ns ± 1% ~ (p=0.562 n=18+18) DivconstU64-12 1.74ns ± 2% 1.74ns ± 1% ~ (p=0.394 n=20+20) DivconstI32-12 1.74ns ± 0% 1.74ns ± 0% ~ (all samples are equal) DivconstU32-12 1.66ns ± 1% 1.66ns ± 0% ~ (p=0.516 n=15+16) DivconstI16-12 1.84ns ± 0% 1.84ns ± 0% ~ (all samples are equal) DivconstU16-12 1.82ns ± 0% 1.82ns ± 0% ~ (all samples are equal) DivconstI8-12 1.79ns ± 0% 1.79ns ± 0% ~ (all samples are equal) DivconstU8-12 1.60ns ± 0% 1.60ns ± 1% ~ (p=0.603 n=17+19) Fannkuch11-12 2.11s ± 1% 2.11s ± 0% ~ (p=0.333 n=16+19) FmtFprintfEmpty-12 45.1ns ± 4% 45.4ns ± 5% ~ (p=0.111 n=20+20) FmtFprintfString-12 134ns ± 0% 129ns ± 0% -3.45% (p=0.000 n=18+16) FmtFprintfInt-12 131ns ± 1% 129ns ± 1% -1.54% (p=0.000 n=16+18) FmtFprintfIntInt-12 205ns ± 2% 203ns ± 0% -0.56% (p=0.014 n=20+18) FmtFprintfPrefixedInt-12 200ns ± 2% 197ns ± 1% -1.48% (p=0.000 n=20+18) FmtFprintfFloat-12 256ns ± 1% 256ns ± 0% -0.21% (p=0.008 n=18+20) FmtManyArgs-12 805ns ± 0% 804ns ± 0% -0.19% (p=0.001 n=18+18) GobDecode-12 7.21ms ± 1% 7.14ms ± 1% -0.92% (p=0.000 n=19+20) GobEncode-12 5.88ms ± 1% 5.88ms ± 1% ~ (p=0.641 n=18+19) Gzip-12 218ms ± 1% 218ms ± 1% ~ (p=0.271 n=19+18) Gunzip-12 37.1ms ± 0% 36.9ms ± 0% -0.29% (p=0.000 n=18+17) HTTPClientServer-12 78.1µs ± 2% 77.4µs ± 2% ~ (p=0.070 n=19+19) JSONEncode-12 15.5ms ± 1% 15.5ms ± 0% ~ (p=0.063 n=20+18) JSONDecode-12 56.1ms ± 0% 55.4ms ± 1% -1.18% (p=0.000 n=19+18) Mandelbrot200-12 4.05ms ± 0% 4.06ms ± 0% +0.29% (p=0.001 n=18+18) GoParse-12 3.28ms ± 1% 3.21ms ± 1% -2.30% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 69.4ns ± 2% 69.3ns ± 1% ~ (p=0.205 n=18+16) RegexpMatchEasy0_1K-12 239ns ± 0% 239ns ± 0% ~ (all samples are equal) RegexpMatchEasy1_32-12 69.4ns ± 1% 69.4ns ± 1% ~ (p=0.620 n=15+18) RegexpMatchEasy1_1K-12 370ns ± 1% 369ns ± 2% ~ (p=0.088 n=20+20) RegexpMatchMedium_32-12 108ns ± 0% 108ns ± 0% ~ (all samples are equal) RegexpMatchMedium_1K-12 33.6µs ± 3% 33.5µs ± 3% ~ (p=0.718 n=20+20) RegexpMatchHard_32-12 1.68µs ± 1% 1.67µs ± 2% ~ (p=0.316 n=20+20) RegexpMatchHard_1K-12 50.5µs ± 3% 50.4µs ± 3% ~ (p=0.659 n=20+20) Revcomp-12 381ms ± 1% 381ms ± 1% ~ (p=0.916 n=19+18) Template-12 66.5ms ± 1% 65.8ms ± 2% -1.08% (p=0.000 n=20+20) TimeParse-12 317ns ± 0% 319ns ± 0% +0.48% (p=0.000 n=19+12) TimeFormat-12 338ns ± 0% 338ns ± 0% ~ (p=0.124 n=19+18) [Geo mean] 5.99µs 5.96µs -0.54% Change-Id: I638ffd9d9f178835bbfa499bac20bd7224f1a907 Reviewed-on: https://go-review.googlesource.com/22591Reviewed-by: Rick Hudson <rlh@golang.org>
-