An error occurred fetching the project authors.
- 19 May, 2016 1 commit
-
-
Austin Clements authored
Currently ready always puts the readied goroutine in runnext. We're going to have to change this for some uses, so add a flag for whether or not to use runnext. For now we always pass true so this is a no-op change. For #15706. Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df Reviewed-on: https://go-review.googlesource.com/23171 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 16 May, 2016 1 commit
-
-
Austin Clements authored
Change-Id: I5ebf93b60213c0138754fc20888ae5ce60237b8c Reviewed-on: https://go-review.googlesource.com/23131Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 30 Apr, 2016 2 commits
-
-
Austin Clements authored
With the switch to separate mark bitmaps, the scan/dead bit for the first word of each object is now unused. Reclaim this bit and use it as a scan/dead bit, just like words three and on. The second word is still used for checkmark. This dramatically simplifies heapBitsSetTypeNoScan and hasPointers, since they no longer need different cases for 1, 2, and 3+ word objects. They can instead just manipulate the heap bitmap for the first word and be done with it. In order to enable this, we change heapBitsSetType and runGCProg to always set the scan/dead bit to scan for the first word on every code path. Since these functions only apply to types that have pointers, there's no need to do this conditionally: it's *always* necessary to set the scan bit in the first word. We also change every place that scans an object and checks if there are more pointers. Rather than only checking morePointers if the word is >= 2, we now check morePointers if word != 1 (since that's the checkmark word). Looking forward, we should probably reclaim the checkmark bit, too, but that's going to be quite a bit more work. Tested by setting doubleCheck in heapBitsSetType and running all.bash on both linux/amd64 and linux/386, and by running GOGC=10 all.bash. This particularly improves the FmtFprintf* go1 benchmarks, since they do a large amount of noscan allocation. name old time/op new time/op delta BinaryTree17-12 2.34s ± 1% 2.38s ± 1% +1.70% (p=0.000 n=17+19) Fannkuch11-12 2.09s ± 0% 2.09s ± 1% ~ (p=0.276 n=17+16) FmtFprintfEmpty-12 44.9ns ± 2% 44.8ns ± 2% ~ (p=0.340 n=19+18) FmtFprintfString-12 127ns ± 0% 125ns ± 0% -1.57% (p=0.000 n=16+15) FmtFprintfInt-12 128ns ± 0% 122ns ± 1% -4.45% (p=0.000 n=15+20) FmtFprintfIntInt-12 207ns ± 1% 193ns ± 0% -6.55% (p=0.000 n=19+14) FmtFprintfPrefixedInt-12 197ns ± 1% 191ns ± 0% -2.93% (p=0.000 n=17+18) FmtFprintfFloat-12 263ns ± 0% 248ns ± 1% -5.88% (p=0.000 n=15+19) FmtManyArgs-12 794ns ± 0% 779ns ± 1% -1.90% (p=0.000 n=18+18) GobDecode-12 7.14ms ± 2% 7.11ms ± 1% ~ (p=0.072 n=20+20) GobEncode-12 5.85ms ± 1% 5.82ms ± 1% -0.49% (p=0.000 n=20+20) Gzip-12 218ms ± 1% 215ms ± 1% -1.22% (p=0.000 n=19+19) Gunzip-12 36.8ms ± 0% 36.7ms ± 0% -0.18% (p=0.006 n=18+20) HTTPClientServer-12 77.1µs ± 4% 77.1µs ± 3% ~ (p=0.945 n=19+20) JSONEncode-12 15.6ms ± 1% 15.9ms ± 1% +1.68% (p=0.000 n=18+20) JSONDecode-12 55.2ms ± 1% 53.6ms ± 1% -2.93% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 1% 4.05ms ± 0% ~ (p=0.306 n=17+17) GoParse-12 3.14ms ± 1% 3.10ms ± 1% -1.31% (p=0.000 n=19+18) RegexpMatchEasy0_32-12 69.3ns ± 1% 70.0ns ± 0% +0.89% (p=0.000 n=19+17) RegexpMatchEasy0_1K-12 237ns ± 1% 236ns ± 0% -0.62% (p=0.000 n=19+16) RegexpMatchEasy1_32-12 69.5ns ± 1% 70.3ns ± 1% +1.14% (p=0.000 n=18+17) RegexpMatchEasy1_1K-12 377ns ± 1% 366ns ± 1% -3.03% (p=0.000 n=15+19) RegexpMatchMedium_32-12 107ns ± 1% 107ns ± 2% ~ (p=0.318 n=20+19) RegexpMatchMedium_1K-12 33.8µs ± 3% 33.5µs ± 1% -1.04% (p=0.001 n=20+19) RegexpMatchHard_32-12 1.68µs ± 1% 1.73µs ± 0% +2.50% (p=0.000 n=20+18) RegexpMatchHard_1K-12 50.8µs ± 1% 52.0µs ± 1% +2.50% (p=0.000 n=19+18) Revcomp-12 381ms ± 1% 385ms ± 1% +1.00% (p=0.000 n=17+18) Template-12 64.9ms ± 3% 62.6ms ± 1% -3.55% (p=0.000 n=19+18) TimeParse-12 324ns ± 0% 328ns ± 1% +1.25% (p=0.000 n=18+18) TimeFormat-12 345ns ± 0% 334ns ± 0% -3.31% (p=0.000 n=15+17) [Geo mean] 52.1µs 51.5µs -1.00% Change-Id: I13e74da3193a7f80794c654f944d1f0d60817049 Reviewed-on: https://go-review.googlesource.com/22632Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
This makes this code better self-documenting and makes it easier to find these places in the future. Change-Id: I31dc5598ae67f937fb9ef26df92fd41d01e983c3 Reviewed-on: https://go-review.googlesource.com/22631Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 29 Apr, 2016 2 commits
-
-
Austin Clements authored
Currently we have lots of (s.start << _PageShift) and variants. We now have an s.base() function that returns this. It's faster and more readable, so use it. Change-Id: I888060a9dae15ea75ca8cc1c2b31c905e71b452b Reviewed-on: https://go-review.googlesource.com/22559Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
Rick Hudson authored
Two changes are included here that are dependent on the other. The first is that allocBits and gcamrkBits are changed to a *uint8 which points to the first byte of that span's mark and alloc bits. Several places were altered to perform pointer arithmetic to locate the byte corresponding to an object in the span. The actual bit corresponding to an object is indexed in the byte by using the lower three bits of the objects index. The second change avoids the redundant calculation of an object's index. The index is returned from heapBitsForObject and then used by the functions indexing allocBits and gcmarkBits. Finally we no longer allocate the gc bits in the span structures. Instead we use an arena based allocation scheme that allows for a more compact bit map as well as recycling and bulk clearing of the mark bits. Change-Id: If4d04b2021c092ec39a4caef5937a8182c64dfef Reviewed-on: https://go-review.googlesource.com/20705Reviewed-by:
Austin Clements <austin@google.com>
-
- 27 Apr, 2016 4 commits
-
-
Rick Hudson authored
The complexity of the GC work buffers put and tryGet prevented them from being inlined. This CL simplifies the fast path thus enabling inlining. If the fast path does not succeed the previous put and tryGet functions are called. Change-Id: I6da6495d0dadf42bd0377c110b502274cc01acf5 Reviewed-on: https://go-review.googlesource.com/20704Reviewed-by:
Austin Clements <austin@google.com>
-
Rick Hudson authored
Prior to this CL the base of a span was calculated in various places using shifts or calls to base(). This CL now always calls base() which has been optimized to calculate the base of the span when the span is initialized and store that value in the span structure. Change-Id: I661f2bfa21e3748a249cdf049ef9062db6e78100 Reviewed-on: https://go-review.googlesource.com/20703Reviewed-by:
Austin Clements <austin@google.com>
-
Rick Hudson authored
Instead of building a freelist from the mark bits generated by the GC this CL allocates directly from the mark bits. The approach moves the mark bits from the pointer/no pointer heap structures into their own per span data structures. The mark/allocation vectors consist of a single mark bit per object. Two vectors are maintained, one for allocation and one for the GC's mark phase. During the GC cycle's sweep phase the interpretation of the vectors is swapped. The mark vector becomes the allocation vector and the old allocation vector is cleared and becomes the mark vector that the next GC cycle will use. Marked entries in the allocation vector indicate that the object is not free. Each allocation vector maintains a boundary between areas of the span already allocated from and areas not yet allocated from. As objects are allocated this boundary is moved until it reaches the end of the span. At this point further allocations will be done from another span. Since we no longer sweep a span inspecting each freed object the responsibility for maintaining pointer/scalar bits in the heapBitMap containing is now the responsibility of the the routines doing the actual allocation. This CL is functionally complete and ready for performance tuning. Change-Id: I336e0fc21eef1066e0b68c7067cc71b9f3d50e04 Reviewed-on: https://go-review.googlesource.com/19470Reviewed-by:
Austin Clements <austin@google.com>
-
Austin Clements authored
Currently the runtime rescans globals during mark 2 and mark termination. This costs as much as 500µs/MB in STW time, which is enough to surpass the 10ms STW limit with only 20MB of globals. It's also basically unnecessary. The compiler already generates write barriers for global -> heap pointer updates and the regular write barrier doesn't check whether the slot is a global or in the heap. Some less common write barriers do cause problems. heapBitsBulkBarrier, which is used by typedmemmove and related functions, currently depends on having access to the pointer bitmap and as a result ignores writes to globals. Likewise, the reflect-related write barriers reflect_typedmemmovepartial and callwritebarrier ignore non-heap destinations; though it appears they can never be called with global pointers anyway. This commit makes heapBitsBulkBarrier issue write barriers for writes to global pointers using the data and BSS pointer bitmaps, removes the inheap checks from the reflection write barriers, and eliminates the rescans during mark 2 and mark termination. It also adds a test that writes to globals have write barriers. Programs with large data+BSS segments (with pointers) aren't common, but for programs that do have large data+BSS segments, this significantly reduces pause time: name \ 95%ile-time/markTerm old new delta LargeBSS/bss:1GB/gomaxprocs:4 148200µs ± 6% 302µs ±52% -99.80% (p=0.008 n=5+5) This very slightly improves the go1 benchmarks: name old time/op new time/op delta BinaryTree17-12 2.62s ± 3% 2.62s ± 4% ~ (p=0.904 n=20+20) Fannkuch11-12 2.15s ± 1% 2.13s ± 0% -1.29% (p=0.000 n=18+20) FmtFprintfEmpty-12 48.3ns ± 2% 47.6ns ± 1% -1.52% (p=0.000 n=20+16) FmtFprintfString-12 152ns ± 0% 152ns ± 1% ~ (p=0.725 n=18+18) FmtFprintfInt-12 150ns ± 1% 149ns ± 1% -1.14% (p=0.000 n=19+20) FmtFprintfIntInt-12 250ns ± 0% 244ns ± 1% -2.12% (p=0.000 n=20+18) FmtFprintfPrefixedInt-12 219ns ± 1% 217ns ± 1% -1.20% (p=0.000 n=19+20) FmtFprintfFloat-12 280ns ± 0% 281ns ± 1% +0.47% (p=0.000 n=19+19) FmtManyArgs-12 928ns ± 0% 923ns ± 1% -0.53% (p=0.000 n=19+18) GobDecode-12 7.21ms ± 1% 7.24ms ± 2% ~ (p=0.091 n=19+19) GobEncode-12 6.07ms ± 1% 6.05ms ± 1% -0.36% (p=0.002 n=20+17) Gzip-12 265ms ± 1% 265ms ± 1% ~ (p=0.496 n=20+19) Gunzip-12 39.6ms ± 1% 39.3ms ± 1% -0.85% (p=0.000 n=19+19) HTTPClientServer-12 74.0µs ± 2% 73.8µs ± 1% ~ (p=0.569 n=20+19) JSONEncode-12 15.4ms ± 1% 15.3ms ± 1% -0.25% (p=0.049 n=17+17) JSONDecode-12 53.7ms ± 2% 53.0ms ± 1% -1.29% (p=0.000 n=18+17) Mandelbrot200-12 3.97ms ± 1% 3.97ms ± 0% ~ (p=0.072 n=17+18) GoParse-12 3.35ms ± 2% 3.36ms ± 1% +0.51% (p=0.005 n=18+20) RegexpMatchEasy0_32-12 72.7ns ± 2% 72.2ns ± 1% -0.70% (p=0.005 n=19+19) RegexpMatchEasy0_1K-12 246ns ± 1% 245ns ± 0% -0.60% (p=0.000 n=18+16) RegexpMatchEasy1_32-12 72.8ns ± 1% 72.5ns ± 1% -0.37% (p=0.011 n=18+18) RegexpMatchEasy1_1K-12 380ns ± 1% 385ns ± 1% +1.34% (p=0.000 n=20+19) RegexpMatchMedium_32-12 115ns ± 2% 115ns ± 1% +0.44% (p=0.047 n=20+20) RegexpMatchMedium_1K-12 35.4µs ± 1% 35.5µs ± 1% ~ (p=0.079 n=18+19) RegexpMatchHard_32-12 1.83µs ± 0% 1.80µs ± 1% -1.76% (p=0.000 n=18+18) RegexpMatchHard_1K-12 55.1µs ± 0% 54.3µs ± 1% -1.42% (p=0.000 n=18+19) Revcomp-12 386ms ± 1% 381ms ± 1% -1.14% (p=0.000 n=18+18) Template-12 61.5ms ± 2% 61.5ms ± 2% ~ (p=0.647 n=19+20) TimeParse-12 338ns ± 0% 336ns ± 1% -0.72% (p=0.000 n=14+19) TimeFormat-12 350ns ± 0% 357ns ± 0% +2.05% (p=0.000 n=19+18) [Geo mean] 55.3µs 55.0µs -0.41% Change-Id: I57e8720385a1b991aeebd111b6874354308e2a6b Reviewed-on: https://go-review.googlesource.com/20829 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 26 Apr, 2016 4 commits
-
-
Austin Clements authored
Currently the stack re-scan during mark termination is O(# stacks) because we enqueue a root marking job for every goroutine. It takes ~34ns to process this root marking job for a valid (clean) stack, so at around 300k goroutines we exceed the 10ms pause goal. A non-trivial portion of this time is spent simply taking the cache miss to check the gcscanvalid flag, so simply optimizing the path that handles clean stacks can only improve this so much. Fix this by keeping an explicit list of goroutines with dirty stacks that need to be rescanned. When a goroutine first transitions to running after a stack scan and marks its stack dirty, it adds itself to this list. We enqueue root marking jobs only for the goroutines in this list, so this improves stack re-scanning asymptotically by completely eliminating time spent on clean goroutines. This reduces mark termination time for 500k idle goroutines from 15ms to 238µs. Overall performance effect is negligible. name \ 95%ile-time/markTerm old new delta IdleGs/gs:500000/gomaxprocs:12 15000µs ± 0% 238µs ± 5% -98.41% (p=0.000 n=10+10) name old time/op new time/op delta XBenchGarbage-12 2.30ms ± 3% 2.29ms ± 1% -0.43% (p=0.049 n=17+18) name old time/op new time/op delta BinaryTree17-12 2.57s ± 3% 2.59s ± 2% ~ (p=0.141 n=19+20) Fannkuch11-12 2.09s ± 0% 2.10s ± 1% +0.53% (p=0.000 n=19+19) FmtFprintfEmpty-12 45.3ns ± 3% 45.2ns ± 2% ~ (p=0.845 n=20+20) FmtFprintfString-12 129ns ± 0% 127ns ± 0% -1.55% (p=0.000 n=16+16) FmtFprintfInt-12 123ns ± 0% 119ns ± 1% -3.24% (p=0.000 n=19+19) FmtFprintfIntInt-12 195ns ± 1% 189ns ± 1% -3.11% (p=0.000 n=17+17) FmtFprintfPrefixedInt-12 193ns ± 1% 187ns ± 1% -3.06% (p=0.000 n=19+19) FmtFprintfFloat-12 254ns ± 0% 255ns ± 1% +0.35% (p=0.001 n=14+17) FmtManyArgs-12 781ns ± 0% 770ns ± 0% -1.48% (p=0.000 n=16+19) GobDecode-12 7.00ms ± 1% 6.98ms ± 1% ~ (p=0.563 n=19+19) GobEncode-12 5.91ms ± 1% 5.92ms ± 0% ~ (p=0.118 n=19+18) Gzip-12 219ms ± 1% 215ms ± 1% -1.81% (p=0.000 n=18+18) Gunzip-12 37.2ms ± 0% 37.4ms ± 0% +0.45% (p=0.000 n=17+19) HTTPClientServer-12 76.9µs ± 3% 77.5µs ± 2% +0.81% (p=0.030 n=20+19) JSONEncode-12 15.0ms ± 0% 14.8ms ± 1% -0.88% (p=0.001 n=15+19) JSONDecode-12 50.6ms ± 0% 53.2ms ± 2% +5.07% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 0% 4.05ms ± 1% ~ (p=0.581 n=16+17) GoParse-12 3.34ms ± 1% 3.30ms ± 1% -1.21% (p=0.000 n=15+20) RegexpMatchEasy0_32-12 69.6ns ± 1% 69.8ns ± 2% ~ (p=0.566 n=19+19) RegexpMatchEasy0_1K-12 238ns ± 1% 236ns ± 0% -0.91% (p=0.000 n=17+13) RegexpMatchEasy1_32-12 69.8ns ± 1% 70.0ns ± 1% +0.23% (p=0.026 n=17+16) RegexpMatchEasy1_1K-12 371ns ± 1% 363ns ± 1% -2.07% (p=0.000 n=19+19) RegexpMatchMedium_32-12 107ns ± 2% 106ns ± 1% -0.51% (p=0.031 n=18+20) RegexpMatchMedium_1K-12 33.0µs ± 0% 32.9µs ± 0% -0.30% (p=0.004 n=16+16) RegexpMatchHard_32-12 1.70µs ± 0% 1.70µs ± 0% +0.45% (p=0.000 n=16+17) RegexpMatchHard_1K-12 51.1µs ± 2% 51.4µs ± 1% +0.53% (p=0.000 n=17+19) Revcomp-12 378ms ± 1% 385ms ± 1% +1.92% (p=0.000 n=19+18) Template-12 64.3ms ± 2% 65.0ms ± 2% +1.09% (p=0.001 n=19+19) TimeParse-12 315ns ± 1% 317ns ± 2% ~ (p=0.108 n=18+20) TimeFormat-12 360ns ± 1% 337ns ± 0% -6.30% (p=0.000 n=18+13) [Geo mean] 51.8µs 51.6µs -0.48% Change-Id: Icf8994671476840e3998236e15407a505d4c760c Reviewed-on: https://go-review.googlesource.com/20700Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Currently we remove stack barriers during STW mark termination, which has a non-trivial per-goroutine cost and means that we have to touch even clean stacks during mark termination. However, there's no problem with leaving them in during the sweep phase. They just have to be out by the time we install new stack barriers immediately prior to scanning the stack such as during the mark phase of the next GC cycle or during mark termination in a STW GC. Hence, move the gcRemoveStackBarriers from STW mark termination to just before we install new stack barriers during concurrent mark. This removes the cost from STW. Furthermore, this combined with concurrent stack shrinking means that the mark termination scan of a clean stack is a complete no-op, which will make it possible to skip clean stacks entirely during mark termination. This has the downside that it will mess up anything outside of Go that tries to walk Go stacks all the time instead of just some of the time. This includes tools like GDB, perf, and VTune. We'll improve the situation shortly. Change-Id: Ia40baad8f8c16aeefac05425e00b0cf478137097 Reviewed-on: https://go-review.googlesource.com/20667Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Currently we enqueue span root mark jobs during both concurrent mark and mark termination, but we make the job a no-op during mark termination. This is silly. Instead of queueing them up just to not do them, don't queue them up in the first place. Change-Id: Ie1d36de884abfb17dd0db6f0449a2b7c997affab Reviewed-on: https://go-review.googlesource.com/20666Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Currently we free cached stacks of dead Gs during STW stack root marking. We do this during STW because there's no way to take ownership of a particular dead G, so attempting to free a dead G's stack during concurrent stack root marking could race with reusing that G. However, we can do this concurrently if we take a completely different approach. One way to prevent reuse of a dead G is to remove it from the free G list. Hence, this adds a new fixed root marking task that simply removes all Gs from the list of dead Gs with cached stacks, frees their stacks, and then adds them to the list of dead Gs without cached stacks. This is also a necessary step toward rescanning only dirty stacks, since it eliminates another task from STW stack marking. Change-Id: Iefbad03078b284a2e7bf30fba397da4ca87fe095 Reviewed-on: https://go-review.googlesource.com/20665Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 21 Apr, 2016 2 commits
-
-
Austin Clements authored
Currently allocating black switches to the system stack (which is probably a historical accident) and atomically updates the global bytes marked stat. Since we're about to depend on this much more, optimize it a bit by putting it back on the regular stack and updating the per-P bytes marked stat, which gets lazily folded into the global bytes marked stat. Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04 Reviewed-on: https://go-review.googlesource.com/22170Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Currently we count black allocations toward the scannable heap size, but not toward the scan work we've done so far. This is clearly inconsistent (we have, in effect, scanned these allocations and since they're already black, we're not going to scan them again). Worse, it means we don't count black allocations toward the scannable heap size as of the *next* GC because this is based on the amount of scan work we did in this cycle. Fix this by counting black allocations as scan work. Currently the GC spends very little time in allocate-black mode, so this probably hasn't been a problem, but this will become important when we switch to always allocating black. Change-Id: If6ff693b070c385b65b6ecbbbbf76283a0f9d990 Reviewed-on: https://go-review.googlesource.com/22119Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 11 Apr, 2016 1 commit
-
-
Jeremy Jackins authored
After mdempsky's recent changes, these are the only references to "TheChar" left in the Go tree. Without the context, and without knowing the history, this is confusing. Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS and sys.GOARCH. Also change the heap dump format to include sys.GOARCH rather than TheChar, which is no longer a concept. Updates #15169 (changes heapdump format) Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca Reviewed-on: https://go-review.googlesource.com/21647Reviewed-by:
Matthew Dempsky <mdempsky@google.com>
-
- 16 Mar, 2016 2 commits
-
-
Austin Clements authored
Currently we shrink stacks during STW mark termination because it used to be unsafe to shrink them concurrently. For some programs, this significantly increases pause time: stack shrinking costs ~5ms/MB copied plus 2µs/shrink. Now that we've made it safe to shrink a stack without the world being stopped, shrink them during the concurrent mark phase. This reduces the STW time in the program from issue #12967 by an order of magnitude and brings it from over the 10ms goal to well under: name old 95%ile-markTerm-time new 95%ile-markTerm-time delta Stackshrink-4 23.8ms ±60% 1.80ms ±39% -92.44% (p=0.008 n=5+5) Fixes #12967. This slows down the go1 and garbage benchmarks overall by < 0.5%. name old time/op new time/op delta XBenchGarbage-12 2.48ms ± 1% 2.49ms ± 1% +0.45% (p=0.005 n=25+21) name old time/op new time/op delta BinaryTree17-12 2.93s ± 2% 2.97s ± 2% +1.34% (p=0.002 n=19+20) Fannkuch11-12 2.51s ± 1% 2.59s ± 0% +3.09% (p=0.000 n=18+18) FmtFprintfEmpty-12 51.1ns ± 2% 51.5ns ± 1% ~ (p=0.280 n=20+17) FmtFprintfString-12 175ns ± 1% 169ns ± 1% -3.01% (p=0.000 n=20+20) FmtFprintfInt-12 160ns ± 1% 160ns ± 0% +0.53% (p=0.000 n=20+20) FmtFprintfIntInt-12 265ns ± 0% 266ns ± 1% +0.59% (p=0.000 n=20+20) FmtFprintfPrefixedInt-12 237ns ± 1% 238ns ± 1% +0.44% (p=0.000 n=20+20) FmtFprintfFloat-12 326ns ± 1% 341ns ± 1% +4.55% (p=0.000 n=20+19) FmtManyArgs-12 1.01µs ± 0% 1.02µs ± 0% +0.43% (p=0.000 n=20+19) GobDecode-12 8.41ms ± 1% 8.30ms ± 2% -1.22% (p=0.000 n=20+19) GobEncode-12 6.66ms ± 1% 6.68ms ± 0% +0.30% (p=0.000 n=18+19) Gzip-12 322ms ± 1% 322ms ± 1% ~ (p=1.000 n=20+20) Gunzip-12 42.8ms ± 0% 42.9ms ± 0% ~ (p=0.174 n=20+20) HTTPClientServer-12 69.7µs ± 1% 70.6µs ± 1% +1.20% (p=0.000 n=20+20) JSONEncode-12 16.8ms ± 0% 16.8ms ± 1% ~ (p=0.154 n=19+19) JSONDecode-12 65.1ms ± 0% 65.3ms ± 1% +0.34% (p=0.003 n=20+20) Mandelbrot200-12 3.93ms ± 0% 3.92ms ± 0% ~ (p=0.396 n=19+20) GoParse-12 3.66ms ± 1% 3.65ms ± 1% ~ (p=0.117 n=16+18) RegexpMatchEasy0_32-12 85.0ns ± 2% 85.5ns ± 2% ~ (p=0.143 n=20+20) RegexpMatchEasy0_1K-12 267ns ± 1% 267ns ± 1% ~ (p=0.867 n=20+17) RegexpMatchEasy1_32-12 83.3ns ± 2% 83.8ns ± 1% ~ (p=0.068 n=20+20) RegexpMatchEasy1_1K-12 432ns ± 1% 432ns ± 1% ~ (p=0.804 n=20+19) RegexpMatchMedium_32-12 133ns ± 0% 133ns ± 0% ~ (p=1.000 n=20+20) RegexpMatchMedium_1K-12 40.3µs ± 1% 40.4µs ± 1% ~ (p=0.319 n=20+19) RegexpMatchHard_32-12 2.10µs ± 1% 2.10µs ± 1% ~ (p=0.723 n=20+18) RegexpMatchHard_1K-12 63.0µs ± 0% 63.0µs ± 0% ~ (p=0.158 n=19+17) Revcomp-12 461ms ± 1% 476ms ± 8% +3.29% (p=0.002 n=20+20) Template-12 80.1ms ± 1% 79.3ms ± 1% -1.00% (p=0.000 n=20+20) TimeParse-12 360ns ± 0% 360ns ± 0% ~ (p=0.802 n=18+19) TimeFormat-12 374ns ± 1% 372ns ± 0% -0.77% (p=0.000 n=20+19) [Geo mean] 61.8µs 62.0µs +0.40% Change-Id: Ib60cd46b7a4987e07670eb271d22f6cee5802842 Reviewed-on: https://go-review.googlesource.com/20044Reviewed-by:
Keith Randall <khr@golang.org>
-
Austin Clements authored
We're about to add another root marking job that needs to happen only during the first markroot pass (whether that's concurrent or STW), just like finalizer scanning. Rather than introducing another flag that has the same value as finalizersDone, just rename finalizersDone to markrootDone. Change-Id: I535356c6ea1f3734cb5b6add264cb7bf48de95e8 Reviewed-on: https://go-review.googlesource.com/20043Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 07 Mar, 2016 1 commit
-
-
Matthew Dempsky authored
Automated refactoring produced using github.com/mdempsky/unconvert. Change-Id: Iacf871a4f221ef17f48999a464ab2858b2bbaa90 Reviewed-on: https://go-review.googlesource.com/20071Reviewed-by:
Austin Clements <austin@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 02 Mar, 2016 1 commit
-
-
Brad Fitzpatrick authored
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022Reviewed-by:
Rob Pike <r@golang.org> Reviewed-by:
Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 25 Feb, 2016 1 commit
-
-
Austin Clements authored
Currently markroot uses a gcWork on the stack and disposes of it immediately after marking one root. This used to be necessary because markroot was called from the depths of parfor, but now that we call it directly and have ready access to a gcWork at the call site, pass the gcWork in, use it directly in markroot, and share it across calls to markroot from the same P. Change-Id: Id7c3b811bfb944153760e01873c07c8d18909be1 Reviewed-on: https://go-review.googlesource.com/19635Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
- 16 Jan, 2016 1 commit
-
-
Austin Clements authored
GC assists check gcBlackenEnabled under the assist queue lock to avoid going to sleep after gcWakeAllAssists has already woken all assists. However, currently we clear gcBlackenEnabled shortly *after* waking all assists, which opens a window where this exact race can happen. Fix this by clearing gcBlackenEnabled before waking blocked assists. However, it's unlikely this actually matters because the world is stopped between waking assists and clearing gcBlackenEnabled and there aren't any obvious allocations during this window, so I don't think an assist could actually slip in to this race window. Updates #13645. Change-Id: I7571f059530481dc781d8fd96a1a40aadebecb0d Reviewed-on: https://go-review.googlesource.com/18682 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 12 Jan, 2016 2 commits
-
-
Austin Clements authored
It would certainly be a mistake to invoke a write barrier while greying an object. Change-Id: I34445a15ab09655ea8a3628a507df56aea61e618 Reviewed-on: https://go-review.googlesource.com/18533 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
It used to be the case that repeatedly getting one GC pointer and enqueuing one GC pointer could cause contention on the work buffers as each operation passed over the boundary of a work buffer. As of b6c0934a, we use a two buffer cache that prevents this sort of contention. Change-Id: I4f1111623f76df9c5493dd9124dec1e0bfaf53b7 Reviewed-on: https://go-review.googlesource.com/18532 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 11 Jan, 2016 1 commit
-
-
Rick Hudson authored
Currently, due to an oversight, we only balance work buffers in background and idle workers and not in assists. As a result, in assist-heavy workloads, assists are likely to tie up large work buffers in per-P caches increasing the likelihood that the global list will be empty. This increases the likelihood that other GC workers will exit and assists will block, slowing down the system as a whole. Fix this by eagerly balancing work buffers as soon as the assists notice that the global buffers are empty. This makes it much more likely that work will be immediately available to other workers and assists. This change reduces the garbage benchmark time by 39% and fixes the regresssion seen at CL 15893 golang.org/cl/15893. Garbage benchmark times before and after this CL. Before GOPERF-METRIC:time=4427020 After GOPERF-METRIC:time=2721645 Fixes #13827 Change-Id: I9cb531fb873bab4b69ce9c1617e30df6c49cdcfe Reviewed-on: https://go-review.googlesource.com/18341Reviewed-by:
Austin Clements <austin@google.com>
-
- 24 Nov, 2015 1 commit
-
-
Austin Clements authored
Write barriers in gcFlushBgCredit lead to very subtle bugs because it executes after the getfull barrier. I tracked some bugs of this form down before go:nowritebarrierrec was implemented. Ensure that they don't reappear by making gcFlushBgCredit go:nowritebarrierrec. Change-Id: Ia5ca2dc59e6268bce8d8b4c87055bd0f6e19bed2 Reviewed-on: https://go-review.googlesource.com/17052Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 19 Nov, 2015 1 commit
-
-
Austin Clements authored
A sigprof during stack barrier insertion or removal can crash if it detects an inconsistency between the stkbar array and the stack itself. Currently we protect against this when scanning another G's stack using stackLock, but we don't protect against it when unwinding stack barriers for a recover or a memmove to the stack. This commit cleans up and improves the stack locking code. It abstracts out the lock and unlock operations. It uses the lock consistently everywhere we perform stack operations, and pushes the lock/unlock down closer to where the stack barrier operations happen to make it more obvious what it's protecting. Finally, it modifies sigprof so that instead of spinning until it acquires the lock, it simply doesn't perform a traceback if it can't acquire it. This is necessary to prevent self-deadlock. Updates #11863, which introduced stackLock to fix some of these issues, but didn't go far enough. Updates #12528. Change-Id: I9d1fa88ae3744d31ba91500c96c6988ce1a3a349 Reviewed-on: https://go-review.googlesource.com/17036Reviewed-by:
Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 16 Nov, 2015 1 commit
-
-
Ian Lance Taylor authored
If you set GODEBUG=cgocheck=2 the runtime package will use the write barrier to detect cases where a Go program writes a Go pointer into non-Go memory. In conjunction with the existing cgo checks, and the not-yet-implemented cgo check for exported functions, this should reliably detect all cases (that do not import the unsafe package) in which a Go pointer is incorrectly shared with C code. This check is optional because it turns on the write barrier at all times, which is known to be expensive. Update #12416. Change-Id: I549d8b2956daa76eac853928e9280e615d6365f4 Reviewed-on: https://go-review.googlesource.com/16899Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 12 Nov, 2015 1 commit
-
-
Michael Matloob authored
runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 11 Nov, 2015 2 commits
-
-
Austin Clements authored
The GC now handles the root marking jobs as part of general marking, so work.markfor is no longer used. Change-Id: I6c3b23fed27e4e7ea6430d6ca7ba25ae4d04ed14 Reviewed-on: https://go-review.googlesource.com/16811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
Austin Clements authored
This changes "mark worker (idle)" to "GC worker (idle)" so it's more clear to users that these goroutines are GC-related. It changes "GC assist" to "GC assist wait" to make it clear that the assist is blocked. Change-Id: Iafbc0903c84f9250ff6bee14baac6fcd4ed5ef76 Reviewed-on: https://go-review.googlesource.com/16511Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 10 Nov, 2015 1 commit
-
-
Michael Matloob authored
This change breaks out most of the atomics functions in the runtime into package runtime/internal/atomic. It adds some basic support in the toolchain for runtime packages, and also modifies linux/arm atomics to remove the dependency on the runtime's mutex. The mutexes have been replaced with spinlocks. all trybots are happy! In addition to the trybots, I've tested on the darwin/arm64 builder, on the darwin/arm builder, and on a ppc64le machine. Change-Id: I6698c8e3cf3834f55ce5824059f44d00dc8e3c2f Reviewed-on: https://go-review.googlesource.com/14204 Run-TryBot: Michael Matloob <matloob@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 05 Nov, 2015 2 commits
-
-
Austin Clements authored
This moves all of the mark 1 to mark 2 transition and mark termination to the mark done transition function. This means these transitions are now handled on the goroutine that detected mark completion. This also means that the GC coordinator and the background completion barriers are no longer used and various workarounds to yield to the coordinator are no longer necessary. These will be removed in follow-up commits. One consequence of this is that mark workers now need to be preemptible when performing the mark done transition. This allows them to stop the world and to perform the final clean-up steps of GC after restarting the world. They are only made preemptible while performing this transition, so if the worker findRunnableGCWorker would schedule isn't available, we didn't want to schedule it anyway. Fixes #11970. Change-Id: I9203a2d6287eeff62d589ec02ad9cb1e29ddb837 Reviewed-on: https://go-review.googlesource.com/16391Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Currently the code for completion of mark 1/mark 2 is duplicated in background workers and assists. Factor this in to a single function that will serve as the transition function for concurrent mark. Change-Id: I4d9f697a15da0d349db3b34d56f3a220dd41d41b Reviewed-on: https://go-review.googlesource.com/16359Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 04 Nov, 2015 1 commit
-
-
Austin Clements authored
Currently dedicated mark workers participate in the getfull barrier during concurrent mark. However, the getfull barrier wasn't designed for concurrent work and this causes no end of headaches. In the concurrent setting, participants come and go. This makes mark completion susceptible to live-lock: since dedicated workers are only periodically polling for completion, it's possible for the program to be in some transient worker each time one of the dedicated workers wakes up to check if it can exit the getfull barrier. It also complicates reasoning about the system because dedicated workers participate directly in the getfull barrier, but transient workers must instead use trygetfull because they have exit conditions that aren't captured by getfull (e.g., fractional workers exit when preempted). The complexity of implementing these exit conditions contributed to #11677. Furthermore, the getfull barrier is inefficient because we could be running user code instead of spinning on a P. In effect, we're dedicating 25% of the CPU to marking even if that means we have to spin to make that 25%. It also causes issues on Windows because we can't actually sleep for 100µs (#8687). Fix this by making dedicated workers no longer participate in the getfull barrier. Instead, dedicated workers simply return to the scheduler when they fail to get more work, regardless of what others workers are doing, and the scheduler only starts new dedicated workers if there's work available. Everything that needs to be handled by this barrier is already handled by detection of mark completion. This makes the system much more symmetric because all workers and assists now use trygetfull during concurrent mark. It also loosens the 25% CPU target so that we can give some of that 25% back to user code if there isn't enough work to keep the mark worker busy. And it eliminates the problematic 100µs sleep on Windows during concurrent mark (though not during mark termination). The downside of this is that if we hit a bottleneck in the heap graph that then expands back out, the system may shut down dedicated workers and take a while to start them back up. We'll address this in the next commit. Updates #12041 and #8687. No effect on the go1 benchmarks. This slows down the garbage benchmark by 9%, but we'll more than make it up in the next commit. name old time/op new time/op delta XBenchGarbage-12 5.80ms ± 2% 6.32ms ± 4% +9.03% (p=0.000 n=20+20) Change-Id: I65100a9ba005a8b5cf97940798918672ea9dd09b Reviewed-on: https://go-review.googlesource.com/16297Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 03 Nov, 2015 2 commits
-
-
Austin Clements authored
Currently, assists are non-preemptible, which means a heavily assisting G can block other Gs from running. At the beginning of a GC cycle, it can also delay scang, which will spin until the assist is done. Since scanning is currently done sequentially, this can seriously extend the length of the scan phase. Fix this by making assists preemptible. Since the assist holds work buffers and runs on the system stack, this must be done cooperatively: we make gcDrainN return on preemption, and make the assist return from the system stack and voluntarily Gosched. This is prerequisite to enlarging the work buffers. Without this change, the delays and spinning in scang increase significantly. This has no effect on the go1 benchmarks. name old time/op new time/op delta XBenchGarbage-12 5.72ms ± 4% 5.37ms ± 5% -6.11% (p=0.000 n=20+20) Change-Id: I829e732a0f23b126da633516a1a9ec1a508fdbf1 Reviewed-on: https://go-review.googlesource.com/15894Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
GC assists must block until the assist can be satisfied (either through stealing credit or doing work) or the GC cycle ends. Currently, this is implemented as a retry loop with a 100 µs delay. This obviously isn't ideal, as it wastes CPU and delays mutator execution. It also has the somewhat peculiar downside that sleeping a G requires allocation, and this requires working around recursive allocation. Replace this timed delay with a proper scheduling queue. When an assist can't be satisfied immediately, it adds the allocating G to a queue and parks it. Any time background scan credit is flushed, it consults this queue, directly satisfies the debt of queued assists, and wakes up satisfied assists before flushing any remaining credit to the background credit pool. No effect on the go1 benchmarks. Slightly speeds up the garbage benchmark. name old time/op new time/op delta XBenchGarbage-12 5.81ms ± 1% 5.72ms ± 4% -1.65% (p=0.011 n=20+20) Updates #12041. Change-Id: I8ee3b6274dd097b12b10a8030796a958a4b0e7b7 Reviewed-on: https://go-review.googlesource.com/15890Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 30 Oct, 2015 2 commits
-
-
Austin Clements authored
Currently the concurrent root scan is performed in its entirety by the GC coordinator before entering concurrent mark (which enables GC workers). This scan is done sequentially, which can prolong the scan phase, delay the mark phase, and means that the scan phase does not obey the 25% CPU goal. Furthermore, there's no need to complete the root scan before starting marking (in fact, we already allow GC assists to happen during the scan phase), so this acts as an unnecessary barrier between root scanning and marking. This change shifts the root scan work out of the GC coordinator and in to the GC workers. The coordinator simply sets up the scan state and enqueues the right number of root scan jobs. The GC workers then drain the root scan jobs prior to draining heap scan jobs. This parallelizes the root scan process, makes it obey the 25% CPU goal, and effectively eliminates root scanning as an isolated phase, allowing the system to smoothly transition from root scanning to heap marking. This also eliminates a major non-STW responsibility of the GC coordinator, which will make it easier to switch to a decentralized state machine. Finally, it puts us in a good position to perform root scanning in assists as well, which will help satisfy assists at the beginning of the GC cycle. This is mostly straightforward. One tricky aspect is that we have to deal with preemption deadlock: where two non-preemptible gorountines are trying to preempt each other to perform a stack scan. Given the context where this happens, the only instance of this is two background workers trying to scan each other. We avoid this by simply not scanning the stacks of background workers during the concurrent phase; this is safe because we'll scan them during mark termination (and their stacks are *very* small and should not contain any new pointers). This change also switches the root marking during mark termination to use the same gcDrain-based code path as concurrent mark. This shouldn't affect performance because STW root marking was already parallel and tasks switched to heap marking immediately when no more root marking tasks were available. However, it simplifies the code and unifies these code paths. This has negligible effect on the go1 benchmarks. It slightly slows down the garbage benchmark, possibly by making GC run slightly more frequently. name old time/op new time/op delta XBenchGarbage-12 5.10ms ± 1% 5.24ms ± 1% +2.87% (p=0.000 n=18+18) name old time/op new time/op delta BinaryTree17-12 3.25s ± 3% 3.20s ± 5% -1.57% (p=0.013 n=20+20) Fannkuch11-12 2.45s ± 1% 2.46s ± 1% +0.38% (p=0.019 n=20+18) FmtFprintfEmpty-12 49.7ns ± 3% 49.9ns ± 4% ~ (p=0.851 n=19+20) FmtFprintfString-12 170ns ± 2% 170ns ± 1% ~ (p=0.775 n=20+19) FmtFprintfInt-12 161ns ± 1% 160ns ± 1% -0.78% (p=0.000 n=19+18) FmtFprintfIntInt-12 267ns ± 1% 270ns ± 1% +1.04% (p=0.000 n=19+19) FmtFprintfPrefixedInt-12 238ns ± 2% 238ns ± 1% ~ (p=0.133 n=18+19) FmtFprintfFloat-12 311ns ± 1% 310ns ± 2% -0.35% (p=0.023 n=20+19) FmtManyArgs-12 1.08µs ± 1% 1.06µs ± 1% -2.31% (p=0.000 n=20+20) GobDecode-12 8.65ms ± 1% 8.63ms ± 1% ~ (p=0.377 n=18+20) GobEncode-12 6.49ms ± 1% 6.52ms ± 1% +0.37% (p=0.015 n=20+20) Gzip-12 319ms ± 3% 318ms ± 1% ~ (p=0.975 n=19+17) Gunzip-12 41.9ms ± 1% 42.1ms ± 2% +0.65% (p=0.004 n=19+20) HTTPClientServer-12 61.7µs ± 1% 62.6µs ± 1% +1.40% (p=0.000 n=18+20) JSONEncode-12 16.8ms ± 1% 16.9ms ± 1% ~ (p=0.239 n=20+18) JSONDecode-12 58.4ms ± 1% 60.7ms ± 1% +3.85% (p=0.000 n=19+20) Mandelbrot200-12 3.86ms ± 0% 3.86ms ± 1% ~ (p=0.092 n=18+19) GoParse-12 3.75ms ± 2% 3.75ms ± 2% ~ (p=0.708 n=19+20) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 2% +0.60% (p=0.010 n=17+20) RegexpMatchEasy0_1K-12 341ns ± 1% 342ns ± 2% ~ (p=0.203 n=20+19) RegexpMatchEasy1_32-12 82.5ns ± 2% 83.2ns ± 2% +0.83% (p=0.007 n=19+19) RegexpMatchEasy1_1K-12 495ns ± 1% 495ns ± 2% ~ (p=0.970 n=19+18) RegexpMatchMedium_32-12 130ns ± 2% 130ns ± 2% +0.59% (p=0.039 n=19+20) RegexpMatchMedium_1K-12 39.2µs ± 1% 39.3µs ± 1% ~ (p=0.214 n=18+18) RegexpMatchHard_32-12 2.03µs ± 2% 2.02µs ± 1% ~ (p=0.166 n=18+19) RegexpMatchHard_1K-12 61.0µs ± 1% 60.9µs ± 1% ~ (p=0.169 n=20+18) Revcomp-12 533ms ± 1% 535ms ± 1% ~ (p=0.071 n=19+17) Template-12 68.1ms ± 2% 73.0ms ± 1% +7.26% (p=0.000 n=19+20) TimeParse-12 355ns ± 2% 356ns ± 2% ~ (p=0.530 n=19+20) TimeFormat-12 357ns ± 2% 347ns ± 1% -2.59% (p=0.000 n=20+19) [Geo mean] 62.1µs 62.3µs +0.31% name old speed new speed delta GobDecode-12 88.7MB/s ± 1% 88.9MB/s ± 1% ~ (p=0.377 n=18+20) GobEncode-12 118MB/s ± 1% 118MB/s ± 1% -0.37% (p=0.015 n=20+20) Gzip-12 60.9MB/s ± 3% 60.9MB/s ± 1% ~ (p=0.944 n=19+17) Gunzip-12 464MB/s ± 1% 461MB/s ± 2% -0.64% (p=0.004 n=19+20) JSONEncode-12 115MB/s ± 1% 115MB/s ± 1% ~ (p=0.236 n=20+18) JSONDecode-12 33.2MB/s ± 1% 32.0MB/s ± 1% -3.71% (p=0.000 n=19+20) GoParse-12 15.5MB/s ± 2% 15.5MB/s ± 2% ~ (p=0.702 n=19+20) RegexpMatchEasy0_32-12 320MB/s ± 1% 318MB/s ± 2% ~ (p=0.094 n=18+20) RegexpMatchEasy0_1K-12 3.00GB/s ± 1% 2.99GB/s ± 1% ~ (p=0.194 n=20+19) RegexpMatchEasy1_32-12 388MB/s ± 2% 385MB/s ± 2% -0.83% (p=0.008 n=19+19) RegexpMatchEasy1_1K-12 2.07GB/s ± 1% 2.07GB/s ± 1% ~ (p=0.964 n=19+18) RegexpMatchMedium_32-12 7.68MB/s ± 1% 7.64MB/s ± 2% -0.57% (p=0.020 n=19+20) RegexpMatchMedium_1K-12 26.1MB/s ± 1% 26.1MB/s ± 1% ~ (p=0.211 n=18+18) RegexpMatchHard_32-12 15.8MB/s ± 1% 15.8MB/s ± 1% ~ (p=0.180 n=18+19) RegexpMatchHard_1K-12 16.8MB/s ± 1% 16.8MB/s ± 2% ~ (p=0.236 n=20+19) Revcomp-12 477MB/s ± 1% 475MB/s ± 1% ~ (p=0.071 n=19+17) Template-12 28.5MB/s ± 2% 26.6MB/s ± 1% -6.77% (p=0.000 n=19+20) [Geo mean] 100MB/s 99.0MB/s -0.82% Change-Id: I875bf6ceb306d1ee2f470cabf88aa6ede27c47a0 Reviewed-on: https://go-review.googlesource.com/16059Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
We already have gcMarkWorkAvailable, but the check for GC mark work is open-coded in several places. Generalize gcMarkWorkAvailable slightly and replace these open-coded checks with calls to gcMarkWorkAvailable. In addition to cleaning up the code, this puts us in a better position to make this check slightly more complicated. Change-Id: I1b29883300ecd82a1bf6be193e9b4ee96582a860 Reviewed-on: https://go-review.googlesource.com/16058Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-