An error occurred fetching the project authors.
- 04 Nov, 2019 1 commit
-
-
Michael Anthony Knyszek authored
This change renames the "round" function to the more appropriately named "alignUp" which rounds an integer up to the next multiple of a power of two. This change also adds the alignDown function, which is almost like alignUp but rounds down to the previous multiple of a power of two. With these two functions, we also go and replace manual rounding code with it where we can. Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec Reviewed-on: https://go-review.googlesource.com/c/go/+/190618Reviewed-by:
Austin Clements <austin@google.com> Reviewed-by:
Keith Randall <khr@golang.org>
-
- 10 Oct, 2019 1 commit
-
-
Brad Fitzpatrick authored
Part 1: CL 199499 (GOOS nacl) Part 2: CL 200077 (amd64p32 files, toolchain) Part 3: stuff that arguably should've been part of Part 2, but I forgot one of my grep patterns when splitting the original CL up into two parts. This one might also have interesting stuff to resurrect for any future x32 ABI support. Updates #30439 Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c Reviewed-on: https://go-review.googlesource.com/c/go/+/200318 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Ian Lance Taylor <iant@golang.org>
-
- 04 Sep, 2019 1 commit
-
-
Michael Anthony Knyszek authored
This change makes it so that worldsema isn't held across the mark phase. This means that various operations like ReadMemStats may now stop the world during the mark phase, reducing latency on such operations. Only three such operations are still no longer allowed to occur during marking: GOMAXPROCS, StartTrace, and StopTrace. For the former it's because any change to GOMAXPROCS impacts GC mark background worker scheduling and the details there are tricky. For the latter two it's because tracing needs to observe consistent GC start and GC end events, and if StartTrace or StopTrace may stop the world during marking, then it's possible for it to see a GC end event without a start or GC start event without an end, respectively. To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until marking is complete, the runtime now holds a new semaphore, gcsema, across the mark phase just like it used to with worldsema. Fixes #19812. Change-Id: I15d43ed184f711b3d104e8f267fb86e335f86bf9 Reviewed-on: https://go-review.googlesource.com/c/go/+/182657 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Keith Randall <khr@golang.org> Reviewed-by:
Cherry Zhang <cherryyz@google.com>
-
- 22 Aug, 2018 2 commits
-
-
Iskander Sharipov authored
Replaces legacy Go syntax for pointer struct member access with more modern auto-deref alternative. Found using https://go-critic.github.io/overview#underef-ref Change-Id: I71a3c424126c4ff5d89f9e4bacb6cc01c6fa2ddf Reviewed-on: https://go-review.googlesource.com/122895 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
Iskander Sharipov authored
Replace `x[:]` where x is a slice with just `x`. Found using https://go-critic.github.io/overview.html#unslice-ref Change-Id: Ib0ee16e1d49b2a875b6b92a770049acc33208362 Reviewed-on: https://go-review.googlesource.com/123375 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
- 03 May, 2018 1 commit
-
-
Josh Bleecher Snyder authored
Every time I poke at #14921, the g.waitreason string pointer writes show up. They're not particularly important performance-wise, but it'd be nice to clear the noise away. And it does open up a few extra bytes in the g struct for some future use. This is a re-roll of CL 99078, which was rolled back because of failures on s390x. Those failures were apparently due to an old version of gdb. Change-Id: Icc2c12f449b2934063fd61e272e06237625ed589 Reviewed-on: https://go-review.googlesource.com/111256 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Michael Munday <mike.munday@ibm.com>
-
- 24 Apr, 2018 1 commit
-
-
Hana Kim authored
"Span" is a commonly used term in many distributed tracing systems (Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a period of time, not necessarily tied into execution of underlying processor, thread, or goroutine, unlike the "Span" of runtime/trace package. Since distributed tracing and go runtime execution tracing are already similar enough to cause confusion, this CL attempts to avoid using the same word if possible. "Region" is being used in a certain tracing system to refer to a code region which is pretty close to what runtime/trace.Span currently refers to. So, replace that. https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions This CL also tweaks APIs a bit based on jbd and heschi's comments: NewContext -> NewTask and it now returns a Task object that exports End method. StartSpan -> StartRegion and it now returns a Region object that exports End method. Also, changed WithSpan to WithRegion and it now takes func() with no context. Another thought is to get rid of WithRegion. It is a nice concept but in practice, it seems problematic (a lot of code churn, and polluting stack trace). Already, the tracing concept is very low level, and we hope this API to be used with great care. Recommended usage will be defer trace.StartRegion(ctx, "someRegion").End() Left old APIs untouched in this CL. Once the usage of them are cleaned up, they will be removed in a separate CL. Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a Reviewed-on: https://go-review.googlesource.com/108296 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
JBD <jbd@google.com>
-
- 13 Mar, 2018 1 commit
-
-
Josh Bleecher Snyder authored
This reverts commit 4eea887f. Reason for revert: broke s390x build Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847 Reviewed-on: https://go-review.googlesource.com/100375Reviewed-by:
Josh Bleecher Snyder <josharian@gmail.com>
-
- 12 Mar, 2018 1 commit
-
-
Josh Bleecher Snyder authored
Every time I poke at #14921, the g.waitreason string pointer writes show up. They're not particularly important performance-wise, but it'd be nice to clear the noise away. And it does open up a few extra bytes in the g struct for some future use. Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9 Reviewed-on: https://go-review.googlesource.com/99078 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 15 Feb, 2018 2 commits
-
-
Hana Kim authored
This implements the annotation API proposed in golang.org/cl/63274. traceString is updated to protect the string map with trace.stringsLock because the assumption that traceString is called by a single goroutine (either at the beginning of tracing and at the end of tracing when dumping all the symbols and function names) is no longer true. traceString is used by the annotation apis (NewContext, StartSpan, Log) to register frequently appearing strings (task and span names, and log keys) after this change. NewContext -> one or two records (EvString, EvUserTaskCreate) end function -> one record (EvUserTaskEnd) StartSpan -> one or two records (EvString, EvUserSpan) span end function -> one or two records (EvString, EvUserSpan) Log -> one or two records (EvString, EvUserLog) EvUserLog record is of the typical record format written by traceEvent except that it is followed by bytes that represents the value string. In addition to runtime/trace change, this change includes corresponding changes in internal/trace to parse the new record types. Future work to improve efficiency: More efficient unique task id generation instead of atomic. (per-P counter). Instead of a centralized trace.stringsLock, consider using per-P string cache or something more efficient. R=go1.11 Change-Id: Iec9276c6c51e5be441ccd52dec270f1e3b153970 Reviewed-on: https://go-review.googlesource.com/71690Reviewed-by:
Austin Clements <austin@google.com>
-
Hana Kim authored
This CL presents the proposed user annotation API skeleton. This CL bumps up the trace version to 1.11. Design doc https://goo.gl/iqJfJ3 Implementation CLs are followed. The API introduces three basic building blocks. Log, Span, and Task. Log is for basic logging. When called, the message will be recorded to the trace along with timestamp, goroutine id, and stack info. trace.Log(ctx, messageType message) Span can be thought as an extension of log to record interesting time interval during a goroutine's execution. A span is local to a goroutine by definition. trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) { /* do something very expensive */ }) Task is higher-level concept that aids tracing of complex operations that encompass multiple goroutines or are asynchronous. For example, an RPC request, a HTTP request, a file write, or a batch job can be traced with a Task. Note we chose to design the API around context.Context so it allows easier integration with other tracing tools, often designed around context.Context as well. Log and WithSpan APIs recognize the task information embedded in the context and record it in the trace as well. That allows the Go execution tracer to associate and group the spans and log messages based on the task information. In order to create a Task, ctx, end := trace.NewContext(ctx, "myTask") defer end() The Go execution tracer measures the time between the task created and the task ended for the task latency. More discussion history in golang.org/cl/59572. Update #16619 R=go1.11 Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd Reviewed-on: https://go-review.googlesource.com/63274Reviewed-by:
Austin Clements <austin@google.com>
-
- 31 Oct, 2017 1 commit
-
-
Hana (Hyang-Ah) Kim authored
Since Go1.8, different types of GC mark workers were annotated and the annotation strings were recorded during StartTrace. This change fixes two issues around the use of traceString from StartTrace here. 1) "failed to parse trace: no consistent ordering of events possible" This issue is a result of a missing 'batch' event entry. For efficient tracing, tracer maintains system allocated buffers and once a buffer is full, it is Flushed out for writing. Moreover, tracing assumes all the records in the same buffer (batch) are already ordered and implements more optimization in encoding and defers the completing order reconstruction till the trace parsing time. Thus, when a Flush happens and a new buffer is used, the new buffer should contain an event to indicate the start of a new batch. Before this CL, the batch entry was written only by traceEvent only when the buffer position is 0 and wasn't written when flush occurs during traceString. This CL fixes it by moving the batch entry write to the traceFlush. 2) crash during tracing due to invalid memory access, or during parsing due to duplicate string entries This issue is a result of memory allocation during traceString calls. Execution tracer traces some memory allocation activities. Before this CL, traceString took the buffer address (*traceBuf) and mutated the buffer. If memory tracing occurs in the meantime from the same P, the allocation tracing (traceEvent) will take the same buffer address through the pointer to the buffer address (**traceBuf), and mutate the buffer. As a result, one of the followings can happen: - the allocation record is overwritten by the following trace string record (data loss) - if buffer flush occurs during the allocation tracing, traceString will attempt to write the string record to the old buffer and eventually causes invalid memory access crash. - or flush on the same buffer can occur twice (once from the memory allocation, and once from the string record write), and in this case the trace can contain the same data twice and the parse will complain about duplicate string record entries. This CL fixes the second issue by making the traceString take **traceBuf (*traceBufPtr). Change-Id: I24f629758625b38e1916fbfc7d7be6ea210586af Reviewed-on: https://go-review.googlesource.com/50873 Run-TryBot: Austin Clements <austin@google.com> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 27 Sep, 2017 2 commits
-
-
Austin Clements authored
allp now has length gomaxprocs, which means none of allp[i] are nil or in state _Pdead. This lets replace several different styles of loops over allp with normal range loops. for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over allp. Likewise, range loops over allp[:gomaxprocs] can just range over allp. Loops that check for p == nil || p.state == _Pdead don't need to check this any more. Loops that check for p == nil don't have to check this *if* dead Ps don't affect them. I checked that all such loops are, in fact, unaffected by dead Ps. One loop was potentially affected, which this fixes by zeroing p.gcAssistTime in procresize. Updates #15131. Change-Id: Ifa1c2a86ed59892eca0610360a75bb613bc6dcee Reviewed-on: https://go-review.googlesource.com/45575 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
Austin Clements authored
This makes it possible to eliminate the hard cap on GOMAXPROCS. Updates #15131. Change-Id: I4c422b340791621584c118a6be1b38e8a44f8b70 Reviewed-on: https://go-review.googlesource.com/45573 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 12 Sep, 2017 1 commit
-
-
Aliaksandr Valialkin authored
Use per-P timers, so each P may work with its own timers. This CL improves performance on multi-CPU systems in the following cases: - When serving high number of concurrent connections with read/write deadlines set (for instance, highly loaded net/http server). - When using high number of concurrent timers. These timers may be implicitly created via context.WithDeadline or context.WithTimeout. Production servers should usually set timeout on connections and external requests in order to prevent from resource leakage. See https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/ Below are relevant benchmark results for various GOMAXPROCS values on linux/amd64: context package: name old time/op new time/op delta WithTimeout/concurrency=40 4.92µs ± 0% 5.17µs ± 1% +5.07% (p=0.000 n=9+9) WithTimeout/concurrency=4000 6.03µs ± 1% 6.49µs ± 0% +7.63% (p=0.000 n=8+10) WithTimeout/concurrency=400000 8.58µs ± 7% 9.02µs ± 4% +5.02% (p=0.019 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-2 3.70µs ± 1% 2.78µs ± 4% -24.90% (p=0.000 n=8+9) WithTimeout/concurrency=4000-2 4.49µs ± 4% 3.67µs ± 5% -18.26% (p=0.000 n=10+10) WithTimeout/concurrency=400000-2 6.16µs ±10% 5.15µs ±13% -16.30% (p=0.000 n=10+10) name old time/op new time/op delta WithTimeout/concurrency=40-4 3.58µs ± 1% 2.64µs ± 2% -26.13% (p=0.000 n=9+10) WithTimeout/concurrency=4000-4 4.17µs ± 0% 3.32µs ± 1% -20.36% (p=0.000 n=10+10) WithTimeout/concurrency=400000-4 5.57µs ± 9% 4.83µs ±10% -13.27% (p=0.001 n=10+10) time package: name old time/op new time/op delta AfterFunc 6.15ms ± 3% 6.07ms ± 2% ~ (p=0.133 n=10+9) AfterFunc-2 3.43ms ± 1% 3.56ms ± 1% +3.91% (p=0.000 n=10+9) AfterFunc-4 5.04ms ± 2% 2.36ms ± 0% -53.20% (p=0.000 n=10+9) After 6.54ms ± 2% 6.49ms ± 3% ~ (p=0.393 n=10+10) After-2 3.68ms ± 1% 3.87ms ± 0% +5.14% (p=0.000 n=9+9) After-4 6.66ms ± 1% 2.87ms ± 1% -56.89% (p=0.000 n=10+10) Stop 698µs ± 2% 689µs ± 1% -1.26% (p=0.011 n=10+10) Stop-2 729µs ± 2% 434µs ± 3% -40.49% (p=0.000 n=10+10) Stop-4 837µs ± 3% 333µs ± 2% -60.20% (p=0.000 n=10+10) SimultaneousAfterFunc 694µs ± 1% 692µs ± 7% ~ (p=0.481 n=10+10) SimultaneousAfterFunc-2 714µs ± 3% 569µs ± 2% -20.33% (p=0.000 n=10+10) SimultaneousAfterFunc-4 782µs ± 2% 386µs ± 2% -50.67% (p=0.000 n=10+10) StartStop 267µs ± 3% 274µs ± 0% +2.64% (p=0.000 n=8+9) StartStop-2 238µs ± 2% 140µs ± 3% -40.95% (p=0.000 n=10+8) StartStop-4 320µs ± 1% 125µs ± 1% -61.02% (p=0.000 n=9+9) Reset 75.0µs ± 1% 77.5µs ± 2% +3.38% (p=0.000 n=10+10) Reset-2 150µs ± 2% 40µs ± 5% -73.09% (p=0.000 n=10+9) Reset-4 226µs ± 1% 33µs ± 1% -85.42% (p=0.000 n=10+10) Sleep 857µs ± 6% 878µs ± 9% ~ (p=0.079 n=10+9) Sleep-2 617µs ± 4% 585µs ± 2% -5.21% (p=0.000 n=10+10) Sleep-4 689µs ± 3% 465µs ± 4% -32.53% (p=0.000 n=10+10) Ticker 55.9ms ± 2% 55.9ms ± 2% ~ (p=0.971 n=10+10) Ticker-2 28.7ms ± 2% 28.1ms ± 1% -2.06% (p=0.000 n=10+10) Ticker-4 14.6ms ± 0% 13.6ms ± 1% -6.80% (p=0.000 n=9+10) Fixes #15133 Change-Id: I6f4b09d2db8c5bec93146db6501b44dbfe5c0ac4 Reviewed-on: https://go-review.googlesource.com/34784Reviewed-by:
Austin Clements <austin@google.com>
-
- 29 Aug, 2017 1 commit
-
-
Austin Clements authored
Right now we only kind of sort of trace GC STW events. We emit events around mark termination, but those start well after stopping the world and end before starting it again, and we don't emit any events for sweep termination. Fix this by generalizing EvGCScanStart/EvGCScanDone. These were already re-purposed to indicate mark termination (despite the names). This commit renames them to EvGCSTWStart/EvGCSTWDone, adds an argument to indicate the STW reason, and shuffles the runtime to generate them right before stopping the world and right after starting the world, respectively. These events will make it possible to generate precise minimum mutator utilization (MMU) graphs and could be useful in detecting non-preemptible goroutines (e.g., #20792). Change-Id: If95783f370781d8ef66addd94886028103a7c26f Reviewed-on: https://go-review.googlesource.com/55411Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 19 Apr, 2017 2 commits
-
-
Austin Clements authored
This extends the GCSweepDone event with counts of swept and reclaimed bytes. These are useful for understanding the duration and effectiveness of sweep events. Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df Reviewed-on: https://go-review.googlesource.com/40811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Hyang-Ah Hana Kim <hyangah@gmail.com>
-
Austin Clements authored
Currently, each individual span sweep emits a span to the trace. But sweeps are generally done in loops until some condition is satisfied, so this tracing is lower-level than anyone really wants any hides the fact that no other work is being accomplished between adjacent sweep events. This is also high overhead: enabling tracing significantly impacts sweep latency. Replace this with instead tracing around the sweep loops used for allocation. This is slightly tricky because sweep loops don't generally know if any sweeping will happen in them. Hence, we make the tracing lazy by recording in the P that we would like to start tracing the sweep *if* one happens, and then only closing the sweep event if we started it. This does mean we don't get tracing on every sweep path, which are legion. However, we get much more informative tracing on the paths that block allocation, which are the paths that matter. Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c Reviewed-on: https://go-review.googlesource.com/40810 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 14 Apr, 2017 1 commit
-
-
David Lazar authored
Now the runtime/trace tests pass with -l=4. This also gets rid of the frames cache for multiple reasons: 1) The frames cache was used to avoid repeated calls to funcname and funcline. Now these calls happen inside the CallersFrames iterator. 2) Maintaining a frames cache is harder: map[uintptr]traceFrame doesn't work since each PC can map to multiple traceFrames. 3) It's not clear that the cache is important. Change-Id: I2914ac0b3ba08e39b60149d99a98f9f532b35bbb Reviewed-on: https://go-review.googlesource.com/40591 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 16 Mar, 2017 1 commit
-
-
Daniel Martí authored
Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 06 Mar, 2017 1 commit
-
-
Austin Clements authored
Currently almost every function that deals with a *_func has to first look up the *moduledata for the module containing the function's entry point. This means we almost always do at least two identical module lookups whenever we deal with a *_func (one to get the *_func and another to get something from its module data) and sometimes several more. Fix this by making findfunc return a new funcInfo type that embeds *_func, but also includes the *moduledata, and making all of the functions that currently take a *_func instead take a funcInfo and use the already-found *moduledata. This transformation is trivial for the most part, since the *_func type is usually inferred. The annoying part is that we can no longer use nil to indicate failure, so this introduces a funcInfo.valid() method and replaces nil checks with calls to valid. Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f Reviewed-on: https://go-review.googlesource.com/37331 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Michael Hudson-Doyle <michael.hudson@canonical.com>
-
- 17 Feb, 2017 1 commit
-
-
Dmitry Vyukov authored
Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 14 Feb, 2017 1 commit
-
-
Austin Clements authored
Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
- 10 Feb, 2017 1 commit
-
-
Heschi Kreinick authored
During the mark phase of garbage collection, goroutines that allocate may be recruited to assist. This change creates trace events for mark assists and displays them similarly to sweep assists in the trace viewer. Mark assists are different than sweeps in that they can be preempted, so displaying them in the trace viewer is a little tricky -- we may need to synthesize multiple slices for one mark assist. This could have been done in the parser instead, but I thought it might be preferable to keep the parser as true to the event stream as possible. Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7 Reviewed-on: https://go-review.googlesource.com/36015 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com> Reviewed-by:
Hyang-Ah Hana Kim <hyangah@gmail.com>
-
- 28 Oct, 2016 3 commits
-
-
Austin Clements authored
Currently when a goroutine blocks on a GC assist, it emits a generic EvGoBlock event. Since assist blocking events and, in particular, the length of the blocked assist queue, are important for diagnosing GC behavior, this commit adds a new EvGoBlockGC event for blocking on a GC assist. The trace viewer uses this event to report a "waiting on GC" count in the "Goroutines" row. This makes sense because, unlike other blocked goroutines, these goroutines do have work to do, so being blocked on a GC assist is quite similar to being in the "runnable" state, which we also report in the trace viewer. Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef Reviewed-on: https://go-review.googlesource.com/30704 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Dmitry Vyukov <dvyukov@google.com>
-
Austin Clements authored
Currently mark workers are shown in the trace as regular goroutines labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end user because of the opaque label and particularly unhelpful to runtime developers because it doesn't distinguish the different types of mark workers. Fix this by introducing a variant of the GoStart event called GoStartLabel that lets the runtime indicate a label for a goroutine execution span and using this to label mark worker executions as "GC (<mode>)" in the trace viewer. Since this bumps the trace version to 1.8, we also add test data for 1.7 traces. Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121 Reviewed-on: https://go-review.googlesource.com/30702 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Dmitry Vyukov <dvyukov@google.com>
-
Peter Weinberger authored
runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack traces of goroutines holding contended mutexes if n > 0. From runtime/pprof, pprot.Lookup("mutex").WriteTo writes the accumulated stack traces to w (in essentially the same format that blocking profiling uses). Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a Reviewed-on: https://go-review.googlesource.com/29650Reviewed-by:
Austin Clements <austin@google.com>
-
- 21 Oct, 2016 1 commit
-
-
Austin Clements authored
trace's reader *g is going to cause write barriers in unfortunate places, so replace it with a guintptr. Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af Reviewed-on: https://go-review.googlesource.com/31469 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 15 Oct, 2016 1 commit
-
-
Austin Clements authored
This covers basically all sysAlloc'd, persistentalloc'd, and fixalloc'd types. Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b Reviewed-on: https://go-review.googlesource.com/30941Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 07 Oct, 2016 2 commits
-
-
Austin Clements authored
Currently these are labeled "MARK", which was accurate in the STW collector, but these really indicate mark termination now, since marking happens for the full duration of the concurrent GC. Re-label them as "MARK TERMINATION" to clarify this. Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6 Reviewed-on: https://go-review.googlesource.com/30018Reviewed-by:
Dmitry Vyukov <dvyukov@google.com>
-
Austin Clements authored
When GC is disabled, we set gcpercent to -1. However, we still use gcpercent to compute several values, such as next_gc and gc_trigger. These calculations are meaningless when gcpercent is -1 and result in meaningless values. This is okay in a sense because we also never use these values if gcpercent is -1, but they're confusing when exposed to the user, for example via MemStats or the execution trace. It's particularly unfortunate in the execution trace because it attempts to plot the underflowed value of next_gc, which scales all useful information in the heap row into oblivion. Fix this by making next_gc ^0 when gcpercent < 0. This has the advantage of being true in a way: next_gc is effectively infinite when gcpercent < 0. We can also detect this special value when updating the execution trace and report next_gc as 0 so it doesn't blow up the display of the heap line. Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9 Reviewed-on: https://go-review.googlesource.com/30016Reviewed-by:
Rick Hudson <rlh@golang.org>
-
- 02 Sep, 2016 1 commit
-
-
Dmitry Vyukov authored
We reset global buffer only if its pos != 0. We ought to do it always, but queue it only if pos != 0. This is a latent bug. Currently it does not fire because whenever we create a global buffer, we increment pos. Change-Id: I01e28ae88ce9a5412497c524391b8b7cb443ffd9 Reviewed-on: https://go-review.googlesource.com/25574 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 22 Aug, 2016 1 commit
-
-
Dmitry Vyukov authored
In StartTrace we emit EvGoCreate for all existing goroutines. This includes stack unwind to obtain current stack. Real Go programs can contain hundreds of thousands of blocked goroutines. For such programs StartTrace can take up to a second (few ms per goroutine). Obtain current stack ID once and use it for all EvGoCreate events. This speeds up StartTrace with 10K blocked goroutines from 20ms to 4 ms (win for StartTrace called from net/http/pprof hander will be bigger as stack is deeper). Change-Id: I9e5ff9468331a840f8fdcdd56c5018c2cfde61fc Reviewed-on: https://go-review.googlesource.com/25573 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Hyang-Ah Hana Kim <hyangah@gmail.com>
-
- 23 Apr, 2016 1 commit
-
-
Dmitry Vyukov authored
Currently tracer uses global sequencer and it introduces significant slowdown on parallel machines (up to 10x). Replace the global sequencer with per-goroutine sequencer. If we assign per-goroutine sequence numbers to only 3 types of events (start, unblock and syscall exit), it is enough to restore consistent partial ordering of all events. Even these events don't need sequence numbers all the time (if goroutine starts on the same P where it was unblocked, then start does not need sequence number). The burden of restoring the order is put on trace parser. Details of the algorithm are described in the comments. On http benchmark with GOMAXPROCS=48: no tracing: 5026 ns/op tracing: 27803 ns/op (+453%) with this change: 6369 ns/op (+26%, mostly for traceback) Also trace size is reduced by ~22%. Average event size before: 4.63 bytes/event, after: 3.62 bytes/event. Besides running trace tests, I've also tested with manually broken cputicks (random skew for each event, per-P skew and episodic random skew). In all cases broken timestamps were detected and no test failures. Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa Reviewed-on: https://go-review.googlesource.com/21512 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by:
Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 22 Apr, 2016 1 commit
-
-
Dmitry Vyukov authored
Change-Id: I037101b1921fe151695d32e9874b50dd64982298 Reviewed-on: https://go-review.googlesource.com/22314Reviewed-by:
Austin Clements <austin@google.com>
-
- 11 Apr, 2016 1 commit
-
-
Dmitry Vyukov authored
1. Parse out version from trace header. 2. Restore handling of 1.5 traces. 3. Restore optional symbolization of traces. 4. Add some canned 1.5 traces for regression testing (http benchmark trace, runtime/trace stress traces, plus one with broken timestamps). Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256 Reviewed-on: https://go-review.googlesource.com/21803 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
- 08 Apr, 2016 1 commit
-
-
Dmitry Vyukov authored
This makes traces self-contained and simplifies trace workflow in modern cloud environments where it is simpler to reach a service via HTTP than to obtain the binary. Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d Reviewed-on: https://go-review.googlesource.com/21732 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Hyang-Ah Hana Kim <hyangah@gmail.com>
-
- 27 Jan, 2016 1 commit
-
-
Austin Clements authored
traceEvent records system call events after a G has already entered _Gsyscall, which means the garbage collector could be installing stack barriers in the G's stack during the traceEvent. If traceEvent attempts to capture the user stack during this, it may observe a inconsistent stack barriers and panic. Fix this by acquiring the stack lock around the stack walk in traceEvent. Fixes #14101. Change-Id: I15f0ab0c70c04c6e182221f65a6f761c5a896459 Reviewed-on: https://go-review.googlesource.com/18973 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 19 Nov, 2015 1 commit
-
-
Austin Clements authored
This replaces *traceAllocBlock with traceAllocBlockPtr. Updates #10600. Change-Id: I94a20d90f04cca7c457b29062427748e315e4857 Reviewed-on: https://go-review.googlesource.com/17004 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Dmitry Vyukov <dvyukov@google.com>
-
- 12 Nov, 2015 1 commit
-
-
Michael Matloob authored
runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817Reviewed-by:
Russ Cox <rsc@golang.org>
-