Commits · 383b447e0da5bd1fcdc2439230b5a1d3e3402117 · Kirill Smelkov / go

An error occurred fetching the project authors.

04 Nov, 2019 1 commit

runtime: clean up power-of-two rounding code with align functions · 383b447e

Michael Anthony Knyszek authored 5 years ago

This change renames the "round" function to the more appropriately named
"alignUp" which rounds an integer up to the next multiple of a power of
two.

This change also adds the alignDown function, which is almost like
alignUp but rounds down to the previous multiple of a power of two.

With these two functions, we also go and replace manual rounding code
with it where we can.

Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec
Reviewed-on: https://go-review.googlesource.com/c/go/+/190618Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>

383b447e

10 Oct, 2019 1 commit

all: remove nacl (part 3, more amd64p32) · 03ef105d

Brad Fitzpatrick authored 5 years ago

Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
        one of my grep patterns when splitting the original CL up into
        two parts.

This one might also have interesting stuff to resurrect for any future
x32 ABI support.

Updates #30439

Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

03ef105d

04 Sep, 2019 1 commit

runtime: don't hold worldsema across mark phase · 7b294cdd

Michael Anthony Knyszek authored 5 years ago

This change makes it so that worldsema isn't held across the mark phase.
This means that various operations like ReadMemStats may now stop the
world during the mark phase, reducing latency on such operations.

Only three such operations are still no longer allowed to occur during
marking: GOMAXPROCS, StartTrace, and StopTrace.

For the former it's because any change to GOMAXPROCS impacts GC mark
background worker scheduling and the details there are tricky.

For the latter two it's because tracing needs to observe consistent GC
start and GC end events, and if StartTrace or StopTrace may stop the
world during marking, then it's possible for it to see a GC end event
without a start or GC start event without an end, respectively.

To ensure that GOMAXPROCS and StartTrace/StopTrace cannot proceed until
marking is complete, the runtime now holds a new semaphore, gcsema,
across the mark phase just like it used to with worldsema.

Fixes #19812.

Change-Id: I15d43ed184f711b3d104e8f267fb86e335f86bf9
Reviewed-on: https://go-review.googlesource.com/c/go/+/182657
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

7b294cdd

22 Aug, 2018 2 commits

runtime: remove redundant explicit deref in trace.go · fd7d3259

Iskander Sharipov authored 6 years ago

Replaces legacy Go syntax for pointer struct member access
with more modern auto-deref alternative.

Found using https://go-critic.github.io/overview#underef-ref

Change-Id: I71a3c424126c4ff5d89f9e4bacb6cc01c6fa2ddf
Reviewed-on: https://go-review.googlesource.com/122895
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

fd7d3259

runtime: simplify slice expression to sliced value itself · fa6639d6

Iskander Sharipov authored 6 years ago

Replace `x[:]` where x is a slice with just `x`.

Found using https://go-critic.github.io/overview.html#unslice-ref

Change-Id: Ib0ee16e1d49b2a875b6b92a770049acc33208362
Reviewed-on: https://go-review.googlesource.com/123375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

fa6639d6

03 May, 2018 1 commit

runtime: convert g.waitreason from string to uint8 · 4d7cf3fe

Josh Bleecher Snyder authored 7 years ago

Every time I poke at #14921, the g.waitreason string
pointer writes show up.

They're not particularly important performance-wise,
but it'd be nice to clear the noise away.

And it does open up a few extra bytes in the g struct
for some future use.

This is a re-roll of CL 99078, which was rolled
back because of failures on s390x.
Those failures were apparently due to an old version of gdb.

Change-Id: Icc2c12f449b2934063fd61e272e06237625ed589
Reviewed-on: https://go-review.googlesource.com/111256
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>

4d7cf3fe

24 Apr, 2018 1 commit

runtime/trace: rename "Span" with "Region" · c2d10243

Hana Kim authored 6 years ago

"Span" is a commonly used term in many distributed tracing systems
(Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a
period of time, not necessarily tied into execution of underlying
processor, thread, or goroutine, unlike the "Span" of runtime/trace
package.

Since distributed tracing and go runtime execution tracing are
already similar enough to cause confusion, this CL attempts to avoid
using the same word if possible.

"Region" is being used in a certain tracing system to refer to a code
region which is pretty close to what runtime/trace.Span currently
refers to. So, replace that.
https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions

This CL also tweaks APIs a bit based on jbd and heschi's comments:

NewContext -> NewTask
and it now returns a Task object that exports End method.

StartSpan -> StartRegion
and it now returns a Region object that exports End method.

Also, changed WithSpan to WithRegion and it now takes func() with no
context. Another thought is to get rid of WithRegion. It is a nice
concept but in practice, it seems problematic (a lot of code churn,
and polluting stack trace). Already, the tracing concept is very low
level, and we hope this API to be used with great care.

Recommended usage will be
defer trace.StartRegion(ctx, "someRegion").End()

Left old APIs untouched in this CL. Once the usage of them are cleaned
up, they will be removed in a separate CL.

Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a
Reviewed-on: https://go-review.googlesource.com/108296
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: JBD <jbd@google.com>

c2d10243

13 Mar, 2018 1 commit

Revert "runtime: convert g.waitreason from string to uint8" · 6cb064c9

Josh Bleecher Snyder authored 7 years ago

This reverts commit 4eea887f.

Reason for revert: broke s390x build

Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847
Reviewed-on: https://go-review.googlesource.com/100375Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

6cb064c9

12 Mar, 2018 1 commit

runtime: convert g.waitreason from string to uint8 · 4eea887f

Josh Bleecher Snyder authored 7 years ago

Every time I poke at #14921, the g.waitreason string
pointer writes show up.

They're not particularly important performance-wise,
but it'd be nice to clear the noise away.

And it does open up a few extra bytes in the g struct
for some future use.

Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9
Reviewed-on: https://go-review.googlesource.com/99078
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

4eea887f

15 Feb, 2018 2 commits

runtime/trace: implement annotation API · 6977a3b2

Hana Kim authored 7 years ago

This implements the annotation API proposed in golang.org/cl/63274.

traceString is updated to protect the string map with trace.stringsLock
because the assumption that traceString is called by a single goroutine
(either at the beginning of tracing and at the end of tracing when
dumping all the symbols and function names) is no longer true.

traceString is used by the annotation apis (NewContext, StartSpan, Log)
to register frequently appearing strings (task and span names, and log
keys) after this change.

NewContext -> one or two records (EvString, EvUserTaskCreate)
end function -> one record (EvUserTaskEnd)
StartSpan -> one or two records (EvString, EvUserSpan)
span end function -> one or two records (EvString, EvUserSpan)
Log -> one or two records (EvString, EvUserLog)

EvUserLog record is of the typical record format written by traceEvent
except that it is followed by bytes that represents the value string.

In addition to runtime/trace change, this change includes
corresponding changes in internal/trace to parse the new record types.

Future work to improve efficiency:
  More efficient unique task id generation instead of atomic. (per-P
  counter).
  Instead of a centralized trace.stringsLock, consider using per-P
  string cache or something more efficient.

R=go1.11

Change-Id: Iec9276c6c51e5be441ccd52dec270f1e3b153970
Reviewed-on: https://go-review.googlesource.com/71690Reviewed-by: Austin Clements <austin@google.com>

6977a3b2

runtime/trace: user annotation API · 32d1cd33

Hana Kim authored 7 years ago

This CL presents the proposed user annotation API skeleton.
This CL bumps up the trace version to 1.11.

Design doc https://goo.gl/iqJfJ3

Implementation CLs are followed.

The API introduces three basic building blocks. Log, Span, and Task.

Log is for basic logging. When called, the message will be recorded
to the trace along with timestamp, goroutine id, and stack info.

   trace.Log(ctx, messageType message)

Span can be thought as an extension of log to record interesting
time interval during a goroutine's execution. A span is local to a
goroutine by definition.

   trace.WithSpan(ctx, "doVeryExpensiveOp", func(ctx context) {
      /* do something very expensive */
   })

Task is higher-level concept that aids tracing of complex operations
that encompass multiple goroutines or are asynchronous.
For example, an RPC request, a HTTP request, a file write, or a
batch job can be traced with a Task.

Note we chose to design the API around context.Context so it allows
easier integration with other tracing tools, often designed around
context.Context as well. Log and WithSpan APIs recognize the task
information embedded in the context and record it in the trace as
well. That allows the Go execution tracer to associate and group
the spans and log messages based on the task information.

In order to create a Task,

   ctx, end := trace.NewContext(ctx, "myTask")
   defer end()

The Go execution tracer measures the time between the task created
and the task ended for the task latency.

More discussion history in golang.org/cl/59572.

Update #16619

R=go1.11

Change-Id: I59a937048294dafd23a75cf1723c6db461b193cd
Reviewed-on: https://go-review.googlesource.com/63274Reviewed-by: Austin Clements <austin@google.com>

32d1cd33

31 Oct, 2017 1 commit

runtime/trace: fix corrupted trace during StartTrace · d58f4e9b

Hana (Hyang-Ah) Kim authored 7 years ago

Since Go1.8, different types of GC mark workers were annotated and the
annotation strings were recorded during StartTrace. This change fixes
two issues around the use of traceString from StartTrace here.

1) "failed to parse trace: no consistent ordering of events possible"

This issue is a result of a missing 'batch' event entry. For efficient
tracing, tracer maintains system allocated buffers and once a buffer
is full, it is Flushed out for writing. Moreover, tracing assumes all
the records in the same buffer (batch) are already ordered and implements
more optimization in encoding and defers the completing order
reconstruction till the trace parsing time. Thus, when a Flush happens
and a new buffer is used, the new buffer should contain an event to
indicate the start of a new batch. Before this CL, the batch entry was
written only by traceEvent only when the buffer position is 0 and
wasn't written when flush occurs during traceString.

This CL fixes it by moving the batch entry write to the traceFlush.

2) crash during tracing due to invalid memory access, or during parsing
due to duplicate string entries

This issue is a result of memory allocation during traceString calls.
Execution tracer traces some memory allocation activities. Before this
CL, traceString took the buffer address (*traceBuf) and mutated the buffer.
If memory tracing occurs in the meantime from the same P, the allocation
tracing (traceEvent) will take the same buffer address through the pointer
to the buffer address (**traceBuf), and mutate the buffer.

As a result, one of the followings can happen:
 - the allocation record is overwritten by the following trace string
   record (data loss)
 - if buffer flush occurs during the allocation tracing, traceString
   will attempt to write the string record to the old buffer and
   eventually causes invalid memory access crash.
 - or flush on the same buffer can occur twice (once from the memory
   allocation, and once from the string record write), and in this case
   the trace can contain the same data twice and the parse will complain
   about duplicate string record entries.

This CL fixes the second issue by making the traceString take
**traceBuf (*traceBufPtr).

Change-Id: I24f629758625b38e1916fbfc7d7be6ea210586af
Reviewed-on: https://go-review.googlesource.com/50873
Run-TryBot: Austin Clements <austin@google.com>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

d58f4e9b

27 Sep, 2017 2 commits

runtime: clean up loops over allp · e900e275

Austin Clements authored 7 years ago

allp now has length gomaxprocs, which means none of allp[i] are nil or
in state _Pdead. This lets replace several different styles of loops
over allp with normal range loops.

for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over
allp. Likewise, range loops over allp[:gomaxprocs] can just range over
allp.

Loops that check for p == nil || p.state == _Pdead don't need to check
this any more.

Loops that check for p == nil don't have to check this *if* dead Ps
don't affect them. I checked that all such loops are, in fact,
unaffected by dead Ps. One loop was potentially affected, which this
fixes by zeroing p.gcAssistTime in procresize.

Updates #15131.

Change-Id: Ifa1c2a86ed59892eca0610360a75bb613bc6dcee
Reviewed-on: https://go-review.googlesource.com/45575
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

e900e275

runtime: dynamically allocate allp · 84d2c7ea

Austin Clements authored 7 years ago

This makes it possible to eliminate the hard cap on GOMAXPROCS.

Updates #15131.

Change-Id: I4c422b340791621584c118a6be1b38e8a44f8b70
Reviewed-on: https://go-review.googlesource.com/45573
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>

84d2c7ea

12 Sep, 2017 1 commit

runtime: improve timers scalability on multi-CPU systems · 76f4fd8a

Aliaksandr Valialkin authored 8 years ago

Use per-P timers, so each P may work with its own timers.

This CL improves performance on multi-CPU systems
in the following cases:

- When serving high number of concurrent connections
with read/write deadlines set (for instance, highly loaded
net/http server).

- When using high number of concurrent timers. These timers
may be implicitly created via context.WithDeadline
or context.WithTimeout.

Production servers should usually set timeout on connections
and external requests in order to prevent from resource leakage.
See https://blog.cloudflare.com/the-complete-guide-to-golang-net-http-timeouts/

Below are relevant benchmark results for various GOMAXPROCS values
on linux/amd64:

context package:

name old time/op new time/op delta
WithTimeout/concurrency=40 4.92µs ± 0% 5.17µs ± 1% +5.07% (p=0.000 n=9+9)
WithTimeout/concurrency=4000 6.03µs ± 1% 6.49µs ± 0% +7.63% (p=0.000 n=8+10)
WithTimeout/concurrency=400000 8.58µs ± 7% 9.02µs ± 4% +5.02% (p=0.019 n=10+10)

name old time/op new time/op delta
WithTimeout/concurrency=40-2 3.70µs ± 1% 2.78µs ± 4% -24.90% (p=0.000 n=8+9)
WithTimeout/concurrency=4000-2 4.49µs ± 4% 3.67µs ± 5% -18.26% (p=0.000 n=10+10)
WithTimeout/concurrency=400000-2 6.16µs ±10% 5.15µs ±13% -16.30% (p=0.000 n=10+10)

name old time/op new time/op delta
WithTimeout/concurrency=40-4 3.58µs ± 1% 2.64µs ± 2% -26.13% (p=0.000 n=9+10)
WithTimeout/concurrency=4000-4 4.17µs ± 0% 3.32µs ± 1% -20.36% (p=0.000 n=10+10)
WithTimeout/concurrency=400000-4 5.57µs ± 9% 4.83µs ±10% -13.27% (p=0.001 n=10+10)

time package:

name old time/op new time/op delta
AfterFunc 6.15ms ± 3% 6.07ms ± 2% ~ (p=0.133 n=10+9)
AfterFunc-2 3.43ms ± 1% 3.56ms ± 1% +3.91% (p=0.000 n=10+9)
AfterFunc-4 5.04ms ± 2% 2.36ms ± 0% -53.20% (p=0.000 n=10+9)
After 6.54ms ± 2% 6.49ms ± 3% ~ (p=0.393 n=10+10)
After-2 3.68ms ± 1% 3.87ms ± 0% +5.14% (p=0.000 n=9+9)
After-4 6.66ms ± 1% 2.87ms ± 1% -56.89% (p=0.000 n=10+10)
Stop 698µs ± 2% 689µs ± 1% -1.26% (p=0.011 n=10+10)
Stop-2 729µs ± 2% 434µs ± 3% -40.49% (p=0.000 n=10+10)
Stop-4 837µs ± 3% 333µs ± 2% -60.20% (p=0.000 n=10+10)
SimultaneousAfterFunc 694µs ± 1% 692µs ± 7% ~ (p=0.481 n=10+10)
SimultaneousAfterFunc-2 714µs ± 3% 569µs ± 2% -20.33% (p=0.000 n=10+10)
SimultaneousAfterFunc-4 782µs ± 2% 386µs ± 2% -50.67% (p=0.000 n=10+10)
StartStop 267µs ± 3% 274µs ± 0% +2.64% (p=0.000 n=8+9)
StartStop-2 238µs ± 2% 140µs ± 3% -40.95% (p=0.000 n=10+8)
StartStop-4 320µs ± 1% 125µs ± 1% -61.02% (p=0.000 n=9+9)
Reset 75.0µs ± 1% 77.5µs ± 2% +3.38% (p=0.000 n=10+10)
Reset-2 150µs ± 2% 40µs ± 5% -73.09% (p=0.000 n=10+9)
Reset-4 226µs ± 1% 33µs ± 1% -85.42% (p=0.000 n=10+10)
Sleep 857µs ± 6% 878µs ± 9% ~ (p=0.079 n=10+9)
Sleep-2 617µs ± 4% 585µs ± 2% -5.21% (p=0.000 n=10+10)
Sleep-4 689µs ± 3% 465µs ± 4% -32.53% (p=0.000 n=10+10)
Ticker 55.9ms ± 2% 55.9ms ± 2% ~ (p=0.971 n=10+10)
Ticker-2 28.7ms ± 2% 28.1ms ± 1% -2.06% (p=0.000 n=10+10)
Ticker-4 14.6ms ± 0% 13.6ms ± 1% -6.80% (p=0.000 n=9+10)

Fixes #15133

Change-Id: I6f4b09d2db8c5bec93146db6501b44dbfe5c0ac4
Reviewed-on: https://go-review.googlesource.com/34784Reviewed-by: Austin Clements <austin@google.com>

76f4fd8a

29 Aug, 2017 1 commit

runtime,cmd/trace: trace GC STW events · b0392159

Austin Clements authored 7 years ago

Right now we only kind of sort of trace GC STW events. We emit events
around mark termination, but those start well after stopping the world
and end before starting it again, and we don't emit any events for
sweep termination.

Fix this by generalizing EvGCScanStart/EvGCScanDone. These were
already re-purposed to indicate mark termination (despite the names).
This commit renames them to EvGCSTWStart/EvGCSTWDone, adds an argument
to indicate the STW reason, and shuffles the runtime to generate them
right before stopping the world and right after starting the world,
respectively.

These events will make it possible to generate precise minimum mutator
utilization (MMU) graphs and could be useful in detecting
non-preemptible goroutines (e.g., #20792).

Change-Id: If95783f370781d8ef66addd94886028103a7c26f
Reviewed-on: https://go-review.googlesource.com/55411Reviewed-by: Rick Hudson <rlh@golang.org>

b0392159

19 Apr, 2017 2 commits

runtime: record swept and reclaimed bytes in sweep trace · 22000f54

Austin Clements authored 7 years ago

This extends the GCSweepDone event with counts of swept and reclaimed
bytes. These are useful for understanding the duration and
effectiveness of sweep events.

Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df
Reviewed-on: https://go-review.googlesource.com/40811
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

22000f54

runtime: make sweep trace events encompass entire sweep loop · 79c56add

Austin Clements authored 7 years ago

Currently, each individual span sweep emits a span to the trace. But
sweeps are generally done in loops until some condition is satisfied,
so this tracing is lower-level than anyone really wants any hides the
fact that no other work is being accomplished between adjacent sweep
events. This is also high overhead: enabling tracing significantly
impacts sweep latency.

Replace this with instead tracing around the sweep loops used for
allocation. This is slightly tricky because sweep loops don't
generally know if any sweeping will happen in them. Hence, we make the
tracing lazy by recording in the P that we would like to start tracing
the sweep *if* one happens, and then only closing the sweep event if
we started it.

This does mean we don't get tracing on every sweep path, which are
legion. However, we get much more informative tracing on the paths
that block allocation, which are the paths that matter.

Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c
Reviewed-on: https://go-review.googlesource.com/40810
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

79c56add

14 Apr, 2017 1 commit

runtime/trace: iterate over frames instead of PCs · 3249cb0a

David Lazar authored 7 years ago

Now the runtime/trace tests pass with -l=4.

This also gets rid of the frames cache for multiple reasons:

1) The frames cache was used to avoid repeated calls to funcname and
funcline. Now these calls happen inside the CallersFrames iterator.

2) Maintaining a frames cache is harder: map[uintptr]traceFrame
doesn't work since each PC can map to multiple traceFrames.

3) It's not clear that the cache is important.

Change-Id: I2914ac0b3ba08e39b60149d99a98f9f532b35bbb
Reviewed-on: https://go-review.googlesource.com/40591
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

3249cb0a

16 Mar, 2017 1 commit

runtime: remove unused g parameter · 77b09b8b

Daniel Martí authored 8 years ago

Found by github.com/mvdan/unparam.

Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c
Reviewed-on: https://go-review.googlesource.com/37894
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

77b09b8b

06 Mar, 2017 1 commit

runtime: avoid repeated findmoduledatap calls · 0efc8b21

Austin Clements authored 8 years ago

Currently almost every function that deals with a *_func has to first
look up the *moduledata for the module containing the function's entry
point. This means we almost always do at least two identical module
lookups whenever we deal with a *_func (one to get the *_func and
another to get something from its module data) and sometimes several
more.

Fix this by making findfunc return a new funcInfo type that embeds
*_func, but also includes the *moduledata, and making all of the
functions that currently take a *_func instead take a funcInfo and use
the already-found *moduledata.

This transformation is trivial for the most part, since the *_func
type is usually inferred. The annoying part is that we can no longer
use nil to indicate failure, so this introduces a funcInfo.valid()
method and replaces nil checks with calls to valid.

Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f
Reviewed-on: https://go-review.googlesource.com/37331
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>

0efc8b21

17 Feb, 2017 1 commit

sync: make Mutex more fair · 0556e262

Dmitry Vyukov authored 8 years ago

Add new starvation mode for Mutex.
In starvation mode ownership is directly handed off from
unlocking goroutine to the next waiter. New arriving goroutines
don't compete for ownership.
Unfair wait time is now limited to 1ms.
Also fix a long standing bug that goroutines were requeued
at the tail of the wait queue. That lead to even more unfair
acquisition times with multiple waiters.

Performance of normal mode is not considerably affected.

Fixes #13086

On the provided in the issue lockskew program:

done in 1.207853ms
done in 1.177451ms
done in 1.184168ms
done in 1.198633ms
done in 1.185797ms
done in 1.182502ms
done in 1.316485ms
done in 1.211611ms
done in 1.182418ms

name old time/op new time/op delta
MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10)
Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10)
MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10)
MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10)
MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10)
MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10)
MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10)
Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10)
RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10)
RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10)
RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10)
RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10)
Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9)
MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10)
Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10)
MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10)
MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9)
MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9)
MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10)
MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10)
RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10)
RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9)
RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10)
RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10)
Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10)
MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10)
Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10)
MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9)
MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10)
MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10)
MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10)
MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10)
RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10)
RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9)
RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9)
RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8)
Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10)
MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal)
Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10)
MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10)
MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10)
MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10)
MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10)
MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10)
RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10)
RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10)
RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10)
RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal)
Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10)
MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10)
Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10)
MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9)
MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9)
MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8)
MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9)
MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8)
RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8)
RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10)
RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9)
RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10)
Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10)
MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8)
Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10)
MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10)
MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8)
MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10)
MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10)
MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8)
RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9)
RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9)
RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal)
RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10)

Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b
Reviewed-on: https://go-review.googlesource.com/34310
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

0556e262

14 Feb, 2017 1 commit

runtime: remove stack barriers · d089a6c7

Austin Clements authored 8 years ago

Now that we don't rescan stacks, stack barriers are unnecessary. This
removes all of the code and structures supporting them as well as
tests that were specifically for stack barriers.

Updates #17503.

Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c
Reviewed-on: https://go-review.googlesource.com/36620
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

d089a6c7

10 Feb, 2017 1 commit

cmd/trace: Record mark assists in execution traces · 2a74b9e8

Heschi Kreinick authored 8 years ago

During the mark phase of garbage collection, goroutines that allocate
may be recruited to assist. This change creates trace events for mark
assists and displays them similarly to sweep assists in the trace
viewer.

Mark assists are different than sweeps in that they can be preempted, so
displaying them in the trace viewer is a little tricky -- we may need to
synthesize multiple slices for one mark assist. This could have been
done in the parser instead, but I thought it might be preferable to keep
the parser as true to the event stream as possible.

Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7
Reviewed-on: https://go-review.googlesource.com/36015
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

2a74b9e8

28 Oct, 2016 3 commits

runtime, cmd/trace: track goroutines blocked on GC assists · 6da83c6f

Austin Clements authored 8 years ago

Currently when a goroutine blocks on a GC assist, it emits a generic
EvGoBlock event. Since assist blocking events and, in particular, the
length of the blocked assist queue, are important for diagnosing GC
behavior, this commit adds a new EvGoBlockGC event for blocking on a
GC assist. The trace viewer uses this event to report a "waiting on
GC" count in the "Goroutines" row. This makes sense because, unlike
other blocked goroutines, these goroutines do have work to do, so
being blocked on a GC assist is quite similar to being in the
"runnable" state, which we also report in the trace viewer.

Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef
Reviewed-on: https://go-review.googlesource.com/30704
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

6da83c6f

runtime, cmd/trace: annotate different mark worker types · 68348394

Austin Clements authored 8 years ago

Currently mark workers are shown in the trace as regular goroutines
labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end
user because of the opaque label and particularly unhelpful to runtime
developers because it doesn't distinguish the different types of mark
workers.

Fix this by introducing a variant of the GoStart event called
GoStartLabel that lets the runtime indicate a label for a goroutine
execution span and using this to label mark worker executions as "GC
(<mode>)" in the trace viewer.

Since this bumps the trace version to 1.8, we also add test data for
1.7 traces.

Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121
Reviewed-on: https://go-review.googlesource.com/30702
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

68348394

runtime: Profile goroutines holding contended mutexes. · ca922b6d

Peter Weinberger authored 8 years ago

runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack
traces of goroutines holding contended mutexes if n > 0. From runtime/pprof,
pprot.Lookup("mutex").WriteTo writes the accumulated
stack traces to w (in essentially the same format that blocking
profiling uses).

Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a
Reviewed-on: https://go-review.googlesource.com/29650Reviewed-by: Austin Clements <austin@google.com>

ca922b6d

21 Oct, 2016 1 commit

runtime: replace *g with guintptr in trace · c2425178

Austin Clements authored 8 years ago

trace's reader *g is going to cause write barriers in unfortunate
places, so replace it with a guintptr.

Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af
Reviewed-on: https://go-review.googlesource.com/31469
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>

c2425178

15 Oct, 2016 1 commit

runtime: mark several types go:notinheap · 1bc6be64

Austin Clements authored 8 years ago

This covers basically all sysAlloc'd, persistentalloc'd, and
fixalloc'd types.

Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b
Reviewed-on: https://go-review.googlesource.com/30941Reviewed-by: Rick Hudson <rlh@golang.org>

1bc6be64

07 Oct, 2016 2 commits

cmd/trace: label mark termination spans as such · 94589054

Austin Clements authored 8 years ago

Currently these are labeled "MARK", which was accurate in the STW
collector, but these really indicate mark termination now, since
marking happens for the full duration of the concurrent GC. Re-label
them as "MARK TERMINATION" to clarify this.

Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6
Reviewed-on: https://go-review.googlesource.com/30018Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

94589054

runtime: make next_gc ^0 when GC is disabled · fa9b57bb

Austin Clements authored 8 years ago

When GC is disabled, we set gcpercent to -1. However, we still use
gcpercent to compute several values, such as next_gc and gc_trigger.
These calculations are meaningless when gcpercent is -1 and result in
meaningless values. This is okay in a sense because we also never use
these values if gcpercent is -1, but they're confusing when exposed to
the user, for example via MemStats or the execution trace. It's
particularly unfortunate in the execution trace because it attempts to
plot the underflowed value of next_gc, which scales all useful
information in the heap row into oblivion.

Fix this by making next_gc ^0 when gcpercent < 0. This has the
advantage of being true in a way: next_gc is effectively infinite when
gcpercent < 0. We can also detect this special value when updating the
execution trace and report next_gc as 0 so it doesn't blow up the
display of the heap line.

Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9
Reviewed-on: https://go-review.googlesource.com/30016Reviewed-by: Rick Hudson <rlh@golang.org>

fa9b57bb

02 Sep, 2016 1 commit

runtime: fix global buffer reset in StopTrace · cd285f1c

Dmitry Vyukov authored 8 years ago

We reset global buffer only if its pos != 0.
We ought to do it always, but queue it only if pos != 0.
This is a latent bug. Currently it does not fire because
whenever we create a global buffer, we increment pos.

Change-Id: I01e28ae88ce9a5412497c524391b8b7cb443ffd9
Reviewed-on: https://go-review.googlesource.com/25574
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

cd285f1c

22 Aug, 2016 1 commit

runtime: speed up StartTrace with lots of blocked goroutines · 747a158e

Dmitry Vyukov authored 8 years ago

In StartTrace we emit EvGoCreate for all existing goroutines.
This includes stack unwind to obtain current stack.
Real Go programs can contain hundreds of thousands of blocked goroutines.
For such programs StartTrace can take up to a second (few ms per goroutine).

Obtain current stack ID once and use it for all EvGoCreate events.

This speeds up StartTrace with 10K blocked goroutines from 20ms to 4 ms
(win for StartTrace called from net/http/pprof hander will be bigger
as stack is deeper).

Change-Id: I9e5ff9468331a840f8fdcdd56c5018c2cfde61fc
Reviewed-on: https://go-review.googlesource.com/25573
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

747a158e

23 Apr, 2016 1 commit

runtime: use per-goroutine sequence numbers in tracer · a3703618

Dmitry Vyukov authored 8 years ago

Currently tracer uses global sequencer and it introduces
significant slowdown on parallel machines (up to 10x).
Replace the global sequencer with per-goroutine sequencer.

If we assign per-goroutine sequence numbers to only 3 types
of events (start, unblock and syscall exit), it is enough to
restore consistent partial ordering of all events. Even these
events don't need sequence numbers all the time (if goroutine
starts on the same P where it was unblocked, then start does
not need sequence number).
The burden of restoring the order is put on trace parser.
Details of the algorithm are described in the comments.

On http benchmark with GOMAXPROCS=48:
no tracing: 5026 ns/op
tracing: 27803 ns/op (+453%)
with this change: 6369 ns/op (+26%, mostly for traceback)

Also trace size is reduced by ~22%. Average event size before: 4.63
bytes/event, after: 3.62 bytes/event.

Besides running trace tests, I've also tested with manually broken
cputicks (random skew for each event, per-P skew and episodic random skew).
In all cases broken timestamps were detected and no test failures.

Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa
Reviewed-on: https://go-review.googlesource.com/21512
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

a3703618

22 Apr, 2016 1 commit

runtime: fix description of trace events · 2d342fba

Dmitry Vyukov authored 8 years ago

Change-Id: I037101b1921fe151695d32e9874b50dd64982298
Reviewed-on: https://go-review.googlesource.com/22314Reviewed-by: Austin Clements <austin@google.com>

2d342fba

11 Apr, 2016 1 commit

internal/trace: support parsing of 1.5 traces · 3fafe2e8

Dmitry Vyukov authored 8 years ago

1. Parse out version from trace header.
2. Restore handling of 1.5 traces.
3. Restore optional symbolization of traces.
4. Add some canned 1.5 traces for regression testing
   (http benchmark trace, runtime/trace stress traces,
    plus one with broken timestamps).

Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256
Reviewed-on: https://go-review.googlesource.com/21803
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

3fafe2e8

08 Apr, 2016 1 commit

runtime: emit file:line info into traces · 0fb7b4cc

Dmitry Vyukov authored 8 years ago

This makes traces self-contained and simplifies trace workflow
in modern cloud environments where it is simpler to reach
a service via HTTP than to obtain the binary.

Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d
Reviewed-on: https://go-review.googlesource.com/21732
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

0fb7b4cc

27 Jan, 2016 1 commit

runtime: acquire stack lock in traceEvent · 08594ac7

Austin Clements authored 9 years ago

traceEvent records system call events after a G has already entered
_Gsyscall, which means the garbage collector could be installing stack
barriers in the G's stack during the traceEvent. If traceEvent
attempts to capture the user stack during this, it may observe a
inconsistent stack barriers and panic. Fix this by acquiring the stack
lock around the stack walk in traceEvent.

Fixes #14101.

Change-Id: I15f0ab0c70c04c6e182221f65a6f761c5a896459
Reviewed-on: https://go-review.googlesource.com/18973
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

08594ac7

19 Nov, 2015 1 commit

runtime: eliminate traceAllocBlock write barriers · e9aef43d

Austin Clements authored 9 years ago

This replaces *traceAllocBlock with traceAllocBlockPtr.

Updates #10600.

Change-Id: I94a20d90f04cca7c457b29062427748e315e4857
Reviewed-on: https://go-review.googlesource.com/17004
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

e9aef43d

12 Nov, 2015 1 commit

runtime: break out system-specific constants into package sys · 432cb66f

Michael Matloob authored 9 years ago

runtime/internal/sys will hold system-, architecture- and config-
specific constants.

Updates #11647

Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54
Reviewed-on: https://go-review.googlesource.com/16817Reviewed-by: Russ Cox <rsc@golang.org>

432cb66f