Commits · 57afa76471ccb3fd9e92349825f90b6c354fc9b5 · Kirill Smelkov / go

27 Apr, 2015 9 commits

runtime: add ragged global barrier function · 57afa764

Austin Clements authored Mar 27, 2015

This adds forEachP, which performs a general-purpose ragged global
barrier. forEachP takes a callback and invokes it for every P at a GC
safe point.

Ps that are idle or in a syscall are considered to be at a continuous
safe point. forEachP ensures that these Ps do not change state by
forcing all syscall Ps into idle and holding the sched.lock.

To ensure that Ps do not enter syscall or idle without running the
safe-point function, this adds checks for a pending callback every
place there is currently a gcwaiting check.

We'll use forEachP to replace the STW around enabling the write
barrier and to replace the current asynchronous per-M wbuf cache with
a cooperatively managed per-P gcWork cache.

Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc
Reviewed-on: https://go-review.googlesource.com/8206Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>

57afa764

Revert "cmd/dist: consolidate runtime CPU tests" · 81c2233b

Josh Bleecher Snyder authored Apr 27, 2015

This reverts commit a9e50a6b.

Change-Id: I3c5e459f1030e36bc249910facdae12303a44151
Reviewed-on: https://go-review.googlesource.com/9394Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>

81c2233b

cmd/dist: consolidate runtime CPU tests · a9e50a6b

Josh Bleecher Snyder authored Apr 24, 2015

Instead of running:

go test -short runtime -cpu=1
go test -short runtime -cpu=2
go test -short runtime -cpu=4

Run just:

go test -short runtime -cpu=1,2,4

This is a return to the Go 1.4.2 behavior.

We lose incremental display of progress and
per-cpu timing information, but we don't have
to recompile and relink the runtime test,
which is slow.

This cuts about 10s off all.bash.

Updates #10571.

Change-Id: I6e8c7149780d47439f8bcfa888e6efc84290c60a
Reviewed-on: https://go-review.googlesource.com/9350Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

a9e50a6b

cmd/internal/ld: remove pointless allocs · 2692f483

Josh Bleecher Snyder authored Apr 24, 2015

Reduces allocs linking cmd/go and runtime.test
by ~13%. No functional changes.

The most easily addressed sources of allocations
after this are expandpkg, rdstring, and symbuf
string conversion.

These can be reduced by interning strings,
but that increases the overall memory footprint.

Change-Id: Ifedefc9f2a0403bcc75460d6b139e8408374e058
Reviewed-on: https://go-review.googlesource.com/9391Reviewed-by: David Crawshaw <crawshaw@golang.org>

2692f483

encoding/xml: do not escape newlines · 4a3e000a

Roger Peppe authored Apr 24, 2015

There is no need to escape newlines in char data -
it makes the XML larger and harder to read.

Change-Id: I1c1fcee1bdffc705c7428f89ca90af8085d6fb73
Reviewed-on: https://go-review.googlesource.com/9310Reviewed-by: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

4a3e000a

runtime: reset spinning in mspinning if work was ready()ed · b0b1a660

Austin Clements authored Apr 24, 2015

This fixes a bug where the runtime ready()s a goroutine while setting
up a new M that's initially marked as spinning, causing the scheduler
to later panic when it finds work in the run queue of a P associated
with a spinning M. Specifically, the sequence of events that can lead
to this is:

1) sysmon calls handoffp to hand off a P stolen from a syscall.

2) handoffp sees no pending work on the P, so it calls startm with
spinning set.

3) startm calls newm, which in turn calls allocm to allocate a new M.

4) allocm "borrows" the P we're handing off in order to do allocation
and performs this allocation.

5) This allocation may assist the garbage collector, and this assist
may detect the end of concurrent mark and ready() the main GC
goroutine to signal this.

6) This ready()ing puts the GC goroutine on the run queue of the
borrowed P.

7) newm starts the OS thread, which runs mstart and subsequently
mstart1, which marks the M spinning because startm was called with
spinning set.

8) mstart1 enters the scheduler, which panics because there's work on
the run queue, but the M is marked spinning.

To fix this, before marking the M spinning in step 7, add a check to
see if work was been added to the P's run queue. If this is the case,
undo the spinning instead.

Fixes #10573.

Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5
Reviewed-on: https://go-review.googlesource.com/9332Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

b0b1a660

runtime: panic when idling a P with runnable Gs · 2a46f55b

Austin Clements authored Apr 24, 2015

This adds a check that we never put a P on the idle list when it has
work on its local run queue.

Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7
Reviewed-on: https://go-review.googlesource.com/9324Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

2a46f55b

runtime: tighten select permutation generation · fd5540e7

Josh Bleecher Snyder authored Dec 18, 2014

This is the optimization made to math/rand in CL 21030043.

Change-Id: I231b24fa77cac1fe74ba887db76313b5efaab3e8
Reviewed-on: https://go-review.googlesource.com/9269Reviewed-by: Minux Ma <minux@golang.org>

fd5540e7

debug/dwarf: update class_string.go to add ClassReferenceSig using stringer. · 3787950a

John Dethridge authored Apr 22, 2015

Change-Id: I677a5ee273a4d285a8adff71ffcfeac34afc887f
Reviewed-on: https://go-review.googlesource.com/9235Reviewed-by: Austin Clements <austin@google.com>

3787950a

26 Apr, 2015 11 commits

crypto/tls: call GetCertificate if Certificates is empty. · cba882ea

Adam Langley authored Apr 12, 2015

This change causes the GetCertificate callback to be called if
Certificates is empty. Previously this configuration would result in an
error.

This allows people to have servers that depend entirely on dynamic
certificate selection, even when the client doesn't send SNI.

Fixes #9208.

Change-Id: I2f5a5551215958b88b154c64a114590300dfc461
Reviewed-on: https://go-review.googlesource.com/8792Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>

cba882ea

crypto/tls: add OCSP response to ConnectionState · ac2bf8ad

Jonathan Rudenberg authored Apr 26, 2015

The OCSP response is currently only exposed via a method on Conn,
which makes it inaccessible when using wrappers like net/http. The
ConnectionState structure is typically available even when using
wrappers and contains many of the other handshake details, so this
change exposes the stapled OCSP response in that structure.

Change-Id: If8dab49292566912c615d816321b4353e711f71f
Reviewed-on: https://go-review.googlesource.com/9361Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>

ac2bf8ad

crypto/elliptic: don't unmarshal points that are off the curve · d86b8d34

David Leon Gil authored Jan 06, 2015

At present, Unmarshal does not check that the point it unmarshals
is actually *on* the curve. (It may be on the curve's twist.)

This can, as Daniel Bernstein has pointed out at great length,
lead to quite devastating attacks. And 3 out of the 4 curves
supported by crypto/elliptic have twists with cofactor != 1;
P-224, in particular, has a sufficiently large cofactor that it
is likely that conventional dlog attacks might be useful.

This closes #2445, filed by Watson Ladd.

To explain why this was (partially) rejected before being accepted:

In the general case, for curves with cofactor != 1, verifying subgroup
membership is required. (This is expensive and hard-to-implement.)
But, as recent discussion during the CFRG standardization process
has brought out, small-subgroup attacks are much less damaging than
a twist attack.

Change-Id: I284042eb9954ff9b7cde80b8b693b1d468c7e1e8
Reviewed-on: https://go-review.googlesource.com/2421Reviewed-by: Adam Langley <agl@golang.org>

d86b8d34

crypto/x509: CertificateRequest signature verification · 54bb4b9f

Paul van Brouwershaven authored Mar 11, 2015

This implements a method for x509.CertificateRequest to prevent
certain attacks and to allow a CA/RA to properly check the validity
of the binding between an end entity and a key pair, to prove that
it has possession of (i.e., is able to use) the private key
corresponding to the public key for which a certificate is requested.

RFC 2986 section 3 states:

"A certification authority fulfills the request by authenticating the
requesting entity and verifying the entity's signature, and, if the
request is valid, constructing an X.509 certificate from the
distinguished name and public key, the issuer name, and the
certification authority's choice of serial number, validity period,
and signature algorithm."

Change-Id: I37795c3b1dfdfdd455d870e499b63885eb9bda4f
Reviewed-on: https://go-review.googlesource.com/7371Reviewed-by: Adam Langley <agl@golang.org>

54bb4b9f

crypto/tls: add support for session ticket key rotation · bff14175

Jonathan Rudenberg authored Apr 17, 2015

This change adds a new method to tls.Config, SetSessionTicketKeys, that
changes the key used to encrypt session tickets while the server is
running. Additional keys may be provided that will be used to maintain
continuity while rotating keys. If a ticket encrypted with an old key is
provided by the client, the server will resume the session and provide
the client with a ticket encrypted using the new key.

Fixes #9994

Change-Id: Idbc16b10ff39616109a51ed39a6fa208faad5b4e
Reviewed-on: https://go-review.googlesource.com/9072Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
Reviewed-by: Adam Langley <agl@golang.org>

bff14175

cmd/pprof: handle empty profile gracefully · 14a4649f

Håvard Haugen authored Jan 09, 2015

The command "go tool pprof -top $GOROOT/bin/go /dev/null" now logs that
profile is empty instead of panicking.

Fixes #9207

Change-Id: I3d55c179277cb19ad52c8f24f1aca85db53ee08d
Reviewed-on: https://go-review.googlesource.com/2571
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

14a4649f

crypto/tls: add support for Certificate Transparency · 02e69c4b

Jonathan Rudenberg authored Apr 16, 2015

This change adds support for serving and receiving Signed Certificate
Timestamps as described in RFC 6962.

The server is now capable of serving SCTs listed in the Certificate
structure. The client now asks for SCTs and, if any are received,
they are exposed in the ConnectionState structure.

Fixes #10201

Change-Id: Ib3adae98cb4f173bc85cec04d2bdd3aa0fec70bb
Reviewed-on: https://go-review.googlesource.com/8988Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>

02e69c4b

encoding/csv: Preallocate records slice · 2db58f8f

Justin Nuß authored Apr 13, 2015

Currently parseRecord will always start with a nil
slice and then resize the slice on append. For input
with a fixed number of fields per record we can preallocate
the slice to avoid having to resize the slice.

This change implements this optimization by using
FieldsPerRecord as capacity if it's > 0 and also adds a
benchmark to better show the differences.

benchmark         old ns/op     new ns/op     delta
BenchmarkRead     19741         17909         -9.28%

benchmark         old allocs     new allocs     delta
BenchmarkRead     59             41             -30.51%

benchmark         old bytes     new bytes     delta
BenchmarkRead     6276          5844          -6.88%

Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab
Reviewed-on: https://go-review.googlesource.com/8880Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

2db58f8f

runtime: signal forwarding for darwin/amd64 · a5b693b4

David Crawshaw authored Apr 24, 2015

Follows the linux signal forwarding semantics from
http://golang.org/cl/8712, sharing the implementation of sigfwdgo.
Forwarding for 386, arm, and arm64 will follow.

Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62
Reviewed-on: https://go-review.googlesource.com/9302Reviewed-by: Ian Lance Taylor <iant@golang.org>

a5b693b4

cmd/internal/ld: R_TLS_LE is fine on Darwin too · c20ff36f

Michael Hudson-Doyle authored Apr 26, 2015

Sorry about this.

Fixes #10575

Change-Id: I2de23be68e7d822d182e5a0d6a00c607448d861e
Reviewed-on: https://go-review.googlesource.com/9341Reviewed-by: Minux Ma <minux@golang.org>

c20ff36f

testing/quick: align tests with reflect.Kind. · b6a0450b

Matt T. Proud authored Apr 12, 2015

This commit is largely cosmetic in the sense that it is the remnants
of a change proposal I had prepared for testing/quick, until I
discovered that 3e9ed273 already implemented the feature I was looking
for: quick.Value() for reflect.Kind Array.  What you see is a merger
and manual cleanup; the cosmetic cleanups are as follows:

(1.) Keeping the TestCheckEqual and its associated input functions
in the same order as type kinds defined in reflect.Kind.  Since
3e9ed273 was committed, the test case began to diverge from the
constant's ordering.

(2.) The `Intptr` derivatives existed to exercise quick.Value with
reflect.Kind's `Ptr` constant.  All `Intptr` (unrelated to `uintptr`)
in the test have been migrated to ensure the parallelism of the
listings and to convey that `Intptr` is not special.

(3.) Correct a misspelling (transposition) of "alias", whereby it is
named as "Alais".

Change-Id: I441450db16b8bb1272c52b0abcda3794dcd0599d
Reviewed-on: https://go-review.googlesource.com/8804Reviewed-by: Russ Cox <rsc@golang.org>

b6a0450b

25 Apr, 2015 1 commit

cmd/8l, cmd/internal/ld, cmd/internal/obj/x86: stop incorrectly using the term "inital exec" · 264858c4

Michael Hudson-Doyle authored Apr 23, 2015

The long comment block in obj6.go:progedit talked about the two code sequences
for accessing g as "local exec" and "initial exec", but really they are both forms
of local exec. This stuff is confusing enough without using the wrong words for
things, so rewrite it to talk about 2-instruction and 1-instruction sequences.
Unfortunately the confusion has made it into code, with the R_TLS_IE relocation
now doing double duty as meaning actual initial exec when externally linking and
boring old local exec when linking internally (half of this is my fault). So this
stops using R_TLS_IE in the local exec case. There is a chance this might break
plan9 or windows, but I don't think so. Next step is working out what the heck is
going on on ARM...

Change-Id: I09da4388210cf49dbc99fd25f5172bbe517cee57
Reviewed-on: https://go-review.googlesource.com/9273Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

264858c4

24 Apr, 2015 19 commits

runtime: Fix bug due to elided return. · ada8cdb9

Rick Hudson authored Apr 23, 2015

A previous change to mbitmap.go dropped a return on a
path the seems not to be excersized. This was a mistake that
this CL fixes.

Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43
Reviewed-on: https://go-review.googlesource.com/9295Reviewed-by: Austin Clements <austin@google.com>

ada8cdb9

cmd/internal/ld: fix R_TLS handling now Xsym is not read from object file · ccc76dba

Michael Hudson-Doyle authored Apr 23, 2015

I think this should fix the arm build. A proper fix involves making the handling
of tlsg less fragile, I'll try that tomorrow.

Update #10557

Change-Id: I9b1b666737fb40aebb6f284748509afa8483cce5
Reviewed-on: https://go-review.googlesource.com/9272Reviewed-by: Dave Cheney <dave@cheney.net>
Run-TryBot: Dave Cheney <dave@cheney.net>

ccc76dba

runtime: replace per-M workbuf cache with per-P gcWork cache · 1b4025f4

Austin Clements authored Apr 19, 2015

Currently, each M has a cache of the most recently used *workbuf. This
is used primarily by the write barrier so it doesn't have to access
the global workbuf lists on every write barrier. It's also used by
stack scanning because it's convenient.

This cache is important for write barrier performance, but this
particular approach has several downsides. It's faster than no cache,
but far from optimal (as the benchmarks below show). It's complex:
access to the cache is sprinkled through most of the workbuf list
operations and it requires special care to transform into and back out
of the gcWork cache that's actually used for scanning and marking. It
requires atomic exchanges to take ownership of the cached workbuf and
to return it to the M's cache even though it's almost always used by
only the current M. Since it's per-M, flushing these caches is O(# of
Ms), which may be high. And it has some significant subtleties: for
example, in general the cache shouldn't be used after the
harvestwbufs() in mark termination because it could hide work from
mark termination, but stack scanning can happen after this and *will*
use the cache (but it turns out this is okay because it will always be
followed by a getfull(), which drains the cache).

This change replaces this cache with a per-P gcWork object. This
gcWork cache can be used directly by scanning and marking (as long as
preemption is disabled, which is a general requirement of gcWork).
Since it's per-P, it doesn't require synchronization, which simplifies
things and means the only atomic operations in the write barrier are
occasionally fetching new work buffers and setting a mark bit if the
object isn't already marked. This cache can be flushed in O(# of Ps),
which is generally small. It follows a simple flushing rule: the cache
can be used during any phase, but during mark termination it must be
flushed before allowing preemption. This also makes the dispose during
mutator assist no longer necessary, which eliminates the vast majority
of gcWork dispose calls and reduces contention on the global workbuf
lists. And it's a lot faster on some benchmarks:

benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 11963668673 11206112763 -6.33%
BenchmarkFannkuch11 2643217136 2649182499 +0.23%
BenchmarkFmtFprintfEmpty 70.4 70.2 -0.28%
BenchmarkFmtFprintfString 364 307 -15.66%
BenchmarkFmtFprintfInt 317 282 -11.04%
BenchmarkFmtFprintfIntInt 512 483 -5.66%
BenchmarkFmtFprintfPrefixedInt 404 380 -5.94%
BenchmarkFmtFprintfFloat 521 479 -8.06%
BenchmarkFmtManyArgs 2164 1894 -12.48%
BenchmarkGobDecode 30366146 22429593 -26.14%
BenchmarkGobEncode 29867472 26663152 -10.73%
BenchmarkGzip 391236616 396779490 +1.42%
BenchmarkGunzip 96639491 96297024 -0.35%
BenchmarkHTTPClientServer 100110 70763 -29.31%
BenchmarkJSONEncode 51866051 52511382 +1.24%
BenchmarkJSONDecode 103813138 86094963 -17.07%
BenchmarkMandelbrot200 4121834 4120886 -0.02%
BenchmarkGoParse 16472789 5879949 -64.31%
BenchmarkRegexpMatchEasy0_32 140 140 +0.00%
BenchmarkRegexpMatchEasy0_1K 394 394 +0.00%
BenchmarkRegexpMatchEasy1_32 120 120 +0.00%
BenchmarkRegexpMatchEasy1_1K 621 614 -1.13%
BenchmarkRegexpMatchMedium_32 209 202 -3.35%
BenchmarkRegexpMatchMedium_1K 54889 55175 +0.52%
BenchmarkRegexpMatchHard_32 2682 2675 -0.26%
BenchmarkRegexpMatchHard_1K 79383 79524 +0.18%
BenchmarkRevcomp 584116718 584595320 +0.08%
BenchmarkTemplate 125400565 109620196 -12.58%
BenchmarkTimeParse 386 387 +0.26%
BenchmarkTimeFormat 580 447 -22.93%

(Best out of 10 runs. The delta of averages is similar.)

This also puts us in a good position to flush these caches when
nearing the end of concurrent marking, which will let us increase the
size of the work buffers while still controlling mark termination
pause time.

Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522
Reviewed-on: https://go-review.googlesource.com/9178
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

1b4025f4

runtime: fix check for pending GC work · d1cae635

Austin Clements authored Apr 23, 2015

When findRunnable considers running a fractional mark worker, it first
checks if there's any work to be done; if there isn't there's no point
in running the worker because it will just reschedule immediately.
However, currently findRunnable just checks work.full and
work.partial, whereas getfull can *also* draw work from m.currentwbuf.
As a result, findRunnable may not start a worker even though there
actually is work.

This problem manifests itself in occasional failures of the
test/init1.go test. This test is unusual because it performs a large
amount of allocation without executing any write barriers, which means
there's nothing to force the pointers in currentwbuf out to the
work.partial/full lists where findRunnable can see them.

This change fixes this problem by making findRunnable also check for a
currentwbuf. This aligns findRunnable with trygetfull's notion of
whether or not there's work.

Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4
Reviewed-on: https://go-review.googlesource.com/9299Reviewed-by: Russ Cox <rsc@golang.org>

d1cae635

runtime: start dedicated mark workers even if there's no work · 26eac917

Austin Clements authored Apr 23, 2015

Currently, findRunnable only considers running a mark worker if
there's work in the work queue. In principle, this can delay the start
of the desired number of dedicated mark workers if there's no work
pending. This is unlikely to occur in practice, since there should be
work queued from the scan phase, but if it were to come up, a CPU hog
mutator could slow down or delay garbage collection.

This check makes sense for fractional mark workers, since they'll just
return to the scheduler immediately if there's no work, but we want
the scheduler to start all of the dedicated mark workers promptly,
even if there's currently no queued work. Hence, this change moves the
pending work check after the check for starting a dedicated worker.

Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101
Reviewed-on: https://go-review.googlesource.com/9298Reviewed-by: Rick Hudson <rlh@golang.org>

26eac917

runtime: fix some out-of-date comments · 711a1642

Austin Clements authored Apr 23, 2015

bgMarkCount no longer exists.

Change-Id: I3aa406fdccfca659814da311229afbae55af8304
Reviewed-on: https://go-review.googlesource.com/9297Reviewed-by: Rick Hudson <rlh@golang.org>

711a1642

misc/cgo/testcshared: make test.bash resilient against noise. · e9a89b80

Hyang-Ah Hana Kim authored Apr 24, 2015

Instead of comparing against the entire output that may include
verbose warning messages, use the last line of the output and check
it includes the expected success message (PASS).

Change-Id: Iafd583ee5529a8aef5439b9f1f6ce0185e4b1331
Reviewed-on: https://go-review.googlesource.com/9304Reviewed-by: David Crawshaw <crawshaw@golang.org>

e9a89b80

cmd/go: rename doc.go to alldocs.go in preparation for "go doc" · b3000b6f

Rob Pike authored Apr 24, 2015

Also rename and update mkdoc.sh to mkalldocs.sh

Change-Id: Ief3673c22d45624e173fc65ee279cea324da03b5
Reviewed-on: https://go-review.googlesource.com/9226Reviewed-by: Russ Cox <rsc@golang.org>

b3000b6f

runtime: implement xadduintptr and update system mstats using it · 6ad33be2

Srdjan Petrovic authored Apr 16, 2015

The motivation is that sysAlloc/Free() currently aren't safe to be
called without a valid G, because arm's xadd64() uses locks that require
a valid G.

The solution here was proposed by Dmitry Vyukov: use xadduintptr()
instead of xadd64(), until arm can support xadd64 on all of its
architectures (not a trivial task for arm).

Change-Id: I250252079357ea2e4360e1235958b1c22051498f
Reviewed-on: https://go-review.googlesource.com/9002Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

6ad33be2

misc/cgo/testcshared: add a c-shared test for android/arm. · 85669799

Hyang-Ah Hana Kim authored Apr 23, 2015

- main3.c tests main.main is exported when compiled for GOOS=android.
- wait longer for main2.c (it's slow on android/arm)
- rearranged test.bash

Fixes #10070.

Change-Id: I6e5a98d1c5fae776afa54ecb5da633b59b269316
Reviewed-on: https://go-review.googlesource.com/9296Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>

85669799

cmd/internal/gc, cmd/internal/ld, cmd/internal/obj: teach compiler about local symbols · 029c7bbd

Michael Hudson-Doyle authored Apr 18, 2015

This lets us avoid loading string constants via the GOT and (together with
http://golang.org/cl/9102) results in the fannkuch benchmark having very similar
register usage with -dynlink as without.

Change-Id: Ic3892b399074982b76773c3e547cfbba5dabb6f9
Reviewed-on: https://go-review.googlesource.com/9103Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

029c7bbd

runtime: simplify process for starting GC goroutine · 0e6a6c51

Austin Clements authored Apr 22, 2015

Currently, when allocation reaches the GC trigger, the runtime uses
readyExecute to start the GC goroutine immediately rather than wait
for the scheduler to get around to the GC goroutine while the mutator
continues to grow the heap.

Now that the scheduler runs the most recently readied goroutine when a
goroutine yields its time slice, this rigmarole is no longer
necessary. The runtime can simply ready the GC goroutine and yield
from the readying goroutine.

Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710
Reviewed-on: https://go-review.googlesource.com/9292Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

0e6a6c51

runtime: use park/ready to wake up GC at end of concurrent mark · ce502b06

Austin Clements authored Apr 22, 2015

Currently, the main GC goroutine sleeps on a note during concurrent
mark and the first background mark worker or assist to finish marking
use wakes up that note to let the main goroutine proceed into mark
termination. Unfortunately, the latency of this wakeup can be quite
high, since the GC goroutine will typically have lost its P while in
the futex sleep, meaning it will be placed on the global run queue and
will wait there until some P is kind enough to pick it up. This delay
gives the mutator more time to allocate and create floating garbage,
growing the heap unnecessarily. Worse, it's likely that background
marking has stopped at this point (unless GOMAXPROCS>4), so anything
that's allocated and published to the heap during this window will
have to be scanned during mark termination while the world is stopped.

This change replaces the note sleep/wakeup with a gopark/ready
scheme. This keeps the wakeup inside the Go scheduler and lets the
garbage collector take advantage of the new scheduler semantics that
run the ready()d goroutine immediately when the ready()ing goroutine
sleeps.

For the json benchmark from x/benchmarks with GOMAXPROCS=4, this
reduces the delay in waking up the GC goroutine and entering mark
termination once concurrent marking is done from ~100ms to typically
<100µs.

Change-Id: Ib11f8b581b8914f2d68e0094f121e49bac3bb384
Reviewed-on: https://go-review.googlesource.com/9291Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

ce502b06

runtime: use timer for GC control revise rather than timeout · 4e32718d

Austin Clements authored Apr 22, 2015

Currently, we use a note sleep with a timeout in a loop in func gc to
periodically revise the GC control variables. Replace this with a
fully blocking note sleep and use a periodic timer to trigger the
revise instead. This is a step toward replacing the note sleep in func
gc.

Change-Id: I2d562f6b9b2e5f0c28e9a54227e2c0f8a2603f63
Reviewed-on: https://go-review.googlesource.com/9290Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

4e32718d

runtime: yield time slice to most recently readied G · e870f06c

Austin Clements authored Apr 22, 2015

Currently, when the runtime ready()s a G, it adds it to the end of the
current P's run queue and continues running. If there are many other
things in the run queue, this can result in a significant delay before
the ready()d G actually runs and can hurt fairness when other Gs in
the run queue are CPU hogs. For example, if there are three Gs sharing
a P, one of which is a CPU hog that never voluntarily gives up the P
and the other two of which are doing small amounts of work and
communicating back and forth on an unbuffered channel, the two
communicating Gs will get very little CPU time.

Change this so that when G1 ready()s G2 and then blocks, the scheduler
immediately hands off the remainder of G1's time slice to G2. In the
above example, the two communicating Gs will now act as a unit and
together get half of the CPU time, while the CPU hog gets the other
half of the CPU time.

This fixes the problem demonstrated by the ping-pong benchmark added
in the previous commit:

benchmark old ns/op new ns/op delta
BenchmarkPingPongHog 684287 825 -99.88%

On the x/benchmarks suite, this change improves the performance of
garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for
GOMAXPROCS=1 and 4. It has negligible effect on heap size.

This has no effect on the go1 benchmark suite since those benchmarks
are mostly single-threaded.

Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f
Reviewed-on: https://go-review.googlesource.com/9289Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

e870f06c

runtime: benchmark for ping-pong in the presence of a CPU hog · da0e37fa

Austin Clements authored Apr 22, 2015

This benchmark demonstrates a current problem with the scheduler where
a set of frequently communicating goroutines get very little CPU time
in the presence of another goroutine that hogs that CPU, even if one
of those communicating goroutines is always runnable.

Currently it takes about 0.5 milliseconds to switch between
ping-ponging goroutines in the presence of a CPU hog:

BenchmarkPingPongHog	    2000	    684287 ns/op

Change-Id: I278848c84f778de32344921ae8a4a8056e4898b0
Reviewed-on: https://go-review.googlesource.com/9288Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

da0e37fa

runtime: factor checking if P run queue is empty · e5e52f4f

Austin Clements authored Apr 22, 2015

There are a variety of places where we check if a P's run queue is
empty. This test is about to get slightly more complicated, so factor
it out into a new function, runqempty. This function is inlinable, so
this has no effect on performance.

Change-Id: If4a0b01ffbd004937de90d8d686f6ded4aad2c6b
Reviewed-on: https://go-review.googlesource.com/9287Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

e5e52f4f

cmd/internal/gc: add and test write barrier debug output · 9406f68e

Russ Cox authored Apr 17, 2015

We can expand the test cases as we discover problems.
This is some basic tests plus all the things I got wrong
in some recent work.

Change-Id: Id875fcfaf74eb087ae42b441fe47a34c5b8ccb39
Reviewed-on: https://go-review.googlesource.com/9158Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

9406f68e

hash/crc32: clarify documentation · 80f575b7

Aamir Khan authored Apr 22, 2015

Explicitly specify that we represent polynomial in reversed notation

Fixes #8229

Change-Id: Idf094c01fd82f133cd0c1b50fa967d12c577bdb5
Reviewed-on: https://go-review.googlesource.com/9237Reviewed-by: David Chase <drchase@google.com>

80f575b7