Commit 8f34b253 authored by Austin Clements's avatar Austin Clements

runtime: retry GC assist until debt is paid off

Currently, there are three ways to satisfy a GC assist: 1) the mutator
steals credit from background GC, 2) the mutator actually does GC
work, and 3) there is no more work available. 3 was never really
intended as a way to satisfy an assist, and it causes problems: there
are periods when it's expected that the GC won't have any work, such
as when transitioning from mark 1 to mark 2 and from mark 2 to mark
termination. During these periods, there's no back-pressure on rapidly
allocating mutators, which lets them race ahead of the heap goal.

For example, test/init1.go and the runtime/trace test both have small
reachable heaps and contain loops that rapidly allocate large garbage
byte slices. This bug lets these tests exceed the heap goal by several
orders of magnitude.

Fix this by forcing the assist (and hence the allocation) to block
until it can satisfy its debt via either 1 or 2, or the GC cycle
terminates.

This fixes one the causes of #11677. It's still possible to overshoot
the GC heap goal, but with this change the overshoot is almost exactly
by the amount of allocation that happens during the concurrent scan
phase, between when the heap passes the GC trigger and when the GC
enables assists.

Change-Id: I5ef4edcb0d2e13a1e432e66e8245f2bd9f8995be
Reviewed-on: https://go-review.googlesource.com/12671Reviewed-by: default avatarRuss Cox <rsc@golang.org>
parent 500c88d4
...@@ -177,6 +177,7 @@ func gcAssistAlloc(size uintptr, allowAssist bool) { ...@@ -177,6 +177,7 @@ func gcAssistAlloc(size uintptr, allowAssist bool) {
return return
} }
retry:
// Steal as much credit as we can from the background GC's // Steal as much credit as we can from the background GC's
// scan credit. This is racy and may drop the background // scan credit. This is racy and may drop the background
// credit below 0 if two mutators steal at the same time. This // credit below 0 if two mutators steal at the same time. This
...@@ -210,6 +211,9 @@ func gcAssistAlloc(size uintptr, allowAssist bool) { ...@@ -210,6 +211,9 @@ func gcAssistAlloc(size uintptr, allowAssist bool) {
// would be a performance hit. // would be a performance hit.
// Instead we recheck it here on the non-preemptable system // Instead we recheck it here on the non-preemptable system
// stack to determine if we should preform an assist. // stack to determine if we should preform an assist.
// GC is done, so ignore any remaining debt.
scanWork = 0
return return
} }
// Track time spent in this assist. Since we're on the // Track time spent in this assist. Since we're on the
...@@ -229,7 +233,9 @@ func gcAssistAlloc(size uintptr, allowAssist bool) { ...@@ -229,7 +233,9 @@ func gcAssistAlloc(size uintptr, allowAssist bool) {
startScanWork := gcw.scanWork startScanWork := gcw.scanWork
gcDrainN(gcw, scanWork) gcDrainN(gcw, scanWork)
// Record that we did this much scan work. // Record that we did this much scan work.
gp.gcscanwork += gcw.scanWork - startScanWork workDone := gcw.scanWork - startScanWork
gp.gcscanwork += workDone
scanWork -= workDone
// If we are near the end of the mark phase // If we are near the end of the mark phase
// dispose of the gcw. // dispose of the gcw.
if gcBlackenPromptly { if gcBlackenPromptly {
...@@ -271,6 +277,25 @@ func gcAssistAlloc(size uintptr, allowAssist bool) { ...@@ -271,6 +277,25 @@ func gcAssistAlloc(size uintptr, allowAssist bool) {
// We called complete() above, so we should yield to // We called complete() above, so we should yield to
// the now-runnable GC coordinator. // the now-runnable GC coordinator.
Gosched() Gosched()
// It's likely that this assist wasn't able to pay off
// its debt, but it's also likely that the Gosched let
// the GC finish this cycle and there's no point in
// waiting. If the GC finished, skip the delay below.
if atomicload(&gcBlackenEnabled) == 0 {
scanWork = 0
}
}
if scanWork > 0 {
// We were unable steal enough credit or perform
// enough work to pay off the assist debt. We need to
// do one of these before letting the mutator allocate
// more, so go around again after performing an
// interruptible sleep for 100 us (the same as the
// getfull barrier) to let other mutators run.
timeSleep(100 * 1000)
goto retry
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment