Commit fc9ca85f authored by Austin Clements's avatar Austin Clements

runtime: make sweep proportional to spans bytes allocated

Proportional concurrent sweep is currently based on a ratio of spans
to be swept per bytes of object allocation. However, proportional
sweeping is performed during span allocation, not object allocation,
in order to minimize contention and overhead. Since objects are
allocated from spans after those spans are allocated, the system tends
to operate in debt, which means when the next GC cycle starts, there
is often sweep debt remaining, so GC has to finish the sweep, which
delays the start of the cycle and delays enabling mutator assists.

For example, it's quite likely that many Ps will simultaneously refill
their span caches immediately after a GC cycle (because GC flushes the
span caches), but at this point, there has been very little object
allocation since the end of GC, so very little sweeping is done. The
Ps then allocate objects from these cached spans, which drives up the
bytes of object allocation, but since these allocations are coming
from cached spans, nothing considers whether more sweeping has to
happen. If the sweep ratio is high enough (which can happen if the
next GC trigger is very close to the retained heap size), this can
easily represent a sweep debt of thousands of pages.

Fix this by making proportional sweep proportional to the number of
bytes of spans allocated, rather than the number of bytes of objects
allocated. Prior to allocating a span, both the small object path and
the large object path ensure credit for allocating that span, so the
system operates in the black, rather than in the red.

Combined with the previous commit, this should eliminate all sweeping
from GC start up. On the stress test in issue #11911, this reduces the
time spent sweeping during GC (and delaying start up) by several
orders of magnitude:

                mean    99%ile     max
    pre fix      1 ms    11 ms   144 ms
    post fix   270 ns   735 ns   916 ns

Updates #11911.

Change-Id: I89223712883954c9d6ec2a7a51ecb97172097df3
Reviewed-on: https://go-review.googlesource.com/13044Reviewed-by: default avatarRick Hudson <rlh@golang.org>
Reviewed-by: default avatarRuss Cox <rsc@golang.org>
parent e30c6d64
......@@ -736,6 +736,12 @@ func largeAlloc(size uintptr, flag uint32) *mspan {
if size&_PageMask != 0 {
npages++
}
// Deduct credit for this span allocation and sweep if
// necessary. mHeap_Alloc will also sweep npages, so this only
// pays the debt down to npage pages.
deductSweepCredit(npages*_PageSize, npages)
s := mHeap_Alloc(&mheap_, npages, 0, true, flag&_FlagNoZero == 0)
if s == nil {
throw("out of memory")
......
......@@ -29,23 +29,8 @@ func mCentral_Init(c *mcentral, sizeclass int32) {
// Allocate a span to use in an MCache.
func mCentral_CacheSpan(c *mcentral) *mspan {
// Perform proportional sweep work. We don't directly reuse
// the spans we're sweeping here for this allocation because
// these can hold any size class. We'll sweep one more span
// below and use that because it will have the right size
// class and be hot in our cache.
pagesOwed := int64(mheap_.sweepPagesPerByte * float64(memstats.heap_live-memstats.heap_marked))
if pagesOwed-int64(mheap_.pagesSwept) > 1 {
// Get the debt down to one page, which we're likely
// to take care of below (if we don't, that's fine;
// we'll pick up the slack later).
for pagesOwed-int64(atomicload64(&mheap_.pagesSwept)) > 1 {
if gosweepone() == ^uintptr(0) {
mheap_.sweepPagesPerByte = 0
break
}
}
}
// Deduct credit for this span allocation and sweep if necessary.
deductSweepCredit(uintptr(class_to_size[c.sizeclass]), 0)
lock(&c.lock)
sg := mheap_.sweepgen
......
......@@ -1563,6 +1563,7 @@ func gcSweep(mode int) {
lock(&mheap_.lock)
mheap_.sweepPagesPerByte = float64(pagesToSweep) / float64(heapDistance)
mheap_.pagesSwept = 0
mheap_.spanBytesAlloc = 0
unlock(&mheap_.lock)
// Background sweep.
......
......@@ -313,6 +313,38 @@ func mSpan_Sweep(s *mspan, preserve bool) bool {
return res
}
// deductSweepCredit deducts sweep credit for allocating a span of
// size spanBytes. This must be performed *before* the span is
// allocated to ensure the system has enough credit. If necessary, it
// performs sweeping to prevent going in to debt. If the caller will
// also sweep pages (e.g., for a large allocation), it can pass a
// non-zero callerSweepPages to leave that many pages unswept.
//
// deductSweepCredit is the core of the "proportional sweep" system.
// It uses statistics gathered by the garbage collector to perform
// enough sweeping so that all pages are swept during the concurrent
// sweep phase between GC cycles.
//
// mheap_ must NOT be locked.
func deductSweepCredit(spanBytes uintptr, callerSweepPages uintptr) {
if mheap_.sweepPagesPerByte == 0 {
// Proportional sweep is done or disabled.
return
}
// Account for this span allocation.
spanBytesAlloc := xadd64(&mheap_.spanBytesAlloc, int64(spanBytes))
// Fix debt if necessary.
pagesOwed := int64(mheap_.sweepPagesPerByte * float64(spanBytesAlloc))
for pagesOwed-int64(atomicload64(&mheap_.pagesSwept)) > int64(callerSweepPages) {
if gosweepone() == ^uintptr(0) {
mheap_.sweepPagesPerByte = 0
break
}
}
}
func dumpFreeList(s *mspan) {
printlock()
print("runtime: free list of span ", s, ":\n")
......
......@@ -29,6 +29,7 @@ type mheap struct {
spans_mapped uintptr
// Proportional sweep
spanBytesAlloc uint64 // bytes of spans allocated this cycle; updated atomically
pagesSwept uint64 // pages swept this cycle; updated atomically
sweepPagesPerByte float64 // proportional sweep ratio; written with lock, read without
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment