Commit 4e7bef84 authored by Michael Anthony Knyszek's avatar Michael Anthony Knyszek Committed by Michael Knyszek

runtime: mark newly-mapped memory as scavenged

On most platforms newly-mapped memory is untouched, meaning the pages
backing the region haven't been faulted in yet. However, we mark this
memory as unscavenged which means the background scavenger
aggressively "returns" this memory to the OS if the heap is small.

The only platform where newly-mapped memory is actually unscavenged (and
counts toward the application's RSS) is on Windows, since
(*mheap).sysAlloc commits the reservation. Instead of making a special
case for Windows, I change the requirements a bit for a sysReserve'd
region. It must now be both sysMap'd and sysUsed'd, with sysMap being a
no-op on Windows. Comments about memory allocation have been updated to
include a more up-to-date mental model of which states a region of memory
may be in (at a very low level) and how to transition between these
states.

Now this means we can correctly mark newly-mapped heap memory as
scavenged on every platform, reducing the load on the background
scavenger early on in the application for small heaps. As a result,
heap-growth scavenging is no longer necessary, since any actual RSS
growth will be accounted for on the allocation codepath.

Finally, this change also cleans up grow a little bit to avoid
pretending that it's freeing an in-use span and just does the necessary
operations directly.

Fixes #32012.
Fixes #31966.
Updates #26473.

Change-Id: Ie06061eb638162e0560cdeb0b8993d94cfb4d290
Reviewed-on: https://go-review.googlesource.com/c/go/+/177097
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: default avatarAustin Clements <austin@google.com>
parent cb5c82bc
...@@ -332,37 +332,74 @@ var physPageSize uintptr ...@@ -332,37 +332,74 @@ var physPageSize uintptr
// value is always safe (though potentially less efficient). // value is always safe (though potentially less efficient).
var physHugePageSize uintptr var physHugePageSize uintptr
// OS-defined helpers: // OS memory management abstraction layer
// //
// sysAlloc obtains a large chunk of zeroed memory from the // Regions of the address space managed by the runtime may be in one of four
// operating system, typically on the order of a hundred kilobytes // states at any given time:
// or a megabyte. // 1) None - Unreserved and unmapped, the default state of any region.
// NOTE: sysAlloc returns OS-aligned memory, but the heap allocator // 2) Reserved - Owned by the runtime, but accessing it would cause a fault.
// may use larger alignment, so the caller must be careful to realign the // Does not count against the process' memory footprint.
// memory obtained by sysAlloc. // 3) Prepared - Reserved, intended not to be backed by physical memory (though
// an OS may implement this lazily). Can transition efficiently to
// Ready. Accessing memory in such a region is undefined (may
// fault, may give back unexpected zeroes, etc.).
// 4) Ready - may be accessed safely.
//
// This set of states is more than is strictly necessary to support all the
// currently supported platforms. One could get by with just None, Reserved, and
// Ready. However, the Prepared state gives us flexibility for performance
// purposes. For example, on POSIX-y operating systems, Reserved is usually a
// private anonymous mmap'd region with PROT_NONE set, and to transition
// to Ready would require setting PROT_READ|PROT_WRITE. However the
// underspecification of Prepared lets us use just MADV_FREE to transition from
// Ready to Prepared. Thus with the Prepared state we can set the permission
// bits just once early on, we can efficiently tell the OS that it's free to
// take pages away from us when we don't strictly need them.
//
// For each OS there is a common set of helpers defined that transition
// memory regions between these states. The helpers are as follows:
// //
// sysUnused notifies the operating system that the contents // sysAlloc transitions an OS-chosen region of memory from None to Ready.
// of the memory region are no longer needed and can be reused // More specifically, it obtains a large chunk of zeroed memory from the
// for other purposes. // operating system, typically on the order of a hundred kilobytes
// sysUsed notifies the operating system that the contents // or a megabyte. This memory is always immediately available for use.
// of the memory region are needed again.
// //
// sysFree returns it unconditionally; this is only used if // sysFree transitions a memory region from any state to None. Therefore, it
// an out-of-memory error has been detected midway through // returns memory unconditionally. It is used if an out-of-memory error has been
// an allocation. It is okay if sysFree is a no-op. // detected midway through an allocation or to carve out an aligned section of
// the address space. It is okay if sysFree is a no-op only if sysReserve always
// returns a memory region aligned to the heap allocator's alignment
// restrictions.
// //
// sysReserve reserves address space without allocating memory. // sysReserve transitions a memory region from None to Reserved. It reserves
// address space in such a way that it would cause a fatal fault upon access
// (either via permissions or not committing the memory). Such a reservation is
// thus never backed by physical memory.
// If the pointer passed to it is non-nil, the caller wants the // If the pointer passed to it is non-nil, the caller wants the
// reservation there, but sysReserve can still choose another // reservation there, but sysReserve can still choose another
// location if that one is unavailable. // location if that one is unavailable.
// NOTE: sysReserve returns OS-aligned memory, but the heap allocator // NOTE: sysReserve returns OS-aligned memory, but the heap allocator
// may use larger alignment, so the caller must be careful to realign the // may use larger alignment, so the caller must be careful to realign the
// memory obtained by sysAlloc. // memory obtained by sysReserve.
//
// sysMap transitions a memory region from Reserved to Prepared. It ensures the
// memory region can be efficiently transitioned to Ready.
// //
// sysMap maps previously reserved address space for use. // sysUsed transitions a memory region from Prepared to Ready. It notifies the
// operating system that the memory region is needed and ensures that the region
// may be safely accessed. This is typically a no-op on systems that don't have
// an explicit commit step and hard over-commit limits, but is critical on
// Windows, for example.
// //
// sysFault marks a (already sysAlloc'd) region to fault // sysUnused transitions a memory region from Ready to Prepared. It notifies the
// if accessed. Used only for debugging the runtime. // operating system that the physical pages backing this memory region are no
// longer needed and can be reused for other purposes. The contents of a
// sysUnused memory region are considered forfeit and the region must not be
// accessed again until sysUsed is called.
//
// sysFault transitions a memory region from Ready or Prepared to Reserved. It
// marks a region such that it will always fault if accessed. Used only for
// debugging the runtime.
func mallocinit() { func mallocinit() {
if class_to_size[_TinySizeClass] != _TinySize { if class_to_size[_TinySizeClass] != _TinySize {
...@@ -539,6 +576,9 @@ func mallocinit() { ...@@ -539,6 +576,9 @@ func mallocinit() {
// heapArenaBytes. sysAlloc returns nil on failure. // heapArenaBytes. sysAlloc returns nil on failure.
// There is no corresponding free function. // There is no corresponding free function.
// //
// sysAlloc returns a memory region in the Prepared state. This region must
// be transitioned to Ready before use.
//
// h must be locked. // h must be locked.
func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) { func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
n = round(n, heapArenaBytes) n = round(n, heapArenaBytes)
...@@ -580,7 +620,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) { ...@@ -580,7 +620,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
// TODO: This would be cleaner if sysReserve could be // TODO: This would be cleaner if sysReserve could be
// told to only return the requested address. In // told to only return the requested address. In
// particular, this is already how Windows behaves, so // particular, this is already how Windows behaves, so
// it would simply things there. // it would simplify things there.
if v != nil { if v != nil {
sysFree(v, n, nil) sysFree(v, n, nil)
} }
...@@ -637,7 +677,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) { ...@@ -637,7 +677,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
throw("misrounded allocation in sysAlloc") throw("misrounded allocation in sysAlloc")
} }
// Back the reservation. // Transition from Reserved to Prepared.
sysMap(v, size, &memstats.heap_sys) sysMap(v, size, &memstats.heap_sys)
mapped: mapped:
...@@ -1288,8 +1328,8 @@ func inPersistentAlloc(p uintptr) bool { ...@@ -1288,8 +1328,8 @@ func inPersistentAlloc(p uintptr) bool {
} }
// linearAlloc is a simple linear allocator that pre-reserves a region // linearAlloc is a simple linear allocator that pre-reserves a region
// of memory and then maps that region as needed. The caller is // of memory and then maps that region into the Ready state as needed. The
// responsible for locking. // caller is responsible for locking.
type linearAlloc struct { type linearAlloc struct {
next uintptr // next free byte next uintptr // next free byte
mapped uintptr // one byte past end of mapped space mapped uintptr // one byte past end of mapped space
...@@ -1308,8 +1348,9 @@ func (l *linearAlloc) alloc(size, align uintptr, sysStat *uint64) unsafe.Pointer ...@@ -1308,8 +1348,9 @@ func (l *linearAlloc) alloc(size, align uintptr, sysStat *uint64) unsafe.Pointer
} }
l.next = p + size l.next = p + size
if pEnd := round(l.next-1, physPageSize); pEnd > l.mapped { if pEnd := round(l.next-1, physPageSize); pEnd > l.mapped {
// We need to map more of the reserved space. // Transition from Reserved to Prepared to Ready.
sysMap(unsafe.Pointer(l.mapped), pEnd-l.mapped, sysStat) sysMap(unsafe.Pointer(l.mapped), pEnd-l.mapped, sysStat)
sysUsed(unsafe.Pointer(l.mapped), pEnd-l.mapped)
l.mapped = pEnd l.mapped = pEnd
} }
return unsafe.Pointer(p) return unsafe.Pointer(p)
......
...@@ -60,24 +60,34 @@ func sysUnused(v unsafe.Pointer, n uintptr) { ...@@ -60,24 +60,34 @@ func sysUnused(v unsafe.Pointer, n uintptr) {
} }
func sysUsed(v unsafe.Pointer, n uintptr) { func sysUsed(v unsafe.Pointer, n uintptr) {
r := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE) p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
if r != 0 { if p == uintptr(v) {
return return
} }
// Commit failed. See SysUnused. // Commit failed. See SysUnused.
for n > 0 { // Hold on to n here so we can give back a better error message
small := n // for certain cases.
k := n
for k > 0 {
small := k
for small >= 4096 && stdcall4(_VirtualAlloc, uintptr(v), small, _MEM_COMMIT, _PAGE_READWRITE) == 0 { for small >= 4096 && stdcall4(_VirtualAlloc, uintptr(v), small, _MEM_COMMIT, _PAGE_READWRITE) == 0 {
small /= 2 small /= 2
small &^= 4096 - 1 small &^= 4096 - 1
} }
if small < 4096 { if small < 4096 {
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", getlasterror(), "\n") errno := getlasterror()
switch errno {
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
throw("out of memory")
default:
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", errno, "\n")
throw("runtime: failed to commit pages") throw("runtime: failed to commit pages")
} }
}
v = add(v, small) v = add(v, small)
n -= small k -= small
} }
} }
...@@ -116,15 +126,4 @@ func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer { ...@@ -116,15 +126,4 @@ func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer {
func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) { func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
mSysStatInc(sysStat, n) mSysStatInc(sysStat, n)
p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
if p != uintptr(v) {
errno := getlasterror()
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
switch errno {
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
throw("out of memory")
default:
throw("runtime: cannot map pages in arena address space")
}
}
} }
...@@ -1246,20 +1246,22 @@ func (h *mheap) grow(npage uintptr) bool { ...@@ -1246,20 +1246,22 @@ func (h *mheap) grow(npage uintptr) bool {
return false return false
} }
// Scavenge some pages out of the free treap to make up for
// the virtual memory space we just allocated, but only if
// we need to.
h.scavengeIfNeededLocked(size)
// Create a fake "in use" span and free it, so that the // Create a fake "in use" span and free it, so that the
// right coalescing happens. // right accounting and coalescing happens.
s := (*mspan)(h.spanalloc.alloc()) s := (*mspan)(h.spanalloc.alloc())
s.init(uintptr(v), size/pageSize) s.init(uintptr(v), size/pageSize)
h.setSpans(s.base(), s.npages, s) h.setSpans(s.base(), s.npages, s)
atomic.Store(&s.sweepgen, h.sweepgen) s.state = mSpanFree
s.state = mSpanInUse memstats.heap_idle += uint64(size)
h.pagesInUse += uint64(s.npages) // (*mheap).sysAlloc returns untouched/uncommitted memory.
h.freeSpanLocked(s, false, true) s.scavenged = true
// s is always aligned to the heap arena size which is always > physPageSize,
// so its totally safe to just add directly to heap_released. Coalescing,
// if possible, will also always be correct in terms of accounting, because
// s.base() must be a physical page boundary.
memstats.heap_released += uint64(size)
h.coalesce(s)
h.free.insert(s)
return true return true
} }
...@@ -1314,7 +1316,6 @@ func (h *mheap) freeManual(s *mspan, stat *uint64) { ...@@ -1314,7 +1316,6 @@ func (h *mheap) freeManual(s *mspan, stat *uint64) {
unlock(&h.lock) unlock(&h.lock)
} }
// s must be on the busy list or unlinked.
func (h *mheap) freeSpanLocked(s *mspan, acctinuse, acctidle bool) { func (h *mheap) freeSpanLocked(s *mspan, acctinuse, acctidle bool) {
switch s.state { switch s.state {
case mSpanManual: case mSpanManual:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment