Commit 4e7bef84 authored by Michael Anthony Knyszek's avatar Michael Anthony Knyszek Committed by Michael Knyszek

runtime: mark newly-mapped memory as scavenged

On most platforms newly-mapped memory is untouched, meaning the pages
backing the region haven't been faulted in yet. However, we mark this
memory as unscavenged which means the background scavenger
aggressively "returns" this memory to the OS if the heap is small.

The only platform where newly-mapped memory is actually unscavenged (and
counts toward the application's RSS) is on Windows, since
(*mheap).sysAlloc commits the reservation. Instead of making a special
case for Windows, I change the requirements a bit for a sysReserve'd
region. It must now be both sysMap'd and sysUsed'd, with sysMap being a
no-op on Windows. Comments about memory allocation have been updated to
include a more up-to-date mental model of which states a region of memory
may be in (at a very low level) and how to transition between these
states.

Now this means we can correctly mark newly-mapped heap memory as
scavenged on every platform, reducing the load on the background
scavenger early on in the application for small heaps. As a result,
heap-growth scavenging is no longer necessary, since any actual RSS
growth will be accounted for on the allocation codepath.

Finally, this change also cleans up grow a little bit to avoid
pretending that it's freeing an in-use span and just does the necessary
operations directly.

Fixes #32012.
Fixes #31966.
Updates #26473.

Change-Id: Ie06061eb638162e0560cdeb0b8993d94cfb4d290
Reviewed-on: https://go-review.googlesource.com/c/go/+/177097
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: default avatarAustin Clements <austin@google.com>
parent cb5c82bc
......@@ -332,37 +332,74 @@ var physPageSize uintptr
// value is always safe (though potentially less efficient).
var physHugePageSize uintptr
// OS-defined helpers:
// OS memory management abstraction layer
//
// sysAlloc obtains a large chunk of zeroed memory from the
// operating system, typically on the order of a hundred kilobytes
// or a megabyte.
// NOTE: sysAlloc returns OS-aligned memory, but the heap allocator
// may use larger alignment, so the caller must be careful to realign the
// memory obtained by sysAlloc.
// Regions of the address space managed by the runtime may be in one of four
// states at any given time:
// 1) None - Unreserved and unmapped, the default state of any region.
// 2) Reserved - Owned by the runtime, but accessing it would cause a fault.
// Does not count against the process' memory footprint.
// 3) Prepared - Reserved, intended not to be backed by physical memory (though
// an OS may implement this lazily). Can transition efficiently to
// Ready. Accessing memory in such a region is undefined (may
// fault, may give back unexpected zeroes, etc.).
// 4) Ready - may be accessed safely.
//
// This set of states is more than is strictly necessary to support all the
// currently supported platforms. One could get by with just None, Reserved, and
// Ready. However, the Prepared state gives us flexibility for performance
// purposes. For example, on POSIX-y operating systems, Reserved is usually a
// private anonymous mmap'd region with PROT_NONE set, and to transition
// to Ready would require setting PROT_READ|PROT_WRITE. However the
// underspecification of Prepared lets us use just MADV_FREE to transition from
// Ready to Prepared. Thus with the Prepared state we can set the permission
// bits just once early on, we can efficiently tell the OS that it's free to
// take pages away from us when we don't strictly need them.
//
// For each OS there is a common set of helpers defined that transition
// memory regions between these states. The helpers are as follows:
//
// sysUnused notifies the operating system that the contents
// of the memory region are no longer needed and can be reused
// for other purposes.
// sysUsed notifies the operating system that the contents
// of the memory region are needed again.
// sysAlloc transitions an OS-chosen region of memory from None to Ready.
// More specifically, it obtains a large chunk of zeroed memory from the
// operating system, typically on the order of a hundred kilobytes
// or a megabyte. This memory is always immediately available for use.
//
// sysFree returns it unconditionally; this is only used if
// an out-of-memory error has been detected midway through
// an allocation. It is okay if sysFree is a no-op.
// sysFree transitions a memory region from any state to None. Therefore, it
// returns memory unconditionally. It is used if an out-of-memory error has been
// detected midway through an allocation or to carve out an aligned section of
// the address space. It is okay if sysFree is a no-op only if sysReserve always
// returns a memory region aligned to the heap allocator's alignment
// restrictions.
//
// sysReserve reserves address space without allocating memory.
// sysReserve transitions a memory region from None to Reserved. It reserves
// address space in such a way that it would cause a fatal fault upon access
// (either via permissions or not committing the memory). Such a reservation is
// thus never backed by physical memory.
// If the pointer passed to it is non-nil, the caller wants the
// reservation there, but sysReserve can still choose another
// location if that one is unavailable.
// NOTE: sysReserve returns OS-aligned memory, but the heap allocator
// may use larger alignment, so the caller must be careful to realign the
// memory obtained by sysAlloc.
// memory obtained by sysReserve.
//
// sysMap transitions a memory region from Reserved to Prepared. It ensures the
// memory region can be efficiently transitioned to Ready.
//
// sysMap maps previously reserved address space for use.
// sysUsed transitions a memory region from Prepared to Ready. It notifies the
// operating system that the memory region is needed and ensures that the region
// may be safely accessed. This is typically a no-op on systems that don't have
// an explicit commit step and hard over-commit limits, but is critical on
// Windows, for example.
//
// sysFault marks a (already sysAlloc'd) region to fault
// if accessed. Used only for debugging the runtime.
// sysUnused transitions a memory region from Ready to Prepared. It notifies the
// operating system that the physical pages backing this memory region are no
// longer needed and can be reused for other purposes. The contents of a
// sysUnused memory region are considered forfeit and the region must not be
// accessed again until sysUsed is called.
//
// sysFault transitions a memory region from Ready or Prepared to Reserved. It
// marks a region such that it will always fault if accessed. Used only for
// debugging the runtime.
func mallocinit() {
if class_to_size[_TinySizeClass] != _TinySize {
......@@ -539,6 +576,9 @@ func mallocinit() {
// heapArenaBytes. sysAlloc returns nil on failure.
// There is no corresponding free function.
//
// sysAlloc returns a memory region in the Prepared state. This region must
// be transitioned to Ready before use.
//
// h must be locked.
func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
n = round(n, heapArenaBytes)
......@@ -580,7 +620,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
// TODO: This would be cleaner if sysReserve could be
// told to only return the requested address. In
// particular, this is already how Windows behaves, so
// it would simply things there.
// it would simplify things there.
if v != nil {
sysFree(v, n, nil)
}
......@@ -637,7 +677,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
throw("misrounded allocation in sysAlloc")
}
// Back the reservation.
// Transition from Reserved to Prepared.
sysMap(v, size, &memstats.heap_sys)
mapped:
......@@ -1288,8 +1328,8 @@ func inPersistentAlloc(p uintptr) bool {
}
// linearAlloc is a simple linear allocator that pre-reserves a region
// of memory and then maps that region as needed. The caller is
// responsible for locking.
// of memory and then maps that region into the Ready state as needed. The
// caller is responsible for locking.
type linearAlloc struct {
next uintptr // next free byte
mapped uintptr // one byte past end of mapped space
......@@ -1308,8 +1348,9 @@ func (l *linearAlloc) alloc(size, align uintptr, sysStat *uint64) unsafe.Pointer
}
l.next = p + size
if pEnd := round(l.next-1, physPageSize); pEnd > l.mapped {
// We need to map more of the reserved space.
// Transition from Reserved to Prepared to Ready.
sysMap(unsafe.Pointer(l.mapped), pEnd-l.mapped, sysStat)
sysUsed(unsafe.Pointer(l.mapped), pEnd-l.mapped)
l.mapped = pEnd
}
return unsafe.Pointer(p)
......
......@@ -60,24 +60,34 @@ func sysUnused(v unsafe.Pointer, n uintptr) {
}
func sysUsed(v unsafe.Pointer, n uintptr) {
r := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
if r != 0 {
p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
if p == uintptr(v) {
return
}
// Commit failed. See SysUnused.
for n > 0 {
small := n
// Hold on to n here so we can give back a better error message
// for certain cases.
k := n
for k > 0 {
small := k
for small >= 4096 && stdcall4(_VirtualAlloc, uintptr(v), small, _MEM_COMMIT, _PAGE_READWRITE) == 0 {
small /= 2
small &^= 4096 - 1
}
if small < 4096 {
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", getlasterror(), "\n")
throw("runtime: failed to commit pages")
errno := getlasterror()
switch errno {
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
throw("out of memory")
default:
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", errno, "\n")
throw("runtime: failed to commit pages")
}
}
v = add(v, small)
n -= small
k -= small
}
}
......@@ -116,15 +126,4 @@ func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer {
func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
mSysStatInc(sysStat, n)
p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
if p != uintptr(v) {
errno := getlasterror()
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
switch errno {
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
throw("out of memory")
default:
throw("runtime: cannot map pages in arena address space")
}
}
}
......@@ -1246,20 +1246,22 @@ func (h *mheap) grow(npage uintptr) bool {
return false
}
// Scavenge some pages out of the free treap to make up for
// the virtual memory space we just allocated, but only if
// we need to.
h.scavengeIfNeededLocked(size)
// Create a fake "in use" span and free it, so that the
// right coalescing happens.
// right accounting and coalescing happens.
s := (*mspan)(h.spanalloc.alloc())
s.init(uintptr(v), size/pageSize)
h.setSpans(s.base(), s.npages, s)
atomic.Store(&s.sweepgen, h.sweepgen)
s.state = mSpanInUse
h.pagesInUse += uint64(s.npages)
h.freeSpanLocked(s, false, true)
s.state = mSpanFree
memstats.heap_idle += uint64(size)
// (*mheap).sysAlloc returns untouched/uncommitted memory.
s.scavenged = true
// s is always aligned to the heap arena size which is always > physPageSize,
// so its totally safe to just add directly to heap_released. Coalescing,
// if possible, will also always be correct in terms of accounting, because
// s.base() must be a physical page boundary.
memstats.heap_released += uint64(size)
h.coalesce(s)
h.free.insert(s)
return true
}
......@@ -1314,7 +1316,6 @@ func (h *mheap) freeManual(s *mspan, stat *uint64) {
unlock(&h.lock)
}
// s must be on the busy list or unlinked.
func (h *mheap) freeSpanLocked(s *mspan, acctinuse, acctidle bool) {
switch s.state {
case mSpanManual:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment