syscall: use CLONE_VFORK and CLONE_VM
This greatly improves the latency of starting a child process when the Go process is using a lot of memory. Even though the kernel uses copy-on-write, preparation for that can take up to several 100ms under certain conditions. All other goroutines are suspended while starting a subprocess so this latency directly affects total throughput. With CLONE_VM the child process shares the same memory with the parent process. On its own this would lead to conflicting use of the same memory, so CLONE_VFORK is used to suspend the parent process until the child releases the memory when switching to to the new program binary via the exec syscall. When the parent process continues to run, one has to consider the changes to memory that the child process did, namely the return address of the syscall function needs to be restored from a register. A simple benchmark has shown a difference in latency of 16ms vs. 0.5ms at 10GB memory usage. However, much higher latencies of several 100ms have been observed in real world scenarios. For more information see comments on #5838. Fixes #5838 Change-Id: I6377d7bd8dcd00c85ca0c52b6683e70ce2174ba6 Reviewed-on: https://go-review.googlesource.com/37439Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
Showing
Please register or sign in to comment