1. 29 Sep, 2016 6 commits
  2. 28 Sep, 2016 8 commits
  3. 27 Sep, 2016 21 commits
  4. 26 Sep, 2016 5 commits
    • Austin Clements's avatar
      runtime: optimize defer code · f8b2314c
      Austin Clements authored
      This optimizes deferproc and deferreturn in various ways.
      
      The most important optimization is that it more carefully arranges to
      prevent preemption or stack growth. Currently we do this by switching
      to the system stack on every deferproc and every deferreturn. While we
      need to be on the system stack for the slow path of allocating and
      freeing defers, in the common case we can fit in the nosplit stack.
      Hence, this change pushes the system stack switch down into the slow
      paths and makes everything now exposed to the user stack nosplit. This
      also eliminates the need for various acquirem/releasem pairs, since we
      are now preventing preemption by preventing stack split checks.
      
      As another smaller optimization, we special case the common cases of
      zero-sized and pointer-sized defer frames to respectively skip the
      copy and perform the copy in line instead of calling memmove.
      
      This speeds up the runtime defer benchmark by 42%:
      
      name           old time/op  new time/op  delta
      Defer-4        75.1ns ± 1%  43.3ns ± 1%  -42.31%   (p=0.000 n=8+10)
      
      In reality, this speeds up defer by about 2.2X. The two benchmarks
      below compare a Lock/defer Unlock pair (DeferLock) with a Lock/Unlock
      pair (NoDeferLock). NoDeferLock establishes a baseline cost, so these
      two benchmarks together show that this change reduces the overhead of
      defer from 61.4ns to 27.9ns.
      
      name           old time/op  new time/op  delta
      DeferLock-4    77.4ns ± 1%  43.9ns ± 1%  -43.31%  (p=0.000 n=10+10)
      NoDeferLock-4  16.0ns ± 0%  15.9ns ± 0%   -0.39%    (p=0.000 n=9+8)
      
      This also shaves 34ns off cgo calls:
      
      name       old time/op  new time/op  delta
      CgoNoop-4   122ns ± 1%  88.3ns ± 1%  -27.72%  (p=0.000 n=8+9)
      
      Updates #14939, #16051.
      
      Change-Id: I2baa0dea378b7e4efebbee8fca919a97d5e15f38
      Reviewed-on: https://go-review.googlesource.com/29656Reviewed-by: default avatarKeith Randall <khr@golang.org>
      f8b2314c
    • Austin Clements's avatar
      runtime: implement getcallersp in Go · d211c2d3
      Austin Clements authored
      This makes it possible to inline getcallersp. getcallersp is on the
      hot path of defers, so this slightly speeds up defer:
      
      name           old time/op  new time/op  delta
      Defer-4        78.3ns ± 2%  75.1ns ± 1%  -4.00%   (p=0.000 n=9+8)
      
      Updates #14939.
      
      Change-Id: Icc1cc4cd2f0a81fc4c8344432d0b2e783accacdd
      Reviewed-on: https://go-review.googlesource.com/29655
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      d211c2d3
    • Austin Clements's avatar
      runtime: update malloc.go documentation · aaf4099a
      Austin Clements authored
      The big documentation comment at the top of malloc.go has gotten
      woefully out of date. Update it.
      
      Change-Id: Ibdb1bdcfdd707a6dc9db79d0633a36a28882301b
      Reviewed-on: https://go-review.googlesource.com/29731Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      aaf4099a
    • Austin Clements's avatar
      runtime: document MemStats · f67c9de6
      Austin Clements authored
      This documents all fields in MemStats and more clearly documents where
      mstats differs from MemStats.
      
      Fixes #15849.
      
      Change-Id: Ie09374bcdb3a5fdd2d25fe4bba836aaae92cb1dd
      Reviewed-on: https://go-review.googlesource.com/28972Reviewed-by: default avatarRob Pike <r@golang.org>
      Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      f67c9de6
    • Austin Clements's avatar
      runtime: eliminate memstats.heap_reachable · 2098e5d3
      Austin Clements authored
      We used to compute an estimate of the reachable heap size that was
      different from the marked heap size. This ultimately caused more
      problems than it solved, so we pulled it out, but memstats still has
      both heap_reachable and heap_marked, and there are some leftover TODOs
      about the problems with this estimate.
      
      Clean this up by eliminating heap_reachable in favor of heap_marked
      and deleting the stale TODOs.
      
      Change-Id: I713bc20a7c90683d2b43ff63c0b21a440269cc4d
      Reviewed-on: https://go-review.googlesource.com/29271
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      2098e5d3