1. 20 Nov, 2015 3 commits
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: prevent sigprof during all stack barrier ops · 08ea8252
      Austin Clements authored
      A sigprof during stack barrier insertion or removal can crash if it
      detects an inconsistency between the stkbar array and the stack
      itself. Currently we protect against this when scanning another G's
      stack using stackLock, but we don't protect against it when unwinding
      stack barriers for a recover or a memmove to the stack.
      
      This commit cleans up and improves the stack locking code. It
      abstracts out the lock and unlock operations. It uses the lock
      consistently everywhere we perform stack operations, and pushes the
      lock/unlock down closer to where the stack barrier operations happen
      to make it more obvious what it's protecting. Finally, it modifies
      sigprof so that instead of spinning until it acquires the lock, it
      simply doesn't perform a traceback if it can't acquire it. This is
      necessary to prevent self-deadlock.
      
      Updates #11863, which introduced stackLock to fix some of these
      issues, but didn't go far enough.
      
      Updates #12528.
      
      Change-Id: I9d1fa88ae3744d31ba91500c96c6988ce1a3a349
      Reviewed-on: https://go-review.googlesource.com/17036Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-on: https://go-review.googlesource.com/17057
      08ea8252
    • Russ Cox's avatar
      [release-branch.go1.5] runtime: fix new stack barrier check · 7ab4cba9
      Russ Cox authored
      During a crash showing goroutine stacks of all threads
      (with GOTRACEBACK=crash), it can be that f == nil.
      
      Only happens on Solaris; not sure why.
      
      Change-Id: Iee2c394a0cf19fa0a24f6befbc70776b9e42d25a
      Reviewed-on: https://go-review.googlesource.com/17110Reviewed-by: default avatarAustin Clements <austin@google.com>
      Reviewed-on: https://go-review.googlesource.com/17122Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      7ab4cba9
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: handle sigprof in stackBarrier · 2a6c7739
      Austin Clements authored
      Currently, if a profiling signal happens in the middle of
      stackBarrier, gentraceback may see inconsistencies between stkbar and
      the barriers on the stack and it will certainly get the wrong return
      PC for stackBarrier. In most cases, the return PC won't be a PC at all
      and this will immediately abort the traceback (which is considered
      okay for a sigprof), but if it happens to be a valid PC this may sent
      gentraceback down a rabbit hole.
      
      Fix this by detecting when the gentraceback starts in stackBarrier and
      simulating the completion of the barrier to get the correct initial
      frame.
      
      Change-Id: Ib11f705ac9194925f63fe5dfbfc84013a38333e6
      Reviewed-on: https://go-review.googlesource.com/17035Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-on: https://go-review.googlesource.com/17056
      2a6c7739
  2. 17 Nov, 2015 14 commits
  3. 13 Nov, 2015 4 commits
    • Keith Randall's avatar
      [release-branch.go1.5] runtime: memmove/memclr pointers atomically · 0b5982f0
      Keith Randall authored
      Make sure that we're moving or zeroing pointers atomically.
      Anything that is a multiple of pointer size and at least
      pointer aligned might have pointers in it.  All the code looks
      ok except for the 1-pointer-sized moves.
      
      Fixes #13160
      Update #12552
      
      Change-Id: Ib97d9b918fa9f4cc5c56c67ed90255b7fdfb7b45
      Reviewed-on: https://go-review.googlesource.com/16668Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16910Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      0b5982f0
    • Michael Hudson-Doyle's avatar
      [release-branch.go1.5] runtime: adjust the arm64 memmove and memclr to operate... · d3a41356
      Michael Hudson-Doyle authored
      [release-branch.go1.5] runtime: adjust the arm64 memmove and memclr to operate by word as much as they can
      
      Not only is this an obvious optimization:
      
      benchmark                           old MB/s     new MB/s     speedup
      BenchmarkMemmove1-4                 35.35        29.65        0.84x
      BenchmarkMemmove2-4                 63.78        52.53        0.82x
      BenchmarkMemmove3-4                 89.72        73.96        0.82x
      BenchmarkMemmove4-4                 109.94       95.73        0.87x
      BenchmarkMemmove5-4                 127.60       112.80       0.88x
      BenchmarkMemmove6-4                 143.59       126.67       0.88x
      BenchmarkMemmove7-4                 157.90       138.92       0.88x
      BenchmarkMemmove8-4                 167.18       231.81       1.39x
      BenchmarkMemmove9-4                 175.23       252.07       1.44x
      BenchmarkMemmove10-4                165.68       261.10       1.58x
      BenchmarkMemmove11-4                174.43       263.31       1.51x
      BenchmarkMemmove12-4                180.76       267.56       1.48x
      BenchmarkMemmove13-4                189.06       284.93       1.51x
      BenchmarkMemmove14-4                186.31       284.72       1.53x
      BenchmarkMemmove15-4                195.75       281.62       1.44x
      BenchmarkMemmove16-4                202.96       439.23       2.16x
      BenchmarkMemmove32-4                264.77       775.77       2.93x
      BenchmarkMemmove64-4                306.81       1209.64      3.94x
      BenchmarkMemmove128-4               357.03       1515.41      4.24x
      BenchmarkMemmove256-4               380.77       2066.01      5.43x
      BenchmarkMemmove512-4               385.05       2556.45      6.64x
      BenchmarkMemmove1024-4              381.23       2804.10      7.36x
      BenchmarkMemmove2048-4              379.06       2814.83      7.43x
      BenchmarkMemmove4096-4              387.43       3064.96      7.91x
      BenchmarkMemmoveUnaligned1-4        28.91        25.40        0.88x
      BenchmarkMemmoveUnaligned2-4        56.13        47.56        0.85x
      BenchmarkMemmoveUnaligned3-4        74.32        69.31        0.93x
      BenchmarkMemmoveUnaligned4-4        97.02        83.58        0.86x
      BenchmarkMemmoveUnaligned5-4        110.17       103.62       0.94x
      BenchmarkMemmoveUnaligned6-4        124.95       113.26       0.91x
      BenchmarkMemmoveUnaligned7-4        142.37       130.82       0.92x
      BenchmarkMemmoveUnaligned8-4        151.20       205.64       1.36x
      BenchmarkMemmoveUnaligned9-4        166.97       215.42       1.29x
      BenchmarkMemmoveUnaligned10-4       148.49       221.22       1.49x
      BenchmarkMemmoveUnaligned11-4       159.47       239.57       1.50x
      BenchmarkMemmoveUnaligned12-4       163.52       247.32       1.51x
      BenchmarkMemmoveUnaligned13-4       167.55       256.54       1.53x
      BenchmarkMemmoveUnaligned14-4       175.12       251.03       1.43x
      BenchmarkMemmoveUnaligned15-4       192.10       267.13       1.39x
      BenchmarkMemmoveUnaligned16-4       190.76       378.87       1.99x
      BenchmarkMemmoveUnaligned32-4       259.02       562.98       2.17x
      BenchmarkMemmoveUnaligned64-4       317.72       842.44       2.65x
      BenchmarkMemmoveUnaligned128-4      355.43       1274.49      3.59x
      BenchmarkMemmoveUnaligned256-4      378.17       1815.74      4.80x
      BenchmarkMemmoveUnaligned512-4      362.15       2180.81      6.02x
      BenchmarkMemmoveUnaligned1024-4     376.07       2453.58      6.52x
      BenchmarkMemmoveUnaligned2048-4     381.66       2568.32      6.73x
      BenchmarkMemmoveUnaligned4096-4     398.51       2669.36      6.70x
      BenchmarkMemclr5-4                  113.83       107.93       0.95x
      BenchmarkMemclr16-4                 223.84       389.63       1.74x
      BenchmarkMemclr64-4                 421.99       1209.58      2.87x
      BenchmarkMemclr256-4                525.94       2411.58      4.59x
      BenchmarkMemclr4096-4               581.66       4372.20      7.52x
      BenchmarkMemclr65536-4              565.84       4747.48      8.39x
      BenchmarkGoMemclr5-4                194.63       160.31       0.82x
      BenchmarkGoMemclr16-4               295.30       630.07       2.13x
      BenchmarkGoMemclr64-4               480.24       1884.03      3.92x
      BenchmarkGoMemclr256-4              540.23       2926.49      5.42x
      
      but it turns out that it's necessary to avoid the GC seeing partially written
      pointers.
      
      It's of course possible to be more sophisticated (using ldp/stp to move 16
      bytes at a time in the core loop and unrolling the tail copying loops being
      the obvious ideas) but I wanted something simple and (reasonably) obviously
      correct.
      
      Fixes #12552
      
      Change-Id: Iaeaf8a812cd06f4747ba2f792de1ded738890735
      Reviewed-on: https://go-review.googlesource.com/14813Reviewed-by: default avatarAustin Clements <austin@google.com>
      Reviewed-on: https://go-review.googlesource.com/16909Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      d3a41356
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: use 4 byte writes in amd64p32 memmove/memclr · fc0f36b2
      Austin Clements authored
      Currently, amd64p32's memmove and memclr use 8 byte writes as much as
      possible and 1 byte writes for the tail of the object. However, if an
      object ends with a 4 byte pointer at an 8 byte aligned offset, this
      may copy/zero the pointer field one byte at a time, allowing the
      garbage collector to observe a partially copied pointer.
      
      Fix this by using 4 byte writes instead of 8 byte writes.
      
      Updates #12552.
      
      Change-Id: I13324fd05756fb25ae57e812e836f0a975b5595c
      Reviewed-on: https://go-review.googlesource.com/15370
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16908Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      fc0f36b2
    • Michael Hudson-Doyle's avatar
      [release-branch.go1.5] runtime: adjust the ppc64x memmove and memclr to copy... · 9f59bc85
      Michael Hudson-Doyle authored
      [release-branch.go1.5] runtime: adjust the ppc64x memmove and memclr to copy by word as much as it can
      
      Issue #12552 can happen on ppc64 too, although much less frequently in my
      testing. I'm fairly sure this fixes it (2 out of 200 runs of oracle.test failed
      without this change and 0 of 200 failed with it). It's also a lot faster for
      large moves/clears:
      
      name           old speed      new speed       delta
      Memmove1-6      157MB/s ± 9%    144MB/s ± 0%    -8.20%         (p=0.004 n=10+9)
      Memmove2-6      281MB/s ± 1%    249MB/s ± 1%   -11.53%        (p=0.000 n=10+10)
      Memmove3-6      376MB/s ± 1%    328MB/s ± 1%   -12.64%        (p=0.000 n=10+10)
      Memmove4-6      475MB/s ± 4%    345MB/s ± 1%   -27.28%         (p=0.000 n=10+8)
      Memmove5-6      540MB/s ± 1%    393MB/s ± 0%   -27.21%        (p=0.000 n=10+10)
      Memmove6-6      609MB/s ± 0%    423MB/s ± 0%   -30.56%         (p=0.000 n=9+10)
      Memmove7-6      659MB/s ± 0%    468MB/s ± 0%   -28.99%         (p=0.000 n=8+10)
      Memmove8-6      705MB/s ± 0%   1295MB/s ± 1%   +83.73%          (p=0.000 n=9+9)
      Memmove9-6      740MB/s ± 1%   1241MB/s ± 1%   +67.61%         (p=0.000 n=10+8)
      Memmove10-6     780MB/s ± 0%   1162MB/s ± 1%   +48.95%         (p=0.000 n=10+9)
      Memmove11-6     811MB/s ± 0%   1180MB/s ± 0%   +45.58%          (p=0.000 n=8+9)
      Memmove12-6     820MB/s ± 1%   1073MB/s ± 1%   +30.83%         (p=0.000 n=10+9)
      Memmove13-6     849MB/s ± 0%   1068MB/s ± 1%   +25.87%        (p=0.000 n=10+10)
      Memmove14-6     877MB/s ± 0%    911MB/s ± 0%    +3.83%        (p=0.000 n=10+10)
      Memmove15-6     893MB/s ± 0%    922MB/s ± 0%    +3.25%         (p=0.000 n=10+9)
      Memmove16-6     897MB/s ± 1%   2418MB/s ± 1%  +169.67%         (p=0.000 n=10+9)
      Memmove32-6     908MB/s ± 0%   3927MB/s ± 2%  +332.64%         (p=0.000 n=10+8)
      Memmove64-6    1.11GB/s ± 0%   5.59GB/s ± 0%  +404.64%          (p=0.000 n=9+9)
      Memmove128-6   1.25GB/s ± 0%   6.71GB/s ± 2%  +437.49%         (p=0.000 n=9+10)
      Memmove256-6   1.33GB/s ± 0%   7.25GB/s ± 1%  +445.06%        (p=0.000 n=10+10)
      Memmove512-6   1.38GB/s ± 0%   8.87GB/s ± 0%  +544.43%        (p=0.000 n=10+10)
      Memmove1024-6  1.40GB/s ± 0%  10.00GB/s ± 0%  +613.80%        (p=0.000 n=10+10)
      Memmove2048-6  1.41GB/s ± 0%  10.65GB/s ± 0%  +652.95%         (p=0.000 n=9+10)
      Memmove4096-6  1.42GB/s ± 0%  11.01GB/s ± 0%  +675.37%         (p=0.000 n=8+10)
      Memclr5-6       269MB/s ± 1%    264MB/s ± 0%    -1.80%        (p=0.000 n=10+10)
      Memclr16-6      600MB/s ± 0%    887MB/s ± 1%   +47.83%        (p=0.000 n=10+10)
      Memclr64-6     1.06GB/s ± 0%   2.91GB/s ± 1%  +174.58%         (p=0.000 n=8+10)
      Memclr256-6    1.32GB/s ± 0%   6.58GB/s ± 0%  +399.86%         (p=0.000 n=9+10)
      Memclr4096-6   1.42GB/s ± 0%  10.90GB/s ± 0%  +668.03%         (p=0.000 n=8+10)
      Memclr65536-6  1.43GB/s ± 0%  11.37GB/s ± 0%  +697.83%          (p=0.000 n=9+8)
      GoMemclr5-6     359MB/s ± 0%    360MB/s ± 0%    +0.46%        (p=0.000 n=10+10)
      GoMemclr16-6    750MB/s ± 0%   1264MB/s ± 1%   +68.45%        (p=0.000 n=10+10)
      GoMemclr64-6   1.17GB/s ± 0%   3.78GB/s ± 1%  +223.58%         (p=0.000 n=10+9)
      GoMemclr256-6  1.35GB/s ± 0%   7.47GB/s ± 0%  +452.44%        (p=0.000 n=10+10)
      
      Update #12552
      
      Change-Id: I7192e9deb9684a843aed37f58a16a4e29970e893
      Reviewed-on: https://go-review.googlesource.com/14840Reviewed-by: default avatarMinux Ma <minux@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16907Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      9f59bc85
  4. 15 Oct, 2015 1 commit
  5. 12 Oct, 2015 1 commit
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: fix recursive GC assist · c257dfb1
      Austin Clements authored
      If gcAssistAlloc is unable to steal or perform enough scan work, it
      calls timeSleep, which allocates. If this allocation requires
      obtaining a new span, it will in turn attempt to assist GC. Since
      there's likely still no way to satisfy the assist, it will sleep
      again, and so on, leading to potentially deep (not infinite, but also
      not bounded) recursion.
      
      Fix this by disallowing assists during the timeSleep.
      
      This same problem was fixed on master by 65aa2da6. That commit built on
      several other changes and hence can't be directly cherry-picked. This
      commit implements the same idea.
      
      Fixes #12894.
      
      Change-Id: I152977eb1d0a3005c42ff3985d58778f054a86d4
      Reviewed-on: https://go-review.googlesource.com/15720Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      c257dfb1
  6. 03 Oct, 2015 1 commit
  7. 23 Sep, 2015 1 commit
  8. 09 Sep, 2015 2 commits
  9. 08 Sep, 2015 13 commits