1. 17 Nov, 2015 5 commits
  2. 13 Nov, 2015 4 commits
    • Keith Randall's avatar
      [release-branch.go1.5] runtime: memmove/memclr pointers atomically · 0b5982f0
      Keith Randall authored
      Make sure that we're moving or zeroing pointers atomically.
      Anything that is a multiple of pointer size and at least
      pointer aligned might have pointers in it.  All the code looks
      ok except for the 1-pointer-sized moves.
      
      Fixes #13160
      Update #12552
      
      Change-Id: Ib97d9b918fa9f4cc5c56c67ed90255b7fdfb7b45
      Reviewed-on: https://go-review.googlesource.com/16668Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16910Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      0b5982f0
    • Michael Hudson-Doyle's avatar
      [release-branch.go1.5] runtime: adjust the arm64 memmove and memclr to operate... · d3a41356
      Michael Hudson-Doyle authored
      [release-branch.go1.5] runtime: adjust the arm64 memmove and memclr to operate by word as much as they can
      
      Not only is this an obvious optimization:
      
      benchmark                           old MB/s     new MB/s     speedup
      BenchmarkMemmove1-4                 35.35        29.65        0.84x
      BenchmarkMemmove2-4                 63.78        52.53        0.82x
      BenchmarkMemmove3-4                 89.72        73.96        0.82x
      BenchmarkMemmove4-4                 109.94       95.73        0.87x
      BenchmarkMemmove5-4                 127.60       112.80       0.88x
      BenchmarkMemmove6-4                 143.59       126.67       0.88x
      BenchmarkMemmove7-4                 157.90       138.92       0.88x
      BenchmarkMemmove8-4                 167.18       231.81       1.39x
      BenchmarkMemmove9-4                 175.23       252.07       1.44x
      BenchmarkMemmove10-4                165.68       261.10       1.58x
      BenchmarkMemmove11-4                174.43       263.31       1.51x
      BenchmarkMemmove12-4                180.76       267.56       1.48x
      BenchmarkMemmove13-4                189.06       284.93       1.51x
      BenchmarkMemmove14-4                186.31       284.72       1.53x
      BenchmarkMemmove15-4                195.75       281.62       1.44x
      BenchmarkMemmove16-4                202.96       439.23       2.16x
      BenchmarkMemmove32-4                264.77       775.77       2.93x
      BenchmarkMemmove64-4                306.81       1209.64      3.94x
      BenchmarkMemmove128-4               357.03       1515.41      4.24x
      BenchmarkMemmove256-4               380.77       2066.01      5.43x
      BenchmarkMemmove512-4               385.05       2556.45      6.64x
      BenchmarkMemmove1024-4              381.23       2804.10      7.36x
      BenchmarkMemmove2048-4              379.06       2814.83      7.43x
      BenchmarkMemmove4096-4              387.43       3064.96      7.91x
      BenchmarkMemmoveUnaligned1-4        28.91        25.40        0.88x
      BenchmarkMemmoveUnaligned2-4        56.13        47.56        0.85x
      BenchmarkMemmoveUnaligned3-4        74.32        69.31        0.93x
      BenchmarkMemmoveUnaligned4-4        97.02        83.58        0.86x
      BenchmarkMemmoveUnaligned5-4        110.17       103.62       0.94x
      BenchmarkMemmoveUnaligned6-4        124.95       113.26       0.91x
      BenchmarkMemmoveUnaligned7-4        142.37       130.82       0.92x
      BenchmarkMemmoveUnaligned8-4        151.20       205.64       1.36x
      BenchmarkMemmoveUnaligned9-4        166.97       215.42       1.29x
      BenchmarkMemmoveUnaligned10-4       148.49       221.22       1.49x
      BenchmarkMemmoveUnaligned11-4       159.47       239.57       1.50x
      BenchmarkMemmoveUnaligned12-4       163.52       247.32       1.51x
      BenchmarkMemmoveUnaligned13-4       167.55       256.54       1.53x
      BenchmarkMemmoveUnaligned14-4       175.12       251.03       1.43x
      BenchmarkMemmoveUnaligned15-4       192.10       267.13       1.39x
      BenchmarkMemmoveUnaligned16-4       190.76       378.87       1.99x
      BenchmarkMemmoveUnaligned32-4       259.02       562.98       2.17x
      BenchmarkMemmoveUnaligned64-4       317.72       842.44       2.65x
      BenchmarkMemmoveUnaligned128-4      355.43       1274.49      3.59x
      BenchmarkMemmoveUnaligned256-4      378.17       1815.74      4.80x
      BenchmarkMemmoveUnaligned512-4      362.15       2180.81      6.02x
      BenchmarkMemmoveUnaligned1024-4     376.07       2453.58      6.52x
      BenchmarkMemmoveUnaligned2048-4     381.66       2568.32      6.73x
      BenchmarkMemmoveUnaligned4096-4     398.51       2669.36      6.70x
      BenchmarkMemclr5-4                  113.83       107.93       0.95x
      BenchmarkMemclr16-4                 223.84       389.63       1.74x
      BenchmarkMemclr64-4                 421.99       1209.58      2.87x
      BenchmarkMemclr256-4                525.94       2411.58      4.59x
      BenchmarkMemclr4096-4               581.66       4372.20      7.52x
      BenchmarkMemclr65536-4              565.84       4747.48      8.39x
      BenchmarkGoMemclr5-4                194.63       160.31       0.82x
      BenchmarkGoMemclr16-4               295.30       630.07       2.13x
      BenchmarkGoMemclr64-4               480.24       1884.03      3.92x
      BenchmarkGoMemclr256-4              540.23       2926.49      5.42x
      
      but it turns out that it's necessary to avoid the GC seeing partially written
      pointers.
      
      It's of course possible to be more sophisticated (using ldp/stp to move 16
      bytes at a time in the core loop and unrolling the tail copying loops being
      the obvious ideas) but I wanted something simple and (reasonably) obviously
      correct.
      
      Fixes #12552
      
      Change-Id: Iaeaf8a812cd06f4747ba2f792de1ded738890735
      Reviewed-on: https://go-review.googlesource.com/14813Reviewed-by: default avatarAustin Clements <austin@google.com>
      Reviewed-on: https://go-review.googlesource.com/16909Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      d3a41356
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: use 4 byte writes in amd64p32 memmove/memclr · fc0f36b2
      Austin Clements authored
      Currently, amd64p32's memmove and memclr use 8 byte writes as much as
      possible and 1 byte writes for the tail of the object. However, if an
      object ends with a 4 byte pointer at an 8 byte aligned offset, this
      may copy/zero the pointer field one byte at a time, allowing the
      garbage collector to observe a partially copied pointer.
      
      Fix this by using 4 byte writes instead of 8 byte writes.
      
      Updates #12552.
      
      Change-Id: I13324fd05756fb25ae57e812e836f0a975b5595c
      Reviewed-on: https://go-review.googlesource.com/15370
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16908Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      fc0f36b2
    • Michael Hudson-Doyle's avatar
      [release-branch.go1.5] runtime: adjust the ppc64x memmove and memclr to copy... · 9f59bc85
      Michael Hudson-Doyle authored
      [release-branch.go1.5] runtime: adjust the ppc64x memmove and memclr to copy by word as much as it can
      
      Issue #12552 can happen on ppc64 too, although much less frequently in my
      testing. I'm fairly sure this fixes it (2 out of 200 runs of oracle.test failed
      without this change and 0 of 200 failed with it). It's also a lot faster for
      large moves/clears:
      
      name           old speed      new speed       delta
      Memmove1-6      157MB/s ± 9%    144MB/s ± 0%    -8.20%         (p=0.004 n=10+9)
      Memmove2-6      281MB/s ± 1%    249MB/s ± 1%   -11.53%        (p=0.000 n=10+10)
      Memmove3-6      376MB/s ± 1%    328MB/s ± 1%   -12.64%        (p=0.000 n=10+10)
      Memmove4-6      475MB/s ± 4%    345MB/s ± 1%   -27.28%         (p=0.000 n=10+8)
      Memmove5-6      540MB/s ± 1%    393MB/s ± 0%   -27.21%        (p=0.000 n=10+10)
      Memmove6-6      609MB/s ± 0%    423MB/s ± 0%   -30.56%         (p=0.000 n=9+10)
      Memmove7-6      659MB/s ± 0%    468MB/s ± 0%   -28.99%         (p=0.000 n=8+10)
      Memmove8-6      705MB/s ± 0%   1295MB/s ± 1%   +83.73%          (p=0.000 n=9+9)
      Memmove9-6      740MB/s ± 1%   1241MB/s ± 1%   +67.61%         (p=0.000 n=10+8)
      Memmove10-6     780MB/s ± 0%   1162MB/s ± 1%   +48.95%         (p=0.000 n=10+9)
      Memmove11-6     811MB/s ± 0%   1180MB/s ± 0%   +45.58%          (p=0.000 n=8+9)
      Memmove12-6     820MB/s ± 1%   1073MB/s ± 1%   +30.83%         (p=0.000 n=10+9)
      Memmove13-6     849MB/s ± 0%   1068MB/s ± 1%   +25.87%        (p=0.000 n=10+10)
      Memmove14-6     877MB/s ± 0%    911MB/s ± 0%    +3.83%        (p=0.000 n=10+10)
      Memmove15-6     893MB/s ± 0%    922MB/s ± 0%    +3.25%         (p=0.000 n=10+9)
      Memmove16-6     897MB/s ± 1%   2418MB/s ± 1%  +169.67%         (p=0.000 n=10+9)
      Memmove32-6     908MB/s ± 0%   3927MB/s ± 2%  +332.64%         (p=0.000 n=10+8)
      Memmove64-6    1.11GB/s ± 0%   5.59GB/s ± 0%  +404.64%          (p=0.000 n=9+9)
      Memmove128-6   1.25GB/s ± 0%   6.71GB/s ± 2%  +437.49%         (p=0.000 n=9+10)
      Memmove256-6   1.33GB/s ± 0%   7.25GB/s ± 1%  +445.06%        (p=0.000 n=10+10)
      Memmove512-6   1.38GB/s ± 0%   8.87GB/s ± 0%  +544.43%        (p=0.000 n=10+10)
      Memmove1024-6  1.40GB/s ± 0%  10.00GB/s ± 0%  +613.80%        (p=0.000 n=10+10)
      Memmove2048-6  1.41GB/s ± 0%  10.65GB/s ± 0%  +652.95%         (p=0.000 n=9+10)
      Memmove4096-6  1.42GB/s ± 0%  11.01GB/s ± 0%  +675.37%         (p=0.000 n=8+10)
      Memclr5-6       269MB/s ± 1%    264MB/s ± 0%    -1.80%        (p=0.000 n=10+10)
      Memclr16-6      600MB/s ± 0%    887MB/s ± 1%   +47.83%        (p=0.000 n=10+10)
      Memclr64-6     1.06GB/s ± 0%   2.91GB/s ± 1%  +174.58%         (p=0.000 n=8+10)
      Memclr256-6    1.32GB/s ± 0%   6.58GB/s ± 0%  +399.86%         (p=0.000 n=9+10)
      Memclr4096-6   1.42GB/s ± 0%  10.90GB/s ± 0%  +668.03%         (p=0.000 n=8+10)
      Memclr65536-6  1.43GB/s ± 0%  11.37GB/s ± 0%  +697.83%          (p=0.000 n=9+8)
      GoMemclr5-6     359MB/s ± 0%    360MB/s ± 0%    +0.46%        (p=0.000 n=10+10)
      GoMemclr16-6    750MB/s ± 0%   1264MB/s ± 1%   +68.45%        (p=0.000 n=10+10)
      GoMemclr64-6   1.17GB/s ± 0%   3.78GB/s ± 1%  +223.58%         (p=0.000 n=10+9)
      GoMemclr256-6  1.35GB/s ± 0%   7.47GB/s ± 0%  +452.44%        (p=0.000 n=10+10)
      
      Update #12552
      
      Change-Id: I7192e9deb9684a843aed37f58a16a4e29970e893
      Reviewed-on: https://go-review.googlesource.com/14840Reviewed-by: default avatarMinux Ma <minux@golang.org>
      Reviewed-on: https://go-review.googlesource.com/16907Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      9f59bc85
  3. 15 Oct, 2015 1 commit
  4. 12 Oct, 2015 1 commit
    • Austin Clements's avatar
      [release-branch.go1.5] runtime: fix recursive GC assist · c257dfb1
      Austin Clements authored
      If gcAssistAlloc is unable to steal or perform enough scan work, it
      calls timeSleep, which allocates. If this allocation requires
      obtaining a new span, it will in turn attempt to assist GC. Since
      there's likely still no way to satisfy the assist, it will sleep
      again, and so on, leading to potentially deep (not infinite, but also
      not bounded) recursion.
      
      Fix this by disallowing assists during the timeSleep.
      
      This same problem was fixed on master by 65aa2da6. That commit built on
      several other changes and hence can't be directly cherry-picked. This
      commit implements the same idea.
      
      Fixes #12894.
      
      Change-Id: I152977eb1d0a3005c42ff3985d58778f054a86d4
      Reviewed-on: https://go-review.googlesource.com/15720Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      c257dfb1
  5. 03 Oct, 2015 1 commit
  6. 23 Sep, 2015 1 commit
  7. 09 Sep, 2015 2 commits
  8. 08 Sep, 2015 16 commits
  9. 06 Sep, 2015 1 commit
  10. 04 Sep, 2015 1 commit
  11. 03 Sep, 2015 3 commits
  12. 19 Aug, 2015 4 commits