1. 06 Mar, 2018 17 commits
    • Wei Xiao's avatar
      cmd/compile/internal/ssa: improve store combine optimization on arm64 · 05962561
      Wei Xiao authored
      Current implementation doesn't consider MOVDreg type operand and fail to combine
      it into larger store. This patch fixes the issue.
      
      Fixes #24242
      
      Change-Id: I7d68697f80e76f48c3528ece01a602bf513248ec
      Reviewed-on: https://go-review.googlesource.com/98397
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      05962561
    • Josh Bleecher Snyder's avatar
      encoding/binary: use an offset instead of slicing · b8543397
      Josh Bleecher Snyder authored
      While running make.bash, over 5% of all pointer writes
      come from encoding/binary doing struct reads.
      
      This change replaces slicing during such reads with an offset.
      This avoids updating the slice pointer with every
      struct field read or write.
      
      This has no impact when the write barrier is off.
      Running the benchmarks with GOGC=1, however,
      shows significant improvement:
      
      name          old time/op    new time/op    delta
      ReadStruct-8    13.2µs ± 6%    10.1µs ± 5%  -23.24%  (p=0.000 n=10+10)
      
      name          old speed      new speed      delta
      ReadStruct-8  5.69MB/s ± 6%  7.40MB/s ± 5%  +30.18%  (p=0.000 n=10+10)
      
      Change-Id: I22904263196bfeddc38abe8989428e263aee5253
      Reviewed-on: https://go-review.googlesource.com/98757
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b8543397
    • Josh Bleecher Snyder's avatar
      runtime: skip pointless writes in freedefer · f7739c07
      Josh Bleecher Snyder authored
      Change-Id: I501a0e5c87ec88616c7dcdf1b723758b6df6c088
      Reviewed-on: https://go-review.googlesource.com/98758
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      f7739c07
    • Josh Bleecher Snyder's avatar
      debug/macho: use bytes.IndexByte instead of a loop · 4599419e
      Josh Bleecher Snyder authored
      Simpler, and no doubt faster.
      
      Change-Id: Idd401918da07a257de365087721e9ff061e6fd07
      Reviewed-on: https://go-review.googlesource.com/98759
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      4599419e
    • Balaram Makam's avatar
      cmd/compile/internal/ssa: inline small memmove for arm64 · 0e8b7110
      Balaram Makam authored
      This patch enables the optimization for arm64 target.
      
      Performance results on Amberwing for strconv benchmark:
      name             old time/op  new time/op  delta
      Quote             721ns ± 0%   617ns ± 0%  -14.40%  (p=0.016 n=5+4)
      QuoteRune         118ns ± 0%   117ns ± 0%   -0.85%  (p=0.008 n=5+5)
      AppendQuote       436ns ± 2%   321ns ± 0%  -26.31%  (p=0.008 n=5+5)
      AppendQuoteRune  34.7ns ± 0%  28.4ns ± 0%  -18.16%  (p=0.000 n=5+4)
      [Geo mean]        189ns        160ns       -15.41%
      
      Change-Id: I5714c474e7483d07ca338fbaf49beb4bbcc11c44
      Reviewed-on: https://go-review.googlesource.com/98735
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      0e8b7110
    • Alberto Donizetti's avatar
      test/codegen: port math/bits.OnesCount tests to codegen · 18ae5eca
      Alberto Donizetti authored
      And remove them from ssa_test.
      
      Change-Id: I3efac5fea529bb0efa2dae32124530482ba5058e
      Reviewed-on: https://go-review.googlesource.com/98815Reviewed-by: default avatarKeith Randall <khr@golang.org>
      18ae5eca
    • Cherry Zhang's avatar
      cmd/internal/obj/arm64: gofmt · f6244454
      Cherry Zhang authored
      Change-Id: Ica778fef2d0245fbb14f595597e45c7cf6adef84
      Reviewed-on: https://go-review.googlesource.com/98895
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      f6244454
    • Elias Naur's avatar
      iostest.bash: don't build std library twice · f5f16d1e
      Elias Naur authored
      Instead, mirror androidtest.bash and build once, then run run.bash.
      
      Change-Id: I174ae30b2a429a62b20bb290a70cb07ed712b1e4
      Reviewed-on: https://go-review.googlesource.com/98915Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      f5f16d1e
    • Elias Naur's avatar
      cmd/dist: default to GOARM=7 on android · ad87a67c
      Elias Naur authored
      Auto-detecting GOARM on Android makes as little sense as for nacl/arm
      and darwin/arm.
      
      Also update androidtest.sh to not require GOARM set.
      
      Change-Id: Id409ce1573d3c668d00fa4b7e3562ad7ece6fef5
      Reviewed-on: https://go-review.googlesource.com/98875Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ad87a67c
    • Cherry Zhang's avatar
      math/big: don't use R18 in ARM64 assembly · 084143d8
      Cherry Zhang authored
      R18 seems reserved on Apple platforms.
      
      May fix darwin/arm64 build.
      
      Change-Id: Ia2c1de550a64827c85a64affa53b94c62aacce8e
      Reviewed-on: https://go-review.googlesource.com/98896
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarElias Naur <elias.naur@gmail.com>
      084143d8
    • Tobias Klauser's avatar
      runtime: fix stack switch check in walltime/nanotime on linux/arm · 9745397e
      Tobias Klauser authored
      CL 98095 got the check wrong. We should be testing
      'getg() == getg().m.curg', not 'getg().m == getg().m.curg'.
      
      Change-Id: I32f6238b00409b67afa8efe732513d542aec5bc7
      Reviewed-on: https://go-review.googlesource.com/98855
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      9745397e
    • Alberto Donizetti's avatar
      test/codegen: port math/bits.TrailingZeros tests to codegen · 85dcc709
      Alberto Donizetti authored
      And remove them from ssa_test.
      
      Change-Id: Ib5de5c0d908f23915e0847eca338cacf2fa5325b
      Reviewed-on: https://go-review.googlesource.com/98795Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      85dcc709
    • as's avatar
      net/http: correct subtle transposition of offset and whence in test · df8c2b90
      as authored
      Change-Id: I788972bdf85c0225397c0e74901bf9c33c6d30c7
      GitHub-Last-Rev: 57737fe782bf7ad2d765c2efd80d75b3baca2c7b
      GitHub-Pull-Request: golang/go#24265
      Reviewed-on: https://go-review.googlesource.com/98761Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      df8c2b90
    • Meng Zhuo's avatar
      runtime, cmd/compile: use ldp for DUFFCOPY on ARM64 · 8916773a
      Meng Zhuo authored
      name         old time/op  new time/op  delta
      CopyFat8     2.15ns ± 1%  2.19ns ± 6%     ~     (p=0.171 n=8+9)
      CopyFat12    2.15ns ± 0%  2.17ns ± 2%     ~     (p=0.137 n=8+10)
      CopyFat16    2.17ns ± 3%  2.15ns ± 0%     ~     (p=0.211 n=10+10)
      CopyFat24    2.16ns ± 1%  2.15ns ± 0%     ~     (p=0.087 n=10+10)
      CopyFat32    11.5ns ± 0%  12.8ns ± 2%  +10.87%  (p=0.000 n=8+10)
      CopyFat64    20.2ns ± 2%  12.9ns ± 0%  -36.11%  (p=0.000 n=10+10)
      CopyFat128   37.2ns ± 0%  21.5ns ± 0%  -42.20%  (p=0.000 n=10+10)
      CopyFat256   71.6ns ± 0%  38.7ns ± 0%  -45.95%  (p=0.000 n=10+10)
      CopyFat512    140ns ± 0%    73ns ± 0%  -47.86%  (p=0.000 n=10+9)
      CopyFat520    142ns ± 0%    74ns ± 0%  -47.54%  (p=0.000 n=10+10)
      CopyFat1024   277ns ± 0%   141ns ± 0%  -49.10%  (p=0.000 n=10+10)
      
      Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a
      Reviewed-on: https://go-review.googlesource.com/97395Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      8916773a
    • Rob Pike's avatar
      cmd/doc: make local dot-slash path names work · baf3eb16
      Rob Pike authored
      Before, an argument that started ./ or ../ was not treated as
      a package relative to the current directory. Thus
      
      	$ cd $GOROOT/src/text
      	$ go doc ./template
      
      could find html/template as $GOROOT/src/html/./template
      is a valid Go source directory.
      
      Fix this by catching such paths and making them absolute before
      processing.
      
      Fixes #23383.
      
      Change-Id: Ic2a92eaa3a6328f728635657f9de72ac3ee82afb
      Reviewed-on: https://go-review.googlesource.com/98396Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      baf3eb16
    • Fangming.Fang's avatar
      crypto/aes: optimize arm64 AES implementation · 917e7269
      Fangming.Fang authored
      This patch makes use of arm64 AES instructions to accelerate AES computation
      and only supports optimization on Linux for arm64
      
      name        old time/op    new time/op     delta
      Encrypt-32     255ns ± 0%       26ns ± 0%   -89.73%
      Decrypt-32     256ns ± 0%       26ns ± 0%   -89.77%
      Expand-32      990ns ± 5%      901ns ± 0%    -9.05%
      
      name        old speed      new speed       delta
      Encrypt-32  62.5MB/s ± 0%  610.4MB/s ± 0%  +876.39%
      Decrypt-32  62.3MB/s ± 0%  610.2MB/s ± 0%  +879.6%
      
      Fixes #18498
      
      Change-Id: If416e5a151785325527b32ff72f6da3812493ed0
      Reviewed-on: https://go-review.googlesource.com/64490
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      917e7269
    • erifan01's avatar
      math/big: optimize addVV and subVV on arm64 · c4f3fe95
      erifan01 authored
      The biggest hot spot of the existing implementation is "load" operations, which lead to poor performance.
      By unrolling the cycle 4x and 2x, and using "LDP", "STP" instructions, this CL can reduce the "load" cost and improve performance.
      
      Benchmarks:
      
      name                              old time/op    new time/op     delta
      AddVV/1-8                           21.5ns ± 0%     11.5ns ± 0%   -46.51%  (p=0.008 n=5+5)
      AddVV/2-8                           13.5ns ± 0%     12.0ns ± 0%   -11.11%  (p=0.008 n=5+5)
      AddVV/3-8                           15.5ns ± 0%     13.0ns ± 0%   -16.13%  (p=0.008 n=5+5)
      AddVV/4-8                           17.5ns ± 0%     13.5ns ± 0%   -22.86%  (p=0.008 n=5+5)
      AddVV/5-8                           19.5ns ± 0%     14.5ns ± 0%   -25.64%  (p=0.008 n=5+5)
      AddVV/10-8                          29.5ns ± 0%     18.0ns ± 0%   -38.98%  (p=0.008 n=5+5)
      AddVV/100-8                          217ns ± 0%       94ns ± 0%   -56.64%  (p=0.008 n=5+5)
      AddVV/1000-8                        2.02µs ± 0%     1.03µs ± 0%   -48.85%  (p=0.008 n=5+5)
      AddVV/10000-8                       20.5µs ± 0%     11.3µs ± 0%   -44.70%  (p=0.008 n=5+5)
      AddVV/100000-8                       247µs ± 3%      154µs ± 0%   -37.52%  (p=0.008 n=5+5)
      SubVV/1-8                           21.5ns ± 0%     11.5ns ± 0%      ~     (p=0.079 n=4+5)
      SubVV/2-8                           13.5ns ± 0%     12.0ns ± 0%   -11.11%  (p=0.008 n=5+5)
      SubVV/3-8                           15.5ns ± 0%     13.0ns ± 0%   -16.13%  (p=0.008 n=5+5)
      SubVV/4-8                           17.5ns ± 0%     13.5ns ± 0%   -22.86%  (p=0.008 n=5+5)
      SubVV/5-8                           19.5ns ± 0%     14.5ns ± 0%   -25.64%  (p=0.008 n=5+5)
      SubVV/10-8                          29.5ns ± 0%     18.0ns ± 0%   -38.98%  (p=0.008 n=5+5)
      SubVV/100-8                          217ns ± 0%       94ns ± 0%   -56.64%  (p=0.008 n=5+5)
      SubVV/1000-8                        2.02µs ± 0%     0.80µs ± 0%   -60.50%  (p=0.008 n=5+5)
      SubVV/10000-8                       20.5µs ± 0%     11.3µs ± 0%   -44.99%  (p=0.008 n=5+5)
      SubVV/100000-8                       221µs ±11%      223µs ±16%      ~     (p=0.690 n=5+5)
      AddVW/1-8                           9.32ns ± 0%     9.32ns ± 0%      ~     (all equal)
      AddVW/2-8                           19.7ns ± 1%     19.7ns ± 0%      ~     (p=0.381 n=5+4)
      AddVW/3-8                           11.5ns ± 0%     11.5ns ± 0%      ~     (all equal)
      AddVW/4-8                           13.0ns ± 0%     13.0ns ± 0%      ~     (all equal)
      AddVW/5-8                           14.5ns ± 0%     14.5ns ± 0%      ~     (all equal)
      AddVW/10-8                          22.0ns ± 0%     22.0ns ± 0%      ~     (all equal)
      AddVW/100-8                          167ns ± 0%      167ns ± 0%      ~     (all equal)
      AddVW/1000-8                        1.52µs ± 0%     1.52µs ± 0%    +0.40%  (p=0.008 n=5+5)
      AddVW/10000-8                       15.1µs ± 0%     15.1µs ± 0%      ~     (p=0.556 n=5+4)
      AddVW/100000-8                       152µs ± 1%      152µs ± 1%      ~     (p=0.690 n=5+5)
      AddMulVVW/1-8                       33.3ns ± 0%     32.7ns ± 1%    -1.86%  (p=0.008 n=5+5)
      AddMulVVW/2-8                       59.3ns ± 1%     56.9ns ± 1%    -4.15%  (p=0.008 n=5+5)
      AddMulVVW/3-8                       80.5ns ± 1%     85.4ns ± 3%    +6.19%  (p=0.008 n=5+5)
      AddMulVVW/4-8                        127ns ± 0%      111ns ± 1%   -13.19%  (p=0.008 n=5+5)
      AddMulVVW/5-8                        144ns ± 0%      149ns ± 0%    +3.47%  (p=0.016 n=4+5)
      AddMulVVW/10-8                       298ns ± 1%      283ns ± 0%    -4.77%  (p=0.008 n=5+5)
      AddMulVVW/100-8                     3.06µs ± 0%     2.99µs ± 0%    -2.21%  (p=0.008 n=5+5)
      AddMulVVW/1000-8                    31.3µs ± 0%     26.9µs ± 0%   -14.17%  (p=0.008 n=5+5)
      AddMulVVW/10000-8                    316µs ± 0%      305µs ± 0%    -3.51%  (p=0.008 n=5+5)
      AddMulVVW/100000-8                  3.17ms ± 0%     3.17ms ± 1%      ~     (p=0.690 n=5+5)
      DecimalConversion-8                  316µs ± 1%      313µs ± 2%      ~     (p=0.095 n=5+5)
      FloatString/100-8                   2.53µs ± 1%     2.56µs ± 2%      ~     (p=0.222 n=5+5)
      FloatString/1000-8                  58.4µs ± 0%     58.5µs ± 0%      ~     (p=0.206 n=5+5)
      FloatString/10000-8                 4.59ms ± 0%     4.58ms ± 0%    -0.31%  (p=0.008 n=5+5)
      FloatString/100000-8                 446ms ± 0%      444ms ± 0%    -0.31%  (p=0.008 n=5+5)
      FloatAdd/10-8                        184ns ± 0%      172ns ± 0%    -6.30%  (p=0.008 n=5+5)
      FloatAdd/100-8                       189ns ± 2%      191ns ± 4%      ~     (p=0.381 n=5+5)
      FloatAdd/1000-8                      371ns ± 0%      347ns ± 1%    -6.42%  (p=0.008 n=5+5)
      FloatAdd/10000-8                    1.87µs ± 0%     1.68µs ± 0%   -10.16%  (p=0.008 n=5+5)
      FloatAdd/100000-8                   17.1µs ± 0%     15.6µs ± 0%    -8.74%  (p=0.016 n=5+4)
      FloatSub/10-8                        152ns ± 0%      138ns ± 0%    -9.47%  (p=0.000 n=4+5)
      FloatSub/100-8                       148ns ± 0%      142ns ± 0%    -4.05%  (p=0.000 n=5+4)
      FloatSub/1000-8                      245ns ± 1%      217ns ± 0%   -11.28%  (p=0.000 n=5+4)
      FloatSub/10000-8                    1.07µs ± 0%     0.88µs ± 1%   -18.14%  (p=0.008 n=5+5)
      FloatSub/100000-8                   9.58µs ± 0%     7.96µs ± 0%   -16.84%  (p=0.008 n=5+5)
      ParseFloatSmallExp-8                28.8µs ± 1%     29.0µs ± 1%      ~     (p=0.095 n=5+5)
      ParseFloatLargeExp-8                 126µs ± 1%      126µs ± 1%      ~     (p=0.841 n=5+5)
      GCD10x10/WithoutXY-8                 277ns ± 2%      281ns ± 4%      ~     (p=0.746 n=5+5)
      GCD10x10/WithXY-8                   2.10µs ± 1%     2.12µs ± 3%      ~     (p=0.548 n=5+5)
      GCD10x100/WithoutXY-8                615ns ± 3%      607ns ± 2%      ~     (p=0.135 n=5+5)
      GCD10x100/WithXY-8                  3.50µs ± 2%     3.62µs ± 5%      ~     (p=0.151 n=5+5)
      GCD10x1000/WithoutXY-8              1.39µs ± 2%     1.39µs ± 3%      ~     (p=0.690 n=5+5)
      GCD10x1000/WithXY-8                 7.39µs ± 1%     7.34µs ± 2%      ~     (p=0.135 n=5+5)
      GCD10x10000/WithoutXY-8             8.66µs ± 1%     8.68µs ± 1%      ~     (p=0.421 n=5+5)
      GCD10x10000/WithXY-8                28.1µs ± 2%     27.0µs ± 2%    -3.81%  (p=0.008 n=5+5)
      GCD10x100000/WithoutXY-8            79.3µs ± 1%     79.3µs ± 1%      ~     (p=0.841 n=5+5)
      GCD10x100000/WithXY-8                238µs ± 0%      227µs ± 1%    -4.74%  (p=0.008 n=5+5)
      GCD100x100/WithoutXY-8              1.89µs ± 1%     1.88µs ± 2%      ~     (p=0.968 n=5+5)
      GCD100x100/WithXY-8                 26.7µs ± 1%     27.0µs ± 1%    +1.44%  (p=0.032 n=5+5)
      GCD100x1000/WithoutXY-8             4.48µs ± 1%     4.45µs ± 2%      ~     (p=0.341 n=5+5)
      GCD100x1000/WithXY-8                36.3µs ± 1%     35.1µs ± 1%    -3.27%  (p=0.008 n=5+5)
      GCD100x10000/WithoutXY-8            22.8µs ± 0%     22.7µs ± 1%      ~     (p=0.056 n=5+5)
      GCD100x10000/WithXY-8                145µs ± 1%      133µs ± 1%    -8.33%  (p=0.008 n=5+5)
      GCD100x100000/WithoutXY-8            198µs ± 0%      195µs ± 0%    -1.56%  (p=0.008 n=5+5)
      GCD100x100000/WithXY-8              1.11ms ± 0%     1.00ms ± 0%   -10.04%  (p=0.008 n=5+5)
      GCD1000x1000/WithoutXY-8            25.2µs ± 1%     24.8µs ± 1%    -1.63%  (p=0.016 n=5+5)
      GCD1000x1000/WithXY-8                513µs ± 0%      517µs ± 2%      ~     (p=0.421 n=5+5)
      GCD1000x10000/WithoutXY-8           57.0µs ± 0%     52.7µs ± 1%    -7.56%  (p=0.008 n=5+5)
      GCD1000x10000/WithXY-8              1.20ms ± 0%     1.10ms ± 0%    -8.70%  (p=0.008 n=5+5)
      GCD1000x100000/WithoutXY-8           358µs ± 0%      318µs ± 1%   -11.03%  (p=0.008 n=5+5)
      GCD1000x100000/WithXY-8             8.71ms ± 0%     7.65ms ± 0%   -12.19%  (p=0.008 n=5+5)
      GCD10000x10000/WithoutXY-8           690µs ± 0%      630µs ± 0%    -8.71%  (p=0.008 n=5+5)
      GCD10000x10000/WithXY-8             16.0ms ± 1%     14.9ms ± 0%    -6.85%  (p=0.008 n=5+5)
      GCD10000x100000/WithoutXY-8         2.09ms ± 0%     1.75ms ± 0%   -16.09%  (p=0.016 n=5+4)
      GCD10000x100000/WithXY-8            86.8ms ± 0%     76.3ms ± 0%   -12.09%  (p=0.008 n=5+5)
      GCD100000x100000/WithoutXY-8        51.1ms ± 0%     46.0ms ± 0%    -9.97%  (p=0.008 n=5+5)
      GCD100000x100000/WithXY-8            1.25s ± 0%      1.15s ± 0%    -7.92%  (p=0.008 n=5+5)
      Hilbert-8                           2.45ms ± 1%     2.49ms ± 1%    +1.99%  (p=0.008 n=5+5)
      Binomial-8                          4.98µs ± 3%     4.90µs ± 2%      ~     (p=0.421 n=5+5)
      QuoRem-8                            7.10µs ± 0%     6.21µs ± 0%   -12.55%  (p=0.016 n=5+4)
      Exp-8                                161ms ± 0%      161ms ± 0%      ~     (p=0.421 n=5+5)
      Exp2-8                               161ms ± 0%      161ms ± 0%      ~     (p=0.151 n=5+5)
      Bitset-8                            40.4ns ± 0%     40.3ns ± 0%      ~     (p=0.190 n=5+5)
      BitsetNeg-8                          163ns ± 3%      137ns ± 2%   -15.91%  (p=0.008 n=5+5)
      BitsetOrig-8                         377ns ± 1%      372ns ± 1%    -1.22%  (p=0.024 n=5+5)
      BitsetNegOrig-8                      631ns ± 1%      605ns ± 1%    -4.09%  (p=0.008 n=5+5)
      ModSqrt225_Tonelli-8                7.26ms ± 0%     7.26ms ± 0%      ~     (p=0.548 n=5+5)
      ModSqrt224_3Mod4-8                  2.24ms ± 0%     2.24ms ± 0%      ~     (p=1.000 n=5+5)
      ModSqrt5430_Tonelli-8                62.4s ± 0%      62.4s ± 0%      ~     (p=0.841 n=5+5)
      ModSqrt5430_3Mod4-8                  20.8s ± 0%      20.7s ± 0%      ~     (p=0.056 n=5+5)
      Sqrt-8                               101µs ± 0%       89µs ± 0%   -12.17%  (p=0.008 n=5+5)
      IntSqr/1-8                          32.5ns ± 1%     32.7ns ± 1%      ~     (p=0.056 n=5+5)
      IntSqr/2-8                           160ns ± 5%      158ns ± 0%      ~     (p=0.397 n=5+4)
      IntSqr/3-8                           298ns ± 4%      296ns ± 4%      ~     (p=0.667 n=5+5)
      IntSqr/5-8                           737ns ± 5%      761ns ± 3%    +3.34%  (p=0.016 n=5+5)
      IntSqr/8-8                          1.87µs ± 4%     1.90µs ± 3%      ~     (p=0.222 n=5+5)
      IntSqr/10-8                         2.96µs ± 4%     2.92µs ± 6%      ~     (p=0.310 n=5+5)
      IntSqr/20-8                         6.28µs ± 3%     6.21µs ± 2%      ~     (p=0.310 n=5+5)
      IntSqr/30-8                         14.0µs ± 2%     13.9µs ± 2%      ~     (p=0.548 n=5+5)
      IntSqr/50-8                         37.7µs ± 3%     38.3µs ± 2%      ~     (p=0.095 n=5+5)
      IntSqr/80-8                         95.9µs ± 2%     95.1µs ± 1%      ~     (p=0.310 n=5+5)
      IntSqr/100-8                         148µs ± 1%      148µs ± 1%      ~     (p=0.841 n=5+5)
      IntSqr/200-8                         586µs ± 1%      587µs ± 1%      ~     (p=1.000 n=5+5)
      IntSqr/300-8                        1.32ms ± 0%     1.31ms ± 1%    -0.73%  (p=0.032 n=5+5)
      IntSqr/500-8                        2.48ms ± 0%     2.45ms ± 0%    -1.15%  (p=0.008 n=5+5)
      IntSqr/800-8                        4.68ms ± 0%     4.62ms ± 0%    -1.23%  (p=0.008 n=5+5)
      IntSqr/1000-8                       7.57ms ± 0%     7.50ms ± 0%    -0.84%  (p=0.008 n=5+5)
      Mul-8                                311ms ± 0%      308ms ± 0%    -0.81%  (p=0.008 n=5+5)
      Exp3Power/0x10-8                     574ns ± 1%      578ns ± 2%      ~     (p=0.500 n=5+5)
      Exp3Power/0x40-8                     640ns ± 1%      646ns ± 0%      ~     (p=0.056 n=5+5)
      Exp3Power/0x100-8                   1.42µs ± 1%     1.42µs ± 1%      ~     (p=0.246 n=5+5)
      Exp3Power/0x400-8                   8.30µs ± 1%     8.29µs ± 1%      ~     (p=0.802 n=5+5)
      Exp3Power/0x1000-8                  60.0µs ± 0%     59.9µs ± 0%    -0.24%  (p=0.016 n=5+5)
      Exp3Power/0x4000-8                   817µs ± 0%      816µs ± 0%    -0.17%  (p=0.008 n=5+5)
      Exp3Power/0x10000-8                 7.80ms ± 1%     7.70ms ± 0%    -1.23%  (p=0.008 n=5+5)
      Exp3Power/0x40000-8                 73.4ms ± 0%     72.5ms ± 0%    -1.28%  (p=0.008 n=5+5)
      Exp3Power/0x100000-8                 665ms ± 0%      656ms ± 0%    -1.34%  (p=0.008 n=5+5)
      Exp3Power/0x400000-8                 5.99s ± 0%      5.90s ± 0%    -1.40%  (p=0.008 n=5+5)
      Fibo-8                               116ms ± 0%       50ms ± 0%   -57.09%  (p=0.008 n=5+5)
      NatSqr/1-8                           112ns ± 4%      112ns ± 2%      ~     (p=0.968 n=5+5)
      NatSqr/2-8                           251ns ± 2%      250ns ± 1%      ~     (p=0.571 n=5+5)
      NatSqr/3-8                           378ns ± 2%      379ns ± 2%      ~     (p=0.794 n=5+5)
      NatSqr/5-8                           829ns ± 3%      827ns ± 2%      ~     (p=1.000 n=5+5)
      NatSqr/8-8                          1.97µs ± 2%     1.95µs ± 2%      ~     (p=0.310 n=5+5)
      NatSqr/10-8                         3.02µs ± 2%     2.99µs ± 2%      ~     (p=0.421 n=5+5)
      NatSqr/20-8                         6.51µs ± 2%     6.49µs ± 1%      ~     (p=0.841 n=5+5)
      NatSqr/30-8                         14.1µs ± 2%     14.0µs ± 2%      ~     (p=0.841 n=5+5)
      NatSqr/50-8                         38.1µs ± 2%     38.3µs ± 3%      ~     (p=0.690 n=5+5)
      NatSqr/80-8                         95.5µs ± 2%     96.0µs ± 1%      ~     (p=0.421 n=5+5)
      NatSqr/100-8                         150µs ± 1%      148µs ± 2%      ~     (p=0.095 n=5+5)
      NatSqr/200-8                         588µs ± 1%      590µs ± 1%      ~     (p=0.421 n=5+5)
      NatSqr/300-8                        1.32ms ± 1%     1.31ms ± 1%      ~     (p=0.841 n=5+5)
      NatSqr/500-8                        2.50ms ± 0%     2.47ms ± 0%    -1.03%  (p=0.008 n=5+5)
      NatSqr/800-8                        4.70ms ± 0%     4.64ms ± 0%    -1.31%  (p=0.008 n=5+5)
      NatSqr/1000-8                       7.60ms ± 0%     7.52ms ± 0%    -1.01%  (p=0.008 n=5+5)
      ScanPi-8                             326µs ± 0%      326µs ± 0%      ~     (p=0.841 n=5+5)
      StringPiParallel-8                  70.3µs ± 5%     63.8µs ±10%      ~     (p=0.056 n=5+5)
      Scan/10/Base2-8                     1.09µs ± 0%     1.09µs ± 0%      ~     (p=0.317 n=5+5)
      Scan/100/Base2-8                    7.79µs ± 0%     7.78µs ± 0%      ~     (p=0.063 n=5+5)
      Scan/1000/Base2-8                   79.0µs ± 0%     78.9µs ± 0%    -0.18%  (p=0.008 n=5+5)
      Scan/10000/Base2-8                  1.22ms ± 0%     1.22ms ± 0%    -0.15%  (p=0.008 n=5+5)
      Scan/100000/Base2-8                 55.1ms ± 0%     55.2ms ± 0%    +0.20%  (p=0.008 n=5+5)
      Scan/10/Base8-8                      512ns ± 0%      512ns ± 1%      ~     (p=0.810 n=5+5)
      Scan/100/Base8-8                    2.89µs ± 0%     2.89µs ± 0%      ~     (p=0.810 n=5+5)
      Scan/1000/Base8-8                   31.0µs ± 0%     31.0µs ± 0%      ~     (p=0.151 n=5+5)
      Scan/10000/Base8-8                   740µs ± 0%      741µs ± 0%    +0.10%  (p=0.008 n=5+5)
      Scan/100000/Base8-8                 50.6ms ± 0%     50.6ms ± 0%    +0.08%  (p=0.008 n=5+5)
      Scan/10/Base10-8                     487ns ± 0%      487ns ± 0%      ~     (p=0.571 n=5+5)
      Scan/100/Base10-8                   2.67µs ± 0%     2.67µs ± 0%      ~     (p=0.810 n=5+5)
      Scan/1000/Base10-8                  28.7µs ± 0%     28.7µs ± 0%    +0.06%  (p=0.008 n=5+5)
      Scan/10000/Base10-8                  716µs ± 0%      717µs ± 0%      ~     (p=0.222 n=5+5)
      Scan/100000/Base10-8                50.3ms ± 0%     50.3ms ± 0%    +0.10%  (p=0.008 n=5+5)
      Scan/10/Base16-8                     438ns ± 0%      437ns ± 1%      ~     (p=0.786 n=5+5)
      Scan/100/Base16-8                   2.47µs ± 0%     2.47µs ± 0%    -0.19%  (p=0.048 n=5+5)
      Scan/1000/Base16-8                  27.2µs ± 0%     27.3µs ± 0%      ~     (p=0.087 n=5+5)
      Scan/10000/Base16-8                  722µs ± 0%      722µs ± 0%    +0.11%  (p=0.008 n=5+5)
      Scan/100000/Base16-8                52.6ms ± 0%     52.7ms ± 0%    +0.15%  (p=0.008 n=5+5)
      String/10/Base2-8                    247ns ± 2%      248ns ± 1%      ~     (p=0.437 n=5+5)
      String/100/Base2-8                  1.51µs ± 0%     1.51µs ± 0%    -0.37%  (p=0.024 n=5+5)
      String/1000/Base2-8                 13.6µs ± 1%     13.5µs ± 0%      ~     (p=0.095 n=5+5)
      String/10000/Base2-8                 135µs ± 0%      135µs ± 1%      ~     (p=0.841 n=5+5)
      String/100000/Base2-8               1.32ms ± 1%     1.32ms ± 1%      ~     (p=0.690 n=5+5)
      String/10/Base8-8                    169ns ± 1%      169ns ± 1%      ~     (p=1.000 n=5+5)
      String/100/Base8-8                   636ns ± 0%      634ns ± 1%      ~     (p=0.413 n=5+5)
      String/1000/Base8-8                 5.33µs ± 1%     5.32µs ± 0%      ~     (p=0.222 n=5+5)
      String/10000/Base8-8                50.9µs ± 1%     50.7µs ± 0%      ~     (p=0.151 n=5+5)
      String/100000/Base8-8                500µs ± 1%      497µs ± 0%      ~     (p=0.421 n=5+5)
      String/10/Base10-8                   516ns ± 1%      513ns ± 0%    -0.62%  (p=0.016 n=5+4)
      String/100/Base10-8                 1.97µs ± 0%     1.96µs ± 0%      ~     (p=0.667 n=4+5)
      String/1000/Base10-8                12.5µs ± 0%     11.5µs ± 0%    -7.92%  (p=0.008 n=5+5)
      String/10000/Base10-8               57.7µs ± 0%     52.5µs ± 0%    -8.93%  (p=0.008 n=5+5)
      String/100000/Base10-8              25.6ms ± 0%     21.6ms ± 0%   -15.94%  (p=0.008 n=5+5)
      String/10/Base16-8                   150ns ± 1%      149ns ± 0%      ~     (p=0.413 n=5+4)
      String/100/Base16-8                  514ns ± 1%      514ns ± 1%      ~     (p=0.849 n=5+5)
      String/1000/Base16-8                4.01µs ± 0%     4.01µs ± 0%      ~     (p=0.421 n=5+5)
      String/10000/Base16-8               37.8µs ± 1%     37.8µs ± 1%      ~     (p=0.841 n=5+5)
      String/100000/Base16-8               373µs ± 2%      373µs ± 0%      ~     (p=0.421 n=5+5)
      LeafSize/0-8                        6.63ms ± 0%     6.63ms ± 0%      ~     (p=0.730 n=4+5)
      LeafSize/1-8                        74.0µs ± 0%     67.7µs ± 1%    -8.53%  (p=0.008 n=5+5)
      LeafSize/2-8                        74.2µs ± 0%     68.3µs ± 1%    -7.99%  (p=0.008 n=5+5)
      LeafSize/3-8                         379µs ± 0%      309µs ± 0%   -18.52%  (p=0.008 n=5+5)
      LeafSize/4-8                        72.7µs ± 1%     66.7µs ± 0%    -8.37%  (p=0.008 n=5+5)
      LeafSize/5-8                         471µs ± 0%      384µs ± 0%   -18.55%  (p=0.008 n=5+5)
      LeafSize/6-8                         378µs ± 0%      308µs ± 0%   -18.59%  (p=0.008 n=5+5)
      LeafSize/7-8                         245µs ± 0%      204µs ± 1%   -16.75%  (p=0.008 n=5+5)
      LeafSize/8-8                        73.4µs ± 0%     66.9µs ± 1%    -8.79%  (p=0.008 n=5+5)
      LeafSize/9-8                         538µs ± 0%      437µs ± 0%   -18.75%  (p=0.008 n=5+5)
      LeafSize/10-8                        472µs ± 0%      396µs ± 1%   -16.01%  (p=0.008 n=5+5)
      LeafSize/11-8                        460µs ± 0%      374µs ± 0%   -18.58%  (p=0.008 n=5+5)
      LeafSize/12-8                        378µs ± 0%      308µs ± 0%   -18.38%  (p=0.008 n=5+5)
      LeafSize/13-8                        343µs ± 0%      284µs ± 0%   -17.30%  (p=0.008 n=5+5)
      LeafSize/14-8                        248µs ± 0%      206µs ± 0%   -16.94%  (p=0.008 n=5+5)
      LeafSize/15-8                        169µs ± 0%      144µs ± 0%   -14.69%  (p=0.008 n=5+5)
      LeafSize/16-8                       72.9µs ± 0%     66.8µs ± 1%    -8.27%  (p=0.008 n=5+5)
      LeafSize/32-8                       82.5µs ± 0%     76.7µs ± 0%    -7.04%  (p=0.008 n=5+5)
      LeafSize/64-8                        134µs ± 0%      129µs ± 0%    -3.80%  (p=0.008 n=5+5)
      ProbablyPrime/n=0-8                 44.2ms ± 0%     43.4ms ± 0%    -1.95%  (p=0.008 n=5+5)
      ProbablyPrime/n=1-8                 64.9ms ± 0%     64.0ms ± 0%    -1.27%  (p=0.008 n=5+5)
      ProbablyPrime/n=5-8                  147ms ± 0%      146ms ± 0%    -0.58%  (p=0.008 n=5+5)
      ProbablyPrime/n=10-8                 250ms ± 0%      249ms ± 0%    -0.35%  (p=0.008 n=5+5)
      ProbablyPrime/n=20-8                 456ms ± 0%      455ms ± 0%    -0.18%  (p=0.008 n=5+5)
      ProbablyPrime/Lucas-8               23.6ms ± 0%     22.7ms ± 0%    -3.74%  (p=0.008 n=5+5)
      ProbablyPrime/MillerRabinBase2-8    20.7ms ± 0%     20.6ms ± 0%      ~     (p=0.421 n=5+5)
      FloatSqrt/64-8                      2.25µs ± 1%     2.29µs ± 0%    +1.48%  (p=0.008 n=5+5)
      FloatSqrt/128-8                     4.86µs ± 1%     4.92µs ± 1%    +1.21%  (p=0.032 n=5+5)
      FloatSqrt/256-8                     13.6µs ± 0%     13.7µs ± 1%    +1.31%  (p=0.032 n=5+5)
      FloatSqrt/1000-8                    70.0µs ± 1%     70.1µs ± 0%      ~     (p=0.690 n=5+5)
      FloatSqrt/10000-8                   1.92ms ± 0%     1.90ms ± 0%    -0.59%  (p=0.008 n=5+5)
      FloatSqrt/100000-8                  55.3ms ± 0%     54.8ms ± 0%    -1.01%  (p=0.008 n=5+5)
      FloatSqrt/1000000-8                  4.56s ± 0%      4.50s ± 0%    -1.28%  (p=0.008 n=5+5)
      
      name                              old speed      new speed       delta
      AddVV/1-8                         2.97GB/s ± 0%   5.56GB/s ± 0%   +86.85%  (p=0.008 n=5+5)
      AddVV/2-8                         9.47GB/s ± 0%  10.66GB/s ± 0%   +12.50%  (p=0.008 n=5+5)
      AddVV/3-8                         12.4GB/s ± 0%   14.7GB/s ± 0%   +19.10%  (p=0.008 n=5+5)
      AddVV/4-8                         14.6GB/s ± 0%   18.9GB/s ± 0%   +29.63%  (p=0.016 n=4+5)
      AddVV/5-8                         16.4GB/s ± 0%   22.0GB/s ± 0%   +34.47%  (p=0.016 n=5+4)
      AddVV/10-8                        21.7GB/s ± 0%   35.5GB/s ± 0%   +63.89%  (p=0.008 n=5+5)
      AddVV/100-8                       29.4GB/s ± 0%   68.0GB/s ± 0%  +131.38%  (p=0.008 n=5+5)
      AddVV/1000-8                      31.7GB/s ± 0%   61.9GB/s ± 0%   +95.43%  (p=0.008 n=5+5)
      AddVV/10000-8                     31.2GB/s ± 0%   56.4GB/s ± 0%   +80.83%  (p=0.008 n=5+5)
      AddVV/100000-8                    25.9GB/s ± 3%   41.4GB/s ± 0%   +59.98%  (p=0.008 n=5+5)
      SubVV/1-8                         2.97GB/s ± 0%   5.56GB/s ± 0%   +86.97%  (p=0.016 n=4+5)
      SubVV/2-8                         9.47GB/s ± 0%  10.66GB/s ± 0%   +12.51%  (p=0.008 n=5+5)
      SubVV/3-8                         12.4GB/s ± 0%   14.8GB/s ± 0%   +19.23%  (p=0.016 n=4+5)
      SubVV/4-8                         14.6GB/s ± 0%   18.9GB/s ± 0%   +29.56%  (p=0.008 n=5+5)
      SubVV/5-8                         16.4GB/s ± 0%   22.0GB/s ± 0%   +34.47%  (p=0.016 n=4+5)
      SubVV/10-8                        21.7GB/s ± 0%   35.5GB/s ± 0%   +63.89%  (p=0.008 n=5+5)
      SubVV/100-8                       29.4GB/s ± 0%   68.0GB/s ± 0%  +131.38%  (p=0.008 n=5+5)
      SubVV/1000-8                      31.6GB/s ± 0%   80.1GB/s ± 0%  +153.08%  (p=0.008 n=5+5)
      SubVV/10000-8                     31.2GB/s ± 0%   56.7GB/s ± 0%   +81.79%  (p=0.008 n=5+5)
      SubVV/100000-8                    29.1GB/s ±10%   29.0GB/s ±18%      ~     (p=0.690 n=5+5)
      AddVW/1-8                          859MB/s ± 0%    859MB/s ± 0%    -0.01%  (p=0.008 n=5+5)
      AddVW/2-8                          811MB/s ± 1%    814MB/s ± 0%      ~     (p=0.413 n=5+4)
      AddVW/3-8                         2.08GB/s ± 0%   2.08GB/s ± 0%      ~     (p=0.206 n=5+5)
      AddVW/4-8                         2.46GB/s ± 0%   2.46GB/s ± 0%      ~     (p=0.056 n=5+5)
      AddVW/5-8                         2.75GB/s ± 0%   2.75GB/s ± 0%      ~     (p=0.508 n=5+5)
      AddVW/10-8                        3.63GB/s ± 0%   3.63GB/s ± 0%      ~     (p=0.214 n=5+5)
      AddVW/100-8                       4.79GB/s ± 0%   4.79GB/s ± 0%      ~     (p=0.500 n=5+5)
      AddVW/1000-8                      5.27GB/s ± 0%   5.25GB/s ± 0%    -0.43%  (p=0.008 n=5+5)
      AddVW/10000-8                     5.30GB/s ± 0%   5.30GB/s ± 0%      ~     (p=0.397 n=5+5)
      AddVW/100000-8                    5.27GB/s ± 1%   5.25GB/s ± 1%      ~     (p=0.690 n=5+5)
      AddMulVVW/1-8                     1.92GB/s ± 0%   1.96GB/s ± 1%    +1.95%  (p=0.008 n=5+5)
      AddMulVVW/2-8                     2.16GB/s ± 1%   2.25GB/s ± 1%    +4.32%  (p=0.008 n=5+5)
      AddMulVVW/3-8                     2.39GB/s ± 1%   2.25GB/s ± 3%    -5.79%  (p=0.008 n=5+5)
      AddMulVVW/4-8                     2.00GB/s ± 0%   2.31GB/s ± 1%   +15.31%  (p=0.008 n=5+5)
      AddMulVVW/5-8                     2.22GB/s ± 0%   2.14GB/s ± 0%    -3.86%  (p=0.008 n=5+5)
      AddMulVVW/10-8                    2.15GB/s ± 1%   2.25GB/s ± 0%    +5.03%  (p=0.008 n=5+5)
      AddMulVVW/100-8                   2.09GB/s ± 0%   2.14GB/s ± 0%    +2.25%  (p=0.008 n=5+5)
      AddMulVVW/1000-8                  2.04GB/s ± 0%   2.38GB/s ± 0%   +16.52%  (p=0.008 n=5+5)
      AddMulVVW/10000-8                 2.03GB/s ± 0%   2.10GB/s ± 0%    +3.64%  (p=0.008 n=5+5)
      AddMulVVW/100000-8                2.02GB/s ± 0%   2.02GB/s ± 1%      ~     (p=0.690 n=5+5)
      
      Change-Id: Ie482d67a7dbb5af6f5d81af2b3d9d14bd66336db
      Reviewed-on: https://go-review.googlesource.com/77831Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c4f3fe95
  2. 05 Mar, 2018 10 commits
  3. 04 Mar, 2018 7 commits
  4. 03 Mar, 2018 6 commits
    • Giovanni Bajo's avatar
      test: port a nil-check interface test from asm_test · 8ce74b7d
      Giovanni Bajo authored
      Change-Id: I69c1688506d1aeca655047acf35d1bff966fc01e
      Reviewed-on: https://go-review.googlesource.com/98442
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      8ce74b7d
    • Giovanni Bajo's avatar
      test: use the version of Go used to run run.go · ec0b8c05
      Giovanni Bajo authored
      Currently, the top-level testsuite always uses whatever version
      of Go is found in the PATH to execute all the tests. This
      forces the developers to tweak the PATH to run the testsuite.
      
      Change it to use the same version of Go used to run run.go.
      This allows developers to run the testsuite using the tip
      compiler by simply saying "../bin/go run run.go".
      
      I think this is a better solution compared to always forcing
      "../bin/go", because it allows developers to run the testsuite
      using different Go versions, for instance to check if a new
      test is fixed in tip compared to the installed compiler.
      
      Fixes #24217
      
      Change-Id: I41b299c753b6e77c41e28be9091b2b630efea9d2
      Reviewed-on: https://go-review.googlesource.com/98439
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      ec0b8c05
    • Pascal S. de Kloe's avatar
      encoding/json: apply conventional error handling in decoder · 74a92b8e
      Pascal S. de Kloe authored
      name                            old time/op    new time/op    delta
      CodeEncoder-12                    1.89ms ± 1%    1.91ms ± 0%   +1.16%  (p=0.000 n=20+19)
      CodeMarshal-12                    2.09ms ± 1%    2.12ms ± 0%   +1.63%  (p=0.000 n=17+18)
      CodeDecoder-12                    8.43ms ± 1%    8.32ms ± 1%   -1.35%  (p=0.000 n=18+20)
      UnicodeDecoder-12                  399ns ± 0%     339ns ± 0%  -15.00%  (p=0.000 n=20+19)
      DecoderStream-12                   281ns ± 1%     231ns ± 0%  -17.91%  (p=0.000 n=20+16)
      CodeUnmarshal-12                  9.35ms ± 2%    9.15ms ± 2%   -2.11%  (p=0.000 n=20+20)
      CodeUnmarshalReuse-12             8.41ms ± 2%    8.29ms ± 2%   -1.34%  (p=0.000 n=20+20)
      UnmarshalString-12                81.2ns ± 2%    74.0ns ± 4%   -8.89%  (p=0.000 n=20+20)
      UnmarshalFloat64-12               71.1ns ± 2%    64.3ns ± 1%   -9.60%  (p=0.000 n=20+19)
      UnmarshalInt64-12                 60.6ns ± 2%    53.2ns ± 0%  -12.28%  (p=0.000 n=18+18)
      Issue10335-12                     96.9ns ± 0%    87.7ns ± 1%   -9.52%  (p=0.000 n=17+20)
      Unmapped-12                        247ns ± 4%     231ns ± 3%   -6.34%  (p=0.000 n=20+20)
      TypeFieldsCache/MissTypes1-12     11.1µs ± 0%    11.1µs ± 0%     ~     (p=0.376 n=19+20)
      TypeFieldsCache/MissTypes10-12    33.9µs ± 0%    33.8µs ± 0%   -0.32%  (p=0.000 n=18+9)
      
      name                            old speed      new speed      delta
      CodeEncoder-12                  1.03GB/s ± 1%  1.01GB/s ± 0%   -1.15%  (p=0.000 n=20+19)
      CodeMarshal-12                   930MB/s ± 1%   915MB/s ± 0%   -1.60%  (p=0.000 n=17+18)
      CodeDecoder-12                   230MB/s ± 1%   233MB/s ± 1%   +1.37%  (p=0.000 n=18+20)
      UnicodeDecoder-12               35.0MB/s ± 0%  41.2MB/s ± 0%  +17.60%  (p=0.000 n=20+19)
      CodeUnmarshal-12                 208MB/s ± 2%   212MB/s ± 2%   +2.16%  (p=0.000 n=20+20)
      
      name                            old alloc/op   new alloc/op   delta
      Issue10335-12                       184B ± 0%      184B ± 0%     ~     (all equal)
      Unmapped-12                         216B ± 0%      216B ± 0%     ~     (all equal)
      
      name                            old allocs/op  new allocs/op  delta
      Issue10335-12                       3.00 ± 0%      3.00 ± 0%     ~     (all equal)
      Unmapped-12                         4.00 ± 0%      4.00 ± 0%     ~     (all equal)
      
      Change-Id: I4b1a87a205da2ef9a572f86f85bc833653c61570
      Reviewed-on: https://go-review.googlesource.com/98440Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      74a92b8e
    • Tobias Klauser's avatar
      runtime: use vDSO for clock_gettime on linux/arm · 51b02711
      Tobias Klauser authored
      Use the __vdso_clock_gettime fast path via the vDSO on linux/arm to
      speed up nanotime and walltime. This results in the following
      performance improvement for time.Now on a RaspberryPi 3 (running
      32bit Raspbian, i.e. GOOS=linux/GOARCH=arm):
      
      name     old time/op  new time/op  delta
      TimeNow  0.99µs ± 0%  0.39µs ± 1%  -60.74%  (p=0.000 n=12+20)
      
      Change-Id: I3598278a6c88d7f6a6ce66c56b9d25f9dd2f4c9a
      Reviewed-on: https://go-review.googlesource.com/98095Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      51b02711
    • Tobias Klauser's avatar
      runtime: remove unused __vdso_time_sym · c69f60d0
      Tobias Klauser authored
      It's unused since https://golang.org/cl/99320043
      
      Change-Id: I74d69ff894aa2fb556f1c2083406c118c559d91b
      Reviewed-on: https://go-review.googlesource.com/98195
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      c69f60d0
    • Keith Randall's avatar
      internal/bytealg: move equal functions to bytealg · 1dfa380e
      Keith Randall authored
      Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen
      to the bytealg package.
      
      Update #19792
      
      Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac
      Reviewed-on: https://go-review.googlesource.com/98355
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      1dfa380e