1. 13 Mar, 2018 5 commits
  2. 12 Mar, 2018 8 commits
    • Filippo Valsorda's avatar
      C: add Filippo Valsorda's @golang.org email · 2086f350
      Filippo Valsorda authored
      Change-Id: I3758a4350b600af304b2cff7ad59c7368a01ab5c
      Reviewed-on: https://go-review.googlesource.com/100215Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      2086f350
    • Josh Bleecher Snyder's avatar
      runtime: convert g.waitreason from string to uint8 · 4eea887f
      Josh Bleecher Snyder authored
      Every time I poke at #14921, the g.waitreason string
      pointer writes show up.
      
      They're not particularly important performance-wise,
      but it'd be nice to clear the noise away.
      
      And it does open up a few extra bytes in the g struct
      for some future use.
      
      Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9
      Reviewed-on: https://go-review.googlesource.com/99078
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      4eea887f
    • Tobias Klauser's avatar
      runtime: simplify range expressions in tests · 025134b0
      Tobias Klauser authored
      Generated by running gofmt -s on the files in question.
      
      Change-Id: If6578b150e1bfced8657196d2af01f5d36879f93
      Reviewed-on: https://go-review.googlesource.com/100135
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      025134b0
    • Vladimir Kuzmin's avatar
      cmd/compile: avoid extra mapaccess in "m[k] op= r" · 73950831
      Vladimir Kuzmin authored
      Currently, order desugars map assignment operations like
      
          m[k] op= r
      
      into
      
          m[k] = m[k] op r
      
      which in turn is transformed during walk into:
      
          tmp := *mapaccess(m, k)
          tmp = tmp op r
          *mapassign(m, k) = tmp
      
      However, this is suboptimal, as we could instead produce just:
      
          *mapassign(m, k) op= r
      
      One complication though is if "r == 0", then "m[k] /= r" and "m[k] %=
      r" will panic, and they need to do so *before* calling mapassign,
      otherwise we may insert a new zero-value element into the map.
      
      It would be spec compliant to just emit the "r != 0" check before
      calling mapassign (see #23735), but currently these checks aren't
      generated until SSA construction. For now, it's simpler to continue
      desugaring /= and %= into two map indexing operations.
      
      Fixes #23661.
      
      Change-Id: I46e3739d9adef10e92b46fdd78b88d5aabe68952
      Reviewed-on: https://go-review.googlesource.com/91557
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      73950831
    • isharipo's avatar
      cmd/compile/internal/ssa: emit IMUL3{L/Q} for MUL{L/Q}const on x86 · 85a8d25d
      isharipo authored
      cmd/asm now supports three-operand form of IMUL,
      so instead of using IMUL with resultInArg0, emit IMUL3 instruction.
      
      This results in less redundant MOVs where SSA assigns
      different registers to input[0] and dst arguments.
      
      Note: these have exactly the same encoding when reg0=reg1:
            IMUL3x $const, reg0, reg1
            IMULx $const, reg
      Two-operand IMULx is like a crippled IMUL3x, with dst fixed to input[0].
      This is why we don't bother to generate IMULx for the case where
      dst is the same as input[0].
      
      Change-Id: I4becda475b3dffdd07b6fdf1c75bacc82af654e4
      Reviewed-on: https://go-review.googlesource.com/99656
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      85a8d25d
    • Giovanni Bajo's avatar
      cmd/compile: implement CMOV on amd64 · 080187f4
      Giovanni Bajo authored
      This builds upon the branchelim pass, activating it for amd64 and
      lowering CondSelect. Special care is made to FPU instructions for
      NaN handling.
      
      Benchmark results on Xeon E5630 (Westmere EP):
      
      name                      old time/op    new time/op    delta
      BinaryTree17-16              4.99s ± 9%     4.66s ± 2%     ~     (p=0.095 n=5+5)
      Fannkuch11-16                4.93s ± 3%     5.04s ± 2%     ~     (p=0.548 n=5+5)
      FmtFprintfEmpty-16          58.8ns ± 7%    61.4ns ±14%     ~     (p=0.579 n=5+5)
      FmtFprintfString-16          114ns ± 2%     114ns ± 4%     ~     (p=0.603 n=5+5)
      FmtFprintfInt-16             181ns ± 4%     125ns ± 3%  -30.90%  (p=0.008 n=5+5)
      FmtFprintfIntInt-16          263ns ± 2%     217ns ± 2%  -17.34%  (p=0.008 n=5+5)
      FmtFprintfPrefixedInt-16     230ns ± 1%     212ns ± 1%   -7.99%  (p=0.008 n=5+5)
      FmtFprintfFloat-16           411ns ± 3%     344ns ± 5%  -16.43%  (p=0.008 n=5+5)
      FmtManyArgs-16               828ns ± 4%     790ns ± 2%   -4.59%  (p=0.032 n=5+5)
      GobDecode-16                10.9ms ± 4%    10.8ms ± 5%     ~     (p=0.548 n=5+5)
      GobEncode-16                9.52ms ± 5%    9.46ms ± 2%     ~     (p=1.000 n=5+5)
      Gzip-16                      334ms ± 2%     337ms ± 2%     ~     (p=0.548 n=5+5)
      Gunzip-16                   64.4ms ± 1%    65.0ms ± 1%   +1.00%  (p=0.008 n=5+5)
      HTTPClientServer-16          156µs ± 3%     155µs ± 3%     ~     (p=0.690 n=5+5)
      JSONEncode-16               21.0ms ± 1%    21.8ms ± 0%   +3.76%  (p=0.016 n=5+4)
      JSONDecode-16               95.1ms ± 0%    95.7ms ± 1%     ~     (p=0.151 n=5+5)
      Mandelbrot200-16            6.38ms ± 1%    6.42ms ± 1%     ~     (p=0.095 n=5+5)
      GoParse-16                  5.47ms ± 2%    5.36ms ± 1%   -1.95%  (p=0.016 n=5+5)
      RegexpMatchEasy0_32-16       111ns ± 1%     111ns ± 1%     ~     (p=0.635 n=5+4)
      RegexpMatchEasy0_1K-16       408ns ± 1%     411ns ± 2%     ~     (p=0.087 n=5+5)
      RegexpMatchEasy1_32-16       103ns ± 1%     104ns ± 1%     ~     (p=0.484 n=5+5)
      RegexpMatchEasy1_1K-16       659ns ± 2%     652ns ± 1%     ~     (p=0.571 n=5+5)
      RegexpMatchMedium_32-16      176ns ± 2%     174ns ± 1%     ~     (p=0.476 n=5+5)
      RegexpMatchMedium_1K-16     58.6µs ± 4%    57.7µs ± 4%     ~     (p=0.548 n=5+5)
      RegexpMatchHard_32-16       3.07µs ± 3%    3.04µs ± 4%     ~     (p=0.421 n=5+5)
      RegexpMatchHard_1K-16       89.2µs ± 1%    87.9µs ± 2%   -1.52%  (p=0.032 n=5+5)
      Revcomp-16                   575ms ± 0%     587ms ± 2%   +2.12%  (p=0.032 n=4+5)
      Template-16                  110ms ± 1%     107ms ± 3%   -3.00%  (p=0.032 n=5+5)
      TimeParse-16                 463ns ± 0%     462ns ± 0%     ~     (p=0.810 n=5+4)
      TimeFormat-16                538ns ± 0%     535ns ± 0%   -0.63%  (p=0.024 n=5+5)
      
      name                      old speed      new speed      delta
      GobDecode-16              70.7MB/s ± 4%  71.4MB/s ± 5%     ~     (p=0.452 n=5+5)
      GobEncode-16              80.7MB/s ± 5%  81.2MB/s ± 2%     ~     (p=1.000 n=5+5)
      Gzip-16                   58.2MB/s ± 2%  57.7MB/s ± 2%     ~     (p=0.452 n=5+5)
      Gunzip-16                  302MB/s ± 1%   299MB/s ± 1%   -0.99%  (p=0.008 n=5+5)
      JSONEncode-16             92.4MB/s ± 1%  89.1MB/s ± 0%   -3.63%  (p=0.016 n=5+4)
      JSONDecode-16             20.4MB/s ± 0%  20.3MB/s ± 1%     ~     (p=0.135 n=5+5)
      GoParse-16                10.6MB/s ± 2%  10.8MB/s ± 1%   +2.00%  (p=0.016 n=5+5)
      RegexpMatchEasy0_32-16     286MB/s ± 1%   285MB/s ± 3%     ~     (p=1.000 n=5+5)
      RegexpMatchEasy0_1K-16    2.51GB/s ± 1%  2.49GB/s ± 2%     ~     (p=0.095 n=5+5)
      RegexpMatchEasy1_32-16     309MB/s ± 1%   307MB/s ± 1%     ~     (p=0.548 n=5+5)
      RegexpMatchEasy1_1K-16    1.55GB/s ± 2%  1.57GB/s ± 1%     ~     (p=0.690 n=5+5)
      RegexpMatchMedium_32-16   5.68MB/s ± 2%  5.73MB/s ± 1%     ~     (p=0.579 n=5+5)
      RegexpMatchMedium_1K-16   17.5MB/s ± 4%  17.8MB/s ± 4%     ~     (p=0.500 n=5+5)
      RegexpMatchHard_32-16     10.4MB/s ± 3%  10.5MB/s ± 4%     ~     (p=0.460 n=5+5)
      RegexpMatchHard_1K-16     11.5MB/s ± 1%  11.7MB/s ± 2%   +1.57%  (p=0.032 n=5+5)
      Revcomp-16                 442MB/s ± 0%   433MB/s ± 2%   -2.05%  (p=0.032 n=4+5)
      Template-16               17.7MB/s ± 1%  18.2MB/s ± 3%   +3.12%  (p=0.032 n=5+5)
      
      Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c
      Reviewed-on: https://go-review.googlesource.com/98695Reviewed-by: default avatarIlya Tocar <ilya.tocar@intel.com>
      080187f4
    • fanzha02's avatar
      cmd/asm: fix ARM64 vector register arrangement encoding bug · fdf5aaf5
      fanzha02 authored
      The current code assigns vector register arrangement a wrong value
      when the arrangement specifier is S2, which causes the incorrect
      assembly.
      
      The patch fixes the issue and adds the test cases.
      
      Fixes #24249
      
      Change-Id: I9736df1279494003d0b178da1af9cee9cd85ce21
      Reviewed-on: https://go-review.googlesource.com/98555Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      fdf5aaf5
    • erifan01's avatar
      math/big: optimize shlVU and shrVU on arm64 · 140bfe9c
      erifan01 authored
      This CL implements shlVU and shrVU with arm64 HW instructions "LDP" and "STP" to reduce load cost,
      it also removes unnecessary checks on the number of shifts for better performance.
      
      Benchmarks:
      
      name                              old time/op    new time/op    delta
      AddVV/1-8                           21.6ns ± 1%    21.6ns ± 1%      ~     (p=0.683 n=5+5)
      AddVV/2-8                           13.5ns ± 0%    13.5ns ± 0%      ~     (all equal)
      AddVV/3-8                           15.5ns ± 0%    15.5ns ± 0%      ~     (all equal)
      AddVV/4-8                           17.5ns ± 0%    17.5ns ± 0%      ~     (all equal)
      AddVV/5-8                           19.5ns ± 0%    19.5ns ± 0%      ~     (all equal)
      AddVV/10-8                          29.5ns ± 0%    29.5ns ± 0%      ~     (all equal)
      AddVV/100-8                          217ns ± 0%     217ns ± 0%      ~     (all equal)
      AddVV/1000-8                        2.02µs ± 0%    2.03µs ± 0%    +0.73%  (p=0.008 n=5+5)
      AddVV/10000-8                       20.5µs ± 0%    20.5µs ± 0%    -0.01%  (p=0.008 n=5+5)
      AddVV/100000-8                       246µs ± 5%     250µs ± 4%      ~     (p=0.548 n=5+5)
      AddVW/1-8                           9.26ns ± 0%    9.32ns ± 0%    +0.65%  (p=0.016 n=4+5)
      AddVW/2-8                           19.8ns ± 3%    19.8ns ± 0%      ~     (p=0.143 n=5+5)
      AddVW/3-8                           11.5ns ± 0%    11.5ns ± 0%      ~     (all equal)
      AddVW/4-8                           13.0ns ± 0%    13.0ns ± 0%      ~     (all equal)
      AddVW/5-8                           14.5ns ± 0%    14.5ns ± 0%      ~     (all equal)
      AddVW/10-8                          22.0ns ± 0%    22.0ns ± 0%      ~     (all equal)
      AddVW/100-8                          167ns ± 0%     166ns ± 0%    -0.60%  (p=0.000 n=5+4)
      AddVW/1000-8                        1.52µs ± 0%    1.52µs ± 0%      ~     (all equal)
      AddVW/10000-8                       15.1µs ± 0%    15.1µs ± 0%    +0.01%  (p=0.008 n=5+5)
      AddVW/100000-8                       163µs ± 4%     153µs ± 3%    -5.97%  (p=0.016 n=5+5)
      AddMulVVW/1-8                       32.4ns ± 1%    33.0ns ± 1%    +1.73%  (p=0.040 n=5+5)
      AddMulVVW/2-8                       56.4ns ± 2%    55.9ns ± 1%      ~     (p=0.135 n=5+5)
      AddMulVVW/3-8                       85.4ns ± 1%    85.1ns ± 0%      ~     (p=0.079 n=5+5)
      AddMulVVW/4-8                        129ns ± 1%     129ns ± 0%      ~     (p=0.397 n=5+5)
      AddMulVVW/5-8                        148ns ± 0%     148ns ± 0%      ~     (all equal)
      AddMulVVW/10-8                       270ns ± 0%     268ns ± 0%    -0.74%  (p=0.029 n=4+4)
      AddMulVVW/100-8                     2.75µs ± 0%    2.75µs ± 0%    -0.09%  (p=0.008 n=5+5)
      AddMulVVW/1000-8                    26.0µs ± 0%    26.0µs ± 0%    -0.06%  (p=0.024 n=5+5)
      AddMulVVW/10000-8                    312µs ± 0%     312µs ± 0%    -0.09%  (p=0.008 n=5+5)
      AddMulVVW/100000-8                  2.89ms ± 0%    2.89ms ± 0%    +0.14%  (p=0.016 n=5+5)
      DecimalConversion-8                  315µs ± 1%     312µs ± 0%      ~     (p=0.095 n=5+5)
      FloatString/100-8                   2.56µs ± 1%    2.52µs ± 1%    -1.31%  (p=0.016 n=5+5)
      FloatString/1000-8                  58.6µs ± 0%    58.2µs ± 0%    -0.75%  (p=0.008 n=5+5)
      FloatString/10000-8                 4.59ms ± 0%    4.59ms ± 0%      ~     (p=0.056 n=5+5)
      FloatString/100000-8                 446ms ± 0%     446ms ± 0%    -0.04%  (p=0.008 n=5+5)
      FloatAdd/10-8                        184ns ± 0%     178ns ± 0%    -3.48%  (p=0.008 n=5+5)
      FloatAdd/100-8                       189ns ± 3%     178ns ± 2%    -6.02%  (p=0.008 n=5+5)
      FloatAdd/1000-8                      371ns ± 0%     267ns ± 0%   -27.99%  (p=0.000 n=5+4)
      FloatAdd/10000-8                    1.87µs ± 0%    1.03µs ± 0%   -44.74%  (p=0.008 n=5+5)
      FloatAdd/100000-8                   17.1µs ± 0%     8.8µs ± 0%   -48.71%  (p=0.016 n=5+4)
      FloatSub/10-8                        148ns ± 0%     146ns ± 0%    -1.35%  (p=0.000 n=5+4)
      FloatSub/100-8                       148ns ± 0%     140ns ± 0%    -5.41%  (p=0.008 n=5+5)
      FloatSub/1000-8                      242ns ± 0%     191ns ± 0%   -21.24%  (p=0.008 n=5+5)
      FloatSub/10000-8                    1.07µs ± 0%    0.64µs ± 1%   -39.89%  (p=0.016 n=4+5)
      FloatSub/100000-8                   9.48µs ± 0%    5.32µs ± 0%   -43.87%  (p=0.008 n=5+5)
      ParseFloatSmallExp-8                29.3µs ± 3%    28.6µs ± 1%      ~     (p=0.310 n=5+5)
      ParseFloatLargeExp-8                 125µs ± 1%     123µs ± 0%    -1.99%  (p=0.008 n=5+5)
      GCD10x10/WithoutXY-8                 278ns ± 4%     289ns ± 5%    +3.96%  (p=0.040 n=5+5)
      GCD10x10/WithXY-8                   2.12µs ± 2%    2.15µs ± 2%      ~     (p=0.095 n=5+5)
      GCD10x100/WithoutXY-8                615ns ± 1%     629ns ± 5%      ~     (p=0.135 n=5+5)
      GCD10x100/WithXY-8                  3.42µs ± 1%    3.53µs ± 2%    +3.38%  (p=0.008 n=5+5)
      GCD10x1000/WithoutXY-8              1.39µs ± 1%    1.38µs ± 1%      ~     (p=0.460 n=5+5)
      GCD10x1000/WithXY-8                 7.47µs ± 2%    7.49µs ± 3%      ~     (p=1.000 n=5+5)
      GCD10x10000/WithoutXY-8             8.71µs ± 1%    8.71µs ± 0%      ~     (p=0.841 n=5+5)
      GCD10x10000/WithXY-8                28.4µs ± 2%    27.2µs ± 2%    -4.24%  (p=0.008 n=5+5)
      GCD10x100000/WithoutXY-8            78.9µs ± 1%    79.1µs ± 0%      ~     (p=0.222 n=5+5)
      GCD10x100000/WithXY-8                240µs ± 1%     228µs ± 1%    -4.98%  (p=0.008 n=5+5)
      GCD100x100/WithoutXY-8              1.87µs ± 2%    1.89µs ± 1%      ~     (p=0.095 n=5+5)
      GCD100x100/WithXY-8                 26.6µs ± 1%    26.3µs ± 0%    -1.14%  (p=0.032 n=5+5)
      GCD100x1000/WithoutXY-8             4.44µs ± 2%    4.47µs ± 2%      ~     (p=0.444 n=5+5)
      GCD100x1000/WithXY-8                36.7µs ± 1%    36.0µs ± 1%    -1.96%  (p=0.008 n=5+5)
      GCD100x10000/WithoutXY-8            22.8µs ± 1%    22.3µs ± 1%    -2.52%  (p=0.008 n=5+5)
      GCD100x10000/WithXY-8                145µs ± 1%     142µs ± 0%    -1.88%  (p=0.008 n=5+5)
      GCD100x100000/WithoutXY-8            198µs ± 1%     190µs ± 0%    -4.06%  (p=0.008 n=5+5)
      GCD100x100000/WithXY-8              1.11ms ± 0%    1.09ms ± 0%    -1.87%  (p=0.008 n=5+5)
      GCD1000x1000/WithoutXY-8            25.4µs ± 1%    25.0µs ± 1%    -1.34%  (p=0.008 n=5+5)
      GCD1000x1000/WithXY-8                515µs ± 1%     485µs ± 0%    -5.85%  (p=0.008 n=5+5)
      GCD1000x10000/WithoutXY-8           57.3µs ± 1%    56.2µs ± 1%    -1.95%  (p=0.008 n=5+5)
      GCD1000x10000/WithXY-8              1.21ms ± 0%    1.18ms ± 0%    -2.65%  (p=0.008 n=5+5)
      GCD1000x100000/WithoutXY-8           358µs ± 0%     352µs ± 1%    -1.71%  (p=0.008 n=5+5)
      GCD1000x100000/WithXY-8             8.72ms ± 0%    8.66ms ± 0%    -0.71%  (p=0.008 n=5+5)
      GCD10000x10000/WithoutXY-8           690µs ± 0%     687µs ± 1%      ~     (p=0.095 n=5+5)
      GCD10000x10000/WithXY-8             16.0ms ± 0%    12.5ms ± 0%   -22.01%  (p=0.008 n=5+5)
      GCD10000x100000/WithoutXY-8         2.09ms ± 0%    2.07ms ± 0%    -0.58%  (p=0.008 n=5+5)
      GCD10000x100000/WithXY-8            86.8ms ± 0%    83.4ms ± 0%    -3.95%  (p=0.008 n=5+5)
      GCD100000x100000/WithoutXY-8        51.2ms ± 0%    51.2ms ± 0%      ~     (p=0.548 n=5+5)
      GCD100000x100000/WithXY-8            1.25s ± 0%     0.89s ± 0%   -28.98%  (p=0.008 n=5+5)
      Hilbert-8                           2.46ms ± 2%    2.53ms ± 1%    +2.89%  (p=0.032 n=5+5)
      Binomial-8                          5.15µs ± 4%    4.92µs ± 1%    -4.43%  (p=0.032 n=5+5)
      QuoRem-8                            7.10µs ± 0%    7.05µs ± 0%    -0.59%  (p=0.008 n=5+5)
      Exp-8                                161ms ± 0%     161ms ± 0%    -0.24%  (p=0.008 n=5+5)
      Exp2-8                               161ms ± 0%     161ms ± 0%    -0.30%  (p=0.016 n=4+5)
      Bitset-8                            40.4ns ± 0%    40.3ns ± 0%      ~     (p=0.159 n=5+5)
      BitsetNeg-8                          158ns ± 4%     155ns ± 2%      ~     (p=0.183 n=5+5)
      BitsetOrig-8                         374ns ± 0%     383ns ± 1%    +2.35%  (p=0.008 n=5+5)
      BitsetNegOrig-8                      620ns ± 1%     663ns ± 2%    +7.00%  (p=0.008 n=5+5)
      ModSqrt225_Tonelli-8                7.26ms ± 0%    7.27ms ± 0%      ~     (p=0.841 n=5+5)
      ModSqrt224_3Mod4-8                  2.24ms ± 0%    2.24ms ± 0%      ~     (p=0.690 n=5+5)
      ModSqrt5430_Tonelli-8                62.3s ± 0%     62.4s ± 0%    +0.15%  (p=0.008 n=5+5)
      ModSqrt5430_3Mod4-8                  20.8s ± 0%     20.8s ± 0%      ~     (p=0.151 n=5+5)
      Sqrt-8                               101µs ± 0%      97µs ± 0%    -3.99%  (p=0.008 n=5+5)
      IntSqr/1-8                          32.7ns ± 1%    32.5ns ± 1%      ~     (p=0.325 n=5+5)
      IntSqr/2-8                           161ns ± 4%     160ns ± 4%      ~     (p=0.659 n=5+5)
      IntSqr/3-8                           296ns ± 7%     297ns ± 6%      ~     (p=0.841 n=5+5)
      IntSqr/5-8                           752ns ± 7%     755ns ± 6%      ~     (p=0.889 n=5+5)
      IntSqr/8-8                          1.91µs ± 3%    1.90µs ± 3%      ~     (p=0.746 n=5+5)
      IntSqr/10-8                         2.99µs ± 4%    3.00µs ± 4%      ~     (p=0.516 n=5+5)
      IntSqr/20-8                         6.29µs ± 2%    6.19µs ± 2%      ~     (p=0.151 n=5+5)
      IntSqr/30-8                         14.0µs ± 1%    13.8µs ± 2%      ~     (p=0.056 n=5+5)
      IntSqr/50-8                         38.1µs ± 3%    37.9µs ± 3%      ~     (p=0.548 n=5+5)
      IntSqr/80-8                         95.1µs ± 1%    94.7µs ± 1%      ~     (p=0.310 n=5+5)
      IntSqr/100-8                         148µs ± 1%     148µs ± 1%      ~     (p=0.548 n=5+5)
      IntSqr/200-8                         587µs ± 1%     587µs ± 1%      ~     (p=1.000 n=5+5)
      IntSqr/300-8                        1.31ms ± 1%    1.32ms ± 1%      ~     (p=0.151 n=5+5)
      IntSqr/500-8                        2.48ms ± 0%    2.49ms ± 0%      ~     (p=0.310 n=5+5)
      IntSqr/800-8                        4.68ms ± 0%    4.67ms ± 0%      ~     (p=0.548 n=5+5)
      IntSqr/1000-8                       7.57ms ± 0%    7.56ms ± 0%      ~     (p=0.421 n=5+5)
      Mul-8                                311ms ± 0%     311ms ± 0%      ~     (p=0.151 n=5+5)
      Exp3Power/0x10-8                     584ns ± 2%     573ns ± 1%      ~     (p=0.190 n=5+5)
      Exp3Power/0x40-8                     646ns ± 2%     649ns ± 1%      ~     (p=0.690 n=5+5)
      Exp3Power/0x100-8                   1.42µs ± 2%    1.45µs ± 1%    +2.03%  (p=0.032 n=5+5)
      Exp3Power/0x400-8                   8.28µs ± 1%    8.39µs ± 0%    +1.33%  (p=0.008 n=5+5)
      Exp3Power/0x1000-8                  60.1µs ± 0%    59.8µs ± 0%    -0.44%  (p=0.008 n=5+5)
      Exp3Power/0x4000-8                   818µs ± 0%     816µs ± 0%    -0.23%  (p=0.008 n=5+5)
      Exp3Power/0x10000-8                 7.79ms ± 0%    7.78ms ± 0%      ~     (p=0.690 n=5+5)
      Exp3Power/0x40000-8                 73.4ms ± 0%    73.3ms ± 0%      ~     (p=0.151 n=5+5)
      Exp3Power/0x100000-8                 665ms ± 0%     664ms ± 0%    -0.16%  (p=0.016 n=4+5)
      Exp3Power/0x400000-8                 5.99s ± 0%     5.97s ± 0%    -0.24%  (p=0.008 n=5+5)
      Fibo-8                               116ms ± 0%     117ms ± 0%    +0.42%  (p=0.008 n=5+5)
      NatSqr/1-8                           113ns ± 2%     112ns ± 1%      ~     (p=0.190 n=5+5)
      NatSqr/2-8                           249ns ± 2%     250ns ± 2%      ~     (p=0.365 n=5+5)
      NatSqr/3-8                           379ns ± 1%     381ns ± 2%      ~     (p=0.127 n=5+5)
      NatSqr/5-8                           838ns ± 3%     841ns ± 5%      ~     (p=0.754 n=5+5)
      NatSqr/8-8                          1.97µs ± 3%    1.97µs ± 4%      ~     (p=1.000 n=5+5)
      NatSqr/10-8                         3.04µs ± 4%    3.04µs ± 4%      ~     (p=1.000 n=5+5)
      NatSqr/20-8                         6.49µs ± 3%    6.50µs ± 2%      ~     (p=0.841 n=5+5)
      NatSqr/30-8                         14.3µs ± 2%    14.2µs ± 2%      ~     (p=0.548 n=5+5)
      NatSqr/50-8                         38.5µs ± 3%    38.3µs ± 3%      ~     (p=0.421 n=5+5)
      NatSqr/80-8                         96.3µs ± 1%    96.1µs ± 1%      ~     (p=0.421 n=5+5)
      NatSqr/100-8                         149µs ± 1%     148µs ± 1%      ~     (p=0.310 n=5+5)
      NatSqr/200-8                         591µs ± 1%     592µs ± 1%      ~     (p=0.690 n=5+5)
      NatSqr/300-8                        1.31ms ± 1%    1.32ms ± 0%      ~     (p=0.190 n=5+4)
      NatSqr/500-8                        2.49ms ± 0%    2.49ms ± 0%      ~     (p=0.095 n=5+5)
      NatSqr/800-8                        4.70ms ± 0%    4.69ms ± 0%      ~     (p=0.222 n=5+5)
      NatSqr/1000-8                       7.60ms ± 0%    7.58ms ± 0%      ~     (p=0.222 n=5+5)
      ScanPi-8                             326µs ± 0%     327µs ± 1%      ~     (p=0.222 n=5+5)
      StringPiParallel-8                  71.4µs ± 5%    67.7µs ± 4%      ~     (p=0.095 n=5+5)
      Scan/10/Base2-8                     1.09µs ± 0%    1.10µs ± 1%      ~     (p=0.810 n=5+5)
      Scan/100/Base2-8                    7.79µs ± 0%    7.83µs ± 0%    +0.53%  (p=0.008 n=5+5)
      Scan/1000/Base2-8                   78.9µs ± 0%    79.0µs ± 0%      ~     (p=0.151 n=5+5)
      Scan/10000/Base2-8                  1.22ms ± 0%    1.23ms ± 1%      ~     (p=0.690 n=5+5)
      Scan/100000/Base2-8                 55.1ms ± 0%    55.1ms ± 0%    +0.10%  (p=0.008 n=5+5)
      Scan/10/Base8-8                      512ns ± 1%     534ns ± 1%    +4.34%  (p=0.008 n=5+5)
      Scan/100/Base8-8                    2.90µs ± 1%    2.92µs ± 0%    +0.67%  (p=0.024 n=5+5)
      Scan/1000/Base8-8                   31.0µs ± 0%    31.1µs ± 0%    +0.27%  (p=0.008 n=5+5)
      Scan/10000/Base8-8                   741µs ± 0%     744µs ± 1%      ~     (p=0.310 n=5+5)
      Scan/100000/Base8-8                 50.5ms ± 0%    50.7ms ± 0%    +0.23%  (p=0.016 n=5+4)
      Scan/10/Base10-8                     485ns ± 0%     510ns ± 1%    +5.15%  (p=0.008 n=5+5)
      Scan/100/Base10-8                   2.68µs ± 0%    2.70µs ± 0%    +0.84%  (p=0.008 n=5+5)
      Scan/1000/Base10-8                  28.7µs ± 0%    28.8µs ± 0%    +0.34%  (p=0.008 n=5+5)
      Scan/10000/Base10-8                  717µs ± 0%     720µs ± 1%      ~     (p=0.238 n=5+5)
      Scan/100000/Base10-8                50.3ms ± 0%    50.3ms ± 0%    +0.02%  (p=0.016 n=4+5)
      Scan/10/Base16-8                     439ns ± 0%     461ns ± 1%    +5.06%  (p=0.008 n=5+5)
      Scan/100/Base16-8                   2.48µs ± 0%    2.49µs ± 0%    +0.59%  (p=0.024 n=5+5)
      Scan/1000/Base16-8                  27.2µs ± 0%    27.3µs ± 0%      ~     (p=0.063 n=5+5)
      Scan/10000/Base16-8                  722µs ± 0%     725µs ± 1%      ~     (p=0.421 n=5+5)
      Scan/100000/Base16-8                52.7ms ± 0%    52.7ms ± 0%      ~     (p=0.686 n=4+4)
      String/10/Base2-8                    248ns ± 1%     248ns ± 1%      ~     (p=0.802 n=5+5)
      String/100/Base2-8                  1.51µs ± 0%    1.51µs ± 0%    -0.54%  (p=0.024 n=5+5)
      String/1000/Base2-8                 13.6µs ± 0%    13.6µs ± 0%      ~     (p=0.548 n=5+5)
      String/10000/Base2-8                 135µs ± 1%     135µs ± 2%      ~     (p=0.421 n=5+5)
      String/100000/Base2-8               1.32ms ± 1%    1.33ms ± 1%      ~     (p=0.310 n=5+5)
      String/10/Base8-8                    169ns ± 0%     170ns ± 0%      ~     (p=0.079 n=5+5)
      String/100/Base8-8                   635ns ± 1%     633ns ± 1%      ~     (p=0.595 n=5+5)
      String/1000/Base8-8                 5.33µs ± 0%    5.30µs ± 0%      ~     (p=0.063 n=5+5)
      String/10000/Base8-8                50.7µs ± 1%    50.7µs ± 1%      ~     (p=1.000 n=5+5)
      String/100000/Base8-8                499µs ± 1%     500µs ± 1%      ~     (p=1.000 n=5+5)
      String/10/Base10-8                   517ns ± 1%     512ns ± 1%    -1.01%  (p=0.032 n=5+5)
      String/100/Base10-8                 1.97µs ± 0%    2.01µs ± 1%    +2.13%  (p=0.008 n=5+5)
      String/1000/Base10-8                12.6µs ± 1%    12.1µs ± 1%    -4.16%  (p=0.008 n=5+5)
      String/10000/Base10-8               57.9µs ± 1%    54.8µs ± 1%    -5.46%  (p=0.008 n=5+5)
      String/100000/Base10-8              25.6ms ± 0%    25.6ms ± 0%    -0.12%  (p=0.008 n=5+5)
      String/10/Base16-8                   149ns ± 0%     149ns ± 1%      ~     (p=1.000 n=5+5)
      String/100/Base16-8                  514ns ± 0%     514ns ± 1%      ~     (p=0.825 n=5+5)
      String/1000/Base16-8                4.01µs ± 0%    4.01µs ± 0%      ~     (p=0.595 n=5+5)
      String/10000/Base16-8               37.7µs ± 0%    37.8µs ± 1%      ~     (p=0.222 n=5+5)
      String/100000/Base16-8               373µs ± 1%     372µs ± 0%      ~     (p=1.000 n=5+5)
      LeafSize/0-8                        6.64ms ± 0%    6.66ms ± 0%    +0.32%  (p=0.008 n=5+5)
      LeafSize/1-8                        74.0µs ± 1%    71.2µs ± 1%    -3.75%  (p=0.008 n=5+5)
      LeafSize/2-8                        74.1µs ± 0%    70.7µs ± 1%    -4.53%  (p=0.008 n=5+5)
      LeafSize/3-8                         379µs ± 0%     374µs ± 0%    -1.25%  (p=0.008 n=5+5)
      LeafSize/4-8                        72.7µs ± 0%    69.2µs ± 0%    -4.79%  (p=0.008 n=5+5)
      LeafSize/5-8                         471µs ± 0%     466µs ± 0%    -1.05%  (p=0.008 n=5+5)
      LeafSize/6-8                         377µs ± 0%     373µs ± 0%    -1.16%  (p=0.008 n=5+5)
      LeafSize/7-8                         245µs ± 0%     241µs ± 0%    -1.65%  (p=0.008 n=5+5)
      LeafSize/8-8                        73.1µs ± 0%    69.4µs ± 0%    -5.10%  (p=0.008 n=5+5)
      LeafSize/9-8                         538µs ± 0%     532µs ± 0%    -1.01%  (p=0.008 n=5+5)
      LeafSize/10-8                        472µs ± 0%     467µs ± 0%    -1.07%  (p=0.008 n=5+5)
      LeafSize/11-8                        460µs ± 0%     454µs ± 0%    -1.22%  (p=0.008 n=5+5)
      LeafSize/12-8                        378µs ± 0%     373µs ± 0%    -1.34%  (p=0.008 n=5+5)
      LeafSize/13-8                        344µs ± 0%     338µs ± 0%    -1.61%  (p=0.008 n=5+5)
      LeafSize/14-8                        247µs ± 0%     243µs ± 0%    -1.62%  (p=0.008 n=5+5)
      LeafSize/15-8                        169µs ± 0%     165µs ± 0%    -2.71%  (p=0.008 n=5+5)
      LeafSize/16-8                       73.3µs ± 1%    69.5µs ± 0%    -5.11%  (p=0.008 n=5+5)
      LeafSize/32-8                       82.7µs ± 0%    79.2µs ± 0%    -4.24%  (p=0.008 n=5+5)
      LeafSize/64-8                        135µs ± 0%     132µs ± 0%    -2.20%  (p=0.008 n=5+5)
      ProbablyPrime/n=0-8                 44.2ms ± 0%    43.9ms ± 0%    -0.69%  (p=0.008 n=5+5)
      ProbablyPrime/n=1-8                 64.8ms ± 0%    64.4ms ± 0%    -0.60%  (p=0.008 n=5+5)
      ProbablyPrime/n=5-8                  147ms ± 0%     147ms ± 0%    -0.34%  (p=0.008 n=5+5)
      ProbablyPrime/n=10-8                 250ms ± 0%     249ms ± 0%    -0.29%  (p=0.008 n=5+5)
      ProbablyPrime/n=20-8                 456ms ± 0%     455ms ± 0%    -0.29%  (p=0.008 n=5+5)
      ProbablyPrime/Lucas-8               23.6ms ± 0%    23.2ms ± 0%    -1.44%  (p=0.008 n=5+5)
      ProbablyPrime/MillerRabinBase2-8    20.6ms ± 0%    20.6ms ± 0%    -0.31%  (p=0.008 n=5+5)
      FloatSqrt/64-8                      2.27µs ± 1%    2.11µs ± 1%    -7.02%  (p=0.008 n=5+5)
      FloatSqrt/128-8                     4.93µs ± 1%    4.40µs ± 1%   -10.73%  (p=0.008 n=5+5)
      FloatSqrt/256-8                     13.6µs ± 0%     6.6µs ± 1%   -51.40%  (p=0.008 n=5+5)
      FloatSqrt/1000-8                    69.8µs ± 0%    31.2µs ± 0%   -55.27%  (p=0.008 n=5+5)
      FloatSqrt/10000-8                   1.91ms ± 0%    0.59ms ± 0%   -69.17%  (p=0.008 n=5+5)
      FloatSqrt/100000-8                  55.4ms ± 0%    17.8ms ± 0%   -67.79%  (p=0.008 n=5+5)
      FloatSqrt/1000000-8                  4.56s ± 0%     1.52s ± 0%   -66.59%  (p=0.008 n=5+5)
      
      Change-Id: Icce52c69668f564490c69b908338b21a2288e116
      Reviewed-on: https://go-review.googlesource.com/79355Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      140bfe9c
  3. 11 Mar, 2018 1 commit
  4. 10 Mar, 2018 7 commits
  5. 09 Mar, 2018 19 commits