1. 21 Feb, 2017 22 commits
  2. 20 Feb, 2017 1 commit
  3. 19 Feb, 2017 7 commits
  4. 18 Feb, 2017 4 commits
  5. 17 Feb, 2017 6 commits
    • Robert Griesemer's avatar
      math/bits: added benchmarks for Leading/TrailingZeros · a4a3d63d
      Robert Griesemer authored
      BenchmarkLeadingZeros-8      	200000000	         8.80 ns/op
      BenchmarkLeadingZeros8-8     	200000000	         8.21 ns/op
      BenchmarkLeadingZeros16-8    	200000000	         7.49 ns/op
      BenchmarkLeadingZeros32-8    	200000000	         7.80 ns/op
      BenchmarkLeadingZeros64-8    	200000000	         8.67 ns/op
      
      BenchmarkTrailingZeros-8     	1000000000	         2.05 ns/op
      BenchmarkTrailingZeros8-8    	2000000000	         1.94 ns/op
      BenchmarkTrailingZeros16-8   	2000000000	         1.94 ns/op
      BenchmarkTrailingZeros32-8   	2000000000	         1.92 ns/op
      BenchmarkTrailingZeros64-8   	2000000000	         2.03 ns/op
      
      Change-Id: I45497bf2d6369ba6cfc88ded05aa735908af8908
      Reviewed-on: https://go-review.googlesource.com/37220
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      a4a3d63d
    • Robert Griesemer's avatar
      math/bits: faster Rotate functions, added respective benchmarks · 19028bdd
      Robert Griesemer authored
      Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
      
      benchmark                    old ns/op     new ns/op     delta
      BenchmarkRotateLeft-8        7.87          7.00          -11.05%
      BenchmarkRotateLeft8-8       8.41          4.52          -46.25%
      BenchmarkRotateLeft16-8      8.07          4.55          -43.62%
      BenchmarkRotateLeft32-8      8.36          4.73          -43.42%
      BenchmarkRotateLeft64-8      7.93          4.78          -39.72%
      
      BenchmarkRotateRight-8       8.23          6.72          -18.35%
      BenchmarkRotateRight8-8      8.76          4.39          -49.89%
      BenchmarkRotateRight16-8     9.07          4.44          -51.05%
      BenchmarkRotateRight32-8     8.85          4.46          -49.60%
      BenchmarkRotateRight64-8     8.11          4.43          -45.38%
      
      Change-Id: I79ea1e9e6fc65f95794a91f860a911efed3aa8a1
      Reviewed-on: https://go-review.googlesource.com/37219Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      19028bdd
    • Robert Griesemer's avatar
      math/bits: faster OnesCount, added respective benchmarks · a12edb8d
      Robert Griesemer authored
      Also: Changed Reverse/ReverseBytes implementations to use
      the same (smaller) masks as OnesCount.
      
      BenchmarkOnesCount-8          37.0          6.26          -83.08%
      BenchmarkOnesCount8-8         7.24          1.99          -72.51%
      BenchmarkOnesCount16-8        11.3          2.47          -78.14%
      BenchmarkOnesCount32-8        18.4          3.02          -83.59%
      BenchmarkOnesCount64-8        40.0          3.78          -90.55%
      BenchmarkReverse-8            6.69          6.22          -7.03%
      BenchmarkReverse8-8           1.64          1.64          +0.00%
      BenchmarkReverse16-8          2.26          2.18          -3.54%
      BenchmarkReverse32-8          2.88          2.87          -0.35%
      BenchmarkReverse64-8          5.64          4.34          -23.05%
      BenchmarkReverseBytes-8       2.48          2.17          -12.50%
      BenchmarkReverseBytes16-8     0.63          0.95          +50.79%
      BenchmarkReverseBytes32-8     1.13          1.24          +9.73%
      BenchmarkReverseBytes64-8     2.50          2.16          -13.60%
      
      OnesCount-8       37.0ns ± 0%   6.3ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount8-8      7.24ns ± 0%  1.99ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount16-8     11.3ns ± 0%   2.5ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount32-8     18.4ns ± 0%   3.0ns ± 0%   ~             (p=1.000 n=1+1)
      OnesCount64-8     40.0ns ± 0%   3.8ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse-8         6.69ns ± 0%  6.22ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse8-8        1.64ns ± 0%  1.64ns ± 0%   ~     (all samples are equal)
      Reverse16-8       2.26ns ± 0%  2.18ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse32-8       2.88ns ± 0%  2.87ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse64-8       5.64ns ± 0%  4.34ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes-8    2.48ns ± 0%  2.17ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes16-8  0.63ns ± 0%  0.95ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes32-8  1.13ns ± 0%  1.24ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes64-8  2.50ns ± 0%  2.16ns ± 0%   ~             (p=1.000 n=1+1)
      
      Change-Id: I591b0ffc83fc3a42828256b6e5030f32c64f9497
      Reviewed-on: https://go-review.googlesource.com/37218Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      a12edb8d
    • Ilya Tocar's avatar
      cmd/compile/internal/ssa: combine load + op on AMD64 · 21c71d77
      Ilya Tocar authored
      On AMD64 Most operation can have one operand in memory.
      Combine load and dependand operation into one new operation,
      where possible. I've seen no significant performance changes on go1,
      but this allows to remove ~1.8kb code from go tool. And in math package
      I see e. g.:
      
      Remainder-6            70.0ns ± 0%   64.6ns ± 0%   -7.76%  (p=0.000 n=9+1
      Change-Id: I88b8602b1d55da8ba548a34eb7da4b25d59a297e
      Reviewed-on: https://go-review.googlesource.com/36793
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      21c71d77
    • Keith Randall's avatar
      cmd/compile: fix 32-bit unsigned division on 64-bit machines · a9292b83
      Keith Randall authored
      The type of an intermediate multiply was wrong.  When that
      intermediate multiply was spilled, the top 32 bits were lost.
      
      Fixes #19153
      
      Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108
      Reviewed-on: https://go-review.googlesource.com/37212
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      a9292b83
    • Robert Griesemer's avatar
      math/bits: faster Reverse, ReverseBytes · 4498b683
      Robert Griesemer authored
      - moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
        This permits use of the same constant m twice (*) which may be
        better for machines that can't use large immediate constants
        directly with an AND instruction and have to load them explicitly.
        *) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)
      
      - simplified returns
        This improves the generated code because the compiler recognizes
        x>>k | x<<k as ROT when k is the bitsize of x.
      
      The 8-bit versions of these instructions can be significantly faster
      still if they are replaced with table lookups, as long as the table
      is in cache. If the table is not in cache, table-lookup is probably
      slower, hence the choice of an explicit register-only implementation
      for now.
      
      BenchmarkReverse-8            8.50          6.86          -19.29%
      BenchmarkReverse8-8           2.17          1.74          -19.82%
      BenchmarkReverse16-8          2.89          2.34          -19.03%
      BenchmarkReverse32-8          3.55          2.95          -16.90%
      BenchmarkReverse64-8          6.81          5.57          -18.21%
      BenchmarkReverseBytes-8       3.49          2.48          -28.94%
      BenchmarkReverseBytes16-8     0.93          0.62          -33.33%
      BenchmarkReverseBytes32-8     1.55          1.13          -27.10%
      BenchmarkReverseBytes64-8     2.47          2.47          +0.00%
      
      Reverse-8         8.50ns ± 0%  6.86ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse8-8        2.17ns ± 0%  1.74ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse16-8       2.89ns ± 0%  2.34ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse32-8       3.55ns ± 0%  2.95ns ± 0%   ~             (p=1.000 n=1+1)
      Reverse64-8       6.81ns ± 0%  5.57ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes-8    3.49ns ± 0%  2.48ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes16-8  0.93ns ± 0%  0.62ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes32-8  1.55ns ± 0%  1.13ns ± 0%   ~             (p=1.000 n=1+1)
      ReverseBytes64-8  2.47ns ± 0%  2.47ns ± 0%   ~     (all samples are equal)
      
      Change-Id: I0064de8c7e0e568ca7885d6f7064344bef91a06d
      Reviewed-on: https://go-review.googlesource.com/37215
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      4498b683