1. 17 Mar, 2017 16 commits
    • Matthew Dempsky's avatar
      cmd/compile: eliminate direct uses of gc.Thearch in backends · 3e2f980e
      Matthew Dempsky authored
      This CL changes the GOARCH.Init functions to take gc.Thearch as a
      parameter, which gc.Main supplies.
      
      Additionally, the x86 backend is refactored to decide within Init
      whether to use the 387 or SSE2 instruction generators, rather than for
      each individual SSA Value/Block.
      
      Passes toolstash-check -all.
      
      Change-Id: Ie6305a6cd6f6ab4e89ecbb3cbbaf5ffd57057a24
      Reviewed-on: https://go-review.googlesource.com/38301
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      3e2f980e
    • Robert Griesemer's avatar
      strconv: replace small int string table with constant string · aea44109
      Robert Griesemer authored
      This reduces memory use yet still provides the significant
      performance gain seen when using a fast path for small integers.
      
      Improvement of this CL comparing to code without fast path:
      
      name              old time/op  new time/op  delta
      FormatIntSmall-8  35.6ns ± 1%   4.5ns ± 1%  -87.30%  (p=0.008 n=5+5)
      AppendIntSmall-8  17.4ns ± 1%   9.4ns ± 3%  -45.70%  (p=0.008 n=5+5)
      
      For comparison, here's the improvement before this CL to code without
      fast path (1% better for FormatIntSmall):
      
      name              old time/op  new time/op  delta
      FormatIntSmall-8  35.6ns ± 1%   4.0ns ± 3%  -88.64%  (p=0.008 n=5+5)
      AppendIntSmall-8  17.4ns ± 1%   8.2ns ± 1%  -52.80%  (p=0.008 n=5+5)
      
      Thus, the code in this CL performs slower for small integers using fast
      path then the prior version, but this is relative to an already very fast
      version:
      
      name              old time/op  new time/op  delta
      FormatIntSmall-8  4.05ns ± 3%  4.52ns ± 1%  +11.81%  (p=0.008 n=5+5)
      AppendIntSmall-8  8.21ns ± 1%  9.45ns ± 3%  +15.05%  (p=0.008 n=5+5)
      
      Measured on 2.3 GHz Intel Core i7 running macOS Sierra 10.12.3.
      
      Overall, it's still ~88% faster than without fast path for small integers,
      so probably worth it as it removes 100 global string slices in favor of
      a single string.
      
      Credits: This is based on the original (but cleaned up) version of the
      code by Aliaksandr Valialkin (https://go-review.googlesource.com/c/37963/).
      
      Change-Id: Icda78679c8c14666d46257894e9fa3d7f35e58b8
      Reviewed-on: https://go-review.googlesource.com/38319Reviewed-by: default avatarMartin Möhrmann <moehrmann@google.com>
      aea44109
    • Daniel Martí's avatar
      go/types: enforce Check path restrictions via panics · b744a11a
      Daniel Martí authored
      Its godoc says that path must not be empty or dot, while the existing
      implementation happily accepts both.
      
      Change-Id: I64766271c35152dc7adb21ff60eb05c52237e6b6
      Reviewed-on: https://go-review.googlesource.com/38262Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      b744a11a
    • Alberto Donizetti's avatar
      encoding/gob: make integers encoding faster · ed00cd94
      Alberto Donizetti authored
      name                old time/op  new time/op  delta
      EncodeInt32Slice-4  14.6µs ± 2%  12.2µs ± 1%  -16.65%  (p=0.000 n=19+18)
      
      Change-Id: I078a171f1633ff81d7e3f981dc9a398309ecb2c0
      Reviewed-on: https://go-review.googlesource.com/38269Reviewed-by: default avatarRob Pike <r@golang.org>
      ed00cd94
    • Keith Randall's avatar
      cmd/compile: intrinsic for math/bits.Reverse on ARM64 · 42e97468
      Keith Randall authored
      I don't know that it exists for any other architectures.
      
      Update #18616
      
      Change-Id: Idfe5dee251764d32787915889ec0be4bebc5be24
      Reviewed-on: https://go-review.googlesource.com/38323
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      42e97468
    • Alexander Menzhinsky's avatar
      cmd/go: fix race libraries rebuilding by `go test -i` · be04da8f
      Alexander Menzhinsky authored
      `go test -i -race` adds the "sync/atomic" package to every package dependency tree
      that makes buildIDs different from packages installed with `go install -race`
      and causes cache rebuilding.
      
      Fixes #19133
      Fixes #19151
      
      Change-Id: I0536c6fa41b0d20fe361b5d35b3c0937b146d07d
      Reviewed-on: https://go-review.googlesource.com/37598Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      be04da8f
    • Alexey Neganov's avatar
      mime: handling invalid mime media parameters · b9f6b22a
      Alexey Neganov authored
      Sometimes it's necessary to deal with emails that do not follow the specification; in particular, it's possible to download such email via gmail.
      When the existing implementation handle invalid mime media parameters, it returns nils and error, although there is a valid media type, which may be returned.
      If this behavior changes, it may not affect any existing programs, but it will help to parse some emails.
      
      Fixes #19498
      
      Change-Id: Ieb2fdbddfd93857faee941d2aa49d59e286d57fd
      Reviewed-on: https://go-review.googlesource.com/38190Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      b9f6b22a
    • Lynn Boger's avatar
      hash/crc32: improve performance for ppc64le · b6cd22c2
      Lynn Boger authored
      This change improves the performance of crc32 for ppc64le by using
      vpmsum and other vector instructions in the algorithm.
      
      The testcase was updated to test more sizes.
      
      Fixes #19570
      
      BenchmarkCRC32/poly=IEEE/size=15/align=0-8             90.5          81.8          -9.61%
      BenchmarkCRC32/poly=IEEE/size=15/align=1-8             89.7          81.7          -8.92%
      BenchmarkCRC32/poly=IEEE/size=40/align=0-8             93.2          61.1          -34.44%
      BenchmarkCRC32/poly=IEEE/size=40/align=1-8             92.8          60.9          -34.38%
      BenchmarkCRC32/poly=IEEE/size=512/align=0-8            501           55.8          -88.86%
      BenchmarkCRC32/poly=IEEE/size=512/align=1-8            502           132           -73.71%
      BenchmarkCRC32/poly=IEEE/size=1kB/align=0-8            947           69.9          -92.62%
      BenchmarkCRC32/poly=IEEE/size=1kB/align=1-8            946           144           -84.78%
      BenchmarkCRC32/poly=IEEE/size=4kB/align=0-8            3602          186           -94.84%
      BenchmarkCRC32/poly=IEEE/size=4kB/align=1-8            3603          263           -92.70%
      BenchmarkCRC32/poly=IEEE/size=32kB/align=0-8           28404         1338          -95.29%
      BenchmarkCRC32/poly=IEEE/size=32kB/align=1-8           28856         1405          -95.13%
      BenchmarkCRC32/poly=Castagnoli/size=15/align=0-8       89.7          81.8          -8.81%
      BenchmarkCRC32/poly=Castagnoli/size=15/align=1-8       89.8          81.9          -8.80%
      BenchmarkCRC32/poly=Castagnoli/size=40/align=0-8       93.8          61.4          -34.54%
      BenchmarkCRC32/poly=Castagnoli/size=40/align=1-8       94.3          61.3          -34.99%
      BenchmarkCRC32/poly=Castagnoli/size=512/align=0-8      503           56.4          -88.79%
      BenchmarkCRC32/poly=Castagnoli/size=512/align=1-8      502           132           -73.71%
      BenchmarkCRC32/poly=Castagnoli/size=1kB/align=0-8      941           70.2          -92.54%
      BenchmarkCRC32/poly=Castagnoli/size=1kB/align=1-8      943           145           -84.62%
      BenchmarkCRC32/poly=Castagnoli/size=4kB/align=0-8      3588          186           -94.82%
      BenchmarkCRC32/poly=Castagnoli/size=4kB/align=1-8      3595          264           -92.66%
      BenchmarkCRC32/poly=Castagnoli/size=32kB/align=0-8     28266         1323          -95.32%
      BenchmarkCRC32/poly=Castagnoli/size=32kB/align=1-8     28344         1404          -95.05%
      
      Change-Id: Ic4d8274c66e0e87bfba5f609f508a3877aee6bb5
      Reviewed-on: https://go-review.googlesource.com/38184Reviewed-by: default avatarDavid Chase <drchase@google.com>
      b6cd22c2
    • Nigel Tao's avatar
      image/png: decode Gray8 transparent images. · 16663a85
      Nigel Tao authored
      Fixes #19553.
      
      Change-Id: I414cb3b1c2dab20f41a7f4e7aba49c534ff19942
      Reviewed-on: https://go-review.googlesource.com/38271Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      16663a85
    • Josh Bleecher Snyder's avatar
      cmd/compile: relocate code from config.go to func.go · 88e47187
      Josh Bleecher Snyder authored
      This is a follow-up to CL 38167.
      Pure code movement.
      
      Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57
      Reviewed-on: https://go-review.googlesource.com/38322
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      88e47187
    • Josh Bleecher Snyder's avatar
      cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config · a5e3cac8
      Josh Bleecher Snyder authored
      This makes ssa.Func, ssa.Cache, and ssa.Config fulfill
      the roles laid out for them in CL 38160.
      
      The only non-trivial change in this CL is how cached
      values and blocks get IDs. Prior to this CL, their IDs were
      assigned as part of resetting the cache, and only modified
      IDs were reset. This required knowing how many values and
      blocks were modified, which required a tight coupling between
      ssa.Func and ssa.Config. To eliminate that coupling,
      we now zero values and blocks during reset,
      and assign their IDs when they are used.
      Since unused values and blocks have ID == 0,
      we can efficiently find the last used value/block,
      to avoid zeroing everything.
      Bulk zeroing is efficient, but not efficient enough
      to obviate the need to avoid zeroing everything every time.
      As a happy side-effect, ssa.Func.Free is no longer necessary.
      
      DebugHashMatch and friends now belong in func.go.
      They have been left in place for clarity and review.
      I will move them in a subsequent CL.
      
      Passes toolstash -cmp. No compiler performance impact.
      No change in 'go test cmd/compile/internal/ssa' execution time.
      
      Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098
      Reviewed-on: https://go-review.googlesource.com/38167
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      a5e3cac8
    • Josh Bleecher Snyder's avatar
      cmd/compile: avoid calling unnecessary Sym format routine · ccaa8e3c
      Josh Bleecher Snyder authored
      Minor cleanup only.
      
      No reason to go through String() when it is
      just as easy to do a direct string comparison.
      
      Eliminates a surprising number of allocations.
      
      name       old alloc/op    new alloc/op    delta
      Template      40.9MB ± 0%     40.9MB ± 0%    ~     (p=0.190 n=10+10)
      Unicode       30.3MB ± 0%     30.3MB ± 0%    ~     (p=0.218 n=10+10)
      GoTypes        116MB ± 0%      116MB ± 0%  -0.09%  (p=0.000 n=10+10)
      SSA            871MB ± 0%      869MB ± 0%  -0.14%  (p=0.000 n=10+9)
      Flate         26.2MB ± 0%     26.2MB ± 0%  -0.15%  (p=0.002 n=10+10)
      GoParser      32.5MB ± 0%     32.5MB ± 0%    ~     (p=0.165 n=10+10)
      Reflect       80.5MB ± 0%     80.4MB ± 0%  -0.12%  (p=0.003 n=9+10)
      Tar           27.3MB ± 0%     27.3MB ± 0%  -0.13%  (p=0.008 n=10+9)
      XML           43.1MB ± 0%     43.1MB ± 0%    ~     (p=0.218 n=10+10)
      
      name       old allocs/op   new allocs/op   delta
      Template        402k ± 1%       400k ± 1%  -0.64%  (p=0.002 n=10+10)
      Unicode         322k ± 1%       321k ± 1%    ~     (p=0.075 n=10+10)
      GoTypes        1.19M ± 0%      1.18M ± 0%  -0.90%  (p=0.000 n=10+10)
      SSA            7.94M ± 0%      7.81M ± 0%  -1.66%  (p=0.000 n=10+9)
      Flate           246k ± 0%       242k ± 1%  -1.42%  (p=0.000 n=10+10)
      GoParser        325k ± 1%       323k ± 1%  -0.84%  (p=0.000 n=10+10)
      Reflect        1.02M ± 0%      1.01M ± 0%  -0.99%  (p=0.000 n=10+10)
      Tar             259k ± 0%       257k ± 1%  -0.72%  (p=0.009 n=10+10)
      XML             406k ± 1%       403k ± 1%  -0.69%  (p=0.001 n=10+10)
      
      Change-Id: Ia129a4cd272027d627e1f3b27e9f07f93e3aa27e
      Reviewed-on: https://go-review.googlesource.com/38230Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      ccaa8e3c
    • Josh Bleecher Snyder's avatar
      cmd/compile: move hasdefer to Func · 0cfb2313
      Josh Bleecher Snyder authored
      Passes toolstash -cmp.
      
      Updates #15756
      
      Change-Id: Ia071dbbd7f2ee0f8433d8c37af4f7b588016244e
      Reviewed-on: https://go-review.googlesource.com/38231Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      0cfb2313
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj/ppc64: remove stackbarrier function check · 604e4841
      Josh Bleecher Snyder authored
      Stack barriers were removed in CL 36620.
      
      Change-Id: If124d65a73a7b344a42be2a4b386a14d7a0a428b
      Reviewed-on: https://go-review.googlesource.com/38169Reviewed-by: default avatarMichael Hudson-Doyle <michael.hudson@canonical.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      604e4841
    • Robert Griesemer's avatar
      go/types: better error for assignment count mismatches · faeda66c
      Robert Griesemer authored
      This matches the error message of cmd/compile (for assignments).
      
      Change-Id: I42a428f5d72f034e7b7e97b090a929e317e812af
      Reviewed-on: https://go-review.googlesource.com/38315
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      faeda66c
    • Robert Griesemer's avatar
      cmd/compile: eliminate "assignment count mismatch" - not needed anymore · 3c7a8124
      Robert Griesemer authored
      See https://go-review.googlesource.com/#/c/38313/ for background.
      It turns out that only a few tests checked for this.
      
      The new error message is shorter and very clear.
      
      Change-Id: I8ab4ad59fb023c8b54806339adc23aefd7dc7b07
      Reviewed-on: https://go-review.googlesource.com/38314
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      3c7a8124
  2. 16 Mar, 2017 15 commits
    • Jeremy Jackins's avatar
      cmd/compile: further clarify assignment count mismatch error message · 73a44f04
      Jeremy Jackins authored
      This is an evolution of https://go-review.googlesource.com/33616, as discussed
      via email with Robert (gri):
      
      $ cat foobar.go
      package main
      
      func main() {
              a := "foo", "bar"
      }
      
      before:
      ./foobar.go:4:4: assignment count mismatch: want 1 values, got 2
      
      after:
      ./foobar.go:4:4: assignment count mismatch: cannot assign 2 values to 1 variables
      
      We could likely also eliminate the "assignment count mismatch" prefix now
      without losing any information, but that string is matched by a number of
      tests.
      
      Change-Id: Ie6fc8a7bbd0ebe841d53e66e5c2f49868decf761
      Reviewed-on: https://go-review.googlesource.com/38313Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      73a44f04
    • Keith Randall's avatar
      cmd/compile: intrinsics for math/bits.{Len,LeadingZeros} · 495b1679
      Keith Randall authored
      name              old time/op  new time/op  delta
      LeadingZeros-4    2.00ns ± 0%  1.34ns ± 1%  -33.02%  (p=0.000 n=8+10)
      LeadingZeros16-4  1.62ns ± 0%  1.57ns ± 0%   -3.09%  (p=0.001 n=8+9)
      LeadingZeros32-4  2.14ns ± 0%  1.48ns ± 0%  -30.84%  (p=0.002 n=8+10)
      LeadingZeros64-4  2.06ns ± 1%  1.33ns ± 0%  -35.08%  (p=0.000 n=8+8)
      
      8-bit args is a special case - the Go code is really fast because
      it is just a single table lookup.  So I've disabled that for now.
      Intrinsics were actually slower:
      LeadingZeros8-4   1.22ns ± 3%  1.58ns ± 1%  +29.56%  (p=0.000 n=10+10)
      
      Update #18616
      
      Change-Id: Ia9c289b9ba59c583ea64060470315fd637e814cf
      Reviewed-on: https://go-review.googlesource.com/38311
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      495b1679
    • Steve Francia's avatar
      doc: reorganize the contribution guidelines into a guide · 5f3e7aa7
      Steve Francia authored
      Updates #17802
      
      Change-Id: I65ea0f4cde973604c04051e7eb25d12e4facecd3
      Reviewed-on: https://go-review.googlesource.com/36626Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Reviewed-by: default avatarChris Broadfoot <cbro@golang.org>
      5f3e7aa7
    • Aliaksandr Valialkin's avatar
      strconv: optimize formatting for small decimal ints · bc8b9b23
      Aliaksandr Valialkin authored
      Avoid memory allocations by returning pre-calculated strings
      for decimal ints in the range 0..99.
      
      Benchmark results:
      
      name              old time/op    new time/op    delta
      FormatInt-4         2.45µs ± 1%    2.40µs ± 1%    -1.86%  (p=0.000 n=8+9)
      AppendInt-4         1.67µs ± 1%    1.65µs ± 0%    -0.92%  (p=0.000 n=10+10)
      FormatUint-4         676ns ± 3%     669ns ± 1%      ~     (p=0.146 n=10+10)
      AppendUint-4         467ns ± 2%     474ns ± 0%    +1.58%  (p=0.000 n=10+10)
      FormatIntSmall-4    29.6ns ± 2%     3.3ns ± 0%   -88.98%  (p=0.000 n=10+9)
      AppendIntSmall-4    16.0ns ± 1%     8.5ns ± 0%   -46.98%  (p=0.000 n=10+9)
      
      name              old alloc/op   new alloc/op   delta
      FormatInt-4           576B ± 0%      576B ± 0%      ~     (all equal)
      AppendInt-4          0.00B          0.00B           ~     (all equal)
      FormatUint-4          224B ± 0%      224B ± 0%      ~     (all equal)
      AppendUint-4         0.00B          0.00B           ~     (all equal)
      FormatIntSmall-4     2.00B ± 0%     0.00B       -100.00%  (p=0.000 n=10+10)
      AppendIntSmall-4     0.00B          0.00B           ~     (all equal)
      
      name              old allocs/op  new allocs/op  delta
      FormatInt-4           37.0 ± 0%      35.0 ± 0%    -5.41%  (p=0.000 n=10+10)
      AppendInt-4           0.00           0.00           ~     (all equal)
      FormatUint-4          6.00 ± 0%      6.00 ± 0%      ~     (all equal)
      AppendUint-4          0.00           0.00           ~     (all equal)
      FormatIntSmall-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
      AppendIntSmall-4      0.00           0.00           ~     (all equal)
      
      Fixes #19445
      
      Change-Id: Ib1f8922f2e0b13743c847ee9e703d1dab77f705c
      Reviewed-on: https://go-review.googlesource.com/37963Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      bc8b9b23
    • Keith Randall's avatar
      cmd/compile: intrinsify math/bits.ReverseBytes · dd9892e3
      Keith Randall authored
      Update #18616
      
      Change-Id: I0c2d643cbbeb131b4c9b12194697afa4af48e1d2
      Reviewed-on: https://go-review.googlesource.com/38166
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      dd9892e3
    • Cherry Zhang's avatar
      cmd/compile: fix MIPS Zero lower rule · 793e4ec3
      Cherry Zhang authored
      A copy-paste error in CL 38150. Fix build.
      
      Change-Id: Ib2afc83564ebe7dab934d45522803e1a191dea18
      Reviewed-on: https://go-review.googlesource.com/38292
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      793e4ec3
    • Matthew Dempsky's avatar
      cmd/compile/internal/syntax: track column position at function end · f37ee0f3
      Matthew Dempsky authored
      Fixes #19576.
      
      Change-Id: I11034fb08e989f6eb7d54bde873b92804223598d
      Reviewed-on: https://go-review.googlesource.com/38291
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      f37ee0f3
    • Cherry Zhang's avatar
      cmd/compile: use type information in Aux for Store size · c8f38b33
      Cherry Zhang authored
      Remove size AuxInt in Store, and alignment in Move/Zero. We still
      pass size AuxInt to Move/Zero, as it is used for partial Move/Zero
      lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288).
      SizeAndAlign is gone.
      
      Passes "toolstash -cmp" on std.
      
      Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d
      Reviewed-on: https://go-review.googlesource.com/38150
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      c8f38b33
    • Cherry Zhang's avatar
      cmd/compile: add a test for writebarrier pass with single-block loop · d75925d6
      Cherry Zhang authored
      The old writebarrier implementation fails to handle single-block
      loop where a memory Phi value depends on the write barrier store
      in the same block. The new implementation (CL 36834) doesn't have
      this problem. Add a test to ensure it.
      
      Fix #19067.
      
      Change-Id: Iab13c6817edc12be8a048d18699b4450fa7ed712
      Reviewed-on: https://go-review.googlesource.com/36940Reviewed-by: default avatarDavid Chase <drchase@google.com>
      d75925d6
    • Cherry Zhang's avatar
      cmd/compile: clean up SSA-building code · 1b853006
      Cherry Zhang authored
      Now that the write barrier insertion is moved to SSA, the SSA
      building code can be simplified.
      
      Updates #17583.
      
      Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25
      Reviewed-on: https://go-review.googlesource.com/36840
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      1b853006
    • Cherry Zhang's avatar
      cmd/compile: move write barrier insertion to SSA · 9ebf3d51
      Cherry Zhang authored
      When the compiler insert write barriers, the frontend makes
      conservative decisions at an early stage. This sometimes have
      false positives because of the lack of information, for example,
      writes on stack. SSA's writebarrier pass identifies writes on
      stack and eliminates write barriers for them.
      
      This CL moves write barrier insertion into SSA. The frontend no
      longer makes decisions about write barriers, and simply does
      normal assignments and emits normal Store ops when building SSA.
      SSA writebarrier pass inserts write barrier for Stores when needed.
      There, it has better information about the store because Phi and
      Copy propagation are done at that time.
      
      This CL only changes StoreWB to Store in gc/ssa.go. A followup CL
      simplifies SSA building code.
      
      Updates #17583.
      
      Change-Id: I4592d9bc0067503befc169c50b4e6f4765673bec
      Reviewed-on: https://go-review.googlesource.com/36839
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      9ebf3d51
    • Cherry Zhang's avatar
      cmd/compile: pass types on SSA Store/Move/Zero ops · 211c8c9f
      Cherry Zhang authored
      For SSA Store/Move/Zero ops, attach the type of the value being
      stored to the op as the Aux field. This type will be used for
      write barrier insertion (in a followup CL). Since SSA passes
      do not accurately propagate types of values (because of type
      casting), we can't simply use type of the store's arguments
      for write barrier insertion.
      
      Passes "toolstash -cmp" on std.
      
      Updates #17583.
      
      Change-Id: I051d5e5c482931640d1d7d879b2a6bb91f2e0056
      Reviewed-on: https://go-review.googlesource.com/36838
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      211c8c9f
    • Daniel Martí's avatar
      runtime: remove unused g parameter · 77b09b8b
      Daniel Martí authored
      Found by github.com/mvdan/unparam.
      
      Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c
      Reviewed-on: https://go-review.googlesource.com/37894
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      77b09b8b
    • Carlos Eduardo Seo's avatar
      runtime: improve IndexByte for ppc64x · d60166d5
      Carlos Eduardo Seo authored
      This change adds a better implementation of IndexByte for ppc64x.
      
      Improvement for bytes·IndexByte:
      
      benchmark                             old ns/op     new ns/op     delta
      BenchmarkIndexByte/10-16              12.5          8.48          -32.16%
      BenchmarkIndexByte/32-16              34.4          9.85          -71.37%
      BenchmarkIndexByte/4K-16              3089          217           -92.98%
      BenchmarkIndexByte/4M-16              3154810       207051        -93.44%
      BenchmarkIndexByte/64M-16             50564811      5579093       -88.97%
      
      benchmark                             old MB/s     new MB/s     speedup
      BenchmarkIndexByte/10-16              800.41       1179.64      1.47x
      BenchmarkIndexByte/32-16              930.60       3249.10      3.49x
      BenchmarkIndexByte/4K-16              1325.71      18832.53     14.21x
      BenchmarkIndexByte/4M-16              1329.49      20257.29     15.24x
      BenchmarkIndexByte/64M-16             1327.19      12028.63     9.06x
      
      Improvement for strings·IndexByte:
      
      benchmark                             old ns/op     new ns/op     delta
      BenchmarkIndexByte-16                 25.9          7.69          -70.31%
      
      Fixes #19030
      
      Change-Id: Ifb82bbb3d643ec44b98eaa2d08a07f47e5c2fd11
      Reviewed-on: https://go-review.googlesource.com/37670
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarLynn Boger <laboger@linux.vnet.ibm.com>
      d60166d5
    • Keith Randall's avatar
      cmd/compile: intrinsics for math/bits.TrailingZerosX · d5dc4905
      Keith Randall authored
      Implement math/bits.TrailingZerosX using intrinsics.
      
      Generally reorganize the intrinsic spec a bit.
      The instrinsics data structure is now built at init time.
      This will make doing the other functions in math/bits easier.
      
      Update sys.CtzX to return int instead of uint{64,32} so it
      matches math/bits.TrailingZerosX.
      
      Improve the intrinsics a bit for amd64.  We don't need the CMOV
      for <64 bit versions.
      
      Update #18616
      
      Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39
      Reviewed-on: https://go-review.googlesource.com/38155
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      d5dc4905
  3. 15 Mar, 2017 9 commits
    • Martin Möhrmann's avatar
      runtime: make complex division c99 compatible · 16200c73
      Martin Möhrmann authored
      - changes tests to check that the real and imaginary part of the go complex
        division result is equal to the result gcc produces for c99
      - changes complex division code to satisfy new complex division test
      - adds float functions isNan, isFinite, isInf, abs and copysign
        in the runtime package
      
      Fixes #14644.
      
      name                   old time/op  new time/op  delta
      Complex128DivNormal-4  21.8ns ± 6%  13.9ns ± 6%  -36.37%  (p=0.000 n=20+20)
      Complex128DivNisNaN-4  14.1ns ± 1%  15.0ns ± 1%   +5.86%  (p=0.000 n=20+19)
      Complex128DivDisNaN-4  12.5ns ± 1%  16.7ns ± 1%  +33.79%  (p=0.000 n=19+20)
      Complex128DivNisInf-4  10.1ns ± 1%  13.0ns ± 1%  +28.25%  (p=0.000 n=20+19)
      Complex128DivDisInf-4  11.0ns ± 1%  20.9ns ± 1%  +90.69%  (p=0.000 n=16+19)
      ComplexAlgMap-4        86.7ns ± 1%  86.8ns ± 2%     ~     (p=0.804 n=20+20)
      
      Change-Id: I261f3b4a81f6cc858bc7ff48f6fd1b39c300abf0
      Reviewed-on: https://go-review.googlesource.com/37441Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      16200c73
    • Austin Clements's avatar
      runtime: print user stack on other threads during GOTRACBEACK=crash · 4b8f41da
      Austin Clements authored
      Currently, when printing tracebacks of other threads during
      GOTRACEBACK=crash, if the thread is on the system stack we print only
      the header for the user goroutine and fail to print its stack. This
      happens because we passed the g0 to traceback instead of curg. The g0
      never has anything set in its gobuf, so traceback doesn't print
      anything.
      
      Fix this by passing _g_.m.curg to traceback instead of the g0.
      
      Fixes #19494.
      
      Change-Id: Idfabf94d6a725e9cdf94a3923dead6455ef3b217
      Reviewed-on: https://go-review.googlesource.com/38012
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      4b8f41da
    • Austin Clements's avatar
      runtime: make GOTRACEBACK=crash crash promptly in cgo binaries · f2e87158
      Austin Clements authored
      GOTRACEBACK=crash works by bouncing a SIGQUIT around the process
      sched.mcount times. However, sched.mcount includes the extra Ms
      allocated by oneNewExtraM for cgo callbacks. Hence, if there are any
      extra Ms that don't have real OS threads, we'll try to send SIGQUIT
      more times than there are threads to catch it. Since nothing will
      catch these extra signals, we'll fall back to blocking for five
      seconds before aborting the process.
      
      Avoid this five second delay by subtracting out the number of extra Ms
      when sending SIGQUITs.
      
      Of course, in a cgo binary, it's still possible for the SIGQUIT to go
      to a cgo thread and cause some other failure mode. This does not fix
      that.
      
      Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965
      Reviewed-on: https://go-review.googlesource.com/38182
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      f2e87158
    • Josh Bleecher Snyder's avatar
      cmd/compile: check labels and gotos before building SSA · c03e75e5
      Josh Bleecher Snyder authored
      This CL introduces yet another compiler pass,
      which checks for correct control flow constructs
      prior to converting from AST to SSA form.
      
      It cannot be integrated with walk, since walk rewrites
      switch and select statements on the fly.
      
      To reduce code duplication, this CL also does some
      minor refactoring.
      
      With this pass in place, the AST to SSA converter
      can now stop generating SSA for any known-dead code.
      This minor savings pays for the minor cost of the new pass.
      
      Performance is almost a wash:
      
      name       old time/op     new time/op     delta
      Template       206ms ± 4%      205ms ± 4%   ~     (p=0.108 n=43+43)
      Unicode       84.0ms ± 4%     84.0ms ± 4%   ~     (p=0.979 n=43+43)
      GoTypes        550ms ± 3%      553ms ± 3%   ~     (p=0.065 n=40+41)
      Compiler       2.57s ± 4%      2.58s ± 2%   ~     (p=0.103 n=44+41)
      SSA            3.94s ± 3%      3.93s ± 2%   ~     (p=0.833 n=44+42)
      Flate          126ms ± 6%      125ms ± 4%   ~     (p=0.941 n=43+39)
      GoParser       147ms ± 4%      148ms ± 3%   ~     (p=0.164 n=42+39)
      Reflect        359ms ± 3%      357ms ± 5%   ~     (p=0.241 n=43+44)
      Tar            106ms ± 5%      106ms ± 7%   ~     (p=0.853 n=40+43)
      XML            202ms ± 3%      203ms ± 3%   ~     (p=0.488 n=42+41)
      
      name       old user-ns/op  new user-ns/op  delta
      Template        240M ± 4%       239M ± 4%   ~     (p=0.844 n=42+43)
      Unicode         107M ± 5%       107M ± 4%   ~     (p=0.332 n=40+43)
      GoTypes         735M ± 3%       731M ± 4%   ~     (p=0.141 n=43+44)
      Compiler       3.51G ± 3%      3.52G ± 3%   ~     (p=0.208 n=42+43)
      SSA            5.72G ± 4%      5.72G ± 3%   ~     (p=0.928 n=44+42)
      Flate           151M ± 7%       150M ± 8%   ~     (p=0.662 n=44+43)
      GoParser        181M ± 5%       181M ± 4%   ~     (p=0.379 n=41+44)
      Reflect         447M ± 4%       445M ± 4%   ~     (p=0.344 n=43+43)
      Tar             125M ± 7%       124M ± 6%   ~     (p=0.353 n=43+43)
      XML             248M ± 4%       250M ± 6%   ~     (p=0.158 n=44+44)
      
      name       old alloc/op    new alloc/op    delta
      Template      40.3MB ± 0%     40.2MB ± 0%  -0.27%  (p=0.000 n=10+10)
      Unicode       30.3MB ± 0%     30.2MB ± 0%  -0.10%  (p=0.015 n=10+10)
      GoTypes        114MB ± 0%      114MB ± 0%  -0.06%  (p=0.000 n=7+9)
      Compiler       480MB ± 0%      481MB ± 0%  +0.07%  (p=0.000 n=10+10)
      SSA            864MB ± 0%      862MB ± 0%  -0.25%  (p=0.000 n=9+10)
      Flate         25.9MB ± 0%     25.9MB ± 0%    ~     (p=0.123 n=10+10)
      GoParser      32.1MB ± 0%     32.1MB ± 0%    ~     (p=0.631 n=10+10)
      Reflect       79.9MB ± 0%     79.6MB ± 0%  -0.39%  (p=0.000 n=10+9)
      Tar           27.1MB ± 0%     27.0MB ± 0%  -0.18%  (p=0.003 n=10+10)
      XML           42.6MB ± 0%     42.6MB ± 0%    ~     (p=0.143 n=10+10)
      
      name       old allocs/op   new allocs/op   delta
      Template        401k ± 0%       401k ± 1%    ~     (p=0.353 n=10+10)
      Unicode         322k ± 0%       322k ± 0%    ~     (p=0.739 n=10+10)
      GoTypes        1.18M ± 0%      1.18M ± 0%  +0.25%  (p=0.001 n=7+8)
      Compiler       4.51M ± 0%      4.53M ± 0%  +0.37%  (p=0.000 n=10+10)
      SSA            7.91M ± 0%      7.93M ± 0%  +0.20%  (p=0.000 n=9+10)
      Flate           244k ± 0%       245k ± 0%    ~     (p=0.123 n=10+10)
      GoParser        323k ± 1%       324k ± 1%  +0.40%  (p=0.035 n=10+10)
      Reflect        1.01M ± 0%      1.02M ± 0%  +0.37%  (p=0.000 n=10+9)
      Tar             258k ± 1%       258k ± 1%    ~     (p=0.661 n=10+9)
      XML             403k ± 0%       405k ± 0%  +0.47%  (p=0.004 n=10+10)
      
      Updates #15756
      Updates #19250
      
      Change-Id: I647bfbb745c35630447eb79dfcaa994b490ce942
      Reviewed-on: https://go-review.googlesource.com/38159
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      c03e75e5
    • Josh Bleecher Snyder's avatar
      cmd/compile: ensure TESTQconst AuxInt is in range · 604455a4
      Josh Bleecher Snyder authored
      Fixes #19555
      
      Change-Id: I7aa0551a90f6bb630c0ba721f3525a8a9cf793fd
      Reviewed-on: https://go-review.googlesource.com/38164
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      604455a4
    • Bryan C. Mills's avatar
      archive/zip: parallelize benchmarks · d0a045da
      Bryan C. Mills authored
      Add subbenchmarks for BenchmarkZip64Test with different sizes to tease
      apart construction costs vs. steady-state throughput.
      
      Results remain comparable with the non-parallel version with -cpu=1:
      
      benchmark                           old ns/op     new ns/op     delta
      BenchmarkCompressedZipGarbage       26832835      27506953      +2.51%
      BenchmarkCompressedZipGarbage-6     27172377      4321534       -84.10%
      BenchmarkZip64Test                  196758732     197765510     +0.51%
      BenchmarkZip64Test-6                193850605     192625458     -0.63%
      
      benchmark                           old allocs     new allocs     delta
      BenchmarkCompressedZipGarbage       44             44             +0.00%
      BenchmarkCompressedZipGarbage-6     44             44             +0.00%
      
      benchmark                           old bytes     new bytes     delta
      BenchmarkCompressedZipGarbage       5592          5664          +1.29%
      BenchmarkCompressedZipGarbage-6     5592          21946         +292.45%
      
      updates #18177
      
      Change-Id: Icfa359d9b1a8df5e085dacc07d2b9221b284764c
      Reviewed-on: https://go-review.googlesource.com/36719Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d0a045da
    • Cherry Zhang's avatar
      cmd/link: on PPC64, put plt stubs at beginning of Textp · 15b37655
      Cherry Zhang authored
      Put call stubs at the beginning (instead of the end). So the
      trampoline pass knows the addresses of the stubs, and it can
      insert trampolines when necessary.
      
      Fixes #19425.
      
      Change-Id: I1e06529ef837a6130df58917315610d45a6819ca
      Reviewed-on: https://go-review.googlesource.com/38131
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarLynn Boger <laboger@linux.vnet.ibm.com>
      15b37655
    • Josh Bleecher Snyder's avatar
      cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache · 43afcb5c
      Josh Bleecher Snyder authored
      The line between ssa.Func and ssa.Config has blurred.
      Concurrent compilation in the backend will require more precision.
      This CL lays out an (aspirational) organization.
      The implementation will come in follow-up CLs,
      once the organization is settled.
      
      ssa.Config holds basic compiler configuration,
      mostly arch-specific information.
      It is configured once, early on, and is readonly,
      so it is safe for concurrent use.
      
      ssa.Func is a single-shot object used for
      compiling a single Func. It is not concurrency-safe
      and not re-usable.
      
      ssa.Cache is a multi-use object used to avoid
      expensive allocations during compilation.
      Each ssa.Func is given an ssa.Cache to use.
      ssa.Cache is not concurrency-safe.
      
      Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc
      Reviewed-on: https://go-review.googlesource.com/38160Reviewed-by: default avatarKeith Randall <khr@golang.org>
      43afcb5c
    • David Chase's avatar
      cmd/compile: put spills in better places · 886e9e60
      David Chase authored
      Previously we always issued a spill right after the op
      that was being spilled.  This CL pushes spills father away
      from the generator, hopefully pushing them into unlikely branches.
      For example:
      
        x = ...
        if unlikely {
          call ...
        }
        ... use x ...
      
      Used to compile to
      
        x = ...
        spill x
        if unlikely {
          call ...
          restore x
        }
      
      It now compiles to
      
        x = ...
        if unlikely {
          spill x
          call ...
          restore x
        }
      
      This is particularly useful for code which appends, as the only
      call is an unlikely call to growslice.  It also helps for the
      spills needed around write barrier calls.
      
      The basic algorithm is walk down the dominator tree following a
      path where the block still dominates all of the restores.  We're
      looking for a block that:
       1) dominates all restores
       2) has the value being spilled in a register
       3) has a loop depth no deeper than the value being spilled
      
      The walking-down code is iterative.  I was forced to limit it to
      searching 100 blocks so it doesn't become O(n^2).  Maybe one day
      we'll find a better way.
      
      I had to delete most of David's code which pushed spills out of loops.
      I suspect this CL subsumes most of the cases that his code handled.
      
      Generally positive performance improvements, but hard to tell for sure
      with all the noise.  (compilebench times are unchanged.)
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.91s ±15%     2.80s ±12%    ~     (p=0.063 n=10+10)
      Fannkuch11-12                3.47s ± 0%     3.30s ± 4%  -4.91%   (p=0.000 n=9+10)
      FmtFprintfEmpty-12          48.0ns ± 1%    47.4ns ± 1%  -1.32%    (p=0.002 n=9+9)
      FmtFprintfString-12         85.6ns ±11%    79.4ns ± 3%  -7.27%  (p=0.005 n=10+10)
      FmtFprintfInt-12            91.8ns ±10%    85.9ns ± 4%    ~      (p=0.203 n=10+9)
      FmtFprintfIntInt-12          135ns ±13%     127ns ± 1%  -5.72%   (p=0.025 n=10+9)
      FmtFprintfPrefixedInt-12     167ns ± 1%     168ns ± 2%    ~      (p=0.580 n=9+10)
      FmtFprintfFloat-12           249ns ±11%     230ns ± 1%  -7.32%  (p=0.000 n=10+10)
      FmtManyArgs-12               504ns ± 7%     506ns ± 1%    ~       (p=0.198 n=9+9)
      GobDecode-12                6.95ms ± 1%    7.04ms ± 1%  +1.37%  (p=0.001 n=10+10)
      GobEncode-12                6.32ms ±13%    6.04ms ± 1%    ~     (p=0.063 n=10+10)
      Gzip-12                      233ms ± 1%     235ms ± 0%  +1.01%   (p=0.000 n=10+9)
      Gunzip-12                   40.1ms ± 1%    39.6ms ± 0%  -1.12%   (p=0.000 n=10+8)
      HTTPClientServer-12          227µs ± 9%     221µs ± 5%    ~       (p=0.114 n=9+8)
      JSONEncode-12               16.1ms ± 2%    15.8ms ± 1%  -2.09%    (p=0.002 n=9+8)
      JSONDecode-12               61.8ms ±11%    57.9ms ± 1%  -6.30%   (p=0.000 n=10+9)
      Mandelbrot200-12            4.30ms ± 3%    4.28ms ± 1%    ~      (p=0.203 n=10+8)
      GoParse-12                  3.18ms ± 2%    3.18ms ± 2%    ~     (p=0.579 n=10+10)
      RegexpMatchEasy0_32-12      76.7ns ± 1%    77.5ns ± 1%  +0.92%    (p=0.002 n=9+8)
      RegexpMatchEasy0_1K-12       239ns ± 3%     239ns ± 1%    ~     (p=0.204 n=10+10)
      RegexpMatchEasy1_32-12      71.4ns ± 1%    70.6ns ± 0%  -1.15%   (p=0.000 n=10+9)
      RegexpMatchEasy1_1K-12       383ns ± 2%     390ns ±10%    ~       (p=0.181 n=8+9)
      RegexpMatchMedium_32-12      114ns ± 0%     113ns ± 1%  -0.88%    (p=0.000 n=9+8)
      RegexpMatchMedium_1K-12     36.3µs ± 1%    36.8µs ± 1%  +1.59%   (p=0.000 n=10+8)
      RegexpMatchHard_32-12       1.90µs ± 1%    1.90µs ± 1%    ~     (p=0.341 n=10+10)
      RegexpMatchHard_1K-12       59.4µs ±11%    57.8µs ± 1%    ~      (p=0.968 n=10+9)
      Revcomp-12                   461ms ± 1%     462ms ± 1%    ~       (p=1.000 n=9+9)
      Template-12                 67.5ms ± 1%    66.3ms ± 1%  -1.77%   (p=0.000 n=10+8)
      TimeParse-12                 314ns ± 3%     309ns ± 0%  -1.56%    (p=0.000 n=9+8)
      TimeFormat-12                340ns ± 2%     331ns ± 1%  -2.79%  (p=0.000 n=10+10)
      
      The go binary is 0.2% larger.  Not really sure why the size
      would change.
      
      Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c
      Reviewed-on: https://go-review.googlesource.com/34822Reviewed-by: default avatarDavid Chase <drchase@google.com>
      886e9e60