1. 16 Mar, 2017 13 commits
  2. 15 Mar, 2017 10 commits
    • Martin Möhrmann's avatar
      runtime: make complex division c99 compatible · 16200c73
      Martin Möhrmann authored
      - changes tests to check that the real and imaginary part of the go complex
        division result is equal to the result gcc produces for c99
      - changes complex division code to satisfy new complex division test
      - adds float functions isNan, isFinite, isInf, abs and copysign
        in the runtime package
      
      Fixes #14644.
      
      name                   old time/op  new time/op  delta
      Complex128DivNormal-4  21.8ns ± 6%  13.9ns ± 6%  -36.37%  (p=0.000 n=20+20)
      Complex128DivNisNaN-4  14.1ns ± 1%  15.0ns ± 1%   +5.86%  (p=0.000 n=20+19)
      Complex128DivDisNaN-4  12.5ns ± 1%  16.7ns ± 1%  +33.79%  (p=0.000 n=19+20)
      Complex128DivNisInf-4  10.1ns ± 1%  13.0ns ± 1%  +28.25%  (p=0.000 n=20+19)
      Complex128DivDisInf-4  11.0ns ± 1%  20.9ns ± 1%  +90.69%  (p=0.000 n=16+19)
      ComplexAlgMap-4        86.7ns ± 1%  86.8ns ± 2%     ~     (p=0.804 n=20+20)
      
      Change-Id: I261f3b4a81f6cc858bc7ff48f6fd1b39c300abf0
      Reviewed-on: https://go-review.googlesource.com/37441Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      16200c73
    • Austin Clements's avatar
      runtime: print user stack on other threads during GOTRACBEACK=crash · 4b8f41da
      Austin Clements authored
      Currently, when printing tracebacks of other threads during
      GOTRACEBACK=crash, if the thread is on the system stack we print only
      the header for the user goroutine and fail to print its stack. This
      happens because we passed the g0 to traceback instead of curg. The g0
      never has anything set in its gobuf, so traceback doesn't print
      anything.
      
      Fix this by passing _g_.m.curg to traceback instead of the g0.
      
      Fixes #19494.
      
      Change-Id: Idfabf94d6a725e9cdf94a3923dead6455ef3b217
      Reviewed-on: https://go-review.googlesource.com/38012
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      4b8f41da
    • Austin Clements's avatar
      runtime: make GOTRACEBACK=crash crash promptly in cgo binaries · f2e87158
      Austin Clements authored
      GOTRACEBACK=crash works by bouncing a SIGQUIT around the process
      sched.mcount times. However, sched.mcount includes the extra Ms
      allocated by oneNewExtraM for cgo callbacks. Hence, if there are any
      extra Ms that don't have real OS threads, we'll try to send SIGQUIT
      more times than there are threads to catch it. Since nothing will
      catch these extra signals, we'll fall back to blocking for five
      seconds before aborting the process.
      
      Avoid this five second delay by subtracting out the number of extra Ms
      when sending SIGQUITs.
      
      Of course, in a cgo binary, it's still possible for the SIGQUIT to go
      to a cgo thread and cause some other failure mode. This does not fix
      that.
      
      Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965
      Reviewed-on: https://go-review.googlesource.com/38182
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      f2e87158
    • Josh Bleecher Snyder's avatar
      cmd/compile: check labels and gotos before building SSA · c03e75e5
      Josh Bleecher Snyder authored
      This CL introduces yet another compiler pass,
      which checks for correct control flow constructs
      prior to converting from AST to SSA form.
      
      It cannot be integrated with walk, since walk rewrites
      switch and select statements on the fly.
      
      To reduce code duplication, this CL also does some
      minor refactoring.
      
      With this pass in place, the AST to SSA converter
      can now stop generating SSA for any known-dead code.
      This minor savings pays for the minor cost of the new pass.
      
      Performance is almost a wash:
      
      name       old time/op     new time/op     delta
      Template       206ms ± 4%      205ms ± 4%   ~     (p=0.108 n=43+43)
      Unicode       84.0ms ± 4%     84.0ms ± 4%   ~     (p=0.979 n=43+43)
      GoTypes        550ms ± 3%      553ms ± 3%   ~     (p=0.065 n=40+41)
      Compiler       2.57s ± 4%      2.58s ± 2%   ~     (p=0.103 n=44+41)
      SSA            3.94s ± 3%      3.93s ± 2%   ~     (p=0.833 n=44+42)
      Flate          126ms ± 6%      125ms ± 4%   ~     (p=0.941 n=43+39)
      GoParser       147ms ± 4%      148ms ± 3%   ~     (p=0.164 n=42+39)
      Reflect        359ms ± 3%      357ms ± 5%   ~     (p=0.241 n=43+44)
      Tar            106ms ± 5%      106ms ± 7%   ~     (p=0.853 n=40+43)
      XML            202ms ± 3%      203ms ± 3%   ~     (p=0.488 n=42+41)
      
      name       old user-ns/op  new user-ns/op  delta
      Template        240M ± 4%       239M ± 4%   ~     (p=0.844 n=42+43)
      Unicode         107M ± 5%       107M ± 4%   ~     (p=0.332 n=40+43)
      GoTypes         735M ± 3%       731M ± 4%   ~     (p=0.141 n=43+44)
      Compiler       3.51G ± 3%      3.52G ± 3%   ~     (p=0.208 n=42+43)
      SSA            5.72G ± 4%      5.72G ± 3%   ~     (p=0.928 n=44+42)
      Flate           151M ± 7%       150M ± 8%   ~     (p=0.662 n=44+43)
      GoParser        181M ± 5%       181M ± 4%   ~     (p=0.379 n=41+44)
      Reflect         447M ± 4%       445M ± 4%   ~     (p=0.344 n=43+43)
      Tar             125M ± 7%       124M ± 6%   ~     (p=0.353 n=43+43)
      XML             248M ± 4%       250M ± 6%   ~     (p=0.158 n=44+44)
      
      name       old alloc/op    new alloc/op    delta
      Template      40.3MB ± 0%     40.2MB ± 0%  -0.27%  (p=0.000 n=10+10)
      Unicode       30.3MB ± 0%     30.2MB ± 0%  -0.10%  (p=0.015 n=10+10)
      GoTypes        114MB ± 0%      114MB ± 0%  -0.06%  (p=0.000 n=7+9)
      Compiler       480MB ± 0%      481MB ± 0%  +0.07%  (p=0.000 n=10+10)
      SSA            864MB ± 0%      862MB ± 0%  -0.25%  (p=0.000 n=9+10)
      Flate         25.9MB ± 0%     25.9MB ± 0%    ~     (p=0.123 n=10+10)
      GoParser      32.1MB ± 0%     32.1MB ± 0%    ~     (p=0.631 n=10+10)
      Reflect       79.9MB ± 0%     79.6MB ± 0%  -0.39%  (p=0.000 n=10+9)
      Tar           27.1MB ± 0%     27.0MB ± 0%  -0.18%  (p=0.003 n=10+10)
      XML           42.6MB ± 0%     42.6MB ± 0%    ~     (p=0.143 n=10+10)
      
      name       old allocs/op   new allocs/op   delta
      Template        401k ± 0%       401k ± 1%    ~     (p=0.353 n=10+10)
      Unicode         322k ± 0%       322k ± 0%    ~     (p=0.739 n=10+10)
      GoTypes        1.18M ± 0%      1.18M ± 0%  +0.25%  (p=0.001 n=7+8)
      Compiler       4.51M ± 0%      4.53M ± 0%  +0.37%  (p=0.000 n=10+10)
      SSA            7.91M ± 0%      7.93M ± 0%  +0.20%  (p=0.000 n=9+10)
      Flate           244k ± 0%       245k ± 0%    ~     (p=0.123 n=10+10)
      GoParser        323k ± 1%       324k ± 1%  +0.40%  (p=0.035 n=10+10)
      Reflect        1.01M ± 0%      1.02M ± 0%  +0.37%  (p=0.000 n=10+9)
      Tar             258k ± 1%       258k ± 1%    ~     (p=0.661 n=10+9)
      XML             403k ± 0%       405k ± 0%  +0.47%  (p=0.004 n=10+10)
      
      Updates #15756
      Updates #19250
      
      Change-Id: I647bfbb745c35630447eb79dfcaa994b490ce942
      Reviewed-on: https://go-review.googlesource.com/38159
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      c03e75e5
    • Josh Bleecher Snyder's avatar
      cmd/compile: ensure TESTQconst AuxInt is in range · 604455a4
      Josh Bleecher Snyder authored
      Fixes #19555
      
      Change-Id: I7aa0551a90f6bb630c0ba721f3525a8a9cf793fd
      Reviewed-on: https://go-review.googlesource.com/38164
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      604455a4
    • Bryan C. Mills's avatar
      archive/zip: parallelize benchmarks · d0a045da
      Bryan C. Mills authored
      Add subbenchmarks for BenchmarkZip64Test with different sizes to tease
      apart construction costs vs. steady-state throughput.
      
      Results remain comparable with the non-parallel version with -cpu=1:
      
      benchmark                           old ns/op     new ns/op     delta
      BenchmarkCompressedZipGarbage       26832835      27506953      +2.51%
      BenchmarkCompressedZipGarbage-6     27172377      4321534       -84.10%
      BenchmarkZip64Test                  196758732     197765510     +0.51%
      BenchmarkZip64Test-6                193850605     192625458     -0.63%
      
      benchmark                           old allocs     new allocs     delta
      BenchmarkCompressedZipGarbage       44             44             +0.00%
      BenchmarkCompressedZipGarbage-6     44             44             +0.00%
      
      benchmark                           old bytes     new bytes     delta
      BenchmarkCompressedZipGarbage       5592          5664          +1.29%
      BenchmarkCompressedZipGarbage-6     5592          21946         +292.45%
      
      updates #18177
      
      Change-Id: Icfa359d9b1a8df5e085dacc07d2b9221b284764c
      Reviewed-on: https://go-review.googlesource.com/36719Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d0a045da
    • Cherry Zhang's avatar
      cmd/link: on PPC64, put plt stubs at beginning of Textp · 15b37655
      Cherry Zhang authored
      Put call stubs at the beginning (instead of the end). So the
      trampoline pass knows the addresses of the stubs, and it can
      insert trampolines when necessary.
      
      Fixes #19425.
      
      Change-Id: I1e06529ef837a6130df58917315610d45a6819ca
      Reviewed-on: https://go-review.googlesource.com/38131
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarLynn Boger <laboger@linux.vnet.ibm.com>
      15b37655
    • Josh Bleecher Snyder's avatar
      cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache · 43afcb5c
      Josh Bleecher Snyder authored
      The line between ssa.Func and ssa.Config has blurred.
      Concurrent compilation in the backend will require more precision.
      This CL lays out an (aspirational) organization.
      The implementation will come in follow-up CLs,
      once the organization is settled.
      
      ssa.Config holds basic compiler configuration,
      mostly arch-specific information.
      It is configured once, early on, and is readonly,
      so it is safe for concurrent use.
      
      ssa.Func is a single-shot object used for
      compiling a single Func. It is not concurrency-safe
      and not re-usable.
      
      ssa.Cache is a multi-use object used to avoid
      expensive allocations during compilation.
      Each ssa.Func is given an ssa.Cache to use.
      ssa.Cache is not concurrency-safe.
      
      Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc
      Reviewed-on: https://go-review.googlesource.com/38160Reviewed-by: default avatarKeith Randall <khr@golang.org>
      43afcb5c
    • David Chase's avatar
      cmd/compile: put spills in better places · 886e9e60
      David Chase authored
      Previously we always issued a spill right after the op
      that was being spilled.  This CL pushes spills father away
      from the generator, hopefully pushing them into unlikely branches.
      For example:
      
        x = ...
        if unlikely {
          call ...
        }
        ... use x ...
      
      Used to compile to
      
        x = ...
        spill x
        if unlikely {
          call ...
          restore x
        }
      
      It now compiles to
      
        x = ...
        if unlikely {
          spill x
          call ...
          restore x
        }
      
      This is particularly useful for code which appends, as the only
      call is an unlikely call to growslice.  It also helps for the
      spills needed around write barrier calls.
      
      The basic algorithm is walk down the dominator tree following a
      path where the block still dominates all of the restores.  We're
      looking for a block that:
       1) dominates all restores
       2) has the value being spilled in a register
       3) has a loop depth no deeper than the value being spilled
      
      The walking-down code is iterative.  I was forced to limit it to
      searching 100 blocks so it doesn't become O(n^2).  Maybe one day
      we'll find a better way.
      
      I had to delete most of David's code which pushed spills out of loops.
      I suspect this CL subsumes most of the cases that his code handled.
      
      Generally positive performance improvements, but hard to tell for sure
      with all the noise.  (compilebench times are unchanged.)
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.91s ±15%     2.80s ±12%    ~     (p=0.063 n=10+10)
      Fannkuch11-12                3.47s ± 0%     3.30s ± 4%  -4.91%   (p=0.000 n=9+10)
      FmtFprintfEmpty-12          48.0ns ± 1%    47.4ns ± 1%  -1.32%    (p=0.002 n=9+9)
      FmtFprintfString-12         85.6ns ±11%    79.4ns ± 3%  -7.27%  (p=0.005 n=10+10)
      FmtFprintfInt-12            91.8ns ±10%    85.9ns ± 4%    ~      (p=0.203 n=10+9)
      FmtFprintfIntInt-12          135ns ±13%     127ns ± 1%  -5.72%   (p=0.025 n=10+9)
      FmtFprintfPrefixedInt-12     167ns ± 1%     168ns ± 2%    ~      (p=0.580 n=9+10)
      FmtFprintfFloat-12           249ns ±11%     230ns ± 1%  -7.32%  (p=0.000 n=10+10)
      FmtManyArgs-12               504ns ± 7%     506ns ± 1%    ~       (p=0.198 n=9+9)
      GobDecode-12                6.95ms ± 1%    7.04ms ± 1%  +1.37%  (p=0.001 n=10+10)
      GobEncode-12                6.32ms ±13%    6.04ms ± 1%    ~     (p=0.063 n=10+10)
      Gzip-12                      233ms ± 1%     235ms ± 0%  +1.01%   (p=0.000 n=10+9)
      Gunzip-12                   40.1ms ± 1%    39.6ms ± 0%  -1.12%   (p=0.000 n=10+8)
      HTTPClientServer-12          227µs ± 9%     221µs ± 5%    ~       (p=0.114 n=9+8)
      JSONEncode-12               16.1ms ± 2%    15.8ms ± 1%  -2.09%    (p=0.002 n=9+8)
      JSONDecode-12               61.8ms ±11%    57.9ms ± 1%  -6.30%   (p=0.000 n=10+9)
      Mandelbrot200-12            4.30ms ± 3%    4.28ms ± 1%    ~      (p=0.203 n=10+8)
      GoParse-12                  3.18ms ± 2%    3.18ms ± 2%    ~     (p=0.579 n=10+10)
      RegexpMatchEasy0_32-12      76.7ns ± 1%    77.5ns ± 1%  +0.92%    (p=0.002 n=9+8)
      RegexpMatchEasy0_1K-12       239ns ± 3%     239ns ± 1%    ~     (p=0.204 n=10+10)
      RegexpMatchEasy1_32-12      71.4ns ± 1%    70.6ns ± 0%  -1.15%   (p=0.000 n=10+9)
      RegexpMatchEasy1_1K-12       383ns ± 2%     390ns ±10%    ~       (p=0.181 n=8+9)
      RegexpMatchMedium_32-12      114ns ± 0%     113ns ± 1%  -0.88%    (p=0.000 n=9+8)
      RegexpMatchMedium_1K-12     36.3µs ± 1%    36.8µs ± 1%  +1.59%   (p=0.000 n=10+8)
      RegexpMatchHard_32-12       1.90µs ± 1%    1.90µs ± 1%    ~     (p=0.341 n=10+10)
      RegexpMatchHard_1K-12       59.4µs ±11%    57.8µs ± 1%    ~      (p=0.968 n=10+9)
      Revcomp-12                   461ms ± 1%     462ms ± 1%    ~       (p=1.000 n=9+9)
      Template-12                 67.5ms ± 1%    66.3ms ± 1%  -1.77%   (p=0.000 n=10+8)
      TimeParse-12                 314ns ± 3%     309ns ± 0%  -1.56%    (p=0.000 n=9+8)
      TimeFormat-12                340ns ± 2%     331ns ± 1%  -2.79%  (p=0.000 n=10+10)
      
      The go binary is 0.2% larger.  Not really sure why the size
      would change.
      
      Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c
      Reviewed-on: https://go-review.googlesource.com/34822Reviewed-by: default avatarDavid Chase <drchase@google.com>
      886e9e60
    • Philip Hofer's avatar
      cmd/compile/internal/gc: mark generated wrappers as DUPOK · 710f4d3e
      Philip Hofer authored
      Interface wrapper functions now get compiled eagerly in some cases.
      Consequently, they may be present in multiple translation units.
      Mark them as DUPOK, just like closures.
      
      Fixes #19548
      Fixes #19550
      
      Change-Id: Ibe74adb5a62dbf6447db37fde22dcbb3479969ef
      Reviewed-on: https://go-review.googlesource.com/38156Reviewed-by: default avatarDavid Chase <drchase@google.com>
      710f4d3e
  3. 14 Mar, 2017 17 commits