1. 07 Sep, 2018 8 commits
    • erifan01's avatar
      cmd/compile: implement non-constant rotates using ROR on arm64 · 204cc14b
      erifan01 authored
      Add some rules to match the Go code like:
      	y &= 63
      	x << y | x >> (64-y)
      or
      	y &= 63
      	x >> y | x << (64-y)
      as a ROR instruction. Make math/bits.RotateLeft faster on arm64.
      
      Extends CL 132435 to arm64.
      
      Benchmarks of math/bits.RotateLeftxxN:
      name            old time/op       new time/op       delta
      RotateLeft-8    3.548750ns +- 1%  2.003750ns +- 0%  -43.54%  (p=0.000 n=8+8)
      RotateLeft8-8   3.925000ns +- 0%  3.925000ns +- 0%     ~     (p=1.000 n=8+8)
      RotateLeft16-8  3.925000ns +- 0%  3.927500ns +- 0%     ~     (p=0.608 n=8+8)
      RotateLeft32-8  3.925000ns +- 0%  2.002500ns +- 0%  -48.98%  (p=0.000 n=8+8)
      RotateLeft64-8  3.536250ns +- 0%  2.003750ns +- 0%  -43.34%  (p=0.000 n=8+8)
      
      Change-Id: I77622cd7f39b917427e060647321f5513973232c
      Reviewed-on: https://go-review.googlesource.com/122542
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      204cc14b
    • Tobias Klauser's avatar
      cmd/dist, go/types: add support for GOARCH=sparc64 · d8c8a142
      Tobias Klauser authored
      This is needed in addition to CL 102555 in order to be able to generate
      Go type definitions for linux/sparc64 in the golang.org/x/sys/unix
      package.
      
      Change-Id: I928185e320572fecb0c89396f871ea16cba8b9a6
      Reviewed-on: https://go-review.googlesource.com/132155
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      d8c8a142
    • Thanabodee Charoenpiriyakij's avatar
      fmt: add example for Sprint · 7bee8085
      Thanabodee Charoenpiriyakij authored
      Updates #27376
      
      Change-Id: I9ce6541a95b5ecd13f3932558427de1f597df07a
      Reviewed-on: https://go-review.googlesource.com/134036
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      7bee8085
    • Thanabodee Charoenpiriyakij's avatar
      fmt: add example for Print · b7182acf
      Thanabodee Charoenpiriyakij authored
      Updates #27376
      
      Change-Id: I2fa63b0d1981a419626072d985e6f3326f6013ff
      Reviewed-on: https://go-review.googlesource.com/134035
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b7182acf
    • Thanabodee Charoenpiriyakij's avatar
      fmt: add example for Fprint · 9facf355
      Thanabodee Charoenpiriyakij authored
      Updates #27376
      
      Change-Id: I0ceb672a9fcd7bbf491be1577d7f135ef35b2561
      Reviewed-on: https://go-review.googlesource.com/133455
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      9facf355
    • Tobias Klauser's avatar
      debug/elf: add R_RISCV_32_PCREL relocation · f64c0b2a
      Tobias Klauser authored
      This were missed in CL 107339 as it is not documented (yet) in
      https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md
      
      But binutils already uses it. See
      https://github.com/riscv/riscv-elf-psabi-doc/issues/36
      
      Change-Id: I1b084cbf70eb6ac966136bed1bb654883a97b6a9
      Reviewed-on: https://go-review.googlesource.com/134015
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      f64c0b2a
    • Marko Kevac's avatar
      runtime/pprof: remove "deleted" suffix while parsing maps file · 73b8e5f8
      Marko Kevac authored
      If binary file of a running program was deleted or moved, maps
      file (/proc/pid/maps) will contain lines that have this binary
      filename suffixed with "(deleted)" string. This suffix stayed
      as a part of the filename and made remote profiling slightly more
      difficult by requiring from a user to rename binary file to
      include this suffix.
      
      This change cleans up the filename and removes this suffix and
      thus simplify debugging.
      
      Fixes #25740
      
      Change-Id: Ib3c8c3b9ef536c2ac037fcc14e8037fa5c960036
      Reviewed-on: https://go-review.googlesource.com/116395
      Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      73b8e5f8
    • Ben Shi's avatar
      cmd/compile: optimize 386's comparison · 031a35ec
      Ben Shi authored
      Optimization of "(CMPconst [0] (ANDL x y)) -> (TESTL x y)" only
      get benefits if there is no further use of the result of x&y. A
      condition of uses==1 will have slight improvements.
      
      1. The code size of pkg/linux_386 decreases about 300 bytes, excluding
      cmd/compile/.
      
      2. The go1 benchmark shows no regression, and even a slight improvement
      in test case FmtFprintfEmpty-4, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              3.34s ± 3%     3.32s ± 2%    ~     (p=0.197 n=30+30)
      Fannkuch11-4                3.48s ± 2%     3.47s ± 1%  -0.33%  (p=0.015 n=30+30)
      FmtFprintfEmpty-4          46.3ns ± 4%    44.8ns ± 4%  -3.33%  (p=0.000 n=30+30)
      FmtFprintfString-4         78.8ns ± 7%    77.3ns ± 5%    ~     (p=0.098 n=30+26)
      FmtFprintfInt-4            90.2ns ± 1%    90.0ns ± 7%  -0.23%  (p=0.027 n=18+30)
      FmtFprintfIntInt-4          144ns ± 4%     143ns ± 5%    ~     (p=0.945 n=30+29)
      FmtFprintfPrefixedInt-4     180ns ± 4%     180ns ± 5%    ~     (p=0.858 n=30+30)
      FmtFprintfFloat-4           409ns ± 4%     406ns ± 3%  -0.87%  (p=0.028 n=30+30)
      FmtManyArgs-4               611ns ± 5%     608ns ± 4%    ~     (p=0.812 n=30+30)
      GobDecode-4                7.30ms ± 5%    7.26ms ± 5%    ~     (p=0.522 n=30+29)
      GobEncode-4                6.90ms ± 7%    6.82ms ± 4%    ~     (p=0.086 n=29+28)
      Gzip-4                      396ms ± 4%     400ms ± 4%  +0.99%  (p=0.026 n=30+30)
      Gunzip-4                   41.1ms ± 3%    41.2ms ± 3%    ~     (p=0.495 n=30+30)
      HTTPClientServer-4         63.7µs ± 3%    63.3µs ± 2%    ~     (p=0.113 n=29+29)
      JSONEncode-4               16.1ms ± 2%    16.1ms ± 2%  -0.30%  (p=0.041 n=30+30)
      JSONDecode-4               60.9ms ± 3%    61.2ms ± 6%    ~     (p=0.187 n=30+30)
      Mandelbrot200-4            5.17ms ± 2%    5.19ms ± 3%    ~     (p=0.676 n=30+30)
      GoParse-4                  3.28ms ± 3%    3.25ms ± 2%  -0.97%  (p=0.002 n=30+30)
      RegexpMatchEasy0_32-4       103ns ± 4%     104ns ± 4%    ~     (p=0.352 n=30+30)
      RegexpMatchEasy0_1K-4       849ns ± 2%     845ns ± 2%    ~     (p=0.381 n=30+30)
      RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 4%    ~     (p=0.795 n=30+30)
      RegexpMatchEasy1_1K-4      1.03µs ± 3%    1.03µs ± 4%    ~     (p=0.275 n=25+30)
      RegexpMatchMedium_32-4      132ns ± 3%     132ns ± 3%    ~     (p=0.970 n=30+30)
      RegexpMatchMedium_1K-4     41.4µs ± 3%    41.4µs ± 3%    ~     (p=0.212 n=30+30)
      RegexpMatchHard_32-4       2.22µs ± 4%    2.22µs ± 4%    ~     (p=0.399 n=30+30)
      RegexpMatchHard_1K-4       67.2µs ± 3%    67.6µs ± 4%    ~     (p=0.359 n=30+30)
      Revcomp-4                   1.84s ± 2%     1.83s ± 2%    ~     (p=0.532 n=30+30)
      Template-4                 69.1ms ± 4%    68.8ms ± 3%    ~     (p=0.146 n=30+30)
      TimeParse-4                 441ns ± 3%     442ns ± 3%    ~     (p=0.154 n=30+30)
      TimeFormat-4                413ns ± 3%     414ns ± 3%    ~     (p=0.275 n=30+30)
      [Geo mean]                 66.2µs         66.0µs       -0.28%
      
      name                     old speed      new speed      delta
      GobDecode-4               105MB/s ± 5%   106MB/s ± 5%    ~     (p=0.514 n=30+29)
      GobEncode-4               111MB/s ± 5%   113MB/s ± 4%  +1.37%  (p=0.046 n=28+28)
      Gzip-4                   49.1MB/s ± 4%  48.6MB/s ± 4%  -0.98%  (p=0.028 n=30+30)
      Gunzip-4                  472MB/s ± 4%   472MB/s ± 3%    ~     (p=0.496 n=30+30)
      JSONEncode-4              120MB/s ± 2%   121MB/s ± 2%  +0.29%  (p=0.042 n=30+30)
      JSONDecode-4             31.9MB/s ± 3%  31.7MB/s ± 6%    ~     (p=0.186 n=30+30)
      GoParse-4                17.6MB/s ± 3%  17.8MB/s ± 2%  +0.98%  (p=0.002 n=30+30)
      RegexpMatchEasy0_32-4     309MB/s ± 4%   307MB/s ± 4%    ~     (p=0.501 n=30+30)
      RegexpMatchEasy0_1K-4    1.21GB/s ± 2%  1.21GB/s ± 2%    ~     (p=0.301 n=30+30)
      RegexpMatchEasy1_32-4     283MB/s ± 4%   282MB/s ± 3%    ~     (p=0.877 n=30+30)
      RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  0.99GB/s ± 4%    ~     (p=0.276 n=25+30)
      RegexpMatchMedium_32-4   7.54MB/s ± 3%  7.55MB/s ± 3%    ~     (p=0.528 n=30+30)
      RegexpMatchMedium_1K-4   24.7MB/s ± 3%  24.7MB/s ± 3%    ~     (p=0.203 n=30+30)
      RegexpMatchHard_32-4     14.4MB/s ± 4%  14.4MB/s ± 4%    ~     (p=0.407 n=30+30)
      RegexpMatchHard_1K-4     15.3MB/s ± 3%  15.1MB/s ± 4%    ~     (p=0.306 n=30+30)
      Revcomp-4                 138MB/s ± 2%   139MB/s ± 2%    ~     (p=0.520 n=30+30)
      Template-4               28.1MB/s ± 4%  28.2MB/s ± 3%    ~     (p=0.149 n=30+30)
      [Geo mean]               81.5MB/s       81.5MB/s       +0.06%
      
      Change-Id: I7f75425f79eec93cdd8fdd94db13ad4f61b6a2f5
      Reviewed-on: https://go-review.googlesource.com/133657
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      031a35ec
  2. 06 Sep, 2018 8 commits
    • fanzha02's avatar
      cmd/compile: optimize math.Copysign on arm64 · 2e5c3251
      fanzha02 authored
      Add rewrite rules to optimize math.Copysign() when the second
      argument is negative floating point constant.
      
      For example, math.Copysign(c, -2): The previous compile output is
      "AND $9223372036854775807, R0, R0; ORR $-9223372036854775808, R0, R0".
      The optimized compile output is "ORR $-9223372036854775808, R0, R0"
      
      Math package benchmark results.
      name                   old time/op  new time/op  delta
      Copysign-8             2.61ns ± 2%  2.49ns ± 0%  -4.55%  (p=0.000 n=10+10)
      Cos-8                  43.0ns ± 0%  41.5ns ± 0%  -3.49%  (p=0.000 n=10+10)
      Cosh-8                 98.6ns ± 0%  98.1ns ± 0%  -0.51%  (p=0.000 n=10+10)
      ExpGo-8                 107ns ± 0%   105ns ± 0%  -1.87%  (p=0.000 n=10+10)
      Exp2Go-8                100ns ± 0%   100ns ± 0%  +0.39%  (p=0.000 n=10+8)
      Max-8                  6.56ns ± 2%  6.45ns ± 1%  -1.63%  (p=0.002 n=10+10)
      Min-8                  6.66ns ± 3%  6.47ns ± 2%  -2.82%  (p=0.006 n=10+10)
      Mod-8                   107ns ± 1%   104ns ± 1%  -2.72%  (p=0.000 n=10+10)
      Frexp-8                11.5ns ± 1%  11.0ns ± 0%  -4.56%  (p=0.000 n=8+10)
      HypotGo-8              19.4ns ± 0%  19.4ns ± 0%  +0.36%  (p=0.019 n=10+10)
      Ilogb-8                8.63ns ± 0%  8.51ns ± 0%  -1.36%  (p=0.000 n=10+10)
      Jn-8                    584ns ± 0%   585ns ± 0%  +0.17%  (p=0.000 n=7+8)
      Ldexp-8                13.8ns ± 0%  13.5ns ± 0%  -2.17%  (p=0.002 n=8+10)
      Logb-8                 10.2ns ± 0%   9.9ns ± 0%  -2.65%  (p=0.000 n=10+7)
      Nextafter64-8          7.54ns ± 0%  7.51ns ± 0%  -0.37%  (p=0.000 n=10+10)
      Remainder-8            73.5ns ± 1%  70.4ns ± 1%  -4.27%  (p=0.000 n=10+10)
      SqrtGoLatency-8        79.6ns ± 0%  76.2ns ± 0%  -4.30%  (p=0.000 n=9+10)
      Yn-8                    582ns ± 0%   579ns ± 0%  -0.52%  (p=0.000 n=10+10)
      
      Change-Id: I0c9cd1ea87435e7b8bab94b4e79e6e29785f25b1
      Reviewed-on: https://go-review.googlesource.com/132915Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2e5c3251
    • fanzha02's avatar
      cmd/internal/obj/arm64: add CONSTRAINED UNPREDICTABLE behavior check for some load/store · 7ab4b558
      fanzha02 authored
      According to ARM64 manual, it is "constrained unpredictable behavior"
      if the src and dst registers of some load/store instructions are same.
      In order to completely prevent such unpredictable behavior, adding the
      check for load/store instructions that are supported by the assembler
      in the assembler.
      
      Add test cases.
      
      Update #25823
      
      Change-Id: I64c14ad99ee543d778e7ec8ae6516a532293dbb3
      Reviewed-on: https://go-review.googlesource.com/120660
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      7ab4b558
    • Iskander Sharipov's avatar
      bytes: remove bootstrap array from Buffer · 9c2be4c2
      Iskander Sharipov authored
      Rationale: small buffer optimization does not work and it has
      made things slower since 2014. Until we can make it work,
      we should prefer simpler code that also turns out to be more
      efficient.
      
      With this change, it's possible to use
      NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired
      stack-allocated initial buffer since escape analysis can
      prove the created slice to be non-escaping.
      
      New implementation key points:
      
          - Zero value bytes.Buffer performs better than before
          - You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes
          - The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly
          - Empty writes don't cause allocations
      
      Buffer benchmarks from bytes package:
      
          name                       old time/op    new time/op    delta
          ReadString-8                 9.20µs ± 1%    9.22µs ± 1%     ~     (p=0.148 n=10+10)
          WriteByte-8                  28.1µs ± 0%    26.2µs ± 0%   -6.78%  (p=0.000 n=10+10)
          WriteRune-8                  64.9µs ± 0%    65.0µs ± 0%   +0.16%  (p=0.000 n=10+10)
          BufferNotEmptyWriteRead-8     469µs ± 0%     461µs ± 0%   -1.76%  (p=0.000 n=9+10)
          BufferFullSmallReads-8        108µs ± 0%     108µs ± 0%   -0.21%  (p=0.000 n=10+10)
      
          name                       old speed      new speed      delta
          ReadString-8               3.56GB/s ± 1%  3.55GB/s ± 1%     ~     (p=0.165 n=10+10)
          WriteByte-8                 146MB/s ± 0%   156MB/s ± 0%   +7.26%  (p=0.000 n=9+10)
          WriteRune-8                 189MB/s ± 0%   189MB/s ± 0%   -0.16%  (p=0.000 n=10+10)
      
          name                       old alloc/op   new alloc/op   delta
          ReadString-8                 32.8kB ± 0%    32.8kB ± 0%     ~     (all equal)
          WriteByte-8                   0.00B          0.00B          ~     (all equal)
          WriteRune-8                   0.00B          0.00B          ~     (all equal)
          BufferNotEmptyWriteRead-8    4.72kB ± 0%    4.67kB ± 0%   -1.02%  (p=0.000 n=10+10)
          BufferFullSmallReads-8       3.44kB ± 0%    3.33kB ± 0%   -3.26%  (p=0.000 n=10+10)
      
          name                       old allocs/op  new allocs/op  delta
          ReadString-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
          WriteByte-8                    0.00           0.00          ~     (all equal)
          WriteRune-8                    0.00           0.00          ~     (all equal)
          BufferNotEmptyWriteRead-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
          BufferFullSmallReads-8         3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)
      
      The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc):
      
          HTTPClientServer-8           64.0 ± 0%      63.0 ± 0%  -1.56%  (p=0.000 n=10+10)
      
      For more explanations and benchmarks see the referenced issue.
      
      Updates #7921
      
      Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3
      Reviewed-on: https://go-review.googlesource.com/133715Reviewed-by: default avatarMartin Möhrmann <moehrmann@google.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9c2be4c2
    • Jake B's avatar
      net: ensure WriteTo on Windows sends even zero-byte payloads · 3e5b5d69
      Jake B authored
      This builds on:
      https://github.com/golang/go/pull/27445
      
      "...And then send change to fix windows internal/poll.FD.WriteTo - together with making TestUDPZeroBytePayload run again."
      - alexbrainman - https://github.com/golang/go/issues/26668#issuecomment-408657503
      
      Fixes #26668
      
      Change-Id: Icd9ecb07458f13e580b3e7163a5946ccec342509
      GitHub-Last-Rev: 3bf2b8b46bb8cf79903930631433a1f2ce50ec42
      GitHub-Pull-Request: golang/go#27446
      Reviewed-on: https://go-review.googlesource.com/132781
      Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAlex Brainman <alex.brainman@gmail.com>
      3e5b5d69
    • Ian Davis's avatar
      encoding/json: recover saved error context when unmarshalling · 22afb357
      Ian Davis authored
      Fixes: #27464
      
      Change-Id: I270c56fd0d5ae8787a1293029aff3072f4f52f33
      Reviewed-on: https://go-review.googlesource.com/132955Reviewed-by: default avatarDaniel Martí <mvdan@mvdan.cc>
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      22afb357
    • Warren Fernandes's avatar
      expvar: fix name of Var interface · 6b7099ca
      Warren Fernandes authored
      Change-Id: Ibc40237981fdd20316f73f7f6f3dfa918dd0af5d
      Reviewed-on: https://go-review.googlesource.com/133658Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      6b7099ca
    • Warren Fernandes's avatar
      fmt: add example for GoStringer interface · 262d4f32
      Warren Fernandes authored
      Updates golang/go#27376.
      
      Change-Id: Ia8608561eb6a268aa7eae8c39c7098df100b643a
      Reviewed-on: https://go-review.googlesource.com/133075Reviewed-by: default avatarKevin Burke <kev@inburke.com>
      Run-TryBot: Kevin Burke <kev@inburke.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      262d4f32
    • taylorza's avatar
      cmd/compile: don't crash reporting misuse of shadowed built-in function · 4a095b87
      taylorza authored
      The existing implementation causes a compiler panic if a function parameter shadows a built-in function, and then calling that shadowed name.
      
      Fixes #27356
      Change-Id: I1ffb6dc01e63c7f499e5f6f75f77ce2318f35bcd
      Reviewed-on: https://go-review.googlesource.com/132876Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      4a095b87
  3. 05 Sep, 2018 17 commits
    • Daniel Martí's avatar
      text/template: simplify line tracking in the lexer · 98fd6680
      Daniel Martí authored
      First, move the strings.Count logic out of emit, since only itemText
      requires that. Use it in those call sites. itemLeftDelim and
      itemRightDelim cannot contain newlines, as they're the "{{" and "}}"
      tokens.
      
      Secondly, introduce a startLine lexer field so that we don't have to
      keep track of it elsewhere. That's also a requirement to move the
      strings.Count out of emit, as emit modifies the start position field.
      
      Change-Id: I69175f403487607a8e5b561b3f1916ee9dc3c0c6
      Reviewed-on: https://go-review.googlesource.com/132275
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRob Pike <r@golang.org>
      98fd6680
    • Michael Munday's avatar
      cmd/compile: regenerate known formats for TestFormats · 2524ed19
      Michael Munday authored
      The formatting verb '%#x' was used for uint32 values in CL 132956.
      This fixes TestFormats.
      
      Change-Id: I3ab6519bde2cb74410fdca14829689cb46bf7022
      Reviewed-on: https://go-review.googlesource.com/133595
      Run-TryBot: Michael Munday <mike.munday@ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDaniel Martí <mvdan@mvdan.cc>
      2524ed19
    • Robert Griesemer's avatar
      go/types: fix internal comments and add additional test case · 3c1b7bc7
      Robert Griesemer authored
      https://go-review.googlesource.com/c/go/+/132355 addressed
      a crash and inadvertently fixed #27346; however the comment
      added to the type-checker was incorrect and misleading.
      
      This CL fixes the comment, and adds a test case for #27346.
      
      Fixes #27346.
      Updates #22467.
      
      Change-Id: Ib6d5caedf302fd42929c4dacc55e973c1aebfe85
      Reviewed-on: https://go-review.googlesource.com/133415
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRebecca Stambler <rstambler@golang.org>
      3c1b7bc7
    • Michael Munday's avatar
      cmd/compile: fix store-to-load forwarding of 32-bit sNaNs · 48af3a8b
      Michael Munday authored
      Signalling NaNs were being converted to quiet NaNs during constant
      propagation through integer <-> float store-to-load forwarding.
      This occurs because we store float32 constants as float64
      values and CPU hardware 'quietens' NaNs during conversion between
      the two.
      
      Eventually we want to move to using float32 values to store float32
      constants, however this will be a big change since both the compiler
      and the assembler expect float64 values. So for now this is a small
      change that will fix the immediate issue.
      
      Fixes #27193.
      
      Change-Id: Iac54bd8c13abe26f9396712bc71f9b396f842724
      Reviewed-on: https://go-review.googlesource.com/132956
      Run-TryBot: Michael Munday <mike.munday@ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      48af3a8b
    • Ben Shi's avatar
      cmd/compile: optimize ARM's comparision · 067dfce2
      Ben Shi authored
      Optimize (CMPconst [0] (ADD x y)) to (CMN x y) will only get benefits
      when the result of the addition is no longer used, otherwise there
      might be even performance drop. And this CL fixes that issue for
      CMP/CMN/TST/TEQ.
      
      There is little regression in the go1 benchmark (excluding noise),
      and the test case JSONDecode-4 even gets improvement.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              21.6s ± 1%     21.6s ± 0%  -0.22%  (p=0.013 n=30+30)
      Fannkuch11-4                11.1s ± 0%     11.1s ± 0%  +0.11%  (p=0.000 n=30+29)
      FmtFprintfEmpty-4           297ns ± 0%     297ns ± 0%  +0.08%  (p=0.007 n=26+28)
      FmtFprintfString-4          589ns ± 1%     589ns ± 0%    ~     (p=0.659 n=30+25)
      FmtFprintfInt-4             644ns ± 1%     650ns ± 0%  +0.88%  (p=0.000 n=30+24)
      FmtFprintfIntInt-4          964ns ± 0%     977ns ± 0%  +1.33%  (p=0.000 n=30+30)
      FmtFprintfPrefixedInt-4    1.06µs ± 0%    1.07µs ± 0%  +1.31%  (p=0.000 n=29+27)
      FmtFprintfFloat-4          1.89µs ± 0%    1.92µs ± 0%  +1.25%  (p=0.000 n=29+29)
      FmtManyArgs-4              3.63µs ± 0%    3.67µs ± 0%  +1.33%  (p=0.000 n=29+27)
      GobDecode-4                38.1ms ± 1%    37.9ms ± 1%  -0.60%  (p=0.000 n=29+29)
      GobEncode-4                35.3ms ± 2%    35.2ms ± 1%    ~     (p=0.286 n=30+30)
      Gzip-4                      2.36s ± 0%     2.37s ± 2%    ~     (p=0.277 n=24+28)
      Gunzip-4                    264ms ± 1%     264ms ± 1%    ~     (p=0.104 n=28+30)
      HTTPClientServer-4         1.04ms ± 4%    1.02ms ± 4%  -1.65%  (p=0.000 n=28+28)
      JSONEncode-4               78.5ms ± 1%    79.6ms ± 1%  +1.34%  (p=0.000 n=27+28)
      JSONDecode-4                379ms ± 4%     352ms ± 5%  -7.09%  (p=0.000 n=29+30)
      Mandelbrot200-4            17.6ms ± 0%    17.6ms ± 0%    ~     (p=0.206 n=28+29)
      GoParse-4                  21.9ms ± 1%    22.1ms ± 1%  +0.87%  (p=0.000 n=28+26)
      RegexpMatchEasy0_32-4       631ns ± 0%     641ns ± 0%  +1.63%  (p=0.000 n=29+30)
      RegexpMatchEasy0_1K-4      4.11µs ± 0%    4.11µs ± 0%    ~     (p=0.700 n=30+30)
      RegexpMatchEasy1_32-4       670ns ± 0%     679ns ± 0%  +1.37%  (p=0.000 n=21+30)
      RegexpMatchEasy1_1K-4      5.31µs ± 0%    5.26µs ± 0%  -1.03%  (p=0.000 n=25+28)
      RegexpMatchMedium_32-4      905ns ± 0%     906ns ± 0%  +0.14%  (p=0.001 n=30+30)
      RegexpMatchMedium_1K-4      192µs ± 0%     191µs ± 0%  -0.45%  (p=0.000 n=29+27)
      RegexpMatchHard_32-4       11.8µs ± 0%    11.7µs ± 0%  -0.39%  (p=0.000 n=29+28)
      RegexpMatchHard_1K-4        347µs ± 0%     347µs ± 0%    ~     (p=0.084 n=29+30)
      Revcomp-4                  37.5ms ± 1%    37.5ms ± 1%    ~     (p=0.279 n=29+29)
      Template-4                  519ms ± 2%     519ms ± 2%    ~     (p=0.652 n=28+29)
      TimeParse-4                2.83µs ± 0%    2.78µs ± 0%  -1.90%  (p=0.000 n=27+28)
      TimeFormat-4               5.79µs ± 0%    5.60µs ± 0%  -3.23%  (p=0.000 n=29+29)
      [Geo mean]                  331µs          330µs       -0.16%
      
      name                     old speed      new speed      delta
      GobDecode-4              20.1MB/s ± 1%  20.3MB/s ± 1%  +0.61%  (p=0.000 n=29+29)
      GobEncode-4              21.7MB/s ± 2%  21.8MB/s ± 1%    ~     (p=0.294 n=30+30)
      Gzip-4                   8.23MB/s ± 1%  8.20MB/s ± 2%    ~     (p=0.099 n=26+28)
      Gunzip-4                 73.5MB/s ± 1%  73.4MB/s ± 1%    ~     (p=0.107 n=28+30)
      JSONEncode-4             24.7MB/s ± 1%  24.4MB/s ± 1%  -1.32%  (p=0.000 n=27+28)
      JSONDecode-4             5.13MB/s ± 4%  5.52MB/s ± 5%  +7.65%  (p=0.000 n=29+30)
      GoParse-4                2.65MB/s ± 1%  2.63MB/s ± 1%  -0.87%  (p=0.000 n=28+26)
      RegexpMatchEasy0_32-4    50.7MB/s ± 0%  49.9MB/s ± 0%  -1.58%  (p=0.000 n=29+29)
      RegexpMatchEasy0_1K-4     249MB/s ± 0%   249MB/s ± 0%    ~     (p=0.342 n=30+28)
      RegexpMatchEasy1_32-4    47.7MB/s ± 0%  47.1MB/s ± 0%  -1.39%  (p=0.000 n=26+30)
      RegexpMatchEasy1_1K-4     193MB/s ± 0%   195MB/s ± 0%  +1.04%  (p=0.000 n=25+28)
      RegexpMatchMedium_32-4   1.10MB/s ± 0%  1.10MB/s ± 0%  -0.42%  (p=0.000 n=30+26)
      RegexpMatchMedium_1K-4   5.33MB/s ± 0%  5.36MB/s ± 0%  +0.43%  (p=0.000 n=29+29)
      RegexpMatchHard_32-4     2.72MB/s ± 0%  2.73MB/s ± 0%  +0.37%  (p=0.000 n=29+30)
      RegexpMatchHard_1K-4     2.95MB/s ± 0%  2.95MB/s ± 0%    ~     (all equal)
      Revcomp-4                67.8MB/s ± 1%  67.7MB/s ± 1%    ~     (p=0.273 n=29+29)
      Template-4               3.74MB/s ± 2%  3.74MB/s ± 2%    ~     (p=0.665 n=28+29)
      [Geo mean]               15.2MB/s       15.2MB/s       +0.21%
      
      Change-Id: Ifed1fb8cc02d5ca52c8bc6c21b6b5bf6dbb2701a
      Reviewed-on: https://go-review.googlesource.com/132115
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      067dfce2
    • fanzha02's avatar
      cmd/internal/obj/arm64: encode float constants into FMOVS/FMOVD instructions · c430adf1
      fanzha02 authored
      Current assembler rewrites float constants to values stored in memory
      except 0.0, which is not performant. This patch uses the FMOVS/FMOVD
      instructions to move some available floating-point immediate constants
      into SIMD&FP destination registers. These available constants can be
      encoded into FMOVS/FMOVD instructions, checked by the chipfloat7() function.
      
      go1 benchmark results.
      name                     old time/op    new time/op    delta
      BinaryTree17-8              6.27s ± 1%     6.27s ± 1%    ~     (p=0.762 n=10+8)
      Fannkuch11-8                5.42s ± 1%     5.38s ± 0%  -0.63%  (p=0.000 n=10+10)
      FmtFprintfEmpty-8          92.9ns ± 1%    93.4ns ± 0%  +0.47%  (p=0.004 n=9+8)
      FmtFprintfString-8          169ns ± 2%     170ns ± 4%    ~     (p=0.378 n=10+10)
      FmtFprintfInt-8             197ns ± 1%     196ns ± 1%  -0.77%  (p=0.009 n=10+9)
      FmtFprintfIntInt-8          284ns ± 1%     286ns ± 1%    ~     (p=0.051 n=10+10)
      FmtFprintfPrefixedInt-8     419ns ± 0%     422ns ± 1%  +0.69%  (p=0.038 n=6+10)
      FmtFprintfFloat-8           458ns ± 0%     463ns ± 1%  +1.14%  (p=0.000 n=10+10)
      FmtManyArgs-8              1.35µs ± 2%    1.36µs ± 1%  +0.91%  (p=0.043 n=10+10)
      GobDecode-8                16.0ms ± 2%    15.5ms ± 1%   -3.39%  (p=0.000 n=10+10)
      GobEncode-8                11.9ms ± 3%    11.4ms ± 1%   -3.98%  (p=0.000 n=10+9)
      Gzip-8                      621ms ± 0%     625ms ± 0%   +0.59%  (p=0.000 n=9+10)
      Gunzip-8                   74.0ms ± 1%    74.3ms ± 0%     ~     (p=0.059 n=9+8)
      HTTPClientServer-8          116µs ± 1%     116µs ± 1%     ~     (p=0.165 n=10+10)
      JSONEncode-8               29.3ms ± 1%    29.5ms ± 0%   +0.72%  (p=0.001 n=10+10)
      JSONDecode-8                145ms ± 1%     148ms ± 2%   +2.06%  (p=0.000 n=10+10)
      Mandelbrot200-8            9.67ms ± 0%    9.48ms ± 1%   -1.92%  (p=0.000 n=8+10)
      GoParse-8                  7.55ms ± 0%    7.60ms ± 0%   +0.57%  (p=0.000 n=9+10)
      RegexpMatchEasy0_32-8       234ns ± 0%     210ns ± 0%  -10.13%  (p=0.000 n=8+10)
      RegexpMatchEasy0_1K-8       753ns ± 1%     729ns ± 0%   -3.17%  (p=0.000 n=10+8)
      RegexpMatchEasy1_32-8       225ns ± 0%     224ns ± 0%   -0.44%  (p=0.000 n=9+9)
      RegexpMatchEasy1_1K-8      1.03µs ± 0%    1.04µs ± 1%   +1.29%  (p=0.000 n=10+10)
      RegexpMatchMedium_32-8      320ns ± 3%     296ns ± 6%   -7.50%  (p=0.000 n=10+10)
      RegexpMatchMedium_1K-8     77.0µs ± 5%    73.6µs ± 1%     ~     (p=0.393 n=10+10)
      RegexpMatchHard_32-8       3.93µs ± 0%    3.89µs ± 1%   -0.95%  (p=0.000 n=10+9)
      RegexpMatchHard_1K-8        120µs ± 5%     115µs ± 1%     ~     (p=0.739 n=10+10)
      Revcomp-8                   1.07s ± 0%     1.08s ± 1%   +0.63%  (p=0.000 n=10+9)
      Template-8                  165ms ± 1%     163ms ± 1%   -1.05%  (p=0.001 n=8+10)
      TimeParse-8                 751ns ± 1%     749ns ± 1%     ~     (p=0.209 n=10+10)
      TimeFormat-8                759ns ± 1%     751ns ± 1%   -0.96%  (p=0.001 n=10+10)
      
      name                     old speed      new speed      delta
      GobDecode-8              48.0MB/s ± 2%  49.6MB/s ± 1%   +3.50%  (p=0.000 n=10+10)
      GobEncode-8              64.5MB/s ± 3%  67.1MB/s ± 1%   +4.08%  (p=0.000 n=10+9)
      Gzip-8                   31.2MB/s ± 0%  31.1MB/s ± 0%   -0.55%  (p=0.000 n=9+8)
      Gunzip-8                  262MB/s ± 1%   261MB/s ± 0%     ~     (p=0.059 n=9+8)
      JSONEncode-8             66.3MB/s ± 1%  65.8MB/s ± 0%   -0.72%  (p=0.001 n=10+10)
      JSONDecode-8             13.4MB/s ± 1%  13.2MB/s ± 1%   -2.02%  (p=0.000 n=10+10)
      GoParse-8                7.67MB/s ± 0%  7.63MB/s ± 0%   -0.57%  (p=0.000 n=9+10)
      RegexpMatchEasy0_32-8     136MB/s ± 0%   152MB/s ± 0%  +11.45%  (p=0.000 n=10+10)
      RegexpMatchEasy0_1K-8    1.36GB/s ± 1%  1.40GB/s ± 0%   +3.25%  (p=0.000 n=10+8)
      RegexpMatchEasy1_32-8     142MB/s ± 0%   143MB/s ± 0%   +0.35%  (p=0.000 n=10+9)
      RegexpMatchEasy1_1K-8     992MB/s ± 0%   980MB/s ± 1%   -1.27%  (p=0.000 n=10+10)
      RegexpMatchMedium_32-8   3.12MB/s ± 3%  3.38MB/s ± 6%   +8.17%  (p=0.000 n=10+10)
      RegexpMatchMedium_1K-8   13.3MB/s ± 5%  13.9MB/s ± 1%     ~     (p=0.362 n=10+10)
      RegexpMatchHard_32-8     8.14MB/s ± 0%  8.21MB/s ± 1%   +0.95%  (p=0.000 n=10+9)
      RegexpMatchHard_1K-8     8.54MB/s ± 5%  8.90MB/s ± 1%     ~     (p=0.636 n=10+10)
      Revcomp-8                 238MB/s ± 0%   236MB/s ± 1%   -0.63%  (p=0.000 n=10+9)
      Template-8               11.8MB/s ± 1%  11.9MB/s ± 1%   +1.07%  (p=0.001 n=8+10)
      
      Change-Id: I57b372d8dcd47e6aec39893843b20385d5d9c37e
      Reviewed-on: https://go-review.googlesource.com/129555
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      c430adf1
    • Brad Fitzpatrick's avatar
      net: don't block forever in splice test cleanup on failure · 81957dd5
      Brad Fitzpatrick authored
      The ppc64x builders are failing on the new splice test from CL 113997
      but the actual failure is being obscured by a test deadlock.
      
      Change-Id: I7747f88bcdba9776a3c0d2f5066cfec572706108
      Reviewed-on: https://go-review.googlesource.com/133417
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarTobias Klauser <tobias.klauser@gmail.com>
      81957dd5
    • Tobias Klauser's avatar
      net: skip splice unix-to-tcp tests on android · 5789f838
      Tobias Klauser authored
      The android builders are failing on the AF_UNIX part of the new splice
      test from CL 113997. Skip them.
      
      Change-Id: Ia0519aae922acb11d2845aa687633935bcd4b1b0
      Reviewed-on: https://go-review.googlesource.com/133515
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      5789f838
    • Iskander Sharipov's avatar
      cmd/compile/internal/gc: fix mayAffectMemory in esc.go · 4cf33e36
      Iskander Sharipov authored
      For OINDEX and other Left+Right nodes, we want the whole
      node to be considered as "may affect memory" if either
      of Left or Right affect memory. Initial implementation
      only considered node as such if both Left and Right were non-safe.
      
      Change-Id: Icfb965a0b4c24d8f83f3722216db068dad2eba95
      Reviewed-on: https://go-review.googlesource.com/133275
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      4cf33e36
    • Alessandro Arzilli's avatar
      misc/cgo/testplugin: disable DWARF tests on darwin · 3fd36498
      Alessandro Arzilli authored
      For some reason on darwin the linker still can't add debug sections to
      plugins. Executables importing "plugin" do have them, however.
      
      Because of issue 25841, plugins on darwin would likely have bad debug
      info anyway so, for now, this isn't a great loss.
      
      This disables the check for debug sections in plugins for darwin only.
      
      Updates #27502
      
      Change-Id: Ib8f62dac1e485006b0c2b3ba04f86d733db5ee9a
      Reviewed-on: https://go-review.googlesource.com/133435Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      3fd36498
    • Iskander Sharipov's avatar
      test: remove go:noinline from escape_because.go · bcf3e063
      Iskander Sharipov authored
      File is compiled with "-l" flag, so go:noinline is redundant.
      
      Change-Id: Ia269f3b9de9466857fc578ba5164613393e82369
      Reviewed-on: https://go-review.googlesource.com/133295Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      bcf3e063
    • Milan Knezevic's avatar
      doc: add GOMIPS64 to source installation docs · b88e4ad6
      Milan Knezevic authored
      Fixes #27258
      
      Change-Id: I1ac75087e2b811e6479990e12d71f2c1f4f47b64
      Reviewed-on: https://go-review.googlesource.com/132015Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      b88e4ad6
    • Tobias Klauser's avatar
      syscall: correct argument order for SyncFileRange syscall on linux/ppc64{,le} · eee1cfb0
      Tobias Klauser authored
      On linux/ppc64{,le} the SYS_SYNC_FILE_RANGE2 syscall is used to
      implement SyncFileRange. This syscall has a different argument order
      than SYS_SYNC_FILE_RANGE. Apart from that the implementations of both
      syscalls are the same, so use a simple wrapper to invoke the syscall
      with the correct argument order.
      
      For context see:
      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edd5cd4a9424f22b0fa08bef5e299d41befd5622
      
      Updates #27485
      
      Change-Id: Ib94fb98376bf6c879df6f1b68c3bdd11ebcb5a44
      Reviewed-on: https://go-review.googlesource.com/133195
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      eee1cfb0
    • Tobias Klauser's avatar
      test: fix nilptr3 check for wasm · d7fc2205
      Tobias Klauser authored
      CL 131735 only updated nilptr3.go for the adjusted nil check. Adjust
      nilptr3_wasm.go as well.
      
      Change-Id: I4a6257d32bb212666fe768dac53901ea0b051138
      Reviewed-on: https://go-review.googlesource.com/133495
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d7fc2205
    • Ben Burkert's avatar
      net: use splice(2) on Linux when reading from UnixConn, rework splice tests · fc5edaca
      Ben Burkert authored
      Rework the splice tests and benchmarks. Move the reading and writing of
      the spliced connections to child processes so that the I/O is not part
      of benchmarks or profiles.
      
      Enable the use of splice(2) when reading from a unix connection and
      writing to a TCP connection. The updated benchmarks show a performance
      gain when using splice(2) to copy large chunks of data that the original
      benchmark did not capture.
      
        name                          old time/op    new time/op    delta
        Splice/tcp-to-tcp/1024-8        5.01µs ± 2%    5.08µs ± 3%      ~     (p=0.068 n=8+10)
        Splice/tcp-to-tcp/2048-8        4.76µs ± 5%    4.65µs ± 3%    -2.36%  (p=0.015 n=9+8)
        Splice/tcp-to-tcp/4096-8        4.91µs ± 2%    4.98µs ± 5%      ~     (p=0.315 n=9+10)
        Splice/tcp-to-tcp/8192-8        5.50µs ± 4%    5.44µs ± 3%      ~     (p=0.758 n=7+9)
        Splice/tcp-to-tcp/16384-8       7.65µs ± 7%    6.53µs ± 3%   -14.65%  (p=0.000 n=10+9)
        Splice/tcp-to-tcp/32768-8       15.3µs ± 7%     8.5µs ± 5%   -44.21%  (p=0.000 n=10+10)
        Splice/tcp-to-tcp/65536-8       30.0µs ± 6%    15.7µs ± 1%   -47.58%  (p=0.000 n=10+8)
        Splice/tcp-to-tcp/131072-8      59.2µs ± 2%    27.4µs ± 5%   -53.75%  (p=0.000 n=9+9)
        Splice/tcp-to-tcp/262144-8       121µs ± 4%      54µs ±19%   -55.56%  (p=0.000 n=9+10)
        Splice/tcp-to-tcp/524288-8       247µs ± 6%     108µs ±12%   -56.34%  (p=0.000 n=10+10)
        Splice/tcp-to-tcp/1048576-8      490µs ± 4%     199µs ±12%   -59.31%  (p=0.000 n=8+10)
        Splice/unix-to-tcp/1024-8       1.20µs ± 2%    1.35µs ± 7%   +12.47%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/2048-8       1.33µs ±12%    1.57µs ± 4%   +17.85%  (p=0.000 n=9+10)
        Splice/unix-to-tcp/4096-8       2.24µs ± 4%    1.67µs ± 4%   -25.14%  (p=0.000 n=9+10)
        Splice/unix-to-tcp/8192-8       4.59µs ± 8%    2.20µs ±10%   -52.01%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/16384-8      8.46µs ±13%    3.48µs ± 6%   -58.91%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/32768-8      18.5µs ± 9%     6.1µs ± 9%   -66.99%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/65536-8      35.9µs ± 7%    13.5µs ± 6%   -62.40%  (p=0.000 n=10+9)
        Splice/unix-to-tcp/131072-8     79.4µs ± 6%    25.7µs ± 4%   -67.62%  (p=0.000 n=10+9)
        Splice/unix-to-tcp/262144-8      157µs ± 4%      54µs ± 8%   -65.63%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/524288-8      311µs ± 3%     107µs ± 8%   -65.74%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/1048576-8     643µs ± 4%     185µs ±32%   -71.21%  (p=0.000 n=10+10)
      
        name                          old speed      new speed      delta
        Splice/tcp-to-tcp/1024-8       204MB/s ± 2%   202MB/s ± 3%      ~     (p=0.068 n=8+10)
        Splice/tcp-to-tcp/2048-8       430MB/s ± 5%   441MB/s ± 3%    +2.39%  (p=0.014 n=9+8)
        Splice/tcp-to-tcp/4096-8       833MB/s ± 2%   823MB/s ± 5%      ~     (p=0.315 n=9+10)
        Splice/tcp-to-tcp/8192-8      1.49GB/s ± 4%  1.51GB/s ± 3%      ~     (p=0.758 n=7+9)
        Splice/tcp-to-tcp/16384-8     2.14GB/s ± 7%  2.51GB/s ± 3%   +17.03%  (p=0.000 n=10+9)
        Splice/tcp-to-tcp/32768-8     2.15GB/s ± 7%  3.85GB/s ± 5%   +79.11%  (p=0.000 n=10+10)
        Splice/tcp-to-tcp/65536-8     2.19GB/s ± 5%  4.17GB/s ± 1%   +90.65%  (p=0.000 n=10+8)
        Splice/tcp-to-tcp/131072-8    2.22GB/s ± 2%  4.79GB/s ± 4%  +116.26%  (p=0.000 n=9+9)
        Splice/tcp-to-tcp/262144-8    2.17GB/s ± 4%  4.93GB/s ±17%  +127.25%  (p=0.000 n=9+10)
        Splice/tcp-to-tcp/524288-8    2.13GB/s ± 6%  4.89GB/s ±13%  +130.15%  (p=0.000 n=10+10)
        Splice/tcp-to-tcp/1048576-8   2.09GB/s ±10%  5.29GB/s ±11%  +153.36%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/1024-8      850MB/s ± 2%   757MB/s ± 7%   -10.94%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/2048-8     1.54GB/s ±11%  1.31GB/s ± 3%   -15.32%  (p=0.000 n=9+10)
        Splice/unix-to-tcp/4096-8     1.83GB/s ± 4%  2.45GB/s ± 4%   +33.59%  (p=0.000 n=9+10)
        Splice/unix-to-tcp/8192-8     1.79GB/s ± 9%  3.73GB/s ± 9%  +108.05%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/16384-8    1.95GB/s ±13%  4.68GB/s ± 3%  +139.80%  (p=0.000 n=10+9)
        Splice/unix-to-tcp/32768-8    1.78GB/s ± 9%  5.38GB/s ±10%  +202.71%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/65536-8    1.83GB/s ± 8%  4.85GB/s ± 6%  +165.70%  (p=0.000 n=10+9)
        Splice/unix-to-tcp/131072-8   1.65GB/s ± 6%  5.10GB/s ± 4%  +208.77%  (p=0.000 n=10+9)
        Splice/unix-to-tcp/262144-8   1.67GB/s ± 4%  4.87GB/s ± 7%  +191.19%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/524288-8   1.69GB/s ± 3%  4.93GB/s ± 7%  +192.38%  (p=0.000 n=10+10)
        Splice/unix-to-tcp/1048576-8  1.63GB/s ± 3%  5.60GB/s ±44%  +243.26%  (p=0.000 n=10+9)
      
      Change-Id: I1eae4c3459c918558c70fc42283db22ff7e0442c
      Reviewed-on: https://go-review.googlesource.com/113997Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      fc5edaca
    • Michael Munday's avatar
      cmd/compile: make math/bits.RotateLeft{32,64} intrinsics on s390x · f94de9c9
      Michael Munday authored
      Extends CL 132435 to s390x. s390x has 32- and 64-bit variable
      rotate left instructions.
      
      Change-Id: Ic4f1ebb0e0543207ed2fc8c119e0163b428138a5
      Reviewed-on: https://go-review.googlesource.com/133035
      Run-TryBot: Michael Munday <mike.munday@ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      f94de9c9
    • Ben Shi's avatar
      cmd/compile: optimize arm64's comparison · 0e9f1de0
      Ben Shi authored
      Add more optimization with TST/CMN.
      
      1. A tiny benchmark shows more than 12% improvement.
      TSTCMN-4                    378µs ± 0%     332µs ± 0%  -12.15%  (p=0.000 n=30+27)
      (https://github.com/benshi001/ugo1/blob/master/tstcmn_test.go)
      
      2. There is little regression in the go1 benchmark, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              19.1s ± 0%     19.1s ± 0%    ~     (p=0.994 n=28+29)
      Fannkuch11-4                10.0s ± 0%     10.0s ± 0%    ~     (p=0.198 n=30+25)
      FmtFprintfEmpty-4           233ns ± 0%     233ns ± 0%  +0.14%  (p=0.002 n=24+30)
      FmtFprintfString-4          428ns ± 0%     428ns ± 0%    ~     (all equal)
      FmtFprintfInt-4             472ns ± 0%     472ns ± 0%    ~     (all equal)
      FmtFprintfIntInt-4          725ns ± 0%     725ns ± 0%    ~     (all equal)
      FmtFprintfPrefixedInt-4     889ns ± 0%     888ns ± 0%    ~     (p=0.632 n=28+30)
      FmtFprintfFloat-4          1.20µs ± 0%    1.20µs ± 0%  +0.05%  (p=0.001 n=18+30)
      FmtManyArgs-4              3.00µs ± 0%    2.99µs ± 0%  -0.07%  (p=0.001 n=27+30)
      GobDecode-4                42.1ms ± 0%    42.2ms ± 0%  +0.29%  (p=0.000 n=28+28)
      GobEncode-4                38.6ms ± 9%    38.8ms ± 9%    ~     (p=0.912 n=30+30)
      Gzip-4                      2.07s ± 1%     2.05s ± 1%  -0.64%  (p=0.000 n=29+30)
      Gunzip-4                    175ms ± 0%     175ms ± 0%  -0.15%  (p=0.001 n=30+30)
      HTTPClientServer-4          872µs ± 5%     880µs ± 6%    ~     (p=0.196 n=30+29)
      JSONEncode-4               88.5ms ± 1%    89.8ms ± 1%  +1.49%  (p=0.000 n=23+24)
      JSONDecode-4                393ms ± 1%     390ms ± 1%  -0.89%  (p=0.000 n=28+30)
      Mandelbrot200-4            19.5ms ± 0%    19.5ms ± 0%    ~     (p=0.405 n=29+28)
      GoParse-4                  19.9ms ± 0%    20.0ms ± 0%  +0.27%  (p=0.000 n=30+30)
      RegexpMatchEasy0_32-4       431ns ± 0%     431ns ± 0%    ~     (p=1.000 n=30+30)
      RegexpMatchEasy0_1K-4      1.61µs ± 0%    1.61µs ± 0%    ~     (p=0.527 n=26+26)
      RegexpMatchEasy1_32-4       443ns ± 0%     443ns ± 0%    ~     (all equal)
      RegexpMatchEasy1_1K-4      2.58µs ± 1%    2.58µs ± 1%    ~     (p=0.578 n=27+25)
      RegexpMatchMedium_32-4      740ns ± 0%     740ns ± 0%    ~     (p=0.357 n=30+30)
      RegexpMatchMedium_1K-4      223µs ± 0%     223µs ± 0%  +0.16%  (p=0.000 n=30+29)
      RegexpMatchHard_32-4       12.3µs ± 0%    12.3µs ± 0%    ~     (p=0.236 n=27+27)
      RegexpMatchHard_1K-4        371µs ± 0%     371µs ± 0%  +0.09%  (p=0.000 n=30+27)
      Revcomp-4                   2.85s ± 0%     2.85s ± 0%    ~     (p=0.057 n=28+25)
      Template-4                  408ms ± 1%     409ms ± 1%    ~     (p=0.117 n=29+29)
      TimeParse-4                1.93µs ± 0%    1.93µs ± 0%    ~     (p=0.535 n=29+28)
      TimeFormat-4               1.99µs ± 0%    1.99µs ± 0%    ~     (p=0.168 n=29+28)
      [Geo mean]                  306µs          307µs       +0.07%
      
      name                     old speed      new speed      delta
      GobDecode-4              18.3MB/s ± 0%  18.2MB/s ± 0%  -0.31%  (p=0.000 n=28+29)
      GobEncode-4              19.9MB/s ± 8%  19.8MB/s ± 9%    ~     (p=0.923 n=30+30)
      Gzip-4                   9.39MB/s ± 1%  9.45MB/s ± 1%  +0.65%  (p=0.000 n=29+30)
      Gunzip-4                  111MB/s ± 0%   111MB/s ± 0%  +0.15%  (p=0.001 n=30+30)
      JSONEncode-4             21.9MB/s ± 1%  21.6MB/s ± 1%  -1.45%  (p=0.000 n=23+23)
      JSONDecode-4             4.94MB/s ± 1%  4.98MB/s ± 1%  +0.84%  (p=0.000 n=27+30)
      GoParse-4                2.91MB/s ± 0%  2.90MB/s ± 0%  -0.34%  (p=0.000 n=21+22)
      RegexpMatchEasy0_32-4    74.1MB/s ± 0%  74.1MB/s ± 0%    ~     (p=0.469 n=29+28)
      RegexpMatchEasy0_1K-4     634MB/s ± 0%   634MB/s ± 0%    ~     (p=0.978 n=24+28)
      RegexpMatchEasy1_32-4    72.2MB/s ± 0%  72.2MB/s ± 0%    ~     (p=0.064 n=27+29)
      RegexpMatchEasy1_1K-4     396MB/s ± 1%   396MB/s ± 1%    ~     (p=0.583 n=27+25)
      RegexpMatchMedium_32-4   1.35MB/s ± 0%  1.35MB/s ± 0%    ~     (all equal)
      RegexpMatchMedium_1K-4   4.60MB/s ± 0%  4.59MB/s ± 0%  -0.14%  (p=0.000 n=30+26)
      RegexpMatchHard_32-4     2.61MB/s ± 0%  2.61MB/s ± 0%    ~     (all equal)
      RegexpMatchHard_1K-4     2.76MB/s ± 0%  2.76MB/s ± 0%    ~     (all equal)
      Revcomp-4                89.1MB/s ± 0%  89.1MB/s ± 0%    ~     (p=0.059 n=28+25)
      Template-4               4.75MB/s ± 1%  4.75MB/s ± 1%    ~     (p=0.106 n=29+29)
      [Geo mean]               18.3MB/s       18.3MB/s       -0.07%
      
      Change-Id: I3cd76ce63e84b0c3cebabf9fa3573b76a7343899
      Reviewed-on: https://go-review.googlesource.com/124935
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      0e9f1de0
  4. 04 Sep, 2018 7 commits
    • Ben Shi's avatar
      cmd/compile: optimize ARM64's code with MADD/MSUB · b4442151
      Ben Shi authored
      MADD does MUL-ADD in a single instruction, and MSUB does the
      similiar simplification for MUL-SUB.
      
      The CL implements the optimization with MADD/MSUB.
      
      1. The total size of pkg/android_arm64/ decreases about 20KB,
      excluding cmd/compile/.
      
      2. The go1 benchmark shows a little improvement for RegexpMatchHard_32-4
      and Template-4, excluding noise.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              16.3s ± 1%     16.5s ± 1%  +1.41%  (p=0.000 n=26+28)
      Fannkuch11-4                8.79s ± 1%     8.76s ± 0%  -0.36%  (p=0.000 n=26+28)
      FmtFprintfEmpty-4           172ns ± 0%     172ns ± 0%    ~     (all equal)
      FmtFprintfString-4          362ns ± 1%     364ns ± 0%  +0.55%  (p=0.000 n=30+30)
      FmtFprintfInt-4             416ns ± 0%     416ns ± 0%    ~     (p=0.099 n=22+30)
      FmtFprintfIntInt-4          655ns ± 1%     660ns ± 1%  +0.76%  (p=0.000 n=30+30)
      FmtFprintfPrefixedInt-4     810ns ± 0%     809ns ± 0%  -0.08%  (p=0.009 n=29+29)
      FmtFprintfFloat-4          1.08µs ± 0%    1.09µs ± 0%  +0.61%  (p=0.000 n=30+29)
      FmtManyArgs-4              2.70µs ± 0%    2.69µs ± 0%  -0.23%  (p=0.000 n=29+28)
      GobDecode-4                32.2ms ± 1%    32.1ms ± 1%  -0.39%  (p=0.000 n=27+26)
      GobEncode-4                27.4ms ± 2%    27.4ms ± 1%    ~     (p=0.864 n=28+28)
      Gzip-4                      1.53s ± 1%     1.52s ± 1%  -0.30%  (p=0.031 n=29+29)
      Gunzip-4                    146ms ± 0%     146ms ± 0%  -0.14%  (p=0.001 n=25+30)
      HTTPClientServer-4         1.00ms ± 4%    0.98ms ± 6%  -1.65%  (p=0.001 n=29+30)
      JSONEncode-4               67.3ms ± 1%    67.2ms ± 1%    ~     (p=0.520 n=28+28)
      JSONDecode-4                329ms ± 5%     330ms ± 4%    ~     (p=0.142 n=30+30)
      Mandelbrot200-4            17.3ms ± 0%    17.3ms ± 0%    ~     (p=0.055 n=26+29)
      GoParse-4                  16.9ms ± 1%    17.0ms ± 1%  +0.82%  (p=0.000 n=30+30)
      RegexpMatchEasy0_32-4       382ns ± 0%     382ns ± 0%    ~     (all equal)
      RegexpMatchEasy0_1K-4      1.33µs ± 0%    1.33µs ± 0%  -0.25%  (p=0.000 n=30+27)
      RegexpMatchEasy1_32-4       361ns ± 0%     361ns ± 0%  -0.08%  (p=0.002 n=30+28)
      RegexpMatchEasy1_1K-4      2.11µs ± 0%    2.09µs ± 0%  -0.54%  (p=0.000 n=30+29)
      RegexpMatchMedium_32-4      594ns ± 0%     592ns ± 0%  -0.32%  (p=0.000 n=30+30)
      RegexpMatchMedium_1K-4      173µs ± 0%     172µs ± 0%  -0.77%  (p=0.000 n=29+27)
      RegexpMatchHard_32-4       10.4µs ± 0%    10.1µs ± 0%  -3.63%  (p=0.000 n=28+27)
      RegexpMatchHard_1K-4        306µs ± 0%     301µs ± 0%  -1.64%  (p=0.000 n=29+30)
      Revcomp-4                   2.51s ± 1%     2.52s ± 0%  +0.18%  (p=0.017 n=26+27)
      Template-4                  394ms ± 3%     382ms ± 3%  -3.22%  (p=0.000 n=28+28)
      TimeParse-4                1.67µs ± 0%    1.67µs ± 0%  +0.05%  (p=0.030 n=27+30)
      TimeFormat-4               1.72µs ± 0%    1.70µs ± 0%  -0.79%  (p=0.000 n=28+26)
      [Geo mean]                  259µs          259µs       -0.33%
      
      name                     old speed      new speed      delta
      GobDecode-4              23.8MB/s ± 1%  23.9MB/s ± 1%  +0.40%  (p=0.001 n=27+26)
      GobEncode-4              28.0MB/s ± 2%  28.0MB/s ± 1%    ~     (p=0.863 n=28+28)
      Gzip-4                   12.7MB/s ± 1%  12.7MB/s ± 1%  +0.32%  (p=0.026 n=29+29)
      Gunzip-4                  133MB/s ± 0%   133MB/s ± 0%  +0.15%  (p=0.001 n=24+30)
      JSONEncode-4             28.8MB/s ± 1%  28.9MB/s ± 1%    ~     (p=0.475 n=28+28)
      JSONDecode-4             5.89MB/s ± 4%  5.87MB/s ± 5%    ~     (p=0.174 n=29+30)
      GoParse-4                3.43MB/s ± 0%  3.40MB/s ± 1%  -0.83%  (p=0.000 n=28+30)
      RegexpMatchEasy0_32-4    83.6MB/s ± 0%  83.6MB/s ± 0%    ~     (p=0.848 n=28+29)
      RegexpMatchEasy0_1K-4     768MB/s ± 0%   770MB/s ± 0%  +0.25%  (p=0.000 n=30+27)
      RegexpMatchEasy1_32-4    88.5MB/s ± 0%  88.5MB/s ± 0%    ~     (p=0.086 n=29+29)
      RegexpMatchEasy1_1K-4     486MB/s ± 0%   489MB/s ± 0%  +0.54%  (p=0.000 n=30+29)
      RegexpMatchMedium_32-4   1.68MB/s ± 0%  1.69MB/s ± 0%  +0.60%  (p=0.000 n=30+23)
      RegexpMatchMedium_1K-4   5.90MB/s ± 0%  5.95MB/s ± 0%  +0.85%  (p=0.000 n=18+20)
      RegexpMatchHard_32-4     3.07MB/s ± 0%  3.18MB/s ± 0%  +3.72%  (p=0.000 n=29+26)
      RegexpMatchHard_1K-4     3.35MB/s ± 0%  3.40MB/s ± 0%  +1.69%  (p=0.000 n=30+30)
      Revcomp-4                 101MB/s ± 0%   101MB/s ± 0%  -0.18%  (p=0.018 n=26+27)
      Template-4               4.92MB/s ± 4%  5.09MB/s ± 3%  +3.31%  (p=0.000 n=28+28)
      [Geo mean]               22.4MB/s       22.6MB/s       +0.62%
      
      Change-Id: I8f304b272785739f57b3c8f736316f658f8c1b2a
      Reviewed-on: https://go-review.googlesource.com/129119
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      b4442151
    • Ben Shi's avatar
      cmd/internal/obj/arm64: support more atomic instructions · 1018a80f
      Ben Shi authored
      LDADDALD(64-bit) and LDADDALW(32-bit) are already supported.
      This CL adds supports of LDADDALH(16-bit) and LDADDALB(8-bit).
      
      Change-Id: I4eac61adcec226d618dfce88618a2b98f5f1afe7
      Reviewed-on: https://go-review.googlesource.com/132135
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      1018a80f
    • Agniva De Sarker's avatar
      cmd/go/internal/modcmd: remove non-existent -dir flag · 55ef4460
      Agniva De Sarker authored
      Fixes #27243
      
      Change-Id: If9230244938dabd03b9afaa6600310df8f97fe92
      Reviewed-on: https://go-review.googlesource.com/131775Reviewed-by: default avatarBryan C. Mills <bcmills@google.com>
      55ef4460
    • Matthew Dempsky's avatar
      cmd/compile: use "N variables but M values" error for OAS · f7a633aa
      Matthew Dempsky authored
      Makes the error message more consistent between OAS and OAS2.
      
      Fixes #26616.
      
      Change-Id: I07ab46c5ef8a37efb2cb557632697f5d1bf789f7
      Reviewed-on: https://go-review.googlesource.com/131280
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      f7a633aa
    • Alessandro Arzilli's avatar
      cmd/link: move dwarf part of DWARF generation before type name mangling · 9c833831
      Alessandro Arzilli authored
      Splits part of dwarfgeneratedebugsyms into a new function,
      dwarfGenerateDebugInfo which is called between deadcode elimination
      and type name mangling.
      This function takes care of collecting and processing the DIEs for
      all functions and package-level variables and also generates DIEs
      for all types used in the program.
      
      Fixes #23733
      
      Change-Id: I75ef0608fbed2dffc3be7a477f1b03e7e740ec61
      Reviewed-on: https://go-review.googlesource.com/111237
      Run-TryBot: Heschi Kreinick <heschi@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarHeschi Kreinick <heschi@google.com>
      9c833831
    • Alexey Naidonov's avatar
      cmd/compile: remove unnecessary nil-check · 669fa8f3
      Alexey Naidonov authored
      Removes unnecessary nil-check when referencing offset from an
      address. Suggested by Keith Randall in golang/go#27180.
      
      Updates golang/go#27180
      
      Change-Id: I326ed7fda7cfa98b7e4354c811900707fee26021
      Reviewed-on: https://go-review.googlesource.com/131735Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      669fa8f3
    • Josh Bleecher Snyder's avatar
      cmd/compile: prefer rematerializeable arg0 for HMUL · 24e51bbe
      Josh Bleecher Snyder authored
      This prevents accidental regalloc regressions
      that otherwise can occur from unrelated changes.
      
      Change-Id: Iea356fb1a24766361fce13748dc1b46e57b21cea
      Reviewed-on: https://go-review.googlesource.com/129375
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      24e51bbe