1. 09 Mar, 2019 9 commits
    • Daniel Martí's avatar
      reflect: make all flag.mustBe* methods inlinable · 788e038e
      Daniel Martí authored
      mustBe was barely over budget, so manually inlining the first flag.kind
      call is enough. Add a TODO to reverse that in the future, once the
      compiler gets better.
      
      mustBeExported and mustBeAssignable were over budget by a larger amount,
      so add slow path functions instead. This is the same strategy used in
      the sync package for common methods like Once.Do, for example.
      
      Lots of exported reflect.Value methods call these assert-like unexported
      methods, so avoiding the function call overhead in the common case does
      shave off a percent from most exported APIs.
      
      Finally, add the methods to TestIntendedInlining.
      
      While at it, replace a couple of uses of the 0 Kind with its descriptive
      name, Invalid.
      
      name     old time/op    new time/op    delta
      Call-8     68.0ns ± 1%    66.8ns ± 1%  -1.81%  (p=0.000 n=10+9)
      PtrTo-8    8.00ns ± 2%    7.83ns ± 0%  -2.19%  (p=0.000 n=10+9)
      
      Updates #7818.
      
      Change-Id: Ic1603b640519393f6b50dd91ec3767753eb9e761
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166462
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      788e038e
    • Daniel Martí's avatar
      cmd/compile: update TestIntendedInlining · cc5dc001
      Daniel Martí authored
      Value.CanInterface and Value.pointer are now inlinable, since we have a
      limited form of mid-stack inlining. Their calls to panic were preventing
      that in previous Go releases. The other three methods still go over
      budget, so update that comment.
      
      In recent commits, sync.Once.Do and multiple lock/unlock methods have
      also been made inlinable, so add those as well. They have standalone
      tests like test/inline_sync.go already, but it's best if the funcs are
      in this global test table too. They aren't inlinable on every platform
      yet, though.
      
      Finally, use math/bits.UintSize to check if GOARCH is 64-bit, now that
      we can.
      
      Change-Id: I65cc681b77015f7746dba3126637e236dcd494e0
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166461
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      cc5dc001
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the RWMutex.RUnlock fast path · 05051b56
      Carlo Alberto Ferraris authored
      RWMutex.RLock is already inlineable, so add a test for it as well.
      
      name                    old time/op  new time/op  delta
      RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
      RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
      RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
      RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
      RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
      RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
      RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
      RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
      RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
      RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
      RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)
      
      linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)
      
      The cumulative effect of this and the previous 3 commits is:
      
      name                    old time/op  new time/op  delta
      MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
      MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
      MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
      Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
      Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
      Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
      MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
      MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
      MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
      MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
      MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
      MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
      MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
      MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
      MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
      MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
      MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
      MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
      MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
      MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
      MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
      Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
      Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
      Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
      RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
      RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
      RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
      RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
      RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
      RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
      RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
      RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
      RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
      RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
      RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)
      
      Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
      Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      05051b56
    • Tobias Klauser's avatar
      bytes: return early in Repeat if count is 0 · 0e9d7d43
      Tobias Klauser authored
      This matches the implementation of strings.Repeat and slightly increases
      performance:
      
      name      old time/op  new time/op  delta
      Repeat-8   145ns ±12%   125ns ±29%  -13.35%  (p=0.009 n=10+10)
      
      Change-Id: Ic0a0e2ea9e36591286a49def320ddb67fe0b2c50
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166399
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      0e9d7d43
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the Once.Do fast path · ca835484
      Carlo Alberto Ferraris authored
      Using Once.Do is now extremely cheap because the fast path is just an inlined
      atomic load of a variable that is written only once and a conditional jump.
      This is very beneficial for Once.Do because, due to its nature, the fast path
      will be used for every call after the first one.
      
      In a attempt to mimize code size increase, reorder the fields so that the
      pointer to Once is also the pointer to Once.done, that is the only field used
      in the hot path. This allows to use more compact instruction encodings or less
      instructions in the hot path (that is inlined at every callsite).
      
      name     old time/op  new time/op  delta
      Once     4.54ns ± 0%  2.06ns ± 0%  -54.59%  (p=0.000 n=19+16)
      Once-4   1.18ns ± 0%  0.55ns ± 0%  -53.39%  (p=0.000 n=15+16)
      Once-16  0.53ns ± 0%  0.17ns ± 0%  -67.92%  (p=0.000 n=18+17)
      
      linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)
      
      Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
      Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      ca835484
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the Mutex.Lock fast path · 41cb0aed
      Carlo Alberto Ferraris authored
      name                    old time/op  new time/op  delta
      MutexUncontended        18.9ns ± 0%  16.2ns ± 0%  -14.29%  (p=0.000 n=19+19)
      MutexUncontended-4      4.75ns ± 1%  4.08ns ± 0%  -14.20%  (p=0.000 n=20+19)
      MutexUncontended-16     2.05ns ± 0%  2.11ns ± 0%   +2.93%  (p=0.000 n=19+16)
      Mutex                   19.3ns ± 1%  16.2ns ± 0%  -15.86%  (p=0.000 n=17+19)
      Mutex-4                 52.4ns ± 4%  48.6ns ± 9%   -7.22%  (p=0.000 n=20+20)
      Mutex-16                 139ns ± 2%   140ns ± 3%   +1.03%  (p=0.011 n=16+20)
      MutexSlack              18.9ns ± 1%  16.2ns ± 1%  -13.96%  (p=0.000 n=20+20)
      MutexSlack-4             225ns ± 8%   211ns ±10%   -5.94%  (p=0.000 n=18+19)
      MutexSlack-16           98.4ns ± 1%  90.9ns ± 1%   -7.60%  (p=0.000 n=17+18)
      MutexWork               58.2ns ± 3%  55.4ns ± 0%   -4.82%  (p=0.000 n=20+17)
      MutexWork-4              103ns ± 7%    95ns ±18%   -8.03%  (p=0.000 n=20+20)
      MutexWork-16             163ns ± 2%   155ns ± 2%   -4.47%  (p=0.000 n=18+18)
      MutexWorkSlack          57.7ns ± 1%  55.4ns ± 0%   -3.99%  (p=0.000 n=20+13)
      MutexWorkSlack-4         276ns ±13%   260ns ±10%   -5.64%  (p=0.001 n=19+19)
      MutexWorkSlack-16        147ns ± 0%   156ns ± 1%   +5.87%  (p=0.000 n=14+19)
      MutexNoSpin              968ns ± 0%   900ns ± 1%   -6.98%  (p=0.000 n=20+18)
      MutexNoSpin-4            270ns ± 2%   255ns ± 2%   -5.74%  (p=0.000 n=19+20)
      MutexNoSpin-16           120ns ± 4%   112ns ± 0%   -6.99%  (p=0.000 n=19+14)
      MutexSpin               3.13µs ± 1%  3.19µs ± 6%     ~     (p=0.401 n=20+20)
      MutexSpin-4              832ns ± 2%   831ns ± 1%   -0.17%  (p=0.023 n=16+18)
      MutexSpin-16             395ns ± 0%   399ns ± 0%   +0.94%  (p=0.000 n=17+19)
      RWMutexUncontended      69.5ns ± 0%  68.4ns ± 0%   -1.59%  (p=0.000 n=20+20)
      RWMutexUncontended-4    17.5ns ± 0%  16.7ns ± 0%   -4.30%  (p=0.000 n=18+17)
      RWMutexUncontended-16   7.92ns ± 0%  7.87ns ± 0%   -0.61%  (p=0.000 n=18+17)
      RWMutexWrite100         24.9ns ± 1%  25.0ns ± 1%   +0.32%  (p=0.000 n=20+20)
      RWMutexWrite100-4       46.2ns ± 4%  46.2ns ± 5%     ~     (p=0.840 n=19+20)
      RWMutexWrite100-16      69.9ns ± 5%  69.9ns ± 3%     ~     (p=0.545 n=20+19)
      RWMutexWrite10          27.0ns ± 2%  26.8ns ± 2%   -0.98%  (p=0.001 n=20+20)
      RWMutexWrite10-4        34.7ns ± 2%  35.0ns ± 4%     ~     (p=0.191 n=18+20)
      RWMutexWrite10-16       37.2ns ± 4%  37.3ns ± 2%     ~     (p=0.438 n=20+19)
      RWMutexWorkWrite100      164ns ± 0%   163ns ± 0%   -0.24%  (p=0.025 n=20+20)
      RWMutexWorkWrite100-4    193ns ± 3%   191ns ± 2%   -1.06%  (p=0.027 n=20+20)
      RWMutexWorkWrite100-16   210ns ± 3%   207ns ± 3%   -1.22%  (p=0.038 n=20+20)
      RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%     ~     (all equal)
      RWMutexWorkWrite10-4     178ns ± 2%   179ns ± 2%     ~     (p=0.186 n=20+20)
      RWMutexWorkWrite10-16    192ns ± 2%   192ns ± 2%     ~     (p=0.731 n=19+20)
      
      linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)
      
      Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
      Reviewed-on: https://go-review.googlesource.com/c/go/+/148959Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      41cb0aed
    • Keith Randall's avatar
      cmd/compile: reverse order of slice bounds checks · 83a33d38
      Keith Randall authored
      Turns out this makes the fix for 28797 unnecessary, because this order
      ensures that the RHS of IsSliceInBounds ops are always nonnegative.
      
      The real reason for this change is that it also makes dealing with
      <0 values easier for reporting values in bounds check panics (issue #30116).
      
      Makes cmd/go negligibly smaller.
      
      Update #28797
      
      Change-Id: I1f25ba6d2b3b3d4a72df3105828aa0a4b629ce85
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166377
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      83a33d38
    • Clément Chigot's avatar
      cmd/link: enable DWARF with external linker on aix/ppc64 · 3cf89e50
      Clément Chigot authored
      In order to allow DWARF with ld, the symbol table is adapted.
      In internal linkmode, each package is considered as a .FILE. However,
      current version of ld is crashing on a few programs because of
      relocations between DWARF symbols. Considering all packages as part of
      one .FILE seems to bypass this bug.
      As it might be fixed in a future release, the size of each package
      in DWARF sections is still retrieved and can be used when it's fixed.
      Moreover, it's improving internal linkmode which should have done it
      anyway.
      
      Change-Id: If3d023fe118b24b9f0f46d201a4849eee8d5e333
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164006
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      3cf89e50
    • LE Manh Cuong's avatar
      debug/gosym: simplify parsing symbol name rule · b37b35ed
      LE Manh Cuong authored
      Symbol name with linker prefix like "type." and "go." is not parsed
      correctly and returns the prefix as parts of package name.
      
      So just returns empty string for symbol name start with linker prefix.
      
      Fixes #29551
      
      Change-Id: Idb4ce872345e5781a5a5da2b2146faeeebd9e63b
      Reviewed-on: https://go-review.googlesource.com/c/go/+/156397
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b37b35ed
  2. 08 Mar, 2019 23 commits
  3. 07 Mar, 2019 8 commits
    • Cherry Zhang's avatar
      cmd/link: fix suspicious code in emitPcln · 3a62f4ee
      Cherry Zhang authored
      In cmd/link/internal/ld/pcln.go:emitPcln, the code and the
      comment don't match. I think the comment is right. Fix the code.
      
      As a consequence, on Linux/AMD64, internal linking with PIE
      buildmode with cgo (at least the cgo packages in the standard
      library) now works. Add a test.
      
      Change-Id: I091cf81ba89571052bc0ec1fa0a6a688dec07b04
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166017
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      3a62f4ee
    • Peter Waller's avatar
      cmd/compile/internal/ssa: set OFOR bBody.Pos to AST Pos · 7afd58d4
      Peter Waller authored
      Assign SSA OFOR's bBody.Pos to AST (*Node).Pos as it is created.
      
      An empty for loop has no other information which may be used to give
      correct position information in the resulting executable. Such a for
      loop may compile to a single `JMP *self` and it is important that the
      location of this is in the right place.
      
      Fixes #30167.
      
      Change-Id: Iec44f0281c462c33fac6b7b8ccfc2ef37434c247
      Reviewed-on: https://go-review.googlesource.com/c/go/+/163019
      Run-TryBot: David Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      7afd58d4
    • fanzha02's avatar
      cmd/compile: optimize arm64 comparison of x and 0.0 with "FCMP $(0.0), Fn" · 27cce773
      fanzha02 authored
      Code:
      func comp(x float64) bool {return x < 0}
      
      Previous version:
        FMOVD	"".x(FP), F0
        FMOVD	ZR, F1
        FCMPD	F1, F0
        CSET	MI, R0
        MOVB	R0, "".~r1+8(FP)
        RET	(R30)
      
      Optimized version:
        FMOVD	"".x(FP), F0
        FCMPD	$(0.0), F0
        CSET	MI, R0
        MOVB	R0, "".~r1+8(FP)
        RET	(R30)
      
      Math package benchmark results:
      name                   old time/op          new time/op          delta
      Acos-8                   77.500000ns +- 0%    77.400000ns +- 0%   -0.13%  (p=0.000 n=9+10)
      Acosh-8                  98.600000ns +- 0%    98.100000ns +- 0%   -0.51%  (p=0.000 n=10+9)
      Asin-8                   67.600000ns +- 0%    66.600000ns +- 0%   -1.48%  (p=0.000 n=9+10)
      Asinh-8                 108.000000ns +- 0%   109.000000ns +- 0%   +0.93%  (p=0.000 n=10+10)
      Atan-8                   36.788889ns +- 0%    36.000000ns +- 0%   -2.14%  (p=0.000 n=9+10)
      Atanh-8                 104.000000ns +- 0%   105.000000ns +- 0%   +0.96%  (p=0.000 n=10+10)
      Atan2-8                  67.100000ns +- 0%    66.600000ns +- 0%   -0.75%  (p=0.000 n=10+10)
      Cbrt-8                   89.100000ns +- 0%    82.000000ns +- 0%   -7.97%  (p=0.000 n=10+10)
      Erf-8                    43.500000ns +- 0%    43.000000ns +- 0%   -1.15%  (p=0.000 n=10+10)
      Erfc-8                   49.000000ns +- 0%    48.220000ns +- 0%   -1.59%  (p=0.000 n=9+10)
      Erfinv-8                 59.100000ns +- 0%    58.600000ns +- 0%   -0.85%  (p=0.000 n=10+10)
      Erfcinv-8                59.100000ns +- 0%    58.600000ns +- 0%   -0.85%  (p=0.000 n=10+10)
      Expm1-8                  56.600000ns +- 0%    56.040000ns +- 0%   -0.99%  (p=0.000 n=8+10)
      Exp2Go-8                 97.600000ns +- 0%    99.400000ns +- 0%   +1.84%  (p=0.000 n=10+10)
      Dim-8                     2.500000ns +- 0%     2.250000ns +- 0%  -10.00%  (p=0.000 n=10+10)
      Mod-8                   108.000000ns +- 0%   106.000000ns +- 0%   -1.85%  (p=0.000 n=8+8)
      Frexp-8                  12.000000ns +- 0%    12.500000ns +- 0%   +4.17%  (p=0.000 n=10+10)
      Gamma-8                  67.100000ns +- 0%    67.600000ns +- 0%   +0.75%  (p=0.000 n=10+10)
      Hypot-8                  17.100000ns +- 0%    17.000000ns +- 0%   -0.58%  (p=0.002 n=8+10)
      Ilogb-8                   9.010000ns +- 0%     8.510000ns +- 0%   -5.55%  (p=0.000 n=10+9)
      J1-8                    288.000000ns +- 0%   287.000000ns +- 0%   -0.35%  (p=0.000 n=10+10)
      Jn-8                    605.000000ns +- 0%   604.000000ns +- 0%   -0.17%  (p=0.001 n=8+9)
      Logb-8                   10.600000ns +- 0%    10.500000ns +- 0%   -0.94%  (p=0.000 n=9+10)
      Log2-8                   16.500000ns +- 0%    17.000000ns +- 0%   +3.03%  (p=0.000 n=10+10)
      PowFrac-8               232.000000ns +- 0%   233.000000ns +- 0%   +0.43%  (p=0.000 n=10+10)
      Remainder-8              70.600000ns +- 0%    69.600000ns +- 0%   -1.42%  (p=0.000 n=10+10)
      SqrtGoLatency-8          77.600000ns +- 0%    76.600000ns +- 0%   -1.29%  (p=0.000 n=10+10)
      Tanh-8                   97.600000ns +- 0%    94.100000ns +- 0%   -3.59%  (p=0.000 n=10+10)
      Y1-8                    289.000000ns +- 0%   288.000000ns +- 0%   -0.35%  (p=0.000 n=10+10)
      Yn-8                    603.000000ns +- 0%   589.000000ns +- 0%   -2.32%  (p=0.000 n=10+10)
      
      Change-Id: I6920734f8662b329aa58f5b8e4eeae73b409984d
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164719Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      27cce773
    • fanzha02's avatar
      cmd/compile: change the condition flags of floating-point comparisons in arm64 backend · 6efd51c6
      fanzha02 authored
      Current compiler reverses operands to work around NaN in
      "less than" and "less equal than" comparisons. But if we
      want to use "FCMPD/FCMPS $(0.0), Fn" to do some optimization,
      the workaround way does not work. Because assembler does
      not support instruction "FCMPD/FCMPS Fn, $(0.0)".
      
      This CL sets condition flags for floating-point comparisons
      to resolve this problem.
      
      Change-Id: Ia48076a1da95da64596d6e68304018cb301ebe33
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164718
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      6efd51c6
    • Robert Griesemer's avatar
      cmd/compile: remove work-arounds for 0o/0O octals · a77f85a6
      Robert Griesemer authored
      With math/big supporting the new octal prefixes directly,
      the compiler doesn't have to manually convert such numbers
      into old-style 0-prefix octals anymore.
      
      Updates #12711.
      
      Change-Id: I300bdd095836595426a1478d68da179f39e5531a
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165861Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      a77f85a6
    • Robert Griesemer's avatar
      math/big: support new octal prefixes 0o and 0O · 129c6e44
      Robert Griesemer authored
      This CL extends the various SetString and Parse methods for
      Ints, Rats, and Floats to accept the new octal prefixes.
      
      The main change is in natconv.go, all other changes are
      documentation and test updates.
      
      Finally, this CL also fixes TestRatSetString which silently
      dropped certain failures.
      
      Updates #12711.
      
      Change-Id: I5ee5879e25013ba1e6eda93ff280915f25ab5d55
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165898Reviewed-by: default avatarEmmanuel Odeke <emm.odeke@gmail.com>
      129c6e44
    • Raul Silvera's avatar
      test: improve test coverage for heap sampling · 2dd066d4
      Raul Silvera authored
      Update the test in test/heapsampling.go to more thoroughly validate heap sampling.
      Lower the sampling rate on the test to ensure allocations both smaller and
      larger than the sampling rate are tested.
      
      Tighten up the validation check to a 10% difference between the unsampled and correct value.
      Because of the nature of random sampling, it is possible that the unsampled value fluctuates
      over that range. To avoid flakes, run the experiment three times and only report an issue if the
      same location consistently falls out of range on all experiments.
      
      This tests the sampling fix in cl/158337.
      
      Change-Id: I54a709e5c75827b8b1c2d87cdfb425ab09759677
      GitHub-Last-Rev: 7c04f126034f9e323efc220c896d75e7984ffd39
      GitHub-Pull-Request: golang/go#26944
      Reviewed-on: https://go-review.googlesource.com/c/go/+/129117
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      2dd066d4
    • Chris Marchesi's avatar
      net/http: let Transport request body writes use sendfile · 6ebfbbaa
      Chris Marchesi authored
      net.TCPConn has the ability to send data out using system calls such as
      sendfile when the source data comes from an *os.File. However, the way
      that I/O has been laid out in the transport means that the File is
      actually wrapped behind two outer io.Readers, and as such the TCP stack
      cannot properly type-assert the reader, ensuring that it falls back to
      genericReadFrom.
      
      This commit does the following:
      
      * Removes transferBodyReader and moves its functionality to a new
      doBodyCopy helper. This is not an io.Reader implementation, but no
      functionality is lost this way, and it allows us to unwrap one layer
      from the body.
      
      * The second layer of the body is unwrapped if the original reader
      was wrapped with ioutil.NopCloser, which is what NewRequest wraps the
      body in if it's not a ReadCloser on its own. The unwrap operation
      passes through the existing body if there's no nopCloser.
      
      Note that this depends on change https://golang.org/cl/163737 to
      properly function, as the lack of ReaderFrom implementation otherwise
      means that this functionality is essentially walled off.
      
      Benchmarks between this commit and https://golang.org/cl/163862,
      incorporating https://golang.org/cl/163737:
      
      linux/amd64:
      name                        old time/op    new time/op    delta
      FileAndServer_1KB/NoTLS-4     53.2µs ± 0%    53.3µs ± 0%      ~     (p=0.075 n=10+9)
      FileAndServer_1KB/TLS-4       61.2µs ± 0%    60.7µs ± 0%    -0.77%  (p=0.000 n=10+9)
      FileAndServer_16MB/NoTLS-4    25.3ms ± 5%     3.8ms ± 6%   -84.95%  (p=0.000 n=10+10)
      FileAndServer_16MB/TLS-4      33.2ms ± 2%    13.4ms ± 2%   -59.57%  (p=0.000 n=10+10)
      FileAndServer_64MB/NoTLS-4     106ms ± 4%      16ms ± 2%   -84.45%  (p=0.000 n=10+10)
      FileAndServer_64MB/TLS-4       129ms ± 1%      54ms ± 3%   -58.32%  (p=0.000 n=8+10)
      
      name                        old speed      new speed      delta
      FileAndServer_1KB/NoTLS-4   19.2MB/s ± 0%  19.2MB/s ± 0%      ~     (p=0.095 n=10+9)
      FileAndServer_1KB/TLS-4     16.7MB/s ± 0%  16.9MB/s ± 0%    +0.78%  (p=0.000 n=10+9)
      FileAndServer_16MB/NoTLS-4   664MB/s ± 5%  4415MB/s ± 6%  +565.27%  (p=0.000 n=10+10)
      FileAndServer_16MB/TLS-4     505MB/s ± 2%  1250MB/s ± 2%  +147.32%  (p=0.000 n=10+10)
      FileAndServer_64MB/NoTLS-4   636MB/s ± 4%  4090MB/s ± 2%  +542.81%  (p=0.000 n=10+10)
      FileAndServer_64MB/TLS-4     522MB/s ± 1%  1251MB/s ± 3%  +139.95%  (p=0.000 n=8+10)
      
      darwin/amd64:
      name                        old time/op    new time/op     delta
      FileAndServer_1KB/NoTLS-8     93.0µs ± 5%     96.6µs ±11%      ~     (p=0.190 n=10+10)
      FileAndServer_1KB/TLS-8        105µs ± 7%      100µs ± 5%    -5.14%  (p=0.002 n=10+9)
      FileAndServer_16MB/NoTLS-8    87.5ms ±19%     10.0ms ± 6%   -88.57%  (p=0.000 n=10+10)
      FileAndServer_16MB/TLS-8      52.7ms ±11%     17.4ms ± 5%   -66.92%  (p=0.000 n=10+10)
      FileAndServer_64MB/NoTLS-8     363ms ±54%       39ms ± 7%   -89.24%  (p=0.000 n=10+10)
      FileAndServer_64MB/TLS-8       209ms ±13%       73ms ± 5%   -65.37%  (p=0.000 n=9+10)
      
      name                        old speed      new speed       delta
      FileAndServer_1KB/NoTLS-8   11.0MB/s ± 5%   10.6MB/s ±10%      ~     (p=0.184 n=10+10)
      FileAndServer_1KB/TLS-8     9.75MB/s ± 7%  10.27MB/s ± 5%    +5.26%  (p=0.003 n=10+9)
      FileAndServer_16MB/NoTLS-8   194MB/s ±16%   1680MB/s ± 6%  +767.83%  (p=0.000 n=10+10)
      FileAndServer_16MB/TLS-8     319MB/s ±10%    963MB/s ± 4%  +201.36%  (p=0.000 n=10+10)
      FileAndServer_64MB/NoTLS-8   180MB/s ±31%   1719MB/s ± 7%  +853.61%  (p=0.000 n=9+10)
      FileAndServer_64MB/TLS-8     321MB/s ±12%    926MB/s ± 5%  +188.24%  (p=0.000 n=9+10)
      
      Updates #30377.
      
      Change-Id: I631a73cea75371dfbb418c9cd487c4aa35e73fcd
      GitHub-Last-Rev: 4a77dd1b80140274bf3ed20ad7465ff3cc06febf
      GitHub-Pull-Request: golang/go#30378
      Reviewed-on: https://go-review.googlesource.com/c/go/+/163599
      Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarEmmanuel Odeke <emm.odeke@gmail.com>
      6ebfbbaa