1. 08 Mar, 2018 20 commits
    • Austin Clements's avatar
      cmd/compile: derive unsigned limits from signed limits in prove · 941fc129
      Austin Clements authored
      This adds a few simple deductions to the prove pass' fact table to
      derive unsigned concrete limits from signed concrete limits where
      possible.
      
      This tweak lets the pass prove 70 additional branch conditions in std
      and cmd.
      
      This is based on a comment from the recently-deleted factsTable.get:
      "// TODO: also use signed data if lim.min >= 0".
      
      Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392
      Reviewed-on: https://go-review.googlesource.com/87479Reviewed-by: default avatarAlexandru Moșoi <alexandru@mosoi.ro>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      941fc129
    • Austin Clements's avatar
      cmd/compile: make prove pass use unsatisfiability · 669db2ce
      Austin Clements authored
      Currently the prove pass uses implication queries. For each block, it
      collects the set of branch conditions leading to that block, and
      queries this fact table for whether any of these facts imply the
      block's own branch condition (or its inverse). This works remarkably
      well considering it doesn't do any deduction on these facts, but it
      has various downsides:
      
      1. It requires an implementation both of adding facts to the table and
         determining implications. These are very nearly duals of each
         other, but require separate implementations. Likewise, the process
         of asserting facts of dominating branch conditions is very nearly
         the dual of the process of querying implied branch conditions.
      
      2. It leads to less effective use of derived facts. For example, the
         prove pass currently derives facts about the relations between len
         and cap, but can't make use of these unless a branch condition is
         in the exact form of a derived fact. If one of these derived facts
         contradicts another fact, it won't notice or make use of this.
      
      This CL changes the approach of the prove pass to instead use
      *contradiction* instead of implication. Rather than ever querying a
      branch condition, it simply adds branch conditions to the fact table.
      If this leads to a contradiction (specifically, it makes the fact set
      unsatisfiable), that branch is impossible and can be cut. As a result,
      
      1. We can eliminate the code for determining implications
         (factsTable.get disappears entirely). Also, there is now a single
         implementation of visiting and asserting branch conditions, since
         we don't have to flip them around to treat them as facts in one
         place and queries in another.
      
      2. Derived facts can be used effectively. It doesn't matter *why* the
         fact table is unsatisfiable; a contradiction in any of the facts is
         enough.
      
      3. As an added benefit, it's now quite easy to avoid traversing beyond
         provably-unreachable blocks. In contrast, the current
         implementation always visits all blocks.
      
      The prove pass already has nearly all of the mechanism necessary to
      compute unsatisfiability, which means this both simplifies the code
      and makes it more powerful.
      
      The only complication is that the current implication procedure has a
      hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and
      OpIsSliceInBounds. We replace this with asserting the appropriate fact
      when we process one of these conditions. This seems much cleaner
      anyway, and works because we can now take advantage of derived facts.
      
      This has no measurable effect on compiler performance.
      
      Effectiveness:
      
      There is exactly one condition in all of std and cmd that this fails
      to prove that the old implementation could: (int64(^uint(0)>>1) < x)
      in encoding/gob. This can never be true because x is an int, and it's
      basically coincidence that the old code gets this. (For example, it
      fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that
      immediately precedes it, and even though the conditions are logically
      unrelated, it wouldn't get the second one if it hadn't first processed
      the first!)
      
      It does, however, prove a few dozen additional branches. These come
      from facts that are added to the fact table about the relations
      between len and cap. These were almost never queried directly before,
      but could lead to contradictions, which the unsat-based approach is
      able to use.
      
      There are exactly two branches in std and cmd that this implementation
      proves in the *other* direction. This sounds scary, but is okay
      because both occur in already-unreachable blocks, so it doesn't matter
      what we chose. Because the fact table logic is sound but incomplete,
      it fails to prove that the block isn't reachable, even though it is
      able to prove that both outgoing branches are impossible. We could
      turn these blocks into BlockExit blocks, but it doesn't seem worth the
      trouble of the extra proof effort for something that happens twice in
      all of std and cmd.
      
      Tests:
      
      This CL updates test/prove.go to change the expected messages because
      it can no longer give a "reason" why it proved or disproved a
      condition. It also adds a new test of a branch it couldn't prove
      before.
      
      It mostly guts test/sliceopt.go, removing everything related to slice
      bounds optimizations and moving a few relevant tests to test/prove.go.
      Much of this test is actually unreachable. The new prove pass figures
      this out and doesn't try to prove anything about the unreachable
      parts. The output on the unreachable parts is already suspect because
      anything can be proved at that point, so it's really just a regression
      test for an algorithm the compiler no longer uses.
      
      This is a step toward fixing #23354. That issue is quite easy to fix
      once we can use derived facts effectively.
      
      Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8
      Reviewed-on: https://go-review.googlesource.com/87478Reviewed-by: default avatarKeith Randall <khr@golang.org>
      669db2ce
    • Austin Clements's avatar
      cmd/compile: simplify limit logic in prove · 2e9cf5f6
      Austin Clements authored
      This replaces the open-coded intersection of limits in the prove pass
      with a general limit intersection operation. This should get identical
      results except in one case where it's more precise: when handling an
      equality relation, if the value is *outside* the existing range, this
      will reduce the range to empty rather than resetting it. This will be
      important to a follow-up CL where we can take advantage of empty
      ranges.
      
      For #23354.
      
      Change-Id: I3d3d75924f61b1da1cb604b3a9d189b26fb3a14e
      Reviewed-on: https://go-review.googlesource.com/87477
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-by: default avatarAlexandru Moșoi <alexandru@mosoi.ro>
      2e9cf5f6
    • Austin Clements's avatar
      cmd/compile: more String methods for prove types · 44e20b64
      Austin Clements authored
      These aid in debugging.
      
      Change-Id: Ieb38c996765f780f6103f8c3292639d408c25123
      Reviewed-on: https://go-review.googlesource.com/87476
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      44e20b64
    • Austin Clements's avatar
      cmd/compile: minor comment improvements/corrections · 491f409a
      Austin Clements authored
      Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3
      Reviewed-on: https://go-review.googlesource.com/87475
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-by: default avatarAlexandru Moșoi <alexandru@mosoi.ro>
      491f409a
    • Matthew Dempsky's avatar
      Revert "cmd/compile: cleanup nodpc and nodfp" · b55eedd1
      Matthew Dempsky authored
      This reverts commit dcac984b.
      
      Reason for revert: broke LR architectures (arm64, ppc64, s390x)
      
      Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b
      Reviewed-on: https://go-review.googlesource.com/99695
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      b55eedd1
    • Alberto Donizetti's avatar
      math/big: allocate less in Float.Sqrt · 010579c2
      Alberto Donizetti authored
      The Newton sqrtInverse procedure we use to compute Float.Sqrt should
      not allocate a number of times proportional to the number of Newton
      iterations we need to reach the desired precision.
      
      At the beginning the function the target precision is known, so even
      if we do want to perform the early steps at low precisions (to save
      time), it's still possible to pre-allocate larger backing arrays, both
      for the temp variables in the loop and the variable that'll hold the
      final result.
      
      There's one complication. At the following line:
      
        u.Sub(three, u)
      
      the Sub method will allocate, because the receiver aliases one of the
      arguments, and the large backing array we initially allocated for u
      will be replaced by a smaller one allocated by Sub. We can work around
      this by introducing a second temp variable u2 that we use to hold the
      Sub call result.
      
      Overall, the sqrtInverse procedure still allocates a number of times
      proportional to the number of Newton steps, because unfortunately a
      few of the Mul calls in the Newton function allocate; but at least we
      allocate less in the function itself.
      
      FloatSqrt/256-4        1.97µs ± 1%    1.84µs ± 1%   -6.61%  (p=0.000 n=8+8)
      FloatSqrt/1000-4       4.80µs ± 3%    4.28µs ± 1%  -10.78%  (p=0.000 n=8+8)
      FloatSqrt/10000-4      40.0µs ± 1%    38.3µs ± 1%   -4.15%  (p=0.000 n=8+8)
      FloatSqrt/100000-4      955µs ± 1%     932µs ± 0%   -2.49%  (p=0.000 n=8+7)
      FloatSqrt/1000000-4    79.8ms ± 1%    79.4ms ± 1%     ~     (p=0.105 n=8+8)
      
      name                 old alloc/op   new alloc/op   delta
      FloatSqrt/256-4          816B ± 0%      512B ± 0%  -37.25%  (p=0.000 n=8+8)
      FloatSqrt/1000-4       2.50kB ± 0%    1.47kB ± 0%  -41.03%  (p=0.000 n=8+8)
      FloatSqrt/10000-4      23.5kB ± 0%    18.2kB ± 0%  -22.62%  (p=0.000 n=8+8)
      FloatSqrt/100000-4      251kB ± 0%     173kB ± 0%  -31.26%  (p=0.000 n=8+8)
      FloatSqrt/1000000-4    4.61MB ± 0%    2.86MB ± 0%  -37.90%  (p=0.000 n=8+8)
      
      name                 old allocs/op  new allocs/op  delta
      FloatSqrt/256-4          12.0 ± 0%       8.0 ± 0%  -33.33%  (p=0.000 n=8+8)
      FloatSqrt/1000-4         19.0 ± 0%       9.0 ± 0%  -52.63%  (p=0.000 n=8+8)
      FloatSqrt/10000-4        35.0 ± 0%      14.0 ± 0%  -60.00%  (p=0.000 n=8+8)
      FloatSqrt/100000-4       55.0 ± 0%      23.0 ± 0%  -58.18%  (p=0.000 n=8+8)
      FloatSqrt/1000000-4       122 ± 0%        75 ± 0%  -38.52%  (p=0.000 n=8+8)
      
      Change-Id: I950dbf61a40267a6cca82ae72524c3024bcb149c
      Reviewed-on: https://go-review.googlesource.com/87659Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      010579c2
    • isharipo's avatar
      math/big: speedup nat.setBytes for bigger slices · d2a5263a
      isharipo authored
      Set up to _S (number of bytes in Uint) bytes at time
      by using BigEndian.Uint32 and BigEndian.Uint64.
      
      The performance improves for slices bigger than _S bytes.
      This is the case for 128/256bit arith that initializes
      it's objects from bytes.
      
      name               old time/op  new time/op  delta
      NatSetBytes/8-4    29.8ns ± 1%  11.4ns ± 0%  -61.63%  (p=0.000 n=9+8)
      NatSetBytes/24-4    109ns ± 1%    56ns ± 0%  -48.75%  (p=0.000 n=9+8)
      NatSetBytes/128-4   420ns ± 2%   110ns ± 1%  -73.83%  (p=0.000 n=10+10)
      NatSetBytes/7-4    26.2ns ± 1%  21.3ns ± 2%  -18.63%  (p=0.000 n=8+9)
      NatSetBytes/23-4    106ns ± 1%    67ns ± 1%  -36.93%  (p=0.000 n=9+10)
      NatSetBytes/127-4   410ns ± 2%   121ns ± 0%  -70.46%  (p=0.000 n=9+8)
      
      Found this optimization opportunity by looking at ethereum_corevm
      community benchmark cpuprofile.
      
      name        old time/op  new time/op  delta
      OpDiv256-4   715ns ± 1%   596ns ± 1%  -16.57%  (p=0.008 n=5+5)
      OpDiv128-4   373ns ± 1%   314ns ± 1%  -15.83%  (p=0.008 n=5+5)
      OpDiv64-4    301ns ± 0%   285ns ± 1%   -5.12%  (p=0.008 n=5+5)
      
      Change-Id: I8e5a680ae6284c8233d8d7431d51253a8a740b57
      Reviewed-on: https://go-review.googlesource.com/98775
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      d2a5263a
    • Matthew Dempsky's avatar
      cmd/compile: cleanup nodpc and nodfp · dcac984b
      Matthew Dempsky authored
      Instead of creating a new &nodfp expression for every recover() call,
      or a new nodpc variable for every function instrumented by the race
      detector, this CL introduces two new uintptr-typed pseudo-variables
      callerSP and callerPC. These pseudo-variables act just like calls to
      the runtime's getcallersp() and getcallerpc() functions.
      
      For consistency, change runtime.gorecover's builtin stub's parameter
      type from "*int32" to "uintptr".
      
      Passes toolstash-check, but toolstash-check -race fails because of
      register allocator changes.
      
      Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358
      Reviewed-on: https://go-review.googlesource.com/99416
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDaniel Martí <mvdan@mvdan.cc>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      dcac984b
    • Matthew Dempsky's avatar
      cmd/compile: remove two out-of-phase calls to walk · 6a5cfa8b
      Matthew Dempsky authored
      All calls to walkstmt/walkexpr/etc should be rooted from funccompile,
      whereas transformclosure and fninit are called by main.
      
      Passes toolstash-check.
      
      Change-Id: Ic880e2d2d83af09618ce4daa8e7716f6b389e53e
      Reviewed-on: https://go-review.googlesource.com/99418
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      6a5cfa8b
    • Matthew Dempsky's avatar
      cmd/compile: remove state.exitCode · 8b766e5d
      Matthew Dempsky authored
      We're holding onto the function's complete AST anyway, so might as
      well grab the exit code from there.
      
      Passes toolstash-check.
      
      Change-Id: I851b5dfdb53f991e9cd9488d25d0d2abc2a8379f
      Reviewed-on: https://go-review.googlesource.com/99417
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      8b766e5d
    • Matthew Dempsky's avatar
      cmd/compile: fuse escape analysis parameter tagging loops · e3127f02
      Matthew Dempsky authored
      Simplifies the code somewhat and allows removing Param.Field.
      
      Passes toolstash-check.
      
      Change-Id: Id854416aea8afd27ce4830ff0f5ff940f7353792
      Reviewed-on: https://go-review.googlesource.com/99336
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      e3127f02
    • Kunpei Sakai's avatar
      net/http: panic when a nil handler is passed to (*ServeMux)HandleFunc · 7d654af5
      Kunpei Sakai authored
      Fixes #24297
      
      Change-Id: I759e88655632fda97dced240b3f13392b2785d0a
      Reviewed-on: https://go-review.googlesource.com/99575Reviewed-by: default avatarAndrew Bonventre <andybons@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Andrew Bonventre <andybons@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      7d654af5
    • Michael Kasch's avatar
      time: add support for parsing timezones denoted by sign and offset · 9f2c611f
      Michael Kasch authored
      IANA Zoneinfo does not provide names for all timezones. Some are denoted
      by a sign and an offset only. E.g: Europe/Turkey is currently +03 or
      America/La_Paz which is -04 (https://data.iana.org/time-zones/releases/tzdata2018c.tar.gz)
      
      Fixes #24071
      
      Change-Id: I9c230a719945e1263c5b52bab82084d22861be3e
      Reviewed-on: https://go-review.googlesource.com/98157Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9f2c611f
    • Ian Lance Taylor's avatar
      runtime: use systemstack around throw in sysSigaction · 3d69ef37
      Ian Lance Taylor authored
      Try to fix the build on ppc64-linux and ppc64le-linux, avoiding:
      
      --- FAIL: TestInlinedRoutineRecords (2.12s)
      	dwarf_test.go:97: build: # command-line-arguments
      		runtime.systemstack: nosplit stack overflow
      			752	assumed on entry to runtime.sigtrampgo (nosplit)
      			480	after runtime.sigtrampgo (nosplit) uses 272
      			400	after runtime.sigfwdgo (nosplit) uses 80
      			264	after runtime.setsig (nosplit) uses 136
      			208	after runtime.sigaction (nosplit) uses 56
      			136	after runtime.sysSigaction (nosplit) uses 72
      			88	after runtime.throw (nosplit) uses 48
      			16	after runtime.dopanic (nosplit) uses 72
      			-16	after runtime.systemstack (nosplit) uses 32
      
      	dwarf_test.go:98: build error: exit status 2
      --- FAIL: TestAbstractOriginSanity (10.22s)
      	dwarf_test.go:97: build: # command-line-arguments
      		runtime.systemstack: nosplit stack overflow
      			752	assumed on entry to runtime.sigtrampgo (nosplit)
      			480	after runtime.sigtrampgo (nosplit) uses 272
      			400	after runtime.sigfwdgo (nosplit) uses 80
      			264	after runtime.setsig (nosplit) uses 136
      			208	after runtime.sigaction (nosplit) uses 56
      			136	after runtime.sysSigaction (nosplit) uses 72
      			88	after runtime.throw (nosplit) uses 48
      			16	after runtime.dopanic (nosplit) uses 72
      			-16	after runtime.systemstack (nosplit) uses 32
      
      	dwarf_test.go:98: build error: exit status 2
      FAIL
      FAIL	cmd/link/internal/ld	13.404s
      
      Change-Id: I4840604adb0e9f68a8d8e24f2f2a1a17d1634a58
      Reviewed-on: https://go-review.googlesource.com/99415Reviewed-by: default avatarAustin Clements <austin@google.com>
      3d69ef37
    • Alberto Donizetti's avatar
      test/codegen: port 2^n muls tests to codegen harness · 3772b2e1
      Alberto Donizetti authored
      And delete them from the asm_test.go file.
      
      Change-Id: I124c8c352299646ec7db0968cdb0fe59a3b5d83d
      Reviewed-on: https://go-review.googlesource.com/99475
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      3772b2e1
    • erifan01's avatar
      math/big: optimize addVW and subVW on arm64 · 0585d41c
      erifan01 authored
      The biggest hot spot of the existing implementation is "load" operations, which lead to poor performance.
      By unrolling the cycle 4 times and 2 times, and using "LDP", "STP" instructions,
      this CL can reduce the "load" cost and improve performance.
      
      Benchmarks:
      
      name                              old time/op    new time/op     delta
      AddVV/1-8                           21.5ns ± 0%     21.5ns ± 0%      ~     (all equal)
      AddVV/2-8                           13.5ns ± 0%     13.5ns ± 0%      ~     (all equal)
      AddVV/3-8                           15.5ns ± 0%     15.5ns ± 0%      ~     (all equal)
      AddVV/4-8                           17.5ns ± 0%     17.5ns ± 0%      ~     (all equal)
      AddVV/5-8                           19.5ns ± 0%     19.5ns ± 0%      ~     (all equal)
      AddVV/10-8                          29.5ns ± 0%     29.5ns ± 0%      ~     (all equal)
      AddVV/100-8                          217ns ± 0%      217ns ± 0%      ~     (all equal)
      AddVV/1000-8                        2.02µs ± 0%     2.02µs ± 0%      ~     (all equal)
      AddVV/10000-8                       20.3µs ± 0%     20.3µs ± 0%      ~     (p=0.603 n=5+5)
      AddVV/100000-8                       223µs ± 7%      228µs ± 8%      ~     (p=0.548 n=5+5)
      AddVW/1-8                           9.32ns ± 0%     9.26ns ± 0%    -0.64%  (p=0.008 n=5+5)
      AddVW/2-8                           19.8ns ± 3%     10.5ns ± 0%   -46.92%  (p=0.008 n=5+5)
      AddVW/3-8                           11.5ns ± 0%     11.0ns ± 0%    -4.35%  (p=0.008 n=5+5)
      AddVW/4-8                           13.0ns ± 0%     12.0ns ± 0%    -7.69%  (p=0.008 n=5+5)
      AddVW/5-8                           14.5ns ± 0%     12.5ns ± 0%   -13.79%  (p=0.008 n=5+5)
      AddVW/10-8                          22.0ns ± 0%     15.5ns ± 0%   -29.55%  (p=0.008 n=5+5)
      AddVW/100-8                          167ns ± 0%       81ns ± 0%   -51.44%  (p=0.008 n=5+5)
      AddVW/1000-8                        1.52µs ± 0%     0.64µs ± 0%   -57.58%  (p=0.008 n=5+5)
      AddVW/10000-8                       15.1µs ± 0%      7.2µs ± 0%   -52.55%  (p=0.008 n=5+5)
      AddVW/100000-8                       150µs ± 0%       71µs ± 0%   -52.95%  (p=0.008 n=5+5)
      SubVW/1-8                           9.32ns ± 0%     9.26ns ± 0%    -0.64%  (p=0.008 n=5+5)
      SubVW/2-8                           19.7ns ± 2%     10.5ns ± 0%   -46.70%  (p=0.008 n=5+5)
      SubVW/3-8                           11.5ns ± 0%     11.0ns ± 0%    -4.35%  (p=0.008 n=5+5)
      SubVW/4-8                           13.0ns ± 0%     12.0ns ± 0%    -7.69%  (p=0.008 n=5+5)
      SubVW/5-8                           14.5ns ± 0%     12.5ns ± 0%   -13.79%  (p=0.008 n=5+5)
      SubVW/10-8                          22.0ns ± 0%     15.5ns ± 0%   -29.55%  (p=0.008 n=5+5)
      SubVW/100-8                          167ns ± 0%       81ns ± 0%   -51.44%  (p=0.008 n=5+5)
      SubVW/1000-8                        1.52µs ± 0%     0.64µs ± 0%   -57.58%  (p=0.008 n=5+5)
      SubVW/10000-8                       15.1µs ± 0%      7.2µs ± 0%   -52.49%  (p=0.008 n=5+5)
      SubVW/100000-8                       150µs ± 0%       71µs ± 0%   -52.91%  (p=0.008 n=5+5)
      AddMulVVW/1-8                       32.4ns ± 1%     32.6ns ± 1%      ~     (p=0.119 n=5+5)
      AddMulVVW/2-8                       57.0ns ± 0%     57.0ns ± 0%      ~     (p=0.643 n=5+5)
      AddMulVVW/3-8                       90.8ns ± 0%     90.7ns ± 0%      ~     (p=0.524 n=5+5)
      AddMulVVW/4-8                        118ns ± 0%      118ns ± 1%      ~     (p=1.000 n=4+5)
      AddMulVVW/5-8                        144ns ± 1%      144ns ± 0%      ~     (p=0.794 n=5+4)
      AddMulVVW/10-8                       294ns ± 1%      296ns ± 0%    +0.48%  (p=0.040 n=5+5)
      AddMulVVW/100-8                     2.73µs ± 0%     2.73µs ± 0%      ~     (p=0.278 n=5+5)
      AddMulVVW/1000-8                    26.0µs ± 0%     26.5µs ± 0%    +2.14%  (p=0.008 n=5+5)
      AddMulVVW/10000-8                    297µs ± 0%      297µs ± 0%    +0.24%  (p=0.008 n=5+5)
      AddMulVVW/100000-8                  3.15ms ± 1%     3.13ms ± 0%      ~     (p=0.690 n=5+5)
      DecimalConversion-8                  311µs ± 2%      309µs ± 2%      ~     (p=0.310 n=5+5)
      FloatString/100-8                   2.55µs ± 2%     2.54µs ± 2%      ~     (p=1.000 n=5+5)
      FloatString/1000-8                  58.1µs ± 0%     58.1µs ± 0%      ~     (p=0.151 n=5+5)
      FloatString/10000-8                 4.59ms ± 0%     4.59ms ± 0%      ~     (p=0.151 n=5+5)
      FloatString/100000-8                 446ms ± 0%      446ms ± 0%    +0.01%  (p=0.016 n=5+5)
      FloatAdd/10-8                        183ns ± 0%      183ns ± 0%      ~     (p=0.333 n=4+5)
      FloatAdd/100-8                       187ns ± 1%      192ns ± 2%      ~     (p=0.056 n=5+5)
      FloatAdd/1000-8                      369ns ± 0%      371ns ± 0%    +0.54%  (p=0.016 n=4+5)
      FloatAdd/10000-8                    1.88µs ± 0%     1.88µs ± 0%    -0.14%  (p=0.000 n=4+5)
      FloatAdd/100000-8                   17.2µs ± 0%     17.1µs ± 0%    -0.37%  (p=0.008 n=5+5)
      FloatSub/10-8                        147ns ± 0%      147ns ± 0%      ~     (all equal)
      FloatSub/100-8                       145ns ± 0%      146ns ± 0%      ~     (p=0.238 n=5+4)
      FloatSub/1000-8                      241ns ± 0%      241ns ± 0%      ~     (p=0.333 n=5+4)
      FloatSub/10000-8                    1.06µs ± 0%     1.06µs ± 0%      ~     (p=0.444 n=5+5)
      FloatSub/100000-8                   9.50µs ± 0%     9.48µs ± 0%    -0.14%  (p=0.008 n=5+5)
      ParseFloatSmallExp-8                28.4µs ± 2%     28.5µs ± 1%      ~     (p=0.690 n=5+5)
      ParseFloatLargeExp-8                 125µs ± 1%      124µs ± 1%      ~     (p=0.095 n=5+5)
      GCD10x10/WithoutXY-8                 277ns ± 2%      278ns ± 3%      ~     (p=0.937 n=5+5)
      GCD10x10/WithXY-8                   2.08µs ± 3%     2.15µs ± 3%      ~     (p=0.056 n=5+5)
      GCD10x100/WithoutXY-8                592ns ± 3%      613ns ± 4%      ~     (p=0.056 n=5+5)
      GCD10x100/WithXY-8                  3.40µs ± 2%     3.42µs ± 4%      ~     (p=0.841 n=5+5)
      GCD10x1000/WithoutXY-8              1.37µs ± 2%     1.35µs ± 3%      ~     (p=0.460 n=5+5)
      GCD10x1000/WithXY-8                 7.34µs ± 2%     7.33µs ± 4%      ~     (p=0.841 n=5+5)
      GCD10x10000/WithoutXY-8             8.52µs ± 0%     8.51µs ± 1%      ~     (p=0.421 n=5+5)
      GCD10x10000/WithXY-8                27.5µs ± 2%     27.2µs ± 1%      ~     (p=0.151 n=5+5)
      GCD10x100000/WithoutXY-8            78.3µs ± 1%     78.5µs ± 1%      ~     (p=0.690 n=5+5)
      GCD10x100000/WithXY-8                231µs ± 0%      229µs ± 1%    -1.11%  (p=0.016 n=5+5)
      GCD100x100/WithoutXY-8              1.86µs ± 2%     1.86µs ± 2%      ~     (p=0.881 n=5+5)
      GCD100x100/WithXY-8                 27.1µs ± 2%     27.2µs ± 1%      ~     (p=0.421 n=5+5)
      GCD100x1000/WithoutXY-8             4.44µs ± 2%     4.41µs ± 1%      ~     (p=0.310 n=5+5)
      GCD100x1000/WithXY-8                36.3µs ± 1%     36.2µs ± 1%      ~     (p=0.310 n=5+5)
      GCD100x10000/WithoutXY-8            22.6µs ± 2%     22.5µs ± 1%      ~     (p=0.690 n=5+5)
      GCD100x10000/WithXY-8                145µs ± 1%      145µs ± 1%      ~     (p=1.000 n=5+5)
      GCD100x100000/WithoutXY-8            195µs ± 0%      196µs ± 1%      ~     (p=0.548 n=5+5)
      GCD100x100000/WithXY-8              1.10ms ± 0%     1.10ms ± 0%    -0.30%  (p=0.016 n=5+5)
      GCD1000x1000/WithoutXY-8            25.0µs ± 1%     25.2µs ± 2%      ~     (p=0.222 n=5+5)
      GCD1000x1000/WithXY-8                520µs ± 0%      520µs ± 1%      ~     (p=0.151 n=5+5)
      GCD1000x10000/WithoutXY-8           57.0µs ± 1%     56.9µs ± 1%      ~     (p=0.690 n=5+5)
      GCD1000x10000/WithXY-8              1.21ms ± 0%     1.21ms ± 1%      ~     (p=0.881 n=5+5)
      GCD1000x100000/WithoutXY-8           358µs ± 0%      359µs ± 1%      ~     (p=0.548 n=5+5)
      GCD1000x100000/WithXY-8             8.73ms ± 0%     8.73ms ± 0%      ~     (p=0.548 n=5+5)
      GCD10000x10000/WithoutXY-8           686µs ± 0%      687µs ± 0%      ~     (p=0.548 n=5+5)
      GCD10000x10000/WithXY-8             15.9ms ± 0%     15.9ms ± 0%      ~     (p=0.841 n=5+5)
      GCD10000x100000/WithoutXY-8         2.08ms ± 0%     2.08ms ± 0%      ~     (p=1.000 n=5+5)
      GCD10000x100000/WithXY-8            86.7ms ± 0%     86.7ms ± 0%      ~     (p=1.000 n=5+5)
      GCD100000x100000/WithoutXY-8        51.1ms ± 0%     51.0ms ± 0%      ~     (p=0.151 n=5+5)
      GCD100000x100000/WithXY-8            1.23s ± 0%      1.23s ± 0%      ~     (p=0.841 n=5+5)
      Hilbert-8                           2.41ms ± 1%     2.42ms ± 2%      ~     (p=0.690 n=5+5)
      Binomial-8                          4.86µs ± 1%     4.86µs ± 1%      ~     (p=0.889 n=5+5)
      QuoRem-8                            7.09µs ± 0%     7.08µs ± 0%    -0.09%  (p=0.024 n=5+5)
      Exp-8                                161ms ± 0%      161ms ± 0%    -0.08%  (p=0.032 n=5+5)
      Exp2-8                               161ms ± 0%      161ms ± 0%      ~     (p=1.000 n=5+5)
      Bitset-8                            40.7ns ± 0%     40.6ns ± 0%      ~     (p=0.095 n=4+5)
      BitsetNeg-8                          159ns ± 4%      148ns ± 0%    -6.92%  (p=0.016 n=5+4)
      BitsetOrig-8                         378ns ± 1%      378ns ± 1%      ~     (p=0.937 n=5+5)
      BitsetNegOrig-8                      647ns ± 5%      647ns ± 4%      ~     (p=1.000 n=5+5)
      ModSqrt225_Tonelli-8                7.26ms ± 0%     7.27ms ± 0%      ~     (p=1.000 n=5+5)
      ModSqrt224_3Mod4-8                  2.24ms ± 0%     2.24ms ± 0%      ~     (p=0.690 n=5+5)
      ModSqrt5430_Tonelli-8                62.8s ± 1%      62.5s ± 0%      ~     (p=0.063 n=5+4)
      ModSqrt5430_3Mod4-8                  20.8s ± 0%      20.8s ± 0%      ~     (p=0.310 n=5+5)
      Sqrt-8                               101µs ± 1%      101µs ± 0%    -0.35%  (p=0.032 n=5+5)
      IntSqr/1-8                          32.3ns ± 1%     32.5ns ± 1%      ~     (p=0.421 n=5+5)
      IntSqr/2-8                           157ns ± 5%      156ns ± 5%      ~     (p=0.651 n=5+5)
      IntSqr/3-8                           292ns ± 2%      291ns ± 3%      ~     (p=0.881 n=5+5)
      IntSqr/5-8                           738ns ± 6%      740ns ± 5%      ~     (p=0.841 n=5+5)
      IntSqr/8-8                          1.82µs ± 4%     1.83µs ± 4%      ~     (p=0.730 n=5+5)
      IntSqr/10-8                         2.92µs ± 1%     2.93µs ± 1%      ~     (p=0.643 n=5+5)
      IntSqr/20-8                         6.28µs ± 2%     6.28µs ± 2%      ~     (p=1.000 n=5+5)
      IntSqr/30-8                         13.8µs ± 2%     13.9µs ± 3%      ~     (p=1.000 n=5+5)
      IntSqr/50-8                         37.8µs ± 4%     37.9µs ± 4%      ~     (p=0.690 n=5+5)
      IntSqr/80-8                         95.9µs ± 1%     95.8µs ± 1%      ~     (p=0.841 n=5+5)
      IntSqr/100-8                         148µs ± 1%      148µs ± 1%      ~     (p=0.310 n=5+5)
      IntSqr/200-8                         586µs ± 1%      586µs ± 1%      ~     (p=0.841 n=5+5)
      IntSqr/300-8                        1.32ms ± 0%     1.31ms ± 0%      ~     (p=0.222 n=5+5)
      IntSqr/500-8                        2.48ms ± 0%     2.48ms ± 0%      ~     (p=0.556 n=5+4)
      IntSqr/800-8                        4.68ms ± 0%     4.68ms ± 0%      ~     (p=0.548 n=5+5)
      IntSqr/1000-8                       7.57ms ± 0%     7.56ms ± 0%      ~     (p=0.421 n=5+5)
      Mul-8                                311ms ± 0%      311ms ± 0%      ~     (p=0.548 n=5+5)
      Exp3Power/0x10-8                     559ns ± 1%      560ns ± 1%      ~     (p=0.984 n=5+5)
      Exp3Power/0x40-8                     641ns ± 1%      634ns ± 1%      ~     (p=0.063 n=5+5)
      Exp3Power/0x100-8                   1.39µs ± 2%     1.40µs ± 2%      ~     (p=0.381 n=5+5)
      Exp3Power/0x400-8                   8.27µs ± 1%     8.26µs ± 0%      ~     (p=0.571 n=5+5)
      Exp3Power/0x1000-8                  59.9µs ± 0%     59.7µs ± 0%    -0.23%  (p=0.008 n=5+5)
      Exp3Power/0x4000-8                   816µs ± 0%      816µs ± 0%      ~     (p=1.000 n=5+5)
      Exp3Power/0x10000-8                 7.77ms ± 0%     7.77ms ± 0%      ~     (p=0.841 n=5+5)
      Exp3Power/0x40000-8                 73.4ms ± 0%     73.4ms ± 0%      ~     (p=0.690 n=5+5)
      Exp3Power/0x100000-8                 665ms ± 0%      664ms ± 0%    -0.14%  (p=0.008 n=5+5)
      Exp3Power/0x400000-8                 5.98s ± 0%      5.98s ± 0%    -0.09%  (p=0.008 n=5+5)
      Fibo-8                               116ms ± 0%      116ms ± 0%    -0.25%  (p=0.008 n=5+5)
      NatSqr/1-8                           115ns ± 3%      116ns ± 2%      ~     (p=0.238 n=5+5)
      NatSqr/2-8                           237ns ± 1%      237ns ± 1%      ~     (p=0.683 n=5+5)
      NatSqr/3-8                           367ns ± 3%      368ns ± 3%      ~     (p=0.817 n=5+5)
      NatSqr/5-8                           807ns ± 3%      812ns ± 3%      ~     (p=0.913 n=5+5)
      NatSqr/8-8                          1.93µs ± 2%     1.93µs ± 3%      ~     (p=0.651 n=5+5)
      NatSqr/10-8                         2.98µs ± 2%     2.99µs ± 2%      ~     (p=0.690 n=5+5)
      NatSqr/20-8                         6.49µs ± 2%     6.46µs ± 2%      ~     (p=0.548 n=5+5)
      NatSqr/30-8                         14.4µs ± 2%     14.3µs ± 2%      ~     (p=0.690 n=5+5)
      NatSqr/50-8                         38.6µs ± 2%     38.7µs ± 2%      ~     (p=0.841 n=5+5)
      NatSqr/80-8                         96.1µs ± 2%     95.8µs ± 2%      ~     (p=0.548 n=5+5)
      NatSqr/100-8                         149µs ± 1%      149µs ± 1%      ~     (p=0.841 n=5+5)
      NatSqr/200-8                         593µs ± 1%      590µs ± 1%      ~     (p=0.421 n=5+5)
      NatSqr/300-8                        1.32ms ± 0%     1.32ms ± 1%      ~     (p=0.222 n=5+5)
      NatSqr/500-8                        2.49ms ± 0%     2.49ms ± 0%      ~     (p=0.690 n=5+5)
      NatSqr/800-8                        4.69ms ± 0%     4.69ms ± 0%      ~     (p=1.000 n=5+5)
      NatSqr/1000-8                       7.59ms ± 0%     7.58ms ± 0%      ~     (p=0.841 n=5+5)
      ScanPi-8                             322µs ± 0%      321µs ± 0%      ~     (p=0.095 n=5+5)
      StringPiParallel-8                  71.4µs ± 5%     68.8µs ± 4%      ~     (p=0.151 n=5+5)
      Scan/10/Base2-8                     1.10µs ± 0%     1.09µs ± 0%    -0.36%  (p=0.032 n=5+5)
      Scan/100/Base2-8                    7.78µs ± 0%     7.79µs ± 0%    +0.14%  (p=0.008 n=5+5)
      Scan/1000/Base2-8                   78.8µs ± 0%     79.0µs ± 0%    +0.24%  (p=0.008 n=5+5)
      Scan/10000/Base2-8                  1.22ms ± 0%     1.22ms ± 0%      ~     (p=0.056 n=5+5)
      Scan/100000/Base2-8                 55.1ms ± 0%     55.0ms ± 0%    -0.15%  (p=0.008 n=5+5)
      Scan/10/Base8-8                      514ns ± 0%      515ns ± 0%      ~     (p=0.079 n=5+5)
      Scan/100/Base8-8                    2.89µs ± 0%     2.89µs ± 0%    +0.15%  (p=0.008 n=5+5)
      Scan/1000/Base8-8                   31.0µs ± 0%     31.1µs ± 0%    +0.12%  (p=0.008 n=5+5)
      Scan/10000/Base8-8                   740µs ± 0%      740µs ± 0%      ~     (p=0.222 n=5+5)
      Scan/100000/Base8-8                 50.6ms ± 0%     50.5ms ± 0%    -0.06%  (p=0.016 n=4+5)
      Scan/10/Base10-8                     492ns ± 1%      490ns ± 1%      ~     (p=0.310 n=5+5)
      Scan/100/Base10-8                   2.67µs ± 0%     2.67µs ± 0%      ~     (p=0.056 n=5+5)
      Scan/1000/Base10-8                  28.7µs ± 0%     28.7µs ± 0%      ~     (p=1.000 n=5+5)
      Scan/10000/Base10-8                  717µs ± 0%      716µs ± 0%      ~     (p=0.222 n=5+5)
      Scan/100000/Base10-8                50.2ms ± 0%     50.3ms ± 0%    +0.05%  (p=0.008 n=5+5)
      Scan/10/Base16-8                     442ns ± 1%      442ns ± 0%      ~     (p=0.468 n=5+5)
      Scan/100/Base16-8                   2.46µs ± 0%     2.45µs ± 0%      ~     (p=0.159 n=5+5)
      Scan/1000/Base16-8                  27.2µs ± 0%     27.2µs ± 0%      ~     (p=0.841 n=5+5)
      Scan/10000/Base16-8                  721µs ± 0%      722µs ± 0%      ~     (p=0.548 n=5+5)
      Scan/100000/Base16-8                52.6ms ± 0%     52.6ms ± 0%    +0.07%  (p=0.008 n=5+5)
      String/10/Base2-8                    244ns ± 1%      242ns ± 1%      ~     (p=0.103 n=5+5)
      String/100/Base2-8                  1.48µs ± 0%     1.48µs ± 1%      ~     (p=0.786 n=5+5)
      String/1000/Base2-8                 13.3µs ± 1%     13.3µs ± 0%      ~     (p=0.222 n=5+5)
      String/10000/Base2-8                 132µs ± 1%      132µs ± 1%      ~     (p=1.000 n=5+5)
      String/100000/Base2-8               1.30ms ± 1%     1.30ms ± 1%      ~     (p=1.000 n=5+5)
      String/10/Base8-8                    167ns ± 1%      168ns ± 1%      ~     (p=0.135 n=5+5)
      String/100/Base8-8                   623ns ± 1%      626ns ± 1%      ~     (p=0.151 n=5+5)
      String/1000/Base8-8                 5.24µs ± 1%     5.24µs ± 0%      ~     (p=1.000 n=5+5)
      String/10000/Base8-8                50.0µs ± 1%     50.0µs ± 1%      ~     (p=1.000 n=5+5)
      String/100000/Base8-8                492µs ± 1%      489µs ± 1%      ~     (p=0.056 n=5+5)
      String/10/Base10-8                   503ns ± 1%      501ns ± 0%      ~     (p=0.183 n=5+5)
      String/100/Base10-8                 1.96µs ± 0%     1.97µs ± 0%      ~     (p=0.389 n=5+5)
      String/1000/Base10-8                12.4µs ± 1%     12.4µs ± 1%      ~     (p=0.841 n=5+5)
      String/10000/Base10-8               56.7µs ± 1%     56.6µs ± 0%      ~     (p=1.000 n=5+5)
      String/100000/Base10-8              25.6ms ± 0%     25.6ms ± 0%      ~     (p=0.222 n=5+5)
      String/10/Base16-8                   147ns ± 0%      148ns ± 2%      ~     (p=1.000 n=4+5)
      String/100/Base16-8                  505ns ± 0%      505ns ± 1%      ~     (p=0.778 n=5+5)
      String/1000/Base16-8                3.94µs ± 0%     3.94µs ± 0%      ~     (p=0.841 n=5+5)
      String/10000/Base16-8               37.4µs ± 1%     37.2µs ± 1%      ~     (p=0.095 n=5+5)
      String/100000/Base16-8               367µs ± 1%      367µs ± 0%      ~     (p=1.000 n=5+5)
      LeafSize/0-8                        6.64ms ± 0%     6.65ms ± 0%      ~     (p=0.690 n=5+5)
      LeafSize/1-8                        72.5µs ± 1%     72.4µs ± 1%      ~     (p=0.841 n=5+5)
      LeafSize/2-8                        72.6µs ± 1%     72.6µs ± 1%      ~     (p=1.000 n=5+5)
      LeafSize/3-8                         377µs ± 0%      377µs ± 0%      ~     (p=0.421 n=5+5)
      LeafSize/4-8                        71.2µs ± 1%     71.3µs ± 0%      ~     (p=0.278 n=5+5)
      LeafSize/5-8                         469µs ± 0%      469µs ± 0%      ~     (p=0.310 n=5+5)
      LeafSize/6-8                         376µs ± 0%      376µs ± 0%      ~     (p=0.841 n=5+5)
      LeafSize/7-8                         244µs ± 0%      244µs ± 0%      ~     (p=0.841 n=5+5)
      LeafSize/8-8                        71.9µs ± 1%     72.1µs ± 1%      ~     (p=0.548 n=5+5)
      LeafSize/9-8                         536µs ± 0%      536µs ± 0%      ~     (p=0.151 n=5+5)
      LeafSize/10-8                        470µs ± 0%      471µs ± 0%    +0.10%  (p=0.032 n=5+5)
      LeafSize/11-8                        458µs ± 0%      458µs ± 0%      ~     (p=0.881 n=5+5)
      LeafSize/12-8                        376µs ± 0%      376µs ± 0%      ~     (p=0.548 n=5+5)
      LeafSize/13-8                        341µs ± 0%      342µs ± 0%      ~     (p=0.222 n=5+5)
      LeafSize/14-8                        246µs ± 0%      245µs ± 0%      ~     (p=0.167 n=5+5)
      LeafSize/15-8                        168µs ± 0%      168µs ± 0%      ~     (p=0.548 n=5+5)
      LeafSize/16-8                       72.1µs ± 1%     72.2µs ± 1%      ~     (p=0.690 n=5+5)
      LeafSize/32-8                       81.5µs ± 1%     81.4µs ± 1%      ~     (p=1.000 n=5+5)
      LeafSize/64-8                        133µs ± 1%      134µs ± 1%      ~     (p=0.690 n=5+5)
      ProbablyPrime/n=0-8                 44.3ms ± 0%     44.2ms ± 0%    -0.28%  (p=0.008 n=5+5)
      ProbablyPrime/n=1-8                 64.8ms ± 0%     64.7ms ± 0%    -0.15%  (p=0.008 n=5+5)
      ProbablyPrime/n=5-8                  147ms ± 0%      147ms ± 0%    -0.11%  (p=0.008 n=5+5)
      ProbablyPrime/n=10-8                 250ms ± 0%      250ms ± 0%      ~     (p=0.056 n=5+5)
      ProbablyPrime/n=20-8                 456ms ± 0%      455ms ± 0%    -0.05%  (p=0.008 n=5+5)
      ProbablyPrime/Lucas-8               23.6ms ± 0%     23.5ms ± 0%    -0.29%  (p=0.008 n=5+5)
      ProbablyPrime/MillerRabinBase2-8    20.6ms ± 0%     20.6ms ± 0%      ~     (p=0.690 n=5+5)
      FloatSqrt/64-8                      2.01µs ± 1%     2.02µs ± 1%      ~     (p=0.421 n=5+5)
      FloatSqrt/128-8                     4.43µs ± 2%     4.38µs ± 2%      ~     (p=0.222 n=5+5)
      FloatSqrt/256-8                     6.64µs ± 1%     6.68µs ± 2%      ~     (p=0.516 n=5+5)
      FloatSqrt/1000-8                    31.9µs ± 0%     31.8µs ± 0%      ~     (p=0.095 n=5+5)
      FloatSqrt/10000-8                    595µs ± 0%      594µs ± 0%      ~     (p=0.056 n=5+5)
      FloatSqrt/100000-8                  17.9ms ± 0%     17.9ms ± 0%      ~     (p=0.151 n=5+5)
      FloatSqrt/1000000-8                  1.52s ± 0%      1.52s ± 0%      ~     (p=0.841 n=5+5)
      
      name                              old speed      new speed       delta
      AddVV/1-8                         2.97GB/s ± 0%   2.97GB/s ± 0%      ~     (p=0.971 n=4+4)
      AddVV/2-8                         9.47GB/s ± 0%   9.47GB/s ± 0%    +0.01%  (p=0.016 n=5+5)
      AddVV/3-8                         12.4GB/s ± 0%   12.4GB/s ± 0%      ~     (p=0.548 n=5+5)
      AddVV/4-8                         14.6GB/s ± 0%   14.6GB/s ± 0%      ~     (p=1.000 n=5+5)
      AddVV/5-8                         16.4GB/s ± 0%   16.4GB/s ± 0%      ~     (p=1.000 n=5+5)
      AddVV/10-8                        21.7GB/s ± 0%   21.7GB/s ± 0%      ~     (p=0.548 n=5+5)
      AddVV/100-8                       29.4GB/s ± 0%   29.4GB/s ± 0%      ~     (p=1.000 n=5+5)
      AddVV/1000-8                      31.7GB/s ± 0%   31.7GB/s ± 0%      ~     (p=0.524 n=5+4)
      AddVV/10000-8                     31.5GB/s ± 0%   31.5GB/s ± 0%      ~     (p=0.690 n=5+5)
      AddVV/100000-8                    28.8GB/s ± 7%   28.1GB/s ± 8%      ~     (p=0.548 n=5+5)
      AddVW/1-8                          859MB/s ± 0%    864MB/s ± 0%    +0.61%  (p=0.008 n=5+5)
      AddVW/2-8                          809MB/s ± 2%   1520MB/s ± 0%   +87.78%  (p=0.008 n=5+5)
      AddVW/3-8                         2.08GB/s ± 0%   2.18GB/s ± 0%    +4.54%  (p=0.008 n=5+5)
      AddVW/4-8                         2.46GB/s ± 0%   2.66GB/s ± 0%    +8.33%  (p=0.016 n=4+5)
      AddVW/5-8                         2.76GB/s ± 0%   3.20GB/s ± 0%   +16.03%  (p=0.008 n=5+5)
      AddVW/10-8                        3.63GB/s ± 0%   5.15GB/s ± 0%   +41.83%  (p=0.008 n=5+5)
      AddVW/100-8                       4.79GB/s ± 0%   9.87GB/s ± 0%  +106.12%  (p=0.008 n=5+5)
      AddVW/1000-8                      5.27GB/s ± 0%  12.42GB/s ± 0%  +135.74%  (p=0.008 n=5+5)
      AddVW/10000-8                     5.31GB/s ± 0%  11.19GB/s ± 0%  +110.71%  (p=0.008 n=5+5)
      AddVW/100000-8                    5.32GB/s ± 0%  11.32GB/s ± 0%  +112.56%  (p=0.008 n=5+5)
      SubVW/1-8                          859MB/s ± 0%    864MB/s ± 0%    +0.61%  (p=0.008 n=5+5)
      SubVW/2-8                          812MB/s ± 2%   1520MB/s ± 0%   +87.09%  (p=0.008 n=5+5)
      SubVW/3-8                         2.08GB/s ± 0%   2.18GB/s ± 0%    +4.55%  (p=0.008 n=5+5)
      SubVW/4-8                         2.46GB/s ± 0%   2.66GB/s ± 0%    +8.33%  (p=0.008 n=5+5)
      SubVW/5-8                         2.75GB/s ± 0%   3.20GB/s ± 0%   +16.03%  (p=0.008 n=5+5)
      SubVW/10-8                        3.63GB/s ± 0%   5.15GB/s ± 0%   +41.82%  (p=0.008 n=5+5)
      SubVW/100-8                       4.79GB/s ± 0%   9.87GB/s ± 0%  +106.13%  (p=0.008 n=5+5)
      SubVW/1000-8                      5.27GB/s ± 0%  12.42GB/s ± 0%  +135.74%  (p=0.008 n=5+5)
      SubVW/10000-8                     5.31GB/s ± 0%  11.17GB/s ± 0%  +110.44%  (p=0.008 n=5+5)
      SubVW/100000-8                    5.32GB/s ± 0%  11.31GB/s ± 0%  +112.35%  (p=0.008 n=5+5)
      AddMulVVW/1-8                     1.97GB/s ± 1%   1.96GB/s ± 1%      ~     (p=0.151 n=5+5)
      AddMulVVW/2-8                     2.24GB/s ± 0%   2.25GB/s ± 0%      ~     (p=0.095 n=5+5)
      AddMulVVW/3-8                     2.11GB/s ± 0%   2.12GB/s ± 0%      ~     (p=0.548 n=5+5)
      AddMulVVW/4-8                     2.17GB/s ± 1%   2.17GB/s ± 1%      ~     (p=0.548 n=5+5)
      AddMulVVW/5-8                     2.22GB/s ± 1%   2.21GB/s ± 1%      ~     (p=0.421 n=5+5)
      AddMulVVW/10-8                    2.17GB/s ± 1%   2.16GB/s ± 0%      ~     (p=0.095 n=5+5)
      AddMulVVW/100-8                   2.35GB/s ± 0%   2.35GB/s ± 0%      ~     (p=0.421 n=5+5)
      AddMulVVW/1000-8                  2.47GB/s ± 0%   2.41GB/s ± 0%    -2.09%  (p=0.008 n=5+5)
      AddMulVVW/10000-8                 2.16GB/s ± 0%   2.15GB/s ± 0%    -0.23%  (p=0.008 n=5+5)
      AddMulVVW/100000-8                2.03GB/s ± 1%   2.04GB/s ± 0%      ~     (p=0.690 n=5+5)
      
      name                              old alloc/op   new alloc/op    delta
      FloatString/100-8                     400B ± 0%       400B ± 0%      ~     (all equal)
      FloatString/1000-8                  3.22kB ± 0%     3.22kB ± 0%      ~     (all equal)
      FloatString/10000-8                 55.6kB ± 0%     55.5kB ± 0%      ~     (p=0.206 n=5+5)
      FloatString/100000-8                 627kB ± 0%      627kB ± 0%      ~     (all equal)
      FloatAdd/10-8                        0.00B           0.00B           ~     (all equal)
      FloatAdd/100-8                       0.00B           0.00B           ~     (all equal)
      FloatAdd/1000-8                      0.00B           0.00B           ~     (all equal)
      FloatAdd/10000-8                     0.00B           0.00B           ~     (all equal)
      FloatAdd/100000-8                    0.00B           0.00B           ~     (all equal)
      FloatSub/10-8                        0.00B           0.00B           ~     (all equal)
      FloatSub/100-8                       0.00B           0.00B           ~     (all equal)
      FloatSub/1000-8                      0.00B           0.00B           ~     (all equal)
      FloatSub/10000-8                     0.00B           0.00B           ~     (all equal)
      FloatSub/100000-8                    0.00B           0.00B           ~     (all equal)
      FloatSqrt/64-8                        416B ± 0%       416B ± 0%      ~     (all equal)
      FloatSqrt/128-8                       720B ± 0%       720B ± 0%      ~     (all equal)
      FloatSqrt/256-8                       816B ± 0%       816B ± 0%      ~     (all equal)
      FloatSqrt/1000-8                    2.50kB ± 0%     2.50kB ± 0%      ~     (all equal)
      FloatSqrt/10000-8                   23.5kB ± 0%     23.5kB ± 0%      ~     (all equal)
      FloatSqrt/100000-8                   251kB ± 0%      251kB ± 0%      ~     (all equal)
      FloatSqrt/1000000-8                 4.61MB ± 0%     4.61MB ± 0%      ~     (all equal)
      
      name                              old allocs/op  new allocs/op   delta
      FloatString/100-8                     8.00 ± 0%       8.00 ± 0%      ~     (all equal)
      FloatString/1000-8                    10.0 ± 0%       10.0 ± 0%      ~     (all equal)
      FloatString/10000-8                   42.0 ± 0%       42.0 ± 0%      ~     (all equal)
      FloatString/100000-8                   346 ± 0%        346 ± 0%      ~     (all equal)
      FloatAdd/10-8                         0.00            0.00           ~     (all equal)
      FloatAdd/100-8                        0.00            0.00           ~     (all equal)
      FloatAdd/1000-8                       0.00            0.00           ~     (all equal)
      FloatAdd/10000-8                      0.00            0.00           ~     (all equal)
      FloatAdd/100000-8                     0.00            0.00           ~     (all equal)
      FloatSub/10-8                         0.00            0.00           ~     (all equal)
      FloatSub/100-8                        0.00            0.00           ~     (all equal)
      FloatSub/1000-8                       0.00            0.00           ~     (all equal)
      FloatSub/10000-8                      0.00            0.00           ~     (all equal)
      FloatSub/100000-8                     0.00            0.00           ~     (all equal)
      FloatSqrt/64-8                        9.00 ± 0%       9.00 ± 0%      ~     (all equal)
      FloatSqrt/128-8                       13.0 ± 0%       13.0 ± 0%      ~     (all equal)
      FloatSqrt/256-8                       12.0 ± 0%       12.0 ± 0%      ~     (all equal)
      FloatSqrt/1000-8                      19.0 ± 0%       19.0 ± 0%      ~     (all equal)
      FloatSqrt/10000-8                     35.0 ± 0%       35.0 ± 0%      ~     (all equal)
      FloatSqrt/100000-8                    55.0 ± 0%       55.0 ± 0%      ~     (all equal)
      FloatSqrt/1000000-8                    122 ± 0%        122 ± 0%      ~     (all equal)
      
      Change-Id: I6888d84c037d91f9e2199f3492ea3f6a0ed77b24
      Reviewed-on: https://go-review.googlesource.com/77832Reviewed-by: default avatarVlad Krasnov <vlad@cloudflare.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      0585d41c
    • Lynn Boger's avatar
      cmd/asm, cmd/internal/obj/ppc64: avoid unnecessary load zeros · 5b14c7b3
      Lynn Boger authored
      When instructions add, and, or, xor, and movd have
      constant operands in some cases more instructions are
      generated than necessary by the assembler.
      
      This adds more opcode/operand combinations to the optab
      and improves the code generation for the cases where the
      size and sign of the constant allows the use of 1
      instructions instead of 2.
      
      Example of previous code:
      	oris r3, r0, 0
      	ori  r3, r3, 65533
      
      now:
      	ori r3, r0, 65533
      
      This does not significantly reduce the overall binary size
      because the improvement depends on the constant value.
      Some procedures show a 1-2% reduction in size. This improvement
      could also be significant in cases where the extra instructions
      occur in a critical loop.
      
      Testcase ppc64enc.s was added to cmd/asm/internal/asm/testdata
      with the variations affected by this change.
      
      Updates #23845
      
      Change-Id: I7fdf2320c95815d99f2755ba77d0c6921cd7fad7
      Reviewed-on: https://go-review.googlesource.com/95135
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      5b14c7b3
    • Joe Tsai's avatar
      encoding/csv: avoid mangling invalid UTF-8 in Writer · 0add9a4d
      Joe Tsai authored
      In the situation where a quoted field is necessary, avoid processing
      each UTF-8 rune one-by-one, which causes mangling of invalid sequences
      into utf8.RuneError, causing a loss of information.
      Instead, search only for the escaped characters, handle those specially
      and copy everything else in between verbatim.
      
      This symmetrically matches the behavior of Reader.
      
      Fixes #24298
      
      Change-Id: I9276f64891084ce8487678f663fad711b4095dbb
      Reviewed-on: https://go-review.googlesource.com/99297
      Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      0add9a4d
    • Matthew Dempsky's avatar
      cmd/compile: mark anonymous receiver parameters as non-escaping · 88466e93
      Matthew Dempsky authored
      This was already done for normal parameters, and the same logic
      applies for receiver parameters too.
      
      Updates #24305.
      
      Change-Id: Ia2a46f68d14e8fb62004ff0da1db0f065a95a1b7
      Reviewed-on: https://go-review.googlesource.com/99335
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      88466e93
  2. 07 Mar, 2018 20 commits
    • Ian Lance Taylor's avatar
      cmd/cover: don't crash on non-gofmt'ed input · 8b8625a3
      Ian Lance Taylor authored
      Without the change to cover.go, the new test fails with
      
      panic: overlapping edits: [4946,4950)->"", [4947,4947)->"thisNameMustBeVeryLongToCauseOverflowOfCounterIncrementStatementOntoNextLineForTest.Count[112]++;"
      
      The original code inserts "else{", deletes "else", and then positions
      a new block just after the "}" that must come before the "else".
      That works on gofmt'ed code, but fails if the code looks like "}else".
      When there is no space between the "{" and the "else", the new block
      is inserted into a location that we are deleting, leading to the
      "overlapping edits" mentioned above.
      
      This CL fixes this case by not deleting the "else" but just using the
      one that is already there. That requires adjust the block offset to
      come after the "{" that we insert.
      
      Fixes #23927
      
      Change-Id: I40ef592490878765bbce6550ddb439e43ac525b2
      Reviewed-on: https://go-review.googlesource.com/98935
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      8b8625a3
    • Ian Lance Taylor's avatar
      runtime: get traceback from VDSO code · 419c0645
      Ian Lance Taylor authored
      Currently if a profiling signal arrives while executing within a VDSO
      the profiler will report _ExternalCode, which is needlessly confusing
      for a pure Go program. Change the VDSO calling code to record the
      caller's PC/SP, so that we can do a traceback from that point. If that
      fails for some reason, report _VDSO rather than _ExternalCode, which
      should at least point in the right direction.
      
      This adds some instructions to the code that calls the VDSO, but the
      slowdown is reasonably negligible:
      
      name                                  old time/op  new time/op  delta
      ClockVDSOAndFallbackPaths/vDSO-8      40.5ns ± 2%  41.3ns ± 1%  +1.85%  (p=0.002 n=10+10)
      ClockVDSOAndFallbackPaths/Fallback-8  41.9ns ± 1%  43.5ns ± 1%  +3.84%  (p=0.000 n=9+9)
      TimeNow-8                             41.5ns ± 3%  41.5ns ± 2%    ~     (p=0.723 n=10+10)
      
      Fixes #24142
      
      Change-Id: Iacd935db3c4c782150b3809aaa675a71799b1c9c
      Reviewed-on: https://go-review.googlesource.com/97315
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      419c0645
    • Ian Lance Taylor's avatar
      runtime: change from rt_sigaction to sigaction · c2f28de7
      Ian Lance Taylor authored
      This normalizes the Linux code to act like other targets. The size
      argument to the rt_sigaction system call is pushed to a single
      function, sysSigaction.
      
      This is intended as a simplification step for CL 93875 for #14327.
      
      Change-Id: I594788e235f0da20e16e8a028e27ac8c883907c4
      Reviewed-on: https://go-review.googlesource.com/99077
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      c2f28de7
    • Brad Fitzpatrick's avatar
      cmd/dist: skip rebuild before running tests when on the build systems · d8c9ef9e
      Brad Fitzpatrick authored
      Updates #24300
      
      Change-Id: I7752dab67e15a6dfe5fffe5b5ccbf3373bbc2c13
      Reviewed-on: https://go-review.googlesource.com/99296Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      d8c9ef9e
    • Vlad Krasnov's avatar
      math/big: implement addMulVVW on arm64 · fd3d2793
      Vlad Krasnov authored
      The lack of proper addMulVVW implementation for arm64 hurts RSA performance.
      
      This assembly implementation is optimized for arm64 based servers.
      
      name                  old time/op    new time/op     delta
      pkg:math/big goos:linux goarch:arm64
      AddMulVVW/1             55.2ns ± 0%     11.9ns ± 1%    -78.37%  (p=0.000 n=8+10)
      AddMulVVW/2             67.0ns ± 0%     11.2ns ± 0%    -83.28%  (p=0.000 n=7+10)
      AddMulVVW/3             93.2ns ± 0%     13.2ns ± 0%    -85.84%  (p=0.000 n=10+10)
      AddMulVVW/4              126ns ± 0%       13ns ± 1%    -89.82%  (p=0.000 n=10+10)
      AddMulVVW/5              151ns ± 0%       17ns ± 0%    -88.87%  (p=0.000 n=10+9)
      AddMulVVW/10             323ns ± 0%       25ns ± 0%    -92.20%  (p=0.000 n=10+10)
      AddMulVVW/100           3.28µs ± 0%     0.14µs ± 0%    -95.82%  (p=0.000 n=10+10)
      AddMulVVW/1000          31.7µs ± 0%      1.3µs ± 0%    -96.00%  (p=0.000 n=10+8)
      AddMulVVW/10000          313µs ± 0%       13µs ± 0%    -95.98%  (p=0.000 n=10+10)
      AddMulVVW/100000        3.24ms ± 0%     0.13ms ± 1%    -96.13%  (p=0.000 n=9+9)
      pkg:crypto/rsa goos:linux goarch:arm64
      RSA2048Decrypt          44.7ms ± 0%      4.0ms ± 6%    -91.08%  (p=0.000 n=8+10)
      RSA2048Sign             46.3ms ± 0%      5.0ms ± 0%    -89.29%  (p=0.000 n=9+10)
      3PrimeRSA2048Decrypt    22.3ms ± 0%      2.4ms ± 0%    -89.26%  (p=0.000 n=10+10)
      
      Change-Id: I295f0bd5c51a4442d02c44ece1f6026d30dff0bc
      Reviewed-on: https://go-review.googlesource.com/76270Reviewed-by: default avatarVlad Krasnov <vlad@cloudflare.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Vlad Krasnov <vlad@cloudflare.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      fd3d2793
    • David du Colombier's avatar
      cmd/go: skip TestVetWithOnlyCgoFiles when cgo is disabled · b1335037
      David du Colombier authored
      CL 99175 added TestVetWithOnlyCgoFiles. However, this
      test is failing on platforms where cgo is disabled,
      because no file can be built.
      
      This change fixes TestVetWithOnlyCgoFiles by skipping
      this test when cgo is disabled.
      
      Fixes #24304.
      
      Change-Id: Ibb38fcd3e0ed1a791782145d3f2866f12117c6fe
      Reviewed-on: https://go-review.googlesource.com/99275Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      b1335037
    • Elias Naur's avatar
      runtime/cgo: make sure nil is undefined before defining it · 7a2a96d6
      Elias Naur authored
      While working on standalone builds of gomobile bindings, I ran into
      errors on the form:
      
      gcc_darwin_arm.c:30:31: error: ambiguous expansion of macro 'nil' [-Werror,-Wambiguous-macro]
      /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS11.2.sdk/usr/include/MacTypes.h:94:15: note: expanding this definition of 'nil'
      
      Fix it by undefining nil before defining it in libcgo.h.
      
      Change-Id: I8e9660a68c6c351e592684d03d529f0d182c0493
      Reviewed-on: https://go-review.googlesource.com/99215
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      7a2a96d6
    • Ian Lance Taylor's avatar
      cmd/go: run vet on packages with only cgo files · 709da955
      Ian Lance Taylor authored
      CgoFiles is not included in GoFiles, so we need to check both.
      
      Fixes #24193
      
      Change-Id: I6a67bd912e3d9a4be0eae8fa8db6fa8a07fb5df3
      Reviewed-on: https://go-review.googlesource.com/99175
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      709da955
    • Matthew Dempsky's avatar
      cmd/compile: prevent untyped types from reaching walk · a3b3284d
      Matthew Dempsky authored
      We already require expressions to have already been typechecked before
      reaching walk. Moreover, all untyped expressions should have been
      converted to their default type by walk.
      
      However, in practice, we've been somewhat sloppy and inconsistent
      about ensuring this. In particular, a lot of AST rewrites ended up
      leaving untyped bool expressions scattered around. These likely aren't
      harmful in practice, but it seems worth cleaning up.
      
      The two most common cases addressed by this CL are:
      
      1) When generating OIF and OFOR nodes, we would often typecheck the
      conditional expression, but not apply defaultlit to force it to the
      expression's default type.
      
      2) When rewriting string comparisons into more fundamental primitives,
      we were simply overwriting r.Type with the desired type, which didn't
      propagate the type to nested subexpressions. These are fixed by
      utilizing finishcompare, which correctly handles this (and is already
      used by other comparison lowering rewrites).
      
      Lastly, walkexpr is extended to assert that it's not called on untyped
      expressions.
      
      Fixes #23834.
      
      Change-Id: Icbd29648a293555e4015d3b06a95a24ccbd3f790
      Reviewed-on: https://go-review.googlesource.com/98337Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      a3b3284d
    • Kunpei Sakai's avatar
      cmd/compile: go fmt · ed8b7a77
      Kunpei Sakai authored
      Change-Id: I2eae33928641c6ed74badfe44d079ae90e5cc8c8
      Reviewed-on: https://go-review.googlesource.com/99195Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      ed8b7a77
    • Alberto Donizetti's avatar
      test/codegen: fix issue with arm64 memmove codegen test · c0289583
      Alberto Donizetti authored
      This recently added arm64 memmove codegen check:
      
        func movesmall() {
          // arm64:-"memmove"
          x := [...]byte{1, 2, 3, 4, 5, 6, 7}
          copy(x[1:], x[:])
        }
      
      is not correct, for two reasons:
      
      1. regexps are matched from the start of the disasm line (excluding
         line information). This mean that a negative -"memmove" check will
         pass against a 'CALL runtime.memmove' line because the line does
         not start with 'memmove' (its starts with CALL...).
         The way to specify no 'memmove' match whatsoever on the line is
         -".*memmove"
      
      2. AFAIK comments on their own line are matched against the first
         subsequent non-comment line. So the code above only verifies that
         the x := ... line does not generate a memmove. The comment should
         be moved near the copy() line, if it's that one we want to not
         generate a memmove call.
      
      The fact that the test above is not effective can be checked by
      running `go run run.go -v codegen` in the toplevel test directory with
      a go1.10 toolchain (that does not have the memmove-elision
      optimization). The test will still pass (it shouldn't).
      
      This change changes the regexp to -".*memmove" and moves it near the
      line it needs to (not)match.
      
      Change-Id: Ie01ef4d775e77d92dc8d8b7856b89b200f5e5ef2
      Reviewed-on: https://go-review.googlesource.com/98977
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      c0289583
    • Tobias Klauser's avatar
      debug/pe: use bytes.IndexByte instead of a loop · aa00d974
      Tobias Klauser authored
      Follow CL 98759
      
      Change-Id: I58c8b769741b395e5bf4e723505b149d063d492a
      Reviewed-on: https://go-review.googlesource.com/99095
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      aa00d974
    • Tobias Klauser's avatar
      database/sql: fix typo in comment · 06572356
      Tobias Klauser authored
      Change-Id: Ie2966bae1dc2e542c42fb32d8059a4b2d4690014
      Reviewed-on: https://go-review.googlesource.com/99115Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      06572356
    • Hana Kim's avatar
      cmd/trace: force GC occassionally · 93b0261d
      Hana Kim authored
      to return memory to the OS after completing potentially
      large operations.
      
      Update #21870
      
      Sys went down to 3.7G
      
      $ DEBUG_MEMORY_USAGE=1 go tool trace trace.out
      
      2018/03/07 09:35:52 Parsing trace...
      after parsing trace
       Alloc:	3385754360 Bytes
       Sys:	3662047864 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	3488907264 Bytes
       HeapInUse:	3426549760 Bytes
       HeapAlloc:	3385754360 Bytes
      Enter to continue...
      2018/03/07 09:36:09 Splitting trace...
      after spliting trace
       Alloc:	3238309424 Bytes
       Sys:	3684410168 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	3488874496 Bytes
       HeapInUse:	3266461696 Bytes
       HeapAlloc:	3238309424 Bytes
      Enter to continue...
      2018/03/07 09:36:39 Opening browser. Trace viewer is listening on http://100.101.224.241:12345
      
      after httpJsonTrace
       Alloc:	3000633872 Bytes
       Sys:	3693978424 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	3488743424 Bytes
       HeapInUse:	3030966272 Bytes
       HeapAlloc:	3000633872 Bytes
      Enter to continue...
      
      Change-Id: I56f64cae66c809cbfbad03fba7bd0d35494c1d04
      Reviewed-on: https://go-review.googlesource.com/92376Reviewed-by: default avatarPeter Weinberger <pjw@google.com>
      93b0261d
    • jimmyfrasche's avatar
      go/build: correct value of .Doc field · 20b14b71
      jimmyfrasche authored
      Build could use the package comment from test files to populate the .Doc
      field on *Package.
      
      As go list uses this data and several packages in the standard library
      have tests with package comments, this lead to:
      
      $ go list -f '{{.Doc}}' flag container/heap image
      These examples demonstrate more intricate uses of the flag package.
      This example demonstrates an integer heap built using the heap interface.
      This example demonstrates decoding a JPEG image and examining its pixels.
      
      This change now only examines non-test files when attempting to populate
      .Doc, resulting in the expected behavior:
      
      $ gotip list -f '{{.Doc}}' flag container/heap image
      Package flag implements command-line flag parsing.
      Package heap provides heap operations for any type that implements heap.Interface.
      Package image implements a basic 2-D image library.
      
      Fixes #23594
      
      Change-Id: I37171c26ec5cc573efd273556a05223c6f675968
      Reviewed-on: https://go-review.googlesource.com/96976
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDaniel Martí <mvdan@mvdan.cc>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      20b14b71
    • Hana Kim's avatar
      cmd/trace: generate jsontrace data in a streaming fashion · ee465831
      Hana Kim authored
      Update #21870
      
      The Sys went down to 4.25G from 6.2G.
      
      $ DEBUG_MEMORY_USAGE=1 go tool trace trace.out
      2018/03/07 08:49:01 Parsing trace...
      after parsing trace
       Alloc:	3385757184 Bytes
       Sys:	3661195896 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	3488841728 Bytes
       HeapInUse:	3426516992 Bytes
       HeapAlloc:	3385757184 Bytes
      Enter to continue...
      2018/03/07 08:49:18 Splitting trace...
      after spliting trace
       Alloc:	2352071904 Bytes
       Sys:	4243825464 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	4025712640 Bytes
       HeapInUse:	2377703424 Bytes
       HeapAlloc:	2352071904 Bytes
      Enter to continue...
      after httpJsonTrace
       Alloc:	3228697832 Bytes
       Sys:	4250379064 Bytes
       HeapReleased:	0 Bytes
       HeapSys:	4025647104 Bytes
       HeapInUse:	3260014592 Bytes
       HeapAlloc:	3228697832 Bytes
      
      Change-Id: I546f26bdbc68b1e58f1af1235a0e299dc0ff115e
      Reviewed-on: https://go-review.googlesource.com/92375
      Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
      Reviewed-by: default avatarPeter Weinberger <pjw@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ee465831
    • Yuval Pavel Zholkover's avatar
      runtime: add missing build constraints to os_linux_{be64,noauxv,novdso,ppc64x}.go files · 083f3957
      Yuval Pavel Zholkover authored
      They do not match the file name patterns of
        *_GOOS
        *_GOARCH
        *_GOOS_GOARCH
      therefore the implicit linux constraint was not being added.
      
      Change-Id: Ie506c51cee6818db445516f96fffaa351df62cf5
      Reviewed-on: https://go-review.googlesource.com/99116Reviewed-by: default avatarTobias Klauser <tobias.klauser@gmail.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      083f3957
    • Elias Naur's avatar
      androidtest.bash: don't require GOARCH set · 9094946f
      Elias Naur authored
      The host GOARCH is most likely supported (386, amd64, arm, arm64).
      
      Change-Id: I86324b9c00f22c592ba54bda7d2ae97c86bda904
      Reviewed-on: https://go-review.googlesource.com/99155
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarHyang-Ah Hana Kim <hyangah@gmail.com>
      9094946f
    • Alex Brainman's avatar
      os: use WIN32_FIND_DATA.Reserved0 to identify symlinks · e83601b4
      Alex Brainman authored
      os.Stat implementation uses instructions described at
      https://blogs.msdn.microsoft.com/oldnewthing/20100212-00/?p=14963/
      to distinguish symlinks. In particular, it calls
      GetFileAttributesEx or FindFirstFile and checks
      either WIN32_FILE_ATTRIBUTE_DATA.dwFileAttributes
      or WIN32_FIND_DATA.dwFileAttributes to see if
      FILE_ATTRIBUTES_REPARSE_POINT flag is set.
      And that seems to worked fine so far.
      
      But now we discovered that OneDrive root folder
      is determined as directory:
      
      c:\>dir C:\Users\Alex | grep OneDrive
      30/11/2017  07:25 PM    <DIR>          OneDrive
      c:\>
      
      while Go identified it as symlink.
      
      But we did not follow Microsoft's advice to the letter - we never
      checked WIN32_FIND_DATA.Reserved0. And adding that extra check
      makes Go treat OneDrive as symlink. So use FindFirstFile and
      WIN32_FIND_DATA.Reserved0 to determine symlinks.
      
      Fixes #22579
      
      Change-Id: I0cb88929eb8b47b1d24efaf1907ad5a0e20de83f
      Reviewed-on: https://go-review.googlesource.com/86556Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e83601b4
    • Matthew Dempsky's avatar
      cmd/compile: remove funcdepth variables · d7eb4901
      Matthew Dempsky authored
      There were only two large classes of use for these variables:
      
      1) Testing "funcdepth != 0" or "funcdepth > 0", which is equivalent to
      checking "Curfn != nil".
      
      2) In oldname, detecting whether a closure variable has been created
      for the current function, which can be handled by instead testing
      "n.Name.Curfn != Curfn".
      
      Lastly, merge funcstart into funchdr, since it's only called once, and
      it better matches up with funcbody now.
      
      Passes toolstash-check.
      
      Change-Id: I8fe159a9d37ef7debc4cd310354cea22a8b23394
      Reviewed-on: https://go-review.googlesource.com/99076
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d7eb4901