1. 27 Feb, 2018 10 commits
    • Tobias Klauser's avatar
      os, syscall: use pipe2 instead of pipe syscall on OpenBSD · 2013ad89
      Tobias Klauser authored
      The pipe2 syscall is part of OpenBSD since version 5.7 and thus exists in
      all officially supported versions.
      
      Follows CL 38426 and CL 94035
      
      Change-Id: I8f93ecbc89664241f1b6b0d069e948776941b1d0
      Reviewed-on: https://go-review.googlesource.com/97356
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      2013ad89
    • Philip Hofer's avatar
      cmd/compile/internal/ssa: clear branch likeliness in clobberBlock · 81786649
      Philip Hofer authored
      The branchelim pass makes some blocks unreachable, but does not
      remove them from Func.Values. Consequently, ssacheck complains
      when it finds a block with a non-zero likeliness value but no
      successors.
      
      Fixes #24014
      
      Change-Id: I2dcf1d8f4e769a2f363508dab3b11198ead336b6
      Reviewed-on: https://go-review.googlesource.com/96075Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Philip Hofer <phofer@umich.edu>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      81786649
    • Josh Bleecher Snyder's avatar
      runtime: improve 386/amd64 systemstack · c5d6c42d
      Josh Bleecher Snyder authored
      Minor improvements, noticed while investigating other things.
      
      Shorten the prologue.
      
      Make branch direction better for static branch prediction;
      the most common case by far is switching stacks (g==curg).
      
      Change-Id: Ib2211d3efecb60446355cda56194221ccb78057d
      Reviewed-on: https://go-review.googlesource.com/97377
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      c5d6c42d
    • Joe Tsai's avatar
      go/doc: replace unexported values with underscore if necessary · f399af31
      Joe Tsai authored
      When a var or const declaration contains a mixture of exported and unexported
      identifiers, replace the unexported identifiers with underscore.
      Otherwise, the LHS and the RHS may mismatch or the declaration may mismatch
      with an iota from above.
      
      Fixes #22426
      
      Change-Id: Icd5fb81b4ece647232a9f7d05cb140227091e9cb
      Reviewed-on: https://go-review.googlesource.com/94877
      Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      f399af31
    • erifan01's avatar
      math: optimize sinh and cosh · ed6c6c9c
      erifan01 authored
      Improve performance by reducing unnecessary function calls
      
      Benchmarks:
      
      Tme    old time/op  new time/op  delta
      Cosh-8   229ns ± 0%   138ns ± 0%  -39.74%  (p=0.008 n=5+5)
      Sinh-8   231ns ± 0%   139ns ± 0%  -39.83%  (p=0.008 n=5+5)
      
      Change-Id: Icab5485849bbfaafca8429d06b67c558101f4f3c
      Reviewed-on: https://go-review.googlesource.com/85477Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      ed6c6c9c
    • Josh Bleecher Snyder's avatar
      runtime: short-circuit typedmemmove when dst==src · 486caa26
      Josh Bleecher Snyder authored
      Change-Id: I855268a4c0d07ad602ec90f5da66422d3d87c5f2
      Reviewed-on: https://go-review.googlesource.com/94595
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      486caa26
    • Giovanni Bajo's avatar
      cmd/compile: fix bit-test rules for highest bit · 68def820
      Giovanni Bajo authored
      Bit-test rules failed to match when matching the highest bit
      of a word because operands in SSA are signed int64. Fix
      them by treating them as unsigned (and correctly handling
      32-bit operands as well).
      
      Tests will be added in next CL.
      
      Change-Id: I491c4e88e7e2f87e9bb72bd0d9fa5d4025b90736
      Reviewed-on: https://go-review.googlesource.com/94765Reviewed-by: default avatarKeith Randall <khr@golang.org>
      68def820
    • Giovanni Bajo's avatar
      cmd/compile: fold bit masking on bits that have been shifted away · 098208a0
      Giovanni Bajo authored
      Spotted while working on #18943, it triggers once during bootstrap.
      
      Change-Id: Ia4330ccc6395627c233a8eb4dcc0e3e2a770bea7
      Reviewed-on: https://go-review.googlesource.com/94764Reviewed-by: default avatarKeith Randall <khr@golang.org>
      098208a0
    • Chad Rosier's avatar
      cmd/compile/internal/ssa: combine zero stores into larger stores on arm64 · ecd9e8a2
      Chad Rosier authored
      This reduces the go tool binary on arm64 by 12k.
      
      go1 results on Amberwing:
      name                   old time/op    new time/op    delta
      RegexpMatchEasy0_32       249ns ± 0%     249ns ± 0%    ~     (p=0.087 n=10+10)
      RegexpMatchEasy0_1K       584ns ± 0%     584ns ± 0%    ~     (all equal)
      RegexpMatchEasy1_32       246ns ± 0%     246ns ± 0%    ~     (p=1.000 n=10+10)
      RegexpMatchEasy1_1K       806ns ± 0%     806ns ± 0%    ~     (p=0.706 n=10+9)
      RegexpMatchMedium_32      314ns ± 0%     314ns ± 0%    ~     (all equal)
      RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%    ~     (p=0.245 n=10+8)
      RegexpMatchHard_32       2.75µs ± 1%    2.75µs ± 1%    ~     (p=0.690 n=10+10)
      RegexpMatchHard_1K       78.9µs ± 0%    78.9µs ± 1%    ~     (p=0.295 n=9+9)
      FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
      FmtFprintfString          112ns ± 0%     112ns ± 0%    ~     (all equal)
      FmtFprintfInt             117ns ± 0%     116ns ± 0%  -0.85%  (p=0.000 n=10+10)
      FmtFprintfIntInt          181ns ± 0%     181ns ± 0%    ~     (all equal)
      FmtFprintfPrefixedInt     222ns ± 0%     224ns ± 0%  +0.90%  (p=0.000 n=9+10)
      FmtFprintfFloat           318ns ± 1%     322ns ± 0%    ~     (p=0.059 n=10+8)
      FmtManyArgs               736ns ± 1%     735ns ± 0%    ~     (p=0.206 n=9+9)
      Gzip                      437ms ± 0%     436ms ± 0%  -0.25%  (p=0.000 n=10+10)
      HTTPClientServer         89.8µs ± 1%    90.2µs ± 2%    ~     (p=0.393 n=10+10)
      JSONEncode               20.1ms ± 1%    20.2ms ± 1%    ~     (p=0.065 n=9+10)
      JSONDecode               94.2ms ± 1%    93.9ms ± 1%  -0.42%  (p=0.043 n=10+10)
      GobDecode                12.7ms ± 1%    12.8ms ± 2%  +0.94%  (p=0.019 n=10+10)
      GobEncode                12.1ms ± 0%    12.1ms ± 0%    ~     (p=0.052 n=10+10)
      Mandelbrot200            5.06ms ± 0%    5.05ms ± 0%  -0.04%  (p=0.000 n=9+10)
      TimeParse                 450ns ± 3%     446ns ± 0%    ~     (p=0.238 n=10+9)
      TimeFormat                485ns ± 1%     483ns ± 1%    ~     (p=0.073 n=10+10)
      Template                 90.4ms ± 0%    90.7ms ± 0%  +0.29%  (p=0.000 n=8+10)
      GoParse                  6.01ms ± 0%    6.03ms ± 0%  +0.35%  (p=0.000 n=10+10)
      BinaryTree17              11.7s ± 0%     11.7s ± 0%    ~     (p=0.481 n=10+10)
      Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.315 n=10+10)
      Fannkuch11                3.40s ± 0%     3.37s ± 0%  -0.92%  (p=0.000 n=10+10)
      [Geo mean]               67.9µs         67.9µs       +0.02%
      
      name                   old speed      new speed      delta
      RegexpMatchEasy0_32     128MB/s ± 0%   128MB/s ± 0%  -0.08%  (p=0.003 n=8+10)
      RegexpMatchEasy0_1K    1.75GB/s ± 0%  1.75GB/s ± 0%    ~     (p=0.642 n=8+10)
      RegexpMatchEasy1_32     130MB/s ± 0%   130MB/s ± 0%    ~     (p=0.690 n=10+9)
      RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%    ~     (p=0.661 n=10+9)
      RegexpMatchMedium_32   3.18MB/s ± 0%  3.18MB/s ± 0%    ~     (all equal)
      RegexpMatchMedium_1K   19.7MB/s ± 0%  19.6MB/s ± 0%    ~     (p=0.190 n=10+9)
      RegexpMatchHard_32     11.6MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.669 n=10+10)
      RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.718 n=9+9)
      Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.24%  (p=0.000 n=10+10)
      JSONEncode             96.5MB/s ± 1%  96.1MB/s ± 1%    ~     (p=0.065 n=9+10)
      JSONDecode             20.6MB/s ± 1%  20.7MB/s ± 1%  +0.42%  (p=0.041 n=10+10)
      GobDecode              60.6MB/s ± 1%  60.0MB/s ± 2%  -0.92%  (p=0.016 n=10+10)
      GobEncode              63.4MB/s ± 0%  63.6MB/s ± 0%    ~     (p=0.055 n=10+10)
      Template               21.5MB/s ± 0%  21.4MB/s ± 0%  -0.30%  (p=0.000 n=9+10)
      GoParse                9.64MB/s ± 0%  9.61MB/s ± 0%  -0.36%  (p=0.000 n=10+10)
      Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.323 n=10+10)
      [Geo mean]             56.0MB/s       55.9MB/s       -0.07%
      
      Change-Id: Ia732fa57fbcf4767d72382516d9f16705d177736
      Reviewed-on: https://go-review.googlesource.com/96435
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      ecd9e8a2
    • Josh Bleecher Snyder's avatar
      cmd/compile: tighten after lowering · 3a9e4440
      Josh Bleecher Snyder authored
      Moving tighten after lowering benefits from the removal of values by
      lowering and lowered CSE. It lets us make better decisions about
      which values are rematerializable and which generate flags.
      Empirically, it lowers stack usage (by avoiding spills)
      and generates slightly smaller and faster binaries.
      
      
      Fixes #19853
      Fixes #21041
      
      name        old time/op       new time/op       delta
      Template          195ms ± 4%        193ms ± 4%  -1.33%  (p=0.000 n=92+97)
      Unicode          94.1ms ± 9%       92.5ms ± 8%  -1.66%  (p=0.002 n=97+95)
      GoTypes           572ms ± 5%        566ms ± 7%  -0.92%  (p=0.001 n=95+98)
      Compiler          2.56s ± 4%        2.52s ± 3%  -1.41%  (p=0.000 n=94+97)
      SSA               6.52s ± 2%        6.47s ± 3%  -0.82%  (p=0.000 n=96+94)
      Flate             117ms ± 5%        116ms ± 7%  -0.72%  (p=0.018 n=97+97)
      GoParser          148ms ± 6%        146ms ± 4%  -0.97%  (p=0.002 n=98+95)
      Reflect           370ms ± 7%        363ms ± 6%  -1.79%  (p=0.000 n=99+98)
      Tar               175ms ± 6%        173ms ± 6%  -1.11%  (p=0.001 n=94+95)
      XML               204ms ± 6%        201ms ± 5%  -1.49%  (p=0.000 n=97+96)
      [Geo mean]        363ms             359ms       -1.22%
      
      name        old user-time/op  new user-time/op  delta
      Template          251ms ± 5%        245ms ± 5%  -2.40%  (p=0.000 n=97+93)
      Unicode           131ms ±10%        128ms ± 9%  -1.93%  (p=0.001 n=100+99)
      GoTypes           760ms ± 4%        752ms ± 4%  -0.96%  (p=0.000 n=97+95)
      Compiler          3.51s ± 3%        3.48s ± 2%  -1.04%  (p=0.000 n=96+95)
      SSA               9.57s ± 4%        9.52s ± 2%  -0.50%  (p=0.004 n=97+96)
      Flate             149ms ± 6%        147ms ± 6%  -1.46%  (p=0.000 n=98+96)
      GoParser          184ms ± 5%        181ms ± 7%  -1.84%  (p=0.000 n=98+97)
      Reflect           469ms ± 6%        461ms ± 6%  -1.69%  (p=0.000 n=100+98)
      Tar               219ms ± 8%        217ms ± 7%  -0.90%  (p=0.035 n=96+96)
      XML               255ms ± 5%        251ms ± 6%  -1.48%  (p=0.000 n=98+98)
      [Geo mean]        476ms             469ms       -1.42%
      
      name        old alloc/op      new alloc/op      delta
      Template         37.8MB ± 0%       37.8MB ± 0%  -0.17%  (p=0.000 n=100+100)
      Unicode          28.8MB ± 0%       28.8MB ± 0%  -0.02%  (p=0.000 n=100+95)
      GoTypes           112MB ± 0%        112MB ± 0%  -0.20%  (p=0.000 n=100+97)
      Compiler          466MB ± 0%        464MB ± 0%  -0.27%  (p=0.000 n=100+100)
      SSA              1.49GB ± 0%       1.49GB ± 0%  -0.08%  (p=0.000 n=100+99)
      Flate            24.4MB ± 0%       24.3MB ± 0%  -0.25%  (p=0.000 n=98+99)
      GoParser         30.7MB ± 0%       30.6MB ± 0%  -0.26%  (p=0.000 n=99+100)
      Reflect          76.4MB ± 0%       76.4MB ± 0%    ~     (p=0.253 n=100+100)
      Tar              38.9MB ± 0%       38.8MB ± 0%  -0.20%  (p=0.000 n=100+97)
      XML              41.5MB ± 0%       41.4MB ± 0%  -0.19%  (p=0.000 n=100+98)
      [Geo mean]       77.5MB            77.4MB       -0.16%
      
      name        old allocs/op     new allocs/op     delta
      Template           381k ± 0%         381k ± 0%  -0.15%  (p=0.000 n=100+100)
      Unicode            342k ± 0%         342k ± 0%  -0.01%  (p=0.000 n=100+98)
      GoTypes           1.19M ± 0%        1.18M ± 0%  -0.24%  (p=0.000 n=100+100)
      Compiler          4.52M ± 0%        4.50M ± 0%  -0.29%  (p=0.000 n=100+100)
      SSA               12.3M ± 0%        12.3M ± 0%  -0.11%  (p=0.000 n=100+100)
      Flate              234k ± 0%         234k ± 0%  -0.26%  (p=0.000 n=99+96)
      GoParser           318k ± 0%         317k ± 0%  -0.21%  (p=0.000 n=99+100)
      Reflect            974k ± 0%         974k ± 0%  -0.03%  (p=0.000 n=100+100)
      Tar                392k ± 0%         391k ± 0%  -0.17%  (p=0.000 n=100+99)
      XML                404k ± 0%         403k ± 0%  -0.24%  (p=0.000 n=99+99)
      [Geo mean]         794k              792k       -0.17%
      
      name        old object-bytes  new object-bytes  delta
      Template          393kB ± 0%        392kB ± 0%  -0.19%  (p=0.008 n=5+5)
      Unicode           207kB ± 0%        207kB ± 0%    ~     (all equal)
      GoTypes          1.23MB ± 0%       1.22MB ± 0%  -0.11%  (p=0.008 n=5+5)
      Compiler         4.34MB ± 0%       4.33MB ± 0%  -0.15%  (p=0.008 n=5+5)
      SSA              9.85MB ± 0%       9.85MB ± 0%  -0.07%  (p=0.008 n=5+5)
      Flate             235kB ± 0%        234kB ± 0%  -0.59%  (p=0.008 n=5+5)
      GoParser          297kB ± 0%        296kB ± 0%  -0.22%  (p=0.008 n=5+5)
      Reflect          1.03MB ± 0%       1.03MB ± 0%  -0.00%  (p=0.008 n=5+5)
      Tar               332kB ± 0%        331kB ± 0%  -0.15%  (p=0.008 n=5+5)
      XML               413kB ± 0%        412kB ± 0%  -0.19%  (p=0.008 n=5+5)
      [Geo mean]        728kB             727kB       -0.17%
      
      Change-Id: I9b5cdb668ed102a001897a05e833105acba220a2
      Reviewed-on: https://go-review.googlesource.com/95995
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      3a9e4440
  2. 26 Feb, 2018 22 commits
  3. 25 Feb, 2018 1 commit
  4. 24 Feb, 2018 3 commits
    • Lubomir I. Ivanov (VMware)'s avatar
      os/user: obtain a user home path on Windows · 7a218942
      Lubomir I. Ivanov (VMware) authored
      newUserFromSid() is extended so that the retriaval of the user home
      path based on a user SID becomes possible.
      
      (1) The primary method it uses is to lookup the Windows registry for
      the following key:
        HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList\[SID]
      
      If the key does not exist the user might not have logged in yet.
      If (1) fails it falls back to (2)
      
      (2) The second method the function uses is to look at the default home
      path for users (e.g. WINAPI's GetProfilesDirectory()) and append
      the username to that. The procedure is in the lines of:
        c:\Users + \ + <username>
      
      The function newUser() now requires the following arguments:
        uid, gid, dir, username, domain
      This is done to avoid multiple calls to usid.String() and
      usid.LookupAccount("") in the case of a newUserFromSid()
      call stack.
      
      The functions current() and newUserFromSid() both call newUser()
      supplying the arguments in question. The helpers
      lookupUsernameAndDomain() and findHomeDirInRegistry() are
      added.
      
      This commit also updates:
      - go/build/deps_test.go, so that the test now includes the
      "internal/syscall/windows/registry" import.
      - os/user/user_test.go, so that User.HomeDir is tested on Windows.
      
      GitHub-Last-Rev: 25423e2a3820121f4c42321e7a77a3977f409724
      GitHub-Pull-Request: golang/go#23822
      Change-Id: I6c3ad1c4ce3e7bc0d1add024951711f615b84ee5
      Reviewed-on: https://go-review.googlesource.com/93935Reviewed-by: default avatarAlex Brainman <alex.brainman@gmail.com>
      Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      7a218942
    • Daniel Martí's avatar
      cmd/compile/internal/syntax: use stringer for operators and tokens · c8791538
      Daniel Martí authored
      With its new -linecomment flag, it is now possible to use stringer on
      values whose strings aren't valid identifiers. This is the case with
      tokens and operators in Go.
      
      Operator alredy had inline comments with each operator's string
      representation; only minor modifications were needed. The inline
      comments were added to each of the token names, using the same strategy.
      
      Comments that were previously inline or part of the string arrays were
      moved to the line immediately before the name they correspond to.
      
      Finally, declare tokStrFast as a function that uses the generated arrays
      directly. Avoiding the branch and strconv call means that we avoid a
      performance regression in the scanner, perhaps due to the lack of
      mid-stack inlining.
      
      Performance is not affected. Measured with 'go test -run StdLib -fast'
      on an X1 Carbon Gen2 (i5-4300U @ 1.90GHz, 8GB RAM, SSD), the best of 5
      runs before and after the changes are:
      
      	parsed 1709399 lines (3763 files) in 1.707402159s (1001169 lines/s)
      	allocated 449.282Mb (263.137Mb/s)
      
      	parsed 1709329 lines (3765 files) in 1.706663154s (1001562 lines/s)
      	allocated 449.290Mb (263.256Mb/s)
      
      Change-Id: Idcc4f83393fcadd6579700e3602c09496ea2625b
      Reviewed-on: https://go-review.googlesource.com/95357Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      c8791538
    • Ilya Tocar's avatar
      math/big: speed-up addMulVVW on amd64 · c3935c08
      Ilya Tocar authored
      Use MULX/ADOX/ADCX instructions to speed-up addMulVVW,
      when they are available. addMulVVW is a hotspot in rsa.
      This is faster than ADD/ADC/IMUL version, because ADOX/ADCX only
      modify carry/overflow flag, so they can be interleaved with each other
      and with MULX, which doesn't modify flags at all.
      Increasing unroll factor to e. g. 16 makes rsa 1% faster, but 3PrimeRSA2048Decrypt
      performance falls back to baseline.
      
      Updates #20058
      
      AddMulVVW/1-8                       3.28ns ± 2%     3.26ns ± 3%     ~     (p=0.107 n=10+10)
      AddMulVVW/2-8                       4.26ns ± 2%     4.24ns ± 3%     ~     (p=0.327 n=9+9)
      AddMulVVW/3-8                       5.07ns ± 2%     5.26ns ± 2%   +3.73%  (p=0.000 n=10+10)
      AddMulVVW/4-8                       6.40ns ± 2%     6.50ns ± 2%   +1.61%  (p=0.000 n=10+10)
      AddMulVVW/5-8                       6.77ns ± 2%     6.86ns ± 1%   +1.38%  (p=0.001 n=9+9)
      AddMulVVW/10-8                      12.2ns ± 2%     10.6ns ± 3%  -13.65%  (p=0.000 n=10+10)
      AddMulVVW/100-8                     79.7ns ± 2%     52.4ns ± 1%  -34.17%  (p=0.000 n=10+10)
      AddMulVVW/1000-8                     695ns ± 1%      491ns ± 2%  -29.39%  (p=0.000 n=9+10)
      AddMulVVW/10000-8                   7.26µs ± 2%     5.92µs ± 6%  -18.42%  (p=0.000 n=10+10)
      AddMulVVW/100000-8                  72.6µs ± 2%     62.2µs ± 2%  -14.31%  (p=0.000 n=10+10)
      
      crypto/rsa speed-up is smaller, but stil noticeable:
      
      RSA2048Decrypt-8        1.61ms ± 1%  1.38ms ± 1%  -14.13%  (p=0.000 n=10+10)
      RSA2048Sign-8           1.93ms ± 1%  1.70ms ± 1%  -11.86%  (p=0.000 n=10+10)
      3PrimeRSA2048Decrypt-8   932µs ± 0%   828µs ± 0%  -11.15%  (p=0.000 n=10+10)
      
      Results on crypto/tls:
      
      HandshakeServer/RSA-8                        901µs ± 1%    777µs ± 0%  -13.70%  (p=0.000 n=10+8)
      HandshakeServer/ECDHE-P256-RSA-8            1.01ms ± 1%   0.90ms ± 0%  -11.53%  (p=0.000 n=10+9)
      
      Full math/big benchmarks:
      
      name                              old time/op    new time/op     delta
      AddVV/1-8                           3.74ns ± 6%     3.55ns ± 2%     ~     (p=0.082 n=10+8)
      AddVV/2-8                           3.96ns ± 2%     3.98ns ± 5%     ~     (p=0.794 n=10+9)
      AddVV/3-8                           4.97ns ± 2%     4.94ns ± 1%     ~     (p=0.081 n=10+9)
      AddVV/4-8                           5.59ns ± 2%     5.59ns ± 2%     ~     (p=0.809 n=10+10)
      AddVV/5-8                           6.63ns ± 1%     6.62ns ± 1%     ~     (p=0.560 n=9+10)
      AddVV/10-8                          8.11ns ± 1%     8.11ns ± 2%     ~     (p=0.402 n=10+10)
      AddVV/100-8                         46.9ns ± 2%     46.8ns ± 1%     ~     (p=0.809 n=10+10)
      AddVV/1000-8                         389ns ± 1%      391ns ± 4%     ~     (p=0.809 n=10+10)
      AddVV/10000-8                       5.05µs ± 5%     4.98µs ± 2%     ~     (p=0.113 n=9+10)
      AddVV/100000-8                      55.3µs ± 3%     55.2µs ± 3%     ~     (p=0.796 n=10+10)
      AddVW/1-8                           3.04ns ± 3%     3.02ns ± 3%     ~     (p=0.538 n=10+10)
      AddVW/2-8                           3.57ns ± 2%     3.61ns ± 2%   +1.12%  (p=0.032 n=9+9)
      AddVW/3-8                           3.77ns ± 1%     3.79ns ± 2%     ~     (p=0.719 n=10+10)
      AddVW/4-8                           4.69ns ± 1%     4.69ns ± 2%     ~     (p=0.920 n=10+9)
      AddVW/5-8                           4.58ns ± 1%     4.58ns ± 1%     ~     (p=0.812 n=10+10)
      AddVW/10-8                          7.62ns ± 2%     7.63ns ± 1%     ~     (p=0.926 n=10+10)
      AddVW/100-8                         41.1ns ± 2%     42.4ns ± 3%   +3.34%  (p=0.000 n=10+10)
      AddVW/1000-8                         386ns ± 2%      389ns ± 4%     ~     (p=0.514 n=10+10)
      AddVW/10000-8                       3.88µs ± 3%     3.87µs ± 3%     ~     (p=0.448 n=10+10)
      AddVW/100000-8                      41.2µs ± 3%     41.7µs ± 3%     ~     (p=0.148 n=10+10)
      AddMulVVW/1-8                       3.28ns ± 2%     3.26ns ± 3%     ~     (p=0.107 n=10+10)
      AddMulVVW/2-8                       4.26ns ± 2%     4.24ns ± 3%     ~     (p=0.327 n=9+9)
      AddMulVVW/3-8                       5.07ns ± 2%     5.26ns ± 2%   +3.73%  (p=0.000 n=10+10)
      AddMulVVW/4-8                       6.40ns ± 2%     6.50ns ± 2%   +1.61%  (p=0.000 n=10+10)
      AddMulVVW/5-8                       6.77ns ± 2%     6.86ns ± 1%   +1.38%  (p=0.001 n=9+9)
      AddMulVVW/10-8                      12.2ns ± 2%     10.6ns ± 3%  -13.65%  (p=0.000 n=10+10)
      AddMulVVW/100-8                     79.7ns ± 2%     52.4ns ± 1%  -34.17%  (p=0.000 n=10+10)
      AddMulVVW/1000-8                     695ns ± 1%      491ns ± 2%  -29.39%  (p=0.000 n=9+10)
      AddMulVVW/10000-8                   7.26µs ± 2%     5.92µs ± 6%  -18.42%  (p=0.000 n=10+10)
      AddMulVVW/100000-8                  72.6µs ± 2%     62.2µs ± 2%  -14.31%  (p=0.000 n=10+10)
      DecimalConversion-8                  108µs ±19%      104µs ± 4%     ~     (p=0.460 n=10+8)
      FloatString/100-8                    926ns ±14%      908ns ± 5%     ~     (p=0.398 n=9+9)
      FloatString/1000-8                  25.7µs ± 1%     25.7µs ± 1%     ~     (p=0.739 n=10+10)
      FloatString/10000-8                 2.13ms ± 1%     2.12ms ± 1%     ~     (p=0.353 n=10+10)
      FloatString/100000-8                 207ms ± 1%      206ms ± 2%     ~     (p=0.912 n=10+10)
      FloatAdd/10-8                       61.3ns ± 3%     61.9ns ± 3%     ~     (p=0.183 n=10+10)
      FloatAdd/100-8                      62.0ns ± 2%     62.9ns ± 4%     ~     (p=0.118 n=10+10)
      FloatAdd/1000-8                     84.7ns ± 2%     84.4ns ± 1%     ~     (p=0.591 n=10+10)
      FloatAdd/10000-8                     305ns ± 2%      306ns ± 1%     ~     (p=0.443 n=10+10)
      FloatAdd/100000-8                   2.45µs ± 1%     2.46µs ± 1%     ~     (p=0.782 n=10+10)
      FloatSub/10-8                       56.8ns ± 4%     56.5ns ± 5%     ~     (p=0.423 n=10+10)
      FloatSub/100-8                      57.3ns ± 4%     57.1ns ± 5%     ~     (p=0.540 n=10+10)
      FloatSub/1000-8                     66.8ns ± 4%     66.6ns ± 1%     ~     (p=0.868 n=10+10)
      FloatSub/10000-8                     199ns ± 1%      198ns ± 1%     ~     (p=0.287 n=10+9)
      FloatSub/100000-8                   1.47µs ± 2%     1.47µs ± 2%     ~     (p=0.920 n=10+9)
      ParseFloatSmallExp-8                8.74µs ±10%     9.48µs ±10%   +8.51%  (p=0.010 n=9+10)
      ParseFloatLargeExp-8                39.2µs ±25%     39.6µs ±12%     ~     (p=0.529 n=10+10)
      GCD10x10/WithoutXY-8                 173ns ±23%      177ns ±20%     ~     (p=0.698 n=10+10)
      GCD10x10/WithXY-8                    736ns ±12%      728ns ±16%     ~     (p=0.838 n=10+10)
      GCD10x100/WithoutXY-8                325ns ±16%      326ns ±14%     ~     (p=0.912 n=10+10)
      GCD10x100/WithXY-8                  1.14µs ±13%     1.16µs ± 6%     ~     (p=0.287 n=10+9)
      GCD10x1000/WithoutXY-8               851ns ±25%      820ns ±12%     ~     (p=0.592 n=10+10)
      GCD10x1000/WithXY-8                 2.89µs ±17%     2.85µs ± 5%     ~     (p=1.000 n=10+9)
      GCD10x10000/WithoutXY-8             6.66µs ±12%     6.82µs ±19%     ~     (p=0.529 n=10+10)
      GCD10x10000/WithXY-8                18.0µs ± 5%     17.2µs ±19%     ~     (p=0.315 n=7+10)
      GCD10x100000/WithoutXY-8            77.8µs ±18%     73.3µs ±11%     ~     (p=0.315 n=10+9)
      GCD10x100000/WithXY-8                186µs ±14%      204µs ±29%     ~     (p=0.218 n=10+10)
      GCD100x100/WithoutXY-8              1.09µs ± 1%     1.09µs ± 2%     ~     (p=0.117 n=9+10)
      GCD100x100/WithXY-8                 7.93µs ± 1%     7.97µs ± 1%   +0.52%  (p=0.006 n=10+10)
      GCD100x1000/WithoutXY-8             2.00µs ± 3%     2.04µs ± 6%     ~     (p=0.053 n=9+10)
      GCD100x1000/WithXY-8                9.23µs ± 1%     9.29µs ± 1%   +0.63%  (p=0.009 n=10+10)
      GCD100x10000/WithoutXY-8            10.2µs ±11%      9.7µs ± 6%     ~     (p=0.278 n=10+9)
      GCD100x10000/WithXY-8               33.3µs ± 4%     33.6µs ± 4%     ~     (p=0.481 n=10+10)
      GCD100x100000/WithoutXY-8            106µs ±17%      105µs ±13%     ~     (p=0.853 n=10+10)
      GCD100x100000/WithXY-8               289µs ±17%      276µs ± 8%     ~     (p=0.353 n=10+10)
      GCD1000x1000/WithoutXY-8            12.2µs ± 1%     12.1µs ± 1%   -0.45%  (p=0.007 n=10+10)
      GCD1000x1000/WithXY-8                131µs ± 1%      132µs ± 0%   +0.93%  (p=0.000 n=9+7)
      GCD1000x10000/WithoutXY-8           20.6µs ± 2%     20.6µs ± 1%     ~     (p=0.326 n=10+9)
      GCD1000x10000/WithXY-8               238µs ± 1%      237µs ± 1%     ~     (p=0.356 n=9+10)
      GCD1000x100000/WithoutXY-8           117µs ± 8%      114µs ±11%     ~     (p=0.190 n=10+10)
      GCD1000x100000/WithXY-8             1.51ms ± 1%     1.50ms ± 1%     ~     (p=0.053 n=9+10)
      GCD10000x10000/WithoutXY-8           220µs ± 1%      218µs ± 1%   -0.86%  (p=0.000 n=10+10)
      GCD10000x10000/WithXY-8             3.04ms ± 0%     3.05ms ± 0%   +0.33%  (p=0.001 n=9+10)
      GCD10000x100000/WithoutXY-8          513µs ± 0%      511µs ± 0%   -0.38%  (p=0.000 n=10+10)
      GCD10000x100000/WithXY-8            15.1ms ± 0%     15.0ms ± 0%     ~     (p=0.053 n=10+9)
      GCD100000x100000/WithoutXY-8        10.4ms ± 1%     10.4ms ± 2%     ~     (p=0.258 n=9+9)
      GCD100000x100000/WithXY-8            205ms ± 1%      205ms ± 1%     ~     (p=0.481 n=10+10)
      Hilbert-8                           1.25ms ±15%     1.24ms ±17%     ~     (p=0.853 n=10+10)
      Binomial-8                          3.03µs ±24%     2.90µs ±16%     ~     (p=0.481 n=10+10)
      QuoRem-8                            1.95µs ± 1%     1.95µs ± 2%     ~     (p=0.117 n=9+10)
      Exp-8                               5.12ms ± 2%     3.99ms ± 1%  -22.02%  (p=0.000 n=10+9)
      Exp2-8                              5.14ms ± 2%     3.98ms ± 0%  -22.55%  (p=0.000 n=10+9)
      Bitset-8                            16.4ns ± 2%     16.5ns ± 2%     ~     (p=0.311 n=9+10)
      BitsetNeg-8                         46.3ns ± 4%     45.8ns ± 4%     ~     (p=0.272 n=10+10)
      BitsetOrig-8                         250ns ±19%      247ns ±14%     ~     (p=0.671 n=10+10)
      BitsetNegOrig-8                      416ns ±14%      429ns ±14%     ~     (p=0.353 n=10+10)
      ModSqrt225_Tonelli-8                 400µs ± 0%      320µs ± 0%  -19.88%  (p=0.000 n=9+7)
      ModSqrt224_3Mod4-8                   123µs ± 1%       97µs ± 0%  -21.21%  (p=0.000 n=9+10)
      ModSqrt5430_Tonelli-8                1.87s ± 0%      1.39s ± 1%  -25.70%  (p=0.000 n=9+10)
      ModSqrt5430_3Mod4-8                  630ms ± 2%      465ms ± 1%  -26.12%  (p=0.000 n=10+10)
      Sqrt-8                              25.8µs ± 1%     25.9µs ± 0%   +0.66%  (p=0.002 n=10+8)
      IntSqr/1-8                          11.3ns ± 1%     11.3ns ± 2%     ~     (p=0.360 n=9+10)
      IntSqr/2-8                          26.6ns ± 1%     27.4ns ± 2%   +2.87%  (p=0.000 n=8+9)
      IntSqr/3-8                          36.5ns ± 6%     36.6ns ± 5%     ~     (p=0.589 n=10+10)
      IntSqr/5-8                          57.2ns ± 2%     57.8ns ± 1%   +0.92%  (p=0.045 n=10+9)
      IntSqr/8-8                           112ns ± 1%       93ns ± 1%  -16.60%  (p=0.000 n=10+10)
      IntSqr/10-8                          148ns ± 1%      129ns ± 5%  -12.85%  (p=0.000 n=10+10)
      IntSqr/20-8                          642ns ±28%      692ns ±21%     ~     (p=0.105 n=10+10)
      IntSqr/30-8                         1.03µs ±18%     1.06µs ±15%     ~     (p=0.422 n=10+8)
      IntSqr/50-8                         2.33µs ±14%     2.14µs ±20%     ~     (p=0.063 n=10+10)
      IntSqr/80-8                         4.06µs ±13%     3.72µs ±14%   -8.31%  (p=0.029 n=10+10)
      IntSqr/100-8                        5.79µs ±10%     5.20µs ±18%  -10.15%  (p=0.004 n=10+10)
      IntSqr/200-8                        17.1µs ± 1%     12.9µs ± 3%  -24.44%  (p=0.000 n=10+10)
      IntSqr/300-8                        35.9µs ± 0%     26.6µs ± 1%  -25.75%  (p=0.000 n=10+10)
      IntSqr/500-8                        84.9µs ± 0%     71.7µs ± 1%  -15.49%  (p=0.000 n=10+10)
      IntSqr/800-8                         170µs ± 1%      142µs ± 2%  -16.73%  (p=0.000 n=10+10)
      IntSqr/1000-8                        258µs ± 1%      218µs ± 1%  -15.65%  (p=0.000 n=10+10)
      Mul-8                               10.4ms ± 1%      8.3ms ± 0%  -20.05%  (p=0.000 n=10+9)
      Exp3Power/0x10-8                     311ns ±15%      321ns ±24%     ~     (p=0.447 n=10+10)
      Exp3Power/0x40-8                     358ns ±21%      346ns ±37%     ~     (p=0.591 n=10+10)
      Exp3Power/0x100-8                    611ns ±19%      570ns ±27%     ~     (p=0.393 n=10+10)
      Exp3Power/0x400-8                   1.31µs ±26%     1.34µs ±19%     ~     (p=0.853 n=10+10)
      Exp3Power/0x1000-8                  6.76µs ±23%     6.22µs ±16%     ~     (p=0.095 n=10+9)
      Exp3Power/0x4000-8                  37.6µs ±14%     36.4µs ±21%     ~     (p=0.247 n=10+10)
      Exp3Power/0x10000-8                  345µs ±14%      310µs ±11%   -9.99%  (p=0.005 n=10+10)
      Exp3Power/0x40000-8                 2.77ms ± 1%     2.34ms ± 1%  -15.47%  (p=0.000 n=10+10)
      Exp3Power/0x100000-8                25.1ms ± 1%     21.3ms ± 1%  -15.26%  (p=0.000 n=10+10)
      Exp3Power/0x400000-8                 225ms ± 1%      190ms ± 1%  -15.61%  (p=0.000 n=10+10)
      Fibo-8                              23.4ms ± 1%     23.3ms ± 0%     ~     (p=0.052 n=10+10)
      NatSqr/1-8                          58.4ns ±24%     59.8ns ±38%     ~     (p=0.739 n=10+10)
      NatSqr/2-8                           122ns ±21%      122ns ±16%     ~     (p=0.896 n=10+10)
      NatSqr/3-8                           140ns ±28%      148ns ±30%     ~     (p=0.288 n=10+10)
      NatSqr/5-8                           193ns ±29%      210ns ±34%     ~     (p=0.469 n=10+10)
      NatSqr/8-8                           317ns ±21%      296ns ±25%     ~     (p=0.393 n=10+10)
      NatSqr/10-8                          362ns ± 8%      373ns ±30%     ~     (p=0.617 n=9+10)
      NatSqr/20-8                         1.24µs ±16%     1.06µs ±29%  -14.57%  (p=0.019 n=10+10)
      NatSqr/30-8                         1.90µs ±32%     1.71µs ±10%     ~     (p=0.176 n=10+9)
      NatSqr/50-8                         4.22µs ±19%     3.67µs ± 7%  -13.03%  (p=0.017 n=10+9)
      NatSqr/80-8                         7.33µs ±20%     6.50µs ±15%  -11.26%  (p=0.009 n=10+10)
      NatSqr/100-8                        9.84µs ±18%     9.33µs ± 8%     ~     (p=0.280 n=10+10)
      NatSqr/200-8                        21.4µs ± 7%     20.0µs ±14%     ~     (p=0.075 n=10+10)
      NatSqr/300-8                        38.0µs ± 2%     31.3µs ±10%  -17.63%  (p=0.000 n=10+10)
      NatSqr/500-8                         102µs ± 5%      101µs ± 4%     ~     (p=0.780 n=9+10)
      NatSqr/800-8                         190µs ± 3%      166µs ± 6%  -12.29%  (p=0.000 n=10+10)
      NatSqr/1000-8                        277µs ± 2%      245µs ± 6%  -11.64%  (p=0.000 n=10+10)
      ScanPi-8                             144µs ±23%      149µs ±24%     ~     (p=0.579 n=10+10)
      StringPiParallel-8                  25.6µs ± 0%     25.8µs ± 0%   +0.69%  (p=0.000 n=9+10)
      Scan/10/Base2-8                      305ns ± 1%      309ns ± 1%   +1.32%  (p=0.000 n=10+9)
      Scan/100/Base2-8                    1.95µs ± 1%     1.98µs ± 1%   +1.10%  (p=0.000 n=10+10)
      Scan/1000/Base2-8                   19.5µs ± 1%     19.7µs ± 1%   +1.39%  (p=0.000 n=10+10)
      Scan/10000/Base2-8                   270µs ± 1%      272µs ± 1%   +0.58%  (p=0.024 n=9+9)
      Scan/100000/Base2-8                 10.3ms ± 0%     10.3ms ± 0%   +0.16%  (p=0.022 n=9+10)
      Scan/10/Base8-8                      146ns ± 4%      154ns ± 4%   +5.57%  (p=0.000 n=9+9)
      Scan/100/Base8-8                     748ns ± 1%      759ns ± 1%   +1.51%  (p=0.000 n=9+10)
      Scan/1000/Base8-8                   7.88µs ± 1%     8.00µs ± 1%   +1.64%  (p=0.000 n=10+10)
      Scan/10000/Base8-8                   155µs ± 1%      155µs ± 1%     ~     (p=0.968 n=10+9)
      Scan/100000/Base8-8                 9.11ms ± 0%     9.11ms ± 0%     ~     (p=0.604 n=9+10)
      Scan/10/Base10-8                     140ns ± 5%      149ns ± 5%   +6.39%  (p=0.000 n=9+10)
      Scan/100/Base10-8                    680ns ± 0%      688ns ± 1%   +1.08%  (p=0.000 n=9+10)
      Scan/1000/Base10-8                  7.09µs ± 1%     7.16µs ± 1%   +0.98%  (p=0.019 n=10+10)
      Scan/10000/Base10-8                  149µs ± 3%      150µs ± 3%     ~     (p=0.143 n=10+10)
      Scan/100000/Base10-8                9.16ms ± 0%     9.16ms ± 0%     ~     (p=0.661 n=10+9)
      Scan/10/Base16-8                     134ns ± 5%      135ns ± 3%     ~     (p=0.505 n=9+9)
      Scan/100/Base16-8                    560ns ± 1%      563ns ± 0%   +0.67%  (p=0.000 n=10+8)
      Scan/1000/Base16-8                  6.28µs ± 1%     6.26µs ± 1%     ~     (p=0.448 n=10+10)
      Scan/10000/Base16-8                  161µs ± 1%      162µs ± 1%   +0.74%  (p=0.008 n=9+9)
      Scan/100000/Base16-8                9.64ms ± 0%     9.64ms ± 0%     ~     (p=0.436 n=10+10)
      String/10/Base2-8                    116ns ±12%      118ns ±13%     ~     (p=0.645 n=10+10)
      String/100/Base2-8                   871ns ±23%      860ns ±22%     ~     (p=0.699 n=10+10)
      String/1000/Base2-8                 10.0µs ±20%     10.0µs ±23%     ~     (p=0.853 n=10+10)
      String/10000/Base2-8                 110µs ±21%      120µs ±25%     ~     (p=0.436 n=10+10)
      String/100000/Base2-8                768µs ±11%      733µs ±16%     ~     (p=0.393 n=10+10)
      String/10/Base8-8                   51.3ns ± 1%     51.0ns ± 3%     ~     (p=0.286 n=9+9)
      String/100/Base8-8                   284ns ± 9%      272ns ±12%     ~     (p=0.267 n=9+10)
      String/1000/Base8-8                 3.06µs ± 9%     3.04µs ±10%     ~     (p=0.739 n=10+10)
      String/10000/Base8-8                36.1µs ±14%     35.1µs ± 9%     ~     (p=0.447 n=10+9)
      String/100000/Base8-8                371µs ±12%      373µs ±16%     ~     (p=0.739 n=10+10)
      String/10/Base10-8                   167ns ±11%      165ns ± 9%     ~     (p=0.781 n=10+10)
      String/100/Base10-8                  727ns ± 1%      740ns ± 2%   +1.70%  (p=0.001 n=10+10)
      String/1000/Base10-8                5.30µs ±18%     5.37µs ±14%     ~     (p=0.631 n=10+10)
      String/10000/Base10-8               45.0µs ±14%     44.6µs ±10%     ~     (p=0.720 n=9+10)
      String/100000/Base10-8              5.10ms ± 1%     5.05ms ± 3%     ~     (p=0.211 n=9+10)
      String/10/Base16-8                  47.7ns ± 6%     47.7ns ± 6%     ~     (p=0.985 n=10+10)
      String/100/Base16-8                  221ns ±10%      234ns ±27%     ~     (p=0.541 n=10+10)
      String/1000/Base16-8                2.23µs ±11%     2.12µs ± 8%   -4.81%  (p=0.029 n=9+8)
      String/10000/Base16-8               28.3µs ±21%     28.5µs ±14%     ~     (p=0.796 n=10+10)
      String/100000/Base16-8               291µs ±16%      293µs ±15%     ~     (p=0.931 n=9+9)
      LeafSize/0-8                        2.43ms ± 1%     2.49ms ± 1%   +2.56%  (p=0.000 n=10+10)
      LeafSize/1-8                        49.7µs ± 9%     46.3µs ±16%   -6.78%  (p=0.017 n=10+9)
      LeafSize/2-8                        48.4µs ±18%     46.3µs ±19%     ~     (p=0.436 n=10+10)
      LeafSize/3-8                        81.7µs ± 3%     80.9µs ± 3%     ~     (p=0.278 n=10+9)
      LeafSize/4-8                        47.0µs ± 7%     47.9µs ±13%     ~     (p=0.905 n=9+10)
      LeafSize/5-8                        96.8µs ± 1%     97.3µs ± 2%     ~     (p=0.515 n=8+10)
      LeafSize/6-8                        82.5µs ± 4%     80.9µs ± 2%   -1.92%  (p=0.019 n=10+10)
      LeafSize/7-8                        67.2µs ±13%     66.6µs ± 9%     ~     (p=0.842 n=10+9)
      LeafSize/8-8                        46.0µs ±28%     45.1µs ±12%     ~     (p=0.739 n=10+10)
      LeafSize/9-8                         111µs ± 1%      111µs ± 1%     ~     (p=0.739 n=10+10)
      LeafSize/10-8                       98.8µs ± 4%     97.9µs ± 3%     ~     (p=0.278 n=10+9)
      LeafSize/11-8                       96.8µs ± 1%     96.4µs ± 1%     ~     (p=0.211 n=9+10)
      LeafSize/12-8                       81.0µs ± 4%     81.3µs ± 3%     ~     (p=0.579 n=10+10)
      LeafSize/13-8                       79.7µs ± 5%     79.2µs ± 3%     ~     (p=0.661 n=10+9)
      LeafSize/14-8                       67.6µs ±12%     65.8µs ± 7%     ~     (p=0.447 n=10+9)
      LeafSize/15-8                       63.9µs ±17%     66.3µs ±14%     ~     (p=0.481 n=10+10)
      LeafSize/16-8                       44.0µs ±28%     46.0µs ±27%     ~     (p=0.481 n=10+10)
      LeafSize/32-8                       46.2µs ±13%     43.5µs ±18%     ~     (p=0.156 n=9+10)
      LeafSize/64-8                       53.3µs ±10%     53.0µs ±19%     ~     (p=0.730 n=9+9)
      ProbablyPrime/n=0-8                 3.60ms ± 1%     3.39ms ± 1%   -5.87%  (p=0.000 n=10+9)
      ProbablyPrime/n=1-8                 4.42ms ± 1%     4.08ms ± 1%   -7.69%  (p=0.000 n=10+10)
      ProbablyPrime/n=5-8                 7.57ms ± 2%     6.79ms ± 1%  -10.24%  (p=0.000 n=10+10)
      ProbablyPrime/n=10-8                11.6ms ± 2%     10.2ms ± 1%  -11.69%  (p=0.000 n=10+10)
      ProbablyPrime/n=20-8                19.4ms ± 2%     16.9ms ± 2%  -12.89%  (p=0.000 n=10+10)
      ProbablyPrime/Lucas-8               2.81ms ± 2%     2.72ms ± 1%   -3.22%  (p=0.000 n=10+9)
      ProbablyPrime/MillerRabinBase2-8     797µs ± 1%      680µs ± 1%  -14.64%  (p=0.000 n=10+10)
      
      name                              old speed      new speed       delta
      AddVV/1-8                         17.1GB/s ± 6%   18.0GB/s ± 2%     ~     (p=0.122 n=10+8)
      AddVV/2-8                         32.4GB/s ± 2%   32.2GB/s ± 4%     ~     (p=0.661 n=10+9)
      AddVV/3-8                         38.6GB/s ± 2%   38.9GB/s ± 1%     ~     (p=0.113 n=10+9)
      AddVV/4-8                         45.8GB/s ± 2%   45.8GB/s ± 2%     ~     (p=0.796 n=10+10)
      AddVV/5-8                         48.1GB/s ± 2%   48.3GB/s ± 1%     ~     (p=0.315 n=10+10)
      AddVV/10-8                        78.9GB/s ± 1%   78.9GB/s ± 2%     ~     (p=0.353 n=10+10)
      AddVV/100-8                        136GB/s ± 2%    137GB/s ± 1%     ~     (p=0.971 n=10+10)
      AddVV/1000-8                       164GB/s ± 1%    164GB/s ± 4%     ~     (p=0.853 n=10+10)
      AddVV/10000-8                      126GB/s ± 6%    129GB/s ± 2%     ~     (p=0.063 n=10+10)
      AddVV/100000-8                     116GB/s ± 3%    116GB/s ± 3%     ~     (p=0.796 n=10+10)
      AddVW/1-8                         2.64GB/s ± 3%   2.64GB/s ± 3%     ~     (p=0.579 n=10+10)
      AddVW/2-8                         4.49GB/s ± 2%   4.44GB/s ± 2%   -1.09%  (p=0.040 n=9+9)
      AddVW/3-8                         6.36GB/s ± 1%   6.34GB/s ± 2%     ~     (p=0.684 n=10+10)
      AddVW/4-8                         6.83GB/s ± 1%   6.82GB/s ± 2%     ~     (p=0.905 n=10+9)
      AddVW/5-8                         8.75GB/s ± 1%   8.73GB/s ± 1%     ~     (p=0.796 n=10+10)
      AddVW/10-8                        10.5GB/s ± 2%   10.5GB/s ± 1%     ~     (p=0.971 n=10+10)
      AddVW/100-8                       19.5GB/s ± 2%   18.9GB/s ± 2%   -3.22%  (p=0.000 n=10+10)
      AddVW/1000-8                      20.7GB/s ± 2%   20.6GB/s ± 4%     ~     (p=0.631 n=10+10)
      AddVW/10000-8                     20.6GB/s ± 3%   20.7GB/s ± 3%     ~     (p=0.481 n=10+10)
      AddVW/100000-8                    19.4GB/s ± 2%   19.2GB/s ± 3%     ~     (p=0.165 n=10+10)
      AddMulVVW/1-8                     19.5GB/s ± 2%   19.7GB/s ± 3%     ~     (p=0.123 n=10+10)
      AddMulVVW/2-8                     30.1GB/s ± 2%   30.2GB/s ± 3%     ~     (p=0.297 n=9+9)
      AddMulVVW/3-8                     37.9GB/s ± 2%   36.5GB/s ± 2%   -3.63%  (p=0.000 n=10+10)
      AddMulVVW/4-8                     40.0GB/s ± 2%   39.4GB/s ± 2%   -1.58%  (p=0.001 n=10+10)
      AddMulVVW/5-8                     47.3GB/s ± 2%   46.6GB/s ± 1%   -1.35%  (p=0.001 n=9+9)
      AddMulVVW/10-8                    52.3GB/s ± 2%   60.6GB/s ± 3%  +15.76%  (p=0.000 n=10+10)
      AddMulVVW/100-8                   80.3GB/s ± 2%  122.1GB/s ± 1%  +51.92%  (p=0.000 n=10+10)
      AddMulVVW/1000-8                  92.0GB/s ± 1%  130.3GB/s ± 2%  +41.61%  (p=0.000 n=9+10)
      AddMulVVW/10000-8                 88.2GB/s ± 2%  108.2GB/s ± 5%  +22.66%  (p=0.000 n=10+10)
      AddMulVVW/100000-8                88.2GB/s ± 2%  102.9GB/s ± 2%  +16.69%  (p=0.000 n=10+10)
      
      Change-Id: Ic98e30c91d437d845fed03e07e976c3fdbf02b36
      Reviewed-on: https://go-review.googlesource.com/74851
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAdam Langley <agl@golang.org>
      c3935c08
  5. 23 Feb, 2018 4 commits
    • Joe Tsai's avatar
      archive/zip: fix handling of Info-ZIP Unix extended timestamps · 9697a119
      Joe Tsai authored
      The Info-ZIP Unix1 extra field is specified as such:
      >>>
      Value    Size   Description
      -----    ----   -----------
      0x5855   Short  tag for this extra block type ("UX")
      TSize    Short  total data size for this block
      AcTime   Long   time of last access (GMT/UTC)
      ModTime  Long   time of last modification (GMT/UTC)
      <<<
      
      The previous handling was incorrect in that it read the AcTime field
      instead of the ModTime field.
      
      The test-osx.zip test unfortunately locked in the wrong behavior.
      Manually parsing that ZIP file shows that the encoded MS-DOS
      date and time are 0x4b5f and 0xa97d, which corresponds with a
      date of 2017-10-31 21:11:58, which matches the correct mod time
      (off by 1 second due to MS-DOS timestamp resolution).
      
      Fixes #23901
      
      Change-Id: I567824c66e8316b9acd103dbecde366874a4b7ef
      Reviewed-on: https://go-review.googlesource.com/96895
      Run-TryBot: Joe Tsai <joetsai@google.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      9697a119
    • Ian Lance Taylor's avatar
      runtime: don't check for String/Error methods in printany · 804e3e56
      Ian Lance Taylor authored
      They have either already been called by preprintpanics, or they can
      not be called safely because of the various conditions checked at the
      start of gopanic.
      
      Fixes #24059
      
      Change-Id: I4a6233d12c9f7aaaee72f343257ea108bae79241
      Reviewed-on: https://go-review.googlesource.com/96755Reviewed-by: default avatarAustin Clements <austin@google.com>
      804e3e56
    • Yuval Pavel Zholkover's avatar
      os: respect umask in Mkdir and OpenFile on BSD systems when perm has ModeSticky set · a5e8e2d9
      Yuval Pavel Zholkover authored
      Instead of calling Chmod directly on perm, stat the created file/dir to extract the
      actual permission bits which can be different from perm due to umask.
      
      Fixes #23120.
      
      Change-Id: I3e70032451fc254bf48ce9627e98988f84af8d91
      Reviewed-on: https://go-review.googlesource.com/84477
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      a5e8e2d9
    • Austin Clements's avatar
      runtime: reduce arena size to 4MB on 64-bit Windows · 78846472
      Austin Clements authored
      Currently, we use 64MB heap arenas on 64-bit platforms. This works
      well on UNIX-like OSes because they treat untouched pages as
      essentially free. However, on Windows, committed memory is charged
      against a process whether or not it has demand-faulted physical pages
      in. Hence, on Windows, even a process with a tiny heap will commit
      64MB for one heap arena, plus another 32MB for the arena map. Things
      are much worse under the race detector, which increases the heap
      commitment by a factor of 5.5X, leading to 384MB of committed memory
      at runtime init.
      
      Fix this by reducing the heap arena size to 4MB on Windows.
      
      To counterbalance the effect of increasing the arena map size by a
      factor of 16, and to further reduce the impact of the commitment for
      the arena map, we switch from a single entry L1 arena map to a 64
      entry L1 arena map.
      
      Compared to the original arena design, this slows down the
      x/benchmarks garbage benchmark by 0.49% (the slow down of this commit
      alone is 1.59%, but the previous commit bought us a 1% speed-up):
      
      name                       old time/op  new time/op  delta
      Garbage/benchmem-MB=64-12  2.28ms ± 1%  2.29ms ± 1%  +0.49%  (p=0.000 n=17+18)
      
      (https://perf.golang.org/search?q=upload:20180223.1)
      
      (This was measured on linux/amd64 by modifying its arena configuration
      as above.)
      
      Fixes #23900.
      
      Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1
      Reviewed-on: https://go-review.googlesource.com/96780
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      78846472