1. 12 Mar, 2019 15 commits
  2. 11 Mar, 2019 14 commits
    • Leon Klingele's avatar
      net/http: add missing error checks in tests · 62bfa69e
      Leon Klingele authored
      Change-Id: I73441ba2eb349f0e0f25068e6b24c74dd33f1456
      GitHub-Last-Rev: b9e6705962b94af3b1b720cc9ad6d33d7d3f1425
      GitHub-Pull-Request: golang/go#30017
      Reviewed-on: https://go-review.googlesource.com/c/go/+/160441Reviewed-by: default avatarEmmanuel Odeke <emm.odeke@gmail.com>
      Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      62bfa69e
    • Josh Bleecher Snyder's avatar
      cmd/compile: make deadcode pass cheaper · c9ccdf1f
      Josh Bleecher Snyder authored
      The deadcode pass runs a lot.
      I'd like it to run even more.
      
      This change adds dedicated storage for deadcode to ssa.Cache.
      In addition to being a nice win now, it makes
      deadcode easier to add other places in the future.
      
      name        old time/op       new time/op       delta
      Template          210ms ± 3%        209ms ± 2%    ~     (p=0.951 n=93+95)
      Unicode          92.2ms ± 3%       93.0ms ± 3%  +0.87%  (p=0.000 n=94+94)
      GoTypes           739ms ± 2%        733ms ± 2%  -0.84%  (p=0.000 n=92+94)
      Compiler          3.51s ± 2%        3.49s ± 2%  -0.57%  (p=0.000 n=94+91)
      SSA               9.80s ± 2%        9.75s ± 2%  -0.57%  (p=0.000 n=95+92)
      Flate             132ms ± 2%        132ms ± 3%    ~     (p=0.165 n=94+98)
      GoParser          160ms ± 3%        159ms ± 3%  -0.42%  (p=0.005 n=96+94)
      Reflect           446ms ± 4%        442ms ± 4%  -0.91%  (p=0.000 n=95+98)
      Tar               186ms ± 3%        186ms ± 2%    ~     (p=0.221 n=94+97)
      XML               252ms ± 2%        250ms ± 2%  -0.55%  (p=0.000 n=95+94)
      [Geo mean]        430ms             429ms       -0.34%
      
      name        old user-time/op  new user-time/op  delta
      Template          256ms ± 3%        257ms ± 3%    ~     (p=0.521 n=94+98)
      Unicode           120ms ± 9%        121ms ± 9%    ~     (p=0.074 n=99+100)
      GoTypes           935ms ± 3%        935ms ± 2%    ~     (p=0.574 n=82+96)
      Compiler          4.56s ± 1%        4.55s ± 2%    ~     (p=0.247 n=88+90)
      SSA               13.6s ± 2%        13.6s ± 1%    ~     (p=0.277 n=94+95)
      Flate             155ms ± 3%        156ms ± 3%    ~     (p=0.181 n=95+100)
      GoParser          193ms ± 8%        184ms ± 6%  -4.39%  (p=0.000 n=100+89)
      Reflect           549ms ± 3%        552ms ± 3%  +0.45%  (p=0.036 n=94+96)
      Tar               230ms ± 4%        230ms ± 4%    ~     (p=0.670 n=97+99)
      XML               315ms ± 5%        309ms ±12%  -2.05%  (p=0.000 n=99+99)
      [Geo mean]        540ms             538ms       -0.47%
      
      name        old alloc/op      new alloc/op      delta
      Template         40.3MB ± 0%       38.9MB ± 0%  -3.36%  (p=0.008 n=5+5)
      Unicode          28.6MB ± 0%       28.4MB ± 0%  -0.90%  (p=0.008 n=5+5)
      GoTypes           137MB ± 0%        132MB ± 0%  -3.65%  (p=0.008 n=5+5)
      Compiler          637MB ± 0%        609MB ± 0%  -4.40%  (p=0.008 n=5+5)
      SSA              2.19GB ± 0%       2.07GB ± 0%  -5.63%  (p=0.008 n=5+5)
      Flate            25.0MB ± 0%       24.1MB ± 0%  -3.80%  (p=0.008 n=5+5)
      GoParser         30.0MB ± 0%       29.1MB ± 0%  -3.17%  (p=0.008 n=5+5)
      Reflect          87.1MB ± 0%       84.4MB ± 0%  -3.05%  (p=0.008 n=5+5)
      Tar              37.3MB ± 0%       36.0MB ± 0%  -3.31%  (p=0.008 n=5+5)
      XML              49.8MB ± 0%       48.0MB ± 0%  -3.69%  (p=0.008 n=5+5)
      [Geo mean]       87.6MB            84.6MB       -3.50%
      
      name        old allocs/op     new allocs/op     delta
      Template           387k ± 0%         380k ± 0%  -1.76%  (p=0.008 n=5+5)
      Unicode            342k ± 0%         341k ± 0%  -0.31%  (p=0.008 n=5+5)
      GoTypes           1.39M ± 0%        1.37M ± 0%  -1.64%  (p=0.008 n=5+5)
      Compiler          5.68M ± 0%        5.60M ± 0%  -1.41%  (p=0.008 n=5+5)
      SSA               17.1M ± 0%        16.8M ± 0%  -1.49%  (p=0.008 n=5+5)
      Flate              240k ± 0%         236k ± 0%  -1.99%  (p=0.008 n=5+5)
      GoParser           309k ± 0%         304k ± 0%  -1.57%  (p=0.008 n=5+5)
      Reflect           1.01M ± 0%        0.99M ± 0%  -2.69%  (p=0.008 n=5+5)
      Tar                360k ± 0%         353k ± 0%  -1.91%  (p=0.008 n=5+5)
      XML                447k ± 0%         441k ± 0%  -1.26%  (p=0.008 n=5+5)
      [Geo mean]         858k              844k       -1.60%
      
      Fixes #15306
      
      Change-Id: I9f558adb911efddead3865542fe2ca71f66fe1da
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166718
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      c9ccdf1f
    • Elias Naur's avatar
      misc/android: copy less from GOROOT to the device · 810809eb
      Elias Naur authored
      The android emulator builders is running out of space after CL 165797
      copied most of GOROOT to the device.
      The pkg directory is by far the largest, so only include what seems
      necessary to build the x/ repositories: pkg/android_$GOARCH and
      pkg/tool/android_$GOARCH.
      
      While here, rename the device root directory to match the exec
      wrapper name and make sure the deferred cleanups actually run before
      os.Exit.
      
      Hopefully fixes the emulator builders.
      
      Updates #23824
      
      Change-Id: I4d1e3ab2c89fd1e5818503d323ddb87f073094da
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166397
      Run-TryBot: Elias Naur <mail@eliasnaur.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      810809eb
    • Bryan C. Mills's avatar
      cmd/go: change the default value of GO111MODULE to 'on' · cf469165
      Bryan C. Mills authored
      Fixes #30228
      
      Change-Id: Ie45ba6483849b843eb6605272f686b9deffe5e48
      Reviewed-on: https://go-review.googlesource.com/c/go/+/162698Reviewed-by: default avatarJay Conrod <jayconrod@google.com>
      cf469165
    • Bryan C. Mills's avatar
      all: move internal/x to vendor/golang.org/x and revendor using 'go mod vendor' · c5cf6624
      Bryan C. Mills authored
      This also updates the vendored-in versions of several packages: 'go
      mod vendor' selects a consistent version of each module, but we had
      previously vendored an ad-hoc selection of packages.
      
      Notably, x/crypto/hkdf was previously vendored in at a much newer
      commit than the rest of x/crypto. Bringing the rest of x/crypto up to
      that commit introduced an import of golang.org/x/sys/cpu, which broke
      the js/wasm build, requiring an upgrade of x/sys to pick up CL 165749.
      
      Updates #30228
      Updates #30241
      Updates #25822
      
      Change-Id: I5b3dbc232b7e6a048a158cbd8d36137af1efb711
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164623Reviewed-by: default avatarFilippo Valsorda <filippo@golang.org>
      c5cf6624
    • Bryan C. Mills's avatar
      cmd,std: add go.mod files · 0fc89a72
      Bryan C. Mills authored
      Updates #30241
      Updates #30228
      
      Change-Id: Ida0fe8263bf44e0498fed2048e22283ba5716835
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164622
      Run-TryBot: Bryan C. Mills <bcmills@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJay Conrod <jayconrod@google.com>
      0fc89a72
    • Bryan C. Mills's avatar
      cmd: refresh cmd/vendor to match 'go mod vendor' · 756a69c6
      Bryan C. Mills authored
      This change preserves the maximum versions from cmd/vendor/vendor.json
      where feasible, but bumps the versions of x/sys (for CL 162987) and
      x/tools (for CL 162989 and CL 160837) so that 'go test all' passes in
      module mode when run from a working directory in src/cmd.
      
      A small change to cmd/vet (not vendored) was necessary to preserve its
      flag behavior given a pristine copy of x/tools; see CL 162989 for more
      detail.
      
      This change was generated by running 'go mod vendor' at CL 164622.
      (Welcoooome to the fuuuuuture!)
      
      Updates #30228
      Updates #30241
      
      Change-Id: I889590318dc857d4a6e20c3023d09a27128d8255
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164618
      Run-TryBot: Bryan C. Mills <bcmills@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJay Conrod <jayconrod@google.com>
      756a69c6
    • Keith Randall's avatar
      test: fix memcombine tests · 486ca37b
      Keith Randall authored
      Two tests (load_le_byte8_uint64_inv and load_be_byte8_uint64)
      pass but the generated code isn't actually correct.
      
      The test regexp provides a false negative, as it matches the
      MOVQ (SP), BP instruction in the epilogue.
      
      Combined loads never worked for these cases - the test was added in error
      as part of a batch and not noticed because of the above false match.
      
      Normalize the amd64/386 tests to always negative match on narrower
      loads and OR.
      
      Change-Id: I256861924774d39db0e65723866c81df5ab5076f
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166837
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      486ca37b
    • Emmanuel T Odeke's avatar
      testing: enable examples on js/wasm with non os.Pipe runExample · ac56baa0
      Emmanuel T Odeke authored
      os.Pipe is not implemented on wasm/js so for that purpose use
      a temporary file for js/wasm. This change creates two versions
      of runExample:
      
      * runExample verbatim that still uses os.Pipe for non js/wasm
      * runExample that uses a temporary file
      
      Also added a TODO to re-unify these function versions back into
      example.go wasm/js gets an os.Pipe implementation.
      
      Change-Id: I9f418a49b2c397e1667724c7442b7bbe8942225e
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165357
      Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      ac56baa0
    • Josh Bleecher Snyder's avatar
      cmd/compile: teach rulegen to |-expand multiple |s in a single op · 48d3c32b
      Josh Bleecher Snyder authored
      I want to be able to write
      
      MOV(Q|Q|L|L|L|W|W|B)loadidx(1|8|1|4|8|1|2|1)
      
      instead of
      
      MOV(Qloadidx1|Qloadidx8|Lloadidx1|Lloadidx4|Lloadidx8|Wloadidx1|Wloadidx2|Bloadidx1)
      
      in rewrite rules.
      
      Both are fairly cryptic and hard to review, but the former
      is at least compact, which helps to not obscure the structure
      of the rest of the rule.
      
      Support that by adjusting rulegen's expansion.
      
      Instead of looking for an op that begins with "(", ends with " ",
      and has exactly one set of parens in it, look for everything of the
      form "(...|...)".
      
      That has false positives: Go code in the && conditions and AuxInt expressions.
      Those are easily checked for syntactically: && conditions are between && and ->,
      and AuxInt expressions are inside square brackets.
      After ruling out those false positives, we can keep everything else,
      regardless of where it is.
      
      No change to the generated code for existing rules.
      
      Change-Id: I5b70a190e268989504f53cb2cce2f9a50170d8a2
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166737
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      48d3c32b
    • Josh Bleecher Snyder's avatar
      cmd/compile: add scale field to SSA Ops · 07b4b4a1
      Josh Bleecher Snyder authored
      Refactoring only.
      
      This makes it easier to add ops
      that do indexed memory loads/stores.
      
      Passes toolstash-check.
      
      Change-Id: I82df0d4154718577ec42106fa1bc76571bf65096
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166425
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      07b4b4a1
    • Josh Bleecher Snyder's avatar
      cmd/compile: normalize more whitespace in rewrite rules · b4fbd291
      Josh Bleecher Snyder authored
      If you write a rewrite rule:
      
      (something) && noteRule("X")-> (something)
      
      then rulegen will panic with an error message about commutativity.
      The real problem is the lack of a space between the ) and the ->.
      Normalize that bit of whitespace too.
      
      Change-Id: Idbd53687cd0398fe275ff2702667688cad05b4ca
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166427
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      b4fbd291
    • Bryan C. Mills's avatar
      cmd/go: set GO111MODULE=off explicitly in TestScript/list_test_err · 65a54aef
      Bryan C. Mills authored
      This test was added after CL 162697.
      
      Updates #30228
      
      Change-Id: Ia33ad3adc99e53b0b03e68906dc1f2e39234e2cf
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166697
      Run-TryBot: Bryan C. Mills <bcmills@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      65a54aef
    • Bryan C. Mills's avatar
      cmd/go: resolve non-standard imports from within GOROOT/src using vendor directories · fd080ea3
      Bryan C. Mills authored
      Updates #30228
      Fixes #26924
      
      Change-Id: Ie625c64721559c7633396342320536396cd1fcf5
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164621
      Run-TryBot: Bryan C. Mills <bcmills@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJay Conrod <jayconrod@google.com>
      fd080ea3
  3. 10 Mar, 2019 3 commits
    • Alberto Donizetti's avatar
      syscall: skip non-root user namespace test if kernel forbids · 1c2d4da1
      Alberto Donizetti authored
      The unprivileged_userns_clone sysctl prevents unpriviledged users from
      creating namespaces, which the AmbientCaps test does. It's set to 0 by
      default in a few Linux distributions (Debian and Arch, possibly
      others), so we need to check it before running the test.
      
      I've verified that setting
      
        echo 1 > /proc/sys/kernel/unprivileged_userns_clone
      
      and then running the test *without this patch* makes it pass, which
      proves that checking unprivileged_userns_clone is indeed sufficient.
      
      Fixes #30698
      
      Change-Id: Ib2079b5e714d7f2440ddf979c3e7cfda9a9c5005
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166460Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      1c2d4da1
    • Tobias Klauser's avatar
      strings: remove unnecessary strings.s · e2dc41b4
      Tobias Klauser authored
      There are no empty function declarations in package strings anymore, so
      strings.s is no longer needed.
      
      Change-Id: I16fe161a9c06804811e98af0ca074f8f46e2f49d
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166458
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e2dc41b4
    • Jason A. Donenfeld's avatar
      runtime: safely load DLLs · 9b6e9f0c
      Jason A. Donenfeld authored
      While many other call sites have been moved to using the proper
      higher-level system loading, these areas were left out. This prevents
      DLL directory injection attacks. This includes both the runtime load
      calls (using LoadLibrary prior) and the implicitly linked ones via
      cgo_import_dynamic, which we move to our LoadLibraryEx. The goal is to
      only loosely load kernel32.dll and strictly load all others.
      
      Meanwhile we make sure that we never fallback to insecure loading on
      older or unpatched systems.
      
      This is CVE-2019-9634.
      
      Fixes #14959
      Fixes #28978
      Fixes #30642
      
      Change-Id: I401a13ed8db248ab1bb5039bf2d31915cac72b93
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165798
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAlex Brainman <alex.brainman@gmail.com>
      9b6e9f0c
  4. 09 Mar, 2019 8 commits
    • Josh Bleecher Snyder's avatar
      cmd/compile: add pure Go math/big functions to TestIntendedInlining · 243c8eb8
      Josh Bleecher Snyder authored
      Change-Id: Id29a9e48a09965e457f923a0ff023722b38b27ef
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165157
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      243c8eb8
    • Josh Bleecher Snyder's avatar
      math/big: add fast path for amd64 addVW for large z · 4d10aba3
      Josh Bleecher Snyder authored
      This matches the pure Go fast path added in the previous commit.
      
      I will leave other architectures to those with ready access to hardware.
      
      name            old time/op    new time/op    delta
      AddVW/1-8         3.60ns ± 3%    3.59ns ± 1%      ~     (p=0.147 n=91+86)
      AddVW/2-8         3.92ns ± 1%    3.91ns ± 2%    -0.36%  (p=0.000 n=86+92)
      AddVW/3-8         4.33ns ± 5%    4.46ns ± 5%    +2.94%  (p=0.000 n=96+97)
      AddVW/4-8         4.76ns ± 5%    4.82ns ± 5%    +1.28%  (p=0.000 n=95+92)
      AddVW/5-8         5.40ns ± 1%    5.42ns ± 0%    +0.47%  (p=0.000 n=76+71)
      AddVW/10-8        8.03ns ± 1%    7.80ns ± 5%    -2.90%  (p=0.000 n=73+96)
      AddVW/100-8       43.8ns ± 5%    17.9ns ± 1%   -59.12%  (p=0.000 n=94+81)
      AddVW/1000-8       428ns ± 4%      85ns ± 6%   -80.20%  (p=0.000 n=96+99)
      AddVW/10000-8     4.22µs ± 2%    1.80µs ± 3%   -57.32%  (p=0.000 n=69+92)
      AddVW/100000-8    44.8µs ± 8%    31.5µs ± 3%   -29.76%  (p=0.000 n=99+90)
      
      name            old time/op    new time/op    delta
      SubVW/1-8         3.53ns ± 2%    3.63ns ± 5%    +2.97%  (p=0.000 n=94+93)
      SubVW/2-8         4.33ns ± 5%    4.01ns ± 2%    -7.36%  (p=0.000 n=90+85)
      SubVW/3-8         4.32ns ± 2%    4.32ns ± 5%      ~     (p=0.084 n=87+97)
      SubVW/4-8         4.70ns ± 2%    4.83ns ± 6%    +2.77%  (p=0.000 n=85+96)
      SubVW/5-8         5.84ns ± 1%    5.35ns ± 1%    -8.35%  (p=0.000 n=87+87)
      SubVW/10-8        8.01ns ± 4%    7.54ns ± 4%    -5.84%  (p=0.000 n=98+97)
      SubVW/100-8       43.9ns ± 5%    17.9ns ± 1%   -59.20%  (p=0.000 n=98+76)
      SubVW/1000-8       426ns ± 2%      85ns ± 3%   -80.13%  (p=0.000 n=90+98)
      SubVW/10000-8     4.24µs ± 2%    1.81µs ± 3%   -57.28%  (p=0.000 n=74+91)
      SubVW/100000-8    44.5µs ± 4%    31.5µs ± 2%   -29.33%  (p=0.000 n=84+91)
      
      Change-Id: I10dd361cbaca22197c27e7734c0f50065292afbb
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164969
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      4d10aba3
    • Josh Bleecher Snyder's avatar
      math/big: add fast path for pure Go addVW for large z · fe24837c
      Josh Bleecher Snyder authored
      In the normal case, only a few words have to be updated when adding a word to a vector.
      When that happens, we can simply copy the rest of the words, which is much faster.
      However, the overhead of that makes it prohibitive for small vectors,
      so we check the size at the beginning.
      
      The implementation is a bit weird to allow addVW to continued to be inlined; see #30548.
      
      The AddVW benchmarks are surprising, but fully repeatable.
      The SubVW benchmarks are more or less as expected.
      I expect that removing the indirect function call will
      help both and make them a bit more normal.
      
      name            old time/op    new time/op     delta
      AddVW/1-8         4.27ns ± 2%     3.81ns ± 3%   -10.83%  (p=0.000 n=89+90)
      AddVW/2-8         4.91ns ± 2%     4.34ns ± 1%   -11.60%  (p=0.000 n=83+90)
      AddVW/3-8         5.77ns ± 4%     5.76ns ± 2%      ~     (p=0.365 n=91+87)
      AddVW/4-8         6.03ns ± 1%     6.03ns ± 1%      ~     (p=0.392 n=80+76)
      AddVW/5-8         6.48ns ± 2%     6.63ns ± 1%    +2.27%  (p=0.000 n=76+74)
      AddVW/10-8        9.56ns ± 2%     9.56ns ± 1%    -0.02%  (p=0.002 n=69+76)
      AddVW/100-8       90.6ns ± 0%     18.1ns ± 4%   -79.99%  (p=0.000 n=72+94)
      AddVW/1000-8       865ns ± 0%       85ns ± 6%   -90.14%  (p=0.000 n=66+96)
      AddVW/10000-8     8.57µs ± 2%     1.82µs ± 3%   -78.73%  (p=0.000 n=99+94)
      AddVW/100000-8    84.4µs ± 2%     31.8µs ± 4%   -62.29%  (p=0.000 n=93+98)
      
      name            old time/op    new time/op     delta
      SubVW/1-8         3.90ns ± 2%     4.13ns ± 4%    +6.02%  (p=0.000 n=92+95)
      SubVW/2-8         4.15ns ± 1%     5.20ns ± 1%   +25.22%  (p=0.000 n=83+85)
      SubVW/3-8         5.50ns ± 2%     6.22ns ± 6%   +13.21%  (p=0.000 n=91+97)
      SubVW/4-8         5.99ns ± 1%     6.63ns ± 1%   +10.63%  (p=0.000 n=79+61)
      SubVW/5-8         6.75ns ± 4%     6.88ns ± 2%    +1.82%  (p=0.000 n=98+73)
      SubVW/10-8        9.57ns ± 1%     9.56ns ± 1%    -0.13%  (p=0.000 n=77+64)
      SubVW/100-8       90.3ns ± 1%     18.1ns ± 2%   -80.00%  (p=0.000 n=75+94)
      SubVW/1000-8       860ns ± 4%       85ns ± 7%   -90.14%  (p=0.000 n=97+99)
      SubVW/10000-8     8.51µs ± 3%     1.77µs ± 6%   -79.21%  (p=0.000 n=100+97)
      SubVW/100000-8    84.4µs ± 3%     31.5µs ± 3%   -62.66%  (p=0.000 n=92+92)
      
      Change-Id: I721d7031d40f245b4a284f5bdd93e7bb85e7e937
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164968
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      fe24837c
    • Josh Bleecher Snyder's avatar
      math/big: remove bounds checks in pure Go implementations · 4c227a09
      Josh Bleecher Snyder authored
      These routines are quite sensitive to BCE.
      
      This change eliminates bounds checks from loops.
      It does so at the cost of a bit of safety:
      malformed input will now return incorrect answers
      instead of panicking.
      
      This isn't as bad as it sounds: math/big has very good
      test coverage, and the alternative implementations are in
      assembly, which could do much worse things with malformed input.
      
      If the compiler's BCE improves, so could these routines.
      
      Notable BCE improvements for these routines would be:
      
      * Allowing and propagating more cross-slice length hints.
        Then hints like _ = y[:len(z)] would eliminate bounds checks for y[i].
      
      * Propagating enough information so that we could do
        n := len(x)
        if len(z) < n {
          n = len(z)
        }
        and then have i < n eliminate the same bounds checks as
        i < len(x) && i < len(z) currently does.
      
      * Providing some way to do BCE for unrolled loops.
        Now that we have math/bits implementations,
        it is possible to write things like ADC chains in
        pure Go, if you can reasonably unroll loops.
      
      Benchmarks below are for amd64, using -tags=math_big_pure_go.
      
      name            old time/op    new time/op    delta
      AddVV/1-8         5.15ns ± 3%    4.65ns ± 4%   -9.81%  (p=0.000 n=93+86)
      AddVV/2-8         6.40ns ± 2%    5.58ns ± 4%  -12.78%  (p=0.000 n=90+95)
      AddVV/3-8         7.07ns ± 2%    6.66ns ± 2%   -5.88%  (p=0.000 n=87+83)
      AddVV/4-8         7.94ns ± 5%    7.41ns ± 4%   -6.65%  (p=0.000 n=94+98)
      AddVV/5-8         8.55ns ± 1%    8.80ns ± 0%   +2.92%  (p=0.000 n=87+92)
      AddVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.12%  (p=0.000 n=83+71)
      AddVV/100-8        119ns ± 5%     117ns ± 4%   -1.60%  (p=0.000 n=93+90)
      AddVV/1000-8      1.14µs ± 4%    1.14µs ± 5%     ~     (p=0.812 n=95+91)
      AddVV/10000-8     11.4µs ± 5%    11.3µs ± 5%     ~     (p=0.503 n=97+96)
      AddVV/100000-8     114µs ± 4%     113µs ± 5%   -0.98%  (p=0.002 n=97+90)
      
      name            old time/op    new time/op    delta
      SubVV/1-8         5.23ns ± 5%    4.65ns ± 3%  -11.18%  (p=0.000 n=89+91)
      SubVV/2-8         6.49ns ± 5%    5.58ns ± 3%  -14.04%  (p=0.000 n=92+94)
      SubVV/3-8         7.10ns ± 3%    6.65ns ± 2%   -6.28%  (p=0.000 n=87+80)
      SubVV/4-8         8.04ns ± 1%    7.44ns ± 5%   -7.49%  (p=0.000 n=83+98)
      SubVV/5-8         8.55ns ± 2%    8.32ns ± 1%   -2.75%  (p=0.000 n=84+92)
      SubVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.09%  (p=0.000 n=80+75)
      SubVV/100-8        119ns ± 0%     116ns ± 3%   -1.83%  (p=0.000 n=87+98)
      SubVV/1000-8      1.13µs ± 5%    1.13µs ± 3%     ~     (p=0.082 n=96+98)
      SubVV/10000-8     11.2µs ± 1%    11.3µs ± 3%   +0.76%  (p=0.000 n=87+97)
      SubVV/100000-8     112µs ± 2%     113µs ± 3%   +0.55%  (p=0.000 n=76+88)
      
      name            old time/op    new time/op    delta
      AddVW/1-8         4.30ns ± 4%    3.96ns ± 6%  -8.02%  (p=0.000 n=89+97)
      AddVW/2-8         5.15ns ± 2%    4.91ns ± 1%  -4.56%  (p=0.000 n=87+80)
      AddVW/3-8         5.59ns ± 3%    5.75ns ± 2%  +2.91%  (p=0.000 n=91+88)
      AddVW/4-8         6.20ns ± 1%    6.03ns ± 1%  -2.71%  (p=0.000 n=75+90)
      AddVW/5-8         6.93ns ± 3%    6.49ns ± 2%  -6.35%  (p=0.000 n=100+82)
      AddVW/10-8        10.0ns ± 7%     9.6ns ± 0%  -4.02%  (p=0.000 n=98+74)
      AddVW/100-8       91.1ns ± 1%    90.6ns ± 1%  -0.55%  (p=0.000 n=84+80)
      AddVW/1000-8       866ns ± 1%     856ns ± 4%  -1.06%  (p=0.000 n=69+96)
      AddVW/10000-8     8.64µs ± 1%    8.53µs ± 4%  -1.25%  (p=0.000 n=67+99)
      AddVW/100000-8    84.3µs ± 2%    85.4µs ± 4%  +1.22%  (p=0.000 n=89+99)
      
      name            old time/op    new time/op    delta
      SubVW/1-8         4.28ns ± 2%    3.82ns ± 3%  -10.63%  (p=0.000 n=91+89)
      SubVW/2-8         4.61ns ± 1%    4.48ns ± 3%   -2.67%  (p=0.000 n=94+96)
      SubVW/3-8         5.54ns ± 1%    5.81ns ± 4%   +4.87%  (p=0.000 n=92+97)
      SubVW/4-8         6.20ns ± 1%    6.08ns ± 2%   -1.99%  (p=0.000 n=71+88)
      SubVW/5-8         6.91ns ± 3%    6.64ns ± 1%   -3.90%  (p=0.000 n=97+70)
      SubVW/10-8        9.85ns ± 2%    9.62ns ± 0%   -2.31%  (p=0.000 n=82+62)
      SubVW/100-8       91.1ns ± 1%    90.9ns ± 3%   -0.14%  (p=0.010 n=71+93)
      SubVW/1000-8       859ns ± 3%     867ns ± 1%   +0.98%  (p=0.000 n=99+78)
      SubVW/10000-8     8.54µs ± 5%    8.57µs ± 2%   +0.38%  (p=0.007 n=98+92)
      SubVW/100000-8    84.5µs ± 3%    84.6µs ± 3%     ~     (p=0.334 n=95+94)
      
      name                old time/op    new time/op    delta
      AddMulVVW/1-8         5.43ns ± 3%    4.36ns ± 2%  -19.67%  (p=0.000 n=95+94)
      AddMulVVW/2-8         6.56ns ± 4%    6.11ns ± 1%   -6.90%  (p=0.000 n=91+91)
      AddMulVVW/3-8         8.00ns ± 1%    7.80ns ± 4%   -2.52%  (p=0.000 n=83+95)
      AddMulVVW/4-8         9.81ns ± 2%    9.53ns ± 1%   -2.86%  (p=0.000 n=77+64)
      AddMulVVW/5-8         11.4ns ± 3%    11.3ns ± 5%   -0.89%  (p=0.000 n=95+97)
      AddMulVVW/10-8        18.9ns ± 5%    19.1ns ± 5%   +0.89%  (p=0.000 n=91+94)
      AddMulVVW/100-8        165ns ± 5%     165ns ± 4%     ~     (p=0.427 n=97+98)
      AddMulVVW/1000-8      1.56µs ± 3%    1.56µs ± 4%     ~     (p=0.167 n=98+96)
      AddMulVVW/10000-8     15.7µs ± 5%    15.6µs ± 5%   -0.31%  (p=0.044 n=95+97)
      AddMulVVW/100000-8     156µs ± 3%     157µs ± 8%     ~     (p=0.373 n=72+99)
      
      Change-Id: Ibc720785d5b95f6a797103b1363843205f4d56bf
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164966
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      4c227a09
    • Daniel Martí's avatar
      reflect: make all flag.mustBe* methods inlinable · 788e038e
      Daniel Martí authored
      mustBe was barely over budget, so manually inlining the first flag.kind
      call is enough. Add a TODO to reverse that in the future, once the
      compiler gets better.
      
      mustBeExported and mustBeAssignable were over budget by a larger amount,
      so add slow path functions instead. This is the same strategy used in
      the sync package for common methods like Once.Do, for example.
      
      Lots of exported reflect.Value methods call these assert-like unexported
      methods, so avoiding the function call overhead in the common case does
      shave off a percent from most exported APIs.
      
      Finally, add the methods to TestIntendedInlining.
      
      While at it, replace a couple of uses of the 0 Kind with its descriptive
      name, Invalid.
      
      name     old time/op    new time/op    delta
      Call-8     68.0ns ± 1%    66.8ns ± 1%  -1.81%  (p=0.000 n=10+9)
      PtrTo-8    8.00ns ± 2%    7.83ns ± 0%  -2.19%  (p=0.000 n=10+9)
      
      Updates #7818.
      
      Change-Id: Ic1603b640519393f6b50dd91ec3767753eb9e761
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166462
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      788e038e
    • Daniel Martí's avatar
      cmd/compile: update TestIntendedInlining · cc5dc001
      Daniel Martí authored
      Value.CanInterface and Value.pointer are now inlinable, since we have a
      limited form of mid-stack inlining. Their calls to panic were preventing
      that in previous Go releases. The other three methods still go over
      budget, so update that comment.
      
      In recent commits, sync.Once.Do and multiple lock/unlock methods have
      also been made inlinable, so add those as well. They have standalone
      tests like test/inline_sync.go already, but it's best if the funcs are
      in this global test table too. They aren't inlinable on every platform
      yet, though.
      
      Finally, use math/bits.UintSize to check if GOARCH is 64-bit, now that
      we can.
      
      Change-Id: I65cc681b77015f7746dba3126637e236dcd494e0
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166461
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      cc5dc001
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the RWMutex.RUnlock fast path · 05051b56
      Carlo Alberto Ferraris authored
      RWMutex.RLock is already inlineable, so add a test for it as well.
      
      name                    old time/op  new time/op  delta
      RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
      RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
      RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
      RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
      RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
      RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
      RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
      RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
      RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
      RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
      RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)
      
      linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)
      
      The cumulative effect of this and the previous 3 commits is:
      
      name                    old time/op  new time/op  delta
      MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
      MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
      MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
      Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
      Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
      Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
      MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
      MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
      MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
      MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
      MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
      MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
      MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
      MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
      MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
      MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
      MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
      MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
      MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
      MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
      MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
      Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
      Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
      Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
      RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
      RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
      RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
      RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
      RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
      RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
      RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
      RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
      RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
      RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
      RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)
      
      Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
      Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      05051b56
    • Tobias Klauser's avatar
      bytes: return early in Repeat if count is 0 · 0e9d7d43
      Tobias Klauser authored
      This matches the implementation of strings.Repeat and slightly increases
      performance:
      
      name      old time/op  new time/op  delta
      Repeat-8   145ns ±12%   125ns ±29%  -13.35%  (p=0.009 n=10+10)
      
      Change-Id: Ic0a0e2ea9e36591286a49def320ddb67fe0b2c50
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166399
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      0e9d7d43