1. 11 Mar, 2019 6 commits
  2. 10 Mar, 2019 3 commits
    • Alberto Donizetti's avatar
      syscall: skip non-root user namespace test if kernel forbids · 1c2d4da1
      Alberto Donizetti authored
      The unprivileged_userns_clone sysctl prevents unpriviledged users from
      creating namespaces, which the AmbientCaps test does. It's set to 0 by
      default in a few Linux distributions (Debian and Arch, possibly
      others), so we need to check it before running the test.
      
      I've verified that setting
      
        echo 1 > /proc/sys/kernel/unprivileged_userns_clone
      
      and then running the test *without this patch* makes it pass, which
      proves that checking unprivileged_userns_clone is indeed sufficient.
      
      Fixes #30698
      
      Change-Id: Ib2079b5e714d7f2440ddf979c3e7cfda9a9c5005
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166460Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      1c2d4da1
    • Tobias Klauser's avatar
      strings: remove unnecessary strings.s · e2dc41b4
      Tobias Klauser authored
      There are no empty function declarations in package strings anymore, so
      strings.s is no longer needed.
      
      Change-Id: I16fe161a9c06804811e98af0ca074f8f46e2f49d
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166458
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e2dc41b4
    • Jason A. Donenfeld's avatar
      runtime: safely load DLLs · 9b6e9f0c
      Jason A. Donenfeld authored
      While many other call sites have been moved to using the proper
      higher-level system loading, these areas were left out. This prevents
      DLL directory injection attacks. This includes both the runtime load
      calls (using LoadLibrary prior) and the implicitly linked ones via
      cgo_import_dynamic, which we move to our LoadLibraryEx. The goal is to
      only loosely load kernel32.dll and strictly load all others.
      
      Meanwhile we make sure that we never fallback to insecure loading on
      older or unpatched systems.
      
      This is CVE-2019-9634.
      
      Fixes #14959
      Fixes #28978
      Fixes #30642
      
      Change-Id: I401a13ed8db248ab1bb5039bf2d31915cac72b93
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165798
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAlex Brainman <alex.brainman@gmail.com>
      9b6e9f0c
  3. 09 Mar, 2019 13 commits
    • Josh Bleecher Snyder's avatar
      cmd/compile: add pure Go math/big functions to TestIntendedInlining · 243c8eb8
      Josh Bleecher Snyder authored
      Change-Id: Id29a9e48a09965e457f923a0ff023722b38b27ef
      Reviewed-on: https://go-review.googlesource.com/c/go/+/165157
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      243c8eb8
    • Josh Bleecher Snyder's avatar
      math/big: add fast path for amd64 addVW for large z · 4d10aba3
      Josh Bleecher Snyder authored
      This matches the pure Go fast path added in the previous commit.
      
      I will leave other architectures to those with ready access to hardware.
      
      name            old time/op    new time/op    delta
      AddVW/1-8         3.60ns ± 3%    3.59ns ± 1%      ~     (p=0.147 n=91+86)
      AddVW/2-8         3.92ns ± 1%    3.91ns ± 2%    -0.36%  (p=0.000 n=86+92)
      AddVW/3-8         4.33ns ± 5%    4.46ns ± 5%    +2.94%  (p=0.000 n=96+97)
      AddVW/4-8         4.76ns ± 5%    4.82ns ± 5%    +1.28%  (p=0.000 n=95+92)
      AddVW/5-8         5.40ns ± 1%    5.42ns ± 0%    +0.47%  (p=0.000 n=76+71)
      AddVW/10-8        8.03ns ± 1%    7.80ns ± 5%    -2.90%  (p=0.000 n=73+96)
      AddVW/100-8       43.8ns ± 5%    17.9ns ± 1%   -59.12%  (p=0.000 n=94+81)
      AddVW/1000-8       428ns ± 4%      85ns ± 6%   -80.20%  (p=0.000 n=96+99)
      AddVW/10000-8     4.22µs ± 2%    1.80µs ± 3%   -57.32%  (p=0.000 n=69+92)
      AddVW/100000-8    44.8µs ± 8%    31.5µs ± 3%   -29.76%  (p=0.000 n=99+90)
      
      name            old time/op    new time/op    delta
      SubVW/1-8         3.53ns ± 2%    3.63ns ± 5%    +2.97%  (p=0.000 n=94+93)
      SubVW/2-8         4.33ns ± 5%    4.01ns ± 2%    -7.36%  (p=0.000 n=90+85)
      SubVW/3-8         4.32ns ± 2%    4.32ns ± 5%      ~     (p=0.084 n=87+97)
      SubVW/4-8         4.70ns ± 2%    4.83ns ± 6%    +2.77%  (p=0.000 n=85+96)
      SubVW/5-8         5.84ns ± 1%    5.35ns ± 1%    -8.35%  (p=0.000 n=87+87)
      SubVW/10-8        8.01ns ± 4%    7.54ns ± 4%    -5.84%  (p=0.000 n=98+97)
      SubVW/100-8       43.9ns ± 5%    17.9ns ± 1%   -59.20%  (p=0.000 n=98+76)
      SubVW/1000-8       426ns ± 2%      85ns ± 3%   -80.13%  (p=0.000 n=90+98)
      SubVW/10000-8     4.24µs ± 2%    1.81µs ± 3%   -57.28%  (p=0.000 n=74+91)
      SubVW/100000-8    44.5µs ± 4%    31.5µs ± 2%   -29.33%  (p=0.000 n=84+91)
      
      Change-Id: I10dd361cbaca22197c27e7734c0f50065292afbb
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164969
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      4d10aba3
    • Josh Bleecher Snyder's avatar
      math/big: add fast path for pure Go addVW for large z · fe24837c
      Josh Bleecher Snyder authored
      In the normal case, only a few words have to be updated when adding a word to a vector.
      When that happens, we can simply copy the rest of the words, which is much faster.
      However, the overhead of that makes it prohibitive for small vectors,
      so we check the size at the beginning.
      
      The implementation is a bit weird to allow addVW to continued to be inlined; see #30548.
      
      The AddVW benchmarks are surprising, but fully repeatable.
      The SubVW benchmarks are more or less as expected.
      I expect that removing the indirect function call will
      help both and make them a bit more normal.
      
      name            old time/op    new time/op     delta
      AddVW/1-8         4.27ns ± 2%     3.81ns ± 3%   -10.83%  (p=0.000 n=89+90)
      AddVW/2-8         4.91ns ± 2%     4.34ns ± 1%   -11.60%  (p=0.000 n=83+90)
      AddVW/3-8         5.77ns ± 4%     5.76ns ± 2%      ~     (p=0.365 n=91+87)
      AddVW/4-8         6.03ns ± 1%     6.03ns ± 1%      ~     (p=0.392 n=80+76)
      AddVW/5-8         6.48ns ± 2%     6.63ns ± 1%    +2.27%  (p=0.000 n=76+74)
      AddVW/10-8        9.56ns ± 2%     9.56ns ± 1%    -0.02%  (p=0.002 n=69+76)
      AddVW/100-8       90.6ns ± 0%     18.1ns ± 4%   -79.99%  (p=0.000 n=72+94)
      AddVW/1000-8       865ns ± 0%       85ns ± 6%   -90.14%  (p=0.000 n=66+96)
      AddVW/10000-8     8.57µs ± 2%     1.82µs ± 3%   -78.73%  (p=0.000 n=99+94)
      AddVW/100000-8    84.4µs ± 2%     31.8µs ± 4%   -62.29%  (p=0.000 n=93+98)
      
      name            old time/op    new time/op     delta
      SubVW/1-8         3.90ns ± 2%     4.13ns ± 4%    +6.02%  (p=0.000 n=92+95)
      SubVW/2-8         4.15ns ± 1%     5.20ns ± 1%   +25.22%  (p=0.000 n=83+85)
      SubVW/3-8         5.50ns ± 2%     6.22ns ± 6%   +13.21%  (p=0.000 n=91+97)
      SubVW/4-8         5.99ns ± 1%     6.63ns ± 1%   +10.63%  (p=0.000 n=79+61)
      SubVW/5-8         6.75ns ± 4%     6.88ns ± 2%    +1.82%  (p=0.000 n=98+73)
      SubVW/10-8        9.57ns ± 1%     9.56ns ± 1%    -0.13%  (p=0.000 n=77+64)
      SubVW/100-8       90.3ns ± 1%     18.1ns ± 2%   -80.00%  (p=0.000 n=75+94)
      SubVW/1000-8       860ns ± 4%       85ns ± 7%   -90.14%  (p=0.000 n=97+99)
      SubVW/10000-8     8.51µs ± 3%     1.77µs ± 6%   -79.21%  (p=0.000 n=100+97)
      SubVW/100000-8    84.4µs ± 3%     31.5µs ± 3%   -62.66%  (p=0.000 n=92+92)
      
      Change-Id: I721d7031d40f245b4a284f5bdd93e7bb85e7e937
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164968
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      fe24837c
    • Josh Bleecher Snyder's avatar
      math/big: remove bounds checks in pure Go implementations · 4c227a09
      Josh Bleecher Snyder authored
      These routines are quite sensitive to BCE.
      
      This change eliminates bounds checks from loops.
      It does so at the cost of a bit of safety:
      malformed input will now return incorrect answers
      instead of panicking.
      
      This isn't as bad as it sounds: math/big has very good
      test coverage, and the alternative implementations are in
      assembly, which could do much worse things with malformed input.
      
      If the compiler's BCE improves, so could these routines.
      
      Notable BCE improvements for these routines would be:
      
      * Allowing and propagating more cross-slice length hints.
        Then hints like _ = y[:len(z)] would eliminate bounds checks for y[i].
      
      * Propagating enough information so that we could do
        n := len(x)
        if len(z) < n {
          n = len(z)
        }
        and then have i < n eliminate the same bounds checks as
        i < len(x) && i < len(z) currently does.
      
      * Providing some way to do BCE for unrolled loops.
        Now that we have math/bits implementations,
        it is possible to write things like ADC chains in
        pure Go, if you can reasonably unroll loops.
      
      Benchmarks below are for amd64, using -tags=math_big_pure_go.
      
      name            old time/op    new time/op    delta
      AddVV/1-8         5.15ns ± 3%    4.65ns ± 4%   -9.81%  (p=0.000 n=93+86)
      AddVV/2-8         6.40ns ± 2%    5.58ns ± 4%  -12.78%  (p=0.000 n=90+95)
      AddVV/3-8         7.07ns ± 2%    6.66ns ± 2%   -5.88%  (p=0.000 n=87+83)
      AddVV/4-8         7.94ns ± 5%    7.41ns ± 4%   -6.65%  (p=0.000 n=94+98)
      AddVV/5-8         8.55ns ± 1%    8.80ns ± 0%   +2.92%  (p=0.000 n=87+92)
      AddVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.12%  (p=0.000 n=83+71)
      AddVV/100-8        119ns ± 5%     117ns ± 4%   -1.60%  (p=0.000 n=93+90)
      AddVV/1000-8      1.14µs ± 4%    1.14µs ± 5%     ~     (p=0.812 n=95+91)
      AddVV/10000-8     11.4µs ± 5%    11.3µs ± 5%     ~     (p=0.503 n=97+96)
      AddVV/100000-8     114µs ± 4%     113µs ± 5%   -0.98%  (p=0.002 n=97+90)
      
      name            old time/op    new time/op    delta
      SubVV/1-8         5.23ns ± 5%    4.65ns ± 3%  -11.18%  (p=0.000 n=89+91)
      SubVV/2-8         6.49ns ± 5%    5.58ns ± 3%  -14.04%  (p=0.000 n=92+94)
      SubVV/3-8         7.10ns ± 3%    6.65ns ± 2%   -6.28%  (p=0.000 n=87+80)
      SubVV/4-8         8.04ns ± 1%    7.44ns ± 5%   -7.49%  (p=0.000 n=83+98)
      SubVV/5-8         8.55ns ± 2%    8.32ns ± 1%   -2.75%  (p=0.000 n=84+92)
      SubVV/10-8        12.7ns ± 1%    12.3ns ± 1%   -3.09%  (p=0.000 n=80+75)
      SubVV/100-8        119ns ± 0%     116ns ± 3%   -1.83%  (p=0.000 n=87+98)
      SubVV/1000-8      1.13µs ± 5%    1.13µs ± 3%     ~     (p=0.082 n=96+98)
      SubVV/10000-8     11.2µs ± 1%    11.3µs ± 3%   +0.76%  (p=0.000 n=87+97)
      SubVV/100000-8     112µs ± 2%     113µs ± 3%   +0.55%  (p=0.000 n=76+88)
      
      name            old time/op    new time/op    delta
      AddVW/1-8         4.30ns ± 4%    3.96ns ± 6%  -8.02%  (p=0.000 n=89+97)
      AddVW/2-8         5.15ns ± 2%    4.91ns ± 1%  -4.56%  (p=0.000 n=87+80)
      AddVW/3-8         5.59ns ± 3%    5.75ns ± 2%  +2.91%  (p=0.000 n=91+88)
      AddVW/4-8         6.20ns ± 1%    6.03ns ± 1%  -2.71%  (p=0.000 n=75+90)
      AddVW/5-8         6.93ns ± 3%    6.49ns ± 2%  -6.35%  (p=0.000 n=100+82)
      AddVW/10-8        10.0ns ± 7%     9.6ns ± 0%  -4.02%  (p=0.000 n=98+74)
      AddVW/100-8       91.1ns ± 1%    90.6ns ± 1%  -0.55%  (p=0.000 n=84+80)
      AddVW/1000-8       866ns ± 1%     856ns ± 4%  -1.06%  (p=0.000 n=69+96)
      AddVW/10000-8     8.64µs ± 1%    8.53µs ± 4%  -1.25%  (p=0.000 n=67+99)
      AddVW/100000-8    84.3µs ± 2%    85.4µs ± 4%  +1.22%  (p=0.000 n=89+99)
      
      name            old time/op    new time/op    delta
      SubVW/1-8         4.28ns ± 2%    3.82ns ± 3%  -10.63%  (p=0.000 n=91+89)
      SubVW/2-8         4.61ns ± 1%    4.48ns ± 3%   -2.67%  (p=0.000 n=94+96)
      SubVW/3-8         5.54ns ± 1%    5.81ns ± 4%   +4.87%  (p=0.000 n=92+97)
      SubVW/4-8         6.20ns ± 1%    6.08ns ± 2%   -1.99%  (p=0.000 n=71+88)
      SubVW/5-8         6.91ns ± 3%    6.64ns ± 1%   -3.90%  (p=0.000 n=97+70)
      SubVW/10-8        9.85ns ± 2%    9.62ns ± 0%   -2.31%  (p=0.000 n=82+62)
      SubVW/100-8       91.1ns ± 1%    90.9ns ± 3%   -0.14%  (p=0.010 n=71+93)
      SubVW/1000-8       859ns ± 3%     867ns ± 1%   +0.98%  (p=0.000 n=99+78)
      SubVW/10000-8     8.54µs ± 5%    8.57µs ± 2%   +0.38%  (p=0.007 n=98+92)
      SubVW/100000-8    84.5µs ± 3%    84.6µs ± 3%     ~     (p=0.334 n=95+94)
      
      name                old time/op    new time/op    delta
      AddMulVVW/1-8         5.43ns ± 3%    4.36ns ± 2%  -19.67%  (p=0.000 n=95+94)
      AddMulVVW/2-8         6.56ns ± 4%    6.11ns ± 1%   -6.90%  (p=0.000 n=91+91)
      AddMulVVW/3-8         8.00ns ± 1%    7.80ns ± 4%   -2.52%  (p=0.000 n=83+95)
      AddMulVVW/4-8         9.81ns ± 2%    9.53ns ± 1%   -2.86%  (p=0.000 n=77+64)
      AddMulVVW/5-8         11.4ns ± 3%    11.3ns ± 5%   -0.89%  (p=0.000 n=95+97)
      AddMulVVW/10-8        18.9ns ± 5%    19.1ns ± 5%   +0.89%  (p=0.000 n=91+94)
      AddMulVVW/100-8        165ns ± 5%     165ns ± 4%     ~     (p=0.427 n=97+98)
      AddMulVVW/1000-8      1.56µs ± 3%    1.56µs ± 4%     ~     (p=0.167 n=98+96)
      AddMulVVW/10000-8     15.7µs ± 5%    15.6µs ± 5%   -0.31%  (p=0.044 n=95+97)
      AddMulVVW/100000-8     156µs ± 3%     157µs ± 8%     ~     (p=0.373 n=72+99)
      
      Change-Id: Ibc720785d5b95f6a797103b1363843205f4d56bf
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164966
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      4c227a09
    • Daniel Martí's avatar
      reflect: make all flag.mustBe* methods inlinable · 788e038e
      Daniel Martí authored
      mustBe was barely over budget, so manually inlining the first flag.kind
      call is enough. Add a TODO to reverse that in the future, once the
      compiler gets better.
      
      mustBeExported and mustBeAssignable were over budget by a larger amount,
      so add slow path functions instead. This is the same strategy used in
      the sync package for common methods like Once.Do, for example.
      
      Lots of exported reflect.Value methods call these assert-like unexported
      methods, so avoiding the function call overhead in the common case does
      shave off a percent from most exported APIs.
      
      Finally, add the methods to TestIntendedInlining.
      
      While at it, replace a couple of uses of the 0 Kind with its descriptive
      name, Invalid.
      
      name     old time/op    new time/op    delta
      Call-8     68.0ns ± 1%    66.8ns ± 1%  -1.81%  (p=0.000 n=10+9)
      PtrTo-8    8.00ns ± 2%    7.83ns ± 0%  -2.19%  (p=0.000 n=10+9)
      
      Updates #7818.
      
      Change-Id: Ic1603b640519393f6b50dd91ec3767753eb9e761
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166462
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      788e038e
    • Daniel Martí's avatar
      cmd/compile: update TestIntendedInlining · cc5dc001
      Daniel Martí authored
      Value.CanInterface and Value.pointer are now inlinable, since we have a
      limited form of mid-stack inlining. Their calls to panic were preventing
      that in previous Go releases. The other three methods still go over
      budget, so update that comment.
      
      In recent commits, sync.Once.Do and multiple lock/unlock methods have
      also been made inlinable, so add those as well. They have standalone
      tests like test/inline_sync.go already, but it's best if the funcs are
      in this global test table too. They aren't inlinable on every platform
      yet, though.
      
      Finally, use math/bits.UintSize to check if GOARCH is 64-bit, now that
      we can.
      
      Change-Id: I65cc681b77015f7746dba3126637e236dcd494e0
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166461
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      cc5dc001
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the RWMutex.RUnlock fast path · 05051b56
      Carlo Alberto Ferraris authored
      RWMutex.RLock is already inlineable, so add a test for it as well.
      
      name                    old time/op  new time/op  delta
      RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
      RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
      RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
      RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
      RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
      RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
      RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
      RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
      RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
      RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
      RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)
      
      linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)
      
      The cumulative effect of this and the previous 3 commits is:
      
      name                    old time/op  new time/op  delta
      MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
      MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
      MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
      Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
      Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
      Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
      MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
      MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
      MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
      MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
      MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
      MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
      MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
      MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
      MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
      MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
      MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
      MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
      MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
      MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
      MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
      Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
      Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
      Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
      RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
      RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
      RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
      RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
      RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
      RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
      RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
      RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
      RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
      RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
      RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
      RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
      RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
      RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
      RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)
      
      Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
      Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      05051b56
    • Tobias Klauser's avatar
      bytes: return early in Repeat if count is 0 · 0e9d7d43
      Tobias Klauser authored
      This matches the implementation of strings.Repeat and slightly increases
      performance:
      
      name      old time/op  new time/op  delta
      Repeat-8   145ns ±12%   125ns ±29%  -13.35%  (p=0.009 n=10+10)
      
      Change-Id: Ic0a0e2ea9e36591286a49def320ddb67fe0b2c50
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166399
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      0e9d7d43
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the Once.Do fast path · ca835484
      Carlo Alberto Ferraris authored
      Using Once.Do is now extremely cheap because the fast path is just an inlined
      atomic load of a variable that is written only once and a conditional jump.
      This is very beneficial for Once.Do because, due to its nature, the fast path
      will be used for every call after the first one.
      
      In a attempt to mimize code size increase, reorder the fields so that the
      pointer to Once is also the pointer to Once.done, that is the only field used
      in the hot path. This allows to use more compact instruction encodings or less
      instructions in the hot path (that is inlined at every callsite).
      
      name     old time/op  new time/op  delta
      Once     4.54ns ± 0%  2.06ns ± 0%  -54.59%  (p=0.000 n=19+16)
      Once-4   1.18ns ± 0%  0.55ns ± 0%  -53.39%  (p=0.000 n=15+16)
      Once-16  0.53ns ± 0%  0.17ns ± 0%  -67.92%  (p=0.000 n=18+17)
      
      linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)
      
      Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
      Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      ca835484
    • Carlo Alberto Ferraris's avatar
      sync: allow inlining the Mutex.Lock fast path · 41cb0aed
      Carlo Alberto Ferraris authored
      name                    old time/op  new time/op  delta
      MutexUncontended        18.9ns ± 0%  16.2ns ± 0%  -14.29%  (p=0.000 n=19+19)
      MutexUncontended-4      4.75ns ± 1%  4.08ns ± 0%  -14.20%  (p=0.000 n=20+19)
      MutexUncontended-16     2.05ns ± 0%  2.11ns ± 0%   +2.93%  (p=0.000 n=19+16)
      Mutex                   19.3ns ± 1%  16.2ns ± 0%  -15.86%  (p=0.000 n=17+19)
      Mutex-4                 52.4ns ± 4%  48.6ns ± 9%   -7.22%  (p=0.000 n=20+20)
      Mutex-16                 139ns ± 2%   140ns ± 3%   +1.03%  (p=0.011 n=16+20)
      MutexSlack              18.9ns ± 1%  16.2ns ± 1%  -13.96%  (p=0.000 n=20+20)
      MutexSlack-4             225ns ± 8%   211ns ±10%   -5.94%  (p=0.000 n=18+19)
      MutexSlack-16           98.4ns ± 1%  90.9ns ± 1%   -7.60%  (p=0.000 n=17+18)
      MutexWork               58.2ns ± 3%  55.4ns ± 0%   -4.82%  (p=0.000 n=20+17)
      MutexWork-4              103ns ± 7%    95ns ±18%   -8.03%  (p=0.000 n=20+20)
      MutexWork-16             163ns ± 2%   155ns ± 2%   -4.47%  (p=0.000 n=18+18)
      MutexWorkSlack          57.7ns ± 1%  55.4ns ± 0%   -3.99%  (p=0.000 n=20+13)
      MutexWorkSlack-4         276ns ±13%   260ns ±10%   -5.64%  (p=0.001 n=19+19)
      MutexWorkSlack-16        147ns ± 0%   156ns ± 1%   +5.87%  (p=0.000 n=14+19)
      MutexNoSpin              968ns ± 0%   900ns ± 1%   -6.98%  (p=0.000 n=20+18)
      MutexNoSpin-4            270ns ± 2%   255ns ± 2%   -5.74%  (p=0.000 n=19+20)
      MutexNoSpin-16           120ns ± 4%   112ns ± 0%   -6.99%  (p=0.000 n=19+14)
      MutexSpin               3.13µs ± 1%  3.19µs ± 6%     ~     (p=0.401 n=20+20)
      MutexSpin-4              832ns ± 2%   831ns ± 1%   -0.17%  (p=0.023 n=16+18)
      MutexSpin-16             395ns ± 0%   399ns ± 0%   +0.94%  (p=0.000 n=17+19)
      RWMutexUncontended      69.5ns ± 0%  68.4ns ± 0%   -1.59%  (p=0.000 n=20+20)
      RWMutexUncontended-4    17.5ns ± 0%  16.7ns ± 0%   -4.30%  (p=0.000 n=18+17)
      RWMutexUncontended-16   7.92ns ± 0%  7.87ns ± 0%   -0.61%  (p=0.000 n=18+17)
      RWMutexWrite100         24.9ns ± 1%  25.0ns ± 1%   +0.32%  (p=0.000 n=20+20)
      RWMutexWrite100-4       46.2ns ± 4%  46.2ns ± 5%     ~     (p=0.840 n=19+20)
      RWMutexWrite100-16      69.9ns ± 5%  69.9ns ± 3%     ~     (p=0.545 n=20+19)
      RWMutexWrite10          27.0ns ± 2%  26.8ns ± 2%   -0.98%  (p=0.001 n=20+20)
      RWMutexWrite10-4        34.7ns ± 2%  35.0ns ± 4%     ~     (p=0.191 n=18+20)
      RWMutexWrite10-16       37.2ns ± 4%  37.3ns ± 2%     ~     (p=0.438 n=20+19)
      RWMutexWorkWrite100      164ns ± 0%   163ns ± 0%   -0.24%  (p=0.025 n=20+20)
      RWMutexWorkWrite100-4    193ns ± 3%   191ns ± 2%   -1.06%  (p=0.027 n=20+20)
      RWMutexWorkWrite100-16   210ns ± 3%   207ns ± 3%   -1.22%  (p=0.038 n=20+20)
      RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%     ~     (all equal)
      RWMutexWorkWrite10-4     178ns ± 2%   179ns ± 2%     ~     (p=0.186 n=20+20)
      RWMutexWorkWrite10-16    192ns ± 2%   192ns ± 2%     ~     (p=0.731 n=19+20)
      
      linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)
      
      Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
      Reviewed-on: https://go-review.googlesource.com/c/go/+/148959Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      41cb0aed
    • Keith Randall's avatar
      cmd/compile: reverse order of slice bounds checks · 83a33d38
      Keith Randall authored
      Turns out this makes the fix for 28797 unnecessary, because this order
      ensures that the RHS of IsSliceInBounds ops are always nonnegative.
      
      The real reason for this change is that it also makes dealing with
      <0 values easier for reporting values in bounds check panics (issue #30116).
      
      Makes cmd/go negligibly smaller.
      
      Update #28797
      
      Change-Id: I1f25ba6d2b3b3d4a72df3105828aa0a4b629ce85
      Reviewed-on: https://go-review.googlesource.com/c/go/+/166377
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      83a33d38
    • Clément Chigot's avatar
      cmd/link: enable DWARF with external linker on aix/ppc64 · 3cf89e50
      Clément Chigot authored
      In order to allow DWARF with ld, the symbol table is adapted.
      In internal linkmode, each package is considered as a .FILE. However,
      current version of ld is crashing on a few programs because of
      relocations between DWARF symbols. Considering all packages as part of
      one .FILE seems to bypass this bug.
      As it might be fixed in a future release, the size of each package
      in DWARF sections is still retrieved and can be used when it's fixed.
      Moreover, it's improving internal linkmode which should have done it
      anyway.
      
      Change-Id: If3d023fe118b24b9f0f46d201a4849eee8d5e333
      Reviewed-on: https://go-review.googlesource.com/c/go/+/164006
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      3cf89e50
    • LE Manh Cuong's avatar
      debug/gosym: simplify parsing symbol name rule · b37b35ed
      LE Manh Cuong authored
      Symbol name with linker prefix like "type." and "go." is not parsed
      correctly and returns the prefix as parts of package name.
      
      So just returns empty string for symbol name start with linker prefix.
      
      Fixes #29551
      
      Change-Id: Idb4ce872345e5781a5a5da2b2146faeeebd9e63b
      Reviewed-on: https://go-review.googlesource.com/c/go/+/156397
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b37b35ed
  4. 08 Mar, 2019 18 commits