1. 01 May, 2018 8 commits
  2. 30 Apr, 2018 18 commits
    • Brian Kessler's avatar
      math/big: return nil for nonexistent ModInverse · 4d44a872
      Brian Kessler authored
      Currently, the behavior of z.ModInverse(g, n) is undefined
      when g and n are not relatively prime.  In that case, no
      ModInverse exists which can be easily checked during the
      computation of the ModInverse.  Because the ModInverse does
      not indicate whether the inverse exists, there are reimplementations
      of a "checked" ModInverse in crypto/rsa.  This change removes the
      undefined behavior.  If the ModInverse does not exist, the receiver z
      is unchanged and the return value is nil. This matches the behavior of
      ModSqrt for the case where the square root does not exist.
      
      name          old time/op    new time/op    delta
      ModInverse-4    2.40µs ± 4%    2.22µs ± 0%   -7.74%  (p=0.016 n=5+4)
      
      name          old alloc/op   new alloc/op   delta
      ModInverse-4    1.36kB ± 0%    1.17kB ± 0%  -14.12%  (p=0.008 n=5+5)
      
      name          old allocs/op  new allocs/op  delta
      ModInverse-4      10.0 ± 0%       9.0 ± 0%  -10.00%  (p=0.008 n=5+5)
      
      Fixes #24922
      
      Change-Id: If7f9d491858450bdb00f1e317152f02493c9c8a8
      Reviewed-on: https://go-review.googlesource.com/108996
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      4d44a872
    • Elias Naur's avatar
      runtime: perform crashes outside systemstack · b1d1ec91
      Elias Naur authored
      CL 93658 moved stack trace printing inside a systemstack call to
      sidestep complexity in case the runtime is in a inconsistent state.
      
      Unfortunately, debuggers generating backtraces for a Go panic
      will be confused and come up with a technical correct but useless
      stack. This CL moves just the crash performing - typically a SIGABRT
      signal - outside the systemstack call to improve backtraces.
      
      Unfortunately, the crash function now needs to be marked nosplit and
      that triggers the no split stackoverflow check. To work around that,
      split fatalpanic in two: fatalthrow for runtime.throw and fatalpanic for
      runtime.gopanic. Only Go panics really needs crashes on the right stack
      and there is enough stack for gopanic.
      
      Example program:
      
      package main
      
      import "runtime/debug"
      
      func main() {
      	debug.SetTraceback("crash")
      	crash()
      }
      
      func crash() {
      	panic("panic!")
      }
      
      Before:
      (lldb) bt
      * thread #1, name = 'simple', stop reason = signal SIGABRT
        * frame #0: 0x000000000044ffe4 simple`runtime.raise at <autogenerated>:1
          frame #1: 0x0000000000438cfb simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424
          frame #2: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525
          frame #3: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
          frame #4: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
          frame #5: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
          frame #6: 0x000000000042a980 simple at proc.go:1094
          frame #7: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525
          frame #8: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
          frame #9: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
          frame #10: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
          frame #11: 0x000000000042a980 simple at proc.go:1094
          frame #12: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758
          frame #13: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
          frame #14: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
          frame #15: 0x000000000042a980 simple at proc.go:1094
          frame #16: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657
          frame #17: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1
      
      After:
      (lldb) bt
      * thread #7, stop reason = signal SIGABRT
        * frame #0: 0x0000000000450024 simple`runtime.raise at <autogenerated>:1
          frame #1: 0x0000000000438d1b simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424
          frame #2: 0x0000000000438ee9 simple`runtime.crash at signal_unix.go:525
          frame #3: 0x00000000004264e3 simple`runtime.fatalpanic(msgs=<unavailable>) at panic.go:664
          frame #4: 0x0000000000425f1b simple`runtime.gopanic(e=<unavailable>) at panic.go:537
          frame #5: 0x0000000000470c62 simple`main.crash at simple.go:11
          frame #6: 0x0000000000470c00 simple`main.main at simple.go:6
          frame #7: 0x0000000000427be7 simple`runtime.main at proc.go:198
          frame #8: 0x000000000044ef91 simple`runtime.goexit at <autogenerated>:1
      
      Updates #22716
      
      Change-Id: Ib5fa35c13662c1dac2f1eac8b59c4a5824b98d92
      Reviewed-on: https://go-review.googlesource.com/110065
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      b1d1ec91
    • Balaram Makam's avatar
      cmd/asm: add vector instructions for ChaCha20Poly1305 on ARM64 · c789ce3f
      Balaram Makam authored
      This change provides VZIP1, VZIP2, VTBL instruction for supporting
      ChaCha20Poly1305 implementation later.
      
      Change-Id: Ife7c87b8ab1a6495a444478eeb9d906ae4c5ffa9
      Reviewed-on: https://go-review.googlesource.com/110015Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c789ce3f
    • Richard Musiol's avatar
      all: skip unsupported tests for js/wasm · e3c68477
      Richard Musiol authored
      The general policy for the current state of js/wasm is that it only
      has to support tests that are also supported by nacl.
      
      The test nilptr3.go makes assumptions about which nil checks can be
      removed. Since WebAssembly does not signal on reading a null pointer,
      all nil checks have to be explicit.
      
      Updates #18892
      
      Change-Id: I06a687860b8d22ae26b1c391499c0f5183e4c485
      Reviewed-on: https://go-review.googlesource.com/110096Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e3c68477
    • Austin Clements's avatar
      cmd/internal/obj/arm: fix/rationalize checkpool distance check · 1b44167d
      Austin Clements authored
      When deciding whether to flush the constant pool, the distance check
      in checkpool can fail to account for padding inserted before the next
      instruction by nacl.
      
      For example, see this failure:
      https://go-review.googlesource.com/c/go/+/109350/2#message-07085b591227824bb1d646a7192cbfa7e0b97066
      Here, the pool should be flushed before a CALL instruction, but
      checkpool only considers the CALL instruction to be 4 bytes and
      doesn't account for the 8 extra bytes of alignment padding added
      before it by asmoutnacl. As a result, it flushes the pool after the
      CALL instruction, which is 4 bytes too late.
      
      Furthermore, there's no explanation for the rather convoluted
      expression used to decide if we need to emit the constant pool.
      
      This CL modifies checkpool to take the PC following the tentative
      instruction as an argument. The caller knows this already and this way
      checkpool doesn't have to guess (and get it wrong in the presence of
      padding). In the process, it rewrites the test to be structured and
      commented.
      
      Change-Id: I32a3d50ffb5a94d42be943e9bcd49036c7e9b95c
      Reviewed-on: https://go-review.googlesource.com/110017
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      1b44167d
    • Richard Musiol's avatar
      syscall: enable some nacl code to be shared with js/wasm · 3bdbb5df
      Richard Musiol authored
      This commit only moves code in preparation for the following commit
      which adds the js/wasm architecture to the os package. There are no
      semantic changes in this commit.
      
      Updates #18892
      
      Change-Id: Ia44484216f905c25395c565c34cfe6996c305ed6
      Reviewed-on: https://go-review.googlesource.com/109976Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      3bdbb5df
    • Brad Fitzpatrick's avatar
      os: find Hostname using Uname to fix Android · 3c7456c1
      Brad Fitzpatrick authored
      It's also fewer system calls. Fall back to longer read
      only if it seems like the Uname result is truncated.
      
      Fixes #24701
      
      Change-Id: Ib6550acede8dddaf184e8fa9de36377e17bbddab
      Reviewed-on: https://go-review.googlesource.com/110295Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      3c7456c1
    • Andrew Bonventre's avatar
      doc: document Go 1.10.2 · 4bced5e9
      Andrew Bonventre authored
      Change-Id: I84334dfd02ad9a27b3fb6d46a6b1c015a3f03511
      Reviewed-on: https://go-review.googlesource.com/110335Reviewed-by: default avatarFilippo Valsorda <filippo@golang.org>
      4bced5e9
    • Andrew Bonventre's avatar
      doc: document Go 1.9.6 · 587416c1
      Andrew Bonventre authored
      Change-Id: I9699b22d3a308cda685aa684b32dcde99333df46
      Reviewed-on: https://go-review.googlesource.com/110315Reviewed-by: default avatarFilippo Valsorda <filippo@golang.org>
      587416c1
    • Kevin Burke's avatar
      crypto/tls: add examples for [Load]X509KeyPair · 4154727e
      Kevin Burke authored
      I was confused about how to start an HTTP server if the server
      cert/key are in memory, not on disk. I thought it would be good to
      show an example of how to use these two functions to accomplish that.
      
      example-cert.pem and example-key.pem were generated using
      crypto/tls/generate_cert.go.
      
      Change-Id: I850e1282fb1c38aff8bd9aeb51988d21fe307584
      Reviewed-on: https://go-review.googlesource.com/72252Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      4154727e
    • Tobias Klauser's avatar
      cmd/cgo: add support for GOARCH=riscv64 · 3334eee4
      Tobias Klauser authored
      Even though GOARCH=riscv64 is not supported by gc yet, it is easy
      to make cmd/cgo already support it.
      
      Together with the changes in debug/elf in CL 107339 this e.g. allows
      to generate Go type definitions for linux/riscv64 in the
      golang.org/x/sys/unix package without using gccgo.
      
      Change-Id: I6b849df2ddac56c8c483eb03d56009669ca36973
      Reviewed-on: https://go-review.googlesource.com/110066
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      3334eee4
    • Elias Naur's avatar
      misc/ios: script lldb directly with Python · 78219ab3
      Elias Naur authored
      The iOS exec wrapper uses ios-deploy to set up a device, install
      the wrapped app, and start a lldb session to run it. ios-deploy is
      not built to be scripted, as can be seen from the brittle way it is
      driven by the Go wrapper. There are many timeouts and comments such
      as
      
      "
      // lldb tries to be clever with terminals.
      // So we wrap it in script(1) and be clever
      // right back at it.
      "
      
      This CL replaces the use of ios-deploy with a lldb driver script in
      Python. lldb is designed to be scripted, so apart from getting rid
      of the ios-deploy dependency, we gain:
      
      - No timouts and scripting ios-deploy through stdin and parsing
      stdout for responses.
      - Accurate exit codes.
      - Prompt exits when the wrapped binary fails for some reason. Before,
      the go test timeout would kick in to fail the test.
      - Support for environment variables.
      - No noise in the test output. Only the test binary output is output
      from the wrapper.
      
      We have to do more work with the lldb driver: mounting the developer
      image on the device, running idevicedebugserverproxy and installing
      the app. Even so, the CL removes almost as many lines as it adds.
      Furthermore, having the steps split up helps to tell setup errors
      from runtime errors.
      
      Change-Id: I48cccc32f475d17987283b2c93aacc3da18fe339
      Reviewed-on: https://go-review.googlesource.com/107337
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      78219ab3
    • Keith Randall's avatar
      runtime: fix newosproc darwin+arm/arm64 · eef27a8f
      Keith Randall authored
      Missed conversion of newosproc for the parts of darwin that
      weren't affected by my previous change.
      
      Update #25181
      
      Change-Id: I81a2935e192b6d0df358c59b7e785eb03c504c23
      Reviewed-on: https://go-review.googlesource.com/110123Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      eef27a8f
    • Alberto Donizetti's avatar
      doc: update FAQ on binary sizes · 0cdf2ec8
      Alberto Donizetti authored
      In the binary sizes FAQ, the approximate size of a Go hello world
      binary was said to be 1.5MB (it was about 1.6MB on go1.7 on
      linux/amd64). Sadly, this is no longer true. A Go1.10 hello world is
      2.0MB, and in 1.11 it'll be about 2.5MB.
      
      Just say "a couple megabytes" to stop this dance.
      
      Change-Id: Ib4dc13a47ccd51327c1a9d90d4116f79597513a4
      Reviewed-on: https://go-review.googlesource.com/110069Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      0cdf2ec8
    • Martin Möhrmann's avatar
      runtime,cmd/compile: adjust and correct path names in comments of map code · 48bfc8db
      Martin Möhrmann authored
      Some of the comments relative paths do not exist and
      reflect does not define its own hmap structure.
      
      Correct paths and consistently reference paths starting from the
      go src directory.
      
      Change-Id: I5204a3a98f77d65f17dcde98b847378cea05ad8a
      Reviewed-on: https://go-review.googlesource.com/94758
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      48bfc8db
    • Wei Xiao's avatar
      cmd/compile: intrinsify runtime.getcallerpc on arm64 · bd8a8872
      Wei Xiao authored
      Add a compiler intrinsic for getcallerpc on arm64 for better code generation.
      
      Change-Id: I897e670a2b8ffa1a8c2fdc638f5b2c44bda26318
      Reviewed-on: https://go-review.googlesource.com/109276Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      bd8a8872
    • Keith Randall's avatar
      runtime,cmd/ld: on darwin, create theads using libc · b7f1777a
      Keith Randall authored
      Replace thread creation with calls to the pthread
      library in libc.
      
      Update #17490
      
      Change-Id: I1e19965c45255deb849b059231252fc6a7861d6c
      Reviewed-on: https://go-review.googlesource.com/108679
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b7f1777a
    • Josh Bleecher Snyder's avatar
      cmd/compile: use AuxInt to store shift boundedness · 743fd917
      Josh Bleecher Snyder authored
      Fixes ssacheck build.
      
      Change-Id: Idf1d2ea9a971a1f17f2fca568099e870bb5d913f
      Reviewed-on: https://go-review.googlesource.com/110122
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      743fd917
  3. 29 Apr, 2018 14 commits
    • Hana Kim's avatar
      cmd/trace: use different colors for tasks · af5143e3
      Hana Kim authored
      and assign the same colors for spans belong to the tasks
      (sadly, the trace viewer will change the saturation/ligthness
      for asynchronous slices so exact color mapping is impossible.
      But I hope they are not too far from each other)
      
      Change-Id: Idaaf0828a1e0dac8012d336dcefa1c6572ddca2e
      Reviewed-on: https://go-review.googlesource.com/109338
      Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarHeschi Kreinick <heschi@google.com>
      af5143e3
    • Alberto Donizetti's avatar
      cmd/compile: better formatting for ssa phases options doc · 8a958bb8
      Alberto Donizetti authored
      Change the help doc of
      
        go tool compile -d=ssa/help
      
      from this:
      
        compile: GcFlag -d=ssa/<phase>/<flag>[=<value>|<function_name>]
        <phase> is one of:
        check, all, build, intrinsics, early_phielim, early_copyelim
        early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
        opt_deadcode, generic_cse, phiopt, nilcheckelim, prove, loopbce
        decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
        fuse, dse, writebarrier, insert_resched_checks, tighten, lower
        lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
        late_phielim, late_copyelim, phi_tighten, late_deadcode, critical
        likelyadjust, layout, schedule, late_nilcheck, flagalloc, regalloc
        loop_rotate, stackframe, trim
        <flag> is one of on, off, debug, mem, time, test, stats, dump
        <value> defaults to 1
        <function_name> is required for "dump", specifies name of function to dump after <phase>
        Except for dump, output is directed to standard out; dump appears in a file.
        Phase "all" supports flags "time", "mem", and "dump".
        Phases "intrinsics" supports flags "on", "off", and "debug".
        Interpretation of the "debug" value depends on the phase.
        Dump files are named <phase>__<function_name>_<seq>.dump.
      
      To this:
      
        compile: PhaseOptions usage:
      
            go tool compile -d=ssa/<phase>/<flag>[=<value>|<function_name>]
      
        where:
      
        - <phase> is one of:
            check, all, build, intrinsics, early_phielim, early_copyelim
            early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
            opt_deadcode, generic_cse, phiopt, nilcheckelim, prove
            decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
            branchelim, fuse, dse, writebarrier, insert_resched_checks, lower
            lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
            late_phielim, late_copyelim, tighten, phi_tighten, late_deadcode
            critical, likelyadjust, layout, schedule, late_nilcheck, flagalloc
            regalloc, loop_rotate, stackframe, trim
      
        - <flag> is one of:
            on, off, debug, mem, time, test, stats, dump
      
        - <value> defaults to 1
      
        - <function_name> is required for the "dump" flag, and specifies the
          name of function to dump after <phase>
      
        Phase "all" supports flags "time", "mem", and "dump".
        Phase "intrinsics" supports flags "on", "off", and "debug".
      
        If the "dump" flag is specified, the output is written on a file named
        <phase>__<function_name>_<seq>.dump; otherwise it is directed to stdout.
      
      Also add a few examples at the bottom.
      
      Fixes #20349
      
      Change-Id: I334799e951e7b27855b3ace5d2d966c4d6ec4cff
      Reviewed-on: https://go-review.googlesource.com/110062Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      8a958bb8
    • Josh Bleecher Snyder's avatar
      cmd/compile: simplify shifts using bounds from prove pass · 9eb4590a
      Josh Bleecher Snyder authored
      The prove pass sometimes has bounds information
      that later rewrite passes do not.
      
      Use this information to mark shifts as bounded,
      and then use that information to generate better code on amd64.
      It may prove to be helpful on other architectures, too.
      
      While here, coalesce the existing shift lowering rules.
      
      This triggers 35 times building std+cmd. The full list is below.
      
      Here's an example from runtime.heapBitsSetType:
      
      			if nb < 8 {
      				b |= uintptr(*p) << nb
      				p = add1(p)
      			} else {
      				nb -= 8
      			}
      
      We now generate better code on amd64 for that left shift.
      
      Updates #25087
      
      vendor/golang_org/x/crypto/curve25519/mont25519_amd64.go:48:20: Proved Rsh8Ux64 bounded
      runtime/mbitmap.go:1252:22: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1265:16: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1275:28: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1645:25: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1663:25: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1808:41: Proved Lsh64x64 bounded
      runtime/mbitmap.go:1831:49: Proved Lsh64x64 bounded
      syscall/route_bsd.go:227:23: Proved Lsh32x64 bounded
      syscall/route_bsd.go:295:23: Proved Lsh32x64 bounded
      syscall/route_darwin.go:40:23: Proved Lsh32x64 bounded
      compress/bzip2/bzip2.go:384:26: Proved Lsh64x16 bounded
      vendor/golang_org/x/net/route/address.go:370:14: Proved Lsh64x64 bounded
      compress/flate/inflate.go:201:54: Proved Lsh64x64 bounded
      math/big/prime.go:50:25: Proved Lsh64x64 bounded
      vendor/golang_org/x/crypto/cryptobyte/asn1.go:464:43: Proved Lsh8x8 bounded
      net/ip.go:87:21: Proved Rsh8Ux64 bounded
      cmd/internal/goobj/read.go:267:23: Proved Lsh64x64 bounded
      cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:534:27: Proved Lsh32x32 bounded
      cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:544:27: Proved Lsh32x32 bounded
      cmd/internal/obj/arm/asm5.go:1044:16: Proved Lsh32x64 bounded
      cmd/internal/obj/arm/asm5.go:1065:10: Proved Lsh32x32 bounded
      cmd/internal/obj/mips/obj0.go:1311:21: Proved Lsh32x64 bounded
      cmd/compile/internal/syntax/scanner.go:352:23: Proved Lsh64x64 bounded
      go/types/expr.go:222:36: Proved Lsh64x64 bounded
      crypto/x509/x509.go:1626:9: Proved Rsh8Ux64 bounded
      cmd/link/internal/loadelf/ldelf.go:823:22: Proved Lsh8x64 bounded
      net/http/h2_bundle.go:1470:17: Proved Lsh8x8 bounded
      net/http/h2_bundle.go:1477:46: Proved Lsh8x8 bounded
      net/http/h2_bundle.go:1481:31: Proved Lsh64x8 bounded
      cmd/compile/internal/ssa/rewriteARM64.go:18759:17: Proved Lsh64x64 bounded
      cmd/compile/internal/ssa/sparsemap.go:70:23: Proved Lsh32x64 bounded
      cmd/compile/internal/ssa/sparsemap.go:73:45: Proved Lsh32x64 bounded
      
      Change-Id: I58bb72f3e6f12f6ac69be633ea7222c245438142
      Reviewed-on: https://go-review.googlesource.com/109776
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      9eb4590a
    • ChrisALiles's avatar
      cmd/compile: pass arguments to convt2E/I integer functions by value · 22ff9521
      ChrisALiles authored
      The motivation is avoid generating a pointer to the data being
      converted so it can be garbage collected.
      The change also slightly reduces binary size by shrinking call sites.
      
      Fixes #24286
      
      Benchmark results:
      name                   old time/op  new time/op  delta
      ConvT2ESmall-4         2.86ns ± 0%  2.80ns ± 1%  -2.12%  (p=0.000 n=29+28)
      ConvT2EUintptr-4       2.88ns ± 1%  2.88ns ± 0%  -0.20%  (p=0.002 n=28+30)
      ConvT2ELarge-4         19.6ns ± 0%  20.4ns ± 1%  +4.22%  (p=0.000 n=19+30)
      ConvT2ISmall-4         3.01ns ± 0%  2.85ns ± 0%  -5.32%  (p=0.000 n=24+28)
      ConvT2IUintptr-4       3.00ns ± 1%  2.87ns ± 0%  -4.44%  (p=0.000 n=29+25)
      ConvT2ILarge-4         20.4ns ± 1%  21.3ns ± 1%  +4.41%  (p=0.000 n=30+26)
      ConvT2Ezero/zero/16-4  2.84ns ± 1%  2.99ns ± 0%  +5.38%  (p=0.000 n=30+25)
      ConvT2Ezero/zero/32-4  2.83ns ± 2%  3.00ns ± 0%  +5.91%  (p=0.004 n=27+3)
      
      Change-Id: I65016ec94c53f97c52113121cab582d0c342b7a8
      Reviewed-on: https://go-review.googlesource.com/102636Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      22ff9521
    • Giovanni Bajo's avatar
      cmd/compile: teach prove to handle expressions like len(s)-delta · e0d37a33
      Giovanni Bajo authored
      When a loop has bound len(s)-delta, findIndVar detected it and
      returned len(s) as (conservative) upper bound. This little lie
      allowed loopbce to drop bound checks.
      
      It is obviously more generic to teach prove about relations like
      x+d<w for non-constant "w"; we already handled the case for
      constant "w", so we just want to learn that if d<0, then x+d<w
      proves that x<w.
      
      To be able to remove the code from findIndVar, we also need
      to teach prove that len() and cap() are always non-negative.
      
      This CL allows to prove 633 more checks in cmd+std. Most
      of them are cases where the code was already testing before
      accessing a slice but the compiler didn't know it. For instance,
      take strings.HasSuffix:
      
          func HasSuffix(s, suffix string) bool {
              return len(s) >= len(suffix) && s[len(s)-len(suffix):] == suffix
          }
      
      When suffix is a literal string, the compiler now understands
      that the explicit check is enough to not emit a slice check.
      
      I also found a loopbce test that was incorrectly
      written to detect an overflow but had a off-by-one (on the
      conservative side), so it unexpectly passed with this CL; I
      changed it to really trigger the overflow as intended.
      
      Change-Id: Ib5abade337db46b8811425afebad4719b6e46c4a
      Reviewed-on: https://go-review.googlesource.com/105635
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      e0d37a33
    • Giovanni Bajo's avatar
      cmd/compile: in prove, detect loops with negative increments · 6d379add
      Giovanni Bajo authored
      To be effective, this also requires being able to relax constraints
      on min/max bound inclusiveness; they are now exposed through a flags,
      and prove has been updated to handle it correctly.
      
      Change-Id: I3490e54461b7b9de8bc4ae40d3b5e2fa2d9f0556
      Reviewed-on: https://go-review.googlesource.com/104041
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      6d379add
    • Giovanni Bajo's avatar
      cmd/compile: improve testing of induction variables · 980fdb8d
      Giovanni Bajo authored
      Test both minimum and maximum bound, and prepare
      formatting for more advanced tests (inclusive / esclusive bounds).
      
      Change-Id: Ibe432916d9c938343bc07943798bc9709ad71845
      Reviewed-on: https://go-review.googlesource.com/104040
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      980fdb8d
    • Giovanni Bajo's avatar
      cmd/compile: remove loopbce pass · f49369b6
      Giovanni Bajo authored
      prove now is able to do what loopbce used to do.
      
      Passes toolstash -cmp.
      
      Compilebench of the whole serie (master 9967582f):
      
      name       old time/op     new time/op     delta
      Template       208ms ±18%      198ms ± 4%    ~     (p=0.690 n=5+5)
      Unicode       99.1ms ±19%     96.5ms ± 4%    ~     (p=0.548 n=5+5)
      GoTypes        623ms ± 1%      633ms ± 1%    ~     (p=0.056 n=5+5)
      Compiler       2.94s ± 2%      3.02s ± 4%    ~     (p=0.095 n=5+5)
      SSA            6.77s ± 1%      7.11s ± 2%  +4.94%  (p=0.008 n=5+5)
      Flate          129ms ± 1%      136ms ± 0%  +4.87%  (p=0.016 n=5+4)
      GoParser       152ms ± 3%      156ms ± 1%    ~     (p=0.095 n=5+5)
      Reflect        380ms ± 2%      392ms ± 1%  +3.30%  (p=0.008 n=5+5)
      Tar            185ms ± 6%      184ms ± 2%    ~     (p=0.690 n=5+5)
      XML            223ms ± 2%      228ms ± 3%    ~     (p=0.095 n=5+5)
      StdCmd         26.8s ± 2%      28.0s ± 5%  +4.46%  (p=0.032 n=5+5)
      
      name       old user-ns/op  new user-ns/op  delta
      Template        252M ± 5%       248M ± 3%    ~     (p=1.000 n=5+5)
      Unicode         118M ± 7%       121M ± 4%    ~     (p=0.548 n=5+5)
      GoTypes         790M ± 2%       793M ± 2%    ~     (p=0.690 n=5+5)
      Compiler       3.78G ± 3%      3.91G ± 4%    ~     (p=0.056 n=5+5)
      SSA            8.98G ± 2%      9.52G ± 3%  +6.08%  (p=0.008 n=5+5)
      Flate           155M ± 1%       160M ± 0%  +3.47%  (p=0.016 n=5+4)
      GoParser        185M ± 4%       187M ± 2%    ~     (p=0.310 n=5+5)
      Reflect         469M ± 1%       481M ± 1%  +2.52%  (p=0.016 n=5+5)
      Tar             222M ± 4%       222M ± 2%    ~     (p=0.841 n=5+5)
      XML             269M ± 1%       274M ± 2%  +1.88%  (p=0.032 n=5+5)
      
      name       old text-bytes  new text-bytes  delta
      HelloSize       664k ± 0%       664k ± 0%    ~     (all equal)
      CmdGoSize      7.23M ± 0%      7.22M ± 0%  -0.06%  (p=0.008 n=5+5)
      
      name       old data-bytes  new data-bytes  delta
      HelloSize       134k ± 0%       134k ± 0%    ~     (all equal)
      CmdGoSize       390k ± 0%       390k ± 0%    ~     (all equal)
      
      name       old exe-bytes   new exe-bytes   delta
      HelloSize      1.39M ± 0%      1.39M ± 0%    ~     (all equal)
      CmdGoSize      14.4M ± 0%      14.4M ± 0%  -0.06%  (p=0.008 n=5+5)
      
      Go1 of the whole serie:
      
      name                      old time/op    new time/op    delta
      BinaryTree17-16              5.40s ± 6%     5.38s ± 4%     ~     (p=1.000 n=12+10)
      Fannkuch11-16                4.04s ± 3%     3.81s ± 3%   -5.70%  (p=0.000 n=11+11)
      FmtFprintfEmpty-16          60.7ns ± 2%    60.2ns ± 3%     ~     (p=0.136 n=11+10)
      FmtFprintfString-16          115ns ± 2%     114ns ± 4%     ~     (p=0.175 n=11+10)
      FmtFprintfInt-16             118ns ± 2%     125ns ± 2%   +5.76%  (p=0.000 n=11+10)
      FmtFprintfIntInt-16          196ns ± 2%     204ns ± 3%   +4.42%  (p=0.000 n=10+11)
      FmtFprintfPrefixedInt-16     207ns ± 2%     214ns ± 2%   +3.23%  (p=0.000 n=10+11)
      FmtFprintfFloat-16           364ns ± 3%     357ns ± 2%   -1.88%  (p=0.002 n=11+11)
      FmtManyArgs-16               773ns ± 2%     775ns ± 1%     ~     (p=0.457 n=11+10)
      GobDecode-16                11.2ms ± 4%    11.0ms ± 3%   -1.51%  (p=0.022 n=10+9)
      GobEncode-16                9.91ms ± 6%    9.81ms ± 5%     ~     (p=0.699 n=11+11)
      Gzip-16                      339ms ± 1%     338ms ± 1%     ~     (p=0.438 n=11+11)
      Gunzip-16                   64.4ms ± 1%    65.2ms ± 1%   +1.28%  (p=0.001 n=10+11)
      HTTPClientServer-16          157µs ± 7%     160µs ± 5%     ~     (p=0.133 n=11+11)
      JSONEncode-16               22.3ms ± 4%    23.2ms ± 4%   +3.79%  (p=0.000 n=11+11)
      JSONDecode-16               96.7ms ± 3%    96.6ms ± 1%     ~     (p=0.562 n=11+11)
      Mandelbrot200-16            6.42ms ± 1%    6.40ms ± 1%     ~     (p=0.365 n=11+11)
      GoParse-16                  5.59ms ± 7%    5.42ms ± 5%   -3.07%  (p=0.020 n=11+10)
      RegexpMatchEasy0_32-16       113ns ± 2%     113ns ± 3%     ~     (p=0.968 n=11+10)
      RegexpMatchEasy0_1K-16       417ns ± 1%     416ns ± 3%     ~     (p=0.742 n=11+10)
      RegexpMatchEasy1_32-16       106ns ± 1%     107ns ± 3%     ~     (p=0.223 n=11+11)
      RegexpMatchEasy1_1K-16       654ns ± 2%     657ns ± 1%     ~     (p=0.672 n=11+8)
      RegexpMatchMedium_32-16      176ns ± 3%     177ns ± 1%     ~     (p=0.664 n=11+9)
      RegexpMatchMedium_1K-16     56.3µs ± 3%    56.7µs ± 3%     ~     (p=0.171 n=11+11)
      RegexpMatchHard_32-16       2.83µs ± 5%    2.83µs ± 4%     ~     (p=0.735 n=11+11)
      RegexpMatchHard_1K-16       82.7µs ± 2%    82.7µs ± 2%     ~     (p=0.853 n=10+10)
      Revcomp-16                   679ms ± 9%     782ms ±29%  +15.16%  (p=0.031 n=9+11)
      Template-16                  118ms ± 1%     109ms ± 2%   -7.49%  (p=0.000 n=11+11)
      TimeParse-16                 474ns ± 1%     462ns ± 1%   -2.59%  (p=0.000 n=11+11)
      TimeFormat-16                482ns ± 1%     494ns ± 1%   +2.49%  (p=0.000 n=10+11)
      
      name                      old speed      new speed      delta
      GobDecode-16              68.7MB/s ± 4%  69.8MB/s ± 3%   +1.52%  (p=0.022 n=10+9)
      GobEncode-16              77.6MB/s ± 6%  78.3MB/s ± 5%     ~     (p=0.699 n=11+11)
      Gzip-16                   57.2MB/s ± 1%  57.3MB/s ± 1%     ~     (p=0.428 n=11+11)
      Gunzip-16                  301MB/s ± 2%   298MB/s ± 1%   -1.07%  (p=0.007 n=11+11)
      JSONEncode-16             86.9MB/s ± 4%  83.7MB/s ± 4%   -3.63%  (p=0.000 n=11+11)
      JSONDecode-16             20.1MB/s ± 3%  20.1MB/s ± 1%     ~     (p=0.529 n=11+11)
      GoParse-16                10.4MB/s ± 6%  10.7MB/s ± 4%   +3.12%  (p=0.020 n=11+10)
      RegexpMatchEasy0_32-16     282MB/s ± 2%   282MB/s ± 3%     ~     (p=0.756 n=11+10)
      RegexpMatchEasy0_1K-16    2.45GB/s ± 1%  2.46GB/s ± 2%     ~     (p=0.705 n=11+10)
      RegexpMatchEasy1_32-16     299MB/s ± 1%   297MB/s ± 2%     ~     (p=0.151 n=11+11)
      RegexpMatchEasy1_1K-16    1.56GB/s ± 2%  1.56GB/s ± 1%     ~     (p=0.717 n=11+8)
      RegexpMatchMedium_32-16   5.67MB/s ± 4%  5.63MB/s ± 1%     ~     (p=0.538 n=11+9)
      RegexpMatchMedium_1K-16   18.2MB/s ± 3%  18.1MB/s ± 3%     ~     (p=0.156 n=11+11)
      RegexpMatchHard_32-16     11.3MB/s ± 5%  11.3MB/s ± 4%     ~     (p=0.711 n=11+11)
      RegexpMatchHard_1K-16     12.4MB/s ± 1%  12.4MB/s ± 2%     ~     (p=0.535 n=9+10)
      Revcomp-16                 370MB/s ± 5%   332MB/s ±24%     ~     (p=0.062 n=8+11)
      Template-16               16.5MB/s ± 1%  17.8MB/s ± 2%   +8.11%  (p=0.000 n=11+11)
      
      Change-Id: I41e46f375ee127785c6491f7ef5bd35581261ae6
      Reviewed-on: https://go-review.googlesource.com/104039
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      f49369b6
    • Giovanni Bajo's avatar
      cmd/compile: implement loop BCE in prove · 7ec25d0a
      Giovanni Bajo authored
      Reuse findIndVar to discover induction variables, and then
      register the facts we know about them into the facts table
      when entering the loop block.
      
      Moreover, handle "x+delta > w" while updating the facts table,
      to be able to prove accesses to slices with constant offsets
      such as slice[i-10].
      
      Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71
      Reviewed-on: https://go-review.googlesource.com/104038
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      7ec25d0a
    • Giovanni Bajo's avatar
      cmd/compile: in prove, infer unsigned relations while branching · 29162ec9
      Giovanni Bajo authored
      When a branch is followed, we apply the relation as described
      in the domain relation table. In case the relation is in the
      positive domain, we can also infer an unsigned relation if,
      by that point, we know that both operands are non-negative.
      
      Fixes #20393
      
      Change-Id: Ieaf0c81558b36d96616abae3eb834c788dd278d5
      Reviewed-on: https://go-review.googlesource.com/100278
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      29162ec9
    • Giovanni Bajo's avatar
      cmd/compile: in prove, add transitive closure of relations · 5c402109
      Giovanni Bajo authored
      Implement it through a partial order datastructure, which
      keeps the relations between SSA values in a forest of DAGs
      and is able to discover contradictions.
      
      In make.bash, this patch is able to prove hundreds of conditions
      which were not proved before.
      
      Compilebench:
      
      name       old time/op       new time/op       delta
      Template         371ms ± 2%        368ms ± 1%    ~     (p=0.222 n=5+5)
      Unicode          203ms ± 6%        199ms ± 3%    ~     (p=0.421 n=5+5)
      GoTypes          1.17s ± 4%        1.18s ± 1%    ~     (p=0.151 n=5+5)
      Compiler         5.54s ± 2%        5.59s ± 1%    ~     (p=0.548 n=5+5)
      SSA              12.9s ± 2%        13.2s ± 1%  +2.96%  (p=0.032 n=5+5)
      Flate            245ms ± 2%        247ms ± 3%    ~     (p=0.690 n=5+5)
      GoParser         302ms ± 6%        302ms ± 4%    ~     (p=0.548 n=5+5)
      Reflect          764ms ± 4%        773ms ± 3%    ~     (p=0.095 n=5+5)
      Tar              354ms ± 6%        361ms ± 3%    ~     (p=0.222 n=5+5)
      XML              434ms ± 3%        429ms ± 1%    ~     (p=0.421 n=5+5)
      StdCmd           22.6s ± 1%        22.9s ± 1%  +1.40%  (p=0.032 n=5+5)
      
      name       old user-time/op  new user-time/op  delta
      Template         436ms ± 8%        426ms ± 5%    ~     (p=0.579 n=5+5)
      Unicode          219ms ±15%        219ms ±12%    ~     (p=1.000 n=5+5)
      GoTypes          1.47s ± 6%        1.53s ± 6%    ~     (p=0.222 n=5+5)
      Compiler         7.26s ± 4%        7.40s ± 2%    ~     (p=0.389 n=5+5)
      SSA              17.7s ± 4%        18.5s ± 4%  +4.13%  (p=0.032 n=5+5)
      Flate            257ms ± 5%        268ms ± 9%    ~     (p=0.333 n=5+5)
      GoParser         354ms ± 6%        348ms ± 6%    ~     (p=0.913 n=5+5)
      Reflect          904ms ± 2%        944ms ± 4%    ~     (p=0.056 n=5+5)
      Tar              398ms ±11%        430ms ± 7%    ~     (p=0.079 n=5+5)
      XML              501ms ± 7%        489ms ± 5%    ~     (p=0.444 n=5+5)
      
      name       old text-bytes    new text-bytes    delta
      HelloSize        670kB ± 0%        670kB ± 0%  +0.00%  (p=0.008 n=5+5)
      CmdGoSize       7.22MB ± 0%       7.21MB ± 0%  -0.07%  (p=0.008 n=5+5)
      
      name       old data-bytes    new data-bytes    delta
      HelloSize       9.88kB ± 0%       9.88kB ± 0%    ~     (all equal)
      CmdGoSize        248kB ± 0%        248kB ± 0%  -0.06%  (p=0.008 n=5+5)
      
      name       old bss-bytes     new bss-bytes     delta
      HelloSize        125kB ± 0%        125kB ± 0%    ~     (all equal)
      CmdGoSize        145kB ± 0%        144kB ± 0%  -0.20%  (p=0.008 n=5+5)
      
      name       old exe-bytes     new exe-bytes     delta
      HelloSize       1.43MB ± 0%       1.43MB ± 0%    ~     (all equal)
      CmdGoSize       14.5MB ± 0%       14.5MB ± 0%  -0.06%  (p=0.008 n=5+5)
      
      Fixes #19714
      Updates #20393
      
      Change-Id: Ia090f5b5dc1bcd274ba8a39b233c1e1ace1b330e
      Reviewed-on: https://go-review.googlesource.com/100277
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      5c402109
    • Josh Bleecher Snyder's avatar
      runtime: iterate over set bits in adjustpointers · 5af0b28a
      Josh Bleecher Snyder authored
      There are several things combined in this change.
      
      First, eliminate the gobitvector type in favor
      of adding a ptrbit method to bitvector.
      In non-performance-critical code, use that method.
      In performance critical code, though, load the bitvector data
      one byte at a time and iterate only over set bits.
      To support that, add and use sys.Ctz8.
      
      name                old time/op  new time/op  delta
      StackCopyPtr-8      81.8ms ± 5%  78.9ms ± 3%   -3.58%  (p=0.000 n=97+96)
      StackCopy-8         65.9ms ± 3%  62.8ms ± 3%   -4.67%  (p=0.000 n=96+92)
      StackCopyNoCache-8   105ms ± 3%   102ms ± 3%   -3.38%  (p=0.000 n=96+95)
      
      Change-Id: I00b80f45612708bd440b1a411a57fa6dfa24aa74
      Reviewed-on: https://go-review.googlesource.com/109716
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      5af0b28a
    • Josh Bleecher Snyder's avatar
      runtime: add fast version of getArgInfo · 13cd0061
      Josh Bleecher Snyder authored
      getArgInfo is called a lot during stack copying.
      In the common case it doesn't do much work,
      but it cannot be inlined.
      
      This change works around that.
      
      name                old time/op  new time/op  delta
      StackCopyPtr-8       108ms ± 5%    96ms ± 4%  -10.40%  (p=0.000 n=20+20)
      StackCopy-8         82.6ms ± 3%  78.4ms ± 6%   -5.15%  (p=0.000 n=19+20)
      StackCopyNoCache-8   130ms ± 3%   122ms ± 3%   -6.44%  (p=0.000 n=20+20)
      
      Change-Id: If7d8a08c50a4e2e76e4331b399396c5dbe88c2ce
      Reviewed-on: https://go-review.googlesource.com/108945
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      13cd0061
    • Austin Clements's avatar
      runtime: use entry stack map at function entry · 0fd427fd
      Austin Clements authored
      Currently, when the runtime looks up the stack map for a frame, it
      uses frame.continpc - 1 unless continpc is the function entry PC, in
      which case it uses frame.continpc. As a result, if continpc is the
      function entry point (which happens for deferred frames), it will
      actually look up the stack map *following* the first instruction.
      
      I think, though I am not positive, that this is always okay today
      because the first instruction of a function can never change the stack
      map. It's usually not a CALL, so it doesn't have PCDATA. Or, if it is
      a CALL, it has to have the entry stack map.
      
      But we're about to start emitting stack maps at every instruction that
      changes them, which means the first instruction can have PCDATA
      (notably, in leaf functions that don't have a prologue).
      
      To prepare for this, tweak how the runtime looks up stack map indexes
      so that if continpc is the function entry point, it directly uses the
      entry stack map.
      
      For #24543.
      
      Change-Id: I85aa818041cd26aff416f7b1fba186e9c8ca6568
      Reviewed-on: https://go-review.googlesource.com/109349Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      0fd427fd