1. 23 Apr, 2019 1 commit
    • David Chase's avatar
      cmd/link: revert/revise CL 98075 because LLDB is very picky now · 807761f3
      David Chase authored
      This was originally
      
      Revert "cmd/link: fix up debug_range for dsymutil (revert CL 72371)"
      
      which has the effect of no longer using Base Address Selection
      Entries in DWARF.  However, the build-time costs of that are
      about 2%, so instead the hacky fixup that generated technically
      incorrect DWARF was removed from the linker, and the choice
      is instead made in the compiler, dependent on platform, but
      also under control of a flag so that we can report this bug
      against LLDB/dsymutil/dwarfdump (really, the LLVM dwarf
      libraries).
      
      This however does not solve #31188; debugging still fails,
      but dwarfdump no longer complains.  There are at least two
      LLDB bugs involved, and this change will at allow us
      to report them without them being rejected because our
      now-obsolete workaround for the first bug creates
      not-quite-DWARF.
      
      Updates #31188.
      
      Change-Id: I5300c51ad202147bab7333329ebe961623d2b47d
      Reviewed-on: https://go-review.googlesource.com/c/go/+/170638
      
      
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: default avatarHeschi Kreinick <heschi@google.com>
      807761f3
  2. 15 Apr, 2019 1 commit
    • Michael Munday's avatar
      cmd/link, runtime: mark goexit as the top of the call stack · aafe2573
      Michael Munday authored
      This CL adds a new attribute, TOPFRAME, which can be used to mark
      functions that should be treated as being at the top of the call
      stack. The function `runtime.goexit` has been marked this way on
      architectures that use a link register.
      
      This will stop programs that use DWARF to unwind the call stack
      from unwinding past `runtime.goexit` on architectures that use a
      link register. For example, it eliminates "corrupt stack?"
      warnings when generating a backtrace that hits `runtime.goexit`
      in GDB on s390x.
      
      Similar code should be added for non-link-register architectures
      (i.e. amd64, 386). They mark the top of the call stack slightly
      differently to link register architectures so I haven't added
      that code (they need to mark "rip" as undefined).
      
      Fixes #24385.
      
      Change-Id: I15b4c69ac75b491daa0acf0d981cb80eb06488de
      Reviewed-on: https://go-review.googlesource.com/c/go/+/169726
      
      
      Run-TryBot: Michael Munday <mike.munday@ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      aafe2573
  3. 11 Jan, 2019 1 commit
    • Austin Clements's avatar
      cmd/compile: separate data and function LSyms · a2e79571
      Austin Clements authored
      Currently, obj.Ctxt's symbol table does not distinguish between ABI0
      and ABIInternal symbols. This is *almost* okay, since a given symbol
      name in the final object file is only going to belong to one ABI or
      the other, but it requires that the compiler mark a Sym as being a
      function symbol before it retrieves its LSym. If it retrieves the LSym
      first, that LSym will be created as ABI0, and later marking the Sym as
      a function symbol won't change the LSym's ABI.
      
      Marking a Sym as a function symbol before looking up its LSym sounds
      easy, except Syms have a dual purpose: they are used just as interned
      strings (every function, variable, parameter, etc with the same
      textual name shares a Sym), and *also* to store state for whatever
      package global has that name. As a result, it's easy to slip up and
      look up an LSym when a Sym is serving as the name of a local variable,
      and then later mark it as a function when it's serving as the global
      with the name.
      
      In general, we were careful to avoid this, but #29610 demonstrates one
      case where we messed up. Because of on-demand importing from indexed
      export data, it's possible to compile a method wrapper for a type
      imported from another package before importing an init function from
      that package. If the argument of the method is named "init", the
      "init" LSym will be created as a data symbol when compiling the
      wrapper, before it gets marked as a function symbol.
      
      To fix this, we separate obj.Ctxt's symbol tables for ABI0 and
      ABIInternal symbols. This way, the compiler will simply get a
      different LSym once the Sym takes on its package-global meaning as a
      function.
      
      This fixes the above ordering issue, and means we no longer need to go
      out of our way to create the "init" function early and mark it as a
      function symbol.
      
      Fixes #29610.
      Updates #27539.
      
      Change-Id: Id9458b40017893d46ef9e4a3f9b47fc49e1ce8df
      Reviewed-on: https://go-review.googlesource.com/c/157017
      
      
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      a2e79571
  4. 28 Dec, 2018 1 commit
    • Keith Randall's avatar
      cmd/compile,runtime: redo mid-stack inlining tracebacks · 69c2c564
      Keith Randall authored
      Work involved in getting a stack trace is divided between
      runtime.Callers and runtime.CallersFrames.
      
      Before this CL, runtime.Callers returns a pc per runtime frame.
      runtime.CallersFrames is responsible for expanding a runtime frame
      into potentially multiple user frames.
      
      After this CL, runtime.Callers returns a pc per user frame.
      runtime.CallersFrames just maps those to user frame info.
      
      Entries in the result of runtime.Callers are now pcs
      of the calls (or of the inline marks), not of the instruction
      just after the call.
      
      Fixes #29007
      Fixes #28640
      Update #26320
      
      Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
      Reviewed-on: https://go-review.googlesource.com/c/152537
      
      
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      69c2c564
  5. 11 Dec, 2018 1 commit
    • Keith Randall's avatar
      cmd/compile: use innermost line number for -S · 01a1eaa1
      Keith Randall authored
      When functions are inlined, for instructions in the inlined body, does
      -S print the location of the call, or the location of the body? Right
      now, we do the former. I'd like to do the latter by default, it makes
      much more sense when reading disassembly. With mid-stack inlining
      enabled in more cases, this quandry will come up more often.
      
      The original behavior is still available with -S=2. Some tests
      use this mode (so they can find assembly generated by a particular
      source line).
      
      This helped me with understanding what the compiler was doing
      while fixing #29007.
      
      Change-Id: Id14a3a41e1b18901e7c5e460aa4caf6d940ed064
      Reviewed-on: https://go-review.googlesource.com/c/153241
      
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      01a1eaa1
  6. 27 Nov, 2018 1 commit
    • Clément Chigot's avatar
      cmd: fix symbols addressing for aix/ppc64 · 4295ed9b
      Clément Chigot authored
      This commit changes the code generated for addressing symbols on AIX
      operating system.
      
      On AIX, every symbol accesses must be done via another symbol near the TOC,
      named TOC anchor or TOC entry. This TOC anchor is a pointer to the symbol
      address.
      During Progedit function, when a symbol access is detected, its instructions
      are modified to create a load on its TOC anchor and retrieve the symbol.
      
      Change-Id: I00cf8f49c13004bc99fa8af13d549a709320f797
      Reviewed-on: https://go-review.googlesource.com/c/151039
      
      
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      4295ed9b
  7. 12 Nov, 2018 2 commits
    • Austin Clements's avatar
      cmd/internal/obj, cmd/link: record ABIs and aliases in Go obj files · c5718b6b
      Austin Clements authored
      This repurposes the "version" field of a symbol reference in the Go
      object file format to be an ABI field. Currently, this is just 0 or 1
      depending on whether the symbol is static (the linker turns it into a
      different internal version number), so it's already only tenuously a
      symbol version. We change this to be -1 for static symbols and
      otherwise by the ABI number.
      
      This also adds a separate list of ABI alias symbols to be recorded in
      the object file. The ABI aliases must be a separate list and not just
      part of the symbol definitions because it's possible to have a symbol
      defined in one package and the alias "defined" in a different package.
      For example, this can happen if a symbol is defined in assembly in one
      package and stubbed in a different package. The stub triggers the
      generation of the ABI alias, but in a different package from the
      definition.
      
      For #27539.
      
      Change-Id: I015c9fe54690c027de6ef77e22b5585976a01587
      Reviewed-on: https://go-review.googlesource.com/c/147157
      
      
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      c5718b6b
    • Austin Clements's avatar
      cmd/compile: accept and parse symabis · 97e4010f
      Austin Clements authored
      This doesn't yet do anything with this information.
      
      For #27539.
      
      Change-Id: Ia12c905812aa1ed425eedd6ab2f55ec75d81c0ce
      Reviewed-on: https://go-review.googlesource.com/c/147099
      
      
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      97e4010f
  8. 03 Nov, 2018 1 commit
    • Austin Clements's avatar
      cmd/compile: avoid duplicate GC bitmap symbols · 15265ec4
      Austin Clements authored
      Currently, liveness produces a distinct obj.LSym for each GC bitmap
      for each function. These are then named by content hash and only
      ultimately deduplicated by WriteObjFile.
      
      For various reasons (see next commit), we want to remove this
      deduplication behavior from WriteObjFile. Furthermore, it's
      inefficient to produce these duplicate symbols in the first place.
      
      GC bitmaps are the only source of duplicate symbols in the compiler.
      This commit eliminates these duplicate symbols by declaring them in
      the Ctxt symbol hash just like every other obj.LSym. As a result, all
      GC bitmaps with the same content now refer to the same obj.LSym.
      
      The next commit will remove deduplication from WriteObjFile.
      
      For #27539.
      
      Change-Id: I4f15e3d99530122cdf473b7a838c69ef5f79db59
      Reviewed-on: https://go-review.googlesource.com/c/146557
      
      
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      15265ec4
  9. 23 Oct, 2018 1 commit
    • Lynn Boger's avatar
      cmd/asm/internal,cmd/internal/obj/ppc64: add alignment directive to asm for ppc64x · bdba5565
      Lynn Boger authored
      This adds support for an alignment directive that can be used
      within Go asm to indicate preferred code alignment for ppc64x.
      This is intended to be used with loops to improve
      performance.
      
      This change only adds the directive and aligns the code based
      on it. Follow up changes will modify asm functions for
      ppc64x that benefit from preferred alignment.
      
      Fixes #14935
      
      Here is one example of the improvement in memmove when the
      directive is used on the loops in the code:
      
      Memmove/64      8.74ns ± 0%    8.64ns ± 0%   -1.19%  (p=0.000 n=8+8)
      Memmove/128     11.5ns ± 0%    11.0ns ± 0%   -4.35%  (p=0.000 n=8+8)
      Memmove/256     23.0ns ± 0%    15.3ns ± 0%  -33.48%  (p=0.000 n=8+8)
      Memmove/512     31.7ns ± 0%    31.8ns ± 0%   +0.32%  (p=0.000 n=8+8)
      Memmove/1024    52.3ns ± 0%    43.9ns ± 0%  -16.10%  (p=0.000 n=8+8)
      Memmove/2048    93.2ns ± 0%    76.2ns ± 0%  -18.24%  (p=0.000 n=8+8)
      Memmove/4096     174ns ± 0%     141ns ± 0%  -18.97%  (p=0.000 n=8+8)
      
      Change-Id: I200d77e923dd5d78c22fe3f8eb142a8fbaff57bf
      Reviewed-on: https://go-review.googlesource.com/c/144218
      
      
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      bdba5565
  10. 03 Oct, 2018 1 commit
    • Keith Randall's avatar
      cmd/compile,runtime: implement stack objects · cbafcc55
      Keith Randall authored
      Rework how the compiler+runtime handles stack-allocated variables
      whose address is taken.
      
      Direct references to such variables work as before. References through
      pointers, however, use a new mechanism. The new mechanism is more
      precise than the old "ambiguously live" mechanism. It computes liveness
      at runtime based on the actual references among objects on the stack.
      
      Each function records all of its address-taken objects in a FUNCDATA.
      These are called "stack objects". The runtime then uses that
      information while scanning a stack to find all of the stack objects on
      a stack. It then does a mark phase on the stack objects, using all the
      pointers found on the stack (and ancillary structures, like defer
      records) as the root set. Only stack objects which are found to be
      live during this mark phase will be scanned and thus retain any heap
      objects they point to.
      
      A subsequent CL will remove all the "ambiguously live" logic from
      the compiler, so that the stack object tracing will be required.
      For this CL, the stack tracing is all redundant with the current
      ambiguously live logic.
      
      Update #22350
      
      Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f
      Reviewed-on: https://go-review.googlesource.com/c/134155
      
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      cbafcc55
  11. 22 May, 2018 2 commits
    • Austin Clements's avatar
      cmd/compile, cmd/internal/obj: record register maps in binary · 9f95c9db
      Austin Clements authored
      This adds FUNCDATA and PCDATA that records the register maps much like
      the existing live arguments maps and live locals maps. The register
      map is indexed independently from the argument and locals maps since
      changes in register liveness tend not to correlate with changes to
      argument and local liveness.
      
      This is the final CL toward adding safe-points everywhere. The
      following CLs will optimize liveness analysis to bring down the cost.
      The effect of this CL is:
      
      name        old time/op       new time/op       delta
      Template          195ms ± 2%        197ms ± 1%    ~     (p=0.136 n=9+9)
      Unicode          98.4ms ± 2%       99.7ms ± 1%  +1.39%  (p=0.004 n=10+10)
      GoTypes           685ms ± 1%        700ms ± 1%  +2.06%  (p=0.000 n=9+9)
      Compiler          3.28s ± 1%        3.34s ± 0%  +1.71%  (p=0.000 n=9+8)
      SSA               7.79s ± 1%        7.91s ± 1%  +1.55%  (p=0.000 n=10+9)
      Flate             133ms ± 2%        133ms ± 2%    ~     (p=0.190 n=10+10)
      GoParser          161ms ± 2%        164ms ± 3%  +1.83%  (p=0.015 n=10+10)
      Reflect           450ms ± 1%        457ms ± 1%  +1.62%  (p=0.000 n=10+10)
      Tar               183ms ± 2%        185ms ± 1%  +0.91%  (p=0.008 n=9+10)
      XML               234ms ± 1%        238ms ± 1%  +1.60%  (p=0.000 n=9+9)
      [Geo mean]        411ms             417ms       +1.40%
      
      name        old exe-bytes     new exe-bytes     delta
      HelloSize         1.47M ± 0%        1.51M ± 0%  +2.79%  (p=0.000 n=10+10)
      
      Compared to just before "cmd/internal/obj: consolidate emitting entry
      stack map", the cumulative effect of adding stack maps everywhere and
      register maps is:
      
      name        old time/op       new time/op       delta
      Template          185ms ± 2%        197ms ± 1%   +6.42%  (p=0.000 n=10+9)
      Unicode          96.3ms ± 3%       99.7ms ± 1%   +3.60%  (p=0.000 n=10+10)
      GoTypes           658ms ± 0%        700ms ± 1%   +6.37%  (p=0.000 n=10+9)
      Compiler          3.14s ± 1%        3.34s ± 0%   +6.53%  (p=0.000 n=9+8)
      SSA               7.41s ± 2%        7.91s ± 1%   +6.71%  (p=0.000 n=9+9)
      Flate             126ms ± 1%        133ms ± 2%   +6.15%  (p=0.000 n=10+10)
      GoParser          153ms ± 1%        164ms ± 3%   +6.89%  (p=0.000 n=10+10)
      Reflect           437ms ± 1%        457ms ± 1%   +4.59%  (p=0.000 n=10+10)
      Tar               178ms ± 1%        185ms ± 1%   +4.18%  (p=0.000 n=10+10)
      XML               223ms ± 1%        238ms ± 1%   +6.39%  (p=0.000 n=10+9)
      [Geo mean]        394ms             417ms        +5.78%
      
      name        old alloc/op      new alloc/op      delta
      Template         34.5MB ± 0%       38.0MB ± 0%  +10.19%  (p=0.000 n=10+10)
      Unicode          29.3MB ± 0%       30.3MB ± 0%   +3.56%  (p=0.000 n=8+9)
      GoTypes           113MB ± 0%        125MB ± 0%  +10.89%  (p=0.000 n=10+10)
      Compiler          510MB ± 0%        575MB ± 0%  +12.79%  (p=0.000 n=10+10)
      SSA              1.46GB ± 0%       1.64GB ± 0%  +12.40%  (p=0.000 n=10+10)
      Flate            23.9MB ± 0%       25.9MB ± 0%   +8.56%  (p=0.000 n=10+10)
      GoParser         28.0MB ± 0%       30.8MB ± 0%  +10.08%  (p=0.000 n=10+10)
      Reflect          77.6MB ± 0%       84.3MB ± 0%   +8.63%  (p=0.000 n=10+10)
      Tar              34.1MB ± 0%       37.0MB ± 0%   +8.44%  (p=0.000 n=10+10)
      XML              42.7MB ± 0%       47.2MB ± 0%  +10.75%  (p=0.000 n=10+10)
      [Geo mean]       76.0MB            83.3MB        +9.60%
      
      name        old allocs/op     new allocs/op     delta
      Template           321k ± 0%         337k ± 0%   +4.98%  (p=0.000 n=10+10)
      Unicode            337k ± 0%         340k ± 0%   +1.04%  (p=0.000 n=10+9)
      GoTypes           1.13M ± 0%        1.18M ± 0%   +4.85%  (p=0.000 n=10+10)
      Compiler          4.67M ± 0%        4.96M ± 0%   +6.25%  (p=0.000 n=10+10)
      SSA               11.7M ± 0%        12.3M ± 0%   +5.69%  (p=0.000 n=10+10)
      Flate              216k ± 0%         226k ± 0%   +4.52%  (p=0.000 n=10+9)
      GoParser           271k ± 0%         283k ± 0%   +4.52%  (p=0.000 n=10+10)
      Reflect            927k ± 0%         972k ± 0%   +4.78%  (p=0.000 n=10+10)
      Tar                318k ± 0%         333k ± 0%   +4.56%  (p=0.000 n=10+10)
      XML                376k ± 0%         395k ± 0%   +5.04%  (p=0.000 n=10+10)
      [Geo mean]         730k              764k        +4.61%
      
      name        old exe-bytes     new exe-bytes     delta
      HelloSize         1.46M ± 0%        1.51M ± 0%   +3.66%  (p=0.000 n=10+10)
      
      For #24543.
      
      Change-Id: I91e003dc64151916b384274884bf02a2d6862547
      Reviewed-on: https://go-review.googlesource.com/109353
      
      
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      9f95c9db
    • isharipo's avatar
      cmd/asm: enable AVX512 · 5437cde9
      isharipo authored
      - Uncomment tests for AVX512 encoder
      - Permit instruction suffixes for x86
      - Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
      - EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
      - optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)
      
      Note: suffix formatting implemented with updated CConv function.
      Now arch asm backend should register formatting function by
      calling RegisterOpSuffix.
      
      Updates #22779
      
      Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8
      Reviewed-on: https://go-review.googlesource.com/113315
      
      
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      5437cde9
  12. 04 May, 2018 1 commit
  13. 30 Apr, 2018 1 commit
  14. 04 Apr, 2018 1 commit
    • David Chase's avatar
      cmd/link: process is_stmt data into dwarf line tables · dead03b7
      David Chase authored
      To improve debugging, instructions should be annotated with
      DWARF is_stmt.  The DWARF default before was is_stmt=1, and
      to remove "jumpy" stepping the optimizer was tagging
      instructions with a no-position position, which interferes
      with the accuracy of profiling information.  This allows
      that to be corrected, and also allows more "jumpy" positions
      to be annotated with is_stmt=0 (these changes were not made
      for 1.10 because of worries about further messing up
      profiling).
      
      The is_stmt values are placed in a pc-encoded table and
      passed through a symbol derived from the name of the
      function and processed in the linker alongside its
      processing of each function's pc/line tables.
      
      The only change in binary size is in the .debug_line tables
      measured with "objdump -h --section=.debug_line go1.test"
      For go1.test, these are 2614 bytes larger,
      or 0.72% of the size of .debug_line,
      or 0.025% of the file size.
      
      This will increase in proportion to how much the is_stmt
      flag is used (toggled).
      
      Change-Id: Ic1f1aeccff44591ad0494d29e1a0202a3c506a7a
      Reviewed-on: https://go-review.googlesource.com/93664
      
      
      Run-TryBot: David Chase <drchase@google.com>
      Reviewed-by: default avatarHeschi Kreinick <heschi@google.com>
      dead03b7
  15. 10 Mar, 2018 1 commit
  16. 04 Dec, 2017 1 commit
  17. 30 Nov, 2017 1 commit
  18. 02 Nov, 2017 1 commit
  19. 13 Oct, 2017 1 commit
    • Wei Xiao's avatar
      cmd/asm: refine Go assembly for ARM64 · 531e6c06
      Wei Xiao authored
      Some ARM64-specific instructions (such as SIMD instructions) are not supported.
      This patch adds support for the following:
      1. Extended register, e.g.:
           ADD	Rm.<ext>[<<amount], Rn, Rd
           <ext> can have the following values:
             UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW and SXTX
      2. Arrangement for SIMD instructions, e.g.:
           VADDP	Vm.<T>, Vn.<T>, Vd.<T>
           <T> can have the following values:
             B8, B16, H4, H8, S2, S4 and D2
      3. Width specifier and element index for SIMD instructions, e.g.:
           VMOV	Vn.<T>[index], Rd // MOV(to general register)
           <T> can have the following values:
             S and D
      4. Register List, e.g.:
           VLD1	(Rn), [Vt1.<T>, Vt2.<T>, Vt3.<T>]
      5. Register offset variant, e.g.:
           VLD1.P	(Rn)(Rm), [Vt1.<T>, Vt2.<T>] // Rm is the post-index register
      6. Go assembly for ARM64 reference manual
           new added instructions are required to have according explanation items in
           the manual and items for existed instructions will be added incrementally
      
      For more information about the refinement background, please refer to the
      discussion (https://groups.google.com/forum/#!topic/golang-dev/rWgDxCrL4GU)
      
      This patch only adds syntax and doesn't break any assembly that already exists.
      
      Change-Id: I34e90b7faae032820593a0e417022c354a882008
      Reviewed-on: https://go-review.googlesource.com/41654
      
      
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      531e6c06
  20. 11 Oct, 2017 1 commit
    • Keith Randall's avatar
      cmd/compile: abort earlier if stack frame too large · e130dcf0
      Keith Randall authored
      If the stack frame is too large, abort immediately.
      We used to generate code first, then abort.
      In issue 22200, generating code raised a panic
      so we got an ICE instead of an error message.
      
      Change the max frame size to 1GB (from 2GB).
      Stack frames between 1.1GB and 2GB didn't used to work anyway,
      the pcln table generation would have failed and generated an ICE.
      
      Fixes #22200
      
      Change-Id: I1d918ab27ba6ebf5c87ec65d1bccf973f8c8541e
      Reviewed-on: https://go-review.googlesource.com/69810
      
      
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e130dcf0
  21. 15 Sep, 2017 1 commit
    • isharipo's avatar
      cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr) · 8c67f210
      isharipo authored
      This change makes it easier to express instructions
      with arbitrary number of operands.
      
      Rationale: previous approach with operand "hiding" does
      not scale well, AVX and especially AVX512 have many
      instructions with 3+ operands.
      
      x86 asm backend is updated to handle up to 6 explicit operands.
      It also fixes issue with 4-th immediate operand type checks.
      All `ytab` tables are updated accordingly.
      
      Changes to non-x86 backends only include these patterns:
      `p.From3 = X` => `p.SetFrom3(X)`
      `p.From3.X = Y` => `p.GetFrom3().X = Y`
      
      Over time, other backends can adapt Prog.RestArgs
      and reduce the amount of workarounds.
      
      -- Performance --
      
      x/benchmark/build:
      
      $ benchstat upstream.bench patched.bench
      name      old time/op                 new time/op                 delta
      Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)
      
      name      old binary-size             new binary-size             delta
      Build-48                  10.3M ± 0%                  10.3M ± 0%   ~     (all equal)
      
      name      old build-time/op           new build-time/op           delta
      Build-48                  21.7s ± 2%                  21.8s ± 2%   ~     (p=0.218 n=10+10)
      
      name      old build-peak-RSS-bytes    new build-peak-RSS-bytes    delta
      Build-48                  145MB ± 5%                  148MB ± 5%   ~     (p=0.218 n=10+10)
      
      name      old build-user+sys-time/op  new build-user+sys-time/op  delta
      Build-48                  21.0s ± 2%                  21.2s ± 2%   ~     (p=0.075 n=10+10)
      
      Microbenchmark shows a slight slowdown.
      
      name        old time/op  new time/op  delta
      AMD64asm-4  49.5ms ± 1%  49.9ms ± 1%  +0.67%  (p=0.001 n=23+15)
      
      func BenchmarkAMD64asm(b *testing.B) {
        for i := 0; i < b.N; i++ {
          TestAMD64EndToEnd(nil)
          TestAMD64Encoder(nil)
        }
      }
      
      Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
      Reviewed-on: https://go-review.googlesource.com/63490
      
      
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      8c67f210
  22. 27 Jul, 2017 1 commit
    • Heschi Kreinick's avatar
      [dev.debug] cmd/compile: better DWARF with optimizations on · 4c54a047
      Heschi Kreinick authored
      Debuggers use DWARF information to find local variables on the
      stack and in registers. Prior to this CL, the DWARF information for
      functions claimed that all variables were on the stack at all times.
      That's incorrect when optimizations are enabled, and results in
      debuggers showing data that is out of date or complete gibberish.
      
      After this CL, the compiler is capable of representing variable
      locations more accurately, and attempts to do so. Due to limitations of
      the SSA backend, it's not possible to be completely correct.
      
      There are a number of problems in the current design. One of the easier
      to understand is that variable names currently must be attached to an
      SSA value, but not all assignments in the source code actually result
      in machine code. For example:
      
        type myint int
        var a int
        b := myint(int)
      and
        b := (*uint64)(unsafe.Pointer(a))
      
      don't generate machine code because the underlying representation is the
      same, so the correct value of b will not be set when the user would
      expect.
      
      Generating the more precise debug information is behind a flag,
      dwarflocationlists. Because of the issues described above, setting the
      flag may not make the debugging experience much better, and may actually
      make it worse in cases where the variable actually is on the stack and
      the more complicated analysis doesn't realize it.
      
      A number of changes are included:
      - Add a new pseudo-instruction, RegKill, which indicates that the value
      in the register has been clobbered.
      - Adjust regalloc to emit RegKills in the right places. Significantly,
      this means that phis are mixed with StoreReg and RegKills after
      regalloc.
      - Track variable decomposition in ssa.LocalSlots.
      - After the SSA backend is done, analyze the result and build location
      lists for each LocalSlot.
      - After assembly is done, update the location lists with the assembled
      PC offsets, recompose variables, and build DWARF location lists. Emit the
      list as a new linker symbol, one per function.
      - In the linker, aggregate the location lists into a .debug_loc section.
      
      TODO:
      - currently disabled for non-X86/AMD64 because there are no data tables.
      
      go build -toolexec 'toolstash -cmp' -a std succeeds.
      
      With -dwarflocationlists false:
      before: f02812195637909ff675782c0b46836a8ff01976
      after:  06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
      benchstat -geomean  /tmp/220352263 /tmp/621364410
      completed   15 of   15, estimated time remaining 0s (eta 3:52PM)
      name        old time/op       new time/op       delta
      Template          199ms ± 3%        198ms ± 2%     ~     (p=0.400 n=15+14)
      Unicode          96.6ms ± 5%       96.4ms ± 5%     ~     (p=0.838 n=15+15)
      GoTypes           653ms ± 2%        647ms ± 2%     ~     (p=0.102 n=15+14)
      Flate             133ms ± 6%        129ms ± 3%   -2.62%  (p=0.041 n=15+15)
      GoParser          164ms ± 5%        159ms ± 3%   -3.05%  (p=0.000 n=15+15)
      Reflect           428ms ± 4%        422ms ± 3%     ~     (p=0.156 n=15+13)
      Tar               123ms ±10%        124ms ± 8%     ~     (p=0.461 n=15+15)
      XML               228ms ± 3%        224ms ± 3%   -1.57%  (p=0.045 n=15+15)
      [Geo mean]        206ms             377ms       +82.86%
      
      name        old user-time/op  new user-time/op  delta
      Template          292ms ±10%        301ms ±12%     ~     (p=0.189 n=15+15)
      Unicode           166ms ±37%        158ms ±14%     ~     (p=0.418 n=15+14)
      GoTypes           962ms ± 6%        963ms ± 7%     ~     (p=0.976 n=15+15)
      Flate             207ms ±19%        200ms ±14%     ~     (p=0.345 n=14+15)
      GoParser          246ms ±22%        240ms ±15%     ~     (p=0.587 n=15+15)
      Reflect           611ms ±13%        587ms ±14%     ~     (p=0.085 n=15+13)
      Tar               211ms ±12%        217ms ±14%     ~     (p=0.355 n=14+15)
      XML               335ms ±15%        320ms ±18%     ~     (p=0.169 n=15+15)
      [Geo mean]        317ms             583ms       +83.72%
      
      name        old alloc/op      new alloc/op      delta
      Template         40.2MB ± 0%       40.2MB ± 0%   -0.15%  (p=0.000 n=14+15)
      Unicode          29.2MB ± 0%       29.3MB ± 0%     ~     (p=0.624 n=15+15)
      GoTypes           114MB ± 0%        114MB ± 0%   -0.15%  (p=0.000 n=15+14)
      Flate            25.7MB ± 0%       25.6MB ± 0%   -0.18%  (p=0.000 n=13+15)
      GoParser         32.2MB ± 0%       32.2MB ± 0%   -0.14%  (p=0.003 n=15+15)
      Reflect          77.8MB ± 0%       77.9MB ± 0%     ~     (p=0.061 n=15+15)
      Tar              27.1MB ± 0%       27.0MB ± 0%   -0.11%  (p=0.029 n=15+15)
      XML              42.7MB ± 0%       42.5MB ± 0%   -0.29%  (p=0.000 n=15+15)
      [Geo mean]       42.1MB            75.0MB       +78.05%
      
      name        old allocs/op     new allocs/op     delta
      Template           402k ± 1%         398k ± 0%   -0.91%  (p=0.000 n=15+15)
      Unicode            344k ± 1%         344k ± 0%     ~     (p=0.715 n=15+14)
      GoTypes           1.18M ± 0%        1.17M ± 0%   -0.91%  (p=0.000 n=15+14)
      Flate              243k ± 0%         240k ± 1%   -1.05%  (p=0.000 n=13+15)
      GoParser           327k ± 1%         324k ± 1%   -0.96%  (p=0.000 n=15+15)
      Reflect            984k ± 1%         982k ± 0%     ~     (p=0.050 n=15+15)
      Tar                261k ± 1%         259k ± 1%   -0.77%  (p=0.000 n=15+15)
      XML                411k ± 0%         404k ± 1%   -1.55%  (p=0.000 n=15+15)
      [Geo mean]         439k              755k       +72.01%
      
      name        old text-bytes    new text-bytes    delta
      HelloSize         694kB ± 0%        694kB ± 0%   -0.00%  (p=0.000 n=15+15)
      
      name        old data-bytes    new data-bytes    delta
      HelloSize        5.55kB ± 0%       5.55kB ± 0%     ~     (all equal)
      
      name        old bss-bytes     new bss-bytes     delta
      HelloSize         133kB ± 0%        133kB ± 0%     ~     (all equal)
      
      name        old exe-bytes     new exe-bytes     delta
      HelloSize        1.04MB ± 0%       1.04MB ± 0%     ~     (all equal)
      
      Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
      Reviewed-on: https://go-review.googlesource.com/41770
      
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      4c54a047
  23. 18 May, 2017 1 commit
    • Alessandro Arzilli's avatar
      cmd/compile: output DWARF lexical blocks for local variables · 2ad41a30
      Alessandro Arzilli authored
      Change compiler and linker to emit DWARF lexical blocks in .debug_info
      section when compiling with -N -l.
      
      Version of debug_info is updated from DWARF v2 to DWARF v3 since
      version 2 does not allow lexical blocks with discontinuous PC ranges.
      
      Remaining open problems:
      - scope information is removed from inlined functions
      - variables records do not have DW_AT_start_scope attributes so a
      variable will shadow other variables with the same name as soon as its
      containing scope begins, even before its declaration.
      
      Updates #6913.
      Updates #12899.
      
      Change-Id: Idc6808788512ea20e7e45bcf782453acb416fb49
      Reviewed-on: https://go-review.googlesource.com/40095
      
      
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      2ad41a30
  24. 27 Apr, 2017 1 commit
    • Josh Bleecher Snyder's avatar
      cmd/compile: add initial backend concurrency support · 756b9ce3
      Josh Bleecher Snyder authored
      This CL adds initial support for concurrent backend compilation.
      
      BACKGROUND
      
      The compiler currently consists (very roughly) of the following phases:
      
      1. Initialization.
      2. Lexing and parsing into the cmd/compile/internal/syntax AST.
      3. Translation into the cmd/compile/internal/gc AST.
      4. Some gc AST passes: typechecking, escape analysis, inlining,
         closure handling, expression evaluation ordering (order.go),
         and some lowering and optimization (walk.go).
      5. Translation into the cmd/compile/internal/ssa SSA form.
      6. Optimization and lowering of SSA form.
      7. Translation from SSA form to assembler instructions.
      8. Translation from assembler instructions to machine code.
      9. Writing lots of output: machine code, DWARF symbols,
         type and reflection info, export data.
      
      Phase 2 was already concurrent as of Go 1.8.
      
      Phase 3 is planned for eventual removal;
      we hope to go straight from syntax AST to SSA.
      
      Phases 5–8 are per-function; this CL adds support for
      processing multiple functions concurrently.
      The slowest phases in the compiler are 5 and 6,
      so this offers the opportunity for some good speed-ups.
      
      Unfortunately, it's not quite that straightforward.
      In the current compiler, the latter parts of phase 4
      (order, walk) are done function-at-a-time as needed.
      Making order and walk concurrency-safe proved hard,
      and they're not particularly slow, so there wasn't much reward.
      To enable phases 5–8 to be done concurrently,
      when concurrent backend compilation is requested,
      we complete phase 4 for all functions
      before starting later phases for any functions.
      
      Also, in reality, we automatically generate new
      functions in phase 9, such as method wrappers
      and equality and has routines.
      Those new functions then go through phases 4–8.
      This CL disables concurrent backend compilation
      after the first, big, user-provided batch of
      functions has been compiled.
      This is done to keep things simple,
      and because the autogenerated functions
      tend to be small, few, simple, and fast to compile.
      
      USAGE
      
      Concurrent backend compilation still defaults to off.
      To set the number of functions that may be backend-compiled
      concurrently, use the compiler flag -c.
      In future work, cmd/go will automatically set -c.
      
      Furthermore, this CL has been intentionally written
      so that the c=1 path has no backend concurrency whatsoever,
      not even spawning any goroutines.
      This helps ensure that, should problems arise
      late in the development cycle,
      we can simply have cmd/go set c=1 always,
      and revert to the original compiler behavior.
      
      MUTEXES
      
      Most of the work required to make concurrent backend
      compilation safe has occurred over the past month.
      This CL adds a handful of mutexes to get the rest of the way there;
      they are the mutexes that I didn't see a clean way to avoid.
      Some of them may still be eliminable in future work.
      
      In no particular order:
      
      * gc.funcsymsmu. The global funcsyms slice is populated
        lazily when we need function symbols for closures.
        This occurs during gc AST to SSA translation.
        The function funcsym also does a package lookup,
        which is a source of races on types.Pkg.Syms;
        funcsymsmu also covers that package lookup.
        This mutex is low priority: it adds a single global,
        it is in an infrequently used code path, and it is low contention.
        Since funcsyms may now be added in any order,
        we must sort them to preserve reproducible builds.
      
      * gc.largeStackFramesMu. We don't discover until after SSA compilation
        that a function's stack frame is gigantic.
        Recording that error happens basically never,
        but it does happen concurrently.
        Fix with a low priority mutex and sorting.
      
      * obj.Link.hashmu. ctxt.hash stores the mapping from
        types.Syms (compiler symbols) to obj.LSyms (linker symbols).
        It is accessed fairly heavily through all the phases.
        This is the only heavily contended mutex.
      
      * gc.signatlistmu. The global signatlist map is
        populated with types through several of the concurrent phases,
        including notably via ngotype during DWARF generation.
        It is low priority for removal.
      
      * gc.typepkgmu. Looking up symbols in the types package
        happens a fair amount during backend compilation
        and DWARF generation, particularly via ngotype.
        This mutex helps us to avoid a broader mutex on types.Pkg.Syms.
        It has low-to-moderate contention.
      
      * types.internedStringsmu. gc AST to SSA conversion and
        some SSA work introduce new autotmps.
        Those autotmps have their names interned to reduce allocations.
        That interning requires protecting types.internedStrings.
        The autotmp names are heavily re-used, and the mutex
        overhead and contention here are low, so it is probably
        a worthwhile performance optimization to keep this mutex.
      
      TESTING
      
      I have been testing this code locally by running
      'go install -race cmd/compile'
      and then doing
      'go build -a -gcflags=-c=128 std cmd'
      for all architectures and a variety of compiler flags.
      This obviously needs to be made part of the builders,
      but it is too expensive to make part of all.bash.
      I have filed #19962 for this.
      
      REPRODUCIBLE BUILDS
      
      This version of the compiler generates reproducible builds.
      Testing reproducible builds also needs automation, however,
      and is also too expensive for all.bash.
      This is #19961.
      
      Also of note is that some of the compiler flags used by 'toolstash -cmp'
      are currently incompatible with concurrent backend compilation.
      They still work fine with c=1.
      Time will tell whether this is a problem.
      
      NEXT STEPS
      
      * Continue to find and fix races and bugs,
        using a combination of code inspection, fuzzing,
        and hopefully some community experimentation.
        I do not know of any outstanding races,
        but there probably are some.
      * Improve testing.
      * Improve performance, for many values of c.
      * Integrate with cmd/go and fine tune.
      * Support concurrent compilation with the -race flag.
        It is a sad irony that it does not yet work.
      * Minor code cleanup that has been deferred during
        the last month due to uncertainty about the
        ultimate shape of this CL.
      
      PERFORMANCE
      
      Here's the buried lede, at last. :)
      
      All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop.
      
      First, going from tip to this CL with c=1 has almost no impact.
      
      name        old time/op       new time/op       delta
      Template          195ms ± 3%        194ms ± 5%    ~     (p=0.370 n=30+29)
      Unicode          86.6ms ± 3%       87.0ms ± 7%    ~     (p=0.958 n=29+30)
      GoTypes           548ms ± 3%        555ms ± 4%  +1.35%  (p=0.001 n=30+28)
      Compiler          2.51s ± 2%        2.54s ± 2%  +1.17%  (p=0.000 n=28+30)
      SSA               5.16s ± 3%        5.16s ± 2%    ~     (p=0.910 n=30+29)
      Flate             124ms ± 5%        124ms ± 4%    ~     (p=0.947 n=30+30)
      GoParser          146ms ± 3%        146ms ± 3%    ~     (p=0.150 n=29+28)
      Reflect           354ms ± 3%        352ms ± 4%    ~     (p=0.096 n=29+29)
      Tar               107ms ± 5%        106ms ± 3%    ~     (p=0.370 n=30+29)
      XML               200ms ± 4%        201ms ± 4%    ~     (p=0.313 n=29+28)
      [Geo mean]        332ms             333ms       +0.10%
      
      name        old user-time/op  new user-time/op  delta
      Template          227ms ± 5%        225ms ± 5%    ~     (p=0.457 n=28+27)
      Unicode           109ms ± 4%        109ms ± 5%    ~     (p=0.758 n=29+29)
      GoTypes           713ms ± 4%        721ms ± 5%    ~     (p=0.051 n=30+29)
      Compiler          3.36s ± 2%        3.38s ± 3%    ~     (p=0.146 n=30+30)
      SSA               7.46s ± 3%        7.47s ± 3%    ~     (p=0.804 n=30+29)
      Flate             146ms ± 7%        147ms ± 3%    ~     (p=0.833 n=29+27)
      GoParser          179ms ± 5%        179ms ± 5%    ~     (p=0.866 n=30+30)
      Reflect           431ms ± 4%        429ms ± 4%    ~     (p=0.593 n=29+30)
      Tar               124ms ± 5%        123ms ± 5%    ~     (p=0.140 n=29+29)
      XML               243ms ± 4%        242ms ± 7%    ~     (p=0.404 n=29+29)
      [Geo mean]        415ms             415ms       +0.02%
      
      name        old obj-bytes     new obj-bytes     delta
      Template           382k ± 0%         382k ± 0%    ~     (all equal)
      Unicode            203k ± 0%         203k ± 0%    ~     (all equal)
      GoTypes           1.18M ± 0%        1.18M ± 0%    ~     (all equal)
      Compiler          3.98M ± 0%        3.98M ± 0%    ~     (all equal)
      SSA               8.28M ± 0%        8.28M ± 0%    ~     (all equal)
      Flate              230k ± 0%         230k ± 0%    ~     (all equal)
      GoParser           287k ± 0%         287k ± 0%    ~     (all equal)
      Reflect           1.00M ± 0%        1.00M ± 0%    ~     (all equal)
      Tar                190k ± 0%         190k ± 0%    ~     (all equal)
      XML                416k ± 0%         416k ± 0%    ~     (all equal)
      [Geo mean]         660k              660k       +0.00%
      
      Comparing this CL to itself, from c=1 to c=2
      improves real times 20-30%, costs 5-10% more CPU time,
      and adds about 2% alloc.
      The allocation increase comes from allocating more ssa.Caches.
      
      name       old time/op       new time/op       delta
      Template         202ms ± 3%        149ms ± 3%  -26.15%  (p=0.000 n=49+49)
      Unicode         87.4ms ± 4%       84.2ms ± 3%   -3.68%  (p=0.000 n=48+48)
      GoTypes          560ms ± 2%        398ms ± 2%  -28.96%  (p=0.000 n=49+49)
      Compiler         2.46s ± 3%        1.76s ± 2%  -28.61%  (p=0.000 n=48+46)
      SSA              6.17s ± 2%        4.04s ± 1%  -34.52%  (p=0.000 n=49+49)
      Flate            126ms ± 3%         92ms ± 2%  -26.81%  (p=0.000 n=49+48)
      GoParser         148ms ± 4%        107ms ± 2%  -27.78%  (p=0.000 n=49+48)
      Reflect          361ms ± 3%        281ms ± 3%  -22.10%  (p=0.000 n=49+49)
      Tar              109ms ± 4%         86ms ± 3%  -20.81%  (p=0.000 n=49+47)
      XML              204ms ± 3%        144ms ± 2%  -29.53%  (p=0.000 n=48+45)
      
      name       old user-time/op  new user-time/op  delta
      Template         246ms ± 9%        246ms ± 4%     ~     (p=0.401 n=50+48)
      Unicode          109ms ± 4%        111ms ± 4%   +1.47%  (p=0.000 n=44+50)
      GoTypes          728ms ± 3%        765ms ± 3%   +5.04%  (p=0.000 n=46+50)
      Compiler         3.33s ± 3%        3.41s ± 2%   +2.31%  (p=0.000 n=49+48)
      SSA              8.52s ± 2%        9.11s ± 2%   +6.93%  (p=0.000 n=49+47)
      Flate            149ms ± 4%        161ms ± 3%   +8.13%  (p=0.000 n=50+47)
      GoParser         181ms ± 5%        192ms ± 2%   +6.40%  (p=0.000 n=49+46)
      Reflect          452ms ± 9%        474ms ± 2%   +4.99%  (p=0.000 n=50+48)
      Tar              126ms ± 6%        136ms ± 4%   +7.95%  (p=0.000 n=50+49)
      XML              247ms ± 5%        264ms ± 3%   +6.94%  (p=0.000 n=48+50)
      
      name       old alloc/op      new alloc/op      delta
      Template        38.8MB ± 0%       39.3MB ± 0%   +1.48%  (p=0.008 n=5+5)
      Unicode         29.8MB ± 0%       30.2MB ± 0%   +1.19%  (p=0.008 n=5+5)
      GoTypes          113MB ± 0%        114MB ± 0%   +0.69%  (p=0.008 n=5+5)
      Compiler         443MB ± 0%        447MB ± 0%   +0.95%  (p=0.008 n=5+5)
      SSA             1.25GB ± 0%       1.26GB ± 0%   +0.89%  (p=0.008 n=5+5)
      Flate           25.3MB ± 0%       25.9MB ± 1%   +2.35%  (p=0.008 n=5+5)
      GoParser        31.7MB ± 0%       32.2MB ± 0%   +1.59%  (p=0.008 n=5+5)
      Reflect         78.2MB ± 0%       78.9MB ± 0%   +0.91%  (p=0.008 n=5+5)
      Tar             26.6MB ± 0%       27.0MB ± 0%   +1.80%  (p=0.008 n=5+5)
      XML             42.4MB ± 0%       43.4MB ± 0%   +2.35%  (p=0.008 n=5+5)
      
      name       old allocs/op     new allocs/op     delta
      Template          379k ± 0%         378k ± 0%     ~     (p=0.421 n=5+5)
      Unicode           322k ± 0%         321k ± 0%     ~     (p=0.222 n=5+5)
      GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.548 n=5+5)
      Compiler         4.12M ± 0%        4.11M ± 0%   -0.14%  (p=0.032 n=5+5)
      SSA              9.72M ± 0%        9.72M ± 0%     ~     (p=0.421 n=5+5)
      Flate             234k ± 1%         234k ± 0%     ~     (p=0.421 n=5+5)
      GoParser          316k ± 1%         315k ± 0%     ~     (p=0.222 n=5+5)
      Reflect           980k ± 0%         979k ± 0%     ~     (p=0.095 n=5+5)
      Tar               249k ± 1%         249k ± 1%     ~     (p=0.841 n=5+5)
      XML               392k ± 0%         391k ± 0%     ~     (p=0.095 n=5+5)
      
      From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%:
      
      name       old time/op       new time/op       delta
      Template         203ms ± 3%        131ms ± 5%  -35.45%  (p=0.000 n=50+50)
      Unicode         87.2ms ± 4%       84.1ms ± 2%   -3.61%  (p=0.000 n=48+47)
      GoTypes          560ms ± 4%        310ms ± 2%  -44.65%  (p=0.000 n=50+49)
      Compiler         2.47s ± 3%        1.41s ± 2%  -43.10%  (p=0.000 n=50+46)
      SSA              6.17s ± 2%        3.20s ± 2%  -48.06%  (p=0.000 n=49+49)
      Flate            126ms ± 4%         74ms ± 2%  -41.06%  (p=0.000 n=49+48)
      GoParser         148ms ± 4%         89ms ± 3%  -39.97%  (p=0.000 n=49+50)
      Reflect          360ms ± 3%        242ms ± 3%  -32.81%  (p=0.000 n=49+49)
      Tar              108ms ± 4%         73ms ± 4%  -32.48%  (p=0.000 n=50+49)
      XML              203ms ± 3%        119ms ± 3%  -41.56%  (p=0.000 n=49+48)
      
      name       old user-time/op  new user-time/op  delta
      Template         246ms ± 9%        287ms ± 9%  +16.98%  (p=0.000 n=50+50)
      Unicode          109ms ± 4%        118ms ± 5%   +7.56%  (p=0.000 n=46+50)
      GoTypes          735ms ± 4%        806ms ± 2%   +9.62%  (p=0.000 n=50+50)
      Compiler         3.34s ± 4%        3.56s ± 2%   +6.78%  (p=0.000 n=49+49)
      SSA              8.54s ± 3%       10.04s ± 3%  +17.55%  (p=0.000 n=50+50)
      Flate            149ms ± 6%        176ms ± 3%  +17.82%  (p=0.000 n=50+48)
      GoParser         181ms ± 5%        213ms ± 3%  +17.47%  (p=0.000 n=50+50)
      Reflect          453ms ± 6%        499ms ± 2%  +10.11%  (p=0.000 n=50+48)
      Tar              126ms ± 5%        149ms ±11%  +18.76%  (p=0.000 n=50+50)
      XML              246ms ± 5%        287ms ± 4%  +16.53%  (p=0.000 n=49+50)
      
      name       old alloc/op      new alloc/op      delta
      Template        38.8MB ± 0%       40.4MB ± 0%   +4.21%  (p=0.008 n=5+5)
      Unicode         29.8MB ± 0%       30.9MB ± 0%   +3.68%  (p=0.008 n=5+5)
      GoTypes          113MB ± 0%        116MB ± 0%   +2.71%  (p=0.008 n=5+5)
      Compiler         443MB ± 0%        455MB ± 0%   +2.75%  (p=0.008 n=5+5)
      SSA             1.25GB ± 0%       1.27GB ± 0%   +1.84%  (p=0.008 n=5+5)
      Flate           25.3MB ± 0%       26.9MB ± 1%   +6.31%  (p=0.008 n=5+5)
      GoParser        31.7MB ± 0%       33.2MB ± 0%   +4.61%  (p=0.008 n=5+5)
      Reflect         78.2MB ± 0%       80.2MB ± 0%   +2.53%  (p=0.008 n=5+5)
      Tar             26.6MB ± 0%       27.9MB ± 0%   +5.19%  (p=0.008 n=5+5)
      XML             42.4MB ± 0%       44.6MB ± 0%   +5.20%  (p=0.008 n=5+5)
      
      name       old allocs/op     new allocs/op     delta
      Template          380k ± 0%         379k ± 0%   -0.39%  (p=0.032 n=5+5)
      Unicode           321k ± 0%         321k ± 0%     ~     (p=0.841 n=5+5)
      GoTypes          1.14M ± 0%        1.14M ± 0%     ~     (p=0.421 n=5+5)
      Compiler         4.12M ± 0%        4.14M ± 0%   +0.52%  (p=0.008 n=5+5)
      SSA              9.72M ± 0%        9.76M ± 0%   +0.37%  (p=0.008 n=5+5)
      Flate             234k ± 1%         234k ± 1%     ~     (p=0.690 n=5+5)
      GoParser          316k ± 0%         317k ± 1%     ~     (p=0.841 n=5+5)
      Reflect           981k ± 0%         981k ± 0%     ~     (p=1.000 n=5+5)
      Tar               250k ± 0%         249k ± 1%     ~     (p=0.151 n=5+5)
      XML               393k ± 0%         392k ± 0%     ~     (p=0.056 n=5+5)
      
      Going beyond c=4 on my machine tends to increase CPU time and allocs
      without impacting real time.
      
      The CPU time numbers matter, because when there are many concurrent
      compilation processes, that will impact the overall throughput.
      
      The numbers above are in many ways the best case scenario;
      we can take full advantage of all cores.
      Fortunately, the most common compilation scenario is incremental
      re-compilation of a single package during a build/test cycle.
      
      Updates #15756
      
      Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea
      Reviewed-on: https://go-review.googlesource.com/40693
      
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      756b9ce3
  25. 20 Apr, 2017 2 commits
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: eliminate LSym.Version · 405a280d
      Josh Bleecher Snyder authored
      There were only two versions, 0 and 1,
      and the only user of version 1 was the assembler,
      to indicate that a symbol was static.
      
      Rename LSym.Version to Static,
      and add it to LSym.Attributes.
      Simplify call-sites.
      
      Passes toolstash-check.
      
      Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f
      Reviewed-on: https://go-review.googlesource.com/41201
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      405a280d
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: split Link.hash into version 0 and 1 · 6e97c71c
      Josh Bleecher Snyder authored
      Though LSym.Version is an int, it can only have the value 0 or 1.
      Using that, split Link.hash into two maps, one for version 0
      (which is far more common) and one for version 1.
      This lets use just the name for lookups,
      which is both faster and more compact.
      This matters because Link.hash map lookups are frequent,
      and will be contended once the backend is concurrent.
      
      name        old time/op       new time/op       delta
      Template          194ms ± 3%        192ms ± 5%  -1.46%  (p=0.000 n=47+49)
      Unicode          84.5ms ± 3%       83.8ms ± 3%  -0.81%  (p=0.011 n=50+49)
      GoTypes           543ms ± 2%        545ms ± 4%    ~     (p=0.566 n=46+49)
      Compiler          2.48s ± 2%        2.48s ± 3%    ~     (p=0.706 n=47+50)
      SSA               5.94s ± 3%        5.98s ± 2%  +0.55%  (p=0.040 n=49+50)
      Flate             119ms ± 6%        119ms ± 4%    ~     (p=0.681 n=48+47)
      GoParser          145ms ± 4%        145ms ± 3%    ~     (p=0.662 n=47+49)
      Reflect           348ms ± 3%        344ms ± 3%  -1.17%  (p=0.000 n=47+47)
      Tar               105ms ± 4%        104ms ± 3%    ~     (p=0.155 n=50+47)
      XML               197ms ± 2%        197ms ± 3%    ~     (p=0.666 n=49+49)
      [Geo mean]        332ms             331ms       -0.37%
      
      name        old user-time/op  new user-time/op  delta
      Template          230ms ±10%        226ms ±10%  -1.85%  (p=0.041 n=50+50)
      Unicode           104ms ± 6%        103ms ± 5%    ~     (p=0.076 n=49+49)
      GoTypes           707ms ± 4%        705ms ± 5%    ~     (p=0.521 n=50+50)
      Compiler          3.30s ± 3%        3.33s ± 4%  +0.76%  (p=0.003 n=50+49)
      SSA               8.17s ± 4%        8.23s ± 3%  +0.66%  (p=0.030 n=50+49)
      Flate             139ms ± 6%        138ms ± 8%    ~     (p=0.184 n=49+48)
      GoParser          174ms ± 5%        172ms ± 6%    ~     (p=0.107 n=48+49)
      Reflect           431ms ± 8%        420ms ± 5%  -2.57%  (p=0.000 n=50+46)
      Tar               119ms ± 6%        118ms ± 7%  -0.95%  (p=0.033 n=50+49)
      XML               236ms ± 4%        236ms ± 4%    ~     (p=0.935 n=50+48)
      [Geo mean]        410ms             407ms       -0.67%
      
      name        old alloc/op      new alloc/op      delta
      Template         38.7MB ± 0%       38.6MB ± 0%  -0.29%  (p=0.008 n=5+5)
      Unicode          29.8MB ± 0%       29.7MB ± 0%  -0.24%  (p=0.008 n=5+5)
      GoTypes           113MB ± 0%        113MB ± 0%  -0.29%  (p=0.008 n=5+5)
      Compiler          462MB ± 0%        462MB ± 0%  -0.12%  (p=0.008 n=5+5)
      SSA              1.27GB ± 0%       1.27GB ± 0%  -0.05%  (p=0.008 n=5+5)
      Flate            25.2MB ± 0%       25.1MB ± 0%  -0.37%  (p=0.008 n=5+5)
      GoParser         31.7MB ± 0%       31.6MB ± 0%    ~     (p=0.056 n=5+5)
      Reflect          77.5MB ± 0%       77.2MB ± 0%  -0.38%  (p=0.008 n=5+5)
      Tar              26.4MB ± 0%       26.3MB ± 0%    ~     (p=0.151 n=5+5)
      XML              41.9MB ± 0%       41.9MB ± 0%  -0.20%  (p=0.032 n=5+5)
      [Geo mean]       74.5MB            74.3MB       -0.23%
      
      name        old allocs/op     new allocs/op     delta
      Template           378k ± 1%         377k ± 1%    ~     (p=0.690 n=5+5)
      Unicode            321k ± 0%         322k ± 0%    ~     (p=0.595 n=5+5)
      GoTypes           1.14M ± 0%        1.14M ± 0%    ~     (p=0.310 n=5+5)
      Compiler          4.25M ± 0%        4.25M ± 0%    ~     (p=0.151 n=5+5)
      SSA               9.84M ± 0%        9.84M ± 0%    ~     (p=0.841 n=5+5)
      Flate              232k ± 1%         232k ± 0%    ~     (p=0.690 n=5+5)
      GoParser           315k ± 1%         315k ± 1%    ~     (p=0.841 n=5+5)
      Reflect            970k ± 0%         970k ± 0%    ~     (p=0.841 n=5+5)
      Tar                248k ± 0%         248k ± 1%    ~     (p=0.841 n=5+5)
      XML                389k ± 0%         389k ± 0%    ~     (p=1.000 n=5+5)
      [Geo mean]         724k              724k       +0.01%
      
      Updates #15756
      
      Change-Id: I2646332e89f0444ca9d5a41d7172537d904ed636
      Reviewed-on: https://go-review.googlesource.com/41050
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      6e97c71c
  26. 19 Apr, 2017 1 commit
    • Matthew Dempsky's avatar
      cmd/internal/objabi: extract shared functionality from obj · 1e3570ac
      Matthew Dempsky authored
      Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
      the assembler backends no longer requires reinstalling cmd/link or
      cmd/addr2line.
      
      There's also now one canonical definition of the object file format in
      cmd/internal/objabi/doc.go, with a warning to update all three
      implementations.
      
      objabi is still something of a grab bag of unrelated code (e.g., flag
      and environment variable handling probably belong in a separate "tool"
      package), but this is still progress.
      
      Fixes #15165.
      Fixes #20026.
      
      Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
      Reviewed-on: https://go-review.googlesource.com/40972
      
      
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      1e3570ac
  27. 18 Apr, 2017 3 commits
    • Matthew Dempsky's avatar
      cmd/internal/obj: un-embed FuncInfo field in LSym · 17470786
      Matthew Dempsky authored
      Automated refactoring using github.com/mdempsky/unbed (to rewrite
      s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
      field to just Func).
      
      Passes toolstash-check -all.
      
      Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
      Reviewed-on: https://go-review.googlesource.com/40670
      
      
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      17470786
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: unexport Link.Hash · 9d4a8467
      Josh Bleecher Snyder authored
      A prior CL eliminated the last reference to Ctxt.Hash
      from the compiler.
      
      Change-Id: Ic97ff84ed1a14e0c93fb0e8ec0b2617c3397c0e8
      Reviewed-on: https://go-review.googlesource.com/40699
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9d4a8467
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: rework gclocals handling · da15fe68
      Josh Bleecher Snyder authored
      The compiler handled gcargs and gclocals LSyms unusually.
      It generated placeholder symbols (makefuncdatasym),
      filled them in, and then renamed them for content-addressability.
      This is an important binary size optimization;
      the same locals information occurs over and over.
      
      This CL continues to treat these LSyms unusually,
      but in a slightly more explicit way,
      and importantly for concurrent compilation,
      in a way that does not require concurrent
      modification of Ctxt.Hash.
      
      Instead of creating gcargs and gclocals in the usual way,
      by creating a types.Sym and then an obj.LSym,
      we add them directly to obj.FuncInfo,
      initialize them in obj.InitTextSym,
      and deduplicate and add them to ctxt.Data at the end.
      Then the backend's job is simply to fill them in
      and rename them appropriately.
      
      Updates #15756
      
      name       old alloc/op      new alloc/op      delta
      Template        38.8MB ± 0%       38.7MB ± 0%  -0.22%  (p=0.016 n=5+5)
      Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.690 n=5+5)
      GoTypes          113MB ± 0%        113MB ± 0%  -0.24%  (p=0.008 n=5+5)
      SSA             1.25GB ± 0%       1.24GB ± 0%  -0.39%  (p=0.008 n=5+5)
      Flate           25.3MB ± 0%       25.2MB ± 0%  -0.43%  (p=0.008 n=5+5)
      GoParser        31.7MB ± 0%       31.7MB ± 0%  -0.22%  (p=0.008 n=5+5)
      Reflect         78.2MB ± 0%       77.6MB ± 0%  -0.80%  (p=0.008 n=5+5)
      Tar             26.6MB ± 0%       26.3MB ± 0%  -0.85%  (p=0.008 n=5+5)
      XML             42.4MB ± 0%       41.9MB ± 0%  -1.04%  (p=0.008 n=5+5)
      
      name       old allocs/op     new allocs/op     delta
      Template          378k ± 0%         377k ± 1%    ~     (p=0.151 n=5+5)
      Unicode           321k ± 1%         321k ± 0%    ~     (p=0.841 n=5+5)
      GoTypes          1.14M ± 0%        1.14M ± 0%  -0.47%  (p=0.016 n=5+5)
      SSA              9.71M ± 0%        9.67M ± 0%  -0.33%  (p=0.008 n=5+5)
      Flate             233k ± 1%         232k ± 1%    ~     (p=0.151 n=5+5)
      GoParser          316k ± 0%         315k ± 0%  -0.49%  (p=0.016 n=5+5)
      Reflect           979k ± 0%         972k ± 0%  -0.75%  (p=0.008 n=5+5)
      Tar               250k ± 0%         247k ± 1%  -0.92%  (p=0.008 n=5+5)
      XML               392k ± 1%         389k ± 0%  -0.67%  (p=0.008 n=5+5)
      
      Change-Id: Idc36186ca9d2f8214b5f7720bbc27b6bb22fdc48
      Reviewed-on: https://go-review.googlesource.com/40697
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      da15fe68
  28. 13 Apr, 2017 1 commit
  29. 12 Apr, 2017 5 commits
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: stop storing Text flags in From3 · ce3ee7cd
      Josh Bleecher Snyder authored
      Prior to this CL, flags such as NOSPLIT
      on ATEXT Progs were stored in From3.Offset.
      Some but not all of those flags were also
      duplicated into From.Sym.Attribute.
      
      This CL migrates all of those flags into
      From.Sym.Attribute and stops creating a From3.
      
      A side-effect of this is that printing an
      ATEXT Prog can no longer simply dump From3.Offset.
      That's kind of good, since the raw flag value
      wasn't very informative anyway, but it did
      necessitate a bunch of updates to the cmd/asm tests.
      
      The reason I'm doing this work now is that
      avoiding storing flags in both From.Sym and From3.Offset
      simplifies some other changes to fix the data
      race first described in CL 40254.
      
      This CL almost passes toolstash-check -all.
      The only changes are in cases where the assembler
      has decided that a function's flags may be altered,
      e.g. to make a function with no calls in it NOSPLIT.
      Prior to this CL, that information was not printed.
      
      Sample before:
      
      "".Ctz64 t=1 size=63 args=0x10 locals=0x0
      	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), $0-16
      	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
      
      Sample after:
      
      "".Ctz64 t=1 nosplit size=63 args=0x10 locals=0x0
      	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	TEXT	"".Ctz64(SB), NOSPLIT, $0-16
      	0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35)	FUNCDATA	$0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
      
      Observe the additional "nosplit" in the first line
      and the additional "NOSPLIT" in the second line.
      
      Updates #15756
      
      Change-Id: I5c59bd8f3bdc7c780361f801d94a261f0aef3d13
      Reviewed-on: https://go-review.googlesource.com/40495
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ce3ee7cd
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: remove Link.Plan9privates · 49f4b5a4
      Josh Bleecher Snyder authored
      Move it to the x86 package, matching our handling
      of deferreturn in x86 and arm.
      While we're here, improve the concurrency safety
      of both Plan9privates and deferreturn
      by eagerly initializing them in instinit.
      
      Updates #15756
      
      Change-Id: If3b1995c1e4ec816a5443a18f8d715631967a8b1
      Reviewed-on: https://go-review.googlesource.com/40408
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      49f4b5a4
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: remove Link.Version · f30de83d
      Josh Bleecher Snyder authored
      It is zeroed pointlessly and never read.
      
      Change-Id: I65390501a878f545122ec558cb621b91e394a538
      Reviewed-on: https://go-review.googlesource.com/40406
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      f30de83d
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: remove Link.Debugdivmod · 30ddffad
      Josh Bleecher Snyder authored
      It is only used once and never written to.
      Switch to a local constant instead.
      
      Change-Id: Icdd84e47b81f0de44ad9ed56ab5f4f91df22e6b6
      Reviewed-on: https://go-review.googlesource.com/40405
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      30ddffad
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: remove dead Link fields · eacfa592
      Josh Bleecher Snyder authored
      These are unused after CLs 39922, 40252, 40370, 40371, and 40372.
      
      Change-Id: I76f9276c581067a8cb555de761550d960f6e39b8
      Reviewed-on: https://go-review.googlesource.com/40404
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      eacfa592
  30. 11 Apr, 2017 2 commits
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj/s390x: make assembler almost concurrency-safe · 1e692454
      Josh Bleecher Snyder authored
      CL 39922 made the arm assembler concurrency-safe.
      This CL does the same, but for s390x.
      The approach is similar: introduce ctxtz to hold
      function-local state and thread it through
      the assembler as necessary.
      
      One race remains after this CL, similar to CL 40252.
      
      That race is conceptually unrelated to this refactoring,
      and will be addressed in a separate CL.
      
      Passes toolstash-check -all.
      
      Updates #15756
      
      Change-Id: Iabf17aa242b70c0b078c2e85dae3d93a5e512372
      Reviewed-on: https://go-review.googlesource.com/40371
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMichael Munday <munday@ca.ibm.com>
      1e692454
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj/arm64: make assembler almost concurrency-safe · b5931020
      Josh Bleecher Snyder authored
      CL 39922 made the arm assembler concurrency-safe.
      This CL does the same, but for arm64.
      The approach is similar: introduce ctxt7 to hold
      function-local state and thread it through
      the assembler as necessary.
      
      One race remains after this CL, deep in aclass,
      in the check that a Prog does not take the address
      of a TLS variable.
      
      That race is conceptually unrelated to this refactoring,
      and will be addressed in a separate CL.
      
      Passes toolstash-check -all.
      
      Updates #15756
      
      Change-Id: Icab1ef70008468f9a5b8bf728a77c4520bbcb67d
      Reviewed-on: https://go-review.googlesource.com/40252
      
      
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      b5931020