1. 13 Apr, 2017 8 commits
    • Wei Xiao's avatar
      hash/crc32: optimize arm64 crc32 implementation · ab636b89
      Wei Xiao authored
      ARMv8 defines crc32 instruction.
      
      Comparing to the original crc32 calculation, this patch makes use of
      crc32 instructions to do crc32 calculation instead of the multiple
      lookup table algorithms.
      
      ARMv8 provides IEEE and Castagnoli polynomials for crc32 calculation
      so that the perfomance of these two types of crc32 get significant
      improved.
      
      name                                        old time/op   new time/op    delta
      CRC32/poly=IEEE/size=15/align=0-32            117ns ± 0%      38ns ± 0%   -67.44%
      CRC32/poly=IEEE/size=15/align=1-32            117ns ± 0%      38ns ± 0%   -67.52%
      CRC32/poly=IEEE/size=40/align=0-32            129ns ± 0%      41ns ± 0%   -68.37%
      CRC32/poly=IEEE/size=40/align=1-32            129ns ± 0%      41ns ± 0%   -68.29%
      CRC32/poly=IEEE/size=512/align=0-32           828ns ± 0%     246ns ± 0%   -70.29%
      CRC32/poly=IEEE/size=512/align=1-32           828ns ± 0%     132ns ± 0%   -84.06%
      CRC32/poly=IEEE/size=1kB/align=0-32          1.58µs ± 0%    0.46µs ± 0%   -70.98%
      CRC32/poly=IEEE/size=1kB/align=1-32          1.58µs ± 0%    0.46µs ± 0%   -70.92%
      CRC32/poly=IEEE/size=4kB/align=0-32          6.06µs ± 0%    1.74µs ± 0%   -71.27%
      CRC32/poly=IEEE/size=4kB/align=1-32          6.10µs ± 0%    1.74µs ± 0%   -71.44%
      CRC32/poly=IEEE/size=32kB/align=0-32         48.3µs ± 0%    13.7µs ± 0%   -71.61%
      CRC32/poly=IEEE/size=32kB/align=1-32         48.3µs ± 0%    13.7µs ± 0%   -71.60%
      CRC32/poly=Castagnoli/size=15/align=0-32      116ns ± 0%      38ns ± 0%   -67.07%
      CRC32/poly=Castagnoli/size=15/align=1-32      116ns ± 0%      38ns ± 0%   -66.90%
      CRC32/poly=Castagnoli/size=40/align=0-32      127ns ± 0%      40ns ± 0%   -68.11%
      CRC32/poly=Castagnoli/size=40/align=1-32      127ns ± 0%      40ns ± 0%   -68.11%
      CRC32/poly=Castagnoli/size=512/align=0-32     828ns ± 0%     132ns ± 0%   -84.06%
      CRC32/poly=Castagnoli/size=512/align=1-32     827ns ± 0%     132ns ± 0%   -84.04%
      CRC32/poly=Castagnoli/size=1kB/align=0-32    1.59µs ± 0%    0.22µs ± 0%   -85.89%
      CRC32/poly=Castagnoli/size=1kB/align=1-32    1.58µs ± 0%    0.22µs ± 0%   -85.79%
      CRC32/poly=Castagnoli/size=4kB/align=0-32    6.14µs ± 0%    0.77µs ± 0%   -87.40%
      CRC32/poly=Castagnoli/size=4kB/align=1-32    6.06µs ± 0%    0.77µs ± 0%   -87.25%
      CRC32/poly=Castagnoli/size=32kB/align=0-32   48.3µs ± 0%     5.9µs ± 0%   -87.71%
      CRC32/poly=Castagnoli/size=32kB/align=1-32   48.4µs ± 0%     6.0µs ± 0%   -87.69%
      CRC32/poly=Koopman/size=15/align=0-32         104ns ± 0%     104ns ± 0%    +0.00%
      CRC32/poly=Koopman/size=15/align=1-32         104ns ± 0%     104ns ± 0%    +0.00%
      CRC32/poly=Koopman/size=40/align=0-32         235ns ± 0%     235ns ± 0%    +0.00%
      CRC32/poly=Koopman/size=40/align=1-32         235ns ± 0%     235ns ± 0%    +0.00%
      CRC32/poly=Koopman/size=512/align=0-32       2.71µs ± 0%    2.71µs ± 0%    -0.07%
      CRC32/poly=Koopman/size=512/align=1-32       2.71µs ± 0%    2.71µs ± 0%    -0.04%
      CRC32/poly=Koopman/size=1kB/align=0-32       5.40µs ± 0%    5.39µs ± 0%    -0.06%
      CRC32/poly=Koopman/size=1kB/align=1-32       5.40µs ± 0%    5.40µs ± 0%    +0.02%
      CRC32/poly=Koopman/size=4kB/align=0-32       21.5µs ± 0%    21.5µs ± 0%    -0.16%
      CRC32/poly=Koopman/size=4kB/align=1-32       21.5µs ± 0%    21.5µs ± 0%    -0.05%
      CRC32/poly=Koopman/size=32kB/align=0-32       172µs ± 0%     172µs ± 0%    -0.07%
      CRC32/poly=Koopman/size=32kB/align=1-32       172µs ± 0%     172µs ± 0%    -0.01%
      
      name                                        old speed     new speed      delta
      CRC32/poly=IEEE/size=15/align=0-32          128MB/s ± 0%   394MB/s ± 0%  +207.95%
      CRC32/poly=IEEE/size=15/align=1-32          128MB/s ± 0%   394MB/s ± 0%  +208.09%
      CRC32/poly=IEEE/size=40/align=0-32          310MB/s ± 0%   979MB/s ± 0%  +216.07%
      CRC32/poly=IEEE/size=40/align=1-32          310MB/s ± 0%   979MB/s ± 0%  +216.16%
      CRC32/poly=IEEE/size=512/align=0-32         618MB/s ± 0%  2074MB/s ± 0%  +235.72%
      CRC32/poly=IEEE/size=512/align=1-32         618MB/s ± 0%  3852MB/s ± 0%  +523.55%
      CRC32/poly=IEEE/size=1kB/align=0-32         646MB/s ± 0%  2225MB/s ± 0%  +244.57%
      CRC32/poly=IEEE/size=1kB/align=1-32         647MB/s ± 0%  2225MB/s ± 0%  +243.87%
      CRC32/poly=IEEE/size=4kB/align=0-32         676MB/s ± 0%  2352MB/s ± 0%  +248.02%
      CRC32/poly=IEEE/size=4kB/align=1-32         672MB/s ± 0%  2352MB/s ± 0%  +250.15%
      CRC32/poly=IEEE/size=32kB/align=0-32        678MB/s ± 0%  2387MB/s ± 0%  +252.17%
      CRC32/poly=IEEE/size=32kB/align=1-32        678MB/s ± 0%  2388MB/s ± 0%  +252.11%
      CRC32/poly=Castagnoli/size=15/align=0-32    129MB/s ± 0%   393MB/s ± 0%  +205.51%
      CRC32/poly=Castagnoli/size=15/align=1-32    129MB/s ± 0%   390MB/s ± 0%  +203.41%
      CRC32/poly=Castagnoli/size=40/align=0-32    314MB/s ± 0%   988MB/s ± 0%  +215.04%
      CRC32/poly=Castagnoli/size=40/align=1-32    314MB/s ± 0%   987MB/s ± 0%  +214.68%
      CRC32/poly=Castagnoli/size=512/align=0-32   618MB/s ± 0%  3860MB/s ± 0%  +524.32%
      CRC32/poly=Castagnoli/size=512/align=1-32   619MB/s ± 0%  3859MB/s ± 0%  +523.66%
      CRC32/poly=Castagnoli/size=1kB/align=0-32   645MB/s ± 0%  4568MB/s ± 0%  +608.56%
      CRC32/poly=Castagnoli/size=1kB/align=1-32   650MB/s ± 0%  4567MB/s ± 0%  +602.94%
      CRC32/poly=Castagnoli/size=4kB/align=0-32   667MB/s ± 0%  5297MB/s ± 0%  +693.81%
      CRC32/poly=Castagnoli/size=4kB/align=1-32   676MB/s ± 0%  5297MB/s ± 0%  +684.00%
      CRC32/poly=Castagnoli/size=32kB/align=0-32  678MB/s ± 0%  5519MB/s ± 0%  +713.83%
      CRC32/poly=Castagnoli/size=32kB/align=1-32  677MB/s ± 0%  5497MB/s ± 0%  +712.04%
      CRC32/poly=Koopman/size=15/align=0-32       143MB/s ± 0%   144MB/s ± 0%    +0.27%
      CRC32/poly=Koopman/size=15/align=1-32       143MB/s ± 0%   144MB/s ± 0%    +0.33%
      CRC32/poly=Koopman/size=40/align=0-32       169MB/s ± 0%   170MB/s ± 0%    +0.12%
      CRC32/poly=Koopman/size=40/align=1-32       170MB/s ± 0%   170MB/s ± 0%    +0.08%
      CRC32/poly=Koopman/size=512/align=0-32      189MB/s ± 0%   189MB/s ± 0%    +0.07%
      CRC32/poly=Koopman/size=512/align=1-32      189MB/s ± 0%   189MB/s ± 0%    +0.04%
      CRC32/poly=Koopman/size=1kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.05%
      CRC32/poly=Koopman/size=1kB/align=1-32      190MB/s ± 0%   190MB/s ± 0%    -0.01%
      CRC32/poly=Koopman/size=4kB/align=0-32      190MB/s ± 0%   190MB/s ± 0%    +0.15%
      CRC32/poly=Koopman/size=4kB/align=1-32      190MB/s ± 0%   191MB/s ± 0%    +0.05%
      CRC32/poly=Koopman/size=32kB/align=0-32     191MB/s ± 0%   191MB/s ± 0%    +0.06%
      CRC32/poly=Koopman/size=32kB/align=1-32     191MB/s ± 0%   191MB/s ± 0%    +0.02%
      
      Also fix a bug of arm64 assembler
      
      The optimization is mainly contributed by Fangming.Fang <fangming.fang@arm.com>
      
      Change-Id: I900678c2e445d7e8ad9e2a9ab3305d649230905f
      Reviewed-on: https://go-review.googlesource.com/40074Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ab636b89
    • Meir Fischer's avatar
      net/http/fcgi: expose cgi env vars in request context · aaf46821
      Meir Fischer authored
      The current interface can't access all environment
      variables directly or via cgi.RequestFromMap, which
      only reads variables on its "white list" to be set on
      the http.Request it returns. If an fcgi variable is
      not on the "white list" - e.g. REMOTE_USER - the old
      code has no access to its value.
      
      This passes variables in the Request context that aren't
      used to add data to the Request itself and adds a method
      that parses those env vars from the Request's context.
      
      Fixes #16546
      
      Change-Id: Ibf933a768b677ece1bb93d7bf99a14cef36ec671
      Reviewed-on: https://go-review.googlesource.com/40012
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      aaf46821
    • Mikio Hara's avatar
      internal/poll: rename RecvFrom to ReadFrom for consistency · 7c3fa418
      Mikio Hara authored
      Also adds missing docs.
      
      Change-Id: Ibd8dbe8441bc7a41f01ed2e2033db98e479a5176
      Reviewed-on: https://go-review.googlesource.com/40412
      Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      7c3fa418
    • Josh Bleecher Snyder's avatar
      cmd/compile: make TestAssembly resilient to output ordering · 0d36999a
      Josh Bleecher Snyder authored
      To preserve reproducible builds, the text entries
      during compilation will be sorted before being printed.
      TestAssembly currently assumes that function init
      comes after all user-defined functions.
      Remove that assumption.
      Instead of looking for "TEXT" to tell you where
      a function ends--which may now yield lots of
      non-function-code junk--look for a line beginning
      with non-whitespace.
      
      Updates #15756
      
      Change-Id: Ibc82dba6143d769ef4c391afc360e523b1a51348
      Reviewed-on: https://go-review.googlesource.com/39853
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      0d36999a
    • Josh Bleecher Snyder's avatar
      cmd/internal/obj: build ctxt.Text during Sym init · c18fd098
      Josh Bleecher Snyder authored
      Instead of constructing ctxt.Text in Flushplist,
      which will be called concurrently,
      do it in InitTextSym, which must be called serially.
      This allows us to avoid a mutex for ctxt.Text,
      and preserves the existing ordering of functions
      for debug output.
      
      Passes toolstash-check.
      
      Updates #15756
      
      Change-Id: I6322b4da24f9f0db7ba25e5b1b50e8d3be2deb37
      Reviewed-on: https://go-review.googlesource.com/40502
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      c18fd098
    • Brad Fitzpatrick's avatar
      cmd/dist: don't compile unneeded GOARCH SSA rewrite rules during bootstrap · 9dbba36a
      Brad Fitzpatrick authored
      Speeds up build (the bootstrap phase) by ~6 seconds.
      
      Bootstrap goes from ~18 seconds to ~12 seconds.
      
      Change-Id: I7e2ec8f5fc668bf6168d90098eaf70390b16e479
      Reviewed-on: https://go-review.googlesource.com/40503
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      9dbba36a
    • Lucas Clemente's avatar
      hash/fnv: add 128-bit FNV hash support · e05de6a5
      Lucas Clemente authored
      The 128bit FNV hash will be used e.g. in QUIC.
      
      The algorithm is described at
      https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function
      
      Change-Id: I13f3ec39b0e12b7a5008824a6619dff2e708ee81
      Reviewed-on: https://go-review.googlesource.com/38356
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      e05de6a5
    • Monis Khan's avatar
      encoding/asn1: support 31 bit identifiers with OID · 94aba766
      Monis Khan authored
      The current implementation uses a max of 28 bits when decoding an
      ObjectIdentifier.  This change makes it so that an int64 is used to
      accumulate up to 35 bits.  If the resulting data would not overflow
      an int32, it is used as an int.  Thus up to 31 bits may be used to
      represent each subidentifier of an ObjectIdentifier.
      
      Fixes #19933
      
      Change-Id: I95d74b64b24cdb1339ff13421055bce61c80243c
      Reviewed-on: https://go-review.googlesource.com/40436Reviewed-by: default avatarAdam Langley <agl@golang.org>
      Run-TryBot: Adam Langley <agl@golang.org>
      94aba766
  2. 12 Apr, 2017 26 commits
  3. 11 Apr, 2017 6 commits