• Ilya Tocar's avatar
    cmd/compile/internal/ssa: combine byte stores on amd64 · 0f2ef0ad
    Ilya Tocar authored
    On amd64 we optimize  encoding/binary.BigEndian.PutUint{16,32,64}
    into bswap + single store, but strangely enough not LittleEndian.PutUint{16,32}.
    We have similar rules, but they use 64-bit shifts everywhere,
    and fail for 16/32-bit case. Add rules that matchLittleEndian.PutUint,
    and relevant tests. Performance results:
    
    LittleEndianPutUint16-6    1.43ns ± 0%    1.07ns ± 0%   -25.17%  (p=0.000 n=9+9)
    LittleEndianPutUint32-6    2.14ns ± 0%    0.94ns ± 0%   -56.07%  (p=0.019 n=6+8)
    
    LittleEndianPutUint16-6  1.40GB/s ± 0%  1.87GB/s ± 0%   +33.24%  (p=0.000 n=9+9)
    LittleEndianPutUint32-6  1.87GB/s ± 0%  4.26GB/s ± 0%  +128.54%  (p=0.000 n=8+8)
    
    Discovered, while looking at ethereum_ethash from community benchmarks
    
    Change-Id: Id86d5443687ecddd2803edf3203dbdd1246f61fe
    Reviewed-on: https://go-review.googlesource.com/95475
    Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarKeith Randall <khr@golang.org>
    0f2ef0ad
rewriteAMD64.go 1.06 MB
The source could not be displayed because it is larger than 1 MB. You can load it anyway or download it instead.