• Michael Munday's avatar
    cmd/compile: add math/bits.{Add,Sub}64 intrinsics on s390x · 2c1b5130
    Michael Munday authored
    This CL adds intrinsics for the 64-bit addition and subtraction
    functions in math/bits. These intrinsics use the condition code
    to propagate the carry or borrow bit.
    
    To make the carry chains more efficient I've removed the
    'clobberFlags' property from most of the load and store
    operations. Originally these ops did clobber flags when using
    offsets that didn't fit in a signed 20-bit integer, however
    that is no longer true.
    
    As with other platforms the intrinsics are faster when executed
    in a chain rather than a loop because currently we need to spill
    and restore the carry bit between each loop iteration. We may
    be able to reduce the need to do this on s390x (e.g. by using
    compare-and-branch instructions that do not clobber flags) in the
    future.
    
    name           old time/op  new time/op  delta
    Add64          1.21ns ± 2%  2.03ns ± 2%  +67.18%  (p=0.000 n=7+10)
    Add64multiple  2.98ns ± 3%  1.03ns ± 0%  -65.39%  (p=0.000 n=10+9)
    Sub64          1.23ns ± 4%  2.03ns ± 1%  +64.85%  (p=0.000 n=10+10)
    Sub64multiple  3.73ns ± 4%  1.04ns ± 1%  -72.28%  (p=0.000 n=10+8)
    
    Change-Id: I913bbd5e19e6b95bef52f5bc4f14d6fe40119083
    Reviewed-on: https://go-review.googlesource.com/c/go/+/174303
    Run-TryBot: Michael Munday <mike.munday@ibm.com>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
    2c1b5130
opGen.go 772 KB