• Russ Cox's avatar
    index/suffixarray: add 32-bit implementation · 45be3530
    Russ Cox authored
    The original index/suffixarray used 32-bit ints on 64-bit machines,
    because that's what 'int' meant in Go at the time. When we changed
    the meaning of int, that doubled the space overhead of suffix arrays
    for all uses, even though the vast majority of them describe less
    than 2 GB of text.
    
    The space overhead of a suffix array compared to the text is not
    insignificant: there's a big difference for many uses between 4X and 8X.
    
    This CL adjusts names in qsufsort.go so that a global search and
    replace s/32/64/g produces a working 64-bit implementation,
    and then it modifies suffixarray.go to choose between the 32-bit
    and 64-bit implementation as appropriate depending on the input size.
    The 64-bit implementation is generated by 'go generate'.
    
    This CL also restructures the benchmarks, to test different
    input sizes, different input texts, and 32-bit vs 64-bit.
    
    The serialized form uses varint-encoded numbers and is unchanged,
    so on-disk suffix arrays written by older versions of Go will be
    readable by this version, and vice versa.
    
    The 32-bit version runs a up to 17% faster than the 64-bit version
    on real inputs, but more importantly it uses 50% less memory.
    
    I have a followup CL that also implements a faster algorithm
    on top of these improvements, but these are a good first step.
    
    name                                  64-bit speed   32-bit speed    delta
    New/text=opticks/size=100K/bits=*-12  4.44MB/s ± 0%  4.64MB/s ± 0%   +4.41%  (p=0.008 n=5+5)
    New/text=opticks/size=500K/bits=*-12  3.70MB/s ± 1%  3.82MB/s ± 0%   +3.30%  (p=0.008 n=5+5)
    New/text=go/size=100K/bits=*-12       4.40MB/s ± 0%  4.61MB/s ± 0%   +4.82%  (p=0.008 n=5+5)
    New/text=go/size=500K/bits=*-12       3.66MB/s ± 0%  3.77MB/s ± 0%   +3.01%  (p=0.016 n=4+5)
    New/text=go/size=1M/bits=*-12         3.29MB/s ± 0%  3.55MB/s ± 0%   +7.90%  (p=0.016 n=5+4)
    New/text=go/size=5M/bits=*-12         2.25MB/s ± 1%  2.65MB/s ± 0%  +17.81%  (p=0.008 n=5+5)
    New/text=go/size=10M/bits=*-12        1.82MB/s ± 0%  2.09MB/s ± 1%  +14.36%  (p=0.008 n=5+5)
    New/text=go/size=50M/bits=*-12        1.35MB/s ± 0%  1.51MB/s ± 1%  +12.33%  (p=0.008 n=5+5)
    New/text=zero/size=100K/bits=*-12     3.42MB/s ± 0%  3.32MB/s ± 0%   -2.74%  (p=0.000 n=5+4)
    New/text=zero/size=500K/bits=*-12     3.00MB/s ± 1%  2.97MB/s ± 0%   -1.13%  (p=0.016 n=5+4)
    New/text=zero/size=1M/bits=*-12       2.81MB/s ± 0%  2.78MB/s ± 2%     ~     (p=0.167 n=5+5)
    New/text=zero/size=5M/bits=*-12       2.46MB/s ± 0%  2.53MB/s ± 0%   +3.18%  (p=0.008 n=5+5)
    New/text=zero/size=10M/bits=*-12      2.35MB/s ± 0%  2.42MB/s ± 0%   +2.98%  (p=0.016 n=4+5)
    New/text=zero/size=50M/bits=*-12      2.12MB/s ± 0%  2.18MB/s ± 0%   +3.02%  (p=0.008 n=5+5)
    New/text=rand/size=100K/bits=*-12     6.98MB/s ± 0%  7.22MB/s ± 0%   +3.38%  (p=0.016 n=4+5)
    New/text=rand/size=500K/bits=*-12     5.53MB/s ± 0%  5.64MB/s ± 0%   +1.92%  (p=0.008 n=5+5)
    New/text=rand/size=1M/bits=*-12       4.62MB/s ± 1%  5.06MB/s ± 0%   +9.61%  (p=0.008 n=5+5)
    New/text=rand/size=5M/bits=*-12       3.09MB/s ± 0%  3.43MB/s ± 0%  +10.94%  (p=0.016 n=4+5)
    New/text=rand/size=10M/bits=*-12      2.68MB/s ± 0%  2.95MB/s ± 0%  +10.39%  (p=0.008 n=5+5)
    New/text=rand/size=50M/bits=*-12      1.92MB/s ± 0%  2.06MB/s ± 1%   +7.41%  (p=0.008 n=5+5)
    SaveRestore/bits=*-12                  243MB/s ± 1%   259MB/s ± 0%   +6.68%  (p=0.000 n=9+10)
    
    name                               64-bit alloc/op  32-bit alloc/op  delta
    New/text=opticks/size=100K/bits=*-12    1.62MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.000 n=5+4)
    New/text=opticks/size=500K/bits=*-12    8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.008 n=5+5)
    New/text=go/size=100K/bits=*-12         1.62MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.008 n=5+5)
    New/text=go/size=500K/bits=*-12         8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.029 n=4+4)
    New/text=go/size=1M/bits=*-12           16.1MB ± 0%     8.1MB ± 0%  -49.95%  (p=0.008 n=5+5)
    New/text=go/size=5M/bits=*-12           80.3MB ± 0%    40.2MB ± 0%     ~     (p=0.079 n=4+5)
    New/text=go/size=10M/bits=*-12           160MB ± 0%      80MB ± 0%  -50.00%  (p=0.008 n=5+5)
    New/text=go/size=50M/bits=*-12           805MB ± 0%     402MB ± 0%  -50.06%  (p=0.029 n=4+4)
    New/text=zero/size=100K/bits=*-12       3.02MB ± 0%    1.46MB ± 0%     ~     (p=0.079 n=4+5)
    New/text=zero/size=500K/bits=*-12       19.7MB ± 0%     8.7MB ± 0%  -55.98%  (p=0.008 n=5+5)
    New/text=zero/size=1M/bits=*-12         39.0MB ± 0%    19.7MB ± 0%  -49.60%  (p=0.000 n=5+4)
    New/text=zero/size=5M/bits=*-12          169MB ± 0%      85MB ± 0%  -49.46%  (p=0.029 n=4+4)
    New/text=zero/size=10M/bits=*-12         333MB ± 0%     169MB ± 0%  -49.43%  (p=0.000 n=5+4)
    New/text=zero/size=50M/bits=*-12        1.63GB ± 0%    0.74GB ± 0%  -54.61%  (p=0.008 n=5+5)
    New/text=rand/size=100K/bits=*-12       1.61MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.000 n=5+4)
    New/text=rand/size=500K/bits=*-12       8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.000 n=5+4)
    New/text=rand/size=1M/bits=*-12         16.1MB ± 0%     8.1MB ± 0%  -49.95%  (p=0.029 n=4+4)
    New/text=rand/size=5M/bits=*-12         80.7MB ± 0%    40.3MB ± 0%  -50.06%  (p=0.008 n=5+5)
    New/text=rand/size=10M/bits=*-12         161MB ± 0%      81MB ± 0%  -50.03%  (p=0.008 n=5+5)
    New/text=rand/size=50M/bits=*-12         806MB ± 0%     403MB ± 0%  -50.00%  (p=0.016 n=4+5)
    SaveRestore/bits=*-12                   9.47MB ± 0%    5.28MB ± 0%  -44.29%  (p=0.000 n=9+8)
    
    https://perf.golang.org/search?q=upload:20190126.1+|+bits:64+vs+bits:32
    
    Fixes #6816.
    
    Change-Id: Ied2fbea519a202ecc43719debcd233344ce38847
    Reviewed-on: https://go-review.googlesource.com/c/go/+/174097
    Run-TryBot: Russ Cox <rsc@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
    45be3530
qsufsort64.go 5.02 KB