• Daniel Martí's avatar
    encoding/json: speed up tokenization of literals · 2cc17bc5
    Daniel Martí authored
    Decoder.Decode and Unmarshal actually scan the input bytes twice - the
    first time to check for syntax errors and the length of the value, and
    the second to perform the decoding.
    
    It's in the second scan that we actually tokenize the bytes. Since
    syntax errors aren't a possibility, we can take shortcuts.
    
    In particular, literals such as quoted strings are very common in JSON,
    so we can avoid a lot of work by special casing them.
    
    name                  old time/op    new time/op    delta
    CodeDecoder-8           10.3ms ± 1%     9.1ms ± 0%  -11.89%  (p=0.002 n=6+6)
    UnicodeDecoder-8         342ns ± 0%     283ns ± 0%  -17.25%  (p=0.000 n=6+5)
    DecoderStream-8          239ns ± 0%     230ns ± 0%   -3.90%  (p=0.000 n=6+5)
    CodeUnmarshal-8         11.0ms ± 0%     9.8ms ± 0%  -11.45%  (p=0.002 n=6+6)
    CodeUnmarshalReuse-8    10.3ms ± 0%     9.0ms ± 0%  -12.72%  (p=0.004 n=5+6)
    UnmarshalString-8        104ns ± 0%      92ns ± 0%  -11.35%  (p=0.002 n=6+6)
    UnmarshalFloat64-8      93.2ns ± 0%    87.6ns ± 0%   -6.01%  (p=0.010 n=6+4)
    UnmarshalInt64-8        74.5ns ± 0%    71.5ns ± 0%   -3.91%  (p=0.000 n=5+6)
    
    name                  old speed      new speed      delta
    CodeDecoder-8          189MB/s ± 1%   214MB/s ± 0%  +13.50%  (p=0.002 n=6+6)
    UnicodeDecoder-8      40.9MB/s ± 0%  49.5MB/s ± 0%  +20.96%  (p=0.002 n=6+6)
    CodeUnmarshal-8        176MB/s ± 0%   199MB/s ± 0%  +12.93%  (p=0.002 n=6+6)
    
    Updates #28923.
    
    Change-Id: I7a5e2aef51bd4ddf2004aad24210f6f50e01eaeb
    Reviewed-on: https://go-review.googlesource.com/c/go/+/151042
    Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
    2cc17bc5
decode.go 35.5 KB