Commit 234092b8 authored by Kirill Smelkov's avatar Kirill Smelkov

go/zodbpickle: Switch to pickling with protocol=3

Pickle protocol 3 allows to natively support all py bytes and string
types and is supported by zodbpickle/py on both py2 and py3 out of the
box. ZODB/py3 is using protocol=3 since 2013 and ZODB/py2 since 2018. In
other words nowdays protocol=3 is the default protocol with which
ZODB emits pickles, and it can be read practically everywhere.

The difference in between protocol=2 and protocol=3 is addition of
BINBYTES and SHORT_BINBYTES opcodes to represent bytes. Without those
opcodes bytes are encoded as

    `_codecs.encode(byt.decode('latin1'), 'latin1')`

which, when unpickled, results in bytes on py3 and str on py2.

Compared to using `BINBYTES byt` this form is much bigger in size, but
what is more important might turn bytes back into str when decoded and
reencoded on py2.

This form of bytes encoding is also not accepted by ZEO/py which rejects
it with

    2024-07-18T20:44:39 ERROR ZEO.asyncio.server Can't deserialize message
    Traceback (most recent call last):
      File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/server.py", line 100, in message_received
        message_id, async_, name, args = self.decode(message)
      File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 119, in pickle_server_decode
        return unpickler.load()  # msgid, flags, name, args
      File "/home/kirr/src/wendelin/z/ZEO5/src/ZEO/asyncio/marshal.py", line 176, in server_find_global
        raise ImportError("Module not allowed: %s" % (module,))
    ImportError: Module not allowed: _codecs

All in all using pickle protocol=3 is ok from both py2 and py3 point of
view, brings size optimization and correctness, and fixes ZEO and
probably other issues.

So let the pickles ZODB/go saves and other Go places emit be encoded with protocol=3 now as well.

For the reference, the

    // TODO 2 -> 3 since ZODB5 switched to it and uses zodbpickle.

is from 2019 added in a16c9e06 (go/zodb: Teach Persistent to serialize
itself).
parent a50f1077
......@@ -35,7 +35,11 @@
// At application level utilities like ogórek.AsBytes and ogórek.AsString are
// handy to work with unpickled data for pickles generated by either py2 or py3.
//
// The encoder emits pickles with protocol=2 in order to support pristine python2.
// The encoder emits pickles with protocol=3 to natively support all py bytes
// and strings types, and to stay interoperable with both py2 and py3: both ZODB4
// and ZODB5 use zodbpickle which supports protocol=3 on py2 and py3, and ZODB5
// actually saves pickles with protocol=3 on both py2 and py3 starting from
// ZODB 5.4 . Pickles saved with protocol=3 are thus universally readable.
//
// See package github.com/kisielk/og-rek for details of pickling/unpickling on Go side.
package zodbpickle
......@@ -58,9 +62,7 @@ import (
// See documentation of ogórek.EncoderConfig.PersistentRef for details.
func NewPickler(w io.Writer, getref func(obj any) *pickle.Ref) *pickle.Encoder {
return pickle.NewEncoderWithConfig(w, &pickle.EncoderConfig{
// allow pristine python2 to decode the pickle.
// TODO 2 -> 3 since ZODB5 switched to it and uses zodbpickle.
Protocol: 2, // see top-level doc
Protocol: 3, // see top-level doc
StrictUnicode: true, // see top-level doc
PersistentRef: getref,
})
......
......@@ -199,9 +199,6 @@ func withZEOSrv(t *testing.T, f func(t *testing.T, zsrv ZEOSrv), optv ...tOption
}
xtesting.NeedPy(t, needpy...)
py2 := strings.HasSuffix(t.Name(), "/py2")
if !msgpack && !py2 {
t.Skip("xfail")
}
withFS1(t, func(fs1path string) {
X := xtesting.FatalIf(t)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment