- 26 Jun, 2024 1 commit
-
-
Kirill Smelkov authored
I've tried to run `zodb dump --pretty=zpickledis` on wendelin.core test data in WCFS(*) and hit the following failure: (z-dev) kirr@deca:~/src/wendelin/wendelin.core/wcfs/internal/zdata/testdata$ zodb dump --pretty=zpickledis zblk.fs ... obj 0000000000000005 685 sha1:865171b709f575b355afd2cc9e1f32b9781c6510 Traceback (most recent call last): File "/home/kirr/src/wendelin/venv/z-dev/bin/zodb", line 11, in <module> load_entry_point('zodbtools', 'console_scripts', 'zodb')() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "<decorator-gen-3>", line 2, in main File "/home/kirr/src/wendelin/venv/z-dev/lib/python2.7/site-packages/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbdump.py", line 341, in main zodbdump(stor, tidmin, tidmax, hashonly, pretty) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbdump.py", line 167, in zodbdump pickletools.dis(dataf, disf) # state File "/usr/lib/python2.7/pickletools.py", line 2005, in dis raise ValueError(errormsg) ValueError: memo key 1 has never been stored into The problem turned out to be due to that state part of zpickle is referring to another object with the same class as already saved in class part of zpickle, so that class was being referred to via GET matching corresponding PUT done in the class part, but our zpickledis handler did not shared the memo in between those two parts and so the GET became unmatched. In more details the problem is illustrated by the following zpickle that corresponds to Object.value referring to the same Object. The first part of zpickle contains class part and refers to __main__.Object global with putting it into memo[1]. The second part of zpickle contains state part and refers to that object by `(Object, 7) PERSID` where Object is retrieved via memo[1] GET: obj 0000000000000007 41 sha1:7108c96ccb9cbeaab1164d533174c300e51309f9 0: \x80 PROTO 2 2: c GLOBAL '__main__ Object' 19: q BINPUT 1 <-- NOTE 21: . STOP highest protocol among opcodes = 2 22: \x80 PROTO 2 24: U SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\x00\x07' 34: q BINPUT 2 36: h BINGET 1 <-- NOTE 38: \x86 TUPLE2 39: Q BINPERSID 40: . STOP highest protocol among opcodes = 2 To handle such zpickles well we need to share the memo when dumping class and state disassemblies similarly to how ZODB does in its ObjectWriter._dump: https://github.com/zopefoundation/ZODB/blob/5.8.1-0-g72cebe6bc/src/ZODB/serialize.py#L436-L443 Pickletools.dis has explicit support for using shared memo - originally added in https://github.com/python/cpython/commit/62235e701e37 and likely motivated by ZODB use-case. (*) https://lab.nexedi.com/nexedi/wendelin.core/-/blob/07087ec8/wcfs/internal/zdata/testdata/zblk.fs generated by wendelin.core@2c152d41 /reviewed-by @jerome /reviewed-on !28
-
- 16 Feb, 2024 1 commit
-
-
Jérome Perrin authored
Co-authored-by: Kirill Smelkov <kirr@nexedi.com>
-
- 01 Sep, 2023 3 commits
-
-
Jérome Perrin authored
even though the interface of IStorageRestorable.tpc_begin does not have a "status" argument, it is described in the notes below that the actual implementation uses it: https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/interfaces.py#L950-L956 This is used by FileStorage: https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/FileStorage/format.py#L30-L39 and the storage methods seem to accept this argument: https://github.com/zopefoundation/ZODB/blob/0632974d/src/ZODB/BaseStorage.py#L182 https://github.com/zopefoundation/ZEO/blob/e5637818/src/ZEO/ClientStorage.py#L888 https://lab.nexedi.com/nexedi/neoppod/blob/fd87e153/neo/client/app.py#L473 Propagating the status fixes some cases where restoring commits did not recreate a storage that is byte-to-byte equivalent. This happened with a FileStorage that was packed and contained transactions with "p" status. Co-authored-by: Kirill Smelkov <kirr@nexedi.com> Reviewed-on: nexedi/zodbtools!24
-
Kirill Smelkov authored
Until now we were generating only regular transactions with " " status and this does not cover e.g. restore case when it needs to replicate packed transaction: instead of recreating it bit-to-bit exactly as original with "p" status, restore recreates it with " " status, breaking restore promise. Adjusting testdata this way exposes that bug in restore: ======================================== FAILURES ======================================== ________________________________________ test_zodbrestore[!zext] ________________________________________ tmpdir = local('/tmp/pytest-of-kirr/pytest-17/test_zodbrestore__zext_0'), zext = <function _ at 0x7fd6b7a03750> @func def test_zodbrestore(tmpdir, zext): zkind = '_!zext' if zext.disabled else '' # restore from testdata/1.zdump.ok and verify it gives result that is # bit-to-bit identical to testdata/1.fs tdata = dirname(__file__) + "/testdata" @func def _(): zdump = open("%s/1%s.zdump.raw.ok" % (tdata, zkind), 'rb') defer(zdump.close) stor = storageFromURL('%s/2.fs' % tmpdir) defer(stor.close) zodbrestore(stor, zdump) _() zfs1 = readfile(fs1_testdata_py23(tmpdir, "%s/1%s.fs" % (tdata, zkind))) zfs2 = readfile("%s/2.fs" % tmpdir) > assert zfs1 == zfs2 E assert 'FS21\x02\x85...0\x00\x00\xb2' == 'FS21\x02\x85\...0\x00\x00\xb2' E Skipping 49 identical leading characters in diff, use -v to show E Skipping 22871 identical trailing characters in diff, use -v to show E - 0\x00\x00tp\x00\x08\x00\t\x00\x00user0.15step 0.15\x00\x00\x00\x00\x00\x00\x00\x03\x02\x85\xcb\xac\x83i\xd0f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00"\x80\x02c__main__ E ? ^ E + 0\x00\x00t \x00\x08\x00\t\x00\x00user0.15step 0.15\x00\x00\x00\x00\x00\x00\x00\x03\x02\x85\xcb\xac\x83i\xd0f\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00... E E ...Full output truncated (39 lines hidden), use '-vv' to show test_restore.py:53: AssertionError Having "p" transactions in the testdata will also make sure that all tools should handle such transactions well. The problem of restore not handling "p" status properly was reported by Jérome at nexedi/zodbtools!24. In the next patch we will fix that problem. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!24
-
Kirill Smelkov authored
In 80559a94 ("zodbdump: support --pretty option with a format to show pickles disassembly") we added support for zodbdump --pretty and adjusted files in testdata/ to be named like 1.zdump.{raw,zpickledis}.ok instead of just 1.zdump.ok. However, that renaming and generation of 1.zdump.zpickledis.ok, it seems, were done by hand, because rerunning gen_testdata.py still regenerates old 1.zdump.ok. It seems that during nexedi/zodbtools!22 I missed that gen_testdata.py was not updated. -> Fix it. Running gen_testdata.py with py2 and ZODB 5.8.1 regenerates *.fs and *.ok files in testdata/ in exactly the same state they were. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!24
-
- 08 Sep, 2022 1 commit
-
-
Kirill Smelkov authored
Penultimate patch needs `bstr` from pygolang to work ok (see kirr/pygolang@c9648c44), but it won't hurt if we merge this without waiting for pygolang bits because without bstr zodbtools continues to work ok on py2, and it will be py3 mode which will not work fully ok. Previous discussions and py3 porting attempts: - !8 (comment 73726) - !12 - conversation from !13 (comment 81553) to !13 (comment 81874) - !19 (comment 129023) - kirr/zodbtools@42799cf6 (comment 166403) /reviewed-by @jerome /reviewed-on !23
-
- 07 Sep, 2022 7 commits
-
-
Kirill Smelkov authored
Empty-range test added in b4824ad5 (analyze: fix ZeroDivisionErrors when report is empty) intended to use 0xffffffffffffffff TID, but used just 'ffffffffffffffff' string instead. It was passing on py2 partly by luck, but on py3 it fails because tidmin type is mismatched: _______________________________ test_zodbanalyze _______________________________ tmpdir = local('/tmp/pytest-of-kirr/pytest-30/test_zodbanalyze0') capsys = <_pytest.capture.CaptureFixture object at 0x7f7bb3f9a4f0> def test_zodbanalyze(tmpdir, capsys): ... # empty range report( > analyze( tfs1, use_dbm=False, delta_fs=False, tidmin="ffffffffffffffff", tidmax=None, ), csv=False, ) zodbtools/test/test_analyze.py:68: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../venv/py3.venv/lib/python3.9/site-packages/decorator.py:232: in fun return caller(func, *(extras + args), **kw) ../../../tools/go/pygolang/golang/__init__.py:103: in _ return f(*argv, **kw) zodbtools/zodbanalyze.py:181: in analyze fsi = fs.iterator(tidmin, tidmax) ../ZODB/src/ZODB/FileStorage/FileStorage.py:1381: in iterator return FileIterator(self._file_name, start, stop) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <ZODB.FileStorage.FileStorage.FileIterator object at 0x7f7bb348c6d0> filename = '/tmp/pytest-of-kirr/pytest-30/test_zodbanalyze0/1.fs' start = 'ffffffffffffffff', stop = None, pos = 4 def __init__(self, filename, start=None, stop=None, pos=4): assert isinstance(filename, STRING_TYPES) file = open(filename, 'rb') self._file = file self._file_name = filename if file.read(4) != packed_version: raise FileStorageFormatError(file.name) file.seek(0, 2) self._file_size = file.tell() if (pos < 4) or pos > self._file_size: raise ValueError("Given position is greater than the file size", pos, self._file_size) self._pos = pos > assert start is None or isinstance(start, bytes) E AssertionError ../ZODB/src/ZODB/FileStorage/FileStorage.py:1816: AssertionError ------------------------------ Captured log call ------------------------------- ERROR ZODB.FileStorage:FileStorage.py:480 loading index UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 25: ordinal not in range(128) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/FileStorage/FileStorage.py", line 478, in _restore_index info = fsIndex.load(index_name) File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/fsIndex.py", line 138, in load v = unpickler.load() SystemError: <built-in method read of _io.BufferedReader object at 0x7f7bb3df03b0> returned a result with an error set ERROR ZODB.FileStorage:FileStorage.py:480 loading index UnicodeDecodeError: 'ascii' codec can't decode byte 0xb7 in position 25: ordinal not in range(128) ... -> Fix it by preparing tidmin in the test a 8-bytes binary properly.
-
Kirill Smelkov authored
e.g. for ObjectData .hashfunc: In many contexts we need that .hashfunc to be like string, e.g. for accessing hashRegistry by keys. In many other contexts - e.g. when zodbdump input it parsed or emitted, it is more handy to handle it like raw bytes. If we let .hashfunc to be of type str - it breaks the second mode. If of type bytes - it breaks the first mode. And also in many places it is hard to constantly encode/decode str and bytes, especially in the places where an object is sometimes used in strings context, and sometimes in binary context. -> Fix it all in one go by using bytestring type from pygolang, which provides both unicode string and binary semantics simultaneously. This needs bstr from pygolang (see kirr/pygolang@c9648c44), but even if pygolang comes without bstr, with this patch zodbtools continues to work ok on py2 - it will be just py3 mode that won't work. The list of test failures before this patch is provided below: _______________________________ test_zodbanalyze _______________________________ tmpdir = local('/tmp/pytest-of-kirr/pytest-22/test_zodbanalyze0') capsys = <_pytest.capture.CaptureFixture object at 0x7f3de6835c70> def test_zodbanalyze(tmpdir, capsys): tfs1 = fs1_testdata_py23(tmpdir, os.path.join(os.path.dirname(__file__), "testdata", "1.fs")) for use_dbm in (False, True): > report( analyze( tfs1, use_dbm=use_dbm, delta_fs=False, tidmin=None, tidmax=None, ), csv=False, ) zodbtools/test/test_analyze.py:30: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ rep = <zodbtools.zodbanalyze.Report object at 0x7f3de5e16b20>, csv = False def report(rep, csv=False): ... print (fmtp % (t_display, rep.TYPEMAP[t], rep.TYPESIZE[t], pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t], > rep.COIDSMAP[t], rep.CBYTESMAP[t], rep.FOIDSMAP.get(t, 0), rep.FBYTESMAP.get(t, 0))) E KeyError: b'persistent.mapping.PersistentMapping' zodbtools/zodbanalyze.py:147: KeyError ____________________________ test_zodbcommit[!zext] ____________________________ zext = <function zext.<locals>._ at 0x7f3deb5c3e50> @func def test_zodbcommit(zext): tmpd = mkdtemp('', 'zodbcommit.') defer(lambda: rmtree(tmpd)) stor = storageFromURL('%s/2.fs' % tmpd) defer(stor.close) head = stor.lastTransaction() # commit some transactions via zodbcommit and verify if storage dump gives # what is expected. t1 = Transaction(z64, ' ', b'user name', b'description ...', zext(dumps({'a': 'b'}, _protocol)), [ ObjectData(p64(1), b'data1', 'sha1', sha1(b'data1')), ObjectData(p64(2), b'data2', 'sha1', sha1(b'data2'))]) t1.tid = zodbcommit(stor, head, t1) t2 = Transaction(z64, ' ', b'user2', b'desc2', b'', [ ObjectDelete(p64(2))]) t2.tid = zodbcommit(stor, t1.tid, t2) buf = BytesIO() zodbdump(stor, p64(u64(head)+1), None, out=buf) dumped = buf.getvalue() > assert dumped == b''.join([_.zdump() for _ in (t1, t2)]) zodbtools/test/test_commit.py:61: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ zodbtools/test/test_commit.py:61: in <listcomp> assert dumped == b''.join([_.zdump() for _ in (t1, t2)]) zodbtools/zodbdump.py:521: in zdump z += obj.zdump() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <zodbtools.zodbdump.ObjectData object at 0x7f3de5d26d90> def zdump(self): data = self.data hashonly = isinstance(data, HashOnly) if hashonly: size = data.size else: size = len(data) > z = b'obj %s %d %s:%s' % (ashex(self.oid), size, self.hashfunc, ashex(self.hash_)) E TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str' zodbtools/zodbdump.py:569: TypeError _______________________________ test_dumpreader ________________________________ def test_dumpreader(): in_ = b"""\ txn 0123456789abcdef " " user "my name" description "o la-la..." extension "zzz123 def" obj 0000000000000001 delete obj 0000000000000002 from 0123456789abcdee obj 0000000000000003 54 adler32:01234567 - obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04 ZZZZ obj 0000000000000005 9 crc32:52fdeac5 ABC DEF! txn 0123456789abcdf0 " " user "author2" description "zzz" extension "qqq" """ r = DumpReader(BytesIO(in_)) > t1 = r.readtxn() zodbtools/test/test_dump.py:78: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ zodbtools/zodbdump.py:443: in readtxn self._badline('unknown hash function %s' % qq(hashfunc)) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <zodbtools.zodbdump.DumpReader object at 0x7f3de5d69cd0> msg = 'unknown hash function "adler32"' def _badline(self, msg): > raise RuntimeError("%s+%d: invalid line: %s (%s)" % (_ioname(self._r), self.lineno, msg, qq(self._line))) E RuntimeError: +7: invalid line: unknown hash function "adler32" ("obj 0000000000000003 54 adler32:01234567 -") zodbtools/zodbdump.py:382: RuntimeError ___________________________ test_zodbrestore[!zext] ____________________________ tmpdir = local('/tmp/pytest-of-kirr/pytest-22/test_zodbrestore__zext_0') zext = <function zext.<locals>._ at 0x7f3de5d6ddc0> @func def test_zodbrestore(tmpdir, zext): zkind = '_!zext' if zext.disabled else '' # restore from testdata/1.zdump.ok and verify it gives result that is # bit-to-bit identical to testdata/1.fs tdata = dirname(__file__) + "/testdata" @func def _(): zdump = open("%s/1%s.zdump.raw.ok" % (tdata, zkind), 'rb') defer(zdump.close) stor = storageFromURL('%s/2.fs' % tmpdir) defer(stor.close) zodbrestore(stor, zdump) > _() zodbtools/test/test_restore.py:49: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../venv/py3.venv/lib/python3.9/site-packages/decorator.py:232: in fun return caller(func, *(extras + args), **kw) ../../../tools/go/pygolang/golang/__init__.py:103: in _ return f(*argv, **kw) zodbtools/test/test_restore.py:48: in _ zodbrestore(stor, zdump) zodbtools/zodbrestore.py:39: in zodbrestore txn = zr.readtxn() zodbtools/zodbdump.py:443: in readtxn self._badline('unknown hash function %s' % qq(hashfunc)) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <zodbtools.zodbdump.DumpReader object at 0x7f3de5d79e20> msg = 'unknown hash function "sha1"' def _badline(self, msg): > raise RuntimeError("%s+%d: invalid line: %s (%s)" % (_ioname(self._r), self.lineno, msg, qq(self._line))) E RuntimeError: /home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1_!zext.zdump.raw.ok+5: invalid line: unknown hash function "sha1" ("obj 0000000000000000 61 sha1:664e6de0f153d8eaeda638d616a320c6e3c5feb1") zodbtools/zodbdump.py:382: RuntimeError
-
Kirill Smelkov authored
Zodbcommit reads input in zodbdump format from stdin and then uses zodbdump.DumpReader to parser that input. The parser works on binary data. However zodbcommit, was preparing that input data mixing bytes and strings, which is failing on py3: (py3.venv) kirr@deca:~/src/wendelin/z/zodbtools$ zodb commit 1.fs 00 Ignoring index for /home/kirr/src/wendelin/z/zodbtools/1.fs aaa Traceback (most recent call last): File "/home/kirr/src/wendelin/venv/py3.venv/bin/zodb", line 33, in <module> sys.exit(load_entry_point('zodbtools', 'console_scripts', 'zodb')()) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "/home/kirr/src/wendelin/venv/py3.venv/lib/python3.9/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 222, in main zin += sys.stdin.read() TypeError: can't concat str to bytes -> Fix it by reading stdin in binary mode. No test currently as zodbcommit.main is not covered by tests (hopefully yet).
-
Kirill Smelkov authored
pickletools.dis, which is used to handle --pretty=zpickledis (*), expects output stream be text-like, not binary. We were passing a binary stream to it. As the result pickle disassembly was failing on py3: _______________________ test_zodbdump[!zext-zpickledis] ________________________ tmpdir = local('/tmp/pytest-of-kirr/pytest-11/test_zodbdump__zext_zpickledis0') zext = <function zext.<locals>._ at 0x7f538b508670>, pretty = 'zpickledis' @mark.parametrize('pretty', ('raw', 'zpickledis')) def test_zodbdump(tmpdir, zext, pretty): tdir = dirname(__file__) zkind = '_!zext' if zext.disabled else '' tfs1 = fs1_testdata_py23(tmpdir, '%s/testdata/1%s.fs' % (tdir, zkind)) stor = FileStorage(tfs1, read_only=True) with open('%s/testdata/1%s.zdump.%s.ok' % (tdir, zkind, pretty), 'rb') as f: dumpok = f.read() out = BytesIO() > zodbdump(stor, None, None, pretty=pretty, out=out) zodbtools/test/test_dump.py:48: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ zodbtools/zodbdump.py:165: in zodbdump pickletools.dis(dataf, disf) # class _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pickle = <_io.BytesIO object at 0x7f538b577130> out = <_io.BytesIO object at 0x7f538b49f8b0>, memo = {}, indentlevel = 4 annotate = 0 def dis(pickle, out=None, memo=None, indentlevel=4, annotate=0): """Produce a symbolic disassembly of a pickle...""" ... for opcode, arg, pos in genops(pickle): if pos is not None: > print("%5d:" % pos, end=' ', file=out) E TypeError: a bytes-like object is required, not 'str' /usr/lib/python3.9/pickletools.py:2450: TypeError -> Fix it by letting pickletools.dis to emit its output to StringIO instead of BytesIO. (*) see 80559a94 "zodbdump: support --pretty option with a format to show pickles disassembly"
-
Kirill Smelkov authored
FileStorage/py2 uses `FS21` magic in file header, whereas FileStorage/py3 uses `FS30` magic: https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/_compat.py#L39 https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/_compat.py#L74 And if, upon opening the database, file magic does not match to what ZODB expects, open is rejected: https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/FileStorage/FileStorage.py#L88 https://github.com/zopefoundation/ZODB/blob/0e72b8b13657/src/ZODB/FileStorage/FileStorage.py#L1625-L1630 This is done with the idea for a database, that was written from Python2, to be rejected to be opened from Python3 and vice-versa because strings/bytes semantics changed in between py23. As the result, many zodbtools tests currently fail on py3 when they try to access prepared FileStorage database in testdata, because that database was originally prepared on py2. Here is, for example, how test_zodbdump fails: ___________________________ test_zodbdump[zext-raw] ____________________________ zext = <function zext.<locals>._ at 0x7f28530bf9d0>, pretty = 'raw' @mark.parametrize('pretty', ('raw', 'zpickledis')) def test_zodbdump(zext, pretty): tdir = dirname(__file__) zkind = '_!zext' if zext.disabled else '' > stor = FileStorage('%s/testdata/1%s.fs' % (tdir, zkind), read_only=True) zodbtools/test/test_dump.py:41: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../ZODB/src/ZODB/FileStorage/FileStorage.py:315: in __init__ self._pos, self._oid, tid = read_index( _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ file = <_io.BufferedReader name='/home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs'> name = '/home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs' index = <ZODB.fsIndex.fsIndex object at 0x7f2852fee2b0>, tindex = {} stop = b'\xff\xff\xff\xff\xff\xff\xff\xff' ltid = b'\x00\x00\x00\x00\x00\x00\x00\x00', start = 4 maxoid = b'\x00\x00\x00\x00\x00\x00\x00\x00', recover = 0, read_only = True def read_index(file, name, index, tindex, stop=b'\377'*8, ltid=z64, start=4, maxoid=z64, recover=0, read_only=0): """Scan the file storage and update the index.""" ... if file_size: if file_size < start: raise FileStorageFormatError(file.name) seek(0) if read(4) != packed_version: > raise FileStorageFormatError(name) E ZODB.FileStorage.FileStorage.FileStorageFormatError: /home/kirr/src/wendelin/z/zodbtools/zodbtools/test/testdata/1.fs ../ZODB/src/ZODB/FileStorage/FileStorage.py:1630: FileStorageFormatError Since zodbtools primarily work on raw data without decoding stored pickles, unlike Zope or ERP5, it should not be a problem for zodbtools to work on py3 with the database that was prepared on py2. -> Adjust all tests to use FileStorage data generated on the fly based on original files in testdata/ but with FileStorage header being rewritten to match current python.
-
Kirill Smelkov authored
A counterpart to readfile - to write a file instead of reading it. We will need this function in the next patch.
-
Kirill Smelkov authored
Soon we will need to use it not only from test_restore.py
-
- 29 Mar, 2022 1 commit
-
-
Jérome Perrin authored
Showing pickle disassembly can sometimes be useful to analyse details of the pickle content. We realized that in some data structures used in ERP5 the same string was saved multiple times in the same pickle and by using the exact same string (ie. for which `s1 is s2` is True), the pickle will have the string only once and pickles are a bit smaller. For more reference, the context was nexedi/erp5!1560 (comment 154825) This introduces a new --pretty option that we will be able to extend later with more output formats. Co-authored-by: Kirill Smelkov <kirr@nexedi.com> Reviewed-on: nexedi/zodbtools!22
-
- 01 Apr, 2021 1 commit
-
-
Jérome Perrin authored
@kirr wrote (nexedi/zodbtools!19 (comment 129442)) For the reference - contrary to ZODB5, restore tests on ZODB4 are currently [broken](https://nexedijs.erp5.net/#/test_result_module/20210317-B3AC205A/2). Restored file is not bit-to-bit identical to the original. The problem is that on commit/restore, we need to save user/description/extension. For extension `zodbdump.Transaction` provides .extension_bytes, which ZODB5 uses to save its raw copy. However ZODB4 goes through `.extension` and pickles it: https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py#L425-453 https://github.com/zopefoundation/ZODB/blob/4/src/ZODB/BaseStorage.py#L220-L240 This leads to unpickle-repickle round-trip and different extension being committed on restore: ```diff diff --git a/1zdump b/2zdump index 5033bc1..a3a32aa 100644 --- a/1zdump +++ b/2zdump @@ -10,7 +10,7 @@ q^A. txn 0285cbac3d0369e6 " " user "user0.0" description "step 0.0" -extension "\x80\x02}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorq\x02U\fzodb/py2 (f)u." +extension "}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorU\fzodb/py2 (f)u." obj 0000000000000000 98 sha1:eba252d1984f975ecb636bc1b3a89c953dd20527 ... ``` What might save us is to somehow in Transaction.extension returns a dict-subclass object that is somehow pickled to the exact bytes remembered when it was created. However, after briefly checking, I could not find a mechanism to do so yet... @jerome wrote (nexedi/zodbtools!19 (comment 129479)) @kirr we already have pytest fixtures to test differently depending on whether the ZODB version has support for extension_bytes, so what about using it in the test and testing restoring the extension bytes version of the dump only for ZODB5 ? @kirr wrote (nexedi/zodbtools!19 (comment 129482)) @jerome, yes we have this, but I believe we should actually fix zodbrestore to be reliable whatever ZODB is used. For ZODB5 it works. For ZODB4-wc2 we can adjust ZODB code to use extension_bytes similarly to how ZODB5 does. But unpatched ZODB4 is currently out of luck. As it was decided that Nexedi will use both ZODB4 and ZODB4-wc2, I think we should fix zodbrestore to work on all those versions to be reliable. /cc @tomo @kirr: -> No universal ZODB4 fix for now (this would require to monkey patch ZODB in several places), so mark "restore with extension" test as xfail similarly to how we already do for "dump with extension" test. This brings -ZODB4 and -ZODB4-wc2 tests back to PASS state. Even though on ZODB4 extension is restored not bit-to-bit exactly, it is restored to be the same dictionary equal to what was used to produce the dump. Not ideal, but still not loosing the information in practice. One more reason to switch to ZODB5...
-
- 16 Mar, 2021 2 commits
-
-
Kirill Smelkov authored
In the previous patch we taught object copy handler to report more details, but it was still incomplete - the error was missing details about which operation was run - commit, or restore of particular transaction. Noting that it can be also noted that other errors reported from that function lack such context. -> So fix it universally, at least for zodbcommit for now: set top-level runctx to topic of what we are doing, and use that runctx when generating errors. Runctx describes what we are running, and could be also later used for logging and tracing. That's why it is called runctx instead of just errctx for "error context". TODO currently it is only exceptions that we explicitly raise which get the context. If an exception is raised by something that we call - the context won't be added. It would be good to later rework error handling and append such context for any raised error. Defer and https://lab.nexedi.com/kirr/go123/blob/863c4602/xerr/__init__.py has something preliminary for this. The particular error when restoring a missing object copy becomes ValueError: /tmp/demo002868462/δ0285cbac75555580/δ.fs: restore 0285cbacb70a3db3 @0285cbacb258bf66: object 0000000000000003: copy from @0285cbac70a3d733: no data instead of older ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data /reviewed-by @jerome /reviewed-on nexedi/zodbtools!20
-
Kirill Smelkov authored
When zodbdump input says to copy an object, we first load that object. However if object does not exist loadBefore raises POSKeyError, and when object at copied-from revision was deleted loadBefore returns None. -> Handle that explicitly to provide failure details to the user, so that instead of cryptic === RUN TestLoad/δstart=0285cbac75555580 Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module> main() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "<decorator-gen-6>", line 2, in main File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main zodbrestore(stor, asbinstream(sys.stdin), _) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore zodbcommit(stor, at, txn) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 122, in zodbcommit _() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 91, in _ data, _, _ = stor.loadBefore(obj.oid, p64(u64(obj.copy_from)+1)) TypeError: 'NoneType' object is not iterable xtesting.go:483: /tmp/demo009767458/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1 it fails with something more understandable: === RUN TestLoad/δstart=0285cbac75555580 Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module> main() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "<decorator-gen-6>", line 2, in main File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main zodbrestore(stor, asbinstream(sys.stdin), _) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore zodbcommit(stor, at, txn) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 129, in zodbcommit _() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 97, in _ (stor.getName(), ashex(obj.oid), ashex(obj.copy_from))) ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data xtesting.go:483: /tmp/demo358030847/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1 For the implementation it would be easier to use loadAt (https://github.com/zopefoundation/ZODB/pull/323), but we don't have that yet. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!20
-
- 15 Mar, 2021 4 commits
-
-
Kirill Smelkov authored
Suggessted by @jerome here: !19 (comment 129181) Co-authored-with: Jérome Perrin <jerome@nexedi.com> /reviewed-on !19
-
Kirill Smelkov authored
Zodbrestore is long-coming counterpart to zodbdump. Implementation is internally based on reworked zodbcommit. For FileStorage restored database is verified via test to be bit-to-bit identical to the original. For NEO it won't be exactly the case, as NEO does not implement IStorageRestoreable: there is only tpc_begin(tid=...) but no restore(). /helped-by @jerome /reviewed-on nexedi/zodbtools!19
-
Kirill Smelkov authored
This current serial will not be needed on new codepaths to be added to zodbcommit in the next patch. -> Move the computation to function to trigger it only from places where knowing current serial is actually needed. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!19
-
Kirill Smelkov authored
Two-phase commit protocol assumes that after tpc_begin, it will be either successful tpc_vote + tpc_finish, or tpc_abort. We were not calling tpc_abort on an error, potentially leaving storage in "commit is in progress" state on an error. /reviewed-by @jerome /reviewed-on !19
-
- 10 Mar, 2021 2 commits
-
-
Kirill Smelkov authored
Nexedi stack is dropping support for that old ZODB version - see e.g. - nexedi/slapos@70d05199 - nexedi/neoppod@3a8f6f03 - nexedi/wendelin.core@0802da2b Regarding test/gen_testdata.py: even though ZODB4 uses zodbpickle, and so should be able to load pickles encoded with protocol 3 even on python2, in practice it does not work so well: ZODB4 tests fail if I set --- a/src/ZODB/_compat.py +++ b/src/ZODB/_compat.py @@ -34,7 +34,7 @@ HIGHEST_PROTOCOL = cPickle.HIGHEST_PROTOCOL IMPORT_MAPPING = {} NAME_MAPPING = {} - _protocol = 1 + _protocol = 3 FILESTORAGE_MAGIC = b"FS21" else: # Python 3.x: can't use stdlib's pickle because -> so continue to preserve protocol < 3 when generating the test database for compatibility - now with ZODB4/py2. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!18
-
Kirill Smelkov authored
The patch that provides raw-extension functionality was merged into ZODB 5.6: https://github.com/zopefoundation/ZODB/commit/2f8cc67a3ba3 So when testing with ZODB5 >= 5.6 the tests will excercise code path that uses txn.extension_bytes, and when testing with ZODB4 the tests will excercise code path that work-arounds lack of txn.extension_bytes. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!18
-
- 02 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
Nxdtest[1] is tox-like tool to run tests under Nexedi testing infrastructure. [1] https://lab.nexedi.com/nexedi/nxdtest /reviewed-on nexedi/zodbtools!17
-
- 30 Apr, 2020 1 commit
-
-
Kirill Smelkov authored
Flushing changes from yet another attempt. Still not completely there yet, but closer. Reviewed-by: @jerome Reviewed-on: nexedi/zodbtools!16
-
- 29 Apr, 2020 6 commits
-
-
Kirill Smelkov authored
ashex gives bytes, whereas reference_tid was str.
-
Kirill Smelkov authored
The sequence cannot be randomly accessed, e.g. In [5]: d = {1:2} In [6]: kv = d.keys() In [7]: kv Out[7]: dict_keys([1]) In [8]: kv[0] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-8-643f90e1910b> in <module>() ----> 1 kv[0] TypeError: 'dict_keys' object is not subscriptable -> Use list(dict.keys()) in places where we need random access.
-
Kirill Smelkov authored
Otherwise it breaks with str on py3: In [1]: from io import BytesIO In [2]: BytesIO("abc") --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-52a130edd46d> in <module>() ----> 1 BytesIO("abc") TypeError: a bytes-like object is required, not 'str'
-
Kirill Smelkov authored
Zodbdump format is text-binary and is saved into files opened in binary mode. -> We have to emit bytes - not strings - into it, since otherwise on Python3 it would break. This needs qq support from pygolang[1] to be able to use qq with both string and bytestring format, e.g. for "hello %s" % qq(name), and b"hello %s" % qq(name) to give the same output irregardless of whether name is str or bytes. [1] nexedi/pygolang!1
-
Kirill Smelkov authored
Zodbdump format is already described as semi text-binary in top-level zodbdump.py documentation. However zdump() docstring was referring to it as "text". Fix it and use binary to handle places where zdump is loaded/saved.
-
Kirill Smelkov authored
%r has different output for strings and bytes on python3: In [1]: a = 'hello' In [2]: b = b'hello' In [3]: repr(a) Out[3]: "'hello'" In [4]: repr(b) Out[4]: "b'hello'" -> Use qq whose output is stable irregardless of whether input is string or bytes.
-
- 13 Mar, 2020 1 commit
-
-
Kirill Smelkov authored
zodbinfo: Provide "head" as command to query DB head; Turn "last_tid" into deprecated alias for head Similarly to go version: neo@151d8b79.
-
- 14 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Starting with upcoming ZODB 5.5.2 ZODB tries to preserve `extension_bytes` transaction metadata property in the raw form as it was stored on disk in the database: https://github.com/zopefoundation/ZODB/commit/2f8cc67a However now when running test/gen_testdata.py with ZODB with that patch (and gen_testdata.py refuses to work if it detects that ZODB does not properly supports .extension_bytes property because we want it to be present in the generated test database [1,2]) it now breaks: $ ./gen_testdata.py Traceback (most recent call last): File "./gen_testdata.py", line 230, in <module> main() File "./gen_testdata.py", line 224, in main gen_testdb("%s.fs" % dbname, zext=zext) File "./gen_testdata.py", line 194, in gen_testdb stor.tpc_begin(txn) File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/BaseStorage.py", line 193, in tpc_begin ext = transaction.extension_bytes AttributeError: 'Transaction' object has no attribute 'extension_bytes' The breakage is because, as specified in ZODB interfaces[3,4], storage requires ZODB.IStorageTransactionMetaData, not transaction.ITransaction instance gen_testdata.py was using. The script used to work before just by luck. The fix is to convert transaction instance into storage transaction metadata object for the place where we talk to storage at raw level. HOWEVER, when checking regenerated database and its dump I noticed: ZODB >= 5.4.0 uses pickle protocol 3 on both python2 and python3 https://github.com/zopefoundation/ZODB/commit/12ee41c4 In other words it saves e.g. OID of an object as pickle binary, which decodes as bytes on py3 and zodbpickle.binary on py2 when decoding via zodbpickle. However it will result in *DecodeError* when decoding on py2 with standard pickle module. The latter means that ZODB3 will _fail_ to load data from test database, because ZODB3 - contrary to ZODB4 and ZODB5 - uses std pickle module, not zodbpickle. We still care about ZODB3 and in particular it is included into zodbtools test matrix: https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/tox.ini#L9-14 so we cannot break it. -> Temporarily patch ZODB at runtime to make sure it emits data with older protocol and without using zodbpickle.binary for oid, so that generated test database could be loaded on ZODB3 as well. gen_testdata.py now works with latest ZODB, but produces exactly the same bit-to-bit output as before. [1] https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/zodbtools/test/gen_testdata.py#L215-217 [2] https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/zodbtools/test/testutil.py#L31-63 [3] https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L815-L818 [4] https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L538-L575 /reviewed-on nexedi/zodbtools!15
-
- 09 Jul, 2019 1 commit
-
-
Kirill Smelkov authored
-> Use .[test] to refer to them. https://stackoverflow.com/a/41398850/9456786 /reviewed-by @jerome /reviewed-on nexedi/zodbtools!14
-
- 03 Jun, 2019 1 commit
-
-
Kirill Smelkov authored
@jerome, I was trying to make zodbtools work with Python3 and along that road picked some bits of your work from nexedi/zodbtools!12. At present the migration to Python3 is not complete, and even though now I have the answer to how handle strings in both python2/3 in compatible and reasonable way (I can share details if you are interested), I have to put that work on hold for some time and use https://pypi.org/project/pep3134 directly in wcfs tests, since getting all string details right, even after figuring on how to do it, will take time. Anyway the bits presented here should be ready for master and could be merged now. Could you please have a look? Thanks beforehand, Kirill /reviewed-on nexedi/zodbtools!13
-
- 24 May, 2019 5 commits
-
-
Kirill Smelkov authored
Zodbdump format is mixed text+binary so dumping to unicode stdout won't work. Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
Because on Py3: def test_dumpreader(): in_ = b"""\ txn 0123456789abcdef " " user "my name" description "o la-la..." extension "zzz123 def" obj 0000000000000001 delete obj 0000000000000002 from 0123456789abcdee obj 0000000000000003 54 adler32:01234567 - obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04 ZZZZ obj 0000000000000005 9 crc32:52fdeac5 ABC DEF! txn 0123456789abcdf0 " " user "author2" description "zzz" extension "qqq" """ r = DumpReader(BytesIO(in_)) t1 = r.readtxn() assert isinstance(t1, Transaction) > assert t1.tid == '0123456789abcdef'.decode('hex') E AttributeError: 'str' object has no attribute 'decode' test/test_dump.py:77: AttributeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
self = <zodbtools.util.CRC32Hasher object at 0x7f887ae465f8> def __init__(self): > self._h = crc32('') E TypeError: a bytes-like object is required, not 'str' util.py:208: TypeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
data = 'data1' def sha1(data): m = hashlib.sha1() > m.update(data) E TypeError: Unicode-objects must be encoded before hashing zodbtools/util.py:38: TypeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
s = b'\x03\xc4\x85v\x00\x00\x00\x00' def ashex(s): > return s.encode('hex') E AttributeError: 'bytes' object has no attribute 'encode' zodbtools/util.py:29: AttributeError s.encode('hex') used to work on Py2 but fails on Py3: In [1]: s = "abc" In [2]: b = b"def" In [3]: s.encode('hex') --------------------------------------------------------------------------- LookupError Traceback (most recent call last) <ipython-input-3-75ae843597fe> in <module>() ----> 1 s.encode('hex') LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs In [4]: b.encode('hex') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-4-ec2fccff20bc> in <module>() ----> 1 b.encode('hex') AttributeError: 'bytes' object has no attribute 'encode' In [5]: import codecs In [6]: codecs.encode(b, 'hex') Out[6]: b'646566' In [7]: codecs.encode(s, 'hex') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /usr/lib/python3.7/encodings/hex_codec.py in hex_encode(input, errors) 14 assert errors == 'strict' ---> 15 return (binascii.b2a_hex(input), len(input)) 16 TypeError: a bytes-like object is required, not 'str' The above exception was the direct cause of the following exception: TypeError Traceback (most recent call last) <ipython-input-7-7fcb16cead4f> in <module>() ----> 1 codecs.encode(s, 'hex') TypeError: encoding with 'hex' codec failed (TypeError: a bytes-like object is required, not 'str') After the patch it works with bytes and raises for str. Fromhex does not need to be changed - it already uses codecs.decode way as originally added in dd959b28 (zodbdump += DumpReader - to read/parse zodbdump stream). Based on patch by Jérome Perrin.
-