- 29 Mar, 2022 1 commit
-
-
Jérome Perrin authored
Showing pickle disassembly can sometimes be useful to analyse details of the pickle content. We realized that in some data structures used in ERP5 the same string was saved multiple times in the same pickle and by using the exact same string (ie. for which `s1 is s2` is True), the pickle will have the string only once and pickles are a bit smaller. For more reference, the context was nexedi/erp5!1560 (comment 154825) This introduces a new --pretty option that we will be able to extend later with more output formats. Co-authored-by: Kirill Smelkov <kirr@nexedi.com> Reviewed-on: nexedi/zodbtools!22
-
- 01 Apr, 2021 1 commit
-
-
Jérome Perrin authored
@kirr wrote (!19 (comment 129442)) For the reference - contrary to ZODB5, restore tests on ZODB4 are currently [broken](https://nexedijs.erp5.net/#/test_result_module/20210317-B3AC205A/2). Restored file is not bit-to-bit identical to the original. The problem is that on commit/restore, we need to save user/description/extension. For extension `zodbdump.Transaction` provides .extension_bytes, which ZODB5 uses to save its raw copy. However ZODB4 goes through `.extension` and pickles it: https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py#L425-453 https://github.com/zopefoundation/ZODB/blob/4/src/ZODB/BaseStorage.py#L220-L240 This leads to unpickle-repickle round-trip and different extension being committed on restore: ```diff diff --git a/1zdump b/2zdump index 5033bc1..a3a32aa 100644 --- a/1zdump +++ b/2zdump @@ -10,7 +10,7 @@ q^A. txn 0285cbac3d0369e6 " " user "user0.0" description "step 0.0" -extension "\x80\x02}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorq\x02U\fzodb/py2 (f)u." +extension "}q\x01(U\tx-cookieSU\x05RF9IEU\vx-generatorU\fzodb/py2 (f)u." obj 0000000000000000 98 sha1:eba252d1984f975ecb636bc1b3a89c953dd20527 ... ``` What might save us is to somehow in Transaction.extension returns a dict-subclass object that is somehow pickled to the exact bytes remembered when it was created. However, after briefly checking, I could not find a mechanism to do so yet... @jerome wrote (!19 (comment 129479)) @kirr we already have pytest fixtures to test differently depending on whether the ZODB version has support for extension_bytes, so what about using it in the test and testing restoring the extension bytes version of the dump only for ZODB5 ? @kirr wrote (!19 (comment 129482)) @jerome, yes we have this, but I believe we should actually fix zodbrestore to be reliable whatever ZODB is used. For ZODB5 it works. For ZODB4-wc2 we can adjust ZODB code to use extension_bytes similarly to how ZODB5 does. But unpatched ZODB4 is currently out of luck. As it was decided that Nexedi will use both ZODB4 and ZODB4-wc2, I think we should fix zodbrestore to work on all those versions to be reliable. /cc @tomo @kirr: -> No universal ZODB4 fix for now (this would require to monkey patch ZODB in several places), so mark "restore with extension" test as xfail similarly to how we already do for "dump with extension" test. This brings -ZODB4 and -ZODB4-wc2 tests back to PASS state. Even though on ZODB4 extension is restored not bit-to-bit exactly, it is restored to be the same dictionary equal to what was used to produce the dump. Not ideal, but still not loosing the information in practice. One more reason to switch to ZODB5...
-
- 16 Mar, 2021 2 commits
-
-
Kirill Smelkov authored
In the previous patch we taught object copy handler to report more details, but it was still incomplete - the error was missing details about which operation was run - commit, or restore of particular transaction. Noting that it can be also noted that other errors reported from that function lack such context. -> So fix it universally, at least for zodbcommit for now: set top-level runctx to topic of what we are doing, and use that runctx when generating errors. Runctx describes what we are running, and could be also later used for logging and tracing. That's why it is called runctx instead of just errctx for "error context". TODO currently it is only exceptions that we explicitly raise which get the context. If an exception is raised by something that we call - the context won't be added. It would be good to later rework error handling and append such context for any raised error. Defer and https://lab.nexedi.com/kirr/go123/blob/863c4602/xerr/__init__.py has something preliminary for this. The particular error when restoring a missing object copy becomes ValueError: /tmp/demo002868462/δ0285cbac75555580/δ.fs: restore 0285cbacb70a3db3 @0285cbacb258bf66: object 0000000000000003: copy from @0285cbac70a3d733: no data instead of older ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data /reviewed-by @jerome /reviewed-on !20
-
Kirill Smelkov authored
When zodbdump input says to copy an object, we first load that object. However if object does not exist loadBefore raises POSKeyError, and when object at copied-from revision was deleted loadBefore returns None. -> Handle that explicitly to provide failure details to the user, so that instead of cryptic === RUN TestLoad/δstart=0285cbac75555580 Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module> main() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "<decorator-gen-6>", line 2, in main File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main zodbrestore(stor, asbinstream(sys.stdin), _) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore zodbcommit(stor, at, txn) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 122, in zodbcommit _() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 91, in _ data, _, _ = stor.loadBefore(obj.oid, p64(u64(obj.copy_from)+1)) TypeError: 'NoneType' object is not iterable xtesting.go:483: /tmp/demo009767458/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1 it fails with something more understandable: === RUN TestLoad/δstart=0285cbac75555580 Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module> main() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main return command_module.main(argv) File "<decorator-gen-6>", line 2, in main File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _ return f(*argv, **kw) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main zodbrestore(stor, asbinstream(sys.stdin), _) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore zodbcommit(stor, at, txn) File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 129, in zodbcommit _() File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 97, in _ (stor.getName(), ashex(obj.oid), ashex(obj.copy_from))) ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data xtesting.go:483: /tmp/demo358030847/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1 For the implementation it would be easier to use loadAt (https://github.com/zopefoundation/ZODB/pull/323), but we don't have that yet. /reviewed-by @jerome /reviewed-on nexedi/zodbtools!20
-
- 15 Mar, 2021 4 commits
-
-
Kirill Smelkov authored
Suggessted by @jerome here: nexedi/zodbtools!19 (comment 129181) Co-authored-with: Jérome Perrin <jerome@nexedi.com> /reviewed-on nexedi/zodbtools!19
-
Kirill Smelkov authored
Zodbrestore is long-coming counterpart to zodbdump. Implementation is internally based on reworked zodbcommit. For FileStorage restored database is verified via test to be bit-to-bit identical to the original. For NEO it won't be exactly the case, as NEO does not implement IStorageRestoreable: there is only tpc_begin(tid=...) but no restore(). /helped-by @jerome /reviewed-on !19
-
Kirill Smelkov authored
This current serial will not be needed on new codepaths to be added to zodbcommit in the next patch. -> Move the computation to function to trigger it only from places where knowing current serial is actually needed. /reviewed-by @jerome /reviewed-on !19
-
Kirill Smelkov authored
Two-phase commit protocol assumes that after tpc_begin, it will be either successful tpc_vote + tpc_finish, or tpc_abort. We were not calling tpc_abort on an error, potentially leaving storage in "commit is in progress" state on an error. /reviewed-by @jerome /reviewed-on !19
-
- 10 Mar, 2021 2 commits
-
-
Kirill Smelkov authored
Nexedi stack is dropping support for that old ZODB version - see e.g. - slapos@70d05199 - neoppod@3a8f6f03 - wendelin.core@0802da2b Regarding test/gen_testdata.py: even though ZODB4 uses zodbpickle, and so should be able to load pickles encoded with protocol 3 even on python2, in practice it does not work so well: ZODB4 tests fail if I set --- a/src/ZODB/_compat.py +++ b/src/ZODB/_compat.py @@ -34,7 +34,7 @@ HIGHEST_PROTOCOL = cPickle.HIGHEST_PROTOCOL IMPORT_MAPPING = {} NAME_MAPPING = {} - _protocol = 1 + _protocol = 3 FILESTORAGE_MAGIC = b"FS21" else: # Python 3.x: can't use stdlib's pickle because -> so continue to preserve protocol < 3 when generating the test database for compatibility - now with ZODB4/py2. /reviewed-by @jerome /reviewed-on !18
-
Kirill Smelkov authored
The patch that provides raw-extension functionality was merged into ZODB 5.6: https://github.com/zopefoundation/ZODB/commit/2f8cc67a3ba3 So when testing with ZODB5 >= 5.6 the tests will excercise code path that uses txn.extension_bytes, and when testing with ZODB4 the tests will excercise code path that work-arounds lack of txn.extension_bytes. /reviewed-by @jerome /reviewed-on !18
-
- 02 Nov, 2020 1 commit
-
-
Kirill Smelkov authored
Nxdtest[1] is tox-like tool to run tests under Nexedi testing infrastructure. [1] https://lab.nexedi.com/nexedi/nxdtest /reviewed-on !17
-
- 30 Apr, 2020 1 commit
-
-
Kirill Smelkov authored
Flushing changes from yet another attempt. Still not completely there yet, but closer. Reviewed-by: @jerome Reviewed-on: !16
-
- 29 Apr, 2020 6 commits
-
-
Kirill Smelkov authored
ashex gives bytes, whereas reference_tid was str.
-
Kirill Smelkov authored
The sequence cannot be randomly accessed, e.g. In [5]: d = {1:2} In [6]: kv = d.keys() In [7]: kv Out[7]: dict_keys([1]) In [8]: kv[0] --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-8-643f90e1910b> in <module>() ----> 1 kv[0] TypeError: 'dict_keys' object is not subscriptable -> Use list(dict.keys()) in places where we need random access.
-
Kirill Smelkov authored
Otherwise it breaks with str on py3: In [1]: from io import BytesIO In [2]: BytesIO("abc") --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-52a130edd46d> in <module>() ----> 1 BytesIO("abc") TypeError: a bytes-like object is required, not 'str'
-
Kirill Smelkov authored
Zodbdump format is text-binary and is saved into files opened in binary mode. -> We have to emit bytes - not strings - into it, since otherwise on Python3 it would break. This needs qq support from pygolang[1] to be able to use qq with both string and bytestring format, e.g. for "hello %s" % qq(name), and b"hello %s" % qq(name) to give the same output irregardless of whether name is str or bytes. [1] pygolang!1
-
Kirill Smelkov authored
Zodbdump format is already described as semi text-binary in top-level zodbdump.py documentation. However zdump() docstring was referring to it as "text". Fix it and use binary to handle places where zdump is loaded/saved.
-
Kirill Smelkov authored
%r has different output for strings and bytes on python3: In [1]: a = 'hello' In [2]: b = b'hello' In [3]: repr(a) Out[3]: "'hello'" In [4]: repr(b) Out[4]: "b'hello'" -> Use qq whose output is stable irregardless of whether input is string or bytes.
-
- 13 Mar, 2020 1 commit
-
-
Kirill Smelkov authored
zodbinfo: Provide "head" as command to query DB head; Turn "last_tid" into deprecated alias for head Similarly to go version: neo@151d8b79.
-
- 14 Feb, 2020 1 commit
-
-
Kirill Smelkov authored
Starting with upcoming ZODB 5.5.2 ZODB tries to preserve `extension_bytes` transaction metadata property in the raw form as it was stored on disk in the database: https://github.com/zopefoundation/ZODB/commit/2f8cc67a However now when running test/gen_testdata.py with ZODB with that patch (and gen_testdata.py refuses to work if it detects that ZODB does not properly supports .extension_bytes property because we want it to be present in the generated test database [1,2]) it now breaks: $ ./gen_testdata.py Traceback (most recent call last): File "./gen_testdata.py", line 230, in <module> main() File "./gen_testdata.py", line 224, in main gen_testdb("%s.fs" % dbname, zext=zext) File "./gen_testdata.py", line 194, in gen_testdb stor.tpc_begin(txn) File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/BaseStorage.py", line 193, in tpc_begin ext = transaction.extension_bytes AttributeError: 'Transaction' object has no attribute 'extension_bytes' The breakage is because, as specified in ZODB interfaces[3,4], storage requires ZODB.IStorageTransactionMetaData, not transaction.ITransaction instance gen_testdata.py was using. The script used to work before just by luck. The fix is to convert transaction instance into storage transaction metadata object for the place where we talk to storage at raw level. HOWEVER, when checking regenerated database and its dump I noticed: ZODB >= 5.4.0 uses pickle protocol 3 on both python2 and python3 https://github.com/zopefoundation/ZODB/commit/12ee41c4 In other words it saves e.g. OID of an object as pickle binary, which decodes as bytes on py3 and zodbpickle.binary on py2 when decoding via zodbpickle. However it will result in *DecodeError* when decoding on py2 with standard pickle module. The latter means that ZODB3 will _fail_ to load data from test database, because ZODB3 - contrary to ZODB4 and ZODB5 - uses std pickle module, not zodbpickle. We still care about ZODB3 and in particular it is included into zodbtools test matrix: https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/tox.ini#L9-14 so we cannot break it. -> Temporarily patch ZODB at runtime to make sure it emits data with older protocol and without using zodbpickle.binary for oid, so that generated test database could be loaded on ZODB3 as well. gen_testdata.py now works with latest ZODB, but produces exactly the same bit-to-bit output as before. [1] https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/zodbtools/test/gen_testdata.py#L215-217 [2] https://lab.nexedi.com/nexedi/zodbtools/blob/7bc0385e/zodbtools/test/testutil.py#L31-63 [3] https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L815-L818 [4] https://github.com/zopefoundation/ZODB/blob/5.5.1-35-gb5895a5c2/src/ZODB/interfaces.py#L538-L575 /reviewed-on !15
-
- 09 Jul, 2019 1 commit
-
-
Kirill Smelkov authored
-> Use .[test] to refer to them. https://stackoverflow.com/a/41398850/9456786 /reviewed-by @jerome /reviewed-on !14
-
- 03 Jun, 2019 1 commit
-
-
Kirill Smelkov authored
@jerome, I was trying to make zodbtools work with Python3 and along that road picked some bits of your work from !12. At present the migration to Python3 is not complete, and even though now I have the answer to how handle strings in both python2/3 in compatible and reasonable way (I can share details if you are interested), I have to put that work on hold for some time and use https://pypi.org/project/pep3134 directly in wcfs tests, since getting all string details right, even after figuring on how to do it, will take time. Anyway the bits presented here should be ready for master and could be merged now. Could you please have a look? Thanks beforehand, Kirill /reviewed-on !13
-
- 24 May, 2019 8 commits
-
-
Kirill Smelkov authored
Zodbdump format is mixed text+binary so dumping to unicode stdout won't work. Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
Because on Py3: def test_dumpreader(): in_ = b"""\ txn 0123456789abcdef " " user "my name" description "o la-la..." extension "zzz123 def" obj 0000000000000001 delete obj 0000000000000002 from 0123456789abcdee obj 0000000000000003 54 adler32:01234567 - obj 0000000000000004 4 sha1:9865d483bc5a94f2e30056fc256ed3066af54d04 ZZZZ obj 0000000000000005 9 crc32:52fdeac5 ABC DEF! txn 0123456789abcdf0 " " user "author2" description "zzz" extension "qqq" """ r = DumpReader(BytesIO(in_)) t1 = r.readtxn() assert isinstance(t1, Transaction) > assert t1.tid == '0123456789abcdef'.decode('hex') E AttributeError: 'str' object has no attribute 'decode' test/test_dump.py:77: AttributeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
self = <zodbtools.util.CRC32Hasher object at 0x7f887ae465f8> def __init__(self): > self._h = crc32('') E TypeError: a bytes-like object is required, not 'str' util.py:208: TypeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
data = 'data1' def sha1(data): m = hashlib.sha1() > m.update(data) E TypeError: Unicode-objects must be encoded before hashing zodbtools/util.py:38: TypeError Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
s = b'\x03\xc4\x85v\x00\x00\x00\x00' def ashex(s): > return s.encode('hex') E AttributeError: 'bytes' object has no attribute 'encode' zodbtools/util.py:29: AttributeError s.encode('hex') used to work on Py2 but fails on Py3: In [1]: s = "abc" In [2]: b = b"def" In [3]: s.encode('hex') --------------------------------------------------------------------------- LookupError Traceback (most recent call last) <ipython-input-3-75ae843597fe> in <module>() ----> 1 s.encode('hex') LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs In [4]: b.encode('hex') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-4-ec2fccff20bc> in <module>() ----> 1 b.encode('hex') AttributeError: 'bytes' object has no attribute 'encode' In [5]: import codecs In [6]: codecs.encode(b, 'hex') Out[6]: b'646566' In [7]: codecs.encode(s, 'hex') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /usr/lib/python3.7/encodings/hex_codec.py in hex_encode(input, errors) 14 assert errors == 'strict' ---> 15 return (binascii.b2a_hex(input), len(input)) 16 TypeError: a bytes-like object is required, not 'str' The above exception was the direct cause of the following exception: TypeError Traceback (most recent call last) <ipython-input-7-7fcb16cead4f> in <module>() ----> 1 codecs.encode(s, 'hex') TypeError: encoding with 'hex' codec failed (TypeError: a bytes-like object is required, not 'str') After the patch it works with bytes and raises for str. Fromhex does not need to be changed - it already uses codecs.decode way as originally added in dd959b28 (zodbdump += DumpReader - to read/parse zodbdump stream). Based on patch by Jérome Perrin.
-
Kirill Smelkov authored
There is no cStringIO on Python3: test_dump.py:26: in <module> from cStringIO import StringIO E ModuleNotFoundError: No module named 'cStringIO' Based on patch by Jérome Perrin.
-
Jérome Perrin authored
This makes zodb command driver tests added in the previous patch to pass on both python2 and python3.
-
Jérome Perrin authored
---- kirr: factor running `zodb ...` into zodbrun + add test for `zodb -h`. Added test currently passes on py2, but fails on py3: out = <_io.TextIOWrapper encoding='UTF-8'> def usage(out): print("""\ Zodb is a tool for managing ZODB databases. Usage: zodb command [arguments] The commands are: """, file=out) cmdv = command_dict.keys() > cmdv.sort() E AttributeError: 'dict_keys' object has no attribute 'sort' zodbtools/zodb.py:55: AttributeError It will be fixed in the next patch.
-
- 07 Mar, 2019 1 commit
-
-
Jérome Perrin authored
-
- 31 Jan, 2019 6 commits
-
-
Jérome Perrin authored
-
Jérome Perrin authored
---- kirr: use loggging as log and keep emitting warnings on one line.
-
Jérome Perrin authored
this silents a warning about \w being unknown escape sequence ---- kirr: preserved _obj_re definition to be on 1 line.
-
Jérome Perrin authored
-
Jérome Perrin authored
until https://github.com/zopefoundation/ZODB/pull/183 gets merged, let's run also the tests for this, since we have support for this extension.
-
Jérome Perrin authored
also simplify a bit definition as ZODB is common in all versions ---- kirr: - cover only last 2 py3 releases: 3.6 and 3.7 for now (3.8 is not yet released) - separate ZODB3 as it supports only python2. Py3 tests are failing for now and we'll be getting them to pass incrementally - step by step.
-
- 30 Jan, 2019 3 commits
-
-
Jérome Perrin authored
Using dateparser to support absolute and relative dates in natural language. /reviewed-by @kirr /reviewed-on !8
-
Jérome Perrin authored
-
Jérome Perrin authored
-