- 01 Nov, 2020 21 commits
-
-
Kirill Smelkov authored
-
Kirill Smelkov authored
This protocol version corresponds to protocol version used by NEO/py v1.12 and was set in NEO/py commit c6453626 (Bump protocol version). The protocol definition was updated to match that NEO/py release in the previous patches.
-
Kirill Smelkov authored
This corresponds to NEO/py commit 2a27239d (tweak: add option to simulate).
-
Kirill Smelkov authored
This corresponds to NEO/py commit c2c9e99d (Better error reporting from the master to neoctl for denied requests).
-
Kirill Smelkov authored
This corresponds to NEO/py commit 21190ee7 (Make 'neoctl print pt' report the number of replicas).
-
Kirill Smelkov authored
This corresponds to NEO/py commit ef5fc508 (Make the number of replicas modifiable when the cluster is running). One important change in the protocol is that Client no longer queries Master for partition table - instead M pushed partTab to C right after identification (after pushing nodeTab). See also: https://neo.nexedi.com/P-NEO-Protocol.Specification.2019?portal_skin=CI_slideshow#/9/5
-
Kirill Smelkov authored
This corresponds to NEO/py commit 27e3f620 (New --new-nid storage option for fast cloning).
-
Kirill Smelkov authored
NEO 1.12 * tag 'v1.12': (28 commits) Release version 1.12 master: reject drop/tweak ctl commands that could lead to unwanted status qa: extend test reproducing the migration of a big ZODB to NEO neoctl: better display of full partition tables Bump protocol version tweak: add option to simulate tweak: do not crash when trying to remove all nodes tweak: do not touch cells of nodes that are intended to be dropped Better error reporting from the master to neoctl for denied requests Make 'neoctl print pt' report the number of replicas Make the number of replicas modifiable when the cluster is running New --new-nid storage option for fast cloning qa: fix 2 tests with ZODB5 qa: new tools/stress options to evaluate MySQL engines qa: provide a way to let tests start 1 mysqld per storage node mysql: make 'user' actually optional in the DB connection string mysql: specify column families for RocksDB qa: add testIncremental (testImporter) test importer: fix hidden "maximum recursion depth exceeded" at startup importer: fix closure of ZODB, and also do it when the import is finished sqlite: fix resumption of migration to NEO with Importer qa: fix a random failure in threaded tests importer: speed up startup when the import is already finished importer: fix replication (as source) once import is finished storage: fix DatabaseManager.getLastTID with max_tid qa: remove 2 useless unit tests storage: allow the master to change our node id Rename --uuid command-line options into --nid importer: fix possible data loss on writeback
-
Kirill Smelkov authored
-
Kirill Smelkov authored
This protocol version corresponds to protocol version used by NEO/py v1.11 and was set in NEO/py commit 9a5b46dd (Bump protocol version). The protocol definition was updated to match that NEO/py release in the previous patch.
-
Kirill Smelkov authored
This corresponds to NEO/py commit 64826794 (New neoctl command to flush the logs of all nodes in the cluster).
-
Kirill Smelkov authored
NEO 1.11 * tag 'v1.11': (52 commits) Release version 1.11 Fix short descriptions of neoctl & neomigrate in their headers Update copyright year qa: new tool to stress-test NEO master: fix typo in comment Fix error handling when setting up a listening connector Fix incomplete/incorrect mapping of node ids in logs Fix log corruption on rotation in multi-threaded applications (e.g. client) sqlite: optimize storage of metadata neolog: do not die when a table is corrupted neolog: add support for zstd-compressed logs neolog: do not hardcode default value of -L option in help message fixup! New log format to show node id (and optionally cluster name) in node column New log format to show node id (and optionally cluster name) in node column fixup! client: discard late answers to lockless writes client: fix race condition between Storage.load() and invalidations client: fix race condition in refcounting dispatched answer packets More RTMIN+2 (log) information for clients and connections storage: check for conflicts when notifying that the a partition is replicated storage: clarify several assertions qa: new expectedFailure testcase method client: merge ConnectionPool inside Application client: prepare merge of ConnectionPool inside Application client: fix AssertionError when trying to reconnect too quickly after an error qa: fix attributeTracker storage: fix storage leak when an oid is stored several times within a transaction client: discard late answers to lockless writes qa: in threaded tests, log queued packets when "tic is looping forever" In logs, dump the partition table in a more compact and readable way storage: fix write-locking bug when a deadlock happens at the end of a replication client: log_flush most exceptions raised from Application to ZODB client: fix assertion failure in case of conflict + storage disconnection client: simplify connection management in transaction contexts client: also vote to nodes that only check serials qa: deindent code Bump protocol version client: fix undetected disconnections to storage nodes during commit Fix data corruption due to undetected conflicts after storage failures master: notify replicating nodes of aborted watched transactions New neoctl command to flush the logs of all nodes in the cluster storage: fix premature write-locking during rebase when replication ends client: fix race condition when a storage connection is closed just after identification storage: relax assertion comments, unused import storage: fix write-lock leak client: fix possible corruption in case of network failure with a storage qa: comment about potential freeze when a functional test ends storage: fix assertion failure in case of connection reset with a client node qa: document a rare random failure in testExport debug: add script to trace all accesses to the client cache Use argparse instead of optparse neolog: use argparse instead of optparse Add comment about dormant bug when sending a lot of data to a slow node client: make clearer that max_size attribute is used from outside ClientCache
-
Kirill Smelkov authored
-
Kirill Smelkov authored
This protocol version corresponds to protocol version used by NEO/py v1.10. The protocol definition was updated to match that NEO/py release in the previous patches.
-
Kirill Smelkov authored
This corresponds to NEO/py commit 97af23cc (Maximize resiliency by taking into account the topology of storage nodes).
-
Kirill Smelkov authored
- Rename GetObject .Tid -> .Before - Rename GetObject .Serial -> .At - Sync docstrings This corresponds to NEO/py commit 9f0f2afe (protocol: update packet docstrings).
-
Kirill Smelkov authored
This corresponds to NEO/py commit 52db5607 ("protocol: a single byte is more than enough to encode enums").
-
Kirill Smelkov authored
Don't skip a code when going request1->request2 through `Request1 Answer1 Request2`. For example before this patch: 1 RequestIdentification 1 | answerBit AcceptIdentification 3 Ping 3 | answerBit Pong ... after this patch: 1 RequestIdentification 1 | answerBit AcceptIdentification 2 Ping 2 | answerBit Pong ... This corresponds to NEO/py commit a00ab78b ("protocol: small cleanup in packet registration").
-
Kirill Smelkov authored
Corresponds to NEO/py commit b3dd6973 ("Optimize resumption of replication by starting from a greater TID").
-
Kirill Smelkov authored
Corresponds to NEO/py commit 3efbbfe3 ("master: automatically discard feeding cells that get out-of-date").
-
Kirill Smelkov authored
NEO 1.10 * tag 'v1.10': (55 commits) Release version 1.10 Maximize resiliency by taking into account the topology of storage nodes storage: also commit updated cell TID at each replicated chunk of 'obj' records storage: skip useless work when unlocking transactions qa: flush logs at the end of each test when -L is not used qa: add a log in case that a mysterious bug happens again storage: clarify log about data deletion of discarded cells debug: new example to run the profiler for 1 minute mysql: fix replication of big oids (> 16M) tests/cluster: speedup waiting a bit protocol: update packet docstrings Bump protocol version protocol: a single byte is more than enough to encode enums protocol: small cleanup in packet registration Optimize resumption of replication by starting from a greater TID importer: update comment about a workaround for ZODB3 Micro-optimization of p64/u64 qa: add a log in testBackupNodeLost for easier debugging Document that the bug when checking replicas may also cause the master to crash storage: stop logging 'Abort TXN' for txn that have been locked storage: split _migrate2() for reusable _alterTable() qa: new testStorageUpgrade qa: update testStorageUpgrade data for what is not automatically upgraded qa: original data for the future testStorageUpgrade sqlite: fix indexes of upgraded db importer: fix NameError when recovering during tpc_finish fixup! importer: fetch and process the data to import in a separate process Serialize empty transaction extension with an empty string client: fix partial import from a source storage qa: give a title to subprocesses of functional tests importer: give a title to the 'import' and 'writeback' subprocesses importer: fetch and process the data to import in a separate process importer: new option to write back new transactions to the source database importer: log when the transaction index for FileStorage DB is built importer: open imported zodb in read-only whenever possible fixup! mysql: fix remaining places where a server disconnection was not catched fixup! storage: speed up replication by sending bigger network packets mysql: do not full-scan for duplicates of big oids if deduplication is disabled mysql: fix remaining places where a server disconnection was not catched fixup! Add support for custom compression levels importer: reenable compression by default qa: review testImporter qa: remove a few uses of 'chr' Fix a few issues with ZODB5 importer: small code cleanup in speedupFileStorageTxnLookup patch importer: do not trigger speedupFileStorageTxnLookup uselessly Add support for custom compression levels setup: update MANIFEST.in importer: do not checksum data twice client: store uncompressed if compressed size is equal fixup! master: automatically discard feeding cells that get out-of-date master: automatically discard feeding cells that get out-of-date qa: remove useless indentation in testSafeTweak bench: new option to mesure ZEO perfs in matrix test bench: reduce number of partitions in matrix test storage: fix replication of creation undone
-
- 16 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
ZEO4 does not have msgpack support and does not take $ZEO_MSGPACK into account. With ZEO4 this test was failing before: --- FAIL: TestHandshake (0.46s) --- FAIL: TestHandshake/py/msgpack=true (0.24s) zeo_test.go:241: handshake: encoding=Z ; want M We don't have infrastructure to check python packages versions, so check it by verifying ZEO.asyncio presence.
-
- 12 Oct, 2020 1 commit
-
-
Kirill Smelkov authored
In that case at0 was initialized as 0 and still considered uninitialized by flushEventq0: (neo) (z-dev) (g.env) kirr@deco:~/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo$ go test -run Empty ------ 2020-10-12T07:39:25 INFO ZEO.runzeo (146240) opening storage '1' using FileStorage ------ 2020-10-12T07:39:25 INFO ZEO.StorageServer StorageServer created RW with storages: 1:RW:/tmp/zeo905263273/1.fs ------ 2020-10-12T07:39:25 INFO ZEO.asyncio.server listening on /tmp/zeo905263273/1.fs.zeosock ------ 2020-10-12T07:39:25 INFO ZEO.asyncio.base Connected server protocol ------ 2020-10-12T07:39:25 INFO ZEO.asyncio.server received handshake 'Z5' 2020/10/12 07:39:25 /tmp/zeo905263273/1.fs.zeosock: EOF --- FAIL: TestEmptyDB (0.22s) --- FAIL: TestEmptyDB/py/msgpack=false (0.22s) panic: flush, but .at0 not yet initialized [recovered] panic: flush, but .at0 not yet initialized goroutine 7 [running]: testing.tRunner.func1.1(0x644a60, 0x6e1a50) /home/kirr/src/tools/go/go/src/testing/testing.go:1072 +0x30d testing.tRunner.func1(0xc000001e00) /home/kirr/src/tools/go/go/src/testing/testing.go:1075 +0x41a panic(0x644a60, 0x6e1a50) /home/kirr/src/tools/go/go/src/runtime/panic.go:969 +0x175 lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.(*zeo).flushEventq0(0xc00018a000) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo.go:180 +0xf3 lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.openByURL(0x6e9ca0, 0xc000016108, 0xc000138120, 0xc000153d98, 0x0, 0x0, 0x0, 0x0, 0x0) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo.go:488 +0x5ba lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.zeoOpen(0xc000018740, 0x1e, 0xc000049d98, 0x0, 0x0, 0x0, 0x0) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo_test.go:285 +0x17b lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.withZEO.func1(0xc000001e00, 0x6e9ea0, 0xc00005e6c0) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo_test.go:219 +0xd0 lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.withZEOSrv.func2.1(0xc0000185c0, 0x16) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo_test.go:205 +0xfb lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.withZEOSrv.func1(0xc000001e00, 0xc00000e5a0) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo_test.go:185 +0x129 lab.nexedi.com/kirr/neo/go/zodb/storage/zeo.withZEOSrv.func2(0xc000001e00) /home/kirr/src/neo/src/lab.nexedi.com/kirr/neo/go/zodb/storage/zeo/zeo_test.go:197 +0x105 testing.tRunner(0xc000001e00, 0xc00000e440) /home/kirr/src/tools/go/go/src/testing/testing.go:1123 +0xef created by testing.(*T).Run /home/kirr/src/tools/go/go/src/testing/testing.go:1168 +0x2b3 exit status 2 FAIL lab.nexedi.com/kirr/neo/go/zodb/storage/zeo 0.227s -> Fix it by using dedicated field marking whether .at0 was initialized or not yet.
-
- 24 Sep, 2020 2 commits
-
-
Kirill Smelkov authored
For virtio NICs /sys/class/net/<NIC>/device lead to $pcidev/virtioX, not just $pcidev, e.g.: $ realpath /sys/class/net/ens3/device /sys/devices/pci0000:00/0000:00:03.0/virtio0 and we were extracting virtio0 instead of 0000:00:03.0 as PCI device identifier. -> Fix it by recognizing and stripping /virtioX suffix.
-
Kirill Smelkov authored
For example under KVM it was failing as cpu: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz cat: /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: No such file or directory File "<string>", line 1 print '%.2fGHz' % ( / 1E6) ^ SyntaxError: invalid syntax $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 40 bits physical, 48 bits virtual CPU(s): 40 On-line CPU(s) list: 0-39 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 40 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz Stepping: 2 CPU MHz: 2494.238 BogoMIPS: 4988.47 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K L3 cache: 16384K NUMA node0 CPU(s): 0-39 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat -> Fix it by checking whether cpufreq/cpuidle directories are available, and display "?" if they are not.
-
- 03 Aug, 2020 2 commits
-
-
Kirill Smelkov authored
Starting from Go1.14 defer is no longer slow: https://golang.org/doc/go1.14#runtime
-
Kirill Smelkov authored
Starting from Go1.14 defer is no longer slow: https://golang.org/doc/go1.14#runtime
-
- 14 Jul, 2020 13 commits
-
-
Kirill Smelkov authored
This turns zeo tests to pass. The bug is reported upstream here: https://github.com/zopefoundation/ZODB/issues/318
-
Kirill Smelkov authored
-
Kirill Smelkov authored
If xres fails to convert to tuple, we should error about xres, not res which is always tuple(nil).
-
Kirill Smelkov authored
Receive invalidation from server and send corresponding events to watchq. Care to send only events with tid > at0 that we initially returned when opening. Tests pass, but they need https://github.com/zopefoundation/ZEO/pull/160
-
Kirill Smelkov authored
This will be used in the next patch to handle invalidateTransaction messages.
-
Kirill Smelkov authored
ZEO5 adds way for messages to be encoded via either pickles or MessagePack. However until now we were always using pickles. Let's add msgpack support to be able to e.g. use wire encoding that server prefers. MsgPack support is almost fully localized in encoding. We use tinylib/msgp runtime routines to decode/encode msg fields with known types, and shamaton/msgpack to decode/encode msg.arg, which is interface{}, because msgp does not generally work for arbitrary reflections. For msgpack=true, tests state is the same as with pickles: handshake works, but load fails when verifying that Load returns correct error for deleted object: TestLoad/py/msgpack=false: xtesting.go:272: load 0285cbacc06d3a4c:0000000000000007: returned err unexpected: have: /tmp/zeo170183943/1.fs.zeosock: load 0285cbacc06d3a4c:0000000000000007: 0000000000000007: no such object want: /tmp/zeo170183943/1.fs.zeosock: load 0285cbacc06d3a4c:0000000000000007: 0000000000000007: object was deleted @0285cbacc06d3a4c TestLoad/py/msgpack=false: xtesting.go:272: load 0285cbad858bf2e6:0000000000000006: returned err unexpected: have: /tmp/zeo170183943/1.fs.zeosock: load 0285cbad858bf2e6:0000000000000006: 0000000000000006: no such object want: /tmp/zeo170183943/1.fs.zeosock: load 0285cbad858bf2e6:0000000000000006: 0000000000000006: object was deleted @0285cbad858bf2e6 TestLoad/py/msgpack=false: xtesting.go:290: load 7fffffffffffffff:0000000000000007: returned err unexpected: have: /tmp/zeo170183943/1.fs.zeosock: load 7fffffffffffffff:0000000000000007: 0000000000000007: no such object want: /tmp/zeo170183943/1.fs.zeosock: load 7fffffffffffffff:0000000000000007: 0000000000000007: object was deleted @0285cbacc06d3a4c TestLoad/py/msgpack=false: xtesting.go:290: load 7fffffffffffffff:0000000000000006: returned err unexpected: have: /tmp/zeo170183943/1.fs.zeosock: load 7fffffffffffffff:0000000000000006: 0000000000000006: no such object want: /tmp/zeo170183943/1.fs.zeosock: load 7fffffffffffffff:0000000000000006: 0000000000000006: object was deleted @0285cbad858bf2e6 TestLoad/py/msgpack=true: xtesting.go:272: load 0285cbacc06d3a4c:0000000000000007: returned err unexpected: have: /tmp/zeo247652538/1.fs.zeosock: load 0285cbacc06d3a4c:0000000000000007: 0000000000000007: no such object want: /tmp/zeo247652538/1.fs.zeosock: load 0285cbacc06d3a4c:0000000000000007: 0000000000000007: object was deleted @0285cbacc06d3a4c TestLoad/py/msgpack=true: xtesting.go:272: load 0285cbad858bf2e6:0000000000000006: returned err unexpected: have: /tmp/zeo247652538/1.fs.zeosock: load 0285cbad858bf2e6:0000000000000006: 0000000000000006: no such object want: /tmp/zeo247652538/1.fs.zeosock: load 0285cbad858bf2e6:0000000000000006: 0000000000000006: object was deleted @0285cbad858bf2e6 TestLoad/py/msgpack=true: xtesting.go:290: load 7fffffffffffffff:0000000000000007: returned err unexpected: have: /tmp/zeo247652538/1.fs.zeosock: load 7fffffffffffffff:0000000000000007: 0000000000000007: no such object want: /tmp/zeo247652538/1.fs.zeosock: load 7fffffffffffffff:0000000000000007: 0000000000000007: object was deleted @0285cbacc06d3a4c TestLoad/py/msgpack=true: xtesting.go:290: load 7fffffffffffffff:0000000000000006: returned err unexpected: have: /tmp/zeo247652538/1.fs.zeosock: load 7fffffffffffffff:0000000000000006: 0000000000000006: no such object want: /tmp/zeo247652538/1.fs.zeosock: load 7fffffffffffffff:0000000000000006: 0000000000000006: object was deleted @0285cbad858bf2e6 This is due to https://github.com/zopefoundation/ZODB/issues/318
-
Kirill Smelkov authored
It is already documented in pktDecodeZ comment that flags is int|bool. However until now we were decoding it only as int. As is it was working, but it will fail when receiving e.g. invalidateTransaction message, because ZEO/py server actually uses bool when sending it: https://github.com/zopefoundation/ZEO/blob/5.2.1-20-gcb26281d/src/ZEO/asyncio/base.py#L139 Fix it. This will be covered by watch tests, when watch support will be added in a followup patch.
-
Kirill Smelkov authored
Handle pickled lists as valid input when decoding tuples.
-
Kirill Smelkov authored
-
Kirill Smelkov authored
This is more uniform and will be followed by all data types in the next patch. Here rename tid/oid pack/unpack routines correspondingly. Add docstrings for X=tid|oid.
-
Kirill Smelkov authored
Keep information about which message encoding is used on the wire in encoding type. Make pktDecode/pktEncode and data type conversion utilities be methods of this type. For now there is only 'pickles' encoding, but we'll soon introduce 'msgpack'. Currently not everything related to pickles is localized in encoding - we'll be moving more bits to encoding in the followup patches.
-
Kirill Smelkov authored
-
Kirill Smelkov authored
First step: - move msg and msgFlags - move pktDecode - move functions to pack/unpack tid and oid more to come.
-