An error occurred fetching the project authors.
- 24 Apr, 2017 3 commits
-
-
Julien Muchembled authored
The election is not a separate process anymore. It happens during the RECOVERING phase, and there's no use of timeouts anymore. Each master node keeps a timestamp of when it started to play the primary role, and the node with the smallest timestamp is elected. The election stops when the cluster is started: as long as it is operational, the primary master can't be deposed. An election must happen whenever the cluster is not operational anymore, to handle the case of a network cut between a primary master and all other nodes: then another master node (secondary) takes over and when the initial primary master is back, it loses against the new primary master if the cluster is already started.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 31 Mar, 2017 2 commits
-
-
Julien Muchembled authored
This is a follow up of commit 64afd7d2, which focused on read accesses when there is no transaction activity. This commit also includes a test to check a simpler scenario that the one described in the previous commit.
-
Julien Muchembled authored
After an attempt to read from a non-readable, which happens when a client has a newer or older PT than storage's, the client now retries to read. This bugfix is for all kinds of read-access except undoLog, which can still report incomplete results.
-
- 21 Feb, 2017 1 commit
-
-
Julien Muchembled authored
This is a first version with several optimizations possible: - improve EventQueue (or implement a specific queue) to minimize deadlocks - turn the RebaseObject packet into a notification Sorting oids could also be useful to reduce the probability of deadlocks, but that would never be enough to avoid them completely, even if there's a single storage. For example: 1. C1 does a first store (x or y) 2. C2 stores x and y; one is delayed 3. C1 stores the other -> deadlock When solving the deadlock, the data of the first store may only exist on the storage. 2 functional tests are removed because they're redundant, either with ZODB tests or with the new threaded tests.
-
- 14 Feb, 2017 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 02 Feb, 2017 1 commit
-
-
Julien Muchembled authored
Now that we do inequality comparisons between timestamps, the master must use a monotonic clock, to avoid issues when the clock is turned back. Before, the probability that time.time() returned again the same value was probably negligible.
-
- 25 Nov, 2016 1 commit
-
-
Julien Muchembled authored
-
- 29 Aug, 2016 1 commit
-
-
Julien Muchembled authored
-
- 22 Mar, 2016 1 commit
-
-
Julien Muchembled authored
-
- 04 Mar, 2016 1 commit
-
-
Julien Muchembled authored
Before this change, a storage node did 3 commits per transaction: - once all data are stored - when locking the transaction - when unlocking the transaction The last one is not important for ACID. In case of a crash, the transaction is unlocked again (verification phase). By deferring it by 1 second, we only have 2 commits per transaction during high activity because all pending changes are merged with the commits caused by other transactions. This change compensates the extra commit(s) per transaction that were introduced in commit 7eb7cf1b ("Minimize the amount of work during tpc_finish").
-
- 01 Dec, 2015 1 commit
-
-
Julien Muchembled authored
With the previous commit, the request to truncate the DB was not stored persistently, which means that this operation was still vulnerable to the case where the master is restarted after some nodes, but not all, have already truncated. The master didn't have the information to fix this and the result was a DB partially truncated. -> On a Truncate packet, a storage node only stores the tid somewhere, to send it back to the master, which stays in RECOVERING state as long as any node has a different value than that of the node with the latest partition table. We also want to make sure that there is no unfinished data, because a user may truncate at a tid higher than a locked one. -> Truncation is now effective at the end on the VERIFYING phase, just before returning the last ids to the master. At last all nodes should be truncated, to avoid that an offline node comes back with a different history. Currently, this would not be an issue since replication is always restart from the beginning, but later we'd like they remember where they stopped to replicate. -> If a truncation is requested, the master waits for all nodes to be pending, even if it was previously started (the user can still force the cluster to start with neoctl). And any lost node during verification also causes the master to go back to recovery. Obviously, the protocol has been changed to split the LastIDs packet and introduce a new Recovery, since it does not make sense anymore to ask last ids during recovery.
-
- 30 Nov, 2015 1 commit
-
-
Julien Muchembled authored
NEO did not ensure that all data and metadata are written on disk before tpc_finish, and it was for example vulnerable to ENOSPC errors. In other words, some work had to be moved to tpc_vote: - In tpc_vote, all involved storage nodes are now asked to write all metadata to ttrans/tobj and _commit_. Because the final tid is not known yet, the tid column of ttrans and tobj now contains NULL and the ttid respectively. - In tpc_finish, AskLockInformation is still required for read locking, ttrans.tid is updated with the final value and this change is _committed_. - The verification phase is greatly simplified, more reliable and faster. For all voted transactions, we can know if a tpc_finish was started by getting the final tid from the ttid, either from ttrans or from trans. And we know that such transactions can't be partial so we don't need to check oids. So in addition to minimizing the risk of failures during tpc_finish, we also fix a bug causing the verification phase to discard transactions with objects for which readCurrent was called. On performance side: - Although tpc_vote now asks all involved storages, instead of only those storing the transaction metadata, the client has been improved to do this in parallel. The additional commits are also all done in parallel. - A possible improvement to compensate the additional commits is to delay the commit done by the unlock. - By minimizing the time to lock transactions, objects are read-locked for a much shorter period. This is even more important that locked transactions must be unlocked in the same order. Transactions with too many modified objects will now timeout inside tpc_vote instead of tpc_finish. Of course, such transactions may still cause other transaction to timeout in tpc_finish.
-
- 25 Nov, 2015 1 commit
-
-
Julien Muchembled authored
This is a workaround to fix holes if replication is interrupted after new data is committed.
-
- 29 Oct, 2015 1 commit
-
-
Julien Muchembled authored
-
- 05 Oct, 2015 1 commit
-
-
Julien Muchembled authored
-
- 24 Sep, 2015 1 commit
-
-
Julien Muchembled authored
-
- 23 Sep, 2015 1 commit
-
-
Julien Muchembled authored
There remain only one leak in ClientApplicationTests.test_connectToPrimaryNode because of Mock objects.
-
- 15 Sep, 2015 1 commit
-
-
Julien Muchembled authored
-
- 28 Aug, 2015 1 commit
-
-
Julien Muchembled authored
deadlocks mainly happened while stopping a cluster, hence the complete review of NEOCluster.stop() A major change is to make the client node handle its lock like other nodes (i.e. in the polling thread itself) to better know when to call Serialized.background() (there was a race condition with the test of 'self.poll_thread.isAlive()' in ClientApplication.close).
-
- 14 Aug, 2015 1 commit
-
-
Julien Muchembled authored
For example, a backup storage node that was rejected because the upstream cluster was not ready could reconnect in loop without delay, using 100% CPU and flooding logs. A new 'setReconnectionNoDelay' method on Connection can be used for cases where it's legitimate to quickly reconnect. With this new delayed reconnection, it's possible to remove the remaining time.sleep().
-
- 12 Aug, 2015 1 commit
-
-
Julien Muchembled authored
-
- 13 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 10 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 25 Jul, 2014 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 20 Jun, 2014 1 commit
-
-
Julien Muchembled authored
Export: - Remove leftover warning about a bug that was fixed in commit e76af297 - In neomigrate script, open NEO storage read-only. - IStorageIteration is already implemented. Import: - Review comments. - In neomigrate script, warn that IStorageRestoreable is not implemented. - Do not call 'close' method on source iterator. BaseStorage does not do it and this is not part of ZODB API. In the case of FileStorage, resource are freed automatically during garbage collection.
-
- 03 Jun, 2014 1 commit
-
-
Julien Muchembled authored
One entry should have been removed before v1.1
-
- 29 May, 2014 1 commit
-
-
Julien Muchembled authored
-
- 07 Jan, 2014 1 commit
-
-
Julien Muchembled authored
If anything wrong happens after a transaction is locked and before the end of onTransactionCommitted, recovery phase should be run again, so that the master gets correct last tid. Following patch by Vincent is an attempt to fix this: --- a/neo/master/app.py +++ b/neo/master/app.py @@ -329,8 +329,8 @@ def playPrimaryRole(self): # recover the cluster status at startup try: - self.runManager(RecoveryManager) while True: + self.runManager(RecoveryManager) self.runManager(VerificationManager) try: if self.backup_tid: @@ -338,10 +338,6 @@ def playPrimaryRole(self): raise RuntimeError("No upstream cluster to backup" " defined in configuration") self.backup_app.provideService() - # Reset connection with storages (and go through a - # recovery phase) when leaving backup mode in order - # to get correct last oid/tid. - self.runManager(RecoveryManager) continue self.provideService() except OperationFailure:
-
- 23 Aug, 2012 1 commit
-
-
Julien Muchembled authored
-
- 20 Aug, 2012 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
- catch OperationFailure - reset transaction manager when leaving backup mode - send appropriate target tid to a storage that updates a outdated cell - clean up partition table when leaving BACKINGUP state unexpectedly - make sure all readable cells of a partition have the same 'backup_tid' if they have the same data, so that we know when internal replication is finished when leaving backup mode - fix storage not finished internal replication when leaving backup mode
-
- 16 Aug, 2012 1 commit
-
-
Julien Muchembled authored
-
- 15 Aug, 2012 1 commit
-
-
Julien Muchembled authored
-
- 10 Aug, 2012 1 commit
-
-
Julien Muchembled authored
SQL tables can be upgraded using: UPDATE config SET name = 'nid' WHERE name = 'uuid'; then for MySQL: ALTER TABLE pt CHANGE uuid nid INT NOT NULL; or SQLite: ALTER TABLE pt RENAME TO old_pt; CREATE TABLE pt (rid INTEGER NOT NULL, nid INTEGER NOT NULL, state INTEGER NOT NULL, PRIMARY KEY (rid, nid)); INSERT INTO pt SELECT * from old_pt; DROP TABLE old_pt;
-
- 23 Jul, 2012 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
This changes completely how to get data from storages than is not too recent and NEO now behaves as expected by ZODB, instead of trying to snapshot at Storage level. However, ZODB should probably be changed to avoid double loading when an invalidation is received during a load.
-