- 28 Aug, 2015 5 commits
-
-
Julien Muchembled authored
This fixes a random failure in testClientReconnection: Traceback (most recent call last): File "neo/tests/threaded/test.py", line 754, in testClientReconnection self.assertTrue(cluster.client.history(x1._p_oid)) failureException: None is not true
-
Julien Muchembled authored
Traceback (most recent call last): File "neo/tests/threaded/test.py", line 838, in testRecycledClientUUID x = client.load(ZERO_TID) [...] File "neo/tests/threaded/test.py", line 822, in notReady m2s.remove(delayNotifyInformation) File "neo/tests/threaded/__init__.py", line 482, in remove del self.filter_dict[filter] KeyError: <function delayNotifyInformation at 0x7f511063a578>
-
Julien Muchembled authored
NEOCluster.tic() gets a new 'slave' parameter that must be True when a client node is in 'master' mode (i.e. setPoll(True)). In this case, tic() will wait that all nodes finish their work and the client polls with a non-zero timeout. Here, tic(slave=1) is used to wait for the storage to process NotifyUnlockInformation notification from the master. Traceback (most recent call last): File "neo/tests/threaded/test.py", line 80, in testBasicStore self.assertEqual(data_info, cluster.storage.getDataLockInfo()) File "neo/tests/__init__.py", line 170, in assertEqual return super(NeoTestBase, self).assertEqual(first, second, msg=msg) failureException: {('\x0b\xee\xc7\xb5\xea?\x0f\xdb\xc9]\r\xd4\x7f<[\xc2u\xda\x8a3', 0): 0} != {('\x0b\xee\xc7\xb5\xea?\x0f\xdb\xc9]\r\xd4\x7f<[\xc2u\xda\x8a3', 0): 1}
-
Julien Muchembled authored
All these changes were useful to debug deadlocks in threaded tests: - New verbose Semaphore. - Logs with numerical 'ident' were too annoying to read so revert to thread name (before commit 5b69d553), with an exception for threaded tests. There remains one case where the result is not unique: when several client apps are instantiated. - Make deadlock detection optional. - Make it possible to name locks. - Make output more compact. - Remove useless 'debug_lock' option. - Add timing information. - Make exception more verbose when an un-acquired lock is released. Here is how I used 'locking': --- a/neo/tests/threaded/__init__.py +++ b/neo/tests/threaded/__init__.py @@ -37,0 +38 @@ +from neo.lib.locking import VerboseSemaphore @@ -71 +72,2 @@ def init(cls): - cls._global_lock = threading.Semaphore(0) + cls._global_lock = VerboseSemaphore(0, check_owner=False, + name="Serialized._global_lock") @@ -265 +267,2 @@ def start(self): - self.em._lock = l = threading.Semaphore(0) + self.em._lock = l = VerboseSemaphore(0, check_owner=False, + name=self.node_name) @@ -346 +349,2 @@ def __init__(self, master_nodes, name, **kw): - self.em._lock = threading.Semaphore(0) + self.em._lock = VerboseSemaphore(0, check_owner=False, + name=repr(self))
-
Julien Muchembled authored
deadlocks mainly happened while stopping a cluster, hence the complete review of NEOCluster.stop() A major change is to make the client node handle its lock like other nodes (i.e. in the polling thread itself) to better know when to call Serialized.background() (there was a race condition with the test of 'self.poll_thread.isAlive()' in ClientApplication.close).
-
- 14 Aug, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
For example, a backup storage node that was rejected because the upstream cluster was not ready could reconnect in loop without delay, using 100% CPU and flooding logs. A new 'setReconnectionNoDelay' method on Connection can be used for cases where it's legitimate to quickly reconnect. With this new delayed reconnection, it's possible to remove the remaining time.sleep().
-
- 12 Aug, 2015 16 commits
-
-
Julien Muchembled authored
Such kind of test has never helped to detect regressions and any bug in EpollEventManager would be quickly reported by other tests. testConnection may go the same way if it keeps annoying me too much.
-
Julien Muchembled authored
This is currently not an issue because the 'time.sleep(1)' in iterateForObject (storage) and _connectToPrimaryNode (master) leave enough time. What could happen is a new connection attempt for a node that already has a connection (causing a failure assertion in Node.setConnection).
-
Julien Muchembled authored
This could happen if a file descriptor was reallocated by the kernel.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
With this patch, the epolling object is not awoken every second to check if a timeout has expired. The API of Connection is changed to get the smallest timeout.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
This is a prerequisite for tickless poll loops.
-
Julien Muchembled authored
-
Julien Muchembled authored
This mainly changes several methods to lock automatically instead of asserting that the caller did it. This removes any overhead for non-MT classes, and the use of 'with' instead of lock/unlock methods also simplifies the API.
-
Julien Muchembled authored
-
Julien Muchembled authored
shutdown is implicit because we don't duplicate sockets.
-
Julien Muchembled authored
-
Julien Muchembled authored
- For all threads except the main one, the id is displayed instead of the name, because the latter is not always unique. - Outputs may be interlaced by concurrent thread, so tracebacks are also prefixed by their idents.
-
Julien Muchembled authored
-
- 28 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 13 Jul, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 10 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 09 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 03 Jul, 2015 3 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 01 Jul, 2015 1 commit
-
-
Julien Muchembled authored
-
- 30 Jun, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 29 Jun, 2015 2 commits
-
-
Julien Muchembled authored
-
Julien Muchembled authored
-
- 24 Jun, 2015 4 commits
-
-
Julien Muchembled authored
When the connection to the primary master node is lost, the node manager does not have anymore a reliable list of running nodes, so iterateForObject() must not retry any cell.
-
Julien Muchembled authored
-
Julien Muchembled authored
-
Julien Muchembled authored
-