Commits · 314778a5ab3ea82aec1a8eddf9f35b037102b8b8 · nexedi / neoppod

01 Apr, 2019 3 commits

Make the number of replicas modifiable when the cluster is running · 314778a5

Julien Muchembled authored Mar 27, 2019

neoctl gets a new command to change the number of replicas.

The number of replicas becomes a new partition table attribute and
like the PT id, it is stored in the config table. On the other side,
the configuration value for the number of partitions is dropped,
since it can be computed from the partition table, which is
always stored in full.

The -p/-r master options now only apply at database creation.

Some implementation notes:

- The protocol is slightly optimized in that the master now sends
  automatically the whole partition tables to the admin & client
  nodes upon connection, like for storage nodes.
  This makes the protocol more consistent, and the master is the
  only remaining node requesting partition tables, during recovery.

- Some parts become tricky because app.pt can be None in more cases.
  For example, the extra condition in NodeManager.update
  (before app.pt.dropNode) was added for this is the reason.
  Or the 'loadPartitionTable' method (storage) that is not inlined
  because of unit tests.
  Overall, this commit simplifies more than it complicates.

- In the master handlers, we stop hijacking the 'connectionCompleted'
  method for tasks to be performed (often send the full partition
  table) on handler switches.

- The admin's 'bootstrapped' flag could have been removed earlier:
  race conditions can't happen since the AskNodeInformation packet
  was removed (commit d048a52d).

314778a5

New --new-nid storage option for fast cloning · a9c15fd3

Julien Muchembled authored Mar 21, 2019

It is often faster to set up replicas by stopping a node (and any
underlying database server like MariaDB) and do a raw copy of the
database (e.g. with rsync). So far, it required to stop the whole
cluster and use tools like 'mysql' or sqlite3' to edit:
- the 'pt' table in databases,
- the 'config.nid' values of the new nodes.

With this new option, if you already have 1 replica, you can set up
new replicas with such fast raw copy, and without interruption of
service. Obviously, this implies less redundancy during the operation.

a9c15fd3

qa: remove 2 useless unit tests · b10cc750
Julien Muchembled authored Mar 29, 2019

b10cc750

21 Mar, 2019 2 commits

storage: allow the master to change our node id · 15369269
Julien Muchembled authored Mar 21, 2019
```
This is not used currently.
```
15369269

Rename --uuid command-line options into --nid · e8473a23

Julien Muchembled authored Mar 21, 2019

This breaks compatibily but it was mentionned from the beginning
that these options are only there for testing purpose.

TODO: rename all remaining occurrences of UUID into NID in the code

e8473a23

16 Mar, 2019 1 commit

importer: fix possible data loss on writeback · e387ad59

Julien Muchembled authored Mar 12, 2019

If the source DB is lost during the import and then restored from a backup,
all new transactions have to written back again on resume. It is the most
common case for which the writeback hits the maximum number of transactions
per partition to process at each iteration; the previous code was buggy in
that it could skip transactions.

e387ad59

11 Mar, 2019 3 commits
- Release version 1.11 · 48d936cb
  Julien Muchembled authored Mar 11, 2019
  
  48d936cb
- Fix short descriptions of neoctl & neomigrate in their headers · af2e209b
  Julien Muchembled authored Mar 11, 2019
  
  af2e209b
- Update copyright year · 342168cd
  Julien Muchembled authored Mar 11, 2019
  
  342168cd
26 Feb, 2019 2 commits

qa: new tool to stress-test NEO · 38e98a12

Julien Muchembled authored Oct 18, 2018

Example output:

    stress: yes (toggle with F1)
    cluster state: RUNNING
    last oid: 0x44c0
    last tid: 0x3cdee272ef19355 (2019-02-26 15:35:11.002419)
    clients: 2308, 2311, 2302, 2173, 2226, 2215, 2306, 2255, 2314, 2356 (+48)
            8m53.988s (42.633861/s)
    pt id: 4107
        RRRDDRRR
     0: OU......
     1: ..UO....
     2: ....OU..
     3: ......UU
     4: OU......
     5: ..UO....
     6: ....OU..
     7: ......UU
     8: OU......
     9: ..UO....
    10: ....OU..
    11: ......UU
    12: OU......
    13: ..UO....
    14: ....OU..
    15: ......UU
    16: OU......
    17: ..UO....
    18: ....OU..
    19: ......UU
    20: OU......
    21: ..UO....
    22: ....OU..
    23: ......UU

38e98a12

master: fix typo in comment · ce25e429
Julien Muchembled authored Oct 18, 2018

ce25e429

25 Feb, 2019 1 commit
- Fix error handling when setting up a listening connector · ce608653
  Julien Muchembled authored Feb 25, 2019
```
getAddress (via __repr__) raised EBADF on closed connectors.
```
  ce608653
31 Dec, 2018 7 commits
- Fix incomplete/incorrect mapping of node ids in logs · 1a070186
  Julien Muchembled authored Oct 18, 2018
```
In functional tests (or anything reusing this framework),
the mapping could be incorrect at the beginning of logs.
```
  1a070186
- Fix log corruption on rotation in multi-threaded applications (e.g. client) · 16fdb24d
  Julien Muchembled authored Dec 31, 2018
```
Corrupted logs cause neolog to fail with the following error:

  AttributeError: 'Log' object has no attribute 'uuid_str'
```
  16fdb24d
- sqlite: optimize storage of metadata · 243c1a0f
  Julien Muchembled authored Dec 31, 2018
```
This makes commit 3c7a3160
(storage: speed up reads by indexing 'obj' primarily by 'oid')
effective for SQLite.

The fake changes in test data are because we don't force upgrade
for this optimization.
```
  243c1a0f
- neolog: do not die when a table is corrupted · 49e7d17f
  Julien Muchembled authored Dec 20, 2018
  
  49e7d17f
- neolog: add support for zstd-compressed logs · ad379295
  Julien Muchembled authored Dec 23, 2018
  
  ad379295
- neolog: do not hardcode default value of -L option in help message · 4a96c8b6
  Julien Muchembled authored Dec 07, 2018
  
  4a96c8b6
- fixup! New log format to show node id (and optionally cluster name) in node column · af53946c
  Julien Muchembled authored Dec 23, 2018
```
Commit aa4d621d broke log rotation
and neolog sometimes failed to read in new format.
```
  af53946c
05 Dec, 2018 1 commit
- New log format to show node id (and optionally cluster name) in node column · aa4d621d
  Julien Muchembled authored Nov 25, 2018
```
neolog has new options: -N for old behaviour, and -C to show the cluster name.
```
  aa4d621d
21 Nov, 2018 4 commits

fixup! client: discard late answers to lockless writes · 8ef1ddba
Julien Muchembled authored Nov 09, 2018
```
Since commit 50e7fe52,
some code can be simplified.
```
8ef1ddba

client: fix race condition between Storage.load() and invalidations · a2e278d5

Julien Muchembled authored Nov 19, 2018

This fixes a bug that could manifest as follows:

Traceback (most recent call last):
File "neo/client/app.py", line 432, in load
self._cache.store(oid, data, tid, next_tid)
File "neo/client/cache.py", line 223, in store
assert item.tid == tid, (item, tid)
AssertionError: (<CacheItem oid='\x00\x00\x00\x00\x00\x00\x00\x01' tid='\x03\xcb\xc6\xca\xfd\xc7\xda\xee' next_tid='\x03\xcb\xc6\xca\xfd\xd8\t\x88' data='...' counter=1 level=1 expire=10000 prev=<...> next=<...>>, '\x03\xcb\xc6\xca\xfd\xd8\t\x88')

The big changes in the threaded test framework are required because we need to
reproduce a race condition between client threads and this conflicts with the
serialization of epoll events (deadlock).

a2e278d5

client: fix race condition in refcounting dispatched answer packets · 743026d5

Julien Muchembled authored Nov 16, 2018

This was found when stress-testing a big cluster. 1 client node was stuck:

  (Pdb) pp app.dispatcher.__dict__
  {'lock_acquire': <built-in method acquire of thread.lock object at 0x7f788c6e4250>,
  'lock_release': <built-in method release of thread.lock object at 0x7f788c6e4250>,
  'message_table': {140155667614608: {},
                    140155668875280: {},
                    140155671145872: {},
                    140155672381008: {},
                    140155672381136: {},
                    140155672381456: {},
                    140155673002448: {},
                    140155673449680: {},
                    140155676093648: {170: <neo.lib.locking.SimpleQueue object at 0x7f788a109c58>},
                    140155677536464: {},
                    140155679224336: {},
                    140155679876496: {},
                    140155680702992: {},
                    140155681851920: {},
                    140155681852624: {},
                    140155682773584: {},
                    140155685988880: {},
                    140155693061328: {},
                    140155693062224: {},
                    140155693074960: {},
                    140155696334736: {278: <neo.lib.locking.SimpleQueue object at 0x7f788a109c58>},
                    140155696411408: {},
                    140155696414160: {},
                    140155696576208: {},
                    140155722373904: {}},
  'queue_dict': {140155673622936: 1, 140155689147480: 2}}

140155673622936 should not be queue_dict

743026d5

More RTMIN+2 (log) information for clients and connections · 7e456329
Julien Muchembled authored Nov 14, 2018

7e456329

15 Nov, 2018 3 commits
- storage: check for conflicts when notifying that the a partition is replicated · d66b4f24
  Julien Muchembled authored Nov 06, 2018
  
  d66b4f24
- storage: clarify several assertions · f25b8ee3
  Julien Muchembled authored Nov 07, 2018
  
  f25b8ee3
- qa: new expectedFailure testcase method · 4150ffb1
  Julien Muchembled authored Nov 07, 2018
```
The idea is to write:

  with self.expectedFailure(...): \

just before the statement that is expected to fail. Contrary to the existing
decorator, we want to:
- be sure that the test fails at the expected line;
- be able to remove an expectedFailure without touching the code around.
```
  4150ffb1
08 Nov, 2018 13 commits

client: merge ConnectionPool inside Application · 7494de84
Julien Muchembled authored Oct 17, 2018

7494de84
client: prepare merge of ConnectionPool inside Application · 693aaf79
Julien Muchembled authored Nov 08, 2018

693aaf79

client: fix AssertionError when trying to reconnect too quickly after an error · 305dda86

Julien Muchembled authored Oct 17, 2018

When ConnectionPool._initNodeConnection fails a first time with:

  StorageError: protocol error: already connected

the following assertion failure happens when trying to reconnect before the
previous connection is actually closed (currently, only the node sending an
error message closes the connection, as commented in EventHandler):

  Traceback (most recent call last):
    File "neo/client/Storage.py", line 82, in load
      return self.app.load(oid)[:2]
    File "neo/client/app.py", line 367, in load
      data, tid, next_tid, _ = self._loadFromStorage(oid, tid, before_tid)
    File "neo/client/app.py", line 399, in _loadFromStorage
      askStorage)
    File "neo/client/app.py", line 293, in _askStorageForRead
      conn = cp.getConnForNode(node)
    File "neo/client/pool.py", line 98, in getConnForNode
      conn = self._initNodeConnection(node)
    File "neo/client/pool.py", line 48, in _initNodeConnection
      dispatcher=app.dispatcher)
    File "neo/lib/connection.py", line 704, in __init__
      super(MTClientConnection, self).__init__(*args, **kwargs)
    File "neo/lib/connection.py", line 602, in __init__
      node.setConnection(self)
    File "neo/lib/node.py", line 122, in setConnection
      attributeTracker.whoSet(self, '_connection'))
  AssertionError

305dda86

qa: fix attributeTracker · 163858ed
Julien Muchembled authored Oct 17, 2018

163858ed
storage: fix storage leak when an oid is stored several times within a transaction · fa14157b
Julien Muchembled authored Oct 15, 2018

fa14157b

client: discard late answers to lockless writes · 50e7fe52

Julien Muchembled authored Oct 09, 2018

This fixes:

  Traceback (most recent call last):
    File "neo/client/Storage.py", line 108, in tpc_vote
      return self.app.tpc_vote(transaction)
    File "neo/client/app.py", line 546, in tpc_vote
      self.waitStoreResponses(txn_context)
    File "neo/client/app.py", line 539, in waitStoreResponses
      _waitAnyTransactionMessage(txn_context)
    File "neo/client/app.py", line 160, in _waitAnyTransactionMessage
      self._handleConflicts(txn_context)
    File "neo/client/app.py", line 514, in _handleConflicts
      self._store(txn_context, oid, serial, data)
    File "neo/client/app.py", line 452, in _store
      self._waitAnyTransactionMessage(txn_context, False)
    File "neo/client/app.py", line 155, in _waitAnyTransactionMessage
      self._waitAnyMessage(queue, block=block)
    File "neo/client/app.py", line 142, in _waitAnyMessage
      _handlePacket(conn, packet, kw)
    File "neo/lib/threaded_app.py", line 133, in _handlePacket
      handler.dispatch(conn, packet, kw)
    File "neo/lib/handler.py", line 72, in dispatch
      method(conn, *args, **kw)
    File "neo/client/handlers/storage.py", line 143, in answerRebaseObject
      assert cached == data
  AssertionError

50e7fe52

qa: in threaded tests, log queued packets when "tic is looping forever" · 82672031
Julien Muchembled authored Oct 15, 2018

82672031
In logs, dump the partition table in a more compact and readable way · 323fd636
Julien Muchembled authored Oct 05, 2018

323fd636

storage: fix write-locking bug when a deadlock happens at the end of a replication · 7fff11f6

Julien Muchembled authored Oct 05, 2018

During rebase, writes could stay lockless although the partition was
replicated. Another transaction could then take locks prematurely, leading to
the following crash:

  Traceback (most recent call last):
    File "neo/lib/handler.py", line 72, in dispatch
      method(conn, *args, **kw)
    File "neo/storage/handlers/master.py", line 36, in notifyUnlockInformation
      self.app.tm.unlock(ttid)
    File "neo/storage/transactions.py", line 329, in unlock
      self.abort(ttid, even_if_locked=True)
    File "neo/storage/transactions.py", line 573, in abort
      not self._replicated.get(self.getPartition(oid))), x
  AssertionError: ('\x00\x00\x00\x00\x00\x03\x03v', '\x03\xca\xb44J\x13\x99\x88', '\x03\xca\xb44J\xe0\xdcU', {}, set(['\x00\x00\x00\x00\x00\x03\x03v']))

7fff11f6

client: log_flush most exceptions raised from Application to ZODB · efaae043
Julien Muchembled authored Oct 03, 2018
```
Flushing logs will help fixing NEO bugs (e.g. failed assertions).
```
efaae043

client: fix assertion failure in case of conflict + storage disconnection · a746f812

Julien Muchembled authored Oct 02, 2018

This fixes:

  Traceback (innermost last):
    ...
    Module transaction._transaction, line 393, in _commitResources
      rm.tpc_vote(self)
    Module ZODB.Connection, line 797, in tpc_vote
      s = vote(transaction)
    Module neo.client.Storage, line 95, in tpc_vote
      return self.app.tpc_vote(transaction)
    Module neo.client.app, line 546, in tpc_vote
      self.waitStoreResponses(txn_context)
    Module neo.client.app, line 539, in waitStoreResponses
      _waitAnyTransactionMessage(txn_context)
    Module neo.client.app, line 160, in _waitAnyTransactionMessage
      self._handleConflicts(txn_context)
    Module neo.client.app, line 471, in _handleConflicts
      assert oid is None, (oid, serial)
  AssertionError: ('\x00\x00\x00\x00\x00\x02\n\xe3', '\x03\xca\xad\xcb!\x92\xb6\x9c')

a746f812

client: simplify connection management in transaction contexts · 2851a274
Julien Muchembled authored Oct 01, 2018
```
With previous commit, there's no point anymore to distinguish storage nodes
for which we only check serials.
```
2851a274

client: also vote to nodes that only check serials · ab435b28

Julien Muchembled authored Oct 01, 2018

Not doing so was an incorrect optimization. Checking serials does take
write-locks and they must not be released when a client-storage connection
breaks between vote and lock, otherwise a concurrent transaction modifying such
serials may finish before.

ab435b28