1. 11 Jan, 2017 3 commits
  2. 09 Jan, 2017 1 commit
  3. 06 Jan, 2017 3 commits
  4. 04 Jan, 2017 2 commits
  5. 03 Jan, 2017 1 commit
  6. 30 Dec, 2016 1 commit
  7. 28 Dec, 2016 6 commits
  8. 27 Dec, 2016 1 commit
  9. 26 Dec, 2016 5 commits
  10. 23 Dec, 2016 1 commit
  11. 22 Dec, 2016 1 commit
  12. 21 Dec, 2016 3 commits
    • Julien Muchembled's avatar
      storage: start replicating the partition which is furthest behind · 4d3f3723
      Julien Muchembled authored
      This fixes the following case when the backup is far behing the upstream DB,
      and there are transactions being committed at the same time:
      
      1. replicate partition 0
      2. replicate partition 0
      3. replicate partition 1
      4. replicate partition 0
      5. replicate partition 1
      6. replicate partition 2
      7. replicate partition 0
      ...
      and so on in a quadratic way.
      
      When the upstream activity was too high, the backup could even be stuck looping
      on the first partitions.
      4d3f3723
    • Julien Muchembled's avatar
      master: fix possibly wrong knowledge of cells' backup_tid when resuming backup · 17af3b47
      Julien Muchembled authored
      The issue happens when there were commits while the backup cluster was down.
      In this case, the master thinks that these commits are already replicated,
      reporting wrong backup_tid to neoctl. It solved by itself once:
      - there are new commits triggering replication for all partitions;
      - all storage nodes have really replicated.
      
      This also resulted in an inconsistent database when leaving backup mode during
      this period.
      17af3b47
    • Julien Muchembled's avatar
      Minor comment/doc changes · c95c6c39
      Julien Muchembled authored
      c95c6c39
  13. 20 Dec, 2016 1 commit
  14. 06 Dec, 2016 2 commits
    • Julien Muchembled's avatar
      master,client: ignore notifications before complete initialization · 36b2d141
      Julien Muchembled authored
      A backup master crashed with the following traceback after a reconnection:
      
          Traceback (most recent call last):
            File "neo/master/app.py", line 127, in run
              self._run()
            File "neo/master/app.py", line 147, in _run
              self.playPrimaryRole()
            File "neo/master/app.py", line 348, in playPrimaryRole
              self.backup_app.provideService())
            File "neo/master/backup_app.py", line 123, in provideService
              poll(1)
            File "neo/lib/event.py", line 126, in poll
              to_process.process()
            File "neo/lib/connection.py", line 500, in process
              self._handlers.handle(self, self._queue.pop(0))
            File "neo/lib/connection.py", line 110, in handle
              self._handle(connection, packet)
            File "neo/lib/connection.py", line 125, in _handle
              handler.packetReceived(connection, packet)
            File "neo/lib/handler.py", line 117, in packetReceived
              self.dispatch(*args)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/master/handlers/backup.py", line 52, in invalidateObjects
              app.invalidatePartitions(tid, partition_set)
            File "neo/master/backup_app.py", line 257, in invalidatePartitions
              self.triggerBackup(node)
            File "neo/master/backup_app.py", line 281, in triggerBackup
              assert cell_list, offset
          AssertionError: 0
      36b2d141
    • Julien Muchembled's avatar
  15. 01 Dec, 2016 5 commits
    • Julien Muchembled's avatar
      Remove dead code found by coverage · 23b9544d
      Julien Muchembled authored
      23b9544d
    • Julien Muchembled's avatar
      Remove some useless unit tests · 1e4a4178
      Julien Muchembled authored
      Many "unit" tests (!= "threaded" tests) don't do more than checking
      implementation details, and increase coverage artificially. As with testEvent
      in commit 71e30fb9, most of these tests will
      either be removed or rewritten as threaded tests.
      
      The fact that the remaining unit tests actually cover code that other test
      don't gives motivation to maintain them. It will be also less code to update
      when switching to https://pypi.python.org/pypi/mock
      
      I proceeded as follows:
      
      1. Measure coverage for all tests except unit tests. While checking my work,
         I found that coverage stats for threaded/functional/zodb tests are quite
         unstable, so I restarted from the beginning by doing this measure several
         times and only keeping the intersection of coverage data.
      
      2. Measure coverage individually for each 'unit' tests, and substract the
         each result with the data in 1.
      
      3. The candidates for deletion are those without any code covered.
      
      Tests I didn't delete:
      
      - neo.tests.master.testElectionHandler: I always do minimal changes about
        election, as long as there's no serious review.
      
      - neo.tests.master.testMasterPT.MasterPartitionTableTests.test_13_outdate
      
      - 4 tests in neo.tests.testPT:
        test_01_Cell, test_04_removeCell, test_06_clear, test_08_filled
      
      - neo.tests.storage.testStorage{MySQL,SQLite}
      
      - neo.tests.testUtil.UtilTests.testReadBufferRead
      
      In a way, this commit is actually quite conservative. There are still many
      useless tests that only check error paths and for simple tested methods, this
      is just duplicating thie tested code.
      1e4a4178
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      Remove unused imports, found by pylint · 3b5a6edb
      Julien Muchembled authored
      3b5a6edb
    • Julien Muchembled's avatar
      TODO: tweak should be safer · e0a2a217
      Julien Muchembled authored
      e0a2a217
  16. 30 Nov, 2016 3 commits
    • Julien Muchembled's avatar
      Various neoctl/neolog formatting improvements/fixes · 264f6f57
      Julien Muchembled authored
      - format IPv6 inside [] when followed by :<port>
      - unify rendering of node lists
      - neoctl: do not crash on empty DB (no PT id)
      264f6f57
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      client: fix simultaneous (re)connections to the master · ec031cdf
      Julien Muchembled authored
      This fixes a reqression in commit c39d5c67,
      which could leads to failures like:
      
      2016-11-29 09:56:58,756 ERROR ZODB.Connection Couldn't load state for 0x4843
      Traceback (most recent call last):
        File "ZODB/Connection.py", line 860, in setstate
          self._setstate(obj)
        File "ZODB/Connection.py", line 901, in _setstate
          p, serial = self._storage.load(obj._p_oid, '')
        File "neo/client/Storage.py", line 82, in load
          return self.app.load(oid)[:2]
        File "neo/client/app.py", line 352, in load
          data, tid, next_tid, _ = self._loadFromStorage(oid, tid, before_tid)
        File "neo/client/app.py", line 372, in _loadFromStorage
          for node, conn in self.cp.iterateForObject(oid, readable=True):
        File "neo/client/pool.py", line 91, in iterateForObject
          pt = self.app.pt
        File "neo/client/app.py", line 146, in __getattr__
          return self.__getattribute__(attr)
      AttributeError: 'Application' object has no attribute 'pt'
      ec031cdf
  17. 28 Nov, 2016 1 commit