1. 29 Nov, 2016 1 commit
  2. 28 Nov, 2016 5 commits
  3. 27 Nov, 2016 11 commits
    • Julien Muchembled's avatar
      Bump protocol version · 8eb14b01
      Julien Muchembled authored
      8eb14b01
    • Julien Muchembled's avatar
      Fix identification issues, including a race condition causing id conflicts · 9385706f
      Julien Muchembled authored
      The added test describes how the new id timestamps fix the race condition.
      These timestamps could be any unique opaque values, and the protocol is
      extended to exchange them along with node ids.
      
      Internally, nodes also reuse timestamps as a marker to identify the first
      NotifyNodeInformation packets from the master: since this packet is a complete
      list of nodes in the cluster, any other node in the node manager has left the
      cluster definitely and is removed.
      
      The secondary masters didn't receive update about master nodes.
      It's also useless to send them information about non-master nodes.
      9385706f
    • Julien Muchembled's avatar
      54e819ff
    • Julien Muchembled's avatar
      Remove AskNodeInformation packet · d048a52d
      Julien Muchembled authored
      When Client (including backup master) and admin nodes are identified,
      the primary master now sends them automatically all nodes with
      NotifyNodeInformation, as with storage nodes.
      d048a52d
    • Julien Muchembled's avatar
      master: fix crashes in identification due to buggy nodes · 35664759
      Julien Muchembled authored
      - check address conflicts
      - on invalid values, reject peer instead of dying
      35664759
    • Julien Muchembled's avatar
      lib.node: fix NodeManager accessors returning identified nodes · e7cccf01
      Julien Muchembled authored
      Listing connected/connecting nodes with a UUID is used:
      - in one place by storage nodes: here, it does not matter if we skip nodes that
        aren't really identified
      - in many places by the master, only for server connections, in which case we
        have equivalence with real identification
      
      So in practice, NodeManager is only simplified to reuse the 'identified'
      property of nodes.
      e7cccf01
    • Julien Muchembled's avatar
      lib.node: code refactoring · 5941b27d
      Julien Muchembled authored
      5941b27d
    • Julien Muchembled's avatar
      storage: only accept clients that are known by the master · c17f5f91
      Julien Muchembled authored
      Therefore, a client node in the node manager is always RUNNING.
      c17f5f91
    • Julien Muchembled's avatar
      Give new ids to clients whose ids were already reallocated · d752aadb
      Julien Muchembled authored
      Although the change applies to any node with a temporary ids (all but storage),
      only clients don't have addresses and are therefore not recognizable.
      
      After a client is disconnected from the master and before reconnecting, another
      client may join the cluster and "steals" the id of the first client. This issue
      leads to stuck clients, failing in loop with exceptions like the following one:
      
          ERROR ZODB.Connection Couldn't load state for 0x0251
          Traceback (most recent call last):
            File "ZODB/Connection.py", line 860, in setstate
              self._setstate(obj)
            File "ZODB/Connection.py", line 901, in _setstate
              p, serial = self._storage.load(obj._p_oid, '')
            File "neo/client/Storage.py", line 82, in load
              return self.app.load(oid)[:2]
            File "neo/client/app.py", line 353, in load
              data, tid, next_tid, _ = self._loadFromStorage(oid, tid, before_tid)
            File "neo/client/app.py", line 373, in _loadFromStorage
              for node, conn in self.cp.iterateForObject(oid, readable=True):
            File "neo/client/pool.py", line 91, in iterateForObject
              pt = self.app.pt
            File "neo/client/app.py", line 145, in __getattr__
              self._getMasterConnection()
            File "neo/client/app.py", line 214, in _getMasterConnection
              result = self.master_conn = self._connectToPrimaryNode()
            File "neo/client/app.py", line 246, in _connectToPrimaryNode
              handler=handler)
            File "neo/lib/threaded_app.py", line 154, in _ask
              _handlePacket(qconn, qpacket, kw, handler)
            File "neo/lib/threaded_app.py", line 135, in _handlePacket
              handler.dispatch(conn, packet, kw)
            File "neo/lib/handler.py", line 66, in dispatch
              method(conn, *args, **kw)
            File "neo/lib/handler.py", line 188, in error
              getattr(self, Errors[code])(conn, message)
            File "neo/client/handlers/__init__.py", line 23, in protocolError
              raise StorageError("protocol error: %s" % message)
          StorageError: protocol error: already connected
      d752aadb
    • Julien Muchembled's avatar
      spelling: oudated -> outdated · b62b8dc3
      Julien Muchembled authored
      b62b8dc3
    • Julien Muchembled's avatar
      Fix spelling mistakes · 6e32ebb7
      Julien Muchembled authored
      6e32ebb7
  4. 25 Nov, 2016 7 commits
  5. 24 Nov, 2016 2 commits
  6. 23 Nov, 2016 10 commits
    • Kirill Smelkov's avatar
      . · 8c736e77
      Kirill Smelkov authored
      8c736e77
    • Kirill Smelkov's avatar
      . · fa68b9e4
      Kirill Smelkov authored
      fa68b9e4
    • Kirill Smelkov's avatar
      Merge branch 'x/go' into t · 0b751f74
      Kirill Smelkov authored
      * x/go:
        .
        .
        .
        X notes on partition table
        .
        .
        .
        .
        .
        .
      0b751f74
    • Kirill Smelkov's avatar
      . · 0d0ce246
      Kirill Smelkov authored
      0d0ce246
    • Kirill Smelkov's avatar
      Merge branch 'master' into x/go · f84a1095
      Kirill Smelkov authored
      * master:
        client: fix item eviction from cache, which could break loading from storage
        Bump protocol version for new read-only mode in BACKUPING state
        backup: Teach cluster in BACKUPING state to also serve regular ZODB clients in read-only mode
        tests/threaded: Add handy shortcuts to NEOCluster to concisely check cluster properties in tests
      f84a1095
    • Kirill Smelkov's avatar
      . · cb46ccd2
      Kirill Smelkov authored
      cb46ccd2
    • Kirill Smelkov's avatar
      . · 6c996814
      Kirill Smelkov authored
      6c996814
    • Kirill Smelkov's avatar
      . · d0c3276a
      Kirill Smelkov authored
      d0c3276a
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
      Merge branch 'master' into t · 5d2baac5
      Kirill Smelkov authored
      * master:
        client: fix item eviction from cache, which could break loading from storage
        Bump protocol version for new read-only mode in BACKUPING state
        backup: Teach cluster in BACKUPING state to also serve regular ZODB clients in read-only mode
        tests/threaded: Add handy shortcuts to NEOCluster to concisely check cluster properties in tests
      5d2baac5
  7. 21 Nov, 2016 2 commits
    • Julien Muchembled's avatar
      client: fix item eviction from cache, which could break loading from storage · 4ef05b9e
      Julien Muchembled authored
      `ClientCache._oid_dict` shall not have empty values. For a given oid, when the
      last item is removed from the cache, the oid must be removed as well to free
      memory. In some cases, this was not done.
      
      A consequence of this bug is the following exception:
      
          ERROR ZODB.Connection Couldn't load state for 0x02d1e1e4
          Traceback (most recent call last):
            File "ZODB/Connection.py", line 860, in setstate
              self._setstate(obj)
            File "ZODB/Connection.py", line 901, in _setstate
              p, serial = self._storage.load(obj._p_oid, '')
            File "neo/client/Storage.py", line 82, in load
              return self.app.load(oid)[:2]
            File "neo/client/app.py", line 358, in load
              self._cache.store(oid, data, tid, next_tid)
            File "neo/client/cache.py", line 228, in store
              prev = item_list[-1]
          IndexError: list index out of range
      4ef05b9e
    • Julien Muchembled's avatar
  8. 18 Nov, 2016 1 commit
  9. 15 Nov, 2016 1 commit
    • Kirill Smelkov's avatar
      backup: Teach cluster in BACKUPING state to also serve regular ZODB clients in read-only mode · d4944062
      Kirill Smelkov authored
      A backup cluster for tids <= backup_tid has all data to provide regular
      read-only ZODB service. Having regular ZODB access to the data can be
      handy e.g. for externally verifying data for consistency between
      main and backup clusters. Peeking around without disturbing main
      cluster might be also useful sometimes.
      
      In this patch:
      
      - master & storage nodes are taught:
      
          * to instantiate read-only or regular client service handler depending on cluster state:
            RUNNING   -> regular
            BACKINGUP -> read-only
      
          * in read-only client handler:
            + to reject write-related operations
            + to provide read operations but adjust semantic as last_tid in the database
              would be = backup_tid
      
      - new READ_ONLY_ACCESS protocol error code is introduced so that client can
        raise POSException.ReadOnlyError upon receiving it.
      
      I have not implemented back-channel for invalidations in read-only mode (yet ?).
      This way once a client connects to cluster in backup state, it won't see
      new data fetched by backup cluster from upstream after client connected.
      
      The reasons invalidations are not implemented is that for now (imho)
      there is no off-hand ready infrastructure to get updates from
      replicating node on transaction-by-transaction basis (it currently only
      notifies when whole batch is done). For consistency verification (main
      reason for this patch) we also don't need invalidations to work, as in
      that task we always connect afresh to backup. So I simply only put
      relevant TODOs about invalidations for now.
      
      The patch is not very polished but should work.
      
      /reviewed-on !4
      d4944062