1. 20 Aug, 2024 1 commit
    • Levin Zimmermann's avatar
      identification: Set possible answers to same msgid · b7da3220
      Levin Zimmermann authored
      When a node tries to connect to another node it initially sends a
      'RequestIdentification' packet. The other node can either reply with
      'AcceptIdentification' or in case of a secondary master with
      'NotPrimaryMaster'.
      
      In the second case the message id differs from the initial requests
      message id. This makes it difficult in a multi-threaded implementation
      to proceed this answer: due to the different msg/connection - id the
      multi-threaded implementation tries to proceed this incoming message in
      a different thread, while the requesting thread waits forever for its
      peers reply.
      
      The most straightforward solution is to use the same connection - id for
      both possible answers to the 'RequestIdentification' packet. This
      doesn't break given NEO/py implementation and is only a small patch.
      A workaround in a multi-threaded implementation on the other hand
      seems to be much more complicated and time-consuming. Finally it also makes
      sense semantically, because "Message IDs are used to identify response
      packets" and in the given context the 'NotPrimaryMaster' *is* the de facto
      response of a 'RequestIdentification'.
      b7da3220
  2. 19 Feb, 2023 1 commit
  3. 16 Feb, 2023 2 commits
  4. 14 Feb, 2023 3 commits
  5. 10 Feb, 2023 1 commit
  6. 02 Feb, 2022 1 commit
    • Kirill Smelkov's avatar
      Fix breakage with zodbpickle >= 2 · d5afef8e
      Kirill Smelkov authored
      Starting from zodbpickle 2 its binary class does not allow users to set
      arbitrary attributes and so
      
      	binary._pack = bytes.__str__
      
      fails with
      
      	TypeError: can't set attributes of built-in/extension type 'zodbpickle.binary'
      
      -> Fix it by explicitly checking for binary type on encoding instead of
      setting binary._pack
      
      See slapos@27f574bc for pre-history.
      
      /cc @jerome
      d5afef8e
  7. 04 Jun, 2021 1 commit
    • Julien Muchembled's avatar
      admin: fix crash if not operational and a downstream cluster is RUNNING · 7f81ac2d
      Julien Muchembled authored
      Traceback (most recent call last):
        ...
        File ".../neo/lib/handler.py", line 75, in dispatch
          method(conn, *args, **kw)
        File ".../neo/admin/handler.py", line 174, in wrapper
          return func(self, name, *args, **kw)
        File ".../neo/admin/handler.py", line 190, in notifyMonitorInformation
          self.app.updateMonitorInformation(name, **info)
        File ".../neo/admin/app.py", line 290, in updateMonitorInformation
          self._notify(self.operational)
        File ".../neo/admin/app.py", line 315, in _notify
          body += '', name, '    ' + backup.formatSummary(upstream)[1]
        File ".../neo/admin/app.py", line 83, in formatSummary
          tid = self.ltid
      AttributeError: 'Backup' object has no attribute 'ltid'
      7f81ac2d
  8. 11 May, 2021 1 commit
  9. 02 Apr, 2021 5 commits
  10. 22 Mar, 2021 1 commit
  11. 04 Mar, 2021 2 commits
  12. 15 Jan, 2021 2 commits
    • Julien Muchembled's avatar
      ssl: don't care whether EOF is ragged or not · d98205d0
      Julien Muchembled authored
      The purpose of suppress_ragged_eofs=False was to micro-optimize the
      normal case: when there's no EOF.
      
      But commit 061cd5d8 showed that this
      option only turns ragged EOF into an exception. It may be easier for
      alternate NEO implementations to close the SSL connection properly. Or
      the performance benefit was not worth the risk to freeze a NEO process.
      d98205d0
    • Kirill Smelkov's avatar
      ssl: Don't ignore non-ragged EOF · 061cd5d8
      Kirill Smelkov authored
      Testing NEO/go client wrt NEO/py server revealed a bug in NEO/py SSL
      handling: proper non-ragged EOF from a peer is ignored, and so leads to
      hang in infinite loop inside _SSL.receive with read_buf memory growing
      indefinitely. Details are below:
      
      NEO/py wraps raw sockets with
      
      	ssl.wrap_socket(suppress_ragged_eofs=False)
      
      which instructs SSL layer to convert unexpected EOF when receiving a TLS
      record into SSLEOFError exception. However when remote peer properly
      closes its side of the connection, socket.read() still returns b'' to
      report non-ragged regular EOF:
      
      https://github.com/python/cpython/blob/v2.7.18/Lib/ssl.py#L630-L650
      
      The code was handling SSLEOFError but not b'' return from socket recv.
      Thus after NEO/go client was disconnecting and properly closing its side
      of the connection, the code started to loop indefinitely in _SSL.receive
      under `while 1` with  b'' returned by self.socket.recv() appended to
      read_buf again and again.
      
      -> Fix it by detecting non-ragged EOF as well and, similarly to how
      SSLEOFError is handled, converting them into self._error('recv', None).
      
      See merge request !17
      061cd5d8
  13. 11 Jan, 2021 4 commits
  14. 02 Oct, 2020 1 commit
    • Julien Muchembled's avatar
      Fix handling of -m/--masters arg · fa63d856
      Julien Muchembled authored
      For the master, the purpose of -m/--masters is to specify addresses
      of other master nodes, since its own address is already known via
      -b/--bind. Therefore, an empty value for -m/--masters is valid.
      The user remains free to repeat the -b value in -m.
      
      More generally, a node may choose to only specify master addresses
      via -D/--dynamic-master-list, so the check that at least one master
      address is specified is moved where the NodeManager is expected to be
      initialized.
      fa63d856
  15. 29 Sep, 2020 1 commit
  16. 25 Sep, 2020 4 commits
    • Julien Muchembled's avatar
    • Julien Muchembled's avatar
      New algorithm for deadlock avoidance · 5e7f34d2
      Julien Muchembled authored
      The time complexity of previous one was too bad. With several tens of
      concurrent transactions, we saw commits take minutes to complete and
      the whole application looked frozen.
      
      This new algorithm is much simpler. Instead of asking the oldest
      transaction to somewhat restart (we used the "rebase" term because
      the concept was similar to what git-rebase does), the storage gives
      it priority and the newest is asked to relock (this request is ignored
      if vote already happened, which means there was actually no deadlock).
      
      testLocklessWriteDuringConflictResolution was initially more complex
      because Transaction.written (client) ignored KeyError (which is not the
      case anymore since commit 8ef1ddba).
      5e7f34d2
    • Julien Muchembled's avatar
      qa: deindent code · d98b576c
      Julien Muchembled authored
      d98b576c
    • Julien Muchembled's avatar
      Update comments · dbf128b7
      Julien Muchembled authored
      dbf128b7
  17. 10 Sep, 2020 3 commits
  18. 04 Sep, 2020 1 commit
  19. 21 Aug, 2020 1 commit
  20. 25 Jun, 2020 1 commit
  21. 24 Jun, 2020 1 commit
  22. 12 Jun, 2020 1 commit
    • Julien Muchembled's avatar
      qa: skip broken ZODB test · f4cb59d2
      Julien Muchembled authored
      ======================================================================
      FAIL: check_tid_ordering_w_commit (neo.tests.zodb.testBasic.BasicTests)
      ----------------------------------------------------------------------
      Traceback (most recent call last):
        File "ZODB/tests/BasicStorage.py", line 397, in check_tid_ordering_w_commit
          self.assertEqual(results.pop('lastTransaction'), tids[1])
        File "neo/tests/__init__.py", line 301, in assertEqual
          return super(NeoTestBase, self).assertEqual(first, second, msg=msg)
      failureException: '\x03\xd8\x85H\xbffp\xbb' != '\x03\xd8\x85H\xbfs\x0b\xdd'
      f4cb59d2
  23. 11 Jun, 2020 1 commit