WIP: Teach NEO/go to handle multiple master nodes
in the process of migrating to WC2 on our WWM/wind clone instance, I realized that it's currently not yet possible to use a NEO/go client with a NEO server which has more than one master node.
There are multiple reasons why this isn't possible yet.
Not all of them are related to NEO/go: I also made a mistake in nexedi/slapos@0cf70a6e when re-working the NEO URL formatting for WCFS (I forgot to split the master nodes IPv6 from each other with a comma
On the NEO/go side - as far as I can see - we need to finish the following steps:
We need to fix the deadlock when a node connects to another node and this other node sends a packet with a different connection id (which is documented as to-fix in the source code). This need to be fixed here, because when we have multiple masters and we try to connect to a secondary master, this secondary master sends us a
NotPrimaryMasterpacket, which is not an answer of
RequestIdentificationand therefore has a different msg id and therefore results in a deadlock.
Dialshould instead return an error, so that
TalkMastercould try a different node.
If I'm not mistaken, this also means we need to change the
Nodetype to something like
MasterAddrArray(and do the same in related code for instance
I'm fine with working on all steps. I want to open this MR to discuss what's the best way how this should be done. I hope (2), (3) and (4) shouldn't be too difficult. But (1) seems to be more complicated.
Of course, the easiest solution would be if any answer to a request would have the same connection id, I guess this would make most sense for the NEO/go connection based model. I also guess this won't happen. So let's find a different way.
You already sketched a possible solution with
link.CloseAccept() link.Ask1(reqID, accept) link.Listen()
where I assume that
Listen is meant to be
Accept in current NEO/go/t?
When trying this I had the problem that the other nodes packet was dropped/not accepted in
serveRecv, then in
serveRecvs next iteration it catched an
EOF error which made the
NodeLink to be closed and therefore creates an
Conn.recvPkt, which is finally propagated back to
Is there another case where we send a message to another node and the other node is expected to reply with a packet which is not the answer e.g. has a different message id (and is not a simple
I can't think of any solution which stays in the isolated-connection-model as long as we can receive answers with a different connection id.
If this is an exception we could introduce something like
In pseudo-code this could do something like this:
func Ask1RPC(link, request, ...response): conn = link.newConn() conn.sendMsgDirect(request) return link.ExpectRPC(request, ...response) func ExpectRPC(link, request, ...response): loop forever: for each connection in link: if connection.hasNewPackage: if connection.newPackage is any of response: return connection.newPackage index in response else: return error
Ask1RPC could temporarily ignore the connection-isolation and process any incoming packet from any connection.
I think in case of an initial dialing this could be ok, because there aren't so many different packet we can expected to arrive from the dialed node.
What do you think about this Kirill, do you think this could be an acceptable trial, would you choose a different approach?