• Jon Maloy's avatar
    tipc: introduce message to synchronize broadcast link · c64f7a6a
    Jon Maloy authored
    Upon establishing a first link between two nodes, there is
    currently a risk that the two endpoints will disagree on exactly
    which sequence number reception and acknowleding of broadcast
    packets should start.
    
    The following scenarios may happen:
    
    1: Node A sends an ACTIVATE message to B, telling it to start acking
       packets from sequence number N.
    2: Node A sends out broadcast N, but does not expect an acknowledge
       from B, since B is not yet in its broadcast receiver's list.
    3: Node A receives ACK for N from all nodes except B, and releases
       packet N.
    4: Node B receives the ACTIVATE, activates its link endpoint, and
       stores the value N as sequence number of first expected packet.
    5: Node B sends a NAME_DISTR message to A.
    6: Node A receives the NAME_DISTR message, and activates its endpoint.
       At this moment B is added to A's broadcast receiver's set.
       Node A also sets sequence number 0 as the first broadcast packet
       to be received from B.
    7: Node A sends broadcast N+1.
    8: B receives N+1, determines there is a gap in the sequence, since
       it is expecting N, and sends a NACK for N back to A.
    9: Node A has already released N, so no retransmission is possible.
       The broadcast link in direction A->B is stale.
    
    In addition to, or instead of, 7-9 above, the following may happen:
    
    10: Node B sends broadcast M > 0 to A.
    11: Node A receives M, falsely decides there must be a gap, since
        it is expecting packet 0, and asks for retransmission of packets
        [0,M-1].
    12: Node B has already released these packets, so the broadcast
        link is stale in direction B->A.
    
    We solve this problem by introducing a new unicast message type,
    BCAST_PROTOCOL/STATE, to convey the sequence number of the next
    sent broadcast packet to the other endpoint, at exactly the moment
    that endpoint is added to the own node's broadcast receivers list,
    and before any other unicast messages are permitted to be sent.
    
    Furthermore, we don't allow any node to start receiving and
    processing broadcast packets until this new synchronization
    message has been received.
    
    To maintain backwards compatibility, we still open up for
    broadcast reception if we receive a NAME_DISTR message without
    any preceding broadcast sync message. In this case, we must
    assume that the other end has an older code version, and will
    never send out the new synchronization message. Hence, for mixed
    old and new nodes, the issue arising in 7-12 of the above may
    happen with the same probability as before.
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
    c64f7a6a
link.c 79.9 KB