• Jon Paul Maloy's avatar
    tipc: reduce transmission rate of reset messages when link is down · 88e8ac70
    Jon Paul Maloy authored
    When a link is down, it will continuously try to re-establish contact
    with the peer by sending out a RESET or an ACTIVATE message at each
    timeout interval. The default value for this interval is currently
    375 ms. This is wasteful, and may become a problem in very large
    clusters with dozens or hundreds of nodes being down simultaneously.
    
    We now introduce a simple backoff algorithm for these cases. The
    first five messages are sent at default rate; thereafter a message
    is sent only each 16th timer interval.
    
    This will cover the vast majority of link recycling cases, since the
    endpoint starting last will transmit at the higher speed, and the link
    should normally be established well be before the rate needs to be
    reduced.
    
    The only case where we will see a degradation of link re-establishment
    times is when the endpoints remain intact, and a glitch in the
    transmission media is causing the link reset. We will then experience
    a worst-case re-establishing time of 6 seconds, something we deem
    acceptable.
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    88e8ac70
link.c 53.3 KB