• Aananth V's avatar
    tcp: new TCP_INFO stats for RTO events · 3868ab0f
    Aananth V authored
    The 2023 SIGCOMM paper "Improving Network Availability with Protective
    ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
    effectively reduce application disruption during outages. To better
    measure the efficacy of this feature, this patch adds three more
    detailed stats during RTO recovery and exports via TCP_INFO.
    Applications and monitoring systems can leverage this data to measure
    the network path diversity and end-to-end repair latency during network
    outages to improve their network infrastructure.
    
    The following counters are added to tcp_sock in order to track RTO
    events over the lifetime of a TCP socket.
    
    1. u16 total_rto - Counts the total number of RTO timeouts.
    2. u16 total_rto_recoveries - Counts the total number of RTO recoveries.
    3. u32 total_rto_time - Counts the total time spent (ms) in RTO
                            recoveries. (time spent in CA_Loss and
                            CA_Recovery states)
    
    To compute total_rto_time, we add a new u32 rto_stamp field to
    tcp_sock. rto_stamp records the start timestamp (ms) of the last RTO
    recovery (CA_Loss).
    
    Corresponding fields are also added to the tcp_info struct.
    Signed-off-by: default avatarAananth V <aananthv@google.com>
    Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    3868ab0f
tcp_input.c 204 KB