• Eric Dumazet's avatar
    tcp: derive delack_max from rto_min · bbf80d71
    Eric Dumazet authored
    While BPF allows to set icsk->->icsk_delack_max
    and/or icsk->icsk_rto_min, we have an ip route
    attribute (RTAX_RTO_MIN) to be able to tune rto_min,
    but nothing to consequently adjust max delayed ack,
    which vary from 40ms to 200 ms (TCP_DELACK_{MIN|MAX}).
    
    This makes RTAX_RTO_MIN of almost no practical use,
    unless customers are in big trouble.
    
    Modern days datacenter communications want to set
    rto_min to ~5 ms, and the max delayed ack one jiffie
    smaller to avoid spurious retransmits.
    
    After this patch, an "rto_min 5" route attribute will
    effectively lower max delayed ack timers to 4 ms.
    
    Note in the following ss output, "rto:6 ... ato:4"
    
    $ ss -temoi dst XXXXXX
    State Recv-Q Send-Q           Local Address:Port       Peer Address:Port  Process
    ESTAB 0      0        [2002:a05:6608:295::]:52950   [2002:a05:6608:297::]:41597
         ino:255134 sk:1001 <->
             skmem:(r0,rb1707063,t872,tb262144,f0,w0,o0,bl0,d0) ts sack
     cubic wscale:8,8 rto:6 rtt:0.02/0.002 ato:4 mss:4096 pmtu:4500
     rcvmss:536 advmss:4096 cwnd:10 bytes_sent:54823160 bytes_acked:54823121
     bytes_received:54823120 segs_out:1370582 segs_in:1370580
     data_segs_out:1370579 data_segs_in:1370578 send 16.4Gbps
     pacing_rate 32.6Gbps delivery_rate 1.72Gbps delivered:1370579
     busy:26920ms unacked:1 rcv_rtt:34.615 rcv_space:65920
     rcv_ssthresh:65535 minrtt:0.015 snd_wnd:65536
    
    While we could argue this patch fixes a bug with RTAX_RTO_MIN,
    I do not add a Fixes: tag, so that we can soak it a bit before
    asking backports to stable branches.
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    bbf80d71
tcp.c 125 KB