• Octavian Purdila's avatar
    tcp: fix premature termination of FIN_WAIT2 time-wait sockets · 80a1096b
    Octavian Purdila authored
    There is a race condition in the time-wait sockets code that can lead
    to premature termination of FIN_WAIT2 and, subsequently, to RST
    generation when the FIN,ACK from the peer finally arrives:
    
    Time     TCP header
    0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
    0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
    0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
    0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
    0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
    0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
    0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
    0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
    0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0
    
    Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
    instance inet_twdr_do_twkill_work returns 1. At that point we will
    mark slot 1 and schedule inet_twdr_twkill_work. We will also make
    twdr->slot = 2.
    
    Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
    is called which will create a new FIN_WAIT2 time-wait socket and will
    place it in the last to be reached slot, i.e. twdr->slot = 1.
    
    At this point say inet_twdr_twkill_work will run which will start
    destroying the time-wait sockets in slot 1, including the just added
    TCP_FIN_WAIT2 one.
    
    To avoid this issue we increment the slot only if all entries in the
    slot have been purged.
    
    This change may delay the slots cleanup by a time-wait death row
    period but only if the worker thread didn't had the time to run/purge
    the current slot in the next period (6 seconds with default sysctl
    settings). However, on such a busy system even without this change we
    would probably see delays...
    Signed-off-by: default avatarOctavian Purdila <opurdila@ixiacom.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    80a1096b
inet_timewait_sock.c 12.5 KB