Commit 44f5324b authored by Jerry Chu's avatar Jerry Chu Committed by David S. Miller

TCP: fix a bug that triggers large number of TCP RST by mistake

This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?
Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 73a8bd74
...@@ -4399,7 +4399,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) ...@@ -4399,7 +4399,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
if (!skb_copy_datagram_iovec(skb, 0, tp->ucopy.iov, chunk)) { if (!skb_copy_datagram_iovec(skb, 0, tp->ucopy.iov, chunk)) {
tp->ucopy.len -= chunk; tp->ucopy.len -= chunk;
tp->copied_seq += chunk; tp->copied_seq += chunk;
eaten = (chunk == skb->len && !th->fin); eaten = (chunk == skb->len);
tcp_rcv_space_adjust(sk); tcp_rcv_space_adjust(sk);
} }
local_bh_disable(); local_bh_disable();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment