Commit 6ff50cd5 authored by Eric Dumazet's avatar Eric Dumazet Committed by David S. Miller

tcp: gso: do not generate out of order packets

GSO TCP handler has following issues :

1) ooo_okay from original GSO packet is duplicated to all segments
2) segments (but the last one) are orphaned, so transmit path can not
get transmit queue number from the socket. This happens if GSO
segmentation is done before stacked device for example.

Result is we can send packets from a given TCP flow to different TX
queues (if using multiqueue NICS). This generates OOO problems and
spurious SACK & retransmits.

Fix this by keeping socket pointer set for all segments.

This means that every segment must also have a destructor, and the
original gso skb truesize must be split on all segments, to keep
precise sk->sk_wmem_alloc accounting.
Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 5c4b2749
......@@ -2887,6 +2887,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
unsigned int mss;
struct sk_buff *gso_skb = skb;
__sum16 newcheck;
bool ooo_okay, copy_destructor;
if (!pskb_may_pull(skb, sizeof(*th)))
goto out;
......@@ -2927,10 +2928,18 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
goto out;
}
copy_destructor = gso_skb->destructor == tcp_wfree;
ooo_okay = gso_skb->ooo_okay;
/* All segments but the first should have ooo_okay cleared */
skb->ooo_okay = 0;
segs = skb_segment(skb, features);
if (IS_ERR(segs))
goto out;
/* Only first segment might have ooo_okay set */
segs->ooo_okay = ooo_okay;
delta = htonl(oldlen + (thlen + mss));
skb = segs;
......@@ -2950,6 +2959,17 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
thlen, skb->csum));
seq += mss;
if (copy_destructor) {
skb->destructor = gso_skb->destructor;
skb->sk = gso_skb->sk;
/* {tcp|sock}_wfree() use exact truesize accounting :
* sum(skb->truesize) MUST be exactly be gso_skb->truesize
* So we account mss bytes of 'true size' for each segment.
* The last segment will contain the remaining.
*/
skb->truesize = mss;
gso_skb->truesize -= mss;
}
skb = skb->next;
th = tcp_hdr(skb);
......@@ -2962,7 +2982,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
* is freed at TX completion, and not right now when gso_skb
* is freed by GSO engine
*/
if (gso_skb->destructor == tcp_wfree) {
if (copy_destructor) {
swap(gso_skb->sk, skb->sk);
swap(gso_skb->destructor, skb->destructor);
swap(gso_skb->truesize, skb->truesize);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment