Commit 6fc2f383 authored by Jakub Kicinski's avatar Jakub Kicinski Committed by David S. Miller

ipv6: gro: flush instead of assuming different flows on hop_limit mismatch

IPv6 GRO considers packets to belong to different flows when their
hop_limit is different. This seems counter-intuitive, the flow is
the same. hop_limit may vary because of various bugs or hacks but
that doesn't mean it's okay for GRO to reorder packets.

Practical impact of this problem on overall TCP performance
is unclear, but TCP itself detects this reordering and bumps
TCPSACKReorder resulting in user complaints.

Eric warns that there may be performance regressions in setups
which do packet spraying across links with similar RTT but different
hop count. To be safe let's target -next and not treat this
as a fix. If the packet spraying is using flow label there should
be no difference in behavior as flow label is checked first.

Note that the code plays an easy to miss trick by upcasting next_hdr
to a u16 pointer and compares next_hdr and hop_limit in one go.
Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 10cdc794
...@@ -249,7 +249,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, ...@@ -249,7 +249,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
if ((first_word & htonl(0xF00FFFFF)) || if ((first_word & htonl(0xF00FFFFF)) ||
!ipv6_addr_equal(&iph->saddr, &iph2->saddr) || !ipv6_addr_equal(&iph->saddr, &iph2->saddr) ||
!ipv6_addr_equal(&iph->daddr, &iph2->daddr) || !ipv6_addr_equal(&iph->daddr, &iph2->daddr) ||
*(u16 *)&iph->nexthdr != *(u16 *)&iph2->nexthdr) { iph->nexthdr != iph2->nexthdr) {
not_same_flow: not_same_flow:
NAPI_GRO_CB(p)->same_flow = 0; NAPI_GRO_CB(p)->same_flow = 0;
continue; continue;
...@@ -260,7 +260,8 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, ...@@ -260,7 +260,8 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
goto not_same_flow; goto not_same_flow;
} }
/* flush if Traffic Class fields are different */ /* flush if Traffic Class fields are different */
NAPI_GRO_CB(p)->flush |= !!(first_word & htonl(0x0FF00000)); NAPI_GRO_CB(p)->flush |= !!((first_word & htonl(0x0FF00000)) |
(__force __be32)(iph->hop_limit ^ iph2->hop_limit));
NAPI_GRO_CB(p)->flush |= flush; NAPI_GRO_CB(p)->flush |= flush;
/* If the previous IP ID value was based on an atomic /* If the previous IP ID value was based on an atomic
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment