Commit a695641c authored by Coco Li's avatar Coco Li Committed by Jakub Kicinski

gve: Support IPv6 Big TCP on DQ

Add support for using IPv6 Big TCP on DQ which can handle large TSO/GRO
packets. See https://lwn.net/Articles/895398/. This can improve the
throughput and CPU usage.

Perf test result:
ip -d link show $DEV
gso_max_size 185000 gso_max_segs 65535 tso_max_size 262143 tso_max_segs 65535 gro_max_size 185000

For performance, tested with neper using 9k MTU on hardware that supports 200Gb/s line rate.

In single streams when line rate is not saturated, we expect throughput improvements.
When the networking is performing at line rate, we expect cpu usage improvements.

Tcp_stream (unidirectional stream test, T=thread, F=flow):
skb=180kb, T=1, F=1, no zerocopy: throughput average=64576.88 Mb/s, sender stime=8.3, receiver stime=10.68
skb=64kb,  T=1, F=1, no zerocopy: throughput average=64862.54 Mb/s, sender stime=9.96, receiver stime=12.67
skb=180kb, T=1, F=1, yes zerocopy:  throughput average=146604.97 Mb/s, sender stime=10.61, receiver stime=5.52
skb=64kb,  T=1, F=1, yes zerocopy:  throughput average=131357.78 Mb/s, sender stime=12.11, receiver stime=12.25

skb=180kb, T=20, F=100, no zerocopy:  throughput average=182411.37 Mb/s, sender stime=41.62, receiver stime=79.4
skb=64kb,  T=20, F=100, no zerocopy:  throughput average=182892.02 Mb/s, sender stime=57.39, receiver stime=72.69
skb=180kb, T=20, F=100, yes zerocopy: throughput average=182337.65 Mb/s, sender stime=27.94, receiver stime=39.7
skb=64kb,  T=20, F=100, yes zerocopy: throughput average=182144.20 Mb/s, sender stime=47.06, receiver stime=39.01
Signed-off-by: default avatarZiwei Xiao <ziweixiao@google.com>
Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230522201552.3585421-1-ziweixiao@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent 51c78a4d
...@@ -31,6 +31,7 @@ ...@@ -31,6 +31,7 @@
// Minimum amount of time between queue kicks in msec (10 seconds) // Minimum amount of time between queue kicks in msec (10 seconds)
#define MIN_TX_TIMEOUT_GAP (1000 * 10) #define MIN_TX_TIMEOUT_GAP (1000 * 10)
#define DQO_TX_MAX 0x3FFFF
const char gve_version_str[] = GVE_VERSION; const char gve_version_str[] = GVE_VERSION;
static const char gve_version_prefix[] = GVE_VERSION_PREFIX; static const char gve_version_prefix[] = GVE_VERSION_PREFIX;
...@@ -2047,6 +2048,10 @@ static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device) ...@@ -2047,6 +2048,10 @@ static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
goto err; goto err;
} }
/* Big TCP is only supported on DQ*/
if (!gve_is_gqi(priv))
netif_set_tso_max_size(priv->dev, DQO_TX_MAX);
priv->num_registered_pages = 0; priv->num_registered_pages = 0;
priv->rx_copybreak = GVE_DEFAULT_RX_COPYBREAK; priv->rx_copybreak = GVE_DEFAULT_RX_COPYBREAK;
/* gvnic has one Notification Block per MSI-x vector, except for the /* gvnic has one Notification Block per MSI-x vector, except for the
......
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
#include "gve_adminq.h" #include "gve_adminq.h"
#include "gve_utils.h" #include "gve_utils.h"
#include "gve_dqo.h" #include "gve_dqo.h"
#include <net/ip.h>
#include <linux/tcp.h> #include <linux/tcp.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/skbuff.h> #include <linux/skbuff.h>
...@@ -646,6 +647,9 @@ static int gve_try_tx_skb(struct gve_priv *priv, struct gve_tx_ring *tx, ...@@ -646,6 +647,9 @@ static int gve_try_tx_skb(struct gve_priv *priv, struct gve_tx_ring *tx,
goto drop; goto drop;
} }
if (unlikely(ipv6_hopopt_jumbo_remove(skb)))
goto drop;
num_buffer_descs = gve_num_buffer_descs_needed(skb); num_buffer_descs = gve_num_buffer_descs_needed(skb);
} else { } else {
num_buffer_descs = gve_num_buffer_descs_needed(skb); num_buffer_descs = gve_num_buffer_descs_needed(skb);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment