Commit da954ae1 authored by Jakub Kicinski's avatar Jakub Kicinski

Merge branch 'add-tx-push-buf-len-param-to-ethtool'

Shay Agroskin says:

====================
Add tx push buf len param to ethtool

This patchset adds a new sub-configuration to ethtool get/set queue
params (ethtool -g) called 'tx-push-buf-len'.

This configuration specifies the maximum number of bytes of a
transmitted packet a driver can push directly to the underlying
device ('push' mode). The motivation for pushing some of the bytes to
the device has the advantages of

- Allowing a smart device to take fast actions based on the packet's
  header
- Reducing latency for small packets that can be copied completely into
  the device

This new param is practically similar to tx-copybreak value that can be
set using ethtool's tunable but conceptually serves a different purpose.
While tx-copybreak is used to reduce the overhead of DMA mapping and
makes no sense to use if less than the whole segment gets copied,
tx-push-buf-len allows to improve performance by analyzing the packet's
data (usually headers) before performing the DMA operation.

The configuration can be queried and set using the commands:

    $ ethtool -g [interface]

    # ethtool -G [interface] tx-push-buf-len [number of bytes]

This patchset also adds support for the new configuration in ENA driver
for which this parameter ensures efficient resources management on the
device side.
====================

Link: https://lore.kernel.org/r/20230323163610.1281468-1-shayagr@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents 2bcc74ff 060cdac2
......@@ -165,6 +165,12 @@ attribute-sets:
-
name: rx-push
type: u8
-
name: tx-push-buf-len
type: u32
-
name: tx-push-buf-len-max
type: u32
-
name: mm-stat
......@@ -311,6 +317,8 @@ operations:
- cqe-size
- tx-push
- rx-push
- tx-push-buf-len
- tx-push-buf-len-max
dump: *ring-get-op
-
name: rings-set
......
......@@ -860,22 +860,24 @@ Request contents:
Kernel response contents:
==================================== ====== ===========================
``ETHTOOL_A_RINGS_HEADER`` nested reply header
``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
==================================== ====== ===========================
======================================= ====== ===========================
``ETHTOOL_A_RINGS_HEADER`` nested reply header
``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer
======================================= ====== ===========================
``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
......@@ -891,6 +893,18 @@ through MMIO writes, thus reducing the latency. However, enabling this feature
may increase the CPU cost. Drivers may enforce additional per-packet
eligibility checks (e.g. on packet size).
``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` specifies the maximum number of bytes of a
transmitted packet a driver can push directly to the underlying device
('push' mode). Pushing some of the payload bytes to the device has the
advantages of reducing latency for small packets by avoiding DMA mapping (same
as ``ETHTOOL_A_RINGS_TX_PUSH`` parameter) as well as allowing the underlying
device to process packet headers ahead of fetching its payload.
This can help the device to make fast actions based on the packet's headers.
This is similar to the "tx-copybreak" parameter, which copies the packet to a
preallocated DMA memory area instead of mapping new memory. However,
tx-push-buff parameter copies the packet directly to the device to allow the
device to take faster actions on the packet.
RINGS_SET
=========
......@@ -908,6 +922,7 @@ Request contents:
``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
==================================== ====== ===========================
Kernel checks that requested ring sizes do not exceed limits reported by
......
......@@ -10,6 +10,10 @@
/* head update threshold in units of (queue size / ENA_COMP_HEAD_THRESH) */
#define ENA_COMP_HEAD_THRESH 4
/* we allow 2 DMA descriptors per LLQ entry */
#define ENA_LLQ_ENTRY_DESC_CHUNK_SIZE (2 * sizeof(struct ena_eth_io_tx_desc))
#define ENA_LLQ_HEADER (128UL - ENA_LLQ_ENTRY_DESC_CHUNK_SIZE)
#define ENA_LLQ_LARGE_HEADER (256UL - ENA_LLQ_ENTRY_DESC_CHUNK_SIZE)
struct ena_com_tx_ctx {
struct ena_com_tx_meta ena_meta;
......
......@@ -476,6 +476,21 @@ static void ena_get_ringparam(struct net_device *netdev,
ring->tx_max_pending = adapter->max_tx_ring_size;
ring->rx_max_pending = adapter->max_rx_ring_size;
if (adapter->ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) {
bool large_llq_supported = adapter->large_llq_header_supported;
kernel_ring->tx_push = true;
kernel_ring->tx_push_buf_len = adapter->ena_dev->tx_max_header_size;
if (large_llq_supported)
kernel_ring->tx_push_buf_max_len = ENA_LLQ_LARGE_HEADER;
else
kernel_ring->tx_push_buf_max_len = ENA_LLQ_HEADER;
} else {
kernel_ring->tx_push = false;
kernel_ring->tx_push_buf_max_len = 0;
kernel_ring->tx_push_buf_len = 0;
}
ring->tx_pending = adapter->tx_ring[0].ring_size;
ring->rx_pending = adapter->rx_ring[0].ring_size;
}
......@@ -486,7 +501,8 @@ static int ena_set_ringparam(struct net_device *netdev,
struct netlink_ext_ack *extack)
{
struct ena_adapter *adapter = netdev_priv(netdev);
u32 new_tx_size, new_rx_size;
u32 new_tx_size, new_rx_size, new_tx_push_buf_len;
bool changed = false;
new_tx_size = ring->tx_pending < ENA_MIN_RING_SIZE ?
ENA_MIN_RING_SIZE : ring->tx_pending;
......@@ -496,11 +512,51 @@ static int ena_set_ringparam(struct net_device *netdev,
ENA_MIN_RING_SIZE : ring->rx_pending;
new_rx_size = rounddown_pow_of_two(new_rx_size);
if (new_tx_size == adapter->requested_tx_ring_size &&
new_rx_size == adapter->requested_rx_ring_size)
changed |= new_tx_size != adapter->requested_tx_ring_size ||
new_rx_size != adapter->requested_rx_ring_size;
/* This value is ignored if LLQ is not supported */
new_tx_push_buf_len = adapter->ena_dev->tx_max_header_size;
if ((adapter->ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) !=
kernel_ring->tx_push) {
NL_SET_ERR_MSG_MOD(extack, "Push mode state cannot be modified");
return -EINVAL;
}
/* Validate that the push buffer is supported on the underlying device */
if (kernel_ring->tx_push_buf_len) {
enum ena_admin_placement_policy_type placement;
new_tx_push_buf_len = kernel_ring->tx_push_buf_len;
placement = adapter->ena_dev->tx_mem_queue_type;
if (placement == ENA_ADMIN_PLACEMENT_POLICY_HOST)
return -EOPNOTSUPP;
if (new_tx_push_buf_len != ENA_LLQ_HEADER &&
new_tx_push_buf_len != ENA_LLQ_LARGE_HEADER) {
bool large_llq_sup = adapter->large_llq_header_supported;
char large_llq_size_str[40];
snprintf(large_llq_size_str, 40, ", %lu", ENA_LLQ_LARGE_HEADER);
NL_SET_ERR_MSG_FMT_MOD(extack,
"Supported tx push buff values: [%lu%s]",
ENA_LLQ_HEADER,
large_llq_sup ? large_llq_size_str : "");
return -EINVAL;
}
changed |= new_tx_push_buf_len != adapter->ena_dev->tx_max_header_size;
}
if (!changed)
return 0;
return ena_update_queue_sizes(adapter, new_tx_size, new_rx_size);
return ena_update_queue_params(adapter, new_tx_size, new_rx_size,
new_tx_push_buf_len);
}
static u32 ena_flow_hash_to_flow_type(u16 hash_fields)
......@@ -909,6 +965,8 @@ static int ena_set_tunable(struct net_device *netdev,
static const struct ethtool_ops ena_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
.supported_ring_params = ETHTOOL_RING_USE_TX_PUSH_BUF_LEN |
ETHTOOL_RING_USE_TX_PUSH,
.get_link_ksettings = ena_get_link_ksettings,
.get_drvinfo = ena_get_drvinfo,
.get_msglevel = ena_get_msglevel,
......
......@@ -334,6 +334,14 @@ struct ena_adapter {
u32 msg_enable;
/* large_llq_header_enabled is used for two purposes:
* 1. Indicates that large LLQ has been requested.
* 2. Indicates whether large LLQ is set or not after device
* initialization / configuration.
*/
bool large_llq_header_enabled;
bool large_llq_header_supported;
u16 max_tx_sgl_size;
u16 max_rx_sgl_size;
......@@ -388,9 +396,10 @@ void ena_dump_stats_to_buf(struct ena_adapter *adapter, u8 *buf);
int ena_update_hw_stats(struct ena_adapter *adapter);
int ena_update_queue_sizes(struct ena_adapter *adapter,
u32 new_tx_size,
u32 new_rx_size);
int ena_update_queue_params(struct ena_adapter *adapter,
u32 new_tx_size,
u32 new_rx_size,
u32 new_llq_header_len);
int ena_update_queue_count(struct ena_adapter *adapter, u32 new_channel_count);
......
......@@ -75,6 +75,8 @@ enum {
* @tx_push: The flag of tx push mode
* @rx_push: The flag of rx push mode
* @cqe_size: Size of TX/RX completion queue event
* @tx_push_buf_len: Size of TX push buffer
* @tx_push_buf_max_len: Maximum allowed size of TX push buffer
*/
struct kernel_ethtool_ringparam {
u32 rx_buf_len;
......@@ -82,6 +84,8 @@ struct kernel_ethtool_ringparam {
u8 tx_push;
u8 rx_push;
u32 cqe_size;
u32 tx_push_buf_len;
u32 tx_push_buf_max_len;
};
/**
......@@ -90,12 +94,14 @@ struct kernel_ethtool_ringparam {
* @ETHTOOL_RING_USE_CQE_SIZE: capture for setting cqe_size
* @ETHTOOL_RING_USE_TX_PUSH: capture for setting tx_push
* @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
* @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
*/
enum ethtool_supported_ring_param {
ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
ETHTOOL_RING_USE_CQE_SIZE = BIT(1),
ETHTOOL_RING_USE_TX_PUSH = BIT(2),
ETHTOOL_RING_USE_RX_PUSH = BIT(3),
ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
ETHTOOL_RING_USE_CQE_SIZE = BIT(1),
ETHTOOL_RING_USE_TX_PUSH = BIT(2),
ETHTOOL_RING_USE_RX_PUSH = BIT(3),
ETHTOOL_RING_USE_TX_PUSH_BUF_LEN = BIT(4),
};
#define __ETH_RSS_HASH_BIT(bit) ((u32)1 << (bit))
......
......@@ -161,9 +161,31 @@ struct netlink_ext_ack {
} \
} while (0)
#define NL_SET_ERR_MSG_ATTR_POL_FMT(extack, attr, pol, fmt, args...) do { \
struct netlink_ext_ack *__extack = (extack); \
\
if (!__extack) \
break; \
\
if (snprintf(__extack->_msg_buf, NETLINK_MAX_FMTMSG_LEN, \
"%s" fmt "%s", "", ##args, "") >= \
NETLINK_MAX_FMTMSG_LEN) \
net_warn_ratelimited("%s" fmt "%s", "truncated extack: ", \
##args, "\n"); \
\
do_trace_netlink_extack(__extack->_msg_buf); \
\
__extack->_msg = __extack->_msg_buf; \
__extack->bad_attr = (attr); \
__extack->policy = (pol); \
} while (0)
#define NL_SET_ERR_MSG_ATTR(extack, attr, msg) \
NL_SET_ERR_MSG_ATTR_POL(extack, attr, NULL, msg)
#define NL_SET_ERR_MSG_ATTR_FMT(extack, attr, msg, args...) \
NL_SET_ERR_MSG_ATTR_POL_FMT(extack, attr, NULL, msg, ##args)
#define NL_SET_ERR_ATTR_MISS(extack, nest, type) do { \
struct netlink_ext_ack *__extack = (extack); \
\
......
......@@ -357,6 +357,8 @@ enum {
ETHTOOL_A_RINGS_CQE_SIZE, /* u32 */
ETHTOOL_A_RINGS_TX_PUSH, /* u8 */
ETHTOOL_A_RINGS_RX_PUSH, /* u8 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */
/* add new constants above here */
__ETHTOOL_A_RINGS_CNT,
......
......@@ -413,7 +413,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_RX_PUSH + 1];
extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
......
......@@ -11,6 +11,7 @@ struct rings_reply_data {
struct ethnl_reply_data base;
struct ethtool_ringparam ringparam;
struct kernel_ethtool_ringparam kernel_ringparam;
u32 supported_ring_params;
};
#define RINGS_REPDATA(__reply_base) \
......@@ -32,6 +33,8 @@ static int rings_prepare_data(const struct ethnl_req_info *req_base,
if (!dev->ethtool_ops->get_ringparam)
return -EOPNOTSUPP;
data->supported_ring_params = dev->ethtool_ops->supported_ring_params;
ret = ethnl_ops_begin(dev);
if (ret < 0)
return ret;
......@@ -57,7 +60,9 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
nla_total_size(sizeof(u8)) + /* _RINGS_TCP_DATA_SPLIT */
nla_total_size(sizeof(u32) + /* _RINGS_CQE_SIZE */
nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */
nla_total_size(sizeof(u8))); /* _RINGS_RX_PUSH */
nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */
nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */
nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */
}
static int rings_fill_reply(struct sk_buff *skb,
......@@ -67,6 +72,7 @@ static int rings_fill_reply(struct sk_buff *skb,
const struct rings_reply_data *data = RINGS_REPDATA(reply_base);
const struct kernel_ethtool_ringparam *kr = &data->kernel_ringparam;
const struct ethtool_ringparam *ringparam = &data->ringparam;
u32 supported_ring_params = data->supported_ring_params;
WARN_ON(kr->tcp_data_split > ETHTOOL_TCP_DATA_SPLIT_ENABLED);
......@@ -98,7 +104,12 @@ static int rings_fill_reply(struct sk_buff *skb,
(kr->cqe_size &&
(nla_put_u32(skb, ETHTOOL_A_RINGS_CQE_SIZE, kr->cqe_size))) ||
nla_put_u8(skb, ETHTOOL_A_RINGS_TX_PUSH, !!kr->tx_push) ||
nla_put_u8(skb, ETHTOOL_A_RINGS_RX_PUSH, !!kr->rx_push))
nla_put_u8(skb, ETHTOOL_A_RINGS_RX_PUSH, !!kr->rx_push) ||
((supported_ring_params & ETHTOOL_RING_USE_TX_PUSH_BUF_LEN) &&
(nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
kr->tx_push_buf_max_len) ||
nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
kr->tx_push_buf_len))))
return -EMSGSIZE;
return 0;
......@@ -117,6 +128,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
[ETHTOOL_A_RINGS_CQE_SIZE] = NLA_POLICY_MIN(NLA_U32, 1),
[ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 },
};
static int
......@@ -158,6 +170,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
return -EOPNOTSUPP;
}
if (tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] &&
!(ops->supported_ring_params & ETHTOOL_RING_USE_TX_PUSH_BUF_LEN)) {
NL_SET_ERR_MSG_ATTR(info->extack,
tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN],
"setting tx push buf len is not supported");
return -EOPNOTSUPP;
}
return ops->get_ringparam && ops->set_ringparam ? 1 : -EOPNOTSUPP;
}
......@@ -189,6 +209,8 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
tb[ETHTOOL_A_RINGS_TX_PUSH], &mod);
ethnl_update_u8(&kernel_ringparam.rx_push,
tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
if (!mod)
return 0;
......@@ -209,6 +231,14 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
return -EINVAL;
}
if (kernel_ringparam.tx_push_buf_len > kernel_ringparam.tx_push_buf_max_len) {
NL_SET_ERR_MSG_ATTR_FMT(info->extack, tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN],
"Requested TX push buffer exceeds the maximum of %u",
kernel_ringparam.tx_push_buf_max_len);
return -EINVAL;
}
ret = dev->ethtool_ops->set_ringparam(dev, &ringparam,
&kernel_ringparam, info->extack);
return ret < 0 ? ret : 1;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment