Commit f96b48c6 authored by David S. Miller's avatar David S. Miller

Merge tag 'mlx5-updates-2021-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-08-19

This series introduces the support for two new mlx5 features:

1) Sample offload for tunneled traffic
2) devlink rate objects support

1) From Chris Mi: Sample offload for tunneled traffic
=====================================================

Background and solution
-----------------------

Currently the sample offload actions send the encapsulated packet
to software. This series de-capsulates the packet before performing
the sampling and set the tunnel properties on the skb metadata
fields to make the behavior consistent with OVS sFlow.

If de-capsulating first, we can't use the same match like before in
default table. So instantiate a post action instance to continue
processing the action list. If HW can preserve reg_c, also use the
post action instance.

Post action infrastructure
--------------------------

Some tc actions are modeled in hardware using multiple tables
causing a tc action list split. For example, CT action is modeled
by jumping to a ct table which is controlled by nf flow table.
sFlow jumps in hardware to a sample table, which continues to a
"default table" where it should continue processing the action list.

Multi table actions are modeled in hardware using a unique fte_id.
The fte_id is set before jumping to a table. Split actions continue
to a post-action table where the matched fte_id value continues the
execution the tc action list.

This series also introduces post action infrastructure. Both ct and
sample use it.

Sample for tunnel in TC SW
--------------------------

tc filter add dev vxlan1 protocol ip parent ffff: prio 3		\
	flower src_mac 24:25:d0:e1:00:00 dst_mac 02:25:d0:13:01:02	\
	enc_src_ip 192.168.1.14 enc_dst_ip 192.168.1.13			\
	enc_dst_port 4789 enc_key_id 4					\
	action sample rate 1 group 6					\
	action tunnel_key unset						\
	action mirred egress redirect dev enp4s0f0_1

MLX5 sample HW offload
----------------------

For the following typical flow table:

+-------------------------------+
+       original flow table     +
+-------------------------------+
+         original match        +
+-------------------------------+
+ sample action + other actions +
+-------------------------------+

We translate the tc filter with sample action to the following HW model:

        +---------------------+
        + original flow table +
        +---------------------+
        +   original match    +
        +---------------------+
              | set fte_id (if reg_c preserve cap)
              | do decap
              v
+------------------------------------------------+
+                Flow Sampler Object             +
+------------------------------------------------+
+                    sample ratio                +
+------------------------------------------------+
+    sample table id    |    default table id    +
+------------------------------------------------+
           |                            |
           v                            v
+-----------------------------+  +-------------------+
+        sample table         +  +   default table   +
+-----------------------------+  +-------------------+
+ forward to management vport +             |
+-----------------------------+             |
                                    +-------+------+
                                    |              |reg_c preserve cap
                                    |              |or decap action
                                    v              v
                       +-----------------+   +-------------+
                       + per vport table +   + post action +
                       +-----------------+   +-------------+
                       + original match  +
                       +-----------------+
                       + other actions   +
                       +-----------------+

2) From Dmytro Linkin: devlink rate object support for mlx5_core driver
=======================================================================

HIGH-LEVEL OVERVIEW

Devlink leaf rate objects created per vport (VF/SF, and PF on BlueField)
in switchdev mode on devlink port registration.
Implement devlink ops callbacks to create/destroy rate groups, set TX
rate values of the vport/group, assign vport to the group.
Driver accepts TX rate values as fraction of 1Mbps.

Refactor existing eswitch QoS infrastructure to be accessible by legacy
NDO rate API and new devlink rate API. NDO rate API is not
removed/disabled in switchdev mode to not break existing users. Rate
values configured with NDO rate API are not visible for devlink
infrastructure, therefore APIs should not be used simultaneously.

IMPLEMENTATION DETAILS

Driver provide two level rate hierarchy to manage bandwidth - group
level and vport level. Initially each vport added to internal unlimited
group created by default. Each rate element (vport or group) receive
bandwidth relative to its parent element (for groups the parent is a
physical link itself) in a Round Robin manner, where element get
bandwidth value according to its weight. Example:

Created four rate groups with tx_share limits:

$ devlink port function rate add \
    pci/0000:06:00.0/group_1 tx_share 30gbit
$ devlink port function rate add \
    pci/0000:06:00.0/group_2 tx_share 20gbit
$ devlink port function rate add \
    pci/0000:06:00.0/group_3 tx_share 20gbit
$ devlink port function rate add \
    pci/0000:06:00.0/group_4 tx_share 10gbit

Weights created in HW for each group are relative to the bigest tx_share
value, which is 30gbit:

<group_1> 1.0
<group_2> 0.67
<group_3> 0.67
<group_4> 0.33

Assuming link speed is 50 Gbit/sec and each group can sustain such
amount of traffic, maximum bandwidth is 50 / (1.0 + 0.67 + 0.67 + 0.33)
 = ~18.75 Gbit/sec. Normilized bandwidth values for groups:

<group_1> 18.75 * 1.0  = 18.75 Gbit/sec
<group_2> 18.75 * 0.67 = 12.5 Gbit/sec
<group_3> 18.75 * 0.67 = 12.5 Gbit/sec
<group_4> 18.75 * 0.33 = 6.25 Gbit/sec

If in example above group_1 doesn't produce any traffic, then maximum
bandwidth becomes 50 / (0.67 + 0.67 + 0.33) = ~30.0 Gbit/sec. Normalized
values:

<group_2> 30.0 * 0.67 = 20.0 Gbit/sec
<group_3> 30.0 * 0.67 = 20.0 Gbit/sec
<group_4> 30.0 * 0.33 = 10.0 Gbit/sec

Same normalization applied to each vport in the group.

Normalized values are internal, therefore driver provides QoS
tracepoints for next events:

* vport rate element creation/deletion:
* vport rate element configuration;
* group rate element creation/deletion;
* group rate element configuration.

PATCHES OVERVIEW

1 - Moving and isolation of eswitch QoS logic in separate file;

2 - Implement devlink leaf rate object support for vports;

3 - Implement rate groups creation/deletion;

4 - Implement TX rate management for the groups;

5 - Implement parent set for vports;

6 - Eswitch QoS tracepoints.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents e61fbee7 3202ea65
......@@ -656,3 +656,47 @@ Bridge offloads tracepoints:
$ cat /sys/kernel/debug/tracing/trace
...
ip-5387 [000] ...1 573713: mlx5_esw_bridge_vport_cleanup: vport_num=1
Eswitch QoS tracepoints:
- mlx5_esw_vport_qos_create: trace creation of transmit scheduler arbiter for vport::
$ echo mlx5:mlx5_esw_vport_qos_create >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-23496 [018] .... 73136.838831: mlx5_esw_vport_qos_create: (0000:82:00.0) vport=2 tsar_ix=4 bw_share=0, max_rate=0 group=000000007b576bb3
- mlx5_esw_vport_qos_config: trace configuration of transmit scheduler arbiter for vport::
$ echo mlx5:mlx5_esw_vport_qos_config >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-26548 [023] .... 75754.223823: mlx5_esw_vport_qos_config: (0000:82:00.0) vport=1 tsar_ix=3 bw_share=34, max_rate=10000 group=000000007b576bb3
- mlx5_esw_vport_qos_destroy: trace deletion of transmit scheduler arbiter for vport::
$ echo mlx5:mlx5_esw_vport_qos_destroy >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-27418 [004] .... 76546.680901: mlx5_esw_vport_qos_destroy: (0000:82:00.0) vport=1 tsar_ix=3
- mlx5_esw_group_qos_create: trace creation of transmit scheduler arbiter for rate group::
$ echo mlx5:mlx5_esw_group_qos_create >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-26578 [008] .... 75776.022112: mlx5_esw_group_qos_create: (0000:82:00.0) group=000000008dac63ea tsar_ix=5
- mlx5_esw_group_qos_config: trace configuration of transmit scheduler arbiter for rate group::
$ echo mlx5:mlx5_esw_group_qos_config >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-27303 [020] .... 76461.455356: mlx5_esw_group_qos_config: (0000:82:00.0) group=000000008dac63ea tsar_ix=5 bw_share=100 max_rate=20000
- mlx5_esw_group_qos_destroy: trace deletion of transmit scheduler arbiter for group::
$ echo mlx5:mlx5_esw_group_qos_destroy >> /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace
...
<...>-27418 [006] .... 76547.187258: mlx5_esw_group_qos_destroy: (0000:82:00.0) group=000000007b576bb3 tsar_ix=1
......@@ -44,19 +44,22 @@ mlx5_core-$(CONFIG_MLX5_CLS_ACT) += en_tc.o en/rep/tc.o en/rep/neigh.o \
lib/fs_chains.o en/tc_tun.o \
esw/indir_table.o en/tc_tun_encap.o \
en/tc_tun_vxlan.o en/tc_tun_gre.o en/tc_tun_geneve.o \
en/tc_tun_mplsoudp.o diag/en_tc_tracepoint.o
en/tc_tun_mplsoudp.o diag/en_tc_tracepoint.o \
en/tc/post_act.o
mlx5_core-$(CONFIG_MLX5_TC_CT) += en/tc_ct.o
mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += en/tc/sample.o
#
# Core extra
#
mlx5_core-$(CONFIG_MLX5_ESWITCH) += eswitch.o eswitch_offloads.o eswitch_offloads_termtbl.o \
ecpf.o rdma.o esw/legacy.o
ecpf.o rdma.o esw/legacy.o \
esw/devlink_port.o esw/vporttbl.o esw/qos.o
mlx5_core-$(CONFIG_MLX5_ESWITCH) += esw/acl/helper.o \
esw/acl/egress_lgcy.o esw/acl/egress_ofld.o \
esw/acl/ingress_lgcy.o esw/acl/ingress_ofld.o \
esw/devlink_port.o esw/vporttbl.o
mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += esw/sample.o
esw/acl/ingress_lgcy.o esw/acl/ingress_ofld.o
mlx5_core-$(CONFIG_MLX5_BRIDGE) += esw/bridge.o en/rep/bridge.o
mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o
......
......@@ -7,6 +7,7 @@
#include "fw_reset.h"
#include "fs_core.h"
#include "eswitch.h"
#include "esw/qos.h"
#include "sf/dev/dev.h"
#include "sf/sf.h"
......@@ -292,6 +293,13 @@ static const struct devlink_ops mlx5_devlink_ops = {
.eswitch_encap_mode_get = mlx5_devlink_eswitch_encap_mode_get,
.port_function_hw_addr_get = mlx5_devlink_port_function_hw_addr_get,
.port_function_hw_addr_set = mlx5_devlink_port_function_hw_addr_set,
.rate_leaf_tx_share_set = mlx5_esw_devlink_rate_leaf_tx_share_set,
.rate_leaf_tx_max_set = mlx5_esw_devlink_rate_leaf_tx_max_set,
.rate_node_tx_share_set = mlx5_esw_devlink_rate_node_tx_share_set,
.rate_node_tx_max_set = mlx5_esw_devlink_rate_node_tx_max_set,
.rate_node_new = mlx5_esw_devlink_rate_node_new,
.rate_node_del = mlx5_esw_devlink_rate_node_del,
.rate_leaf_parent_set = mlx5_esw_devlink_rate_parent_set,
#endif
#ifdef CONFIG_MLX5_SF_MANAGER
.port_new = mlx5_devlink_sf_port_new,
......
......@@ -7,6 +7,8 @@
#include "mod_hdr.h"
#include "lib/fs_ttc.h"
struct mlx5e_post_act;
enum {
MLX5E_TC_FT_LEVEL = 0,
MLX5E_TC_TTC_FT_LEVEL,
......@@ -19,6 +21,7 @@ struct mlx5e_tc_table {
struct mutex t_lock;
struct mlx5_flow_table *t;
struct mlx5_fs_chains *chains;
struct mlx5e_post_act *post_act;
struct rhashtable ht;
......
......@@ -17,7 +17,7 @@
#include "en/mapping.h"
#include "en/tc_tun.h"
#include "lib/port_tun.h"
#include "esw/sample.h"
#include "en/tc/sample.h"
struct mlx5e_rep_indr_block_priv {
struct net_device *netdev;
......@@ -516,7 +516,6 @@ void mlx5e_rep_tc_netdevice_event_unregister(struct mlx5e_rep_priv *rpriv)
mlx5e_rep_indr_block_unbind);
}
#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
static bool mlx5e_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb,
struct mlx5e_tc_update_priv *tc_priv,
u32 tunnel_id)
......@@ -609,12 +608,13 @@ static bool mlx5e_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb,
return true;
}
static bool mlx5e_restore_skb(struct sk_buff *skb, u32 chain, u32 reg_c1,
static bool mlx5e_restore_skb_chain(struct sk_buff *skb, u32 chain, u32 reg_c1,
struct mlx5e_tc_update_priv *tc_priv)
{
struct mlx5e_priv *priv = netdev_priv(skb->dev);
u32 tunnel_id = (reg_c1 >> ESW_TUN_OFFSET) & TUNNEL_ID_MASK;
#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
if (chain) {
struct mlx5_rep_uplink_priv *uplink_priv;
struct mlx5e_rep_priv *uplink_rpriv;
......@@ -636,9 +636,25 @@ static bool mlx5e_restore_skb(struct sk_buff *skb, u32 chain, u32 reg_c1,
zone_restore_id))
return false;
}
#endif /* CONFIG_NET_TC_SKB_EXT */
return mlx5e_restore_tunnel(priv, skb, tc_priv, tunnel_id);
}
#endif /* CONFIG_NET_TC_SKB_EXT */
static void mlx5e_restore_skb_sample(struct mlx5e_priv *priv, struct sk_buff *skb,
struct mlx5_mapped_obj *mapped_obj,
struct mlx5e_tc_update_priv *tc_priv)
{
if (!mlx5e_restore_tunnel(priv, skb, tc_priv, mapped_obj->sample.tunnel_id)) {
netdev_dbg(priv->netdev,
"Failed to restore tunnel info for sampled packet\n");
return;
}
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
mlx5e_tc_sample_skb(skb, mapped_obj);
#endif /* CONFIG_MLX5_TC_SAMPLE */
mlx5_rep_tc_post_napi_receive(tc_priv);
}
bool mlx5e_rep_tc_update_skb(struct mlx5_cqe64 *cqe,
struct sk_buff *skb,
......@@ -647,7 +663,7 @@ bool mlx5e_rep_tc_update_skb(struct mlx5_cqe64 *cqe,
struct mlx5_mapped_obj mapped_obj;
struct mlx5_eswitch *esw;
struct mlx5e_priv *priv;
u32 reg_c0, reg_c1;
u32 reg_c0;
int err;
reg_c0 = (be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK);
......@@ -659,8 +675,6 @@ bool mlx5e_rep_tc_update_skb(struct mlx5_cqe64 *cqe,
*/
skb->mark = 0;
reg_c1 = be32_to_cpu(cqe->ft_metadata);
priv = netdev_priv(skb->dev);
esw = priv->mdev->priv.eswitch;
err = mapping_find(esw->offloads.reg_c0_obj_pool, reg_c0, &mapped_obj);
......@@ -671,18 +685,14 @@ bool mlx5e_rep_tc_update_skb(struct mlx5_cqe64 *cqe,
return false;
}
#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN)
return mlx5e_restore_skb(skb, mapped_obj.chain, reg_c1, tc_priv);
#endif /* CONFIG_NET_TC_SKB_EXT */
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
if (mapped_obj.type == MLX5_MAPPED_OBJ_SAMPLE) {
mlx5_esw_sample_skb(skb, &mapped_obj);
if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
u32 reg_c1 = be32_to_cpu(cqe->ft_metadata);
return mlx5e_restore_skb_chain(skb, mapped_obj.chain, reg_c1, tc_priv);
} else if (mapped_obj.type == MLX5_MAPPED_OBJ_SAMPLE) {
mlx5e_restore_skb_sample(priv, skb, &mapped_obj, tc_priv);
return false;
}
#endif /* CONFIG_MLX5_TC_SAMPLE */
if (mapped_obj.type != MLX5_MAPPED_OBJ_SAMPLE &&
mapped_obj.type != MLX5_MAPPED_OBJ_CHAIN) {
} else {
netdev_dbg(priv->netdev, "Invalid mapped object type: %d\n", mapped_obj.type);
return false;
}
......
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
// Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#include "en_tc.h"
#include "post_act.h"
#include "mlx5_core.h"
struct mlx5e_post_act {
enum mlx5_flow_namespace_type ns_type;
struct mlx5_fs_chains *chains;
struct mlx5_flow_table *ft;
struct mlx5e_priv *priv;
struct xarray ids;
};
struct mlx5e_post_act_handle {
enum mlx5_flow_namespace_type ns_type;
struct mlx5_flow_attr *attr;
struct mlx5_flow_handle *rule;
u32 id;
};
#define MLX5_POST_ACTION_BITS (mlx5e_tc_attr_to_reg_mappings[FTEID_TO_REG].mlen)
#define MLX5_POST_ACTION_MAX GENMASK(MLX5_POST_ACTION_BITS - 1, 0)
#define MLX5_POST_ACTION_MASK MLX5_POST_ACTION_MAX
struct mlx5e_post_act *
mlx5e_tc_post_act_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
enum mlx5_flow_namespace_type ns_type)
{
struct mlx5e_post_act *post_act;
int err;
if (ns_type == MLX5_FLOW_NAMESPACE_FDB &&
!MLX5_CAP_ESW_FLOWTABLE_FDB(priv->mdev, ignore_flow_level)) {
mlx5_core_warn(priv->mdev, "firmware level support is missing\n");
err = -EOPNOTSUPP;
goto err_check;
} else if (!MLX5_CAP_FLOWTABLE_NIC_RX(priv->mdev, ignore_flow_level)) {
mlx5_core_warn(priv->mdev, "firmware level support is missing\n");
err = -EOPNOTSUPP;
goto err_check;
}
post_act = kzalloc(sizeof(*post_act), GFP_KERNEL);
if (!post_act) {
err = -ENOMEM;
goto err_check;
}
post_act->ft = mlx5_chains_create_global_table(chains);
if (IS_ERR(post_act->ft)) {
err = PTR_ERR(post_act->ft);
mlx5_core_warn(priv->mdev, "failed to create post action table, err: %d\n", err);
goto err_ft;
}
post_act->chains = chains;
post_act->ns_type = ns_type;
post_act->priv = priv;
xa_init_flags(&post_act->ids, XA_FLAGS_ALLOC1);
return post_act;
err_ft:
kfree(post_act);
err_check:
return ERR_PTR(err);
}
void
mlx5e_tc_post_act_destroy(struct mlx5e_post_act *post_act)
{
if (IS_ERR_OR_NULL(post_act))
return;
xa_destroy(&post_act->ids);
mlx5_chains_destroy_global_table(post_act->chains, post_act->ft);
kfree(post_act);
}
struct mlx5e_post_act_handle *
mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *attr)
{
u32 attr_sz = ns_to_attr_sz(post_act->ns_type);
struct mlx5e_post_act_handle *handle = NULL;
struct mlx5_flow_attr *post_attr = NULL;
struct mlx5_flow_spec *spec = NULL;
int err;
handle = kzalloc(sizeof(*handle), GFP_KERNEL);
spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
post_attr = mlx5_alloc_flow_attr(post_act->ns_type);
if (!handle || !spec || !post_attr) {
kfree(post_attr);
kvfree(spec);
kfree(handle);
return ERR_PTR(-ENOMEM);
}
memcpy(post_attr, attr, attr_sz);
post_attr->chain = 0;
post_attr->prio = 0;
post_attr->ft = post_act->ft;
post_attr->inner_match_level = MLX5_MATCH_NONE;
post_attr->outer_match_level = MLX5_MATCH_NONE;
post_attr->action &= ~(MLX5_FLOW_CONTEXT_ACTION_DECAP);
handle->ns_type = post_act->ns_type;
/* Splits were handled before post action */
if (handle->ns_type == MLX5_FLOW_NAMESPACE_FDB)
post_attr->esw_attr->split_count = 0;
err = xa_alloc(&post_act->ids, &handle->id, post_attr,
XA_LIMIT(1, MLX5_POST_ACTION_MAX), GFP_KERNEL);
if (err)
goto err_xarray;
/* Post action rule matches on fte_id and executes original rule's
* tc rule action
*/
mlx5e_tc_match_to_reg_match(spec, FTEID_TO_REG,
handle->id, MLX5_POST_ACTION_MASK);
handle->rule = mlx5_tc_rule_insert(post_act->priv, spec, post_attr);
if (IS_ERR(handle->rule)) {
err = PTR_ERR(handle->rule);
netdev_warn(post_act->priv->netdev, "Failed to add post action rule");
goto err_rule;
}
handle->attr = post_attr;
kvfree(spec);
return handle;
err_rule:
xa_erase(&post_act->ids, handle->id);
err_xarray:
kfree(post_attr);
kvfree(spec);
kfree(handle);
return ERR_PTR(err);
}
void
mlx5e_tc_post_act_del(struct mlx5e_post_act *post_act, struct mlx5e_post_act_handle *handle)
{
mlx5_tc_rule_delete(post_act->priv, handle->rule, handle->attr);
xa_erase(&post_act->ids, handle->id);
kfree(handle->attr);
kfree(handle);
}
struct mlx5_flow_table *
mlx5e_tc_post_act_get_ft(struct mlx5e_post_act *post_act)
{
return post_act->ft;
}
/* Allocate a header modify action to write the post action handle fte id to a register. */
int
mlx5e_tc_post_act_set_handle(struct mlx5_core_dev *dev,
struct mlx5e_post_act_handle *handle,
struct mlx5e_tc_mod_hdr_acts *acts)
{
return mlx5e_tc_match_to_reg_set(dev, acts, handle->ns_type, FTEID_TO_REG, handle->id);
}
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
#ifndef __MLX5_POST_ACTION_H__
#define __MLX5_POST_ACTION_H__
#include "en.h"
#include "lib/fs_chains.h"
struct mlx5_flow_attr;
struct mlx5e_priv;
struct mlx5e_tc_mod_hdr_acts;
struct mlx5e_post_act *
mlx5e_tc_post_act_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
enum mlx5_flow_namespace_type ns_type);
void
mlx5e_tc_post_act_destroy(struct mlx5e_post_act *post_act);
struct mlx5e_post_act_handle *
mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *attr);
void
mlx5e_tc_post_act_del(struct mlx5e_post_act *post_act, struct mlx5e_post_act_handle *handle);
struct mlx5_flow_table *
mlx5e_tc_post_act_get_ft(struct mlx5e_post_act *post_act);
int
mlx5e_tc_post_act_set_handle(struct mlx5_core_dev *dev,
struct mlx5e_post_act_handle *handle,
struct mlx5e_tc_mod_hdr_acts *acts);
#endif /* __MLX5_POST_ACTION_H__ */
......@@ -4,39 +4,38 @@
#ifndef __MLX5_EN_TC_SAMPLE_H__
#define __MLX5_EN_TC_SAMPLE_H__
#include "en.h"
#include "eswitch.h"
struct mlx5e_priv;
struct mlx5_flow_attr;
struct mlx5_esw_psample;
struct mlx5e_tc_psample;
struct mlx5e_post_act;
struct mlx5_sample_attr {
struct mlx5e_sample_attr {
u32 group_num;
u32 rate;
u32 trunc_size;
u32 restore_obj_id;
u32 sampler_id;
struct mlx5_flow_table *sample_default_tbl;
struct mlx5_sample_flow *sample_flow;
struct mlx5e_sample_flow *sample_flow;
};
void mlx5_esw_sample_skb(struct sk_buff *skb, struct mlx5_mapped_obj *mapped_obj);
void mlx5e_tc_sample_skb(struct sk_buff *skb, struct mlx5_mapped_obj *mapped_obj);
struct mlx5_flow_handle *
mlx5_esw_sample_offload(struct mlx5_esw_psample *sample_priv,
mlx5e_tc_sample_offload(struct mlx5e_tc_psample *sample_priv,
struct mlx5_flow_spec *spec,
struct mlx5_flow_attr *attr);
struct mlx5_flow_attr *attr,
u32 tunnel_id);
void
mlx5_esw_sample_unoffload(struct mlx5_esw_psample *sample_priv,
mlx5e_tc_sample_unoffload(struct mlx5e_tc_psample *sample_priv,
struct mlx5_flow_handle *rule,
struct mlx5_flow_attr *attr);
struct mlx5_esw_psample *
mlx5_esw_sample_init(struct mlx5e_priv *priv);
struct mlx5e_tc_psample *
mlx5e_tc_sample_init(struct mlx5_eswitch *esw, struct mlx5e_post_act *post_act);
void
mlx5_esw_sample_cleanup(struct mlx5_esw_psample *esw_psample);
mlx5e_tc_sample_cleanup(struct mlx5e_tc_psample *tc_psample);
#endif /* __MLX5_EN_TC_SAMPLE_H__ */
......@@ -92,7 +92,8 @@ struct mlx5_ct_attr {
struct mlx5_tc_ct_priv *
mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
struct mod_hdr_tbl *mod_hdr,
enum mlx5_flow_namespace_type ns_type);
enum mlx5_flow_namespace_type ns_type,
struct mlx5e_post_act *post_act);
void
mlx5_tc_ct_clean(struct mlx5_tc_ct_priv *ct_priv);
......@@ -132,7 +133,8 @@ mlx5e_tc_ct_restore_flow(struct mlx5_tc_ct_priv *ct_priv,
static inline struct mlx5_tc_ct_priv *
mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
struct mod_hdr_tbl *mod_hdr,
enum mlx5_flow_namespace_type ns_type)
enum mlx5_flow_namespace_type ns_type,
struct mlx5e_post_act *post_act)
{
return NULL;
}
......
......@@ -60,6 +60,7 @@ struct mlx5e_neigh_update_table {
struct mlx5_tc_ct_priv;
struct mlx5e_rep_bond;
struct mlx5e_tc_tun_encap;
struct mlx5e_post_act;
struct mlx5_rep_uplink_priv {
/* Filters DB - instantiated by the uplink representor and shared by
......@@ -88,8 +89,9 @@ struct mlx5_rep_uplink_priv {
/* maps tun_enc_opts to a unique id*/
struct mapping_ctx *tunnel_enc_opts_mapping;
struct mlx5e_post_act *post_act;
struct mlx5_tc_ct_priv *ct_priv;
struct mlx5_esw_psample *esw_psample;
struct mlx5e_tc_psample *tc_psample;
/* support eswitch vports bonding */
struct mlx5e_rep_bond *bond;
......
......@@ -47,6 +47,7 @@
#include <net/bareudp.h>
#include <net/bonding.h>
#include "en.h"
#include "en/tc/post_act.h"
#include "en_rep.h"
#include "en/rep/tc.h"
#include "en/rep/neigh.h"
......@@ -60,7 +61,7 @@
#include "en/mod_hdr.h"
#include "en/tc_priv.h"
#include "en/tc_tun_encap.h"
#include "esw/sample.h"
#include "en/tc/sample.h"
#include "lib/devcom.h"
#include "lib/geneve.h"
#include "lib/fs_chains.h"
......@@ -246,7 +247,7 @@ get_ct_priv(struct mlx5e_priv *priv)
}
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
static struct mlx5_esw_psample *
static struct mlx5e_tc_psample *
get_sample_priv(struct mlx5e_priv *priv)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
......@@ -257,7 +258,7 @@ get_sample_priv(struct mlx5e_priv *priv)
uplink_rpriv = mlx5_eswitch_get_uplink_priv(esw, REP_ETH);
uplink_priv = &uplink_rpriv->uplink_priv;
return uplink_priv->esw_psample;
return uplink_priv->tc_psample;
}
return NULL;
......@@ -1147,7 +1148,8 @@ mlx5e_tc_offload_fdb_rules(struct mlx5_eswitch *esw,
mod_hdr_acts);
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
} else if (flow_flag_test(flow, SAMPLE)) {
rule = mlx5_esw_sample_offload(get_sample_priv(flow->priv), spec, attr);
rule = mlx5e_tc_sample_offload(get_sample_priv(flow->priv), spec, attr,
mlx5e_tc_get_flow_tun_id(flow));
#endif
} else {
rule = mlx5_eswitch_add_offloaded_rule(esw, spec, attr);
......@@ -1186,7 +1188,7 @@ void mlx5e_tc_unoffload_fdb_rules(struct mlx5_eswitch *esw,
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
if (flow_flag_test(flow, SAMPLE)) {
mlx5_esw_sample_unoffload(get_sample_priv(flow->priv), flow->rule[0], attr);
mlx5e_tc_sample_unoffload(get_sample_priv(flow->priv), flow->rule[0], attr);
return;
}
#endif
......@@ -1550,6 +1552,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv,
else
mlx5e_detach_mod_hdr(priv, flow);
}
kfree(attr->sample_attr);
kvfree(attr->parse_attr);
kvfree(attr->esw_attr->rx_tun_attr);
......@@ -1559,7 +1562,6 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv,
if (flow_flag_test(flow, L3_TO_L2_DECAP))
mlx5e_detach_decap(priv, flow);
kfree(flow->attr->esw_attr->sample);
kfree(flow->attr);
}
......@@ -1624,17 +1626,22 @@ static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
}
}
static int flow_has_tc_fwd_action(struct flow_cls_offload *f)
static bool flow_requires_tunnel_mapping(u32 chain, struct flow_cls_offload *f)
{
struct flow_rule *rule = flow_cls_offload_flow_rule(f);
struct flow_action *flow_action = &rule->action;
const struct flow_action_entry *act;
int i;
if (chain)
return false;
flow_action_for_each(i, act, flow_action) {
switch (act->id) {
case FLOW_ACTION_GOTO:
return true;
case FLOW_ACTION_SAMPLE:
return true;
default:
continue;
}
......@@ -1875,7 +1882,7 @@ static int parse_tunnel_attr(struct mlx5e_priv *priv,
return -EOPNOTSUPP;
needs_mapping = !!flow->attr->chain;
sets_mapping = !flow->attr->chain && flow_has_tc_fwd_action(f);
sets_mapping = flow_requires_tunnel_mapping(flow->attr->chain, f);
*match_inner = !needs_mapping;
if ((needs_mapping || sets_mapping) &&
......@@ -3716,13 +3723,13 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv,
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct mlx5e_tc_flow_parse_attr *parse_attr;
struct mlx5e_rep_priv *rpriv = priv->ppriv;
struct mlx5e_sample_attr sample_attr = {};
const struct ip_tunnel_info *info = NULL;
struct mlx5_flow_attr *attr = flow->attr;
int ifindexes[MLX5_MAX_FLOW_FWD_VPORTS];
bool ft_flow = mlx5e_is_ft_flow(flow);
const struct flow_action_entry *act;
struct mlx5_esw_flow_attr *esw_attr;
struct mlx5_sample_attr sample = {};
bool encap = false, decap = false;
u32 action = attr->action;
int err, i, if_count = 0;
......@@ -3993,10 +4000,10 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv,
NL_SET_ERR_MSG_MOD(extack, "Sample action with connection tracking is not supported");
return -EOPNOTSUPP;
}
sample.rate = act->sample.rate;
sample.group_num = act->sample.psample_group->group_num;
sample_attr.rate = act->sample.rate;
sample_attr.group_num = act->sample.psample_group->group_num;
if (act->sample.truncate)
sample.trunc_size = act->sample.trunc_size;
sample_attr.trunc_size = act->sample.trunc_size;
flow_flag_set(flow, SAMPLE);
break;
default:
......@@ -4081,10 +4088,10 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv,
* no errors after parsing.
*/
if (flow_flag_test(flow, SAMPLE)) {
esw_attr->sample = kzalloc(sizeof(*esw_attr->sample), GFP_KERNEL);
if (!esw_attr->sample)
attr->sample_attr = kzalloc(sizeof(*attr->sample_attr), GFP_KERNEL);
if (!attr->sample_attr)
return -ENOMEM;
*esw_attr->sample = sample;
*attr->sample_attr = sample_attr;
}
return 0;
......@@ -4682,7 +4689,7 @@ static int apply_police_params(struct mlx5e_priv *priv, u64 rate,
rate_mbps = max_t(u32, rate, 1);
}
err = mlx5_esw_modify_vport_rate(esw, vport_num, rate_mbps);
err = mlx5_esw_qos_modify_vport_rate(esw, vport_num, rate_mbps);
if (err)
NL_SET_ERR_MSG_MOD(extack, "failed applying action to hardware");
......@@ -4895,8 +4902,9 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
goto err_chains;
}
tc->post_act = mlx5e_tc_post_act_init(priv, tc->chains, MLX5_FLOW_NAMESPACE_KERNEL);
tc->ct = mlx5_tc_ct_init(priv, tc->chains, &priv->fs.tc.mod_hdr,
MLX5_FLOW_NAMESPACE_KERNEL);
MLX5_FLOW_NAMESPACE_KERNEL, tc->post_act);
tc->netdevice_nb.notifier_call = mlx5e_tc_netdev_event;
err = register_netdevice_notifier_dev_net(priv->netdev,
......@@ -4912,6 +4920,7 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
err_reg:
mlx5_tc_ct_clean(tc->ct);
mlx5e_tc_post_act_destroy(tc->post_act);
mlx5_chains_destroy(tc->chains);
err_chains:
mapping_destroy(chains_mapping);
......@@ -4950,6 +4959,7 @@ void mlx5e_tc_nic_cleanup(struct mlx5e_priv *priv)
mutex_destroy(&tc->t_lock);
mlx5_tc_ct_clean(tc->ct);
mlx5e_tc_post_act_destroy(tc->post_act);
mapping_destroy(tc->mapping);
mlx5_chains_destroy(tc->chains);
}
......@@ -4970,13 +4980,16 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht)
priv = netdev_priv(rpriv->netdev);
esw = priv->mdev->priv.eswitch;
uplink_priv->post_act = mlx5e_tc_post_act_init(priv, esw_chains(esw),
MLX5_FLOW_NAMESPACE_FDB);
uplink_priv->ct_priv = mlx5_tc_ct_init(netdev_priv(priv->netdev),
esw_chains(esw),
&esw->offloads.mod_hdr,
MLX5_FLOW_NAMESPACE_FDB);
MLX5_FLOW_NAMESPACE_FDB,
uplink_priv->post_act);
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
uplink_priv->esw_psample = mlx5_esw_sample_init(netdev_priv(priv->netdev));
uplink_priv->tc_psample = mlx5e_tc_sample_init(esw, uplink_priv->post_act);
#endif
mapping_id = mlx5_query_nic_system_image_guid(esw->dev);
......@@ -5022,11 +5035,12 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht)
mapping_destroy(uplink_priv->tunnel_mapping);
err_tun_mapping:
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
mlx5_esw_sample_cleanup(uplink_priv->esw_psample);
mlx5e_tc_sample_cleanup(uplink_priv->tc_psample);
#endif
mlx5_tc_ct_clean(uplink_priv->ct_priv);
netdev_warn(priv->netdev,
"Failed to initialize tc (eswitch), err: %d", err);
mlx5e_tc_post_act_destroy(uplink_priv->post_act);
return err;
}
......@@ -5043,9 +5057,10 @@ void mlx5e_tc_esw_cleanup(struct rhashtable *tc_ht)
mapping_destroy(uplink_priv->tunnel_mapping);
#if IS_ENABLED(CONFIG_MLX5_TC_SAMPLE)
mlx5_esw_sample_cleanup(uplink_priv->esw_psample);
mlx5e_tc_sample_cleanup(uplink_priv->tc_psample);
#endif
mlx5_tc_ct_clean(uplink_priv->ct_priv);
mlx5e_tc_post_act_destroy(uplink_priv->post_act);
}
int mlx5e_tc_num_filters(struct mlx5e_priv *priv, unsigned long flags)
......
......@@ -70,6 +70,7 @@ struct mlx5_flow_attr {
struct mlx5_fc *counter;
struct mlx5_modify_hdr *modify_hdr;
struct mlx5_ct_attr ct_attr;
struct mlx5e_sample_attr *sample_attr;
struct mlx5e_tc_flow_parse_attr *parse_attr;
u32 chain;
u16 prio;
......
......@@ -91,9 +91,15 @@ int mlx5_esw_offloads_devlink_port_register(struct mlx5_eswitch *esw, u16 vport_
if (err)
goto reg_err;
err = devlink_rate_leaf_create(dl_port, vport);
if (err)
goto rate_err;
vport->dl_port = dl_port;
return 0;
rate_err:
devlink_port_unregister(dl_port);
reg_err:
mlx5_esw_dl_port_free(dl_port);
return err;
......@@ -109,6 +115,12 @@ void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vpo
vport = mlx5_eswitch_get_vport(esw, vport_num);
if (IS_ERR(vport))
return;
if (vport->dl_port->devlink_rate) {
mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
devlink_rate_leaf_destroy(vport->dl_port);
}
devlink_port_unregister(vport->dl_port);
mlx5_esw_dl_port_free(vport->dl_port);
vport->dl_port = NULL;
......@@ -148,8 +160,16 @@ int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_p
if (err)
return err;
err = devlink_rate_leaf_create(dl_port, vport);
if (err)
goto rate_err;
vport->dl_port = dl_port;
return 0;
rate_err:
devlink_port_unregister(dl_port);
return err;
}
void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num)
......@@ -159,6 +179,12 @@ void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num
vport = mlx5_eswitch_get_vport(esw, vport_num);
if (IS_ERR(vport))
return;
if (vport->dl_port->devlink_rate) {
mlx5_esw_qos_vport_update_group(esw, vport, NULL, NULL);
devlink_rate_leaf_destroy(vport->dl_port);
}
devlink_port_unregister(vport->dl_port);
vport->dl_port = NULL;
}
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
#undef TRACE_SYSTEM
#define TRACE_SYSTEM mlx5
#if !defined(_MLX5_ESW_TP_) || defined(TRACE_HEADER_MULTI_READ)
#define _MLX5_ESW_TP_
#include <linux/tracepoint.h>
#include "eswitch.h"
TRACE_EVENT(mlx5_esw_vport_qos_destroy,
TP_PROTO(const struct mlx5_vport *vport),
TP_ARGS(vport),
TP_STRUCT__entry(__string(devname, dev_name(vport->dev->device))
__field(unsigned short, vport_id)
__field(unsigned int, tsar_ix)
),
TP_fast_assign(__assign_str(devname, dev_name(vport->dev->device));
__entry->vport_id = vport->vport;
__entry->tsar_ix = vport->qos.esw_tsar_ix;
),
TP_printk("(%s) vport=%hu tsar_ix=%u\n",
__get_str(devname), __entry->vport_id, __entry->tsar_ix
)
);
DECLARE_EVENT_CLASS(mlx5_esw_vport_qos_template,
TP_PROTO(const struct mlx5_vport *vport, u32 bw_share, u32 max_rate),
TP_ARGS(vport, bw_share, max_rate),
TP_STRUCT__entry(__string(devname, dev_name(vport->dev->device))
__field(unsigned short, vport_id)
__field(unsigned int, tsar_ix)
__field(unsigned int, bw_share)
__field(unsigned int, max_rate)
__field(void *, group)
),
TP_fast_assign(__assign_str(devname, dev_name(vport->dev->device));
__entry->vport_id = vport->vport;
__entry->tsar_ix = vport->qos.esw_tsar_ix;
__entry->bw_share = bw_share;
__entry->max_rate = max_rate;
__entry->group = vport->qos.group;
),
TP_printk("(%s) vport=%hu tsar_ix=%u bw_share=%u, max_rate=%u group=%p\n",
__get_str(devname), __entry->vport_id, __entry->tsar_ix,
__entry->bw_share, __entry->max_rate, __entry->group
)
);
DEFINE_EVENT(mlx5_esw_vport_qos_template, mlx5_esw_vport_qos_create,
TP_PROTO(const struct mlx5_vport *vport, u32 bw_share, u32 max_rate),
TP_ARGS(vport, bw_share, max_rate)
);
DEFINE_EVENT(mlx5_esw_vport_qos_template, mlx5_esw_vport_qos_config,
TP_PROTO(const struct mlx5_vport *vport, u32 bw_share, u32 max_rate),
TP_ARGS(vport, bw_share, max_rate)
);
DECLARE_EVENT_CLASS(mlx5_esw_group_qos_template,
TP_PROTO(const struct mlx5_core_dev *dev,
const struct mlx5_esw_rate_group *group,
unsigned int tsar_ix),
TP_ARGS(dev, group, tsar_ix),
TP_STRUCT__entry(__string(devname, dev_name(dev->device))
__field(const void *, group)
__field(unsigned int, tsar_ix)
),
TP_fast_assign(__assign_str(devname, dev_name(dev->device));
__entry->group = group;
__entry->tsar_ix = tsar_ix;
),
TP_printk("(%s) group=%p tsar_ix=%u\n",
__get_str(devname), __entry->group, __entry->tsar_ix
)
);
DEFINE_EVENT(mlx5_esw_group_qos_template, mlx5_esw_group_qos_create,
TP_PROTO(const struct mlx5_core_dev *dev,
const struct mlx5_esw_rate_group *group,
unsigned int tsar_ix),
TP_ARGS(dev, group, tsar_ix)
);
DEFINE_EVENT(mlx5_esw_group_qos_template, mlx5_esw_group_qos_destroy,
TP_PROTO(const struct mlx5_core_dev *dev,
const struct mlx5_esw_rate_group *group,
unsigned int tsar_ix),
TP_ARGS(dev, group, tsar_ix)
);
TRACE_EVENT(mlx5_esw_group_qos_config,
TP_PROTO(const struct mlx5_core_dev *dev,
const struct mlx5_esw_rate_group *group,
unsigned int tsar_ix, u32 bw_share, u32 max_rate),
TP_ARGS(dev, group, tsar_ix, bw_share, max_rate),
TP_STRUCT__entry(__string(devname, dev_name(dev->device))
__field(const void *, group)
__field(unsigned int, tsar_ix)
__field(unsigned int, bw_share)
__field(unsigned int, max_rate)
),
TP_fast_assign(__assign_str(devname, dev_name(dev->device));
__entry->group = group;
__entry->tsar_ix = tsar_ix;
__entry->bw_share = bw_share;
__entry->max_rate = max_rate;
),
TP_printk("(%s) group=%p tsar_ix=%u bw_share=%u max_rate=%u\n",
__get_str(devname), __entry->group, __entry->tsar_ix,
__entry->bw_share, __entry->max_rate
)
);
#endif /* _MLX5_ESW_TP_ */
/* This part must be outside protection */
#undef TRACE_INCLUDE_PATH
#define TRACE_INCLUDE_PATH esw/diag
#undef TRACE_INCLUDE_FILE
#define TRACE_INCLUDE_FILE qos_tracepoint
#include <trace/define_trace.h>
......@@ -11,6 +11,7 @@
#include "mlx5_core.h"
#include "eswitch.h"
#include "fs_core.h"
#include "esw/qos.h"
enum {
LEGACY_VEPA_PRIO = 0,
......@@ -508,3 +509,22 @@ int mlx5_eswitch_set_vport_trust(struct mlx5_eswitch *esw,
mutex_unlock(&esw->state_lock);
return err;
}
int mlx5_eswitch_set_vport_rate(struct mlx5_eswitch *esw, u16 vport,
u32 max_rate, u32 min_rate)
{
struct mlx5_vport *evport = mlx5_eswitch_get_vport(esw, vport);
int err;
if (!mlx5_esw_allowed(esw))
return -EPERM;
if (IS_ERR(evport))
return PTR_ERR(evport);
mutex_lock(&esw->state_lock);
err = mlx5_esw_qos_set_vport_min_rate(esw, evport, min_rate, NULL);
if (!err)
err = mlx5_esw_qos_set_vport_max_rate(esw, evport, max_rate, NULL);
mutex_unlock(&esw->state_lock);
return err;
}
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
#ifndef __MLX5_ESW_QOS_H__
#define __MLX5_ESW_QOS_H__
#ifdef CONFIG_MLX5_ESWITCH
int mlx5_esw_qos_set_vport_min_rate(struct mlx5_eswitch *esw,
struct mlx5_vport *evport,
u32 min_rate,
struct netlink_ext_ack *extack);
int mlx5_esw_qos_set_vport_max_rate(struct mlx5_eswitch *esw,
struct mlx5_vport *evport,
u32 max_rate,
struct netlink_ext_ack *extack);
void mlx5_esw_qos_create(struct mlx5_eswitch *esw);
void mlx5_esw_qos_destroy(struct mlx5_eswitch *esw);
int mlx5_esw_qos_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport,
u32 max_rate, u32 bw_share);
void mlx5_esw_qos_vport_disable(struct mlx5_eswitch *esw, struct mlx5_vport *vport);
int mlx5_esw_devlink_rate_leaf_tx_share_set(struct devlink_rate *rate_leaf, void *priv,
u64 tx_share, struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_leaf_tx_max_set(struct devlink_rate *rate_leaf, void *priv,
u64 tx_max, struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_node_tx_share_set(struct devlink_rate *rate_node, void *priv,
u64 tx_share, struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_node_tx_max_set(struct devlink_rate *rate_node, void *priv,
u64 tx_max, struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_node_new(struct devlink_rate *rate_node, void **priv,
struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_node_del(struct devlink_rate *rate_node, void *priv,
struct netlink_ext_ack *extack);
int mlx5_esw_devlink_rate_parent_set(struct devlink_rate *devlink_rate,
struct devlink_rate *parent,
void *priv, void *parent_priv,
struct netlink_ext_ack *extack);
#endif
#endif
......@@ -46,7 +46,7 @@
#include "lib/fs_chains.h"
#include "sf/sf.h"
#include "en/tc_ct.h"
#include "esw/sample.h"
#include "en/tc/sample.h"
enum mlx5_mapped_obj_type {
MLX5_MAPPED_OBJ_CHAIN,
......@@ -61,6 +61,7 @@ struct mlx5_mapped_obj {
u32 group_id;
u32 rate;
u32 trunc_size;
u32 tunnel_id;
} sample;
};
};
......@@ -75,11 +76,6 @@ struct mlx5_mapped_obj {
#define MLX5_MAX_MC_PER_VPORT(dev) \
(1 << MLX5_CAP_GEN(dev, log_max_current_mc_list))
#define MLX5_MIN_BW_SHARE 1
#define MLX5_RATE_TO_BW_SHARE(rate, divider, limit) \
min_t(u32, max_t(u32, (rate) / (divider), MLX5_MIN_BW_SHARE), limit)
#define mlx5_esw_has_fwd_fdb(dev) \
MLX5_CAP_ESW_FLOWTABLE(dev, fdb_multi_path_to_table)
......@@ -181,6 +177,7 @@ struct mlx5_vport {
u32 bw_share;
u32 min_rate;
u32 max_rate;
struct mlx5_esw_rate_group *group;
} qos;
u16 vport;
......@@ -309,7 +306,9 @@ struct mlx5_eswitch {
struct {
bool enabled;
u32 root_tsar_id;
u32 root_tsar_ix;
struct mlx5_esw_rate_group *group0;
struct list_head groups; /* Protected by esw->state_lock */
} qos;
struct mlx5_esw_bridge_offloads *br_offloads;
......@@ -335,8 +334,7 @@ int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable);
u32 mlx5_esw_match_metadata_alloc(struct mlx5_eswitch *esw);
void mlx5_esw_match_metadata_free(struct mlx5_eswitch *esw, u32 metadata);
int mlx5_esw_modify_vport_rate(struct mlx5_eswitch *esw, u16 vport_num,
u32 rate_mbps);
int mlx5_esw_qos_modify_vport_rate(struct mlx5_eswitch *esw, u16 vport_num, u32 rate_mbps);
/* E-Switch API */
int mlx5_eswitch_init(struct mlx5_core_dev *dev);
......@@ -359,6 +357,10 @@ int mlx5_eswitch_set_vport_trust(struct mlx5_eswitch *esw,
u16 vport_num, bool setting);
int mlx5_eswitch_set_vport_rate(struct mlx5_eswitch *esw, u16 vport,
u32 max_rate, u32 min_rate);
int mlx5_esw_qos_vport_update_group(struct mlx5_eswitch *esw,
struct mlx5_vport *vport,
struct mlx5_esw_rate_group *group,
struct netlink_ext_ack *extack);
int mlx5_eswitch_set_vepa(struct mlx5_eswitch *esw, u8 setting);
int mlx5_eswitch_get_vepa(struct mlx5_eswitch *esw, u8 *setting);
int mlx5_eswitch_get_vport_config(struct mlx5_eswitch *esw,
......@@ -469,7 +471,6 @@ struct mlx5_esw_flow_attr {
} dests[MLX5_MAX_FLOW_FWD_VPORTS];
struct mlx5_rx_tun_attr *rx_tun_attr;
struct mlx5_pkt_reformat *decap_pkt_reformat;
struct mlx5_sample_attr *sample;
};
int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode,
......
......@@ -187,12 +187,12 @@ esw_cleanup_decap_indir(struct mlx5_eswitch *esw,
static int
esw_setup_sampler_dest(struct mlx5_flow_destination *dest,
struct mlx5_flow_act *flow_act,
struct mlx5_esw_flow_attr *esw_attr,
struct mlx5_flow_attr *attr,
int i)
{
flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER;
dest[i].sampler_id = esw_attr->sample->sampler_id;
dest[i].sampler_id = attr->sample_attr->sampler_id;
return 0;
}
......@@ -435,7 +435,7 @@ esw_setup_dests(struct mlx5_flow_destination *dest,
attr->flags |= MLX5_ESW_ATTR_FLAG_SRC_REWRITE;
if (attr->flags & MLX5_ESW_ATTR_FLAG_SAMPLE) {
esw_setup_sampler_dest(dest, flow_act, esw_attr, *i);
esw_setup_sampler_dest(dest, flow_act, attr, *i);
(*i)++;
} else if (attr->dest_ft) {
esw_setup_ft_dest(dest, flow_act, esw, attr, spec, *i);
......@@ -540,10 +540,7 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR)
flow_act.modify_hdr = attr->modify_hdr;
/* esw_attr->sample is allocated only when there is a sample action */
if (esw_attr->sample && esw_attr->sample->sample_default_tbl) {
fdb = esw_attr->sample->sample_default_tbl;
} else if (split) {
if (split) {
fwd_attr.chain = attr->chain;
fwd_attr.prio = attr->prio;
fwd_attr.vport = esw_attr->in_rep->vport;
......
......@@ -865,7 +865,8 @@ struct mlx5_ifc_qos_cap_bits {
u8 nic_bw_share[0x1];
u8 nic_rate_limit[0x1];
u8 packet_pacing_uid[0x1];
u8 reserved_at_c[0x14];
u8 log_esw_max_sched_depth[0x4];
u8 reserved_at_10[0x10];
u8 reserved_at_20[0xb];
u8 log_max_qos_nic_queue_group[0x5];
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment