- 19 Oct, 2021 33 commits
-
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Break the address up into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Invert the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: Multi-level qdisc offload Petr says: Currently, mlxsw admits for offload a suitable root qdisc, and its children. Thus up to two levels of hierarchy are offloaded. Often, this is enough: one can configure TCs with RED and TCs with a shaper on, and can even see counters for each TC by looking at a qdisc at a sufficiently shallow position. While simple, the system has obvious shortcomings. It is not possible to configure both RED and shaping on one TC. It is not possible to place a PRIO below root TBF, which would then be offloaded as port shaper. FIFOs are only offloaded at root or directly below, which is confusing to users, because RED and TBF of course have their own FIFO. This patch set lifts assumptions that prevent offloading multi-level qdisc trees. In patch #1, offload of a graft operation is added to TBF. Grafts are issued as another qdisc is linked to the qdisc in question, and give drivers a chance to react to the linking. The absence of this event was not a major issue so far, because TBF was not considered classful, which changes with this patchset. The codebase currently assumes that ETS and PRIO are the only classful qdiscs. The following patches gradually lift this assumption. In patch #2, calculation of traffic class and priomap of a qdisc is fixed. Patch #3 fixes handling of future FIFOs. Child FIFO qdiscs may be created and notified before their parent qdisc exists and therefore need special handling. Patches #4, #5 and #6 unify, respectively, child destruction, child grafting, and cleanup of statistics. Patch #7 adds a function that validates whether a given qdisc topology is offloadable. Finally in patch #8, TBF and RED become classful. At this point, FIFO qdiscs grafted to an offloaded qdisc should always be offloaded. Patch #9 adds a selftest to verify some offloadable and unoffloadable qdisc trees. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
This checks that various qdisc configurations either are or are not offloaded. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Permit offloading qdiscs below RED and TBF. In order to avoid having to implement trivial propagating callbacks for get_prio_bitmap and get_tclass_num, extend mlxsw_sp_qdisc_get_prio_bitmap() and ..._get_tclass_num() to handle the lack of the callback as a cue to forward the request to the parent. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
A following patch will enable offloading qdiscs that are deeper than directly under root qdisc. Currently the topology validation consists of demanding a root qdisc position for ETS and PRIO. Since RED and TBF are considered classless, this is enough. In order to prevent some nonsensical combinations when RED and TBF become classful, introduce a more general topology validator. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
On Spectrum, there are no per-TC TX counters. Instead, mlxsw uses per-prio counters and aggregates them according to the priomap. Therefore when priomap changes, the counter base values need to be reset to reflect the change. Previously, this was only done for the sole child qdisc, but a following patch makes RED and TBF classful. Thus apply the request to the whole sub-tree. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Qdisc graft operations have so far been reported at PRIO, ETS and RED, with RED events ignored, because RED was not considered a classful qdisc. A following patch will make mlxsw recognize RED and TBF as classful qdiscs, and thus it is necessary to validate grafting at these qdiscs as well. Rename the existing graft validator to make it clear that it is a generic function, and invoke for RED and TBF graft events as well. Drop the unnecessary PRIO helper and invoke the graft validator directly for PRIO as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Currently ETS and PRIO are the only offloaded classful qdiscs. Since they are both similar, their destroy handler is the same, and it handles children destruction itself. But now it is possible to do it generically for any classful qdisc. Therefore promote the recursive destruction from the ETS handler to mlxsw_sp_qdisc_destroy(), so that RED and TBF pick it up in follow-up patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Extract from __mlxsw_sp_qdisc_ets_replace() two helpers for handling of one future FIFO resp. reinitializing the array of future FIFOs. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Currently when keeping track of qdiscs, mlxsw notes the TC and priomap corresponding to each qdisc. That is fine currently, as there only ever is one level of qdiscs to update: the direct children of ETS / PRIO. However as deeper structures are made offloadable, ETS would need to update these values for the complete subtree, and interim qdiscs would need to remember to propagate the value. Instead, reverse the responsibility: child qdiscs can ask their parent what their TC and priomap are. ETS / PRIO know the answer right away, or there are defaults for when the root qdisc does not assign them (e.g. when RED is used as root qdisc). When RED and TBF become classful, they will simply forward the request up to their parent. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
As another qdisc is linked to the TBF, the latter should issue an event to give drivers a chance to react to the grafting. In other qdiscs, this event is called GRAFT, so follow suit with TBF as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: mlx5-updates-2021-10-18 Maor Maor Gottlieb says: ======================== Use hash to select the affinity port in VF LAG Current VF LAG architecture is based on QP association with a port. QP must be created after LAG is enabled to allow association with non-native port. VM Packets going on slow-path to eSwicth manager (SW path or hairpin) will be transmitted through a different QP than the VM. This means that Different packets of the same flow might egress from different physical ports. This patch-set solves this issue by moving the port selection to be based on the hash function defined by the bond. When the device is moved to VF LAG mode, the driver creates TTC (traffic type classifier) flow tables in order to classify the packet and steer it to the relevant hash function. Similar to what is done in the mlx5 RSS implementation. Each rule in the TTC table, forwards the packet to port selection flow table which has one hash split flow group which contains two "catch all" flow table entries. Each entry point to the relative uplink port. As shown below: ------------------- | FT | TTC rule -> | ----------- | | FG| FTE --|-|-----> uplink of port #1 | | FTE --|-|-----> uplink of port #2 | ----------- | ------------------- Hash split flow group is flow group that created as type of HASH_SPLIT and associated with match definer. The match definer define the fields which included in the hash calculation. The driver creates the match definer according to the xmit hash policy of the bond driver. Patches overview: ======================== Minor E-Switch updates: - Patch #12, dynamic allocation of dest array - Patch #13, increase number of forward destinations to 32 Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueDavid S. Miller authored
Mateusz Palczewski says: ==================== 40GbE Intel Wired LAN Driver Updates 2021-10-18 Use single state machine for driver initialization and for service initialized driver. The init state machine implemented in init_task() is merged into the watchdog_task(). The init_task() function is removed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maor Dickman authored
Increase supported number of forward destinations in the same rule, local and remote, from 2 to 32. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Dickman authored
Use dynamic allocation for the dest array in preparation for the next patch which increase MLX5_MAX_FLOW_FWD_VPORTS and will cause stack allocation to be bigger than 1024 bytes. Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Use the steering based solution for select the affinity port when the LAG mode is based on hash policy and the device support in port selection flow table. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Add create function, build the steering tables, TTC and definers according to the LAG hash type. The destroy function, destroys all the steering components. The modify functions is used when the bond mapping changes and it iterates over all the rules in the definers and modifies them to steer the packet to the relevant active ports. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Add support to create inner and outer TTC tables for LAG port selection. These tables are used to classify the packets in order to select the related definer. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Every definer will consist of a flow table with a single hash group with exactly two flow table entries, one for each device port. The destination of these entries is the uplink vport according to the port state and hash policy. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Set the related bits in the match definer mask according to the TT mapping. This mask will be used to create the match definers. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Generate a traffic type bitmap that will define which steering objects we need to create for the steering based LAG. Bits in this bitmap are set according to the LAG hash type. In addition, have a field that indicate if the lag is in encap mode or not. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Downstream patches add another lag related file so it makes sense to have all the lag files in a dedicated directory. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
The uplink destination type should be used in rules to steer the packet to the uplink when the device is in steering based LAG mode. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Introduce new APIs to create and destroy flow matcher for given format id. Flow match definer object is used for defining the fields and mask used for the hash calculation. User should mask the desired fields like done in the match criteria. This object is assigned to flow group of type hash. In this flow group type, packets lookup is done based on the hash result. This patch also adds the required bits to create such flow group. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Add new port selection flow steering namespace. Flow steering rules in this namespaceare are used to determine the physical port for egress packets. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Maor Gottlieb authored
Add bitmasks to ttc_params to indicate if rule is valid or not. It will allow to create TTC table with support only in part of the traffic types. In later patches which introduce the steering based LAG port selection, TTC will be created with only part of the rules according to the hash type. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
- 18 Oct, 2021 7 commits
-
-
Shai Malin authored
Change the TCP common variable - "iscsi_ooo" to "ooo_opq". This variable is common between all the TCP L5 protocols and not specific to iSCSI. Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Link: https://lore.kernel.org/r/20211015124118.29041-2-smalin@marvell.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shai Malin authored
Optimize the ll2 TCP out-of-order likely flows: - Optimize the non-error flows of the ll2 ooo data path. - Optimize "QED_OOO_RIGHT_BUF" over "QED_OOO_LEFT_BUF". Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Link: https://lore.kernel.org/r/20211015124118.29041-1-smalin@marvell.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Lukas Bulwahn authored
Commit e330fb14 ("of: net: move of_net under net/") moves of_net.c to ./net/core/, but misses to adjust the reference to this file in MAINTAINERS. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F: drivers/of/of_net.c Adjust the file entry after this file movement. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20211016055815.14397-1-lukas.bulwahn@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Mateusz Palczewski authored
Use single state machine for driver initialization and for service initialized driver. The init state machine implemented in init_task() is merged into the watchdog_task(). The init_task() function is removed. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Mateusz Palczewski authored
This commit adds a new state, __IAVF_INIT_FAILED to the state machine. From now on initialization functions report errors not by returning an error value, but by changing the state to indicate that something went wrong. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Mateusz Palczewski authored
Replace state changes of iavf state machine with a method that also tracks the previous state the machine was on. This change is required for further work with refactoring init and watchdog state machines. Tracking of previous state would help us recover iavf after failure has occurred. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jakub Kicinski authored
mlx5_tout_ms() returns a u64, we can't directly divide it. This is not a problem here, @timeout which is the value that actually matters here is already a ulong, so this implies storing return value of mlx5_tout_ms() on a ulong should be fine. This fixes: ERROR: modpost: "__udivdi3" [drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko] undefined! Fixes: 32def412 ("net/mlx5: Read timeout values from DTOR") Link: https://lore.kernel.org/r/20211018172608.1069754-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-