Commits · e8609e69470f369509b44d5f2619f94541fe9df6 · Kirill Smelkov / linux

11 Mar, 2022 38 commits

net: ethernet: ti: am65-cpsw: Convert to PHYLINK · e8609e69

Siddharth Vadapalli authored Mar 09, 2022

Convert am65-cpsw driver and am65-cpsw ethtool to use Phylink APIs
as described at Documentation/networking/sfp-phylink.rst. All calls
to Phy APIs are replaced with their equivalent Phylink APIs.

No functional change intended. Use Phylink instead of conventional
Phylib, in preparation to add support for SGMII/QSGMII modes.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e8609e69

powerpc/net: Implement powerpc specific csum_shift() to remove branch · 3af722cb

Christophe Leroy authored Mar 08, 2022

Today's implementation of csum_shift() leads to branching based on
parity of 'offset'

	000002f8 <csum_block_add>:
	     2f8:	70 a5 00 01 	andi.   r5,r5,1
	     2fc:	41 a2 00 08 	beq     304 <csum_block_add+0xc>
	     300:	54 84 c0 3e 	rotlwi  r4,r4,24
	     304:	7c 63 20 14 	addc    r3,r3,r4
	     308:	7c 63 01 94 	addze   r3,r3
	     30c:	4e 80 00 20 	blr

Use first bit of 'offset' directly as input of the rotation instead of
branching.

	000002f8 <csum_block_add>:
	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
	     2fc:	20 a5 00 20 	subfic  r5,r5,32
	     300:	5c 84 28 3e 	rotlw   r4,r4,r5
	     304:	7c 63 20 14 	addc    r3,r3,r4
	     308:	7c 63 01 94 	addze   r3,r3
	     30c:	4e 80 00 20 	blr

And change to left shift instead of right shift to skip one more
instruction. This has no impact on the final sum.

	000002f8 <csum_block_add>:
	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
	     2fc:	5c 84 28 3e 	rotlw   r4,r4,r5
	     300:	7c 63 20 14 	addc    r3,r3,r4
	     304:	7c 63 01 94 	addze   r3,r3
	     308:	4e 80 00 20 	blr

Seems like only powerpc benefits from a branchless implementation.
Other main architectures like ARM or X86 get better code with
the generic implementation and its branch.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

3af722cb

Merge tag 'mlx5-updates-2022-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 8ef1dc4d

David S. Miller authored Mar 11, 2022

Saeed Mahameed says:

====================
mlx5-updates-2022-03-10

1) Leon removes useless includes from both mlx5 and mlx4
2) Tariq adds node awareness to some object allocations
3) Gal Cleanups and improvements to EEPROM query
4) Paul adds Software steering to Connection Tracking, to speed up
   CT Rules insertion.

Paul Blakey Says:
=================
To improve insertion rate, this series allows for using software
steering API directly instead of going through the fs_core layer.
This can be done for CT because it doesn't need fs_core layer extra
facilities, such as autogroups, FTE IDs and modifications (which require
a copy of the flow key/mask). Skipping fs_core layer also allows to
create the software steering objects (dr_* objects) ahead of time and
re-use them for multiple rules, whereas software steering under fs_core
creates them on the fly and discards them. This in turn increased insertion
rate.

The series first introduces a lightweight CT flow steering provider
with the first implementations using fs_core layer, and moves CT to use it.
The next patches implement a provider using software steering directly,
bypassing fs_core, and uses it if software steering is available.

=================

====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8ef1dc4d

net/mlx5e: Remove overzealous validations in netlink EEPROM query · 970adfb7

Gal Pressman authored Jan 26, 2022

Unlike the legacy EEPROM callbacks, when using the netlink EEPROM query
(get_module_eeprom_by_page) the driver should not try to validate the
query parameters, but just perform the read requested by the userspace.

Recent discussion in the mailing list:
https://lore.kernel.org/netdev/20220120093051.70845141@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net/Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

970adfb7

net/mlx5: Parse module mapping using mlx5_ifc · fcb610a8

Gal Pressman authored Jan 17, 2022

The assumption that the first byte in the module mapping dword is the
module number shouldn't be hard-coded in the driver, but come from
mlx5_ifc structs.

While at it, fix the incorrect width for the 'rx_lane' and 'tx_lane'
fields.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

fcb610a8

net/mlx5: Query the maximum MCIA register read size from firmware · 271907ee

Gal Pressman authored Jan 17, 2022

The MCIA register supports either 12 or 32 dwords, use the correct value
by querying the capability from the MCAM register.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

271907ee

net/mlx5: CT: Create smfs dr matchers dynamically · fbf6836d

Paul Blakey authored Sep 05, 2021

SMFS dr matchers are processed sequentially in hardware according to
their priorities, and not skipped if empty.

Currently, smfs ct fs creates four predefined dr matchers per ct
table (ct/ct nat) with hardcoded priority. Compared to dmfs ct fs
using autogroups, this might cause additional hops in fastpath for
traffic patterns that match later priorties, even if previous
priorites are empty, e.g user only using ipv6 UDP traffic will
have additional 3 hops.

Create the matchers dynamically, using the highest priority available,
on first rule usage, and remove them on last usage.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

fbf6836d

net/mlx5: CT: Add software steering ct flow steering provider · 3ee61ebb

Paul Blakey authored Sep 29, 2021

fs_core layer adds extra book keeping that is either unneeded for CT, or
unused by the underlying software steering, such as allocating FTEs and
FTE ids, saving the match key and mask, and autogroups management.
On top of that, direct steering has a translation layer (fs_dr) from PRM
commands to direct steering objects, for example, creating temporary
dr_action objects. This has a performance impact when dealing
with CT high insertion rate.

To use direct steering (smfs) directly for ct, add a tc ct fs smfs
implementation. Instead of dmfs autogroups, smfs ct fs uses one of 4
predefined dr matchers in CT and CT-NAT tables, for each combination
of tuple ethertype (ipv4/ipv6), and tuple ip_proto (udp/tcp) that
is currently used by nf flow table flow offload.

At rule insertions, validate the flow rule fits one of the predfined
matcher, and insert to it.

To fill the dr_actions of the rule efficiently, create the fwd to post_ct
tbl dr_action at fs init, the count dr_action at counter creation,
and re-use the already pre-allocated modify header dr_action.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

3ee61ebb

net/mlx5: Add smfs lib to export direct steering API to CT · c6fef514

Paul Blakey authored Sep 30, 2021

Add a thin layer that exports selected direct steering (dr) API
which will be used by a ct fs implementation in a following
patch.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

c6fef514

net/mlx5: DR, Add helper to get backing dr table from a mlx5 flow table · 34ea969d

Paul Blakey authored Jul 13, 2021

If sw steering was used to create the table, dr steeering fs creates
a backing dr table for the mlx5 flow table.

Add helper to return this table so it can be used to create matchers and
add rules on it directly instead of passing via eswitch_offloads/fs_core
insertion.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

34ea969d

net/mlx5: CT: Introduce a platform for multiple flow steering providers · 76909000

Paul Blakey authored Nov 23, 2021

Currently, fs_core layer provides flow steering services to the driver
including: autogroups, allocating FTEs (flow table entries) and FTE ids,
and support of fte action modification. If then software steering is
configured, rule insertion will go through a translation layer from
firmware buffers to software steering objects (see fs_dr.c).

The connection tracking table is a system table that is not directly
controlled by the user and is a very high scale table. These fs_core
services introduces an overhead that may be optimized by using software
steering API directly.

Introduce ct flow steering interface to allow multiple flow steering
providers. Use the new interface to implement the current dmfs (device
managed flow steering) provider which uses fs_core insertion.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

76909000

net/mlx5: Node-aware allocation for the doorbell pgdir · a3540eff

Tariq Toukan authored Dec 30, 2020

The function is node-aware and gets the node as an argument.
Use a node-aware allocation for the doorbell pgdir structure.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a3540eff

net/mlx5: Node-aware allocation for UAR · b5e4c307

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

b5e4c307

net/mlx5: Node-aware allocation for the EQs · 7f880719

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7f880719

net/mlx5: Node-aware allocation for the EQ table · e894246d

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

e894246d

net/mlx5: Node-aware allocation for the IRQ table · 196df17a

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

196df17a

net/mlx5: Delete useless module.h include · 71ab5807

Leon Romanovsky authored Jan 11, 2022

There is no need in include of module.h in the following files.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

71ab5807

net/mlx4: Delete useless moduleparam include · 04263701

Leon Romanovsky authored Jan 10, 2022

Remove inclusion of not used moduleparam.h.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

04263701

Merge branch 'net-ipa-use-bulk-interconnect-interfaces' · 63f13b2e

Jakub Kicinski authored Mar 10, 2022

Alex Elder says:

====================
net: ipa: use bulk interconnect interfaces

The IPA code currently enables and disables interconnects by setting
the bandwidth of each to a non-zero value, or to zero.  The
interconnect API now supports enable/disable functions, so we can
use those instead.  In addition, the interconnect API provides bulk
interfaces that allow all interconnects to be operated on at once.

This series converts the IPA driver to use the bulk enable and
disable interfaces.  In the process it uses some existing data
structures rather than defining new ones.
====================

Link: https://lore.kernel.org/r/20220309192037.667879-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>