Commits · c205d53c4923e4b32b4a2a2773618f08e176a250 · Kirill Smelkov / linux

11 Mar, 2022 40 commits

ptp: ocp: Add firmware capability bits for feature gating · c205d53c

Jonathan Lemon authored Mar 10, 2022

Add the ability to group sysfs nodes behind a firmware feature
check.  This way non-present sysfs attributes are omitted on
older firmware, which does not have newer features.

This will be used in the upcoming patches which adds more
features to the timecard.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c205d53c

ptp: ocp: Add GND and VCC output selectors · cd09193f

Jonathan Lemon authored Mar 10, 2022

These will provide constant outputs.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cd09193f

ptp: ocp: Rename output selector 'GNSS' to 'GNSS1' · be69087c

Jonathan Lemon authored Mar 10, 2022

As there are may be 2 GNSS outputs, rename the first one for clarity.
This also works around a parsing issue when specifying selectors.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be69087c

ptp: ocp: Add ability to disable input selectors. · b2c4f0ac

Jonathan Lemon authored Mar 10, 2022

This adds support for the "IN: None" selector, which disables
the input on a sma pin.  This should be compatible with old firmware
(the firmware will ignore it if not supported).
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2c4f0ac

ptp: ocp: Add support for selectable SMA directions. · a509a7c6

Jonathan Lemon authored Mar 10, 2022

Assuming the firmware allows it, the direction of each SMA connector
is no longer fixed.  Handle remapping directions for each pin.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a509a7c6

net: lan966x: Improve the CPU TX bitrate. · fb9eb027

Horatiu Vultur authored Mar 10, 2022

When doing manual injection of the frame, it is required to check if the
TX FIFO is ready to accept the next word of the frame. For this we are
using 'readx_poll_timeout_atomic', the only problem is that before it
actually checks the status, is determining the time when to finish polling
the status. Which seems to be an expensive operation.
Therefore check the status of the TX FIFO before calling
'readx_poll_timeout_atomic'.
Doing this will improve the TX bitrate by ~70%. Because 99% the FIFO is
ready by that time. The measurements were done using iperf3.

Before:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.03  sec  55.2 MBytes  46.2 Mbits/sec    0 sender
[  5]   0.00-10.04  sec  53.8 MBytes  45.0 Mbits/sec      receiver

After:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.10  sec  95.0 MBytes  78.9 Mbits/sec    0 sender
[  5]   0.00-10.11  sec  95.0 MBytes  78.8 Mbits/sec      receiver
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb9eb027

net: ethernet: ezchip: fix platform_get_irq.cocci warning · 89ff05d5

Yihao Han authored Mar 10, 2022

Remove dev_err() messages after platform_get_irq*() failures.
platform_get_irq() already prints an error.

Generated by: scripts/coccinelle/api/platform_get_irq.cocci
Signed-off-by: Yihao Han <hanyihao@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

89ff05d5

flow_dissector: Add support for HSRv0 · f65e5844

Kurt Kanzenbach authored Mar 10, 2022

Commit bf08824a ("flow_dissector: Add support for HSR") added support for
HSR within the flow dissector. However, it only works for HSR in version
1. Version 0 uses a different Ether Type. Add support for it.
Reported-by: Anthony Harivel <anthony.harivel@linutronix.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

f65e5844

net: mv643xx_eth: use platform_get_irq() instead of platform_get_resource() · bf2b8342

Minghao Chi authored Mar 10, 2022

It is not recommened to use platform_get_resource(pdev, IORESOURCE_IRQ)
for requesting IRQ's resources any more, as they can be not ready yet in
case of DT-booting.

platform_get_irq() instead is a recommended way for getting IRQ even if
it was not retrieved earlier.

It also makes code simpler because we're getting "int" value right away
and no conversion from resource to int is required.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

bf2b8342

net: ethernet: ti: davinci_emac: Use platform_get_irq() to get the interrupt · 7cd08f10

Lad Prabhakar authored Mar 10, 2022

platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq() for DT users only.

While at it propagate error code in emac_dev_stop() in case
platform_get_irq_optional() fails.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7cd08f10

net: ethernet: ti: am65-cpsw: Convert to PHYLINK · e8609e69

Siddharth Vadapalli authored Mar 09, 2022

Convert am65-cpsw driver and am65-cpsw ethtool to use Phylink APIs
as described at Documentation/networking/sfp-phylink.rst. All calls
to Phy APIs are replaced with their equivalent Phylink APIs.

No functional change intended. Use Phylink instead of conventional
Phylib, in preparation to add support for SGMII/QSGMII modes.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e8609e69

powerpc/net: Implement powerpc specific csum_shift() to remove branch · 3af722cb

Christophe Leroy authored Mar 08, 2022

Today's implementation of csum_shift() leads to branching based on
parity of 'offset'

	000002f8 <csum_block_add>:
	     2f8:	70 a5 00 01 	andi.   r5,r5,1
	     2fc:	41 a2 00 08 	beq     304 <csum_block_add+0xc>
	     300:	54 84 c0 3e 	rotlwi  r4,r4,24
	     304:	7c 63 20 14 	addc    r3,r3,r4
	     308:	7c 63 01 94 	addze   r3,r3
	     30c:	4e 80 00 20 	blr

Use first bit of 'offset' directly as input of the rotation instead of
branching.

	000002f8 <csum_block_add>:
	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
	     2fc:	20 a5 00 20 	subfic  r5,r5,32
	     300:	5c 84 28 3e 	rotlw   r4,r4,r5
	     304:	7c 63 20 14 	addc    r3,r3,r4
	     308:	7c 63 01 94 	addze   r3,r3
	     30c:	4e 80 00 20 	blr

And change to left shift instead of right shift to skip one more
instruction. This has no impact on the final sum.

	000002f8 <csum_block_add>:
	     2f8:	54 a5 1f 38 	rlwinm  r5,r5,3,28,28
	     2fc:	5c 84 28 3e 	rotlw   r4,r4,r5
	     300:	7c 63 20 14 	addc    r3,r3,r4
	     304:	7c 63 01 94 	addze   r3,r3
	     308:	4e 80 00 20 	blr

Seems like only powerpc benefits from a branchless implementation.
Other main architectures like ARM or X86 get better code with
the generic implementation and its branch.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

3af722cb

Merge tag 'mlx5-updates-2022-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 8ef1dc4d

David S. Miller authored Mar 11, 2022

Saeed Mahameed says:

====================
mlx5-updates-2022-03-10

1) Leon removes useless includes from both mlx5 and mlx4
2) Tariq adds node awareness to some object allocations
3) Gal Cleanups and improvements to EEPROM query
4) Paul adds Software steering to Connection Tracking, to speed up
   CT Rules insertion.

Paul Blakey Says:
=================
To improve insertion rate, this series allows for using software
steering API directly instead of going through the fs_core layer.
This can be done for CT because it doesn't need fs_core layer extra
facilities, such as autogroups, FTE IDs and modifications (which require
a copy of the flow key/mask). Skipping fs_core layer also allows to
create the software steering objects (dr_* objects) ahead of time and
re-use them for multiple rules, whereas software steering under fs_core
creates them on the fly and discards them. This in turn increased insertion
rate.

The series first introduces a lightweight CT flow steering provider
with the first implementations using fs_core layer, and moves CT to use it.
The next patches implement a provider using software steering directly,
bypassing fs_core, and uses it if software steering is available.

=================

====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8ef1dc4d

net/mlx5e: Remove overzealous validations in netlink EEPROM query · 970adfb7

Gal Pressman authored Jan 26, 2022

Unlike the legacy EEPROM callbacks, when using the netlink EEPROM query
(get_module_eeprom_by_page) the driver should not try to validate the
query parameters, but just perform the read requested by the userspace.

Recent discussion in the mailing list:
https://lore.kernel.org/netdev/20220120093051.70845141@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net/Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

970adfb7

net/mlx5: Parse module mapping using mlx5_ifc · fcb610a8

Gal Pressman authored Jan 17, 2022

The assumption that the first byte in the module mapping dword is the
module number shouldn't be hard-coded in the driver, but come from
mlx5_ifc structs.

While at it, fix the incorrect width for the 'rx_lane' and 'tx_lane'
fields.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

fcb610a8

net/mlx5: Query the maximum MCIA register read size from firmware · 271907ee

Gal Pressman authored Jan 17, 2022

The MCIA register supports either 12 or 32 dwords, use the correct value
by querying the capability from the MCAM register.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

271907ee

net/mlx5: CT: Create smfs dr matchers dynamically · fbf6836d

Paul Blakey authored Sep 05, 2021

SMFS dr matchers are processed sequentially in hardware according to
their priorities, and not skipped if empty.

Currently, smfs ct fs creates four predefined dr matchers per ct
table (ct/ct nat) with hardcoded priority. Compared to dmfs ct fs
using autogroups, this might cause additional hops in fastpath for
traffic patterns that match later priorties, even if previous
priorites are empty, e.g user only using ipv6 UDP traffic will
have additional 3 hops.

Create the matchers dynamically, using the highest priority available,
on first rule usage, and remove them on last usage.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

fbf6836d

net/mlx5: CT: Add software steering ct flow steering provider · 3ee61ebb

Paul Blakey authored Sep 29, 2021

fs_core layer adds extra book keeping that is either unneeded for CT, or
unused by the underlying software steering, such as allocating FTEs and
FTE ids, saving the match key and mask, and autogroups management.
On top of that, direct steering has a translation layer (fs_dr) from PRM
commands to direct steering objects, for example, creating temporary
dr_action objects. This has a performance impact when dealing
with CT high insertion rate.

To use direct steering (smfs) directly for ct, add a tc ct fs smfs
implementation. Instead of dmfs autogroups, smfs ct fs uses one of 4
predefined dr matchers in CT and CT-NAT tables, for each combination
of tuple ethertype (ipv4/ipv6), and tuple ip_proto (udp/tcp) that
is currently used by nf flow table flow offload.

At rule insertions, validate the flow rule fits one of the predfined
matcher, and insert to it.

To fill the dr_actions of the rule efficiently, create the fwd to post_ct
tbl dr_action at fs init, the count dr_action at counter creation,
and re-use the already pre-allocated modify header dr_action.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

3ee61ebb

net/mlx5: Add smfs lib to export direct steering API to CT · c6fef514

Paul Blakey authored Sep 30, 2021

Add a thin layer that exports selected direct steering (dr) API
which will be used by a ct fs implementation in a following
patch.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

c6fef514

net/mlx5: DR, Add helper to get backing dr table from a mlx5 flow table · 34ea969d

Paul Blakey authored Jul 13, 2021

If sw steering was used to create the table, dr steeering fs creates
a backing dr table for the mlx5 flow table.

Add helper to return this table so it can be used to create matchers and
add rules on it directly instead of passing via eswitch_offloads/fs_core
insertion.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

34ea969d

net/mlx5: CT: Introduce a platform for multiple flow steering providers · 76909000

Paul Blakey authored Nov 23, 2021

Currently, fs_core layer provides flow steering services to the driver
including: autogroups, allocating FTEs (flow table entries) and FTE ids,
and support of fte action modification. If then software steering is
configured, rule insertion will go through a translation layer from
firmware buffers to software steering objects (see fs_dr.c).

The connection tracking table is a system table that is not directly
controlled by the user and is a very high scale table. These fs_core
services introduces an overhead that may be optimized by using software
steering API directly.

Introduce ct flow steering interface to allow multiple flow steering
providers. Use the new interface to implement the current dmfs (device
managed flow steering) provider which uses fs_core insertion.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

76909000

net/mlx5: Node-aware allocation for the doorbell pgdir · a3540eff

Tariq Toukan authored Dec 30, 2020

The function is node-aware and gets the node as an argument.
Use a node-aware allocation for the doorbell pgdir structure.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

a3540eff

net/mlx5: Node-aware allocation for UAR · b5e4c307

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

b5e4c307

net/mlx5: Node-aware allocation for the EQs · 7f880719

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

7f880719

net/mlx5: Node-aware allocation for the EQ table · e894246d

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

e894246d

net/mlx5: Node-aware allocation for the IRQ table · 196df17a

Tariq Toukan authored Dec 30, 2020

Prefer the aware allocation, use the device NUMA node.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

196df17a

net/mlx5: Delete useless module.h include · 71ab5807

Leon Romanovsky authored Jan 11, 2022

There is no need in include of module.h in the following files.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

71ab5807

net/mlx4: Delete useless moduleparam include · 04263701

Leon Romanovsky authored Jan 10, 2022

Remove inclusion of not used moduleparam.h.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

04263701

Merge branch 'net-ipa-use-bulk-interconnect-interfaces' · 63f13b2e

Jakub Kicinski authored Mar 10, 2022

Alex Elder says:

====================
net: ipa: use bulk interconnect interfaces

The IPA code currently enables and disables interconnects by setting
the bandwidth of each to a non-zero value, or to zero.  The
interconnect API now supports enable/disable functions, so we can
use those instead.  In addition, the interconnect API provides bulk
interfaces that allow all interconnects to be operated on at once.

This series converts the IPA driver to use the bulk enable and
disable interfaces.  In the process it uses some existing data
structures rather than defining new ones.
====================

Link: https://lore.kernel.org/r/20220309192037.667879-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>