Commits · f48298d3fbfaadedd7e7bd1cdcbb3f1291a8d42d · Kirill Smelkov / linux

10 Mar, 2021 40 commits

staging: dpaa2-switch: move the driver out of staging · f48298d3

Ioana Ciornei authored Mar 10, 2021

Now that the dpaa2-switch driver has basic I/O capabilities on the
switch port net_devices and multiple bridging domains are supported,
move the driver out of staging.

The dpaa2-switch driver is placed right next to the dpaa2-eth driver
since, in the near future, they will be sharing most of the data path.
I didn't implement code reuse in this patch series because I wanted to
keep it as small as possible.

Also, the README is removed from staging with the intention to add
proper rst documentation afterwards to actually match was is supported
by the driver.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f48298d3

staging: dpaa2-switch: prevent joining a bridge while VLAN uppers are present · 1c4928fc

Ioana Ciornei authored Mar 10, 2021

Each time a switch port joins a bridge, it will start to use a FDB table
common with all the other switch ports that are under the same bridge.
This means that any VLAN added prior to a bridge join, will retain its
previous FDB table destination. With this patch, I choose to restrict
when a switch port can change it's upper device (either join or leave)
so that the driver does not have to delete all the previously installed
VLANs from the previous FDB and add them into the new one.

Thus, in the PRECHANGEUPPER notification we check if there are any VLAN
type upper devices and if that's true, deny the CHANGEUPPER.

This way, the user is not restricted in the topology but rather in the
order in which the setup is done: it must first create the bridging
domain layout and after that add the necessary VLAN devices if
necessary. The teardown is similar, the VLAN devices will need to be
destroyed prior to a change in the bridging layout.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c4928fc

staging: dpaa2-switch: add fast-ageing on bridge leave · 685b4801

Ioana Ciornei authored Mar 10, 2021

Upon leaving a bridge, any MAC addresses learnt on the switch port prior
to this point have to be removed so that we preserve the bridging domain
configuration.

Restructure the dpaa2_switch_port_fdb_dump() function in order to have a
common dpaa2_switch_fdb_iterate() function between the FDB dump callback
and the fast age procedure. To accomplish this, add a new callback -
dpaa2_switch_fdb_cb_t - which will be called on each MAC addr and,
depending on the situation, will either dump the FDB entry into a
netlink message or will delete the address from the FDB table, in case
of the fast-age.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

685b4801

staging: dpaa2-switch: accept only vlan-aware upper devices · d671407f

Ioana Ciornei authored Mar 10, 2021

The DPAA2 Switch is not capable to handle traffic in a VLAN unaware
fashion, thus the previous handling of both the accepted upper devices
and the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING flag was wrong.

Fix this by checking if the bridge that we are joining is indeed VLAN
aware, if not return an error. Also, the RX VLAN filtering feature is
defined as 'on [fixed]' and the .ndo_vlan_rx_add_vid() and
.ndo_vlan_rx_kill_vid() callbacks are implemented just by recreating a
switchdev_obj_port_vlan object and then calling the same functions used
on the switchdev notifier path.
In addition, changing the vlan_filtering flag to 0 on a bridge under
which a DPAA2 switch interface is present is not supported, thus
rejected when SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING is received with
such a request.

This patch is also adding the use of the switchdev_handle_port_attr_set
function so that we can iterate through all the lower devices of the
bridge that the notification was received on and actually catch if the
user is trying to change the vlan_filtering state. Since on a VLAN
filtering change the net_device is the bridge, we also move the
dpaa2_switch_port_dev_check call so that we do not return NOTIFY_DONE
right away.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d671407f

staging: dpaa2-switch: move the notifier register to module_init() · 16abb6ad

Ioana Ciornei authored Mar 10, 2021

Move the notifier blocks register into the module_init() step, instead of
object probe, so that all DPSW devices probed by the dpaa2-switch driver
can use the same notifiers.

This will enable us to have a more straightforward approach in
determining if an event is intended for an object managed by this driver
or not. Previously, the dpaa2_switch_port_dev_check() function was
forced to also check the notifier block beside the net_device_ops
structure to determine if the event is for us or not.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

16abb6ad

staging: dpaa2-switch: properly setup switching domains · 539dda3c

Ioana Ciornei authored Mar 10, 2021

Until now, the DPAA2 switch was not capable to properly setup its
switching domains depending on the existence, or lack thereof, of a
upper bridge device. This meant that all switch ports of a DPSW object
were switching by default even though they were not under the same
bridge device.

Another issue was the inability to actually add the CPU in the flooding
domains (broadcast, unknown unicast etc) of a particular switch port.
This meant that a simple ping on a switch interface was not possible
since no broadcast ARP frame would actually reach the CPU queues.

This patch tries to fix exactly these problems by:

* Creating and managing a FDB table for each flooding domain. This means
that when a switch interface is not bridged it will use its own FDB
table. While in bridged mode all DPAA2 switch interfaces under the
same upper will use the same FDB table, thus leverage the same FDB
entries.

* Adding a new MC firmware command - dpsw_set_egress_flood() - through
which the driver can setup the flooding domains as needed. For
example, when the switch interface is standalone, thus not in a
bridge with any other DPAA2 switch port, it will setup its broadcast
and unknown unicast flooding domains to only include the control
interface (the queues that reach the CPU and the driver can dequeue
from). This flooding domain changes when the interface joins a bridge
and is configured to include, beside the control interface, all other
DPAA2 switch interfaces.

We impose a minimum limit of FDB tables available equal to the number of
switch interfaces so that we guarantee that, in the maximal
configuration - all interfaces are standalone, each switch port will
have a private FDB table. At the same time, we only probe DPSW objects
that have the flooding and broadcast replicators configured to be per
FDB (DPSW_*_PER_FDB). Without this, the dpaa2-switch driver would not
be able to configure multiple switching domains.

At probe time, a FDB table will be allocated for each port. At a bridge
join event, the switch port will either continue to use the current FDB
table (if it's the first dpaa2-switch port to join that bridge) or will
switch to use the FDB table associated with the port that it's already
under the bridge. If a FDB switch is necessary, the private FDB table
which was previously used will be returned to the pool of unused FDBs.

Upon a bridge leave, the switch port needs a private FDB table thus it
will search and get the first unused FDB table. This way, all the other
ports remaining under the bridge will continue to use the same FDB
table.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

539dda3c

staging: dpaa2-switch: enable the control interface · 613c0a58

Ioana Ciornei authored Mar 10, 2021

Enable the CTRL_IF of the switch object, now that all the pieces are in
place (buffer and queue management, interrupts, NAPI instances etc).
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

613c0a58

staging: dpaa2-switch: add .ndo_start_xmit() callback · 7fd94d86

Ioana Ciornei authored Mar 10, 2021

Implement the .ndo_start_xmit() callback for the switch port interfaces.
For each of the switch ports, gather the corresponding queue
destination ID (QDID) necessary for Tx enqueueing.

We'll reserve 64 bytes for software annotations, where we keep a skb
backpointer used on the Tx confirmation side for releasing the allocated
memory. At the moment, we only support linear skbs.

Also, add support for the Tx confirmation path which for the most part
shares the code path with the normal Rx queue.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7fd94d86

staging: dpaa2-switch: handle Rx path on control interface · 0b1b7137

Ioana Ciornei authored Mar 10, 2021

The dpaa2-ethsw supports only one Rx queue that is shared by all switch
ports. This means that information about which port was the ingress port
for a specific frame needs to be passed in metadata. In our case, the
Flow Context (FLC) field from the frame descriptor holds this
information. Besides the interface ID of the ingress port we also
receive the virtual QDID of the port. Below is a visual description of
the 64 bits of FLC.

63           47           31           15           0
+---------------------------------------------------+
|            |            |            |            |
|  RESERVED  |    IF_ID   |  RESERVED  |  IF QDID   |
|            |            |            |            |
+---------------------------------------------------+

Because all switch ports share the same Rx and Tx conf queues, NAPI
management takes into consideration when there is at least one switch
interface open to enable the NAPI instance.

The Rx path is common, for the most part, for both Rx and Tx conf with
the mention that each of them has its own consume function of a frame
descriptor. Dequeueing from a FQ, consuming dequeued store and also the
NAPI poll function is common between both queues.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b1b7137

staging: dpaa2-switch: setup dpio · 04abc97d

Ioana Ciornei authored Mar 10, 2021

Setup interrupts on the control interface queues. We do not force an
exact affinity between the interrupts received from a specific queue and
a cpu.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04abc97d

staging: dpaa2-switch: setup buffer pool and RX path rings · 2877e4f7

Ioana Ciornei authored Mar 10, 2021

Allocate and setup a buffer pool, needed on the Rx path of the control
interface. Also, define the Rx buffer size seen by the WRIOP from the
PAGE_SIZE buffers seeded.

Also, create the needed Rx rings for both frame queues used on the
control interface.  On the Rx path, when a pull-dequeue operation is
performed on a software portal, available frame descriptors are put in a
ring - a DMA memory storage - for further usage.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2877e4f7

staging: dpaa2-switch: get control interface attributes · 26d419f3

Ioana Ciornei authored Mar 10, 2021

Introduce a new structure to hold all necessary info related to an RX
queue for the control interface and populate the FQ IDs.
We only have one Rx queue and one Tx confirmation queue on the control
interface, both shared by all the switch ports.

Also, increase the minimum version of the object supported by the driver
since for a basic switch driver support we'll be in need for some ABIs
added in the latest version of firmware.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

26d419f3

staging: dpaa2-switch: remove obsolete .ndo_fdb_{add|del} callbacks · 5dda9a79

Ioana Ciornei authored Mar 10, 2021

Since the dpaa2-switch already listens for SWITCHDEV_FDB_ADD_TO_DEVICE /
SWITCHDEV_FDB_DEL_TO_DEVICE events emitted by the bridge, we don't need
the bridge bypass operations, and now is a good time to delete them. All
'bridge fdb' commands need the 'master' flag specified now.

In fact, having the obsolete .ndo_fdb_{add|del} callbacks would even
complicate the bridge leave/join procedures without any real benefit.
Every FDB entry is installed in an FDB ID as far as the hardware is
concerned, and the dpaa2-switch ports change their FDB ID when they join
or leave a bridge. So we would need to manually delete these FDB entries
when the FDB ID changes. That's because, unlike FDB entries added
through switchdev, where the bridge automatically deletes those on
leave, there isn't anybody who will remove the static FDB entries
installed via the bridge bypass operations upon a change in the upper
device.

Note that we still need .ndo_fdb_dump though. The dpaa2-switch does not
emit any interrupts when a new address is learnt, so we cannot keep the
bridge FDB in sync with the hardware FDB. Therefore, we need this
callback to get a chance to print the FDB entries that were dynamically
learnt by our hardware.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5dda9a79

staging: dpaa2-switch: fix up initial forwarding configuration done by firmware · 282d47de

Ioana Ciornei authored Mar 10, 2021

By default, the DPSW object is configured with VLAN ID 1 in the VLAN
table, which all ports are member of. This entry in the VLAN table
selects the same FDB ID for all ports, meaning that forwarding between
ports is permitted. This is unlike the switchdev model, where each port
should operate as standalone by default.

To make the switch operate in standalone ports mode, we need the VLAN
table to select a unique FDB ID for each port. In order to do that, we
need to simply delete the VLAN 1 created automatically by firmware, and
let dpaa2_switch_port_init take over, by readding VLAN ID 1, but
pointing towards a unique FDB ID.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

282d47de

staging: dpaa2-switch: remove broken learning and flooding support · 93a4d0ab

Ioana Ciornei authored Mar 10, 2021

This patch is removing the current configuration of learning and
flooding states per switch port because they are essentially broken in
terms of integration with the switchdev APIs and the bridge
understanding of these states.

First of all, the learning state is a per switch port configuration
while the dpaa2-switch driver was using it to configure the entire
bridging domain. This is broken since the software learning state could
be out of sync with the hardware state when ports from the same bridging
domain are configured by the user with different learning parameters.

The BR_FLOOD flag has been misinterpreted as well. Instead of denoting
whether unicast traffic for which there is no FDB entry will be flooded
towards a given port, the dpaa2-switch used the flag to configure
whether or not a frame with an unknown destination received on a given
port should be flooded or not. In summary, it was used as ingress
setting instead of a egress one.

Also, remove the unnecessary call to dpsw_if_set_broadcast() and the API
definition. The HW default is to let all switch ports to be able to
flood broadcast traffic thus there is no need to call the API again.

Instead of trying to patch things up, just remove the support for the
moment so that we'll add it back cleanly once the driver is out of
staging.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

93a4d0ab

Merge branch 'enetc-cleanups' · 157611c8

David S. Miller authored Mar 10, 2021

Vladimir Oltean says:

====================
Refactoring/cleanup for NXP ENETC

This series performs the following:
- makes the API for Control Buffer Descriptor Rings in enetc_cbdr.c a
  bit more tightly knit.
- moves more logic into enetc_rxbd_next to make the callers simpler
- moves more logic into enetc_refill_rx_ring to make the callers simpler
- removes forward declarations
- simplifies the probe path to unify probing for used and unused PFs.

Nothing radical.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

157611c8

net: enetc: make enetc_refill_rx_ring update the consumer index · 7a5222cb

Vladimir Oltean authored Mar 10, 2021

Since commit fd5736bf ("enetc: Workaround for MDIO register access
issue"), enetc_refill_rx_ring no longer updates the RX BD ring's
consumer index, that is left to be done by the caller. This has led to
bugs such as the ones found in 96a5223b ("net: enetc: remove bogus
write to SIRXIDR from enetc_setup_rxbdr") and 3a5d12c9 ("net: enetc:
keep RX ring consumer index in sync with hardware"), so it is desirable
that we move back the update of the consumer index into enetc_refill_rx_ring.

The trouble with that is the different MDIO locking context for the two
callers of enetc_refill_rx_ring:

- enetc_clean_rx_ring runs under enetc_lock_mdio()
- enetc_setup_rxbdr runs outside enetc_lock_mdio()

Simplify the callers of enetc_refill_rx_ring by making enetc_setup_rxbdr
explicitly take enetc_lock_mdio() around the call. It will be the only
place in need of ensuring the hot accessors can be used.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7a5222cb

net: enetc: remove forward declaration for enetc_map_tx_buffs · 0486185e

Vladimir Oltean authored Mar 10, 2021

There is no other reason why this forward declaration exists rather than
poor ordering of the functions.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0486185e

net: enetc: remove forward-declarations of enetc_clean_{rx,tx}_ring · 8580b3c3

Vladimir Oltean authored Mar 10, 2021

This patch moves the NAPI enetc_poll after enetc_clean_rx_ring such that
we can delete the forward declarations.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8580b3c3

net: enetc: use enum enetc_active_offloads · 7f071a45

Vladimir Oltean authored Mar 10, 2021

The active_offloads variable of enetc_ndev_priv has an enum type, use it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7f071a45

net: enetc: simplify callers of enetc_rxbd_next · c027aa92

Vladimir Oltean authored Mar 10, 2021

When we iterate through the BDs in the RX ring, the software producer
index (which is already passed by value to enetc_rxbd_next) lags behind,
and we end up with this funny looking "++i == rx_ring->bd_count" check
so that we drag it after us.

Let's pass the software producer index "i" by reference, so that
enetc_rxbd_next can increment it by itself (mod rx_ring->bd_count),
especially since enetc_rxbd_next has to increment the index anyway.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c027aa92

net: enetc: don't initialize unused ports from a separate code path · 4b47c0b8

Vladimir Oltean authored Mar 10, 2021

Since commit 3222b5b6 ("net: enetc: initialize RFS/RSS memories for
unused ports too") there is a requirement to initialize the memories of
unused PFs too, which has left the probe path in a bit of a rough shape,
because we basically have a minimal initialization path for unused PFs
which is separate from the main initialization path.

Now that initializing a control BD ring is as simple as calling
enetc_setup_cbdr, let's move that outside of enetc_alloc_si_resources
(unused PFs don't need classification rules, so no point in allocating
them just to free them later).

But enetc_alloc_si_resources is called both for PFs and for VFs, so now
that enetc_setup_cbdr is no longer called from this common function, it
means that the VF probe path needs to explicitly call enetc_setup_cbdr
too.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b47c0b8

net: enetc: pass bd_count as an argument to enetc_setup_cbdr · 5b4daa7f

Vladimir Oltean authored Mar 10, 2021

It makes no sense from an API perspective to first initialize some
portion of struct enetc_cbdr outside enetc_setup_cbdr, then leave that
function to initialize the rest. enetc_setup_cbdr should be able to
perform all initialization given a zero-initialized struct enetc_cbdr.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5b4daa7f

net: enetc: squash clear_cbdr and free_cbdr into teardown_cbdr · 0bfde022

Vladimir Oltean authored Mar 10, 2021

All call sites call enetc_clear_cbdr and enetc_free_cbdr one after
another, so let's combine the two functions into a single method named
enetc_teardown_cbdr which does both, and in the same order.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0bfde022

net: enetc: save the mode register address inside struct enetc_cbdr · 27f9025d

Vladimir Oltean authored Mar 10, 2021

enetc_clear_cbdr depends on struct enetc_hw because it must disable the
ring through a register write. We'd like to remove that dependency, so
let's do what's already done with the producer and consumer indices,
which is to save the iomem address in a variable kept in struct enetc_cbdr.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

27f9025d

net: enetc: squash enetc_alloc_cbdr and enetc_setup_cbdr · 24be14e3

Vladimir Oltean authored Mar 10, 2021

enetc_alloc_cbdr and enetc_setup_cbdr are always called one after
another, so we can simplify the callers and make enetc_setup_cbdr do
everything that's needed.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24be14e3

net: enetc: save the DMA device for enetc_free_cbdr · 01121ab7

Vladimir Oltean authored Mar 10, 2021

We shouldn't need to pass the struct device *dev to enetc CBDR APIs over
and over again, so save this inside struct enetc_cbdr::dma_dev and avoid
calling it from the enetc_free_cbdr functions.

This breaks the dependency of the cbdr API from struct enetc_si (the
station interface).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01121ab7

net: enetc: move the CBDR API to enetc_cbdr.c · 176769d1

Vladimir Oltean authored Mar 10, 2021

Since there is a dedicated file in this driver for interacting with
control BD rings, it makes sense to move these functions there.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

176769d1

Merge branch 'defxx-updates' · e2359fad

David S. Miller authored Mar 10, 2021

Maciej W. Rozycki says:

====================
FDDI: defxx: CSR access fixes and improvements

As a lab upgrade I have recently replaced a dated 32-bit x86 server with
a new POWER9 system. One of the purposes of the system has been providing
network based resources to clients over my FDDI network. As such the new
server has also received a new DEFPA FDDI network adapter.

As it turned out the interface did not work with the driver as shipped by
the most recent stable Debian release (Linux version 5.9.15) for ppc64el.
Symptoms were inconclusive, and the DEFPA adapter turned out to have a
manufacturing defect as well, however eventually I have figured out the
PCIe host bridge used with the system, Power Systems Host Bridge 4 (PHB4),
does not (anymore) implement PCI I/O transactions, while the binary defxx
driver as shipped by Debian comes configured for port I/O, and then a bug
in resource handling causes the driver to try and use an unassigned port
I/O range for adapter's PDQ main ASIC's CSR access.

Fortunately the PFI PCI interface ASIC used with the DEFPA adapter has
been designed such as to provide for both PCI I/O and PCI memory accesses
to be used for PDQ CSR access, via a pair of BARs to be alternatively
used.

Originally the defxx driver only supported port I/O access, but in the
course of interfacing it to the TURBOchannel bus I had to implement MMIO
access too, and while at it I have added a kernel configuration option to
globally switch between port I/O and MMIO at compilation time, however
conservatively defaulting to port I/O for EISA bus support where the use
of MMIO currently requires the adapter to have been suitably configured
via ECU (EISA Configuration Utility), supplied externally.

With the kernel configuration option set to MMIO the DEFPA interface
works correctly with my POWER9 system. Therefore I have prepared this
small patch series consisting of a pair of conservative bug fixes, to be
backported to stable branches, and then a pair of improvements for the
robustness of the driver.

So changes 1/4 and 2/4 apply both to net and net-next, and then changes
3/4 and 4/4 apply on top of them to net-next only. In particular there
are diff context dependencies going like this: 1/4 -> 3/4 -> 4/4. Let me
know if this submission needs to be sorted differently.

See individual change descriptions for further details as to the actual
changes made.

NB the ESIC interface chip used for slave address decoding with the DEFEA
EISA adapter has decoding implemented for address bits 31:10 and therefore
supports full 32-bit range for the allocation of the CSR decoding window.
For DOS compatibility reasons ECU however only allows allocations between
0x000c0000 and 0x000effff.

Given that for other compatibility reasons EISA is subtractively decoded
on mixed PCI/EISA systems we could allocate an MMIO region from arbitrary
unoccupied memory space and program the ESIC suitably without regard for
that compatibility limitation. In fact I have a proof-of-concept change
and it seems to work reliably.

However with these patches applied the driver continues supporting port
I/O as fallback and the EISA product ID register is located in the EISA
slot-specific port I/O address space, so any EISA system however modern
(sounds like a joke, eh?) also has to support port I/O access somehow.

So while I think such a dynamic MMIO allocation would be an example of
good engineering, but it would require changes to our EISA core and
therefore it may have had sense 25 years ago when EISA was still
mainstream, but not nowadays when EISA systems are I suppose more of a
curiosity rather than the usual equipment.

This patch series has been thoroughly verified with Linux 5.11.0 as
released and then a Raptor Talos II POWER9 system and a Malta 5Kc MIPS64
system for PCI DEFPA adapter support, an Advanced Integrated Research
486EI x86 system for EISA DEFEA adapter support, and a Digital Equipment
DECstation 5000 model 260 MIPS III system for TURBOchannel DEFTA adapter
support, covering both port I/O and MMIO operation where applicable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e2359fad

FDDI: defxx: Use driver's name with resource requests · 4e052626

Maciej W. Rozycki authored Mar 10, 2021

Replace repeated "defxx" strings with a reference to the DRV_NAME macro
and then use the driver's name rather that the bus address with resource
requests so as to have contents of /proc/iomem and /proc/ioports more
meaningful to the user, in line with what drivers usually do.

So rather than say:

5000-50ff : DEC FDDIcontroller/EISA Adapter
  5000-503f : 00:05
  5040-5043 : 00:05
5400-54ff : DEC FDDIcontroller/EISA Adapter
5800-58ff : DEC FDDIcontroller/EISA Adapter
5c00-5cff : DEC FDDIcontroller/EISA Adapter
  5c80-5cbf : 00:05

or:

620c080020000-620c08002007f : 0031:02:04.0
  620c080020000-620c08002007f : 0031:02:04.0
620c080030000-620c08003ffff : 0031:02:04.0

or:

1f100000-1f10003f : tc2

we report:

5000-50ff : DEC FDDIcontroller/EISA Adapter
  5000-503f : defxx
  5040-5043 : defxx
5400-54ff : DEC FDDIcontroller/EISA Adapter
5800-58ff : DEC FDDIcontroller/EISA Adapter
5c00-5cff : DEC FDDIcontroller/EISA Adapter
  5c80-5cbf : defxx

and:

620c080020000-620c08002007f : 0031:02:04.0
  620c080020000-620c08002007f : defxx
620c080030000-620c08003ffff : 0031:02:04.0

and:

1f100000-1f10003f : defxx

respectively for the DEFEA (EISA), DEFPA (PCI), and DEFTA (TURBOchannel)
adapters.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e052626

FDDI: defxx: Implement dynamic CSR I/O address space selection · 795e272e

Maciej W. Rozycki authored Mar 10, 2021

Recent versions of the PCI Express specification have deprecated support
for I/O transactions and actually some PCIe host bridges, such as Power
Systems Host Bridge 4 (PHB4), do not implement them. Conversely a DEFEA
adapter can have its MMIO decoding disabled with ECU (EISA Configuration
Utility) and therefore not available for us with the resource allocation
infrastructure we implement.

However either I/O address space will always be available for use with
the DEFEA (EISA) and DEFPA (PCI) adapters and both have double address
decoding implemented in hardware for Control and Status Register access.
The two kinds of adapters can be present both at once in a single mixed
PCI/EISA system. For the DEFTA (TURBOchannel) variant there is no issue
as there has been no port I/O address space defined for that bus.

To make people's life easier and the driver more robust remove the
DEFXX_MMIO configuration option so as to rather than making the choice
for the I/O address space to use at build time for all the adapters
installed in the system let the driver choose the most suitable address
space dynamically on a case-by-case basis at run time. Make MMIO the
default and resort to port I/O should the default fail for some reason.

This way multiple adapters installed in one system can use different I/O
address spaces each, in particular in the presence of DEFEA adapters in
a pure-EISA or a mixed EISA/PCI system (it is expected that DEFPA boards
will use MMIO in normal circumstances).

The choice of the I/O address space to use continues being reported by
the driver on startup, e.g.:

eisa 00:05: EISA: slot 5: DEC3002 detected
defxx: v1.12 2021/03/10 Lawrence V. Stefani and others
00:05: DEFEA at I/O addr = 0x5000, IRQ = 10, Hardware addr = 00-00-f8-c8-b3-b6
00:05: registered as fddi0

and:

defxx: v1.12 2021/03/10 Lawrence V. Stefani and others
0031:02:04.0: DEFPA at MMIO addr = 0x620c080020000, IRQ = 57, Hardware addr = 00-60-6d-93-91-98
0031:02:04.0: registered as fddi0

and:

defxx: v1.12 2021/03/10 Lawrence V. Stefani and others
tc2: DEFTA at MMIO addr = 0x1f100000, IRQ = 21, Hardware addr = 08-00-2b-b0-8b-1e
tc2: registered as fddi0

so there is no need to add further information.

The change is supposed to cause a negligible performance hit as I/O
accessors will now have code executed conditionally at run time.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

795e272e

FDDI: defxx: Make MMIO the configuration default except for EISA · 193ced4a

Maciej W. Rozycki authored Mar 10, 2021

Recent versions of the PCI Express specification have deprecated support
for I/O transactions and actually some PCIe host bridges, such as Power
Systems Host Bridge 4 (PHB4), do not implement them.

The default kernel configuration choice for the defxx driver is the use
of I/O ports rather than MMIO for PCI and EISA systems. It may have
made sense as a conservative backwards compatible choice back when MMIO
operation support was added to the driver as a part of TURBOchannel bus
support. However nowadays this configuration choice makes the driver
unusable with systems that do not implement I/O transactions for PCIe.

Make DEFXX_MMIO the configuration default then, except where configured
for EISA. This exception is because an EISA adapter can have its MMIO
decoding disabled with ECU (EISA Configuration Utility) and therefore
not available with the resource allocation infrastructure we implement,
while port I/O is always readily available as it uses slot-specific
addressing, directly mapped to the slot an option card has been placed
in and handled with our EISA bus support core. Conversely a kernel that
supports modern systems which may not have I/O transactions implemented
for PCIe will usually not be expected to handle legacy EISA systems.

The change of the default will make it easier for people, including but
not limited to distribution packagers, to make a working choice for the
driver.

Update the option description accordingly and while at it replace the
potentially ambiguous PIO acronym with IOP for "port I/O" vs "I/O ports"
according to our nomenclature used elsewhere.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: e89a2cfb ("[TC] defxx: TURBOchannel support")
Cc: stable@vger.kernel.org # v2.6.21+
Signed-off-by: David S. Miller <davem@davemloft.net>

193ced4a

FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR · f626ca68

Maciej W. Rozycki authored Mar 10, 2021

Recent versions of the PCI Express specification have deprecated support
for I/O transactions and actually some PCIe host bridges, such as Power
Systems Host Bridge 4 (PHB4), do not implement them.

For those systems the PCI BARs that request a mapping in the I/O space
have the length recorded in the corresponding PCI resource set to zero,
which makes it unassigned:

# lspci -s 0031:02:04.0 -v
0031:02:04.0 FDDI network controller: Digital Equipment Corporation PCI-to-PDQ Interface Chip [PFI] FDDI (DEFPA) (rev 02)
	Subsystem: Digital Equipment Corporation FDDIcontroller/PCI (DEFPA)
	Flags: bus master, medium devsel, latency 136, IRQ 57, NUMA node 8
	Memory at 620c080020000 (32-bit, non-prefetchable) [size=128]
	I/O ports at <unassigned> [disabled]
	Memory at 620c080030000 (32-bit, non-prefetchable) [size=64K]
	Capabilities: [50] Power Management version 2
	Kernel driver in use: defxx
	Kernel modules: defxx

#

Regardless the driver goes ahead and requests it (here observed with a
Raptor Talos II POWER9 system), resulting in an odd /proc/ioport entry:

# cat /proc/ioports
00000000-ffffffffffffffff : 0031:02:04.0
#

Furthermore, the system gets confused as the driver actually continues
and pokes at those locations, causing a flood of messages being output
to the system console by the underlying system firmware, like:

defxx: v1.11 2014/07/01  Lawrence V. Stefani and others
defxx 0031:02:04.0: enabling device (0140 -> 0142)
LPC[000]: Got SYNC no-response error. Error address reg: 0xd0010000
IPMI: dropping non severe PEL event
LPC[000]: Got SYNC no-response error. Error address reg: 0xd0010014
IPMI: dropping non severe PEL event
LPC[000]: Got SYNC no-response error. Error address reg: 0xd0010014
IPMI: dropping non severe PEL event

and so on and so on (possibly intermixed actually, as there's no locking
between the kernel and the firmware in console port access with this
particular system, but cleaned up above for clarity), and once some 10k
of such pairs of the latter two messages have been produced an interace
eventually shows up in a useless state:

0031:02:04.0: DEFPA at I/O addr = 0x0, IRQ = 57, Hardware addr = 00-00-00-00-00-00

This was not expected to happen as resource handling was added to the
driver a while ago, because it was not known at that time that a PCI
system would be possible that cannot assign port I/O resources, and
oddly enough `request_region' does not fail, which would have caught it.

Correct the problem then by checking for the length of zero for the CSR
resource and bail out gracefully refusing to register an interface if
that turns out to be the case, producing messages like:

defxx: v1.11 2014/07/01  Lawrence V. Stefani and others
0031:02:04.0: Cannot use I/O, no address set, aborting
0031:02:04.0: Recompile driver with "CONFIG_DEFXX_MMIO=y"

Keep the original check for the EISA MMIO resource as implemented,
because in that case the length is hardwired to 0x400 as a consequence
of how the compare/mask address decoding works in the ESIC chip and it
is only the base address that is set to zero if MMIO has been disabled
for the adapter in EISA configuration, which in turn could be a valid
bus address in a legacy-free system implementing PCI, especially for
port I/O.

Where the EISA MMIO resource has been disabled for the adapter in EISA
configuration this arrangement keeps producing messages like:

eisa 00:05: EISA: slot 5: DEC3002 detected
defxx: v1.11 2014/07/01  Lawrence V. Stefani and others
00:05: Cannot use MMIO, no address set, aborting
00:05: Recompile driver with "CONFIG_DEFXX_MMIO=n"
00:05: Or run ECU and set adapter's MMIO location

with the last two lines now swapped for easier handling in the driver.

There is no need to check for and catch the case of a port I/O resource
not having been assigned for EISA as the adapter uses the slot-specific
I/O space, which gets assigned by how EISA has been specified and maps
directly to the particular slot an option card has been placed in.  And
the EISA variant of the adapter has additional registers that are only
accessible via the port I/O space anyway.

While at it factor out the error message calls into helpers and fix an
argument order bug with the `pr_err' call now in `dfx_register_res_err'.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: 4d0438e5 ("defxx: Clean up DEFEA resource management")
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: David S. Miller <davem@davemloft.net>

f626ca68

Merge branch 'mlxsw-misc-updates' · a3c39230

David S. Miller authored Mar 10, 2021

Ido Schimmel says:

====================
mlxsw: Misc updates

This patch set contains miscellaneous updates for mlxsw.

Patches #1-#2 reword an extack message to make it clearer and fix a
comment.

Patch #3 bumps the minimum firmware version enforced by mlxsw. This is
needed for two upcoming features: Resilient hashing and per-flow
sampling.

Patches #4-#6 improve the information reported via devlink-health for
'fw_fatal' events.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a3c39230

mlxsw: Adjust some MFDE fields shift and size to fw implementation · 4734a750

Danielle Ratson authored Mar 10, 2021

MFDE.irisc_id and MFDE.event_id were adjusted according to what is
actually implemented in firmware.

Adjust the shift and size of these fields in mlxsw as well.

Note that the displacement of the first field is not a regression.
It was always incorrect and therefore reported "0".
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4734a750

mlxsw: core: Expose MFDE.log_ip to devlink health · 315afd20

Danielle Ratson authored Mar 10, 2021

Add the MFDE.log_ip field to devlink health reporter in order to ease
firmware debug. This field encodes the instruction pointer that triggered
the CR space timeout.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

315afd20

mlxsw: reg: Extend MFDE register with new log_ip field · ff12ba3a

Danielle Ratson authored Mar 10, 2021

Extend MFDE (Monitoring FW Debug) register with new field specifying the
instruction pointer that triggered the CR space timeout.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff12ba3a

mlxsw: spectrum: Bump minimum FW version to xx.2008.2406 · 2ab781c2

Petr Machata authored Mar 10, 2021

The indicated version fixes the following two issues:

- MIRROR_SAMPLER_ACTION.mirror_probability_rate inverted. This has
  implication for per-flow sampling.

- When adjacency is replaced-if-inactive (RATR.opcode=3), bad parameter
  was reported when replacing an active entry. This breaks offload of
  resilient next-hop groups.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2ab781c2

mlxsw: reg: Fix comment about slot_index field in PMAOS register · 675e5a1e

Amit Cohen authored Mar 10, 2021

The comment did not include the register name.
Add `pmaos` to align the comment with other comments.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

675e5a1e

mlxsw: spectrum: Reword an error message for Q-in-Q veto · 825e8885

Danielle Ratson authored Mar 10, 2021

'Uppers' is not clear enough for all users when referring to upper
devices.

Reword the error message so it will be clearer.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

825e8885