Commits · 221f3dff2999de4c7599a83d6c53eaf668699d33 · Kirill Smelkov / linux

31 Oct, 2020 35 commits

octeontx2-af: Initialize NIX1 block · 221f3dff

Rakesh Babu authored Oct 29, 2020

This patch modifies NIX functions to operate
with nix_hw context so that existing functions
can be used for both NIX0 and NIX1 blocks. And
the NIX blocks present in the system are initialized
during driver init and freed during exit.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

221f3dff

octeontx2-af: Manage new blocks in 98xx · 9932fb72

Rakesh Babu authored Oct 29, 2020

AF manages the tasks of allocating, freeing
LFs from RVU blocks to PF and VFs. With new
NIX1 and CPT1 blocks in 98xx, this patch
adds support for handling new blocks too.
Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

9932fb72

octeontx2-af: Update get/set resource count functions · cdd41e87

Subbaraya Sundeep authored Oct 29, 2020

Since multiple blocks of same type are present in
98xx, modify functions which get resource count and
which update resource count to work with individual
block address instead of block type.
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

cdd41e87

net: axienet: Properly handle PCS/PMA PHY for 1000BaseX mode · 1a025560

Robert Hancock authored Oct 28, 2020

Update the axienet driver to properly support the Xilinx PCS/PMA PHY
component which is used for 1000BaseX and SGMII modes, including
properly configuring the auto-negotiation mode of the PHY and reading
the negotiated state from the PHY.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Link: https://lore.kernel.org/r/20201028171429.1699922-1-robert.hancock@calian.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1a025560

net: ipa: avoid a bogus warning · 624251b4

Alex Elder authored Oct 31, 2020

The previous commit added support for IPA having up to six source
and destination resources.  But currently nothing uses more than
four.  (Five of each are used in a newer version of the hardware.)

I find that in one of my build environments the compiler complains
about newly-added code in two spots.  Inspection shows that the
warnings have no merit, but this compiler does not recognize that.

    ipa_main.c:457:39: warning: array index 5 is past the end of the
        array (which contains 4 elements) [-Warray-bounds]
    (and the same warning at line 483)

We can make this warning go away by changing the number of elements
in the source and destination resource limit arrays--now rather than
waiting until we need it to support the newer hardware.  This change
was coming soon anyway; make it now to get rid of the warning.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20201031151524.32132-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

624251b4

Merge branch 'net-add-functionality-to-net-core-byte-packet-counters-and-use-it-in-r8169' · 023efb15

Jakub Kicinski authored Oct 31, 2020

Heiner Kallweit says:

====================
net: add functionality to net core byte/packet counters and use it in r8169

This series adds missing functionality to the net core handling of
byte/packet counters and statistics. The extensions are then used
to remove private rx/tx byte/packet counters in r8169 driver.
====================

Link: https://lore.kernel.org/r/1fdb8ecd-be0a-755d-1d92-c62ed8399e77@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

023efb15

r8169: remove no longer needed private rx/tx packet/byte counters · f1d54705

Heiner Kallweit authored Oct 29, 2020

After switching to the net core rx/tx byte/packet counters we can
remove the now unused private version.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

f1d54705

r8169: use struct pcpu_sw_netstats for rx/tx packet/byte counters · 5e4cb480

Heiner Kallweit authored Oct 29, 2020

Switch to the net core rx/tx byte/packet counter infrastructure.
This simplifies the code, only small drawback is some memory overhead
because we use just one queue, but allocate the counters per cpu.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5e4cb480

net: core: add devm_netdev_alloc_pcpu_stats · 81b01894

Heiner Kallweit authored Oct 29, 2020

We have netdev_alloc_pcpu_stats(), and we have devm_alloc_percpu().
Add a managed version of netdev_alloc_pcpu_stats, e.g. for allocating
the per-cpu stats in the probe() callback of a driver. It needs to be
a macro for dealing properly with the type argument.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

81b01894

net: core: add dev_sw_netstats_tx_add · d3fd6548

Heiner Kallweit authored Oct 29, 2020

Add dev_sw_netstats_tx_add(), complementing already existing
dev_sw_netstats_rx_add(). Other than dev_sw_netstats_rx_add allow to
pass the number of packets as function argument.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d3fd6548

Merge branch 'in_interrupt-cleanup-part-2' · 4e5d79bb

Jakub Kicinski authored Oct 31, 2020

Sebastian Andrzej Siewior says:

====================
in_interrupt() cleanup, part 2

in the discussion about preempt count consistency across kernel configurations:

  https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/

Linus clearly requested that code in drivers and libraries which changes
behaviour based on execution context should either be split up so that
e.g. task context invocations and BH invocations have different interfaces
or if that's not possible the context information has to be provided by the
caller which knows in which context it is executing.

This includes conditional locking, allocation mode (GFP_*) decisions and
avoidance of code paths which might sleep.

In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
driver code completely.

This is part two addressing remaining drivers except for orinoco-usb.
====================

Cherry picking only Ethernet changes.

Link: https://lore.kernel.org/r/20201027225454.3492351-1-bigeasy@linutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>

4e5d79bb

net: tlan: Replace in_irq() usage · beca9282

Sebastian Andrzej Siewior authored Oct 27, 2020

The driver uses in_irq() to determine if the tlan_priv::lock has to be
acquired in tlan_mii_read_reg() and tlan_mii_write_reg().

The interrupt handler acquires the lock outside of these functions so the
in_irq() check is meant to prevent a lock recursion deadlock. But this
check is incorrect when interrupt force threading is enabled because then
the handler runs in thread context and in_irq() correctly returns false.

The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
seperated or the context be conveyed in an argument passed by the caller,
which usually knows the context.

tlan_set_timer() has this conditional as well, but this function is only
invoked from task context or the timer callback itself. So it always has to
lock and the check can be removed.

tlan_mii_read_reg(), tlan_mii_write_reg() and tlan_phy_print() are invoked
from interrupt and other contexts.

Split out the actual function body into helper variants which are called
from interrupt context and make the original functions wrappers which
acquire tlan_priv::lock unconditionally.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Samuel Chessman <chessman@tux.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

beca9282

net: forcedeth: Replace context and lock check with a lockdep_assert() · dc5e8bfc

Sebastian Andrzej Siewior authored Oct 27, 2020

nv_update_stats() triggers a WARN_ON() when invoked from hard interrupt
context because the locks in use are not hard interrupt safe. It also has
an assert_spin_locked() which was the lock check before the lockdep era.

Lockdep has way broader locking correctness checks and covers both issues,
so replace the warning and the lock assert with lockdep_assert_held().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Rain River <rain.1986.08.12@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dc5e8bfc

net: neterion: s2io: Replace in_interrupt() for context detection · 5ce7f3f4

Sebastian Andrzej Siewior authored Oct 27, 2020

wait_for_cmd_complete() uses in_interrupt() to detect whether it is safe to
sleep or not.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

in_interrupt() also is only partially correct because it fails to chose the
correct code path when just preemption or interrupts are disabled.

Add an argument 'may_block' to both functions and adjust the callers to
pass the context information.

The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:

 s2io_card_up()
   s2io_set_multicast()

 init_nic()
   init_tti()

 s2io_close()
   do_s2io_delete_unicast_mc()
     do_s2io_add_mac()

 s2io_set_mac_addr()
   do_s2io_prog_unicast()
     do_s2io_add_mac()

 s2io_reset()
   do_s2io_restore_unicast_mc()
     do_s2io_add_mc()
       do_s2io_add_mac()

 s2io_open()
   do_s2io_prog_unicast()
     do_s2io_add_mac()

The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:

 __dev_set_rx_mode()
    s2io_set_multicast()

 s2io_txpic_intr_handle()
   s2io_link()
     init_tti()

Add a may_sleep argument to wait_for_cmd_complete(), s2io_set_multicast()
and init_tti() and hand the context information in from the call sites.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5ce7f3f4

Merge branch 'l2-multicast-forwarding-for-ocelot-switch' · 68bb4665

Jakub Kicinski authored Oct 30, 2020

Vladimir Oltean says:

====================
L2 multicast forwarding for Ocelot switch

This series enables the mscc_ocelot switch to forward raw L2 (non-IP)
mdb entries as configured by the bridge driver after this patch:

https://patchwork.ozlabs.org/project/netdev/patch/20201028233831.610076-1-vladimir.oltean@nxp.com/
====================

Link: https://lore.kernel.org/r/20201029022738.722794-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

68bb4665

net: mscc: ocelot: support L2 multicast entries · e5d1f896

Vladimir Oltean authored Oct 29, 2020

There is one main difference in mscc_ocelot between IP multicast and L2
multicast. With IP multicast, destination ports are encoded into the
upper bytes of the multicast MAC address. Example: to deliver the
address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to
program the address of 00:03:08:11:22:33 into hardware. Whereas for L2
multicast, the MAC table entry points to a Port Group ID (PGID), and
that PGID contains the port mask that the packet will be forwarded to.
As to why it is this way, no clue. My guess is that not all port
combinations can be supported simultaneously with the limited number of
PGIDs, and this was somehow an issue for IP multicast but not for L2
multicast. Anyway.

Prior to this change, the raw L2 multicast code was bogus, due to the
fact that there wasn't really any way to test it using the bridge code.
There were 2 issues:
- A multicast PGID was allocated for each MDB entry, but it wasn't in
  fact programmed to hardware. It was dummy.
- In fact we don't want to reserve a multicast PGID for every single MDB
  entry. That would be odd because we can only have ~60 PGIDs, but
  thousands of MDB entries. So instead, we want to reserve a multicast
  PGID for every single port combination for multicast traffic. And
  since we can have 2 (or more) MDB entries delivered to the same port
  group (and therefore PGID), we need to reference-count the PGIDs.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e5d1f896

net: mscc: ocelot: make entry_type a member of struct ocelot_multicast · bb8d53fd

Vladimir Oltean authored Oct 29, 2020

This saves a re-classification of the MDB address on deletion.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bb8d53fd

net: mscc: ocelot: remove the "new" variable in ocelot_port_mdb_add · 728e69ae

Vladimir Oltean authored Oct 29, 2020

It is Not Needed, a comment will suffice.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

728e69ae

net: mscc: ocelot: use ether_addr_copy · ebbd860e

Vladimir Oltean authored Oct 29, 2020

Since a helper is available for copying Ethernet addresses, let's use it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ebbd860e

net: mscc: ocelot: classify L2 mdb entries as LOCKED · 7c313143

Vladimir Oltean authored Oct 29, 2020

ocelot.h says:

/* MAC table entry types.
 * ENTRYTYPE_NORMAL is subject to aging.
 * ENTRYTYPE_LOCKED is not subject to aging.
 * ENTRYTYPE_MACv4 is not subject to aging. For IPv4 multicast.
 * ENTRYTYPE_MACv6 is not subject to aging. For IPv6 multicast.
 */

We don't want the permanent entries added with 'bridge mdb' to be
subject to aging.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

7c313143

net: bridge: explicitly convert between mdb entry state and port group flags · 0e761ac0

Vladimir Oltean authored Oct 29, 2020

When creating a new multicast port group, there is implicit conversion
between the __u8 state member of struct br_mdb_entry and the unsigned
char flags member of struct net_bridge_port_group. This implicit
conversion relies on the fact that MDB_PERMANENT is equal to
MDB_PG_FLAGS_PERMANENT.

Let's be more explicit and convert the state to flags manually.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201028234815.613226-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0e761ac0

net: bridge: mcast: add support for raw L2 multicast groups · 955062b0

Nikolay Aleksandrov authored Oct 29, 2020

Extend the bridge multicast control and data path to configure routes
for L2 (non-IP) multicast groups.

The uapi struct br_mdb_entry union u is extended with another variant,
mac_addr, which does not change the structure size, and which is valid
when the proto field is zero.

To be compatible with the forwarding code that is already in place,
which acts as an IGMP/MLD snooping bridge with querier capabilities, we
need to declare that for L2 MDB entries (for which there exists no such
thing as IGMP/MLD snooping/querying), that there is always a querier.
Otherwise, these entries would be flooded to all bridge ports and not
just to those that are members of the L2 multicast group.

Needless to say, only permanent L2 multicast groups can be installed on
a bridge port.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201028233831.610076-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

955062b0

Merge branch 'sfc-ef100-tso-enhancements' · 8ece853d

Jakub Kicinski authored Oct 30, 2020

Edward Cree says:

====================
sfc: EF100 TSO enhancements

Support TSO over encapsulation (with GSO_PARTIAL), and over VLANs
 (which the code already handled but we didn't advertise).  Also
 correct our handling of IPID mangling.

I couldn't find documentation of exactly what shaped SKBs we can
 get given, so patch #2 is slightly guesswork, but when I tested
 TSO over both underlay and (VxLAN) overlay, the checksums came
 out correctly, so at least in those cases the edits we're making
 must be the right ones.
Similarly, I'm not 100% sure I've correctly understood how FIXEDID
 and MANGLEID are supposed to work in patch #3.
====================

Link: https://lore.kernel.org/r/6e1ea05f-faeb-18df-91ef-572445691d89@solarflare.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

8ece853d

sfc: advertise our vlan features · b61e8100

Edward Cree authored Oct 28, 2020

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b61e8100

sfc: only use fixed-id if the skb asks for it · dbe2f251

Edward Cree authored Oct 28, 2020

AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a
 driver may _need_ to mangle IDs in order to do TSO, and conversely
 a signal from the stack that the driver is permitted to do so.
Since we support both fixed and incrementing IPIDs, we should rely
 on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using
 the MANGLEID feature to make all TSOs fixed-id.
Includes other minor cleanups of ef100_make_tso_desc() coding style.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dbe2f251

sfc: implement encap TSO on EF100 · 806f9f23

Edward Cree authored Oct 28, 2020

The NIC only needs to know where the headers it has to edit (TCP and
 inner and outer IPv4) are, which fits GSO_PARTIAL nicely.
It also supports non-PARTIAL offload of UDP tunnels, again just
 needing to be told the outer transport offset so that it can edit
 the UDP length field.
(It's not clear to me whether the stack will ever use the non-PARTIAL
 version with the netdev feature flags we're setting here.)
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

806f9f23

sfc: extend bitfield macros to 17 fields · a7a375ca

Edward Cree authored Oct 28, 2020

We need EFX_POPULATE_OWORD_17 for an encap TSO descriptor on EF100.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a7a375ca

Merge branch 'net-ipa-minor-bug-fixes' · dc956588

Jakub Kicinski authored Oct 30, 2020

Alex Elder says:

====================
net: ipa: minor bug fixes

This series fixes several bugs.  They are minor, in that the code
currently works on supported platforms even without these patches
applied, but they're bugs nevertheless and should be fixed.

Version 2 improves the commit message for the fourth patch.  It also
fixes a bug in two spots in the last patch.  Both of these changes
were suggested by Willem de Bruijn.
====================

Link: https://lore.kernel.org/r/20201028194148.6659-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

dc956588

net: ipa: avoid going past end of resource group array · 4a0d7579