Commits · 230f3d53a5477bf8b04e649dca67da85635cd1eb · Kirill Smelkov / linux

31 Jul, 2023 5 commits

Jan Sokolowski authored Jul 28, 2023

Replace uses of i40e_status to as equivalent as possible error codes.
Remove enum i40e_status as it is no longer needed
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230728171336.2446156-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

230f3d53

tcp: Remove unused function declarations · 68223f96

Yue Haibing authored Jul 29, 2023

commit 8a59f9d1 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
left behind tcp_bpf_get_proto() declaration. And tcp_v4_tw_remember_stamp()
function is remove in ccb7c410 ("timewait_sock: Create and use getpeer op.").
Since commit 68698970 ("tcp: simplify tcp_mark_skb_lost")
tcp_skb_mark_lost_uncond_verify() declaration is not used anymore.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230729122644.10648-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

68223f96

devlink: Remove unused extern declaration devlink_port_region_destroy() · 2628d408

Yue Haibing authored Jul 28, 2023

devlink_port_region_destroy() is never implemented since
commit 544e7c33 ("net: devlink: Add support for port regions").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230728132113.32888-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

2628d408

net: Use sockaddr_storage for getsockopt(SO_PEERNAME). · 8936bf53

Kuniyuki Iwashima authored Jul 28, 2023

Commit df8fc4e9 ("kbuild: Enable -fstrict-flex-arrays=3") started
applying strict rules to standard string functions.

It does not work well with conventional socket code around each protocol-
specific sockaddr_XXX struct, which is cast from sockaddr_storage and has
a bigger size than fortified functions expect.  See these commits:

 commit 06d4c8a8 ("af_unix: Fix fortify_panic() in unix_bind_bsd().")
 commit ecb4534b ("af_unix: Terminate sun_path when bind()ing pathname socket.")
 commit a0ade840 ("af_packet: Fix warning of fortified memcpy() in packet_getname().")

We must cast the protocol-specific address back to sockaddr_storage
to call such functions.

However, in the case of getsockaddr(SO_PEERNAME), the rationale is a bit
unclear as the buffer is defined by char[128] which is the same size as
sockaddr_storage.

Let's use sockaddr_storage explicitly.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8936bf53

net: flow_dissector: Use 64bits for used_keys · 2b3082c6

Ratheesh Kannoth authored Jul 29, 2023

As 32bits of dissector->used_keys are exhausted,
increase the size to 64bits.

This is base change for ESP/AH flow dissector patch.
Please find patch and discussions at
https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#tSigned-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Tested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2b3082c6

29 Jul, 2023 8 commits

team: Remove NULL check before dev_{put, hold} · 64a37272

Yang Li authored Jul 27, 2023

The call netdev_{put, hold} of dev_{put, hold} will check NULL,
so there is no need to check before using dev_{put, hold},
remove it to silence the warning:

./drivers/net/team/team.c:2325:3-10: WARNING: NULL check before dev_{put, hold} functions is not needed.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5991Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

64a37272

net: ethernet: mtk_eth_soc: enable nft hw flowtable_offload for MT7988 SoC · 88efedf5

Lorenzo Bianconi authored Jul 27, 2023

Enable hw Packet Process Engine (PPE) for MT7988 SoC.
Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/5e86341b0220a49620dadc02d77970de5ded9efc.1690441576.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

88efedf5

net: ethernet: mtk_eth_soc: enable page_pool support for MT7988 SoC · 58ea461b

Lorenzo Bianconi authored Jul 27, 2023

In order to recycle pages, enable page_pool allocator for MT7988 SoC.
Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/fd4e8693980e47385a543e7b002eec0b88bd09df.1690440675.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

58ea461b

net: bcmasp: Clean up redundant dev_err_probe() · c88c157d

Chen Jiahao authored Jul 27, 2023

Referring to platform_get_irq()'s definition, the return value has
already been checked, error message also been printed via
dev_err_probe() if ret < 0. Calling dev_err_probe() one more time
outside platform_get_irq() is obviously redundant.

Removing dev_err_probe() outside platform_get_irq() to clean up
above problem.
Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Justin Chen <justin.chen@broadcom.com>
Link: https://lore.kernel.org/r/20230727115551.2655840-1-chenjiahao16@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

c88c157d

bonding: 3ad: Remove unused declaration bond_3ad_update_lacp_active() · 61c51453

YueHaibing authored Jul 26, 2023

This is not used since commit 3a755cd8 ("bonding: add new option lacp_active")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230726143816.15280-1-yuehaibing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

61c51453

Merge branch 'r8152-reduce-control-transfer' · 4e1db4a8

Jakub Kicinski authored Jul 28, 2023

Hayes Wang says:

====================
r8152: reduce control transfer

The two patches are used to reduce the number of control transfer when
access the registers in bulk.
====================

Link: https://lore.kernel.org/r/20230726030808.9093-417-nic_swsd@realtek.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

4e1db4a8

r8152: set bp in bulk · e5c266a6

Hayes Wang authored Jul 26, 2023

PLA_BP_0 ~ PLA_BP_15 (0xfc28 ~ 0xfc46) are continuous registers, so we
could combine the control transfers into one control transfer.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230726030808.9093-419-nic_swsd@realtek.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e5c266a6

r8152: adjust generic_ocp_write function · 57df0fb9

Hayes Wang authored Jul 26, 2023

Reduce the control transfer if all bytes of first or the last DWORD are
written.

The original method is to split the control transfer into three parts
(the first DWORD, middle continuous data, and the last DWORD). However,
they could be combined if whole bytes of the first DWORD or last DWORD
are written. That is, the first DWORD or the last DWORD could be combined
with the middle continuous data, if the byte_en is 0xff.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230726030808.9093-418-nic_swsd@realtek.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

57df0fb9

28 Jul, 2023 27 commits

net: ethernet: slicoss: remove redundant increment of pointer data · 3bdd85e2

Colin Ian King authored Jul 26, 2023

The pointer data is being incremented but this change to the pointer
is not used afterwards. The increment is redundant and can be removed.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Link: https://lore.kernel.org/r/20230726164522.369206-1-colin.i.king@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

3bdd85e2

Merge branch 'in-kernel-support-for-the-tls-alert-protocol' · 05191d88

Jakub Kicinski authored Jul 28, 2023

Chuck Lever says:

====================
In-kernel support for the TLS Alert protocol

IMO the kernel doesn't need user space (ie, tlshd) to handle the TLS
Alert protocol. Instead, a set of small helper functions can be used
to handle sending and receiving TLS Alerts for in-kernel TLS
consumers.
====================

Merged on top of a tag in case it's needed in the NFS tree.

Link: https://lore.kernel.org/r/169047923706.5241.1181144206068116926.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

05191d88

net/handshake: Trace events for TLS Alert helpers · b470985c

Chuck Lever authored Jul 27, 2023

Add observability for the new TLS Alert infrastructure.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047947409.5241.14548832149596892717.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b470985c

SUNRPC: Use new helpers to handle TLS Alerts · 39067dda

Chuck Lever authored Jul 27, 2023

Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no longer valid.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047944747.5241.1974889594004407123.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

39067dda

net/handshake: Add helpers for parsing incoming TLS Alerts · 39d0e38d

Chuck Lever authored Jul 27, 2023

Kernel TLS consumers can replace common TLS Alert parsing code with
these helpers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047942074.5241.13791647439480672048.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

39d0e38d

SUNRPC: Send TLS Closure alerts before closing a TCP socket · 5dd5ad68

Chuck Lever authored Jul 27, 2023

Before closing a TCP connection, the TLS protocol wants peers to
send session close Alert notifications. Add those in both the RPC
client and server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047939404.5241.14392506226409865832.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5dd5ad68

net/handshake: Add API for sending TLS Closure alerts · 35b1b538

Chuck Lever authored Jul 27, 2023

This helper sends an alert only if a TLS session was established.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047936730.5241.618595693821012638.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

35b1b538

net/tls: Add TLS Alert definitions · 02574271

Chuck Lever authored Jul 27, 2023

I'm about to add support for kernel handshake API consumers to send
TLS Alerts, so introduce the needed protocol definitions in the new
header tls_prot.h.

This presages support for Closure alerts. Also, support for alerts
is a pre-requite for handling session re-keying, where one peer will
signal the need for a re-key by sending a TLS Alert.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047934064.5241.8377890858495063518.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

02574271

net/tls: Move TLS protocol elements to a separate header · 6a7eccef

Chuck Lever authored Jul 27, 2023

Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other infrastructure provided in <net/tls.h>.

Break out existing standardized protocol elements into a separate
header, and make room for a few more elements in subsequent patches.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047931374.5241.7713175865185969309.stgit@oracle-102.nfsv4bat.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

6a7eccef

octeontx2-af: Initialize 'cntr_val' to fix uninitialized symbol error · 222a6c42

Suman Ghosh authored Jul 27, 2023

drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c:860
otx2_tc_update_mcam_table_del_req()
error: uninitialized symbol 'cntr_val'.

Fixes: ec87f054 ("octeontx2-af: Install TC filter rules in hardware based on priority")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20230727163101.2793453-1-sumang@marvell.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

222a6c42

Merge branch 'eth-bnxt-fix-a-couple-of-w-1-c-1-warnings' · a4989bee

Jakub Kicinski authored Jul 28, 2023

Jakub Kicinski says:

====================
eth: bnxt: fix a couple of W=1 C=1 warnings

Fix a couple of build warnings.
====================

Link: https://lore.kernel.org/r/20230727190726.1859515-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a4989bee

eth: bnxt: fix warning for define in struct_group · 9f49db62

Jakub Kicinski authored Jul 27, 2023

Fix C=1 warning with sparse 0.6.4:

drivers/net/ethernet/broadcom/bnxt/bnxt.c: note: in included file:
drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h:30:1: warning: directive in macro's argument list

Don't put defines in a struct_group().
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230727190726.1859515-3-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

9f49db62

eth: bnxt: fix one of the W=1 warnings about fortified memcpy() · 833c4a81

Jakub Kicinski authored Jul 27, 2023

Fix a W=1 warning with gcc 13.1:

In function ‘fortify_memcpy_chk’,
    inlined from ‘bnxt_hwrm_queue_cos2bw_cfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:133:3:
include/linux/fortify-string.h:592:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
  592 |                         __read_overflow2_field(q_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The field group is already defined and starts at queue_id:

struct bnxt_cos2bw_cfg {
	u8			pad[3];
	struct_group_attr(cfg, __packed,
		u8		queue_id;
		__le32		min_bw;
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230727190726.1859515-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

833c4a81

Merge tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · b10d10a7

Jakub Kicinski authored Jul 28, 2023

Saeed Mahameed says:

====================
mlx5-updates-2023-07-24

1) Generalize devcom implementation to be independent of number of ports
   or device's GUID.

2) Save memory on command interface statistics.

3) General code cleanups

* tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Give esw_offloads_load/unload_rep() "mlx5_" prefix
  net/mlx5: Make mlx5_eswitch_load/unload_vport() static
  net/mlx5: Make mlx5_esw_offloads_rep_load/unload() static
  net/mlx5: Remove pointless devlink_rate checks
  net/mlx5: Don't check vport->enabled in port ops
  net/mlx5e: Make flow classification filters static
  net/mlx5e: Remove duplicate code for user flow
  net/mlx5: Allocate command stats with xarray
  net/mlx5: split mlx5_cmd_init() to probe and reload routines
  net/mlx5: Remove redundant cmdif revision check
  net/mlx5: Re-organize mlx5_cmd struct
  net/mlx5e: E-Switch, Allow devcom initialization on more vports
  net/mlx5e: E-Switch, Register devcom device with switch id key
  net/mlx5: Devcom, Infrastructure changes
  net/mlx5: Use shared code for checking lag is supported
====================

Link: https://lore.kernel.org/r/20230727183914.69229-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b10d10a7

Merge branch 'mlxsw-avoid-non-tracker-helpers-when-holding-and-putting-netdevices' · 97d0dca7

Jakub Kicinski authored Jul 28, 2023

Petr Machata says:

====================
mlxsw: Avoid non-tracker helpers when holding and putting netdevices

Using the tracking helpers, netdev_hold() and netdev_put(), makes it easier
to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER
is enabled. For example, the following traceback shows the callpath to the
point of an outstanding hold that was never put:

    unregister_netdevice: waiting for swp3 to become free. Usage count = 6
    ref_tracker: eth%d@ffff888123c9a580 has 1/5 users at
	mlxsw_sp_switchdev_event+0x6bd/0xcc0 [mlxsw_spectrum]
	notifier_call_chain+0xbf/0x3b0
	atomic_notifier_call_chain+0x78/0x200
	br_switchdev_fdb_notify+0x25f/0x2c0 [bridge]
	fdb_notify+0x16a/0x1a0 [bridge]
	[...]

In this patchset, get rid of all non-ref-tracking helpers in mlxsw.

- Patch #1 drops two functions that are not used anymore, but contain
  dev_hold() / dev_put() calls.

- Patch #2 avoids taking a reference in one function which is called
  under RTNL.

- The remaining patches convert individual hold/put sites one by one
  from trackerless to tracker-enabled.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/netdev/4c056da27c19d95ffeaba5acf1427ecadfc3f94c.camel@redhat.com/
====================

Link: https://lore.kernel.org/r/cover.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

97d0dca7

mlxsw: spectrum_router: IPv6 events: Use tracker helpers to hold & put netdevices · cb211620

Petr Machata authored Jul 27, 2023

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with IPv6 address events.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/f0af6ad4722b4ca6e598fd4fda8311a3041651ec.1690471775.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

cb211620

mlxsw: spectrum_router: RIF: Use tracker helpers to hold & put netdevices · d0e0e880

Petr Machata authored Jul 27, 2023

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with RIF allocation.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/8b7701a7b439ac268e4be4040eff99d01e27ae47.1690471775.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d0e0e880

mlxsw: spectrum_router: hw_stats: Use tracker helpers to hold & put netdevices · b17b2d57

Petr Machata authored Jul 27, 2023

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with hw_stats events.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b972314cfef4f4c24e66e60d13cffa5d606d1bf3.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b17b2d57

mlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices · deeaa371

Petr Machata authored Jul 27, 2023

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with FIB events.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/5221a92e751c40447c55959f622267ccc999ed04.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

deeaa371

mlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices · 1ae489ab

Petr Machata authored Jul 27, 2023

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
switchdev module.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/774c3d7b5b0231f1435df2ec9dd660192e382756.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1ae489ab

mlxsw: spectrum_nve: Do not take reference when looking up netdevice · 16f8c846

Petr Machata authored Jul 27, 2023

mlxsw_sp_nve_fid_disable() is always called under RTNL. It is therefore
safe to call __dev_get_by_index() to get the netdevice pointer without
bumping the reference count, because we can be sure the netdevice is not
going away. That then obviates the need to put the netdevice later in the
function.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/341d1046f89d8d839d9d00e4a3d58cdc351e9397.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

16f8c846

mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put() · 569f98b3

Petr Machata authored Jul 27, 2023

As of commit 151b89f6 ("mlxsw: spectrum_router: Reuse work neighbor
initialization in work scheduler"), the functions
mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users.
Drop them.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

569f98b3

net: change accept_ra_min_rtr_lft to affect all RA lifetimes · 5027d54a

Patrick Rohr authored Jul 26, 2023

accept_ra_min_rtr_lft only considered the lifetime of the default route
and discarded entire RAs accordingly.

This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and
applies the value to individual RA sections; in particular, router
lifetime, PIO preferred lifetime, and RIO lifetime. If any of those
lifetimes are lower than the configured value, the specific RA section
is ignored.

In order for the sysctl to be useful to Android, it should really apply
to all lifetimes in the RA, since that is what determines the minimum
frequency at which RAs must be processed by the kernel. Android uses
hardware offloads to drop RAs for a fraction of the minimum of all
lifetimes present in the RA (some networks have very frequent RAs (5s)
with high lifetimes (2h)). Despite this, we have encountered networks
that set the router lifetime to 30s which results in very frequent CPU
wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the
WiFi firmware) entirely on such networks, it seems better to ignore the
misconfigured routers while still processing RAs from other IPv6 routers
on the same network (i.e. to support IoT applications).

The previous implementation dropped the entire RA based on router
lifetime. This turned out to be hard to expand to the other lifetimes
present in the RA in a consistent manner; dropping the entire RA based
on RIO/PIO lifetimes would essentially require parsing the whole thing
twice.

Fixes: 1671bcfd ("net: add sysctl accept_ra_min_rtr_lft")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5027d54a

Merge branch 'net-store-netdevs-in-an-xarray' · 5bdc312c

Jakub Kicinski authored Jul 28, 2023

Jakub Kicinski says:

====================
net: store netdevs in an xarray

One of more annoying developer experience gaps we have in netlink
is iterating over netdevs. It's painful. Add an xarray to make
it trivial.

v1: https://lore.kernel.org/all/20230722014237.4078962-1-kuba@kernel.org/
====================

Link: https://lore.kernel.org/r/20230726185530.2247698-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5bdc312c

net: convert some netlink netdev iterators to depend on the xarray · 84e00d9b

Jakub Kicinski authored Jul 26, 2023

Reap the benefits of easier iteration thanks to the xarray.
Convert just the genetlink ones, those are easier to test.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-3-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

84e00d9b

net: store netdevs in an xarray · 759ab1ed

Jakub Kicinski authored Jul 26, 2023

Iterating over the netdev hash table for netlink dumps is hard.
Dumps are done in "chunks" so we need to save the position
after each chunk, so we know where to restart from. Because
netdevs are stored in a hash table we remember which bucket
we were in and how many devices we dumped.

Since we don't hold any locks across the "chunks" - devices may
come and go while we're dumping. If that happens we may miss
a device (if device is deleted from the bucket we were in).
We indicate to user space that this may have happened by setting
NLM_F_DUMP_INTR. User space is supposed to dump again (I think)
if it sees that. Somehow I doubt most user space gets this right..

To illustrate let's look at an example:

               System state:
  start:       # [A, B, C]
  del:  B      # [A, C]

with the hash table we may dump [A, B], missing C completely even
tho it existed both before and after the "del B".

Add an xarray and use it to allocate ifindexes. This way we
can iterate ifindexes in order, without the worry that we'll
skip one. We may still generate a dump of a state which "never
existed", for example for a set of values and sequence of ops:

               System state:
  start:       # [A, B]
  add:  C      # [A, C, B]
  del:  B      # [A, C]

we may generate a dump of [A], if C got an index between A and B.
System has never been in such state. But I'm 90% sure that's perfectly
fine, important part is that we can't _miss_ devices which exist before
and after. User space which wants to mirror kernel's state subscribes
to notifications and does periodic dumps so it will know that C exists
from the notification about its creation or from the next dump
(next dump is _guaranteed_ to include C, if it doesn't get removed).

To avoid any perf regressions keep the hash table for now. Most
net namespaces have very few devices and microbenchmarking 1M lookups
on Skylake I get the following results (not counting loopback
to number of devs):

 #devs | hash |  xa  | delta
    2  | 18.3 | 20.1 | + 9.8%
   16  | 18.3 | 20.1 | + 9.5%
   64  | 18.3 | 26.3 | +43.8%
  128  | 20.4 | 26.3 | +28.6%
  256  | 20.0 | 26.4 | +32.1%
 1024  | 26.6 | 26.7 | + 0.2%
 8192  |541.3 | 33.5 | -93.8%

No surprises since the hash table has 256 entries.
The microbenchmark scans indexes in order, if the pattern is more
random xa starts to win at 512 devices already. But that's a lot
of devices, in practice.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-2-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

759ab1ed

Merge branch 'ynl-couple-of-unrelated-fixes' · 083476a2

Jakub Kicinski authored Jul 28, 2023

Stanislav Fomichev says:

====================
ynl: couple of unrelated fixes

- spelling of xdp-features
- s/xdp_zc_max_segs/xdp-zc-max-segs/
- expose xdp-zc-max-segs
- add /* private: */
- regenerate headers
- print xdp_zc_max_segs from sample
====================

Link: https://lore.kernel.org/r/20230727163001.3952878-1-sdf@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

083476a2