Commits · cbbb7abdd00ed583bec374526967b09d8840e264 · Kirill Smelkov / linux

16 Aug, 2021 4 commits

David S. Miller authored Aug 16, 2021

Luo Jie says:

====================
net: mdio: Add IPQ MDIO reset related function

This patch series add the MDIO reset features, which includes
configuring MDIO clock source frequency and indicating CMN_PLL that
ethernet LDO has been ready, this ethernet LDO is dedicated in the
IPQ5018 platform.

Specify more chipset IPQ40xx, IPQ807x, IPQ60xx and IPQ50xx supported by
this MDIO driver.

Changes in v3:
	* simplify the function ipq_mdio_reset.

Changes in v2:
	* Addressed review comments (Andrew Lunn).
	* Remove the IS_ERR().
	* make binding patch part of series.
	* document the property 'reg' and 'clock'.

Changes in v1:
	* make MDIO_IPQ4019 unchanged for backwards compatibility.
	* remove the PHY reset functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cbbb7abd

dt-bindings: net: Add the properties for ipq4019 MDIO · 2a4c32e7

Luo Jie authored Aug 12, 2021

The new added properties resource "reg" is for configuring
ethernet LDO in the IPQ5018 chipset, the property "clocks"
is for configuring the MDIO clock source frequency.
Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

2a4c32e7

MDIO: Kconfig: Specify more IPQ chipset supported · c76ee263

Luo Jie authored Aug 12, 2021

The IPQ MDIO driver currently supports the chipset IPQ40xx, IPQ807x,
IPQ60xx and IPQ50xx.

Add the compatible 'qcom,ipq5018-mdio' because of ethernet LDO dedicated
to the IPQ5018 platform.
Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

c76ee263

net: mdio: Add the reset function for IPQ MDIO driver · 23a890d4

Luo Jie authored Aug 12, 2021

1. configure the MDIO clock source frequency.
2. the LDO resource is needed to configure the ethernet LDO available
for CMN_PLL.
Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

23a890d4

14 Aug, 2021 34 commits

MAINTAINERS: Remove the ipx network layer info · e4637f62

Cai Huoqing authored Aug 13, 2021

commit <47595e32> ("<MAINTAINERS: Mark some staging directories>")
indicated the ipx network layer as obsolete in Jan 2018,
updated in the MAINTAINERS file.

now, after being exposed for 3 years to refactoring, so to
remove the ipx network layer info from MAINTAINERS.
additionally, there is no module that depends on ipx.h
except a broken staging driver(r8188eu)
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4637f62

net: Remove net/ipx.h and uapi/linux/ipx.h header files · 6c9b4084

Cai Huoqing authored Aug 13, 2021

commit <47595e32> ("<MAINTAINERS: Mark some staging directories>")
indicated the ipx network layer as obsolete in Jan 2018,
updated in the MAINTAINERS file

now, after being exposed for 3 years to refactoring, so to
delete uapi/linux/ipx.h and net/ipx.h header files for good.
additionally, there is no module that depends on ipx.h except
a broken staging driver(r8188eu)
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c9b4084

Merge branch 'iupa-last-things-before-pm-conversion' · fda4e19d

David S. Miller authored Aug 14, 2021

Alex Elder says:

====================
net: ipa: last things before PM conversion

This series contains a few remaining changes needed before fully
switching over to using runtime power management rather than the
previous "IPA clock" mechanism.

The first patch moves the calls to enable and disable the IPA
interrupt as a system wakeup interrupt into "ipa_clock.c" with the
rest of the power-related code.

The second adds a flag to make it possible to distinguish runtime
suspend from system suspend.

The third and fourth patches arrange for the ->start_xmit path to
resume hardware if necessary, to ensure it is powered.  If power is
not active, the TX queue is stopped, and arrangements are made for
the queue to be restarted once hardware power is active again.

The fifth patch keeps the TX queue active during suspend.  This
isn't necessary for system suspend but it's important for runtime
suspend.

And the last patch makes it so we don't hold the hardware active
while the modem network device is open.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fda4e19d

net: ipa: don't hold clock reference while netdev open · 8dc181f2

Alex Elder authored Aug 12, 2021

Currently a clock reference is taken whenever the ->ndo_open
callback for the modem netdev is called.  That reference is dropped
when the device is closed, in ipa_stop().

We no longer need this, because ipa_start_xmit() now handles the
situation where the hardware power state is not active.

Drop the clock reference in ipa_open() when we're done, and take a
new reference in ipa_stop() before we begin closing the interface.

Finally (and unrelated, but trivial), change the return type of
ipa_start_xmit() to be netdev_tx_t instead of int.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

8dc181f2

net: ipa: don't stop TX on suspend · 8dcf8bb3

Alex Elder authored Aug 12, 2021

Currently we stop the modem netdev transmit queue when suspending
the hardware.  For system suspend this ensured we'd never attempt
to transmit while attempting to suspend the modem endpoints.

For runtime suspend, the IPA hardware might get suspended while the
system is operating.  In that case we want an attempt to transmit a
packet to cause the hardware to resume if necessary.  But if we
disable the queue this cannot happen.

So stop disabling the queue on suspend.  In case we end up disabling
it in ipa_start_xmit() (see the previous commit), we still arrange
to start the TX queue on resume.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

8dcf8bb3

net: ipa: ensure hardware has power in ipa_start_xmit() · 6b51f802

Alex Elder authored Aug 12, 2021

We need to ensure the hardware is powered when we transmit a packet.
But if it's not, we can't block to wait for it.  So asynchronously
request power in ipa_start_xmit(), and only proceed if the return
value indicates the power state is active.

If the hardware is not active, a runtime resume request will have
been initiated.  In that case, stop the network stack from further
transmit attempts until the resume completes.  Return NETDEV_TX_BUSY,
to retry sending the packet once the queue is restarted.

If the power request returns an error (other than -EINPROGRESS,
which just means a resume requested elsewhere isn't complete), just
drop the packet.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

6b51f802

net: ipa: re-enable transmit in PM WQ context · a96e73fa

Alex Elder authored Aug 12, 2021

Create a new work structure in the modem private data, and use it to
re-enable the modem network device transmit queue when resuming.

This is needed by the next patch, which stops the TX queue if IPA
power isn't active when a transmit request arrives. Packets will
start arriving the instant the TX queue is enabled, but resuming
isn't complete until ipa_modem_resume() returns. This way we're
sure to be resumed before transmits are allowed again.

Cancel it before calling ipa_stop() in ipa_modem_stop() to ensure
the transmit queue restart completes before it gets stopped there.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a96e73fa

net: ipa: distinguish system from runtime suspend · b9c532c1

Alex Elder authored Aug 12, 2021

Add a new flag that is set when the hardware is suspended due to a
system suspend operation, distingishing it from runtime suspend.
Use it in the SUSPEND IPA interrupt handler to determine whether to
trigger a system resume because of the event.  Define new suspend
and resume power management callback functions to set and clear the
new flag, respectively.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9c532c1

net: ipa: enable wakeup in ipa_power_setup() · d430fe4b

Alex Elder authored Aug 12, 2021

Move the call to enable the IPA interrupt as a wakeup interrupt into
ipa_power_setup(), disable it in ipa_power_teardown().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d430fe4b

Merge branch 'bridgge-mcast' · 8db102a6

David S. Miller authored Aug 14, 2021

Nikolay Aleksandrov says:

====================
net: bridge: mcast: dump querier state

This set adds the ability to dump the current multicast querier state.
This is extremely useful when debugging multicast issues, we've had
many cases of unexpected queriers causing strange behaviour and mcast
test failures. The first patch changes the querier struct to record
a port device's ifindex instead of a pointer to the port itself so we
can later retrieve it, I chose this way because it's much simpler
and doesn't require us to do querier port ref counting, it is best
effort anyway. Then patch 02 makes the querier address/port updates
consistent via a combination of multicast_lock and seqcount, so readers
can only use seqcount to get a consistent snapshot of address and port.
Patch 03 is a minor cleanup in preparation for the dump support, it
consolidates IPv4 and IPv6 querier selection paths as they share most of
the logic (except address comparisons of course). Finally the last three
patches add the new querier state dumping support, for the bridge's
global multicast context we embed the BRIDGE_QUERIER_xxx attributes
into IFLA_BR_MCAST_QUERIER_STATE and for the per-vlan global mcast
contexts we embed them into BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE.

The structure is:
  [IFLA_BR_MCAST_QUERIER_STATE / BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE]
  `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier
  `[BRIDGE_QUERIER_IP_PORT]    - bridge port ifindex where the querier was
                                 seen (set only if external querier)
  `[BRIDGE_QUERIER_IP_OTHER_TIMER]   -  other querier timeout
  `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier
  `[BRIDGE_QUERIER_IPV6_PORT]    - bridge port ifindex where the querier
                                   was seen (set only if external querier)
  `[BRIDGE_QUERIER_IPV6_OTHER_TIMER]   -  other querier timeout

Later we can also add IGMP version of seen queriers and last seen values
from the queries.
====================

8db102a6

net: bridge: vlan: dump mcast ctx querier state · ddc649d1

Nikolay Aleksandrov authored Aug 13, 2021

Use the new mcast querier state dump infrastructure and export vlans'
mcast context querier state embedded in attribute
BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddc649d1

net: bridge: mcast: dump ipv6 querier state · 85b41082

Nikolay Aleksandrov authored Aug 13, 2021

Add support for dumping global IPv6 querier state, we dump the state
only if our own querier is enabled or there has been another external
querier which has won the election. For the bridge global state we use
a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
The structure is:
  [IFLA_BR_MCAST_QUERIER_STATE]
   `[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier
   `[BRIDGE_QUERIER_IPV6_PORT]    - bridge port ifindex where the querier
                                    was seen (set only if external querier)
   `[BRIDGE_QUERIER_IPV6_OTHER_TIMER]   -  other querier timeout

IPv4 and IPv6 attributes are embedded at the same level of
IFLA_BR_MCAST_QUERIER_STATE. If we didn't dump anything we cancel the nest
and return.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85b41082

net: bridge: mcast: dump ipv4 querier state · c7fa1d9b

Nikolay Aleksandrov authored Aug 13, 2021

Add support for dumping global IPv4 querier state, we dump the state
only if our own querier is enabled or there has been another external
querier which has won the election. For the bridge global state we use
a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
The structure is:
 [IFLA_BR_MCAST_QUERIER_STATE]
  `[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier
  `[BRIDGE_QUERIER_IP_PORT]    - bridge port ifindex where the querier was
                                 seen (set only if external querier)
  `[BRIDGE_QUERIER_IP_OTHER_TIMER]   -  other querier timeout
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7fa1d9b

net: bridge: mcast: consolidate querier selection for ipv4 and ipv6 · c3fb3698

Nikolay Aleksandrov authored Aug 13, 2021

We can consolidate both functions as they share almost the same logic.
This is easier to maintain and we have a single querier update function.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3fb3698

net: bridge: mcast: make sure querier port/address updates are consistent · 67b746f9

Nikolay Aleksandrov authored Aug 13, 2021

Use a sequence counter to make sure port/address updates can be read
consistently without requiring the bridge multicast_lock. We need to
zero out the port and address when the other querier has expired and
we're about to select ourselves as querier. br_multicast_read_querier
will be used later when dumping querier state. Updates are done only
with the multicast spinlock and softirqs disabled, while reads are done
from process context and from softirqs (due to notifications).
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

67b746f9

net: bridge: mcast: record querier port device ifindex instead of pointer · bb18ef8e

Nikolay Aleksandrov authored Aug 13, 2021

Currently when a querier port is detected its net_bridge_port pointer is
recorded, but it's used only for comparisons so it's fine to have stale
pointer, in order to dereference and use the port pointer a proper
accounting of its usage must be implemented adding unnecessary
complexity. To solve the problem we can just store the netdevice ifindex
instead of the port pointer and retrieve the bridge port. It is a best
effort and the device needs to be validated that is still part of that
bridge before use, but that is small price to pay for avoiding querier
reference counting for each port/vlan.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb18ef8e

Merge branch 'devlink-cleanup-for-delay-event' · 2fa16787

David S. Miller authored Aug 14, 2021

Leon Romanovsky says:

====================
Devlink cleanup for delay event series

Jakub's request to make sure that devlink events are delayed and not
printed till they fully accessible [1] requires us to implement delayed
event notification system in the devlink.

In order to do it, I moved some of my patches (xarray e.t.c) from the future
series to be before "Move devlink_register to be near devlink_reload_enable" [2].

That allows us to rely on DEVLINK_REGISTERED xarray mark to decide if to print
event or not.

Other patches are simple cleanup which is needed anyway.

[1] https://lore.kernel.org/lkml/20210811071817.4af5ab34@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
[2] https://lore.kernel.org/lkml/cover.1628599239.git.leonro@nvidia.com

Next in the queue:
 * Delay event series
 * Move devlink_register to be near devlink_reload_enable"
 * Extension of devlink_ops to be set dynamically
 * devlink_reload_* delete
 * Devlink locks rework to user xarray and reference counting
 * ????
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2fa16787

net: hns3: remove always exist devlink pointer check · a1fcb106

Leon Romanovsky authored Aug 14, 2021

The devlink pointer always exists after hclge_devlink_init() succeed.
Remove that check together with NULL setting after release and ensure
that devlink_register is last command prior to call to devlink_reload_enable().

Fixes: b741269b ("net: hns3: add support for registering devlink for PF")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1fcb106

devlink: Clear whole devlink_flash_notify struct · ed43fbac

Leon Romanovsky authored Aug 14, 2021

The { 0 } doesn't clear all fields in the struct, but tells to the
compiler to set all fields to zero and doesn't touch any sub-fields
if they exists.

The {} is an empty initialiser that instructs to fully initialize whole
struct including sub-fields, which is error-prone for future
devlink_flash_notify extensions.

Fixes: 6700acc5 ("devlink: collect flash notify params into a struct")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ed43fbac

devlink: Use xarray to store devlink instances · 11a861d7

Leon Romanovsky authored Aug 14, 2021

We can use xarray instead of linearly organized linked lists for the
devlink instances. This will let us revise the locking scheme in favour
of internal xarray locking that protects database.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

11a861d7

devlink: Count struct devlink consumers · 437ebfd9

Leon Romanovsky authored Aug 14, 2021

The struct devlink itself is protected by internal lock and doesn't
need global lock during operation. That global lock is used to protect
addition/removal new devlink instances from the global list in use by
all devlink consumers in the system.

The future conversion of linked list to be xarray will allow us to
actually delete that lock, but first we need to count all struct devlink
users.

The reference counting provides us a way to ensure that no new user
space commands success to grab devlink instance which is going to be
destroyed makes it is safe to access it without lock.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

437ebfd9

devlink: Remove check of always valid devlink pointer · 7ca973dc

Leon Romanovsky authored Aug 14, 2021

Devlink objects are accessible only after they were registered and
have valid devlink_*->devlink pointers.

Remove that check and simplify respective fill functions as an outcome
of such change.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7ca973dc

devlink: Simplify devlink_pernet_pre_exit call · cbf6ab67

Leon Romanovsky authored Aug 14, 2021

The devlink_pernet_pre_exit() will be called if net namespace exits.

That routine is relevant for devlink instances that were assigned to
that namespaces first. This assignment is possible only with the following
command: "devlink reload DEV netns ...", which already checks reload support.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cbf6ab67

Merge branch 'mptcp-improve-backup-subflows' · 38e3bfa8

David S. Miller authored Aug 14, 2021

Mat Martineau says:

====================
mptcp: Improve use of backup subflows

Multipath TCP combines multiple TCP subflows in to one stream, and the
MPTCP-level socket must decide which subflow to use when sending (or
resending) chunks of data. The choice of the "best" subflow to transmit
on can vary depending on the priority (normal or backup) for each
subflow and how well the subflow is performing.

In order to improve MPTCP performance when some subflows are failing,
this patch set changes how backup subflows are utilized and introduces
tracking of "stale" subflows that are still connected but not making
progress.

Patch 1 adjusts MPTCP-level retransmit timeouts to use data from all
subflows.

Patch 2 makes MPTCP-level retransmissions less aggressive to avoid
resending data that's still queued at the TCP level.

Patch 3 changes the way pending data is handled when subflows are
closed. Unacked MPTCP-level data still in the subflow tx queue is
immediately moved to another subflow for transmission instead of waiting
for MPTCP-level timeouts to trigger retransmission.

Patch 4 has some sysctl code cleanup.

Patches 5 and 6 add tracking of "stale" subflows, so only underlying TCP
subflow connections that appear to be making progress are considered
when selecting a subflow to (re)transmit data. How fast a subflow goes
stale is configurable with a per-namespace sysctl. Related MIBS are
added too.

Patch 7 makes sure the backup flag is always correctly recorded when the
MP_JOIN SYN/ACK is received for an added subflow.

Patch 8 adds more test cases for backup subflows and stale subflows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

38e3bfa8

selftests: mptcp: add testcase for active-back · 7d1e6f16

Paolo Abeni authored Aug 13, 2021

Add more test-case for link failures scenario,
including recovery from link failure using only
backup subflows and bi-directional transfer.

Additionally explicitly check for stale count
Co-developed-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7d1e6f16

mptcp: backup flag from incoming MPJ ack option · 0460ce22

Paolo Abeni authored Aug 13, 2021

the parsed incoming backup flag is not propagated
to the subflow itself, the client may end-up using it
to send data.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/191Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0460ce22

mptcp: add mibs for stale subflows processing · fc1b4e3b

Paolo Abeni authored Aug 13, 2021

This allows monitoring exceptional events like
active backup scenarios.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc1b4e3b

mptcp: faster active backup recovery · ff5a0b42

Paolo Abeni authored Aug 13, 2021

The msk can use backup subflows to transmit in-sequence data
only if there are no other active subflow. On active backup
scenario, the MPTCP connection can do forward progress only
due to MPTCP retransmissions - rtx can pick backup subflows.

This patch introduces a new flag flow MPTCP subflows: if the
underlying TCP connection made no progresses for long time,
and there are other less problematic subflows available, the
given subflow become stale.

Stale subflows are not considered active: if all non backup
subflows become stale, the MPTCP scheduler can pick backup
subflows for plain transmissions.

Stale subflows can return in active state, as soon as any reply
from the peer is observed.

Active backup scenarios can now leverage the available b/w
with no restrinction.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff5a0b42

mptcp: cleanup sysctl data and helpers · 6da14d74

Paolo Abeni authored Aug 13, 2021

Reorder the data in mptcp_pernet to avoid wasting space
with no reasons and constify the access helpers.

No functional changes intended.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6da14d74

mptcp: handle pending data on closed subflow · 1e1d9d6f

Paolo Abeni authored Aug 13, 2021

The PM can close active subflow, e.g. due to ingress RM_ADDR
option. Such subflow could carry data still unacked at the
MPTCP-level, both in the write and the rtx_queue, which has
never reached the other peer.

Currently the mptcp-level retransmission will deliver such data,
but at a very low rate (at most 1 DSM for each MPTCP rtx interval).

We can speed-up the recovery a lot, moving all the unacked in the
tcp write_queue, so that it will be pushed again via other
subflows, at the speed allowed by them.

Also make available the new helper for later patches.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1e1d9d6f

mptcp: less aggressive retransmission strategy · 71b7dec2

Paolo Abeni authored Aug 13, 2021

The current mptcp re-inject strategy is very aggressive,
we have mptcp-level retransmissions even on single subflow
connection, if the link in-use is lossy.

Let's be a little more conservative: we do retransmit
only if at least a subflow has write and rtx queue empty.

Additionally use the backup subflows only if the active
subflows are stale - no progresses in at least an rtx period
and ignore stale subflows for rtx timeout update

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

71b7dec2

mptcp: more accurate timeout · 33d41c9c

Paolo Abeni authored Aug 13, 2021

As reported by Maxim, we have a lot of MPTCP-level
retransmissions when multilple links with different latencies
are in use.

This patch refactor the mptcp-level timeout accounting so that
the maximum of all the active subflow timeout is used. To avoid
traversing the subflow list multiple times, the update is
performed inside the packet scheduler.

Additionally clean-up a bit timeout handling.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

33d41c9c

ethernet: fix PTP_1588_CLOCK dependencies · e5f31552

Arnd Bergmann authored Aug 12, 2021

The 'imply' keyword does not do what most people think it does, it only
politely asks Kconfig to turn on another symbol, but does not prevent
it from being disabled manually or built as a loadable module when the
user is built-in. In the ICE driver, the latter now causes a link failure:

aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_eth_ioctl':
ice_main.c:(.text+0x13b0): undefined reference to `ice_ptp_get_ts_config'
ice_main.c:(.text+0x13b0): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_get_ts_config'
aarch64-linux-ld: ice_main.c:(.text+0x13bc): undefined reference to `ice_ptp_set_ts_config'
ice_main.c:(.text+0x13bc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_set_ts_config'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_prepare_for_reset':
ice_main.c:(.text+0x31fc): undefined reference to `ice_ptp_release'
ice_main.c:(.text+0x31fc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_release'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_rebuild':

This is a recurring problem in many drivers, and we have discussed
it several times befores, without reaching a consensus. I'm providing
a link to the previous email thread for reference, which discusses
some related problems.

To solve the dependency issue better than the 'imply' keyword, introduce a
separate Kconfig symbol "CONFIG_PTP_1588_CLOCK_OPTIONAL" that any driver
can depend on if it is able to use PTP support when available, but works
fine without it. Whenever CONFIG_PTP_1588_CLOCK=m, those drivers are
then prevented from being built-in, the same way as with a 'depends on
PTP_1588_CLOCK || !PTP_1588_CLOCK' dependency that does the same trick,
but that can be rather confusing when you first see it.

Since this should cover the dependencies correctly, the IS_REACHABLE()
hack in the header is no longer needed now, and can be turned back
into a normal IS_ENABLED() check. Any driver that gets the dependency
wrong will now cause a link time failure rather than being unable to use
PTP support when that is in a loadable module.

However, the two recently added ptp_get_vclocks_index() and
ptp_convert_timestamp() interfaces are only called from builtin code with
ethtool and socket timestamps, so keep the current behavior by stubbing
those out completely when PTP is in a loadable module. This should be
addressed properly in a follow-up.

As Richard suggested, we may want to actually turn PTP support into a
'bool' option later on, preventing it from being a loadable module
altogether, which would be one way to solve the problem with the ethtool
interface.

Fixes: 06c16d89 ("ice: register 1588 PTP clock device object for E810 devices")
Link: https://lore.kernel.org/netdev/20210804121318.337276-1-arnd@kernel.org/
Link: https://lore.kernel.org/netdev/CAK8P3a06enZOf=XyZ+zcAwBczv41UuCTz+=0FMf2gBz1_cOnZQ@mail.gmail.com/
Link: https://lore.kernel.org/netdev/CAK8P3a3=eOxE-K25754+fB_-i_0BZzf9a9RfPTX3ppSwu9WZXw@mail.gmail.com/
Link: https://lore.kernel.org/netdev/20210726084540.3282344-1-arnd@kernel.org/Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210812183509.1362782-1-arnd@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e5f31552

net: phy: marvell: add SFP support for 88E1510 · b697d9d3

Ivan Bornyakov authored Aug 12, 2021

Add support for SFP cages connected to the Marvell 88E1512 transceiver.
88E1512 supports for SGMII/1000Base-X/100Base-FX media type with RGMII
on system interface. Configure PHY to appropriate mode depending on the
type of SFP inserted. On SFP removal configure PHY to the RGMII-copper
mode so RJ-45 port can still work.
Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Link: https://lore.kernel.org/r/20210812134256.2436-1-i.bornyakov@metrotek.ruSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b697d9d3

13 Aug, 2021 2 commits

Merge branch 'kconfig-symbol-clean-up-on-net' · a44fc4b6

Jakub Kicinski authored Aug 13, 2021

Lukas Bulwahn says:

====================
Kconfig symbol clean-up on net

The script ./scripts/checkkconfigsymbols.py warns on invalid references to
Kconfig symbols (often, minor typos, name confusions or outdated references).

This patch series addresses all issues reported by
./scripts/checkkconfigsymbols.py in ./net/ and ./drivers/net/ for Kconfig
and Makefile files. Issues in the Kconfig and Makefile files indicate some
shortcomings in the overall build definitions, and often are true actionable
issues to address.

These issues can be identified and filtered by:

  ./scripts/checkkconfigsymbols.py \
  | grep -E "(drivers/)?net/.*(Kconfig|Makefile)" -B 1 -A 1

After applying this patch series on linux-next (next-20210811), the command
above yields no further issues to address.
====================

Link: https://lore.kernel.org/r/20210812083806.28434-1-lukas.bulwahn@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a44fc4b6

net: dpaa_eth: remove dead select in menuconfig FSL_DPAA_ETH · f75d8155

Lukas Bulwahn authored Aug 12, 2021

The menuconfig FSL_DPAA_ETH selects config FSL_FMAN_MAC, but the config
FSL_FMAN_MAC never existed in the kernel tree.

Hence, ./scripts/checkkconfigsymbols.py warns:

FSL_FMAN_MAC
Referencing files: drivers/net/ethernet/freescale/dpaa/Kconfig

Remove this dead select in menuconfig FSL_DPAA_ETH.

Fixes: 9ad1a374 ("dpaa_eth: add support for DPAA Ethernet")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

f75d8155