- 07 May, 2024 20 commits
-
-
Daniel Jurgens authored
Stop storing RSS setting in the control buffer. This is prep work for removing RTNL lock protection of the control buffer. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Tested-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Arınç ÜNAL authored
Currently, the MT7530 DSA subdriver configures the MT7530 switch to provide direct access to switch PHYs, meaning, the switch PHYs listen on the MDIO bus the switch listens on. The PHY muxing feature makes use of this. This is problematic as the PHY may be attached before the switch is initialised, in which case, the PHY will fail to be attached. Since commit 91374ba5 ("net: dsa: mt7530: support OF-based registration of switch MDIO bus"), we can describe the switch PHYs on the MDIO bus of the switch on the device tree. Extend the check to detect PHY muxing when the PHY is defined on the MDIO bus of the switch on the device tree. When the PHY is described this way, the switch will be initialised first, then the switch MDIO bus will be registered. Only after these steps, the PHY will be attached. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/20240430-b4-for-netnext-mt7530-use-switch-mdio-bus-for-phy-muxing-v2-1-9104d886d0db@arinc9.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Eric Dumazet says: ==================== rtnetlink: more rcu conversions for rtnl_fill_ifinfo() We want to no longer rely on RTNL for "ip link show" command. This is a long road, this series takes care of some parts. ==================== Link: https://lore.kernel.org/r/20240503192059.3884225-1-edumazet@google.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
We want to be able to run rtnl_fill_ifinfo() under RCU protection instead of RTNL in the future. All rtnl_link_ops->get_link_net() methods already using dev_net() are ready. I added READ_ONCE() annotations on others. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
dev->xdp_prog is protected by RCU, we can lift RTNL requirement from rtnl_xdp_prog_skb(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
Change dev_change_proto_down() and dev_change_proto_down_reason() to write once on dev->proto_down and dev->proto_down_reason. Then rtnl_fill_proto_down() can use READ_ONCE() annotations and run locklessly. rtnl_proto_down_size() should assume worst case, because readng dev->proto_down_reason multiple times would be racy without RTNL in the future. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
Following device fields can be read locklessly in rtnl_fill_ifinfo() : type, ifindex, operstate, link_mode, mtu, min_mtu, max_mtu, group, promiscuity, allmulti, num_tx_queues, gso_max_segs, gso_max_size, gro_max_size, gso_ipv4_max_size, gro_ipv4_max_size, tso_max_size, tso_max_segs, num_rx_queues. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
In the following patch we want to read dev->allmulti and dev->promiscuity locklessly from rtnl_fill_ifinfo() In this patch I change __dev_set_promiscuity() and __dev_set_allmulti() to write these fields (and dev->flags) only if they succeed, with WRITE_ONCE() annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
rtnl_fill_ifinfo() can read dev->tx_queue_len locklessly, granted we add corresponding READ_ONCE()/WRITE_ONCE() annotations. Add missing READ_ONCE(dev->tx_queue_len) in teql_enqueue() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
We can use netdev_copy_name() to no longer rely on RTNL to fetch dev->name. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
dev->qdisc can be read using RCU protection. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
says: ==================== net: qede: don't restrict error codes This series fixes the qede driver, so that when a helper function fails, then the callee should return the returned error code, instead just assuming that the error is eg. -EINVAL. The patches in this series, reduces the change of future bugs, so new error codes can be returned from the helpers, without having to update the call sites. This is a follow-up to my recent series "net: qede: avoid overruling error codes", which fixed the cases where the implicit assumption of failing with specific error codes had been broken. https://lore.kernel.org/netdev/20240426091227.78060-1-ast@fiberby.net/ Asbjørn Sloth Tønnesen (3): net: qede: use return from qede_parse_actions() for flow_spec net: qede: use return from qede_flow_spec_validate_unused() net: qede: use return from qede_flow_parse_ports() .../net/ethernet/qlogic/qede/qede_filter.c | 27 ++++++++++++------- 1 file changed, 18 insertions(+), 9 deletions(-) ==================== Link: https://lore.kernel.org/r/20240503105505.839342-1-ast@fiberby.netSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Asbjørn Sloth Tønnesen authored
When calling qede_flow_parse_ports(), then the return code was only used for a non-zero check, and then -EINVAL was returned. qede_flow_parse_ports() can currently fail with: * -EINVAL This patch changes qede_flow_parse_v{4,6}_common() to use the actual return code from qede_flow_parse_ports(), so it's no longer assumed that all errors are -EINVAL. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Asbjørn Sloth Tønnesen authored
When calling qede_flow_spec_validate_unused() then the return code was only used for a non-zero check, and then -EOPNOTSUPP was returned. qede_flow_spec_validate_unused() can currently fail with: * -EOPNOTSUPP This patch changes qede_flow_spec_to_rule() to use the actual return code from qede_flow_spec_validate_unused(), so it's no longer assumed that all errors are -EOPNOTSUPP. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Asbjørn Sloth Tønnesen authored
In qede_flow_spec_to_rule(), when calling qede_parse_actions() then the return code was only used for a non-zero check, and then -EINVAL was returned. qede_parse_actions() can currently fail with: * -EINVAL * -EOPNOTSUPP Commit 319a1d19 ("flow_offload: check for basic action hw stats type") broke the implicit assumption that it could only fail with -EINVAL, by changing it to return -EOPNOTSUPP, when hardware stats are requested. However AFAICT it's not possible to trigger qede_parse_actions() to return -EOPNOTSUPP, when called from qede_flow_spec_to_rule(), as hardware stats can't be requested by ethtool_rx_flow_rule_create(). This patch changes qede_flow_spec_to_rule() to use the actual return code from qede_parse_actions(), so it's no longer assumed that all errors are -EINVAL. Only compile tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jakub Kicinski authored
Merge tag 'ipsec-next-2024-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2024-05-03 1) Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support. This was defined by an early version of an IETF draft that did not make it to a standard. 2) Introduce direction attribute for xfrm states. xfrm states have a direction, a stsate can be used either for input or output packet processing. Add a direction to xfrm states to make it clear for what a xfrm state is used. * tag 'ipsec-next-2024-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Restrict SA direction attribute to specific netlink message types xfrm: Add dir validation to "in" data path lookup xfrm: Add dir validation to "out" data path lookup xfrm: Add Direction to the SA in or out udpencap: Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support ==================== Link: https://lore.kernel.org/r/20240503082732.2835810-1-steffen.klassert@secunet.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shi-Sheng Yang authored
This patch fixes the spelling mistakes in comments. The changes were generated using codespell and reviewed manually. eariler -> earlier greceful -> graceful Signed-off-by: Shi-Sheng Yang <fourcolor4c@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240502154740.249839-1-fourcolor4c@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
According to GCC, the constriction of irq_name in otx2_open() may, theoretically, be truncated. This patch takes the approach of treating such a situation as an error which it detects by making use of the return value of snprintf, which is the total number of bytes, excluding the trailing '\0', that would have been written. Based on the approach taken to a similar problem in commit 54b90943 ("rtc: fix snprintf() checking in is_rtc_hctosys()") Flagged by gcc-13 W=1 builds as: .../otx2_pf.c:1933:58: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=] 1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name, | ^ .../otx2_pf.c:1933:17: note: 'snprintf' output between 8 and 33 bytes into a destination of size 32 1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1934 | qidx); | ~~~~~ Compile tested only. Tested-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240503-octeon2-pf-irq_name-truncation-v2-1-91099177b942@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dr. David Alan Gilbert authored
This list looks like it's been unused since the OF conversion in 2008 in commit 826b6cfc ("fore200e: Convert over to pure OF driver.") This also means we can remove the 'entry' member for the list. Build tested only. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240503001822.183061-1-linux@treblig.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shailend Chand authored
The new netdev queue api is implemented for gve. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Link: https://lore.kernel.org/all/20240501232549.1327174-11-shailend@google.com/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 06 May, 2024 8 commits
-
-
Paolo Abeni authored
Felix Fietkau says: ==================== Add TCP fraglist GRO support When forwarding TCP after GRO, software segmentation is very expensive, especially when the checksum needs to be recalculated. One case where that's currently unavoidable is when routing packets over PPPoE. Performance improves significantly when using fraglist GRO implemented in the same way as for UDP. When NETIF_F_GRO_FRAGLIST is enabled, perform a lookup for an established socket in the same netns as the receiving device. While this may not cover all relevant use cases in multi-netns configurations, it should be good enough for most configurations that need this. Here's a measurement of running 2 TCP streams through a MediaTek MT7622 device (2-core Cortex-A53), which runs NAT with flow offload enabled from one ethernet port to PPPoE on another ethernet port + cake qdisc set to 1Gbps. rx-gro-list off: 630 Mbit/s, CPU 35% idle rx-gro-list on: 770 Mbit/s, CPU 40% idle Changes since v4: - add likely() to prefer the non-fraglist path in check Changes since v3: - optimize __tcpv4_gso_segment_csum - add unlikely() - reorder dev_net/skb_gro_network_header calls after NETIF_F_GRO_FRAGLIST check - add support for ipv6 nat - drop redundant pskb_may_pull check Changes since v2: - create tcp_gro_header_pull helper function to pull tcp header only once - optimize __tcpv4_gso_segment_list_csum, drop obsolete flags check Changes since v1: - revert bogus tcp flags overwrite on segmentation - fix kbuild issue with !CONFIG_IPV6 - only perform socket lookup for the first skb in the GRO train Changes since RFC: - split up patches - handle TCP flags mutations ==================== Link: https://lore.kernel.org/r/20240502084450.44009-1-nbd@nbd.nameSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
When forwarding TCP after GRO, software segmentation is very expensive, especially when the checksum needs to be recalculated. One case where that's currently unavoidable is when routing packets over PPPoE. Performance improves significantly when using fraglist GRO implemented in the same way as for UDP. When NETIF_F_GRO_FRAGLIST is enabled, perform a lookup for an established socket in the same netns as the receiving device. While this may not cover all relevant use cases in multi-netns configurations, it should be good enough for most configurations that need this. Here's a measurement of running 2 TCP streams through a MediaTek MT7622 device (2-core Cortex-A53), which runs NAT with flow offload enabled from one ethernet port to PPPoE on another ethernet port + cake qdisc set to 1Gbps. rx-gro-list off: 630 Mbit/s, CPU 35% idle rx-gro-list on: 770 Mbit/s, CPU 40% idle Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
Pull the code out of tcp_gro_receive in order to access the tcp header from tcp4/6_gro_receive. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
This pulls the flow port matching out of tcp_gro_receive, so that it can be reused for the next change, which adds the TCP fraglist GRO heuristic. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
This implements fraglist GRO similar to how it's handled in UDP, however no functional changes are added yet. The next change adds a heuristic for using fraglist GRO instead of regular GRO. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
Preparation for adding TCP fraglist GRO support. It expects packets to be combined in a similar way as UDP fraglist GSO packets. For IPv4 packets, NAT is handled in the same way as UDP fraglist GSO. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Felix Fietkau authored
This helper function will be used for TCP fraglist GRO support Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Rengarajan S authored
The PTP_CMD_CTL is a self clearing register which controls the PTP clock values. In the current implementation driver waits for a duration of 20 sec in case of HW failure to clear the PTP_CMD_CTL register bit. This timeout of 20 sec is very long to recognize a HW failure, as it is typically cleared in one clock(<16ns). Hence reducing the timeout to 1 sec would be sufficient to conclude if there is any HW failure observed. The usleep_range will sleep somewhere between 1 msec to 20 msec for each iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max timeout is extended to 1 sec. Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240502050300.38689-1-rengarajan.s@microchip.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
- 05 May, 2024 10 commits
-
-
David S. Miller authored
Shailend Chand says: ==================== gve: Implement queue api Following the discussion on https://patchwork.kernel.org/project/linux-media/patch/20240305020153.2787423-2-almasrymina@google.com/, the queue api defined by Mina is implemented for gve. The first patch is just Mina's introduction of the api. The rest of the patches make surgical changes in gve to enable it to work correctly with only a subset of queues present (thus far it had assumed that either all queues are up or all are down). The final patch has the api implementation. Changes since v1: clang warning fixes, kdoc warning fix, and addressed review comments. ==================== Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
Every tx and rx ring has its own queue-page-list (QPL) that serves as the bounce buffer. Previously we were allocating QPLs for all queues before the queues themselves were allocated and later associating a QPL with a queue. This is avoidable complexity: it is much more natural for each queue to allocate and free its own QPL. Moreover, the advent of new queue-manipulating ndo hooks make it hard to keep things as is: we would need to transfer a QPL from an old queue to a new queue, and that is unpleasant. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
We now account for the fact that the NIC might send us stats for a subset of queues. Without this change, gve_get_ethtool_stats might make an invalid access on the priv->stats_report->stats array. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
This does not fix any existing bug. In anticipation of the ndo queue api hooks that alloc/free/start/stop a single Rx queue, the already existing per-queue stop functions are being made more robust. Specifically for this use case: rx_queue_n.stop() + rx_queue_n.start() Note that this is not the use case being used in devmem tcp (the first place these new ndo hooks would be used). There the usecase is: new_queue.alloc() + old_queue.stop() + new_queue.start() + old_queue.free() Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
In order to make possible the implementation of per-queue ndo hooks, gve_turnup was changed in a previous patch to account for queues already having some unprocessed descriptors: it does a one-off napi_schdule to handle them. If conditions of consistent high traffic persist in the immediate aftermath of this, the poll routine for a queue can be "stuck" on the cpu on which the ndo hooks ran, instead of the cpu its irq has affinity with. This situation is exacerbated by the fact that the ndo hooks for all the queues are invoked on the same cpu, potentially causing all the napi poll routines to be residing on the same cpu. A self correcting mechanism in the poll method itself solves this problem. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
gVNIC has a requirement that all queues have to be quiesced before any queue is operated on (created or destroyed). To enable the implementation of future ndo hooks that work on a single queue, we need to evolve gve_turnup to account for queues already having some unprocessed descriptors in the ring. Say rxq 4 is being stopped and started via the queue api. Due to gve's requirement of quiescence, queues 0 through 3 are not processing their rings while queue 4 is being toggled. Once they are made live, these queues need to be poked to cause them to check their rings for descriptors that were written during their brief period of quiescence. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
Currently the queues are either all live or all dead, toggling from one state to the other via the ndo open and stop hooks. The future addition of single-queue ndo hooks changes this, and thus gve_turnup and gve_turndown should evolve to account for a state where some queues are live and some aren't. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
This allows for implementing future ndo hooks that act on a single queue. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shailend Chand authored
Although this is not fixing any existing double free bug, making these functions idempotent allows for a simpler implementation of future ndo hooks that act on a single queue. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mina Almasry authored
This API enables the net stack to reset the queues used for devmem TCP. Signed-off-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 03 May, 2024 2 commits
-
-
Mina Almasry authored
This reverts commit a580ea99. This revert is to resolve Dragos's report of page_pool leak here: https://lore.kernel.org/lkml/20240424165646.1625690-2-dtatulea@nvidia.com/ The reverted patch interacts very badly with commit 2cc3aeb5 ("skbuff: Fix a potential race while recycling page_pool packets"). The reverted commit hopes that the pp_recycle + is_pp_page variables do not change between the skb_frag_ref and skb_frag_unref operation. If such a change occurs, the skb_frag_ref/unref will not operate on the same reference type. In the case of Dragos's report, the grabbed ref was a pp ref, but the unref was a page ref, because the pp_recycle setting on the skb was changed. Attempting to fix this issue on the fly is risky. Lets revert and I hope to reland this with better understanding and testing to ensure we don't regress some edge case while streamlining skb reffing. Fixes: a580ea99 ("net: mirror skb frag ref/unref helpers") Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Mina Almasry <almasrymina@google.com> Link: https://lore.kernel.org/r/20240502175423.2456544-1-almasrymina@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
David Wei authored
Current net-next/main does not boot for older chipsets e.g. Stratus. Sample dmesg: [ 11.368315] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Able to reserve only 0 out of 9 requested RX rings [ 11.390181] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Unable to reserve tx rings [ 11.438780] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): 2nd rings reservation failed. [ 11.487559] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Not enough rings available. [ 11.506012] bnxt_en 0000:02:00.0: probe with driver bnxt_en failed with error -12 This is caused by bnxt_get_avail_msix() returning a negative value for these chipsets not using the new resource manager i.e. !BNXT_NEW_RM. This in turn causes hwr.cp in __bnxt_reserve_rings() to be set to 0. In the current call stack, __bnxt_reserve_rings() is called from bnxt_set_dflt_rings() before bnxt_init_int_mode(). Therefore, bp->total_irqs is always 0 and for !BNXT_NEW_RM bnxt_get_avail_msix() always returns a negative number. Historically, MSIX vectors were requested by the RoCE driver during run-time and bnxt_get_avail_msix() was used for this purpose. Today, RoCE MSIX vectors are statically allocated. bnxt_get_avail_msix() should only be called for the BNXT_NEW_RM() case to reserve the MSIX ahead of time for RoCE use. bnxt_get_avail_msix() is also be simplified to handle the BNXT_NEW_RM() case only. Fixes: d630624e ("bnxt_en: Utilize ulp client resources if RoCE is not registered") Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240502203757.3761827-1-dw@davidwei.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-