- 19 Oct, 2021 40 commits
-
-
Jesse Brandeburg authored
The driver was having trouble with unreliable latency when doing single threaded ping-pong tests. This was root caused to the DIM algorithm landing on a too slow interrupt value, which caused high latency, and it was especially present when queues were being switched frequently by the scheduler as happens on default setups today. In attempting to improve this, we allow the upper rate limit for interrupts to move to rate limit of 4 microseconds as a max, which means that no vector can generate more than 250,000 interrupts per second. The old config was up to 100,000. The driver previously tried to program the rate limit too frequently and if the receive and transmit side were both active on the same vector, the INTRL would be set incorrectly, and this change fixes that issue as a side effect of the redesign. This driver will operate from now on with a slightly changed DIM table with more emphasis towards latency sensitivity by having more table entries with lower latency than with high latency (high being >= 64 microseconds). The driver also resets the DIM algorithm state with a new stats set when there is no work done and the data becomes stale (older than 1 second), for the respective receive or transmit portion of the interrupt. Add a new helper for setting rate limit, which will be used more in a followup patch. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Brett Creeley authored
Implement ndo_set_vf_rate to support setting of min_tx_rate and max_tx_rate; set the appropriate bandwidth in the scheduler for the node representing the specified VF VSI. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Cai Huoqing authored
Replacing dma_pool_alloc/memset() with dma_pool_zalloc() to simplify the code. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christophe JAILLET authored
'destroy_workqueue()' already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant 'flush_workqueue()' calls. This was generated with coccinelle: @@ expression E; @@ - flush_workqueue(E); destroy_workqueue(E); Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Leon Romanovsky authored
PCI core code in the pci_call_probe() has a path that doesn't hold device_lock. It happens because the ->probe() is called through the workqueue mechanism. 349 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, 350 const struct pci_device_id *id) 351 { 352 .... 377 if (cpu < nr_cpu_ids) 378 error = work_on_cpu(cpu, local_pci_probe, &ddi); Luckily enough, the core still ensures that only single flow is executed, so it safe to remove the assert checks that anyway were added for annotations purposes. Fixes: b88f7b12 ("devlink: Annotate devlink API calls") Reported-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
luo penghao authored
The variable will be assigned again later in the if condition, there is no meaning there. drivers/net/ethernet/broadcom/tg3.c:5750:2 warning: Value stored to 'current_link_up' is never read. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Robert Hancock authored
The auto-negotiation state in the PCS as set by phylink_mii_c22_pcs_config was previously always enabled when the driver is configured for in-band autonegotiation, even if autonegotiation was disabled on the interface with ethtool. Update the code to set the BMCR_ANENABLE bit based on the interface's autonegotiation enabled state. Update phylink_mii_c22_pcs_get_state to not check autonegotiation-related fields when autonegotiation is disabled. Update phylink_mac_pcs_get_state to initialize the state based on the interface's configured speed, duplex and pause parameters rather than to unknown when autonegotiation is disabled, before calling the driver's pcs_get_state functions, as they are not likely to provide meaningful data for these fields when autonegotiation is disabled. In this case the driver is really just filling in the link state field. Note that in cases where there is a downstream PHY connected, such as with SGMII and a copper PHY, the configuration set by ethtool is handled by phy_ethtool_ksettings_set and not propagated to the PCS. This is correct since SGMII or 1000Base-X autonegotiation with the PCS should normally still be used even if the copper side has disabled it. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sebastian Andrzej Siewior authored
Eric reported that the rate estimator reads statics from the softirq which in turn triggers a warning introduced in the statistics rework. The warning is too cautious. The updates happen in the softirq context so reads from softirq are fine since the writes can not be preempted. The updates/writes happen during qdisc_run() which ensures one writer and the softirq context. The remaining bad context for reading statistics remains in hard-IRQ because it may preempt a writer. Fixes: 29cbcd85 ("net: sched: Remove Qdisc::running sequence counter") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Commit ea269a6f ("net: phylink: Update SFP selected interface on advertising changes") added a better solution to selecting the interface mode for SFPs using the advertisement mask. This method will work for mvneta and mvpp2 when selecting between 2500base-X and 1000base-X without needing to use the basex helper, or indicate that we support both 1000base-X and 2500base-X when in either of these two interface modes. Hence, we need to eliminate the validation prior to selecting the interface, otherwise when we clean up mvneta's validation function, we will end up locking to 2500base-X as we validate with an interface mode of PHY_INERFACE_MODE_2500BASEX. The supported mask will already have been reduced down to the union of support for the SFP and MAC already, so we can be confident that the advertisement mask is already appropriately restricted. We only need to select the appropriate interface, and then revalidate with the new interface mode. We get rid of the check for pl->sfp_port too, this is meaningless here as it doesn't get cleared when a module is removed, so it doesn't indicate if a module is present. Just rely on pl->sfp_bus. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
luo penghao authored
This assignment statement is meaningless, because the statement will execute to the tag "set_itr_now". The clang_analyzer complains as follows: drivers/net/ethernet/intel/e1000e/netdev.c:2552:3 warning: Value stored to 'current_itr' is never read. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== ethernet: add eth_hw_addr_gen() for switches While doing the last polishing of the drivers/ethernet changes I realized we have a handful of drivers offsetting some base MAC addr by an id. So I decided to add a helper for it. The helper takes care of wrapping which is probably not 100% necessary but seems like a good idea. And it saves driver side LoC (the diffstat is actually negative if we compare against the changes I'd have to make if I was to convert all these drivers to not operate directly on netdev->dev_addr). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Vadym and Taras report that the current behavior of the driver is not exactly expected and it's better to add the port id in like other drivers do. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
We have 5 drivers which offset base MAC addr by port id. Create a helper for them. This helper takes care of overflows, which some drivers did not do, please complain if that's going to break anything! Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== ethernet: manual netdev->dev_addr conversions (part 2) Manual conversions of Ethernet drivers writing directly to netdev->dev_addr (part 2 out of 3). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Break the address up into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Break the address up into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Invert the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 406f42fa ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Read the address into an array on the stack, then call eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: Multi-level qdisc offload Petr says: Currently, mlxsw admits for offload a suitable root qdisc, and its children. Thus up to two levels of hierarchy are offloaded. Often, this is enough: one can configure TCs with RED and TCs with a shaper on, and can even see counters for each TC by looking at a qdisc at a sufficiently shallow position. While simple, the system has obvious shortcomings. It is not possible to configure both RED and shaping on one TC. It is not possible to place a PRIO below root TBF, which would then be offloaded as port shaper. FIFOs are only offloaded at root or directly below, which is confusing to users, because RED and TBF of course have their own FIFO. This patch set lifts assumptions that prevent offloading multi-level qdisc trees. In patch #1, offload of a graft operation is added to TBF. Grafts are issued as another qdisc is linked to the qdisc in question, and give drivers a chance to react to the linking. The absence of this event was not a major issue so far, because TBF was not considered classful, which changes with this patchset. The codebase currently assumes that ETS and PRIO are the only classful qdiscs. The following patches gradually lift this assumption. In patch #2, calculation of traffic class and priomap of a qdisc is fixed. Patch #3 fixes handling of future FIFOs. Child FIFO qdiscs may be created and notified before their parent qdisc exists and therefore need special handling. Patches #4, #5 and #6 unify, respectively, child destruction, child grafting, and cleanup of statistics. Patch #7 adds a function that validates whether a given qdisc topology is offloadable. Finally in patch #8, TBF and RED become classful. At this point, FIFO qdiscs grafted to an offloaded qdisc should always be offloaded. Patch #9 adds a selftest to verify some offloadable and unoffloadable qdisc trees. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
This checks that various qdisc configurations either are or are not offloaded. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Permit offloading qdiscs below RED and TBF. In order to avoid having to implement trivial propagating callbacks for get_prio_bitmap and get_tclass_num, extend mlxsw_sp_qdisc_get_prio_bitmap() and ..._get_tclass_num() to handle the lack of the callback as a cue to forward the request to the parent. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
A following patch will enable offloading qdiscs that are deeper than directly under root qdisc. Currently the topology validation consists of demanding a root qdisc position for ETS and PRIO. Since RED and TBF are considered classless, this is enough. In order to prevent some nonsensical combinations when RED and TBF become classful, introduce a more general topology validator. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
On Spectrum, there are no per-TC TX counters. Instead, mlxsw uses per-prio counters and aggregates them according to the priomap. Therefore when priomap changes, the counter base values need to be reset to reflect the change. Previously, this was only done for the sole child qdisc, but a following patch makes RED and TBF classful. Thus apply the request to the whole sub-tree. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Qdisc graft operations have so far been reported at PRIO, ETS and RED, with RED events ignored, because RED was not considered a classful qdisc. A following patch will make mlxsw recognize RED and TBF as classful qdiscs, and thus it is necessary to validate grafting at these qdiscs as well. Rename the existing graft validator to make it clear that it is a generic function, and invoke for RED and TBF graft events as well. Drop the unnecessary PRIO helper and invoke the graft validator directly for PRIO as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Currently ETS and PRIO are the only offloaded classful qdiscs. Since they are both similar, their destroy handler is the same, and it handles children destruction itself. But now it is possible to do it generically for any classful qdisc. Therefore promote the recursive destruction from the ETS handler to mlxsw_sp_qdisc_destroy(), so that RED and TBF pick it up in follow-up patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Extract from __mlxsw_sp_qdisc_ets_replace() two helpers for handling of one future FIFO resp. reinitializing the array of future FIFOs. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Currently when keeping track of qdiscs, mlxsw notes the TC and priomap corresponding to each qdisc. That is fine currently, as there only ever is one level of qdiscs to update: the direct children of ETS / PRIO. However as deeper structures are made offloadable, ETS would need to update these values for the complete subtree, and interim qdiscs would need to remember to propagate the value. Instead, reverse the responsibility: child qdiscs can ask their parent what their TC and priomap are. ETS / PRIO know the answer right away, or there are defaults for when the root qdisc does not assign them (e.g. when RED is used as root qdisc). When RED and TBF become classful, they will simply forward the request up to their parent. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
As another qdisc is linked to the TBF, the latter should issue an event to give drivers a chance to react to the grafting. In other qdiscs, this event is called GRAFT, so follow suit with TBF as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-