Commit 9e2bc267 authored by Robert Hancock's avatar Robert Hancock Committed by David S. Miller

net: axienet: Use NAPI for TX completion path

This driver was using the TX IRQ handler to perform all TX completion
tasks. Under heavy TX network load, this can cause significant irqs-off
latencies (found to be in the hundreds of microseconds using ftrace).
This can cause other issues, such as overrunning serial UART FIFOs when
using high baud rates with limited UART FIFO sizes.

Switch to using a NAPI poll handler to perform the TX completion work
to get this out of hard IRQ context and avoid the IRQ latency impact.
A separate poll handler is used for TX and RX since they have separate
IRQs on this controller, so that the completion work for each of them
stays on the same CPU as the interrupt.

Testing on a Xilinx MPSoC ZU9EG platform using iperf3 from a Linux PC
through a switch at 1G link speed showed no significant change in TX or
RX throughput, with approximately 941 Mbps before and after. Hard IRQ
time in the TX throughput test was significantly reduced from 12% to
below 1% on the CPU handling TX interrupts, with total hard+soft IRQ CPU
usage dropping from about 56% down to 48%.
Signed-off-by: default avatarRobert Hancock <robert.hancock@calian.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent f0cf4000
......@@ -385,7 +385,6 @@ struct axidma_bd {
* @phy_node: Pointer to device node structure
* @phylink: Pointer to phylink instance
* @phylink_config: phylink configuration settings
* @napi: NAPI control structure
* @pcs_phy: Reference to PCS/PMA PHY if used
* @pcs: phylink pcs structure for PCS PHY
* @switch_x_sgmii: Whether switchable 1000BaseX/SGMII mode is enabled in the core
......@@ -396,7 +395,22 @@ struct axidma_bd {
* @regs_start: Resource start for axienet device addresses
* @regs: Base address for the axienet_local device address space
* @dma_regs: Base address for the axidma device address space
* @napi_rx: NAPI RX control structure
* @rx_dma_cr: Nominal content of RX DMA control register
* @rx_bd_v: Virtual address of the RX buffer descriptor ring
* @rx_bd_p: Physical address(start address) of the RX buffer descr. ring
* @rx_bd_num: Size of RX buffer descriptor ring
* @rx_bd_ci: Stores the index of the Rx buffer descriptor in the ring being
* accessed currently.
* @napi_tx: NAPI TX control structure
* @tx_dma_cr: Nominal content of TX DMA control register
* @tx_bd_v: Virtual address of the TX buffer descriptor ring
* @tx_bd_p: Physical address(start address) of the TX buffer descr. ring
* @tx_bd_num: Size of TX buffer descriptor ring
* @tx_bd_ci: Stores the next Tx buffer descriptor in the ring that may be
* complete. Only updated at runtime by TX NAPI poll.
* @tx_bd_tail: Stores the index of the next Tx buffer descriptor in the ring
* to be populated.
* @dma_err_task: Work structure to process Axi DMA errors
* @tx_irq: Axidma TX IRQ number
* @rx_irq: Axidma RX IRQ number
......@@ -404,19 +418,6 @@ struct axidma_bd {
* @phy_mode: Phy type to identify between MII/GMII/RGMII/SGMII/1000 Base-X
* @options: AxiEthernet option word
* @features: Stores the extended features supported by the axienet hw
* @tx_bd_v: Virtual address of the TX buffer descriptor ring
* @tx_bd_p: Physical address(start address) of the TX buffer descr. ring
* @tx_bd_num: Size of TX buffer descriptor ring
* @rx_bd_v: Virtual address of the RX buffer descriptor ring
* @rx_bd_p: Physical address(start address) of the RX buffer descr. ring
* @rx_bd_num: Size of RX buffer descriptor ring
* @tx_bd_ci: Stores the index of the Tx buffer descriptor in the ring being
* accessed currently. Used while alloc. BDs before a TX starts
* @tx_bd_tail: Stores the index of the Tx buffer descriptor in the ring being
* accessed currently. Used while processing BDs after the TX
* completed.
* @rx_bd_ci: Stores the index of the Rx buffer descriptor in the ring being
* accessed currently.
* @max_frm_size: Stores the maximum size of the frame that can be that
* Txed/Rxed in the existing hardware. If jumbo option is
* supported, the maximum frame size would be 9k. Else it is
......@@ -436,8 +437,6 @@ struct axienet_local {
struct phylink *phylink;
struct phylink_config phylink_config;
struct napi_struct napi;
struct mdio_device *pcs_phy;
struct phylink_pcs pcs;
......@@ -453,7 +452,20 @@ struct axienet_local {
void __iomem *regs;
void __iomem *dma_regs;
struct napi_struct napi_rx;
u32 rx_dma_cr;
struct axidma_bd *rx_bd_v;
dma_addr_t rx_bd_p;
u32 rx_bd_num;
u32 rx_bd_ci;
struct napi_struct napi_tx;
u32 tx_dma_cr;
struct axidma_bd *tx_bd_v;
dma_addr_t tx_bd_p;
u32 tx_bd_num;
u32 tx_bd_ci;
u32 tx_bd_tail;
struct work_struct dma_err_task;
......@@ -465,16 +477,6 @@ struct axienet_local {
u32 options;
u32 features;
struct axidma_bd *tx_bd_v;
dma_addr_t tx_bd_p;
u32 tx_bd_num;
struct axidma_bd *rx_bd_v;
dma_addr_t rx_bd_p;
u32 rx_bd_num;
u32 tx_bd_ci;
u32 tx_bd_tail;
u32 rx_bd_ci;
u32 max_frm_size;
u32 rxmem;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment