Commit 8fce33317023 introduced the concept of NAPI per-channel and independent cleaning of TX path.
This is currently breaking performance in some cases. The scenario happens when all packets are being received in Queue 0 but the TX is performed in Queue != 0. I didn't look very deep but it seems that NAPI for Queue 0 will clean the RX path but as TX is in different NAPI, this last one is called at a slower rate which kills performance in TX. I suspect this is due to TX cleaning takes much longer than RX and because NAPI will get canceled once we return with 0 budget consumed (e.g. when TX is still not done it will return 0 budget). Fix this by looking at all TX channels in NAPI poll function. Signed-off-by: Jose Abreu <joab...@synopsys.com> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races") Cc: Joao Pinto <jpi...@synopsys.com> Cc: David S. Miller <da...@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavall...@st.com> Cc: Alexandre Torgue <alexandre.tor...@st.com> --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 - drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++++------ 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index 63e1064b27a2..8f6741a626d8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -82,7 +82,6 @@ struct stmmac_channel { struct stmmac_priv *priv_data; u32 index; int has_rx; - int has_tx; }; struct stmmac_tc_entry { diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 685d20472358..5bf5f8ebb4b6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2031,13 +2031,13 @@ static int stmmac_napi_check(struct stmmac_priv *priv, u32 chan) struct stmmac_channel *ch = &priv->channel[chan]; bool needs_work = false; - if ((status & handle_rx) && ch->has_rx) { + if (status & handle_rx) { needs_work = true; } else { status &= ~handle_rx; } - if ((status & handle_tx) && ch->has_tx) { + if (status & handle_tx) { needs_work = true; } else { status &= ~handle_tx; @@ -3528,11 +3528,12 @@ static int stmmac_napi_poll(struct napi_struct *napi, int budget) struct stmmac_priv *priv = ch->priv_data; int work_done, rx_done = 0, tx_done = 0; u32 chan = ch->index; + int i; priv->xstats.napi_poll++; - if (ch->has_tx) - tx_done = stmmac_tx_clean(priv, budget, chan); + for (i = 0; i < priv->plat->tx_queues_to_use; i++) + tx_done += stmmac_tx_clean(priv, budget, i); if (ch->has_rx) rx_done = stmmac_rx(priv, budget, chan); @@ -4325,8 +4326,6 @@ int stmmac_dvr_probe(struct device *device, if (queue < priv->plat->rx_queues_to_use) ch->has_rx = true; - if (queue < priv->plat->tx_queues_to_use) - ch->has_tx = true; netif_napi_add(ndev, &ch->napi, stmmac_napi_poll, NAPI_POLL_WEIGHT); -- 2.7.4