Commit 8fce33317023 introduced the concept of NAPI per-channel and
independent cleaning of TX path.

This is currently breaking performance in some cases. The scenario
happens when all packets are being received in Queue 0 but the TX is
performed in Queue != 0.

I didn't look very deep but it seems that NAPI for Queue 0 will clean
the RX path but as TX is in different NAPI, this last one is called at a
slower rate which kills performance in TX. I suspect this is due to TX
cleaning takes much longer than RX and because NAPI will get canceled
once we return with 0 budget consumed (e.g. when TX is still not done it
will return 0 budget).

Fix this by looking at all TX channels in NAPI poll function.

Signed-off-by: Jose Abreu <joab...@synopsys.com>
Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue 
races")
Cc: Joao Pinto <jpi...@synopsys.com>
Cc: David S. Miller <da...@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavall...@st.com>
Cc: Alexandre Torgue <alexandre.tor...@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h      |  1 -
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 11 +++++------
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h 
b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 63e1064b27a2..8f6741a626d8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -82,7 +82,6 @@ struct stmmac_channel {
        struct stmmac_priv *priv_data;
        u32 index;
        int has_rx;
-       int has_tx;
 };
 
 struct stmmac_tc_entry {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 685d20472358..5bf5f8ebb4b6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2031,13 +2031,13 @@ static int stmmac_napi_check(struct stmmac_priv *priv, 
u32 chan)
        struct stmmac_channel *ch = &priv->channel[chan];
        bool needs_work = false;
 
-       if ((status & handle_rx) && ch->has_rx) {
+       if (status & handle_rx) {
                needs_work = true;
        } else {
                status &= ~handle_rx;
        }
 
-       if ((status & handle_tx) && ch->has_tx) {
+       if (status & handle_tx) {
                needs_work = true;
        } else {
                status &= ~handle_tx;
@@ -3528,11 +3528,12 @@ static int stmmac_napi_poll(struct napi_struct *napi, 
int budget)
        struct stmmac_priv *priv = ch->priv_data;
        int work_done, rx_done = 0, tx_done = 0;
        u32 chan = ch->index;
+       int i;
 
        priv->xstats.napi_poll++;
 
-       if (ch->has_tx)
-               tx_done = stmmac_tx_clean(priv, budget, chan);
+       for (i = 0; i < priv->plat->tx_queues_to_use; i++)
+               tx_done += stmmac_tx_clean(priv, budget, i);
        if (ch->has_rx)
                rx_done = stmmac_rx(priv, budget, chan);
 
@@ -4325,8 +4326,6 @@ int stmmac_dvr_probe(struct device *device,
 
                if (queue < priv->plat->rx_queues_to_use)
                        ch->has_rx = true;
-               if (queue < priv->plat->tx_queues_to_use)
-                       ch->has_tx = true;
 
                netif_napi_add(ndev, &ch->napi, stmmac_napi_poll,
                               NAPI_POLL_WEIGHT);
-- 
2.7.4

Reply via email to