MANA hardware requires at least one doorbell ring every 8 wraparounds
of the CQ. The driver rings the doorbell as a form of flow control to
inform hardware that CQEs have been consumed.

The NAPI poll functions mana_poll_tx_cq() and mana_poll_rx_cq() can
poll up to CQE_POLLING_BUFFER (512) completions per call. If the CQ
has fewer than 512 entries, a single poll call can process more than
4 wraparounds without ringing the doorbell. The doorbell threshold
check also uses ">" instead of ">=", delaying the ring by one extra
CQE beyond 4 wraparounds. Combined, these issues can cause the driver
to exceed the 8-wraparound hardware limit, leading to missed
completions and stalled queues.

Fix this by capping the number of CQEs polled per call to 4 wraparounds
of the CQ in both TX and RX paths. Also change the doorbell threshold
from ">" to ">=" so the doorbell is rung as soon as 4 wraparounds are
reached.

Cc: [email protected]
Fixes: 58a63729c957 ("net: mana: Fix doorbell out of order violation and avoid 
unnecessary doorbell rings")
Signed-off-by: Long Li <[email protected]>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 23 +++++++++++++++----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c 
b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 9919183ad39e..fe667e0d930d 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1770,8 +1770,14 @@ static void mana_poll_tx_cq(struct mana_cq *cq)
        ndev = txq->ndev;
        apc = netdev_priv(ndev);
 
+       /* Limit CQEs polled to 4 wraparounds of the CQ to ensure the
+        * doorbell can be rung in time for the hardware's requirement
+        * of at least one doorbell ring every 8 wraparounds.
+        */
        comp_read = mana_gd_poll_cq(cq->gdma_cq, completions,
-                                   CQE_POLLING_BUFFER);
+                                   min_t(u32, (cq->gdma_cq->queue_size /
+                                          COMP_ENTRY_SIZE) * 4,
+                                         CQE_POLLING_BUFFER));
 
        if (comp_read < 1)
                return;
@@ -2156,7 +2162,14 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
        struct mana_rxq *rxq = cq->rxq;
        int comp_read, i;
 
-       comp_read = mana_gd_poll_cq(cq->gdma_cq, comp, CQE_POLLING_BUFFER);
+       /* Limit CQEs polled to 4 wraparounds of the CQ to ensure the
+        * doorbell can be rung in time for the hardware's requirement
+        * of at least one doorbell ring every 8 wraparounds.
+        */
+       comp_read = mana_gd_poll_cq(cq->gdma_cq, comp,
+                                   min_t(u32, (cq->gdma_cq->queue_size /
+                                          COMP_ENTRY_SIZE) * 4,
+                                         CQE_POLLING_BUFFER));
        WARN_ON_ONCE(comp_read > CQE_POLLING_BUFFER);
 
        rxq->xdp_flush = false;
@@ -2201,11 +2214,11 @@ static int mana_cq_handler(void *context, struct 
gdma_queue *gdma_queue)
                mana_gd_ring_cq(gdma_queue, SET_ARM_BIT);
                cq->work_done_since_doorbell = 0;
                napi_complete_done(&cq->napi, w);
-       } else if (cq->work_done_since_doorbell >
-                  cq->gdma_cq->queue_size / COMP_ENTRY_SIZE * 4) {
+       } else if (cq->work_done_since_doorbell >=
+                  (cq->gdma_cq->queue_size / COMP_ENTRY_SIZE) * 4) {
                /* MANA hardware requires at least one doorbell ring every 8
                 * wraparounds of CQ even if there is no need to arm the CQ.
-                * This driver rings the doorbell as soon as we have exceeded
+                * This driver rings the doorbell as soon as it has processed
                 * 4 wraparounds.
                 */
                mana_gd_ring_cq(gdma_queue, 0);
-- 
2.43.0


Reply via email to