According to the report of [1], this driver is possible to cause
the following error in ravb_tx_timeout_work().

ravb e6800000.ethernet ethernet: failed to switch device to config mode

This error means that the hardware could not change the state
from "Operation" to "Configuration" while some tx and/or rx queue
are operating. After that, ravb_config() in ravb_dmac_init() will fail,
and then any descriptors will be not allocaled anymore so that NULL
pointer dereference happens after that on ravb_start_xmit().

To fix the issue, the ravb_tx_timeout_work() should check
the return value of ravb_stop_dma() whether this hardware can be
re-initialized or not. If ravb_stop_dma() fails, ravb_tx_timeout_work()
re-enables TX and RX and just exits.

[1]
https://lore.kernel.org/linux-renesas-soc/20200518045452.2390-1-dirk.be...@de.bosch.com/

Reported-by: Dirk Behme <dirk.be...@de.bosch.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda...@renesas.com>
---
 Changes from RFC v1:
 - Check the return value of ravb_stop_dma() and exit if the hardware
   condition can not be initialized in the tx timeout.
 - Update the commit subject and description.
 - Fix some typo.
 https://patchwork.kernel.org/patch/11570217/

 Unfortunately, I still didn't reproduce the issue yet. So, I still
 marked RFC on this patch.

 drivers/net/ethernet/renesas/ravb_main.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index a442bcf6..500f5c1 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1458,7 +1458,18 @@ static void ravb_tx_timeout_work(struct work_struct 
*work)
                ravb_ptp_stop(ndev);
 
        /* Wait for DMA stopping */
-       ravb_stop_dma(ndev);
+       if (ravb_stop_dma(ndev)) {
+               /* If ravb_stop_dma() fails, the hardware is still in-progress
+                * as "Operation" mode for TX and/or RX. So, this should not
+                * call the following functions because ravb_dmac_init() is
+                * possible to fail too. Also, this should not retry
+                * ravb_stop_dma() again and again here because it's possible
+                * to wait forever. So, this just re-enables the TX and RX and
+                * skip the following re-initialization procedure.
+                */
+               ravb_rcv_snd_enable(ndev);
+               goto out;
+       }
 
        ravb_ring_free(ndev, RAVB_BE);
        ravb_ring_free(ndev, RAVB_NC);
@@ -1467,6 +1478,7 @@ static void ravb_tx_timeout_work(struct work_struct *work)
        ravb_dmac_init(ndev);
        ravb_emac_init(ndev);
 
+out:
        /* Initialise PTP Clock driver */
        if (priv->chip_id == RCAR_GEN2)
                ravb_ptp_init(ndev, priv->pdev);
-- 
2.7.4

Reply via email to