Hi all, I finally found why in my mpc5200 based system (and in the lite5200) I had leaks of skbs when receiving ethernet traffic at 100 Mbit/s so that we generated an rxfifo error (to be able to get it I even reduced the speed of the CPU using the PLL jumpers).
The problem was: The reception packets are handled in the interrupt handler: mpc52xx_fec_rx_interrupt. So this interrupt is being handled all the time because we are receiving packets continuously. The reception fifo errors are handled in the interrupt handler: mpc52xx_fec_interrupt. This interruption is being handled very often because we are receiving more packets that what we can handle, and that generates a fifo reception error. So it is very usual that the mpc52xx_fec_interrupt handler is executed at the middle of the mpc52xx_fec_rx_interrupt handler. And I think that it is not designed for that purpose. The mpc52xx_fec_rx_interrupt uses bcom_retrieve_buffer and bcom_submit_next_buffer. The mpc52xx_fec_interrupt calls to mpc52xx_fec_reset which calls to mpc52xx_fec_free_rx_buffers and mpc52xx_fec_alloc_rx_buffers, which at the end call bcom_retrieve_buffer and bcom_submit_next_buffer. Then we have two problems because of the same origin: - bcom_retrieve_buffer and bcom_submit_next_buffer doesn't seem to be atomic, so they cannot be called from one interrupt nested from another one being executing those functions. - Even if they were atomic, the mpc52xx_fec_rx_interrupt and the mpc52xx_fec_reset functions are not intended to be called at the same time. Patch I managed to do: I've managed to perform a patch that avoids the leak disabling the irqs, but I have a remaining problem: The last packet to be transmited is waiting for a packet to be received in order to egress (I don't know why but I detected it experimentaly). Anyone with more experience can tell me what's wrong in this patch? Thank you in advance, Asier Signed-off-by: Asier Llano <a.ll...@ziv.es> --- drivers/net/fec_mpc52xx.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff -urpN linux-2.6.31.2/drivers/net/fec_mpc52xx.c linux-2.6.31.2-fec_mpc52xx_skb_leak/drivers/net/fec_mpc52xx.c ================================================================ --- linux-2.6.31.2/drivers/net/fec_mpc52xx.c +++ linux-2.6.31.2-fec_mpc52xx_skb_leak/drivers/net/fec_mpc52xx.c @@ -85,13 +85,20 @@ MODULE_PARM_DESC(debug, "debugging messa static void mpc52xx_fec_tx_timeout(struct net_device *dev) { + struct mpc52xx_fec_priv *priv = netdev_priv(dev); + unsigned long flags; + dev_warn(&dev->dev, "transmit timed out\n"); + spin_lock_irqsave(&priv->lock, flags); + mpc52xx_fec_reset(dev); dev->stats.tx_errors++; netif_wake_queue(dev); + + spin_unlock_irqrestore(&priv->lock, flags); } static void mpc52xx_fec_set_paddr(struct net_device *dev, u8 *mac) @@ -359,8 +366,9 @@ static irqreturn_t mpc52xx_fec_tx_interr { struct net_device *dev = dev_id; struct mpc52xx_fec_priv *priv = netdev_priv(dev); + unsigned long flags; - spin_lock(&priv->lock); + spin_lock_irqsave(&priv->lock, flags); while (bcom_buffer_done(priv->tx_dmatsk)) { struct sk_buff *skb; @@ -375,7 +383,7 @@ static irqreturn_t mpc52xx_fec_tx_interr netif_wake_queue(dev); - spin_unlock(&priv->lock); + spin_unlock_irqrestore(&priv->lock, flags); return IRQ_HANDLED; } @@ -384,6 +392,9 @@ static irqreturn_t mpc52xx_fec_rx_interr { struct net_device *dev = dev_id; struct mpc52xx_fec_priv *priv = netdev_priv(dev); + unsigned long flags; + + spin_lock_irqsave(&priv->lock, flags); while (bcom_buffer_done(priv->rx_dmatsk)) { struct sk_buff *skb; @@ -445,6 +456,8 @@ static irqreturn_t mpc52xx_fec_rx_interr bcom_submit_next_buffer(priv->rx_dmatsk, skb); } + spin_unlock_irqrestore(&priv->lock, flags); + return IRQ_HANDLED; } @@ -454,6 +467,7 @@ static irqreturn_t mpc52xx_fec_interrupt struct mpc52xx_fec_priv *priv = netdev_priv(dev); struct mpc52xx_fec __iomem *fec = priv->fec; u32 ievent; + unsigned long flags; ievent = in_be32(&fec->ievent); @@ -471,9 +485,14 @@ static irqreturn_t mpc52xx_fec_interrupt if (net_ratelimit() && (ievent & FEC_IEVENT_XFIFO_ERROR)) dev_warn(&dev->dev, "FEC_IEVENT_XFIFO_ERROR\n"); + spin_lock_irqsave(&priv->lock, flags); + mpc52xx_fec_reset(dev); netif_wake_queue(dev); + + spin_unlock_irqrestore(&priv->lock, flags); + return IRQ_HANDLED; } ----------------------------------------- PLEASE NOTE ------------------------------------------- This message, along with any attachments, may be confidential or legally privileged. It is intended only for the named person(s), who is/are the only authorized recipients. If this message has reached you in error, kindly destroy it without review and notify the sender immediately. Thank you for your help. ZIV uses virus scanning software but excludes any liability for viruses contained in any attachment. ------------------------------------ ROGAMOS LEA ESTE TEXTO ------------------------------- Este mensaje y sus anexos pueden contener información confidencial y/o con derecho legal. Está dirigido únicamente a la/s persona/s o entidad/es reseñadas como único destinatario autorizado. Si este mensaje le hubiera llegado por error, por favor elimínelo sin revisarlo ni reenviarlo y notifíquelo inmediatamente al remitente. Gracias por su colaboración. ZIV utiliza software antivirus, pero no se hace responsable de los virus contenidos en los ficheros anexos. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev