On Mon, Jun 05, 2006 at 04:11:25PM -0700, Kok, Auke wrote: > > Netpoll was broken due to the earlier addition of multiqueue. > > Signed-off-by: Mitch Williams <[EMAIL PROTECTED]> > Signed-off-by: Auke Kok <[EMAIL PROTECTED]> > --- > > drivers/net/e1000/e1000_main.c | 9 ++++++++- > 1 files changed, 8 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index ed15fca..7103a0e 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -4629,10 +4629,17 @@ static void > e1000_netpoll(struct net_device *netdev) > { > struct e1000_adapter *adapter = netdev_priv(netdev); > +#ifdef CONFIG_E1000_NAPI > + int budget = 0; > + > + disable_irq(adapter->pdev->irq); > + e1000_clean_tx_irq(adapter, adapter->tx_ring); > + adapter->clean_rx(adapter, adapter->rx_ring, &budget, netdev->weight); > +#else > + I've been speaking about this fix with a Jeff Moyer, and we've come up with some concerns regarding its implementation. Specifically the call to adapter->clean_rx in the case of the e1000 driver is rather a layering violation in the netpoll code, in the sense that the function pointed to by clean_rx is functionality that is nominally used by the dev->poll method. In fact in this case, it would appear possible since dev->poll is called under the poll_lock, but dev->poll_controller is not, that is is possible to have cpus in a system executing in e1000_clean_rx_irq_[ps] at the same time leading to data corruption:
CPU0: netpoll_poll_dev dev->poll_controller (e1000_netpoll) adapter->clean_rx (e1000_clean_rx_irq) CPU1: napi_poll dev->poll (e1000_clean) e1000_clean_rx_irq I'm not sure what the right fix is here. A spinlock in e1000_clean_rx_irq[_ps] would seem to be an easy and direct fix, but I don't know that thats the best solution. Something I don't understand is why the call to adapter->clean_rx is required in the first place when the napi_poll routine calls it itself directly. All other drivers schedule a napi poll and receive frames via that path. My guess is that it has to do with the fact that we schedule polls on the device in the polling_netdev array, rather than the actual registered netdev. Looking at the driver code I note that while an entire array is allocated for polling netdevs, we only ever use entry 0. Would it make sense to just remove the the polling_netdev array and use the registered device like all the other drivers. I expect that would eliminate the need for this patch as well. Regards Neil > disable_irq(adapter->pdev->irq); > e1000_intr(adapter->pdev->irq, netdev, NULL); > e1000_clean_tx_irq(adapter, adapter->tx_ring); > -#ifndef CONFIG_E1000_NAPI > adapter->clean_rx(adapter, adapter->rx_ring); > #endif > enable_irq(adapter->pdev->irq); > > > > -- > Auke Kok <[EMAIL PROTECTED]> > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- /*************************************************** *Neil Horman *Software Engineer *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu ***************************************************/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html