On Mon, Jun 05, 2006 at 04:11:25PM -0700, Kok, Auke wrote:
> 
> Netpoll was broken due to the earlier addition of multiqueue.
> 
> Signed-off-by: Mitch Williams <[EMAIL PROTECTED]>
> Signed-off-by: Auke Kok <[EMAIL PROTECTED]>
> ---
> 
>  drivers/net/e1000/e1000_main.c |    9 ++++++++-
>  1 files changed, 8 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
> index ed15fca..7103a0e 100644
> --- a/drivers/net/e1000/e1000_main.c
> +++ b/drivers/net/e1000/e1000_main.c
> @@ -4629,10 +4629,17 @@ static void
>  e1000_netpoll(struct net_device *netdev)
>  {
>       struct e1000_adapter *adapter = netdev_priv(netdev);
> +#ifdef CONFIG_E1000_NAPI
> +     int budget = 0;
> +
> +     disable_irq(adapter->pdev->irq);
> +     e1000_clean_tx_irq(adapter, adapter->tx_ring);
> +     adapter->clean_rx(adapter, adapter->rx_ring, &budget, netdev->weight);
> +#else
> +
I've been speaking about this fix with a Jeff Moyer, and we've come up with some
concerns regarding its implementation.  Specifically the call to
adapter->clean_rx in the case of the e1000 driver is rather a layering
violation in the netpoll code, in the sense that the function pointed to by 
clean_rx
is functionality that is nominally used by the dev->poll method.  In fact in
this case, it would appear possible since dev->poll is called under the
poll_lock, but dev->poll_controller is not, that is is possible to have cpus in
a system executing in e1000_clean_rx_irq_[ps] at the same time leading to data
corruption:

CPU0:
netpoll_poll_dev
 dev->poll_controller (e1000_netpoll)
  adapter->clean_rx (e1000_clean_rx_irq)

CPU1:
napi_poll
 dev->poll (e1000_clean)
  e1000_clean_rx_irq

I'm not sure what the right fix is here.  A spinlock in e1000_clean_rx_irq[_ps]
would seem to be an easy and direct fix, but I don't know that thats the best
solution.

Something I don't understand is why the call to adapter->clean_rx is
required in the first place when the napi_poll routine calls it itself directly.
All other drivers schedule a napi poll and receive frames via that path.  My
guess is that it has to do with the fact that we schedule polls on the device in
the polling_netdev array, rather than the actual registered netdev. Looking at
the driver code I note that while an entire array is allocated for polling
netdevs, we only ever use entry 0.  Would it make sense to just remove the the
polling_netdev array and use the registered device like all the other drivers.
I expect that would eliminate the need for this patch as well.

Regards
Neil

>       disable_irq(adapter->pdev->irq);
>       e1000_intr(adapter->pdev->irq, netdev, NULL);
>       e1000_clean_tx_irq(adapter, adapter->tx_ring);
> -#ifndef CONFIG_E1000_NAPI
>       adapter->clean_rx(adapter, adapter->rx_ring);
>  #endif
>       enable_irq(adapter->pdev->irq);
> 
> 
> 
> --
> Auke Kok <[EMAIL PROTECTED]>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
/***************************************************
 *Neil Horman
 *Software Engineer
 *gpg keyid: 1024D / 0x92A74FA1 - http://pgp.mit.edu
 ***************************************************/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to