On Fri, 2005-08-05 at 15:55 +0200, Andi Kleen wrote: > > This is fixing the symptom and is not the cure. Unfortunately I don't > > have a e1000 card so I can't try a fix. But I did have a e100 card that > > would lock up the same way. The problem was that netpoll_poll calls the > > cards netpoll routine (in e1000_main.c e1000_netpoll). In the e100 > > case, when the transmit buffer would fill up, the queue would go down. > > But the netpoll routine in the e100 code never put it back up after it > > was all transfered. So this would lock up the kernel when that happened. > > In my case the hang happened when no cable was connected.
But should come back when the cable is reconnected. OK, I admit, it shouldn't hang in the first place. > > There is no way to handle this in any other way. You eventually > have to bail out. > > > > > repeat: > > - if(!np || !np->dev || !netif_running(np->dev)) { > > + if(try-- == 0 || !np || !np->dev || !netif_running(np->dev)) { > > + if (!try) > > + printk(KERN_WARNING "net driver is stuck down, maybe a" > > + " problem with the driver's netpoll\n"); > > ... and nobody will see that. It will not even trigger an output. Since one would be using net console right? :-) Oops! I forgot that. Well it may make it to the logs, since this patch also bails out. That's why I think your first patch with this warning as well as a fix for the e1000 should be submitted. Since the e1000 shouldn't lock up netpoll just because the queue was put down. Hmm, how bad is it to have a printk in a routine that is registered to printk? If this does print, a "static once" variable should be added so that this is only printed once and not everytime it tries to print this message. -- Steve - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html