From: Eugene Surovegin <[EMAIL PROTECTED]>
Date: Tue, 6 Sep 2005 15:04:17 -0700

> David, correct me if I'm wrong, but I think there is a major problem
> with current netconsole/netpoll approach.

You're preaching to the choir.  I think the whole netpoll
implementation is fundamentally flawed, and the locking problems we
keep bumping into are merely a symptom.

People want this thing so badly, that I keep letting them continue to
patch this thing into quasi-working, even though it's foundations are
what are so problematic.

It's never going to work %100 reliably, I think, here's why:

The core issue, and conflict, is that the desire is to have the
responses be immediate and come at the moment the event occurs.
Because the situation may be so dire that deferring into a more
appropriate software IRQ context may not be possible, and thus we'd
lose the log message or event.

So we try to spit out netconsole messages in hw IRQ context and stuff
like that, as you stated.  The tg3 driver is susceptible to the
problem you mention, as is bnx2, because they use purely software
interrupt spinlocking, and thus their timers will deadlock if any hw
IRQ context netpoll operations occur.

There is a way to fix all of this, deferring all netpoll operations to
software IRQ context, but you sacrifice reliability when the system is
in such a bad state that software IRQs are not occuring any more
or are deadlocked.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to