On Sat, Aug 06, 2005 at 09:45:03AM +0200, Ingo Molnar wrote: > > * Andi Kleen <[EMAIL PROTECTED]> wrote: > > > On Fri, Aug 05, 2005 at 01:01:57PM -0700, Matt Mackall wrote: > > > The netpoll philosophy is to assume that its traffic is an absolute > > > priority - it is better to potentially hang trying to deliver a panic > > > message than to give up and crash silently. > > > > That would be ok if netpoll was only used to deliver panics. But it is > > not. It delivers all messages, and you cannot hang the kernel during > > that. Actually even for panics it is wrong, because often it is more > > important to reboot in a panic than (with a panic timeout) to actually > > deliver the panic. That's needed e.g. in a failover cluster. > > without going into the merits of this discussion, reliable failover > clusters must include (and do include) an external ability to cut power. > No amount of in-kernel logic will prevent the kernel from hanging, given > a bad enough kernel bug.
Ok, true, but we should do a best effort. > > So the right question is not 'can we prevent the kernel from hanging, > ever' (we cannot), but 'which change makes it less likely for the kernel > to hang'. (and, obviously: assuming all other kernel components are > functioning per specification, netpoll itself most not hang :-) > > even a plain printk to VGA can hang in certain kernel crashes. Netpoll > is more complex and thus has more exposure to hangs. E.g. netpoll relies > on the network driver to correctly recycle skbs within a bound amount of > time. If the network driver leaks skbs, it's game over for netpoll. I don't think we even need to think about such rare cases, until the easy cases ("everything hangs when the cable is pulled") are not fixed. > [ i'd prefer a hang over nondeterministic behavior, and e.g. losing > console messages is sure nondeterministic behavior. What if the > console message is "WARNING: the box has just been broken into"? ] That just makes netconsole useless in production. If it causes frequenet hangs people will not use it. > > we could do one thing (see the patch below): i think it would be useful > to fill up the netlogging skb queue straight at initialization time. > Especially if netpoll is used for dumping alone, the system might not be > in a situation to fill up the queue at the point of crash, so better be > a bit more prepared and keep the pipeline filled. You're solving a completely different issue here? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/