Jiri Kosina <[EMAIL PROTECTED]> writes: > On Tue, 3 Apr 2007, Jiri Kosina wrote: > >> > we're also having problems reproducing it on that same combination >> > (2.6.21-rc4 + my tree), so it points to something in -mm. Since your >> > trace is completely different right now it looks like something else >> > is fuzzing it up. Since the e1000 changes are in rc5-mm3 as well, that >> > might help to narrow it down quickly. >> I don't know (yet) whether rc5-mm3 was OK in this respect, I didn't boot >> it on this machine. I only know that both rc5 and rc5 + e1000 tree are >> OK, but rc5-mm4 panics on ifconfig/dhclient on e1000 card immediately on >> my system. >> I will start bisection when I get back to the respective machine >> (tomorrow) and will let you know. > > And the bisection winner is > > i386-irq-kill-nr_irq_vectors-and-increase-nr_irqs.patch > > I don't immediately see how it could be causing it, so adding CCs which > are listed in the patch.
Weird. I will have to look at that in a little more detail. Do you know if this problem happens on x86_64? What does your .config look like? What does /proc/interrupts look like? What kind of hardware you running this kernel on? Can anyone else reproduce this? The oops clearly shows something using -1 and calling that as an address I don't know why, but I'm guessing I have triggered a memory stomp somewhere. I think this is the first time I have seen a small negative number causing a NULL pointer dereference. That patch looks innocuous enough that either: - I just missed changing something I should have. - Your configuration has an increase in NR_IRQS and that triggered something. - The patch simply permuted things so a memory stomp now happens on the e1000 data structures instead of somewhere else. - Something doesn't like large irq numbers. This work is essentially a backport from x86_64 so if your hardware is 64bit capable testing that should be a fairly easy test, and be able to rule out large irq numbers as the culprit. Until I get a good look at -mm I'm going to have a hard time guessing. But a roving memory stomp is my best guess. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/