On Wed, Sep 10, 2014 at 09:18:35AM -0400, Vince Weaver wrote: > Somehow something is stomping over memory with a forking workload (likely > an improper free with RCU like we've seen before) but the fact that it > causes a reboot immediately makes it *really* hard to debug this.
Yes, the insta reboot thing is a total pain. Too bad Steve is out for a spell; the only thing I can think of is trying to 'preserve' the trace buffer over the reboot; its a warm reboot and memory contents should be 'stable'. So if we can get the new boot to agree with the old kernel's idea of trace buffers we might retain enough. Another approach would be using the firewire debug facility to read the trace buffer post-mortem. Of course, that requires you have FW in at least two boxes and an appropriate cable (not something I've actually ever done due to lack of FW hardware). Maybe the EHCI debug port (USB) might provide similar capabilities -- again, significant lack of experience due to not actually having hardware for that. I think I've once managed to hit the triple fault reboot in qemu/kvm, which makes inspecting the dead state tons easier, if you can manage to reproduce in a virt environment you've got a chance (of course, the problem at that time was not perf and so a lot less sensitive to hardware). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/