On 10/07/2013 23:09, Kevin Day wrote: >> >> Those sound useful. Just out of curiosity, however, since we're on the >> topic of kernel dumps: Has anyone even looked into the notion of an >> emergency fall-back network stack to enable remote kernel panic (or system >> hang) debugging, the way OS X lets you do? I can't tell you the number of >> times I've NMI'd a Mac and connected to it remotely in a scenario where >> everything was totally wedged and just a couple of minutes in kgdb (or now >> lldb) quickly showed that everything was waiting on a specific lock and the >> problem became manifestly clear. >> >> The feature also lets you scrape a panic'd machine with automation, running >> some kgdb scripts against it to glean useful information for later analysis >> vs having to have someone schlep the dump image manually to triage. It's >> going to be damn hard to live without this now, and if someone else isn't >> working on it, that's good to know too! > > At a previous employer, we had a system where on a panic it had a totally > separate stack capable of just IP/UDP/TFTP and would save its core via TFTP > to a server. This isn’t as nice as full remote debugging, but it was a whole > lot easier to develop. The caveats I remember were: > > 1) We didn’t want to implement ARP, so you had to write the mac address of > the “dump server” to the kernel via sysctl before crashing. > 2) We also didn’t want to have to deal with routing tables, so you had to > manually specify what interface to blast packets out to, also via sysctl. > 3) After a panic we didn’t want to rely on interrupt processing working, so > it polled the network interface and blocked whenever it needed to. Since this > was an embedded system, it wasn’t too big of a deal - only one network driver > had to be hacked to support this. Basically a flag that would switch to > “disable normal processing, switch to polled fifos for input and output” > until reboot. > 4) The whole system used only preallocated buffers and its own stack (carved > out from memory on boot) so even if the kernel’s malloc was trashed, we could > still dump. > > I’m not sure this really would scratch your itch, but I believe this took me > no more than a day or two to implement. Parts #1 and #2 would be pretty easy, > but I’m not sure how generic the kernel could support an emergency network > mode that doesn’t require interrupts for every network card out there. Maybe > that isn’t as important to you as it was to us. > > The whole exercise is much easier if you don’t use TFTP but a custom protocol > that doesn’t require the crashing system to receive any packets, if it can > just blast away at some random host oblivious if it’s working or not, it’s a > lot less code to write. > There was some work on something similar at one point, not sure what came of it. http://lists.freebsd.org/pipermail/freebsd-current/2010-September/020164.html
Vince > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org" > _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"