[osol-code] Queries regarding crash dump .....

Gaurav Dhiman Sun, 29 Oct 2006 01:54:47 -0700

Hi,

I am studing the crash dump routines of opensoalris now days, mainly the flow of panic() funcion.

Can someone let me know the answers of following questions.

- What all can be configured as dump device ?

- Swap partition on local disk is ok, that is by default

- Dedicated raw partition on local disk is ok but dont know how to do it

- Partition across the network (i mean through NFS), is it posible, I gone thru the code of nfs_dump() and when I make my kernel crash and follow the flow in KMDB, then the flow goes till nfs_dump() ---> nd_init() and then returns back with error due to which the dumping is not done across the network. I figured out the cause of that error code returned. Actually in nd_init(), first thing we do is that we do check the version number of vnode structutre of our configured dump file, and in my case that comes out to be 0, where as the supported NFS versions for dumping are ony 2 and 3 as per the code in nd_init(), can someone explain me what should I do to have the dump across the network. One more thing is it possible to have eh dump in normal partition or only in swap partition ?

- Can I throught the minimum debugging messages like processor state and panic stack on local or serial console at the time of panicing ? How is it that possible if at all ?

- What the purpose of putting other CPUs in infinite loop other than the CPU which paniced. As far as I understand this is done to keep other CPUs bussy in infinte loop so that they do not disturbe the physical memory of which the current panic is taking snapshot in terms of dump. I dont think we do report the state of other CPUs in our panic dump, do we ? If ys, I could not find that code which dump the other CPUs state.

- If some one has already gone through some code of kernel panic and crash dump, can someone let me know what are the low level arch-dependent APIs in crash dumping code. I could figure out few of those, vpanic() (which dumps the panic CPU state on panic stack), panic_stop_cpus() (wich sends IPIs and put other CPUs in infinte loop, sending IPIs is arch-dependent), panic_savetrap() (dont know what it do), panic_saveregs() (copies the processor state from panic stack to panic buffer)

- Why are we maintianing two seperate buffers, panicbuf and dumpbuf. Right now my understading is that panic buf is only used to save the CPU state and panic stack, where as the dumpbuf is used to save the system dump information, like kernel symbol table, page table and physical memory dump. once these bufffers are filled they are sent to dumpvp_flush() to put the dump on disk or across network. Why are two buffers maintained seperately. Is it for the reason that in live dumping cases, CPU state and panic stack is not the part of dump and we only save the system information only ?

I have asked alot on questions in one go, can someone answer atleast few of these.

_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

[osol-code] Queries regarding crash dump .....

Reply via email to