on 24/12/2012 20:17 Derek Kulinski said the following: > Hello Andriy, > > Monday, December 24, 2012, 8:01:26 AM, you wrote: > >> on 24/12/2012 00:23 Derek Kulinski said the following: >>> Dumping 3701 out of 8072 >>> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > >> So do you have the crash dump(s)? > > Yes, but they are 3.5GB each. I attached text dump to GNATS but I can > resend it to you (I don't know if it's ok to send attachments to the > mailing list). If you would prefer I could give you access to the > box.
Derek, I've looked through the cores and it does look like in all cases some sort of memory corruption is a precursor to a subsequent crash. I can't decidedly say if the corruptions are caused by the hardware, by some code overwriting random memory locations ("rogue" driver) or by a "simpler" bug like use after free. I am always inclined to suspect the hardware first. You can try to reproduce the problem with some additional checks enabled in the kernel. Those should catch the problem earlier and thus make its source clearer. I recommend the following: options INVARIANTS options INVARIANT_SUPPORT options WITNESS options DEBUG_MEMGUARD makeoptions DEBUG+="-DDEBUG" The last is really needed only for the ZFS and OpenSolaris compat code. It make result in some extra noise from unrelated subsystems. Perhaps you could just add "#define DEBUG" to sys/cddl/contrib/opensolaris/uts/common/sys/debug.h. I haven't tested this approach though. Also, please put vm.memguard.desc="arc_buf_hdr_t" into loader.conf. Please note that these options will make your system significantly slower. -- Andriy Gapon _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"