owner-freebsd-hack...@freebsd.org wrote:
> On Thu, 6 May 2010, Boris Kochergin wrote:
> 
>> My experience with bad memory is that if it causes the machine to
>> crash, it won't always happen while the machine is running the same
>> process (or kernel thread)--so look for it crashing in a wide
>> variety of places--and upon inspection of the core dump, a pointer
>> somewhere will be pointing to garbage.
> ============
> 
> so really i'd need to collect two or more crash dumps, and if they
> point to different addresses then i can reasonably say the RAM is bad?
> 
> thanks...

It's not just that they point to different addresses, it is garbage in many 
completely independent places. For example, pulling bad registers/return 
addresses off the stack, or garbage in random unrelated 
buffers/structures/pointers. On the other hand, if you often have garbage in 
some structure's "foo" pointer, that indicates a problem (maybe locking) in how 
your code manages setting that foo pointer. It's a subtle difference.

It is also useful to make sure that the garbage itself is different. As 
mentioned before, a single bit error in an otherwise valid value, or maybe a 
missing/scrambled byte, these are good indications of memory problems. If 
random places are often overwritten with something else, that could just be 
another piece of misbehaving code that is writing someplace it shouldn't. I've 
often found code that writes some buffer into e.g. a piece of memory it no 
longer owns that looks like memory corruption until you realize the garbage is 
always something specific like a vnode structure.

/Andrew

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to