On Thursday 5 September 2024 10:08:08 BST Frank Steinmetzger wrote: > Am Wed, Sep 04, 2024 at 11:38:01PM +0100 schrieb Michael: > > Some MoBos are more tolerant than others. > > > > Regarding Dale's question, which has already been answered - yes, anything > > the bad memory has touched is suspect of corruption. Without ECC RAM a > > dodgy module can cause a lot of damage before it is discovered. > > Actually I was wondering: DDR5 has built-in ECC. But that’s not the same as > the server-grade stuff, because it all happens inside the module with no > communication to the CPU or the OS. So what is the point of it if it still > causes errors like in Dale’s case? > > Maybe that it only catches 1-bit errors, but Dale has more broken bits?
Or it could be Dale's kit is DDR4? Either way, as you say DDR5 is manufactured with On-Die ECC capable of correcting a single-bit error, necessary because DDR5 chip density has increased to the point where single-bit flip errors become unavoidable. It also allows manufacturers to ship chips which would otherwise fail the JEDEC specification. On-Die ECC will only correct bit flips *within* the memory chip. Conventional Side-Band ECC with one additional chip dedicated to ECC correction is capable of correcting errors while data is being moved by the memory controller between the memory module and CPU/GPU. It performs much more heavy lifting and this is why ECC memory is slower.
signature.asc
Description: This is a digitally signed message part.