Paul Schlie <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Paul Schlie writes: >>> - yes, if you're willing to compute true CRC's as opposed to simpler >>> checksums, which may be worth the price if in fact many/most data >>> check failures are truly caused by single bit errors somewhere in the >>> chain, >> >> FWIW, not one of the corrupted-data problems I've investigated has ever >> looked like a single-bit error. So the theoretical basis for using a >> CRC here seems pretty weak. I doubt we'd even consider automatic repair >> attempts anyway. > > - although I accept that you may be correct in your assessment that most > errors are in fact multi-bit;
I've seen bad memory in a SCSI controller cause single-bit errors in storage. It was quite confusing since the symptom was syntax errors in the C code we were compiling on the server. The sysadmin actually caught it reliably corrupting a block of source text written out and read back. I've also seen single-bit errors caused by bad memory in a network interface. *Twice*. Particularly nasty since the CRC on TCP/IP packets is only 16-bit so a large enough ftp transfer would eventually finish despite the packet loss but with the occasional bits flipped. In these days of SAN/NAS and SCSI over IP that's pretty scary... Several cases on list have come down to "filesystem secretly replaces entire block of data with Folger's Crystals(tm) -- let's see if the database notices". Any checksum would help in that case but I wouldn't discount single bit errors either. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers