Martijn van Oosterhout <[EMAIL PROTECTED]> writes: > Actually, the real problem to me seems to be that to check the checksum > when you read the page in, you need to look at the contents of the page > and "assume" some of the values in there are correct, before you can > even calculate the checksum. If the page really is corrupted, chances > are the item pointers are going to be bogus, but you need to read them > to calculate the checksum...
Hmm. You could verify the values closely enough to ensure you don't crash while redoing the CRC calculation, which ought to be sufficient. Still, I agree that the whole thing looks too Rube Goldbergian to count as a reliability enhancer, which is what the point is after all. > Double-buffering allows you to simply checksum the whole page, so > creating a COMP_CRC32_WITH_COPY() macro would do it. Just allocate a > block on the stack, copy/checksum it there, do the write() syscall and > forget it. I think the argument is about whether we increase our vulnerability to torn-page problems if we just add a CRC and don't do anything else to the overall writing process. Right now, a partial write on a hint-bit-only update merely results in some hint bits getting lost (as long as you discount the scenario where the disk fails to read a partially-written sector at all --- maybe we're fooling ourselves to ignore that?). With a CRC added, that suddenly becomes a corrupted-page situation, and it's not easy to tell that no real harm was done. Again, the real bottom line here is whether there will be a *net* gain in reliability. If a CRC adds too many false-positive reports of bad data, it's not going to be a win. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers