On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote: > Well, let's think about what we'd need to do to make CRCs work > reliably. There are two problems. > > 1. [...] If we CRC the entire page, the torn pages are never > acceptable, so every action that modifies the page must be WAL-logged. > > 2. Currently, we allow hint bits on a page to be updated while holding
[...] The way I see it, here are the rules we are breaking, and why: * We don't get an exclusive lock when dirtying a page with hint bits - Why: we write while reading, and we want good concurrency. - Why': because after a bulk load, we don't have any hint bits, and the only way to get them set without VACUUM is to write while reading. I've never been entirely sure why VACUUM isn't good enough in this case, aside from the fact that a user might not run VACUUM (and autovacuum might not either, if it was only a bulk load and no updates/deletes). * We don't WAL log setting hint bits (which dirties a page) - Why: because after a bulk load, we don't want to write the data a 4th time Hypothetically, if we had a bulk loading strategy, these problems would go away, and we could follow the rules. Right? Is there a case other than bulk loading which demands that we break these rules? And, if we had a bulk loading path, we could probably get away with writing the data only twice (today, we write it 3 times including the hint bits) or maybe once if WAL archiving is off. So, is there a case other than bulk loading for which we need to break these rules? If not, perhaps we should consider bulk loading a different problem, and simplify the design of all of these other features (and allow new storage-touching features to come about, like CRCs, without exponentially increasing the complexity with each one). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers