On Tue, May 24, 2011 at 11:52 PM, Bruce Momjian <br...@momjian.us> wrote: > Robert Haas wrote: >> 2. The other fairly obvious alternative is to adjust our existing WAL >> record types to be idempotent - i.e. to not rely on the existing page >> contents. For XLOG_HEAP_INSERT, we currently store the target tid and >> the tuple contents. I'm not sure if there's anything else, but we >> would obviously need the offset where the new tuple should be written, >> which we currently infer from reading the existing page contents. For >> XLOG_HEAP_DELETE, we store just the TID of the target tuple; we would >> certainly need to store its offset within the block, and maybe the >> infomask. For XLOG_HEAP_UPDATE, we'd need the old and new offsets and >> perhaps also the old and new infomasks. Assuming that's all we need >> and I'm not missing anything (which I won't bet on), that means we'd >> be adding, say, 4 bytes per insert or delete and 8 bytes per update. >> So, if checkpoints are spread out widely enough that there will be >> more than ~2K operations per page between checkpoints, then it makes >> more sense to just do a full page write and call it good. If not, >> this idea might have legs. > > I vote for "wal_level = idempotent" because so few people will know what > idempotent means. ;-)
That idea has the additional advantage of confusing the level of detail of our WAL logging (minimal vs. archive vs. hot standby) with the mechanism used to protect against torn pages (full page writes vs. idempotent WAL records vs. prayer). When they set it wrong and destroy their system, we can tell them it's their own fault for not configuring the system properly! Bwahahahaha! In all seriousness, I can't imagine that we'd make this user-configurable in the first place, since that would amount to having two sets of WAL records each of which would be even less well tested than what we have now; and for a project this complex, we probably shouldn't even consider changing things that seem to work now unless the new system is clearly better than the old. > Idempotent does seem like the most promising idea. I tend to agree with you, but I'm worried it won't actually work out to a win. By the time we augment the records with enough additional information we may have eaten up a lot of the benefit we were hoping to get. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers