On Thu, May 9, 2013 at 10:45 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 9 May 2013 22:39, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Simon Riggs <si...@2ndquadrant.com> writes: >>> If the current WAL record is corrupt and the next WAL record is in >>> every way valid, we can potentially continue. >> >> That seems like a seriously bad idea. > > I agree. But if you knew that were true, is stopping a better idea?
Having one corrupt record followed by a valid record is not an abnormal situation. It could easily be the correct end of WAL. I think it's not possible to protect 100% against this without giving up the checksum optimization which implies doing two fsyncs per commit instead of 1. However it is possible to reduce the window. Every time the transaction log is synced a different file can be updated with the a known minimum transaction log recovery point. Even if it's not synced consistently on every transaction commit or wal sync it would serve as a low water mark. Recovering to that point is not sufficient but is necessary for a consistent recovery. That file could be synced lazily, say, every 10s or something like that and would guarantee that any wal corruption would be caught except for the last 10s of wal traffic for example. If you're only interested in database consistency and not lost commits then that file could be synced on buffer xlog flushes (making a painful case even more painful). Off the top of my head that would be sufficient to guarantee that a corrupt xlog that would create an inconsistent database would not be missed. I may be missing cases involving checkpoints or the like though. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers