Here's v2. Jakub Wartak pointed out to me off-list that this broke the case where a chain of incrementals crosses a timeline switch. That made me realize that I also need to add the WAL level to XLOG_END_OF_RECOVERY, so this version does that.
I also forgot to mention that this patch changes behavior in the case where you've been running with summarize_wal=off for a while and then you turned it on. Previously, we'd start summarizing from the oldest WAL record we could still read from pg_xlog. Now, we'll start summarizing from the first checkpoint (or timeline switch) after that. That's necessary, because when we read the oldest record available, we can't know for sure what WAL level was used to generate it, so we have to assume the worst case, i.e. minimal, and thus skip summarizing that WAL. However, it's also harmless, because a WAL summary that covers part of a checkpoint cycle is useless to us anyway. We need completely WAL summaries from the start of the prior backup to the start of the current one to be able to do an incremental backup, and the previous backup and the current backup must have each started with a checkpoint, so a summary covering part of a checkpoint cycle can never make an incremental backup possible where it would not otherwise have been possible. One more thing I forgot to mention is that we can't fix this problem by making summarize_wal PGC_POSTMASTER. That doesn't work because of what is mentioned in the previous paragraph: when summarize_wal is turned on it will go back and try to summarize any older WAL that is still around: we need this infrastructure to know whether or not that older WAL is safe to summarize. And I don't think we can remove the behavior where we back up and try to summarize old WAL, either, because then after a crash you'd always have a gap in your summary files and you would have to take a new full backup afterwards, which would suck. I continue to think that a lot of the value of this feature is in making sure that it *always* works -- when you start to add cases where full backups are required, this becomes a lot less useful to the target audience for the feature, namely, people whose databases are so large that full backups take an unreasonably long time to complete. ...Robert
v2-0001-Do-not-summarize-WAL-if-generated-with-wal_level-.patch
Description: Binary data