On 12/21/23 07:37, Andres Freund wrote:
On 2023-12-20 13:11:37 -0400, David Steele wrote:
I've run this through a bunch of scenarios (in my head) with parallel
backups and it does seem to hold up.
I think we'd need to write the state file before XLOG_BACKUP_START just in
case. Seems better to have an extra state file rather than have one be
missing.
That'd very significantly weaken the approach, afaict, because "external" base
base backup could end up copying those files. The whole point is to detect
broken procedures, so relying on such files being excluded from the base
backup seems like a bad idea.
I also see no need to do so - because we'd only verify that a backup start has
been replayed when replaying XLOG_BACKUP_STOP there's no danger in not
creating the files during XLOG_BACKUP_START, but doing so just before logging
the XLOG_BACKUP_STOP.
Ugh, I meant XLOG_BACKUP_STOP. So sounds like we are on the same page.
Probably we'd want to exclude *all* state files from backups, though.
I don't think so - I think we want the opposite? As noted above, I think in a
safety net like this we shouldn't assume that backup procedures were followed
correctly.
Fair enough.
Seems like in various PITR scenarios it could be hard to determine when to
remove them.
Why? I think we can basically remove the files when:
a) after the checkpoint during which XLOG_BACKUP_STOP was replayed - I think
we already have the infrastructure to queue file deletions that we can hook
into
b) when replaying a shutdown checkpoint / after creation of a shutdown
checkpoint
I thought about this some more. I *think* any state files a backup can
see would have to be for XLOG_BACKUP_STOP records generated during the
backup and they would get removed before the cluster had recovered to
consistency.
I'd still prefer to exclude state files from the backup, but I agree
there is no actual need to do so.
Regards,
-David