Re: Detecting some cases of missing backup_label

David Steele Thu, 21 Dec 2023 04:26:52 -0800

On 12/21/23 07:37, Andres Freund wrote:


On 2023-12-20 13:11:37 -0400, David Steele wrote:

I've run this through a bunch of scenarios (in my head) with parallel
backups and it does seem to hold up.

I think we'd need to write the state file before XLOG_BACKUP_START just in
case. Seems better to have an extra state file rather than have one be
missing.


That'd very significantly weaken the approach, afaict, because "external" base
base backup could end up copying those files. The whole point is to detect
broken procedures, so relying on such files being excluded from the base
backup seems like a bad idea.

I also see no need to do so - because we'd only verify that a backup start has
been replayed when replaying XLOG_BACKUP_STOP there's no danger in not
creating the files during XLOG_BACKUP_START, but doing so just before logging
the XLOG_BACKUP_STOP.


Ugh, I meant XLOG_BACKUP_STOP. So sounds like we are on the same page.

Probably we'd want to exclude *all* state files from backups, though.


I don't think so - I think we want the opposite? As noted above, I think in a
safety net like this we shouldn't assume that backup procedures were followed
correctly.


Fair enough.

Seems like in various PITR scenarios it could be hard to determine when to
remove them.


Why? I think we can basically remove the files when:

a) after the checkpoint during which XLOG_BACKUP_STOP was replayed - I think
    we already have the infrastructure to queue file deletions that we can hook
    into
b) when replaying a shutdown checkpoint / after creation of a shutdown
    checkpoint

I thought about this some more. I *think* any state files a backup cansee would have to be for XLOG_BACKUP_STOP records generated during thebackup and they would get removed before the cluster had recovered toconsistency.

I'd still prefer to exclude state files from the backup, but I agreethere is no actual need to do so.


Regards,
-David

Re: Detecting some cases of missing backup_label

Reply via email to