On 11/30/21 19:54, Michael Paquier wrote:
On Tue, Nov 30, 2021 at 05:58:15PM -0500, David Steele wrote:
I did figure out how to keep the safe part of exclusive backup (not having
to maintain a connection) while removing the dangerous part (writing
backup_label into PGDATA), but it was a substantial amount of work and I
felt that it had little chance of being committed.

Which was, I guess, done by storing the backup_label contents within a
file different than backup_label, still maintained in the main data
folder to ensure that it gets included in the backup?

That, or emit it from pg_start_backup() so the user can write it wherever they please. That would include writing it into PGDATA if they really wanted to, but that would be on them and the default behavior would be safe. The problem with this is if the user does not rename/supply backup_label on restore then they will get corruption and not know it.

Here's another idea. Since the contents of pg_wal are not supposed to be copied, we could add a file there to indicate that the cluster should remove backup_label on restart. Our instructions also say to remove the contents of pg_wal on restore if they were originally copied, so hopefully one of the two would happen. But, again, if they fail to follow the directions it would lead to corruption.

Order would be important here. When starting the backup the proper order would be to write pg_wal/backup_in_progress and then backup_label. When stopping the backup they would be removed in the reverse order.

On a restart if both are present then delete both in the correct order and start crash recovery using the info in pg_control. If only backup_label is present then go into recovery using the info from backup_label.

It's possible for pg_wal/backup_in_process to be present by itself if the server crashes after deleting backup_label but before deleting pg_wal/backup_in_progress. In that case the server should simply remove it on start and go into crash recovery using the info from pg_control.

The advantage of this idea is that it does not change the current instructions as far as I can see. If the user is already following them, they'll be fine. If they are not, then they'll need to start doing so.

Of course, none of this affects users who are using non-exclusive backup, which I do hope covers the majority by now.

Thoughts?

Regards,
-David


Reply via email to