On Fri, May 17, 2024 at 12:46:49PM +1000, David Steele wrote: > This is greatly simplified implementation of the patch proposed in [1] and > hopefully it addresses the concerns expressed there. Since the > implementation is quite different it seemed like a new thread was > appropriate, especially since the old thread title would be very misleading > regarding the new functionality.
- /* No backup_label file has been found if we are here. */ + /* + * No backup_label file has been found if we are here. Error if the + * control file requires backup_label. + */ + if (ControlFile->backupLabelRequired) + ereport(FATAL, + (errmsg("could not find backup_label required for recovery"), + errhint("backup_label must be present for recovery to succeed"))); I thought that this had some similarities with my last fight in this area, where xlogrecovery.c would fail hard if there was a backup_label file but no signal files, but nope that's not the case: https://www.postgresql.org/message-id/flat/ZArVOMifjzE7f8W7%40paquier.xyz > The basic idea is to harden recovery by returning a copy of pg_control from > pg_backup_stop() that has a flag set to prevent recovery if the backup_label > file is missing. Instead of backup software copying pg_control from PGDATA, > it stores an updated version that is returned from pg_backup_stop(). Hmm, okay. There is also a slight impact for BASE_BACKUP, requiring basebackup_progress_wait_wal_archive() and do_pg_backup_stop() to be called earlier when sending the main data directory which is the last one in the list of tablespaces. As far as I can see, this does not change the logic because do_pg_backup_stop() does not touch the control file, but it seems to me that we'd rather keep these two calls as they are now, and send the control file once we are out of the for loop that processes all the tablespaces. That seems slightly cleaner to me, still I agree that both are the same things. Anyway, finishing do_pg_backup_stop() and then sending the control file is just a consequence of the implementation choice driven by the output required for the SQL function so as this is stored in the backup state to get it down to pg_backup_stop() in xlogfuncs.c, so we could also take one step back and forget about the SQL function, setting only the new flag when sending a BASE_BACKUP, or just not use the backupState to store this data as the exercise involves forcing one boolean and recalculate a CRC32. > * We don't need to worry about backup software seeing a torn copy of > pg_control, since Postgres can safely read it out of memory and provide a > valid copy via pg_backup_stop(). This solves torn reads without needing to > write pg_control via a temp file, which may affect performance on a standby. We're talking about a 8kB file which has a size of 512B (PG_CONTROL_MAX_SAFE_SIZE) to avoid such issues. So I'm not sure to see your point here? > * For backup from standby, we no longer need to instruct the backup software > to copy pg_control last. In fact the backup software should not copy > pg_control from PGDATA at all. Yep. Not from PGDATA, but from the function. > These changes have no impact on current backup software and they are free to > use the pg_control available from pg_stop_backup() or continue to use > pg_control from PGDATA. Of course they will miss the benefits of getting a > consistent copy of pg_control and the backup_label checking, but will be no > worse off than before. There is a large comment block in do_pg_backup_stop() around backup_stopped_in_recovery. Perhaps it should be improved based on this patch. The main concern that I have over this patch is: who is actually going to use this extension of the SQL stop function? Perhaps existing backup solutions are good enough risk vs reward is not worth it? The label_file and the tablespace map are text, this is a series of bytes that has no visibility for the end-user unless checked on the client-side. This adds a new step where backups would need to copy the control file to the data folder. -- Michael
signature.asc
Description: PGP signature