On Sun, Nov 05, 2023 at 01:45:39PM -0400, David Steele wrote: > Rebased on 151ffcf6.
I like this patch a lot. Even if the backup_label file is removed, we still have all the debug information from the backup history file, thanks to its LABEL, BACKUP METHOD and BACKUP FROM, so no information is lost. It does a 1:1 replacement of the contents parsed from the backup_label needed by recovery by fetching them from the control file. Sounds like a straight-forward change to me. The patch is failing the recovery test 039_end_of_wal.pl. Could you look at the failure? /* Build and save the contents of the backup history file */ - history_file = build_backup_content(state, true); + history_file = build_backup_content(state); build_backup_content() sounds like an incorrect name if it is a routine onlyused to build the contents of backup history files. Why is there nothing updated in src/bin/pg_controldata/? + /* Clear fields used to initialize recovery */ + ControlFile->backupCheckPoint = InvalidXLogRecPtr; + ControlFile->backupStartPointTLI = 0; + ControlFile->backupRecoveryRequired = false; + ControlFile->backupFromStandby = false; These variables in the control file are cleaned up when the backup_label file was read previously, but backup_label is renamed to backup_label.old a bit later than that. Your logic looks correct seen from here, but shouldn't these variables be set much later, aka just *after* UpdateControlFile(). This gap between the initialization of the control file and the in-memory reset makes the code quite brittle, IMO. - basebackup_progress_wait_wal_archive(&state); - do_pg_backup_stop(backup_state, !opt->nowait); Why is that moved? - The backup label - file includes the label string you gave to <function>pg_backup_start</function>, - as well as the time at which <function>pg_backup_start</function> was run, and - the name of the starting WAL file. In case of confusion it is therefore - possible to look inside a backup file and determine exactly which - backup session the dump file came from. The tablespace map file includes + The tablespace map file includes It may be worth mentioning that the backup history file holds this information on the primary's pg_wal, as well. The changes in sendFileWithContent() may be worth a patch of its own. --- a/src/include/catalog/pg_control.h +++ b/src/include/catalog/pg_control.h @@ -146,6 +146,9 @@ typedef struct ControlFileData @@ -160,14 +163,25 @@ typedef struct ControlFileData XLogRecPtr minRecoveryPoint; TimeLineID minRecoveryPointTLI; + XLogRecPtr backupCheckPoint; XLogRecPtr backupStartPoint; + TimeLineID backupStartPointTLI; XLogRecPtr backupEndPoint; + bool backupRecoveryRequired; + bool backupFromStandby; This increases the size of the control file from 296B to 312B with an 8-byte alignment, as far as I can see. The size of the control file has been always a sensitive subject especially with the hard limit of PG_CONTROL_MAX_SAFE_SIZE. Well, the point of this patch is that this is the price to pay to prevent users from doing something stupid with a removal of a backup_label when they should not. Do others have an opinion about this increase in size? Actually, grouping backupStartPointTLI and minRecoveryPointTLI should reduce more the size with some alignment magic, no? - /* - * BACKUP METHOD lets us know if this was a typical backup ("streamed", - * which could mean either pg_basebackup or the pg_backup_start/stop - * method was used) or if this label came from somewhere else (the only - * other option today being from pg_rewind). If this was a streamed - * backup then we know that we need to play through until we get to the - * end of the WAL which was generated during the backup (at which point we - * will have reached consistency and backupEndRequired will be reset to be - * false). - */ - if (fscanf(lfp, "BACKUP METHOD: %19s\n", backuptype) == 1) - { - if (strcmp(backuptype, "streamed") == 0) - *backupEndRequired = true; - } backupRecoveryRequired in the control file is switched to false for pg_rewind and true for streamed backups. My gut feeling is telling me that this should be OK, as out-of-core tools would need an upgrade if they relied on the backend_label file anyway. I can see that this change makes use lose some documentation, unfortunately. Shouldn't these removed lines be moved to pg_control.h instead for the description of backupEndRequired? doc/src/sgml/ref/pg_rewind.sgml and src/backend/access/transam/xlogrecovery.c still include references to the backup_label file. -- Michael
signature.asc
Description: PGP signature