I talked to Tom on the phone today and and I think we have a procedure for doing backup/restore in a fairly foolproof way.
As outlined below, we need to record the start/stop and checkpoint WAL file names and offsets, and somehow pass those on to restore. I think any system that requires users to link those values together is going to cause confusion and be error-prone. My idea is to do much of this automatically. First, create a server-side function called pitr_backup_start() which creates a file in the /data directory which contains the WAL filename/offsets for last checkpoint and start. Then do the backup of the data directory. Then call pitr_backup_stop() which adds the stop filename/offsets to the file, and archive that file in the same place as the WAL files. To restore, you untar the backup of /data. Then the recover backend reads the file created by pitr_backup_start() to find the name of the backup parameter file. It then recovers that file from the archive location and uses the start/stop/checkpoint filename/offset information to the restore. The advantage of this is that the tar backup contains everything needed to find the proper parameter file for restore. Ideally we could get all the parameters into the tar backup, but that isn't possible because we can't push the stop counters into the backup after the backup has completed. I recommend the pitr_backup_start() file be named for the current WAL filename/offset, perhaps 000000000000032c.3da390.backup or something like that. The file would be a simple text file in pg_xlog/archive_status: # start 2004-07-14 21:35:22.324579-04 wal_checkpoint = 0000000000000319.021233 wal_start = 000000000000032c.92a9cb ...added after backup completes... wal_stop = 000000000000034a.3db030 # stop 2004-07-14 21:32:22.0923213-04 The timestamps are for documentation only. These files give admins looking in the archive directory information on backup times. (As an idea, there is no need for the user to specify a recovery mode. If the postmaster starts and sees the pitr_backup_start() file in /data, it can go into recovery mode automatically. If the archiver can't find the file in the archive location, it can assume that it is just being started from power failure mode. However if it finds the file in the archive location, it can assume it is to enter recovery mode. There is a race condition that a crash during copy of the file to the archive location would be a problem. The solution would be to create a special flag file before copying the file to archive, and then archive it and remove the flag file. If the postmaster starts up and sees the pitr_backup_start() file in /data and in the archive location, and it doesn't see the flag file, it then knows it is doing a restore because the flag file would never appear in a backup. Anyway, this is just an idea.) --------------------------------------------------------------------------- Simon Riggs wrote: > On Wed, 2004-07-14 at 10:57, Zeugswetter Andreas SB SD wrote: > > > The recovery mechanism doesn't rely upon you knowing 1 or 3. The > > > recovery reads pg_control (from the backup) and then attempts to > > > de-archive the appropriate xlog segment file and then starts > > > rollforward > > > > Unfortunately this only works if pg_control was the first file to be > > backed up (or by chance no checkpoint happened after backup start and > > pg_control backup) > > > > Other db's have commands for: > > start/end external backup > > > > OK...this idea has come up a few times. Here's my take: > > - OS and hardware facilities exist now to make instant copies of sets of > files. Some of these are open source, others not. If you use these, you > have no requirement for this functionality....but these alone are no > replacement for archive recovery.... I accept that some people may not > wish to go to the expense or effort to use those options, but in my mind > these are the people that will not be using archive_mode anyway. > > - all we would really need to do is to stop the bgwriter from doing > anything during backup. pgcontrol is only updated at checkpoint. The > current xlog is updated constantly, but this need not be copied because > we are already archiving it as soon as its full. That leaves the > bgwriter, which is now responsible for both lazy writing and > checkpoints. > So, put a switch into bgwriter to halt for a period, then turn it back > on at the end. Could be a SIGHUP GUC...or... > > ...and with my greatest respects.... > > - please could somebody else code that?... my time is limited > > Best regards, Simon Riggs > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match