Greetings, * Lucas (r...@sud0.nz) wrote: > On 26/10/2021, at 6:13 AM, Stephen Frost <sfr...@snowman.net> wrote: > > * Mladen Gogala (gogala.mla...@gmail.com) wrote: > >> On 10/23/21 23:12, Lucas wrote: > >>> This has proven to work very well for me. I had to restore a few backups > >>> already and it always worked. The bad part is that I need to stop the > >>> database before performing the Snapshot, for data integrity, so that means > >>> that I have a hot-standby server only for these snapshots. > >>> Lucas > >> > >> Actually, you don't need to stop the database. You need to execute > >> pg_start_backup() before taking a snapshot and then pg_stop_backup() when > >> the snapshot is done. You will need to recover the database when you finish > >> the restore but you will not lose any data. I know that pg_begin_backup() > >> and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any > >> API for storage or file system snapshots, that's the only thing that can > >> help you use storage snapshots as backups. To my knowledge,the only > >> database > >> that does have API for storage snapshots is DB2. The API is called > >> "Advanced > >> Copy Services" or ACS. It's documented here: > >> > >> https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs > >> > >> For Postgres, the old begin/stop backup functions should be sufficient. > > > > No, it's not- you must also be sure to archive any WAL that's generated > > between the pg_start_backup and pg_stop_backup and then to be sure and > > add into the snapshot the appropriate signal files or recovery.conf, > > depending on PG version, to indicate that you're restoring from a backup > > and make sure that the WAL is made available via restore_command. > > > > Just doing stat/stop backup is *not* enough and you run the risk of > > having an invalid backup or corruption when you restore. > > > > If the entire system is on a single volume then you could possibly just > > take a snapshot of it (without any start/stop backup stuff) but it's > > very risky to do that and then try to do PITR with it because we don't > > know where consistency is reached in such a case (we *must* play all the > > way through to the end of the WAL which existed at the time of the > > snapshot in order to reach consistency). > > > > In the end though, really, it's much, much, much better to use a proper > > backup and archiving tool that's written specifically for PG than to try > > and roll your own, using snapshots or not. > > When I create a snapshot, the script gets the latest WAL file applied from > [1] and adds that information to the Snapshot Tags in AWS. I then use that > information in the future when restoring the snapshot. The script will read > the tag and it will download 50 WAL Files before that and all the WAL files > after that required. > The WAL files are being backed up to S3. > > I had to restore the database to a PITR state many times, and it always > worked very well. > > I also create slaves using the snapshot method. So, I don’t mind having to > stop/start the Database for the snapshot process, as it’s proven to work fine > for the last 5 years.
I have to say that the process used here isn't terribly clear to me (you cleanly shut down the database ... and also copy the WAL files?), so I don't really want to comment on if it's actually correct or not because I can't say one way or the other if it is or isn't. I do want to again stress that I don't recommend writing your own tools for doing backup/restore/PITR and I would caution people against people trying to use this approach you've suggested. Also, being able to tell when such a process *doesn't* work is non-trivial (look at how long it took us to discover the issues around fsync..), so saying that it seems to have worked for a long time for you isn't really enough to make me feel comfortable with it. Thanks, Stephen
signature.asc
Description: PGP signature