Greetings,

* Lucas (r...@sud0.nz) wrote:
> On 26/10/2021, at 6:13 AM, Stephen Frost <sfr...@snowman.net> wrote:
> > * Mladen Gogala (gogala.mla...@gmail.com) wrote:
> >> On 10/23/21 23:12, Lucas wrote:
> >>> This has proven to work very well for me. I had to restore a few backups
> >>> already and it always worked. The bad part is that I need to stop the
> >>> database before performing the Snapshot, for data integrity, so that means
> >>> that I have a hot-standby server only for these snapshots.
> >>> Lucas
> >> 
> >> Actually, you don't need to stop the database. You need to execute
> >> pg_start_backup() before taking a snapshot and then pg_stop_backup() when
> >> the snapshot is done. You will need to recover the database when you finish
> >> the restore but you will not lose any data. I know that pg_begin_backup()
> >> and pg_stop_backup() are deprecated but since PostgreSQL doesn't have any
> >> API for storage or file system snapshots, that's the only thing that can
> >> help you use storage snapshots as backups. To my knowledge,the only 
> >> database
> >> that does have API for storage snapshots is DB2. The API is called 
> >> "Advanced
> >> Copy Services" or ACS. It's documented here:
> >> 
> >> https://www.ibm.com/docs/en/db2/11.1?topic=recovery-db2-advanced-copy-services-acs
> >> 
> >> For Postgres, the old begin/stop backup functions should be sufficient.
> > 
> > No, it's not- you must also be sure to archive any WAL that's generated
> > between the pg_start_backup and pg_stop_backup and then to be sure and
> > add into the snapshot the appropriate signal files or recovery.conf,
> > depending on PG version, to indicate that you're restoring from a backup
> > and make sure that the WAL is made available via restore_command.
> > 
> > Just doing stat/stop backup is *not* enough and you run the risk of
> > having an invalid backup or corruption when you restore.
> > 
> > If the entire system is on a single volume then you could possibly just
> > take a snapshot of it (without any start/stop backup stuff) but it's
> > very risky to do that and then try to do PITR with it because we don't
> > know where consistency is reached in such a case (we *must* play all the
> > way through to the end of the WAL which existed at the time of the
> > snapshot in order to reach consistency).
> > 
> > In the end though, really, it's much, much, much better to use a proper
> > backup and archiving tool that's written specifically for PG than to try
> > and roll your own, using snapshots or not.
> 
> When I create a snapshot, the script gets the latest WAL file applied from 
> [1] and adds that information to the Snapshot Tags in AWS. I then use that 
> information in the future when restoring the snapshot. The script will read 
> the tag and it will download 50 WAL Files before that and all the WAL files 
> after that required.
> The WAL files are being backed up to S3.
> 
> I had to restore the database to a PITR state many times, and it always 
> worked very well.
> 
> I also create slaves using the snapshot method. So, I don’t mind having to 
> stop/start the Database for the snapshot process, as it’s proven to work fine 
> for the last 5 years.

I have to say that the process used here isn't terribly clear to me (you
cleanly shut down the database ... and also copy the WAL files?), so I
don't really want to comment on if it's actually correct or not because
I can't say one way or the other if it is or isn't.

I do want to again stress that I don't recommend writing your own tools
for doing backup/restore/PITR and I would caution people against people
trying to use this approach you've suggested.  Also, being able to tell
when such a process *doesn't* work is non-trivial (look at how long it
took us to discover the issues around fsync..), so saying that it seems
to have worked for a long time for you isn't really enough to make me
feel comfortable with it.

Thanks,

Stephen

Attachment: signature.asc
Description: PGP signature

Reply via email to