Greetings, * Laurenz Albe (laurenz.a...@cybertec.at) wrote: > I think the fundamental problem with all these approaches is that there is > no safe way to distinguish a server crashed in backup mode from a restored > backup. This is what makes the problem so hard.
Right- if you want to just call start/stop and take a snapshot in the middle and then be able to restore that directly and start up the database, then there *can't* be any way to distinguish between the two, which is, I'm pretty sure, where this whole discussion ended up back during the 9.6 development cycle and why it's still an issue. If there was an easy way to fix this, I feel like we would have already. > The existing exclusive backup is in my opinion the safest variant: it refuses > to create a corrupted cluster without manual intervention and gives you a dire > warning to consider if you are doing the right thing. ... it's the least dangerous if you limit yourself to that method, but that doesn't make it safe. :( In the end, you basically *have* to have a way of extracting out the data needed for the backup (start/stop WAL and such) that doesn't make the running cluster look like it's a backup being restored, and you *have* to make that information available to the database cluster when it's restored somehow, and notify PG that it's doing backup recovery and *not* crash recovery, to eliminate this risk, and that's pretty hard to manage if all you want to do is snapshot the filesystem. Of course, you have to have a solution for WAL too and the thought has crossed my mind that maybe there's something we could do when it comes to stash all the info needed in the WAL archive, but I'm still not sure how we'd solve for knowing if we're doing backup recovery or crash recovery in that case without some kind of marker or something external telling us that's what we're doing. As you proposed previously, but with a bit of a twist, maybe we could just always do backup recovery if we find a .backup (or whatever) file in the WAL that, when compared to pg_control, shows that we were in the process of doing a backup... That would require that everyone always have a restore_command set, which wasn't possible before because that went into recovery.conf, but it's possible to just always have that set now, and that would eliminate the risk of us running the system out of disk space by keeping all the WAL that's generated during the backup local. Obviously, a lot of this is pretty hand-wavy, and you still have the unfortunate situation that if you're actually recoverying from a crash that just happened to happen while you were taking a backup then you could be replaying a heck of a lot more WAL than you needed to, and you have to have a working restore_command on the primary, and you'd have to figure out a way for PG to check for these files .backup or whatever files on startup that doesn't take forever or require stepping through every WAL segment or something, but maybe those concerns could be addressed. Thanks! Stephen
signature.asc
Description: PGP signature