On Wed, Jul 1, 2020 at 11:08 PM David Steele <da...@pgmasters.net> wrote:
> On 7/1/20 4:39 PM, Magnus Hagander wrote: > > On Wed, Jul 1, 2020 at 10:28 PM David Steele <da...@pgmasters.net > > Here's a thought. What if we just stored the oldest starting LSN and > a > > count of how many backups have been requested. When the backup ends > it > > checks that backup count is > 0 and starting LSN is <= its starting > > LSN. > > If not, it throws an error. When backups go to 0 FPWs are turned off > if > > they were off before the first backup. > > > > I guess the weak spot of that one is if some script does stop without > > doing start first, it will break somebody else's backup. (And yes, I've > > seen scripts make this mistake many times -- it equally breaks the > > exclusive backups in the current system...) > > Well, they'd have to pass in a backup_label with a start LSN >= the min > LSN or they would just get an error and not decrement the backup count. > Oh as long as we're still requiring pg_stop_backup() to pass in something that it received from pg_start_backup() so that we can verify it's the correct one, then that problem wouldn't exist. The real issue would be if they called pg_stop_backup twice. We might be > able to stop that with a rolling max stop lsn to keep anyone from > calling pg_stop_backup() twice. > But yeah, it would be possible to kill somebody else's session with some > finagling. Still, worse case would be an error'd backup rather than a > corrupt one. > What about the case of: Session A - start backup Session B - stop backup (but A is still running of course) Session C - start backup Session A - stop backup At this point, session A can still stop the backup because there is one running -- but there has been time in between the two when no backup was running. That could lead to Session A getting a corrupt backup, I think -- unless we pass some unique identifier back in pg_stop_backup that matches it up. (And if we do pass that up, then session B running pg_stop_backup() would fail, thus leaving the backup started by A still running. But really, that's only if FPWs are turned off. We can also do some > extra validation if the session is left open, which for most software is > the norm now. > Session left open should really still be the default, as it's the safest one :) And yes, most backup *software* does it. But the entire reason we want another mode from the current non-exclusive backup is people *not* using one of the ready-made backup solutions. > And don't we need the combination of the start/stop location for the > > history file? > > You mean the .backup file for the WAL? All that needs is the > backup_label and the stop LSN that's determined in pg_stop_backup(). Am > I missing something? > I mean the .backup file in the archive, yes. That one contains both the start and the stop location, timeline and time. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>