On Wed, Jul 1, 2020 at 11:08 PM David Steele <da...@pgmasters.net> wrote:

> On 7/1/20 4:39 PM, Magnus Hagander wrote:
> > On Wed, Jul 1, 2020 at 10:28 PM David Steele <da...@pgmasters.net
> >     Here's a thought. What if we just stored the oldest starting LSN and
> a
> >     count of how many backups have been requested. When the backup ends
> it
> >     checks that backup count is > 0 and starting LSN is <= its starting
> >     LSN.
> >     If not, it throws an error. When backups go to 0 FPWs are turned off
> if
> >     they were off before the first backup.
> >
> > I guess the weak spot of that one is if some script does stop without
> > doing start first, it will break somebody else's backup. (And yes, I've
> > seen scripts make this mistake many times -- it equally breaks the
> > exclusive backups in the current system...)
>
> Well, they'd have to pass in a backup_label with a start LSN >= the min
> LSN or they would just get an error and not decrement the backup count.
>

Oh as long as we're still requiring pg_stop_backup() to pass in something
that it received from pg_start_backup() so that we can verify it's the
correct one, then that problem wouldn't exist.


The real issue would be if they called pg_stop_backup twice. We might be
> able to stop that with a rolling max stop lsn to keep anyone from
> calling pg_stop_backup() twice.


> But yeah, it would be possible to kill somebody else's session with some
> finagling. Still, worse case would be an error'd backup rather than a
> corrupt one.
>

What about the case of:
Session A - start backup
Session B - stop backup (but A is still running of course)
Session C - start backup
Session A - stop backup

At this point, session A can still stop the backup because there is one
running -- but there has been time in between the two when no backup was
running. That could lead to Session A getting a corrupt backup, I think --
unless we pass some unique identifier back in pg_stop_backup that matches
it up. (And if we do pass that up, then session B running pg_stop_backup()
would fail, thus leaving the backup started by A still running.


But really, that's only if FPWs are turned off. We can also do some
> extra validation if the session is left open, which for most software is
> the norm now.
>

Session left open should really still be the default, as it's the safest
one :) And yes, most backup *software* does it. But the entire reason we
want another mode from the current non-exclusive backup is people *not*
using one of the ready-made backup solutions.



> And don't we need the combination of the start/stop location for the
> > history file?
>
> You mean the .backup file for the WAL? All that needs is the
> backup_label and the stop LSN that's determined in pg_stop_backup(). Am
> I missing something?
>

I mean the .backup file in the archive, yes. That one contains both the
start and the stop location, timeline and time.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Reply via email to