On 7/1/20 5:44 PM, Magnus Hagander wrote:
On Wed, Jul 1, 2020 at 11:08 PM David Steele <da...@pgmasters.net <mailto:da...@pgmasters.net>> wrote:

    But yeah, it would be possible to kill somebody else's session with
    some
    finagling. Still, worse case would be an error'd backup rather than a
    corrupt one.

What about the case of:
Session A - start backup
Session B - stop backup (but A is still running of course)
Session C - start backup
Session A - stop backup

At this point, session A can still stop the backup because there is one running -- but there has been time in between the two when no backup was running. That could lead to Session A getting a corrupt backup, I think -- unless we pass some unique identifier back in pg_stop_backup that matches it up. (And if we do pass that up, then session B running pg_stop_backup() would fail, thus leaving the backup started by A still running.

This is fine because the min start LSN would have been advanced after B stopped. When A tries to stop the min start LSN will be later than its start LSN so it will error.

It might be easier/better to just keep the one exclusive slot in shared memory and store the backup label in it. We only allow one exclusive backup now so it wouldn't be a loss in functionality.

None of this really solves the problem of what happens when the user dumps the backup_label into the data directory. With traditional backup software that's pretty much going to be the only choice. Is telling them not to do it and washing our hands of it really enough?

In particular, I'm worried about the logic in postmaster.c that would be removed if we no longer save the backup_label explicitly during an exclusive backup. If backup_label is no longer removed on a clean shutdown it seems we'll just make the situation worse.

Regards,
--
-David
da...@pgmasters.net


Reply via email to