Re: Add recovery to pg_control and remove backup_label

David Steele Tue, 21 Nov 2023 10:14:21 -0800

On 11/20/23 16:37, Andres Freund wrote:


On 2023-11-20 11:11:13 -0500, Robert Haas wrote:

I think we need more votes to make a change this big. I have a
concern, which I think I've expressed before, that we keep whacking
around the backup APIs, and that has a cost which is potentially
larger than the benefits.


+1.  The amount of whacking around in this area has been substantial, and it's
hard for operators to keep up. And realistically, with data sizes today, the
pressure to do basebackups with disk snapshots etc is not going to shrink.

True enough, but disk snapshots aren't really backups in themselves, inmost scenarios, because they reside on the same storage as the cluster.Of course, snapshots can be exported, but that's also expensive.

I see snapshots as an adjunct to backups -- a safe backup offsitesomewhere for DR and snapshots for day to day operations. Even so,managing snapshots as backups is harder than people think. It is easy toget wrong and end up with silent corruption.

Leaving that concern aside, I am still on the fence about this proposal. I
think it does decrease the chance of getting things wrong in the
streaming-basebackup case. But for external backups, it seems almost
universally worse (with the exception of the torn pg_control issue, that we
also can address otherwise):

Why universally worse? The software stores pg_control instead of backuplabel. The changes to pg_basebackup were pretty trivial and the changesto external backup are pretty much the same, at least in my limitedsample of one.

And I don't believe we have a satisfactory solution to the tornpg_control issue yet. Certainly it has not been committed and Thomas hasshown enthusiasm for this approach, to the point of hoping it could beback patched (it can't).

It doesn't reduce the risk of getting things wrong, you can still omit placing
a file into the data directory and get silent corruption as a consequence. In
addition, it's harder to see when looking at a base backup whether the process
was right or not, because now the good and bad state look the same if you just
look on the filesystem level!

This is one of the reasons I thought writing just the first 512 bytes ofpg_control would be valuable. It would give an easy indicator thatpg_control came from a backup. Michael was not in favor of conflatingthat change with this patch -- but I still think it's a good idea.

Then there's the issue of making ad-hoc debugging harder by not having a
human readable file with information anymore, including when looking at the
history, via backup_label.old.

Yeah, you'd need to use pg_controldata instead. But as Michael hassuggested, we could also write backup_label as backup_info so there ishuman-readable information available.

Given that, I wonder if what we should do is to just add a new field to
pg_control that says "error out if backup_label does not exist", that we set
when creating a streaming base backup

I'm not in favor of a change only accessible to pg_basebackup orexternal software that can manipulate pg_control.


Regards,
-David

Re: Add recovery to pg_control and remove backup_label

Reply via email to