On 11/20/23 16:37, Andres Freund wrote:
On 2023-11-20 11:11:13 -0500, Robert Haas wrote:
I think we need more votes to make a change this big. I have a
concern, which I think I've expressed before, that we keep whacking
around the backup APIs, and that has a cost which is potentially
larger than the benefits.
+1. The amount of whacking around in this area has been substantial, and it's
hard for operators to keep up. And realistically, with data sizes today, the
pressure to do basebackups with disk snapshots etc is not going to shrink.
True enough, but disk snapshots aren't really backups in themselves, in
most scenarios, because they reside on the same storage as the cluster.
Of course, snapshots can be exported, but that's also expensive.
I see snapshots as an adjunct to backups -- a safe backup offsite
somewhere for DR and snapshots for day to day operations. Even so,
managing snapshots as backups is harder than people think. It is easy to
get wrong and end up with silent corruption.
Leaving that concern aside, I am still on the fence about this proposal. I
think it does decrease the chance of getting things wrong in the
streaming-basebackup case. But for external backups, it seems almost
universally worse (with the exception of the torn pg_control issue, that we
also can address otherwise):
Why universally worse? The software stores pg_control instead of backup
label. The changes to pg_basebackup were pretty trivial and the changes
to external backup are pretty much the same, at least in my limited
sample of one.
And I don't believe we have a satisfactory solution to the torn
pg_control issue yet. Certainly it has not been committed and Thomas has
shown enthusiasm for this approach, to the point of hoping it could be
back patched (it can't).
It doesn't reduce the risk of getting things wrong, you can still omit placing
a file into the data directory and get silent corruption as a consequence. In
addition, it's harder to see when looking at a base backup whether the process
was right or not, because now the good and bad state look the same if you just
look on the filesystem level!
This is one of the reasons I thought writing just the first 512 bytes of
pg_control would be valuable. It would give an easy indicator that
pg_control came from a backup. Michael was not in favor of conflating
that change with this patch -- but I still think it's a good idea.
Then there's the issue of making ad-hoc debugging harder by not having a
human readable file with information anymore, including when looking at the
history, via backup_label.old.
Yeah, you'd need to use pg_controldata instead. But as Michael has
suggested, we could also write backup_label as backup_info so there is
human-readable information available.
Given that, I wonder if what we should do is to just add a new field to
pg_control that says "error out if backup_label does not exist", that we set
when creating a streaming base backup
I'm not in favor of a change only accessible to pg_basebackup or
external software that can manipulate pg_control.
Regards,
-David