Re: The danger of deleting backup_label

2023-10-19 Thread David Steele
On 10/19/23 10:56, Robert Haas wrote: On Thu, Oct 19, 2023 at 10:43 AM David Steele wrote: What I meant here (but said badly) is that in the case of snapshot backups, the backup_label and tablespace_map will likely need to be stored somewhere off the server since they can't be part of the snaps

Re: The danger of deleting backup_label

2023-10-19 Thread David G. Johnston
On Thursday, October 19, 2023, David Steele wrote: > On 10/19/23 10:24, Robert Haas wrote: > >> On Wed, Oct 18, 2023 at 7:15 PM David Steele wrote: >> >>> pg_llbackup -d $CONNTR --backup-label=PATH --tablespace-map=PATH --copy-data-directory=SHELLCOMMAND I think in most cases

Re: The danger of deleting backup_label

2023-10-19 Thread Robert Haas
On Thu, Oct 19, 2023 at 10:43 AM David Steele wrote: > What I meant here (but said badly) is that in the case of snapshot > backups, the backup_label and tablespace_map will likely need to be > stored somewhere off the server since they can't be part of the > snapshot, perhaps in a key store. In t

Re: The danger of deleting backup_label

2023-10-19 Thread David Steele
On 10/19/23 10:24, Robert Haas wrote: On Wed, Oct 18, 2023 at 7:15 PM David Steele wrote: (b) be stored someplace else, I don't think the additional fields *need* to be stored anywhere at all, at least not by us. We can provide them as output from pg_backup_stop() and the caller can do as the

Re: The danger of deleting backup_label

2023-10-19 Thread Robert Haas
On Wed, Oct 18, 2023 at 7:15 PM David Steele wrote: > > (b) be stored someplace > > else, > > I don't think the additional fields *need* to be stored anywhere at all, > at least not by us. We can provide them as output from pg_backup_stop() > and the caller can do as they please. None of those fie

Re: The danger of deleting backup_label

2023-10-18 Thread David G. Johnston
On Wednesday, October 18, 2023, David Steele wrote: > On 10/18/23 08:39, Robert Haas wrote: > >> On Tue, Oct 17, 2023 at 4:17 PM David Steele wrote: >> >>> Given that the above can't be back patched, I'm thinking we don't need >>> backup_label at all going forward. We just write the values we ne

Re: The danger of deleting backup_label

2023-10-18 Thread David Steele
On 10/18/23 08:39, Robert Haas wrote: On Tue, Oct 17, 2023 at 4:17 PM David Steele wrote: Given that the above can't be back patched, I'm thinking we don't need backup_label at all going forward. We just write the values we need for recovery into pg_control and return *that* from pg_backup_stop

Re: The danger of deleting backup_label

2023-10-18 Thread David Steele
On 10/17/23 22:13, Kyotaro Horiguchi wrote: At Tue, 17 Oct 2023 16:16:42 -0400, David Steele wrote in Given that the above can't be back patched, I'm thinking we don't need backup_label at all going forward. We just write the values we need for recovery into pg_control and return *that* from pg

Re: The danger of deleting backup_label

2023-10-18 Thread Robert Haas
On Tue, Oct 17, 2023 at 4:17 PM David Steele wrote: > Given that the above can't be back patched, I'm thinking we don't need > backup_label at all going forward. We just write the values we need for > recovery into pg_control and return *that* from pg_backup_stop() and > tell the user to store it

Re: The danger of deleting backup_label

2023-10-17 Thread Kyotaro Horiguchi
At Tue, 17 Oct 2023 16:16:42 -0400, David Steele wrote in > Given that the above can't be back patched, I'm thinking we don't need > backup_label at all going forward. We just write the values we need > for recovery into pg_control and return *that* from pg_backup_stop() > and tell the user to st

Re: The danger of deleting backup_label

2023-10-17 Thread David Steele
On 10/14/23 11:30, David Steele wrote: On 10/12/23 10:19, David Steele wrote: On 10/11/23 18:10, Thomas Munro wrote: As Stephen mentioned[1], we could perhaps also complain if both backup label and control file exist, and then hint that the user should remove the *control file* (not the backup

Re: The danger of deleting backup_label

2023-10-17 Thread Robert Haas
On Tue, Oct 17, 2023 at 3:17 PM Stephen Frost wrote: > I'd also put out there that while people don't do restore testing > nearly as much as they should, they tend to at _least_ try to do it once > after taking their first backup and if that fails then they try to figure > out why and what they're

Re: The danger of deleting backup_label

2023-10-17 Thread Stephen Frost
Greetings, * David Steele (da...@pgmasters.net) wrote: > On 10/16/23 15:06, Robert Haas wrote: > > On Mon, Oct 16, 2023 at 1:00 PM David Steele wrote: > > > After some agonizing (we hope) they decide to delete backup_label and, > > > wow, it just works! So now they merrily go on their way with a

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 15:06, Robert Haas wrote: On Mon, Oct 16, 2023 at 1:00 PM David Steele wrote: After some agonizing (we hope) they decide to delete backup_label and, wow, it just works! So now they merrily go on their way with a corrupted cluster. They also remember for the next time that deleting b

Re: The danger of deleting backup_label

2023-10-16 Thread Michael Paquier
On Mon, Oct 16, 2023 at 12:25:59PM -0400, Robert Haas wrote: > On Mon, Oct 16, 2023 at 11:45 AM David Steele wrote: > > If you start from the last checkpoint (which is what will generally be > > stored in pg_control) then the effect is pretty similar. > > If the backup didn't span a checkpoint, t

Re: The danger of deleting backup_label

2023-10-16 Thread Robert Haas
On Mon, Oct 16, 2023 at 1:00 PM David Steele wrote: > After some agonizing (we hope) they decide to delete backup_label and, > wow, it just works! So now they merrily go on their way with a corrupted > cluster. They also remember for the next time that deleting backup_label > is definitely a good

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 12:25, Robert Haas wrote: On Mon, Oct 16, 2023 at 11:45 AM David Steele wrote: Hmmm, the reason to back patch this is that it would fix [1], which sure looks like a problem to me even if it is not a "bug". We can certainly require backup software to retry pg_control until the checks

Re: The danger of deleting backup_label

2023-10-16 Thread Robert Haas
On Mon, Oct 16, 2023 at 11:45 AM David Steele wrote: > Hmmm, the reason to back patch this is that it would fix [1], which sure > looks like a problem to me even if it is not a "bug". We can certainly > require backup software to retry pg_control until the checksum is valid > but that seems like a

Re: The danger of deleting backup_label

2023-10-16 Thread David Steele
On 10/16/23 10:55, Robert Haas wrote: On Sat, Oct 14, 2023 at 11:33 AM David Steele wrote: All of this is fixable in HEAD, but seems incredibly dangerous to back patch. Even so, I have attached the patch in case somebody sees an opportunity that I do not. I really do not think we should be ev

Re: The danger of deleting backup_label

2023-10-16 Thread Robert Haas
On Sat, Oct 14, 2023 at 11:33 AM David Steele wrote: > All of this is fixable in HEAD, but seems incredibly dangerous to back > patch. Even so, I have attached the patch in case somebody sees an > opportunity that I do not. I really do not think we should be even thinking about back-patching some

Re: The danger of deleting backup_label

2023-10-14 Thread David Steele
On 10/12/23 10:19, David Steele wrote: On 10/11/23 18:10, Thomas Munro wrote: As Stephen mentioned[1], we could perhaps also complain if both backup label and control file exist, and then hint that the user should remove the *control file* (not the backup label!).  I had originally suggested we

Re: The danger of deleting backup_label

2023-10-12 Thread David Steele
On 10/11/23 18:22, Michael Paquier wrote: On Tue, Oct 10, 2023 at 05:06:45PM -0400, David Steele wrote: That fails because there is a check to make sure the checkpoint is valid when pg_control is loaded. Another possibility is to use a special LSN like we use for unlogged tables. Anything >=

Re: The danger of deleting backup_label

2023-10-12 Thread David Steele
Hi Thomas, On 10/11/23 18:10, Thomas Munro wrote: Even though I spent a whole bunch of time trying to figure out how to make concurrent reads of the control file sufficiently atomic for backups (pg_basebackup and low level filesystem tools), and we explored multiple avenues with varying results

Re: The danger of deleting backup_label

2023-10-11 Thread Michael Paquier
On Tue, Oct 10, 2023 at 05:06:45PM -0400, David Steele wrote: > That fails because there is a check to make sure the checkpoint is valid > when pg_control is loaded. Another possibility is to use a special LSN like > we use for unlogged tables. Anything >= 24 and < WAL segment size will work > fine

Re: The danger of deleting backup_label

2023-10-11 Thread Thomas Munro
Hi David, Even though I spent a whole bunch of time trying to figure out how to make concurrent reads of the control file sufficiently atomic for backups (pg_basebackup and low level filesystem tools), and we explored multiple avenues with varying results, and finally came up with something that b

Re: The danger of deleting backup_label

2023-10-10 Thread David Steele
On 9/28/23 22:30, Michael Paquier wrote: On Thu, Sep 28, 2023 at 05:14:22PM -0400, David Steele wrote: Recovery worked perfectly as long as backup_label was present and failed hard when it was not: LOG: invalid primary checkpoint record PANIC: could not locate a valid checkpoint record It's

Re: The danger of deleting backup_label

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 05:14:22PM -0400, David Steele wrote: > While reading through [1] I saw there were two instances where backup_label > was removed to achieve a "successful" restore. This might work on trivial > test restores but is an invitation to (silent) disaster in a production > environ