On 2023-Nov-16, Alvaro Herrera wrote: > On 2023-Oct-04, Robert Haas wrote:
> > - Right now, I have a hard-coded 60 second timeout for WAL > > summarization. If you try to take an incremental backup and the WAL > > summaries you need don't show up within 60 seconds, the backup times > > out. I think that's a reasonable default, but should it be > > configurable? If yes, should that be a GUC or, perhaps better, a > > pg_basebackup option? > > I'd rather have a way for the server to provide diagnostics on why the > summaries aren't being produced. Maybe a server running under valgrind > is going to fail and need a longer one, but otherwise a hardcoded > timeout seems sufficient. > > You did say later that you thought summary files would just go from one > checkpoint to the next. So the only question is at what point the file > for the last checkpoint (i.e. from the previous one up to the one > requested by pg_basebackup) is written. If walsummarizer keeps almost > the complete state in memory and just waits for the checkpoint record to > write it, then it's probably okay. On 2023-Nov-16, Alvaro Herrera wrote: > On 2023-Nov-16, Robert Haas wrote: > > > On Thu, Nov 16, 2023 at 5:21 AM Alvaro Herrera <alvhe...@alvh.no-ip.org> > > wrote: > > > It's not clear to me if WalSummarizerCtl->pending_lsn if fulfilling some > > > purpose or it's just a leftover from prior development. I see it's only > > > read in an assertion ... Maybe if we think this cross-check is > > > important, it should be turned into an elog? Otherwise, I'd remove it. > > > > I've been thinking about that. One thing I'm not quite sure about > > though is introspection. Maybe there should be a function that shows > > summarized_tli and summarized_lsn from WalSummarizerData, and maybe it > > should expose pending_lsn too. > > True. Putting those two thoughts together, I think pg_basebackup with --progress could tell you "still waiting for the summary file up to LSN %X/%X to appear, and the walsummarizer is currently handling lsn %X/%X" or something like that. This would probably require two concurrent connections, one to run BASE_BACKUP and another to inquire server state; but this should easy enough to integrate together with parallel basebackup later. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/