On Fri, Jul 26, 2024 at 4:10 PM Alexander Korotkov <aekorot...@gmail.com> wrote: > 0002 could also use pg_indent and pgperltidy. And I have couple other > notes regarding 0002.
As Tom said elsewhere, we don't have a practice of requiring perltidy for every commit, and I normally don't bother. The tree is pgindent-clean currently so I believe I got that part right. > > In the process of fixing these bugs, I realized that the logic to wait > > for WAL summarization to catch up was spread out in a way that made > > it difficult to reuse properly, so this code refactors things to make > > it easier. > > It would be nice to split refactoring out of material logic changed. > This way it would be easier to review and would look cleaner in the > git history. I support that idea in general but felt it was overkill in this case: it's new code, and there was only one existing caller of the function that got refactored, and I'm not a huge fan of cluttering the git history with a bunch of tiny little refactoring commits to fix a single bug. I might have changed it if I'd seen this note before committing, though. > > To make this fix work, also teach the WAL summarizer that after a > > promotion has occurred, no more WAL can appear on the previous > > timeline: previously, the WAL summarizer wouldn't switch to the new > > timeline until we actually started writing WAL there, but that meant > > that when the startup process was waiting for the WAL summarizer, it > > was waiting for an action that the summarizer wasn't yet prepared to > > take. > > I think this is a pretty long sentence, and I'm not sure I can > understand it correctly. Does startup process wait for the WAL > summarizer without this patch? If not, it's not clear for me that > word "previously" doesn't distribute to this part of sentence. > Breaking this into multiple sentences could improve the clarity for > me. Yes, I think that phrasing is muddled. It's too late to amend the commit message, but for posterity: I initially thought that I could fix this bug by just teaching the startup process to wait for WAL summarization before performing the .partial renaming, but that was not enough by itself. The problem is that the WAL summarizer would read all the records that were present in the final WAL file on the old timeline, but it wouldn't actually write out a summary file, because that only happens when it reaches XLOG_CHECKPOINT_REDO or a timeline switch point. Since no WAL had appeared on the new timeline yet, it didn't view the end of the old timeline as a switch point, so it just sat there waiting, without ever writing out a file; and the startup process sat there waiting for it. So the second part of the fix is to make the WAL summarizer realize that once the startup process has chosen a new timeline ID, no more WAL is going to appear on the old timeline, and thus it can (and should) write out the final summary file for the old timeline and prepare to read WAL from the new timeline. -- Robert Haas EDB: http://www.enterprisedb.com