On Thu, Jun 17, 2021 at 1:23 AM Amul Sul <sula...@gmail.com> wrote: > Attached is rebase for the latest master head. Also, I added one more > refactoring code that deduplicates the code setting database state in the > control file. The same code set the database state is also needed for this > feature.
I started studying 0001 today and found that it rearranged the order of operations in StartupXLOG() more than I was expecting. It does, as per previous discussions, move a bunch of things to the place where we now call XLogParamters(). But, unsatisfyingly, InRecovery = false and XLogReaderFree() then have to move down even further. Since the goal here is to get to a situation where we sometimes XLogAcceptWrites() after InRecovery = false, it didn't seem nice for this refactoring patch to still end up with a situation where this stuff happens while InRecovery = true. In fact, with the patch, the amount of code that runs with InRecovery = true actually *increases*, which is not what I think should be happening here. That's why the patch ends up having to adjust SetMultiXactIdLimit to not Assert(!InRecovery). And then I started to wonder how this was ever going to work as part of the larger patch set, because as you have it here, XLogAcceptWrites() takes arguments XLogReaderState *xlogreader, XLogRecPtr EndOfLog, and TimeLineID EndOfLogTLI and if the checkpointer is calling that at a later time after the user issues pg_prohibit_wal(false), it's going to have none of those things. So I had a quick look at that part of the code and found this in checkpointer.c: XLogAcceptWrites(true, NULL, InvalidXLogRecPtr, 0); For those following along from home, the additional "true" is a bool needChkpt argument added to XLogAcceptWrites() by 0003. Well, none of this is very satisfying. The whole purpose of passing the xlogreader is so we can figure out whether we need a checkpoint (never mind the question of whether the existing algorithm for determining that is really sensible) but now we need a second argument that basically serves the same purpose since one of the two callers to this function won't have an xlogreader. And then we're passing the EndOfLog and EndOfLogTLI as dummy values which seems like it's probably just totally wrong, but if for some reason it works correctly there sure don't seem to be any comments explaining why. So I started doing a bit of hacking myself and ended up with the attached, which I think is not completely the right thing yet but I think it's better than your version. I split this into three parts. 0001 splits up the logic that currently decides whether to write an end-of-recovery record or a checkpoint record and if the latter how the checkpoint ought to be performed into two functions. DetermineRecoveryXlogAction() figures out what we want to do, and PerformRecoveryXlogAction() does it. It also moves the code to run recovery_end_command and related stuff into a new function CleanupAfterArchiveRecovery(). 0002 then builds on this by postponing UpdateFullPageWrites(), PerformRecoveryXLogAction(), and CleanupAfterArchiveRecovery() to just before we XLogReportParameters(). Because of the refactoring done by 0001, this is only a small amount of code movement. Because of the separation between DetermineRecoveryXlogAction() and PerformRecoveryXlogAction(), the latter doesn't need the xlogreader. So we can do DetermineRecoveryXlogAction() at the same time as now, while the xlogreader is available, and then we don't need it later when we PerformRecoveryXlogAction(), because we already know what we need to know. I think this is all fine as far as it goes. My 0003 is where I see some lingering problems. It creates XLogAcceptWrites(), moves the appropriate stuff there, and doesn't need the xlogreader. But it doesn't really solve the problem of how checkpointer.c would be able to call this function with proper arguments. It is at least better in not needing two arguments to decide what to do, but how is checkpointer.c supposed to know what to pass for xlogaction? Worse yet, how is checkpointer.c supposed to know what to pass for EndOfLogTLI and EndOfLog? Actually, EndOfLog doesn't seem too problematic, because that value has been stored in four (!) places inside XLogCtl by this code: LogwrtResult.Write = LogwrtResult.Flush = EndOfLog; XLogCtl->LogwrtResult = LogwrtResult; XLogCtl->LogwrtRqst.Write = EndOfLog; XLogCtl->LogwrtRqst.Flush = EndOfLog; Presumably we could relatively easily change things around so that we finish one of those values ... probably one of the "write" values .. back out of XLogCtl instead of passing it as a parameter. That would work just as well from the checkpointer as from the startup process, and there seems to be no way for the value to change until after XLogAcceptWrites() has been called, so it seems fine. But that doesn't help for the other arguments. What I'm thinking is that we should just arrange to store EndOfLogTLI and xlogaction into XLogCtl also, and then XLogAcceptWrites() can fish those values out of there as well, which should be enough to make it work and do the same thing regardless of which process is calling it. But I have run out of time for today so have not explored coding that up. -- Robert Haas EDB: http://www.enterprisedb.com
0001-Refactor-some-end-of-recovery-code-out-of-StartupXLO.patch
Description: Binary data
0002-Postpone-some-end-of-recovery-operations-relating-to.patch
Description: Binary data
0003-Create-XLogAcceptWrites-function-with-code-from-Star.patch
Description: Binary data