Hi,

On 2021-06-16 16:30:45 +0300, Heikki Linnakangas wrote:
> xlog.c is very large. We've split off some functions from it over the years,
> but it's still large and it keeps growing.
>
> Attached is a proposal to split functions related to WAL replay, standby
> mode, fetching files from archive, computing the recovery target and so on,
> to new source file called xlogrecovery.c.

Wohoo!

I think this is desperately needed. I personally am more concerned about
the size of StartupXLOG() etc than the size of xlog.c itself, but since
both reasonably are done at the same time...


> That's a fairly clean split.  StartupXLOG() stays in xlog.c, but much of the
> code from it has been moved to new functions InitWalRecovery(),
> PerformWalRecovery() and EndWalRecovery(). The general idea is that xlog.c is
> still responsible for orchestrating the servers startup, but xlogrecovery.c
> is responsible for figuring out whether WAL recovery is needed, performing
> it, and deciding when it can stop.

For some reason "recovery" bothers me a tiny bit, even though it's obviously
already in use. Using "apply", or "replay" seems more descriptive to me, but
whatever.


> There's surely more refactoring we could do. xlog.c has a lot of global
> variables, with similar names but slightly different meanings for example.
> (Quick: what's the difference between InRedo, InRecovery, InArchiveRecovery,
> and RecoveryInProgress()? I have to go check the code every time to remind
> myself). But this patch tries to just move source code around for clarity.

Agreed, it's quite chaotic. I think a good initial step to clean up that mess
would be to just collect the relevant variables into one or two structs.


> There are small changes in the order that some of things are done in
> StartupXLOG(), for readability. I tried to be careful and check that the
> changes are safe, but a second pair of eyes would be appreciated on that.

I think it might be worth trying to break this into a bit more incremental
changes - it's a huge commit and mixing code movement with code changes makes
it really hard to review the non-movement portion.

> +void
> +PerformWalRecovery(void)
> +{

> +
> +     if (record != NULL)
> +     {
> +             ErrorContextCallback errcallback;
> +             TimestampTz xtime;
> +             PGRUsage        ru0;
> +             XLogRecPtr      ReadRecPtr;
> +             XLogRecPtr      EndRecPtr;
> +
> +             pg_rusage_init(&ru0);
> +
> +             InRedo = true;
> +
> +             /* Initialize resource managers */
> +             for (rmid = 0; rmid <= RM_MAX_ID; rmid++)
> +             {
> +                     if (RmgrTable[rmid].rm_startup != NULL)
> +                             RmgrTable[rmid].rm_startup();
> +             }
> +
> +             ereport(LOG,
> +                             (errmsg("redo starts at %X/%X",
> +                                             
> LSN_FORMAT_ARGS(xlogreader->ReadRecPtr))));
> +
> +             /*
> +              * main redo apply loop
> +              */
> +             do
> +             {

If we're refactoring all of this, can we move the apply-one-record part into
its own function as well? Happy to do that as a followup or precursor patch
too. The per-record logic has grown complicated enough to make that quite
worthwhile imo - and imo most of the time one either is interested in the
per-record work, or in the rest of the StartupXLog/PerformWalRecovery logic.

Greetings,

Andres Freund


Reply via email to