Re: [HACKERS] ThisTimeLineID in checkpointer and bgwriter processes

Heikki Linnakangas Fri, 21 Dec 2012 00:14:50 -0800

On 21.12.2012 08:18, Amit Kapila wrote:

On Thursday, December 20, 2012 11:15 PM Heikki Linnakangas wrote:

On 20.12.2012 18:19, Fujii Masao wrote:

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent

commit

doesn't take care of it and prevents the standby from recycling the

WAL files

properly. Specifically, the standby recycles the WAL file to wrong

name.


A-ha, good catch. So that's actually a live bug in 9.1 and 9.2 as well:
after the recovery target timeline has changed, restartpoints will
continue to preallocate/recycle WAL files for the old timeline. That's
otherwise harmless, but the useless WAL files waste space, and
walreceiver will have to always create new files.

So instead of always running with ThisTimeLineID = 0 in the
checkpointer
process, I guess we'll have to update it to the timeline being
replayed,
when creating a restartpoint.


Shouldn't there be a check if(RecoveryInProgress), before assigning
RecoveryTargetTLI to ThisTimeLineID in CreateRestartPoint()?

Hmm, I don't think so. You're not supposed to get that far inCreateRestartPoint() if recovery has already ended, or just being ended.The startup process "ends recovery", ie. makes RecoveryInProgress()return false, only after writing the end-of-recovery checkpoint. Andafter the end-of-recovery checkpoint has been written,CreateRestartPoint() will do nothing, because the end-of-recoverycheckpoint is later than the last potential restartpoint. I'm talkingabout this check in CreateRestartPoint():

        if (XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
                XLByteLE(lastCheckPoint.redo, ControlFile->checkPointCopy.redo))
        {
                ereport(DEBUG2,
                                (errmsg("skipping restartpoint, already performed at 
%X/%X",
                                                (uint32) (lastCheckPoint.redo 
>> 32), (uint32) lastCheckPoint.redo)));
                ...
                return false;
        }


However, there's this just before we recycle WAL segments:

        /*
         * Update pg_control, using current time.  Check that it still shows
         * IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
         * this is a quick hack to make sure nothing really bad happens if 
somehow
         * we get here after the end-of-recovery checkpoint.
         */
        LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
        if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
                XLByteLT(ControlFile->checkPointCopy.redo, lastCheckPoint.redo))
        {

> ...

but I believe that "quick hack" is just paranoia. You should not getthat far after the end-of-recovery checkpoint.

In any case, if you somehow get there anyway, the worst that will happenis that some old WAL segments are recycled/preallocated on the oldtimeline, wasting some space until the next checkpoint.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ThisTimeLineID in checkpointer and bgwriter processes

Reply via email to