On Fri, Oct 1, 2021 at 2:29 AM Robert Haas <robertmh...@gmail.com> wrote:

> On Thu, Sep 30, 2021 at 7:59 AM Amul Sul <sula...@gmail.com> wrote:
> > To find the value of InRecovery after we clear it, patch still uses
> > ControlFile's DBState, but now the check condition changed to a more
> > specific one which is less confusing.
> >
> > In casual off-list discussion, the point was made to check
> > SharedRecoveryState to find out the InRecovery value afterward, and
> > check that using RecoveryInProgress().  But we can't depend on
> > SharedRecoveryState because at the start it gets initialized to
> > RECOVERY_STATE_CRASH irrespective of InRecovery that happens later.
> > Therefore, we can't use RecoveryInProgress() which always returns
> > true if SharedRecoveryState != RECOVERY_STATE_DONE.
>
> Uh, this change has crept into 0002, but it should be in 0004 with the
> rest of the changes to remove dependencies on variables specific to
> the startup process. Like I said before, we should really be trying to
> separate code movement from functional changes. Also, 0002 doesn't
> actually apply for me. Did you generate these patches with 'git
> format-patch'?
>
> [rhaas pgsql]$ patch -p1 <
> ~/Downloads/v36-0001-Refactor-some-end-of-recovery-code-out-of-Startu.patch
> patching file src/backend/access/transam/xlog.c
> Hunk #1 succeeded at 889 (offset 9 lines).
> Hunk #2 succeeded at 939 (offset 12 lines).
> Hunk #3 succeeded at 5734 (offset 37 lines).
> Hunk #4 succeeded at 8038 (offset 70 lines).
> Hunk #5 succeeded at 8248 (offset 70 lines).
> [rhaas pgsql]$ patch -p1 <
> ~/Downloads/v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch
> patching file src/backend/access/transam/xlog.c
> Reversed (or previously applied) patch detected!  Assume -R? [n]
> Apply anyway? [n] y
> Hunk #1 FAILED at 7954.
> Hunk #2 succeeded at 8079 (offset 70 lines).
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/backend/access/transam/xlog.c.rej
> [rhaas pgsql]$ git reset --hard
> HEAD is now at b484ddf4d2 Treat ETIMEDOUT as indicating a
> non-recoverable connection failure.
> [rhaas pgsql]$ patch -p1 <
> ~/Downloads/v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch
> patching file src/backend/access/transam/xlog.c
> Reversed (or previously applied) patch detected!  Assume -R? [n]
> Apply anyway? [n]
> Skipping patch.
> 2 out of 2 hunks ignored -- saving rejects to file
> src/backend/access/transam/xlog.c.rej
>
>
I tried to apply the patch on the master branch head and it's failing
with conflicts.

Later applied patch on below commit and it got applied cleanly:

commit 7d1aa6bf1c27bf7438179db446f7d1e72ae093d0
Author: Tom Lane <t...@sss.pgh.pa.us>
Date:   Mon Sep 27 18:48:01 2021 -0400

    Re-enable contrib/bloom's TAP tests.

rushabh@rushabh:postgresql$ git apply
v36-0001-Refactor-some-end-of-recovery-code-out-of-Startu.patch
rushabh@rushabh:postgresql$ git apply
v36-0002-Postpone-some-end-of-recovery-operations-relatin.patch
rushabh@rushabh:postgresql$ git apply
v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch
v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:34: space
before tab in indent.
  /*
v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:38: space
before tab in indent.
  */
v36-0003-Create-XLogAcceptWrites-function-with-code-from-.patch:39: space
before tab in indent.
  Insert->fullPageWrites = lastFullPageWrites;
warning: 3 lines add whitespace errors.
rushabh@rushabh:postgresql$ git apply
v36-0004-Remove-dependencies-on-startup-process-specifica.patch

There are whitespace errors on patch 0003.


> It seems to me that the approach you're pursuing here can't work,
> because the long-term goal is to get to a place where, if the system
> starts up read-only, XLogAcceptWrites() might not be called until
> later, after StartupXLOG() has exited. But in that case the control
> file state would be DB_IN_PRODUCTION. But my idea of using
> RecoveryInProgress() won't work either, because we set
> RECOVERY_STATE_DONE just after we set DB_IN_PRODUCTION. Put
> differently, the question we want to answer is not "are we in recovery
> now?" but "did we perform recovery?". After studying the code a bit, I
> think a good test might be
> !XLogRecPtrIsInvalid(XLogCtl->lastReplayedEndRecPtr). If InRecovery
> gets set to true, then we're certain to enter the if (InRecovery)
> block that contains the main redo loop. And that block unconditionally
> does XLogCtl->lastReplayedEndRecPtr = XLogCtl->replayEndRecPtr. I
> think that replayEndRecPtr can't be 0 because it's supposed to
> represent the record we're pretending to have last replayed, as
> explained by the comments. And while lastReplayedEndRecPtr will get
> updated later as we replay more records, I think it will never be set
> back to 0. It's only going to increase, as we replay more records. On
> the other hand if InRecovery = false then we'll never change it, and
> it seems that it starts out as 0.
>
> I was hoping to have more time today to comment on 0004, but the day
> seems to have gotten away from me. One quick thought is that it looks
> a bit strange to be getting EndOfLog from GetLastSegSwitchData() which
> returns lastSegSwitchLSN while getting EndOfLogTLI from replayEndTLI
> ... because there's also replayEndRecPtr, which seems to go with
> replayEndTLI. It feels like we should use a source for the TLI that
> clearly matches the source for the corresponding LSN, unless there's
> some super-good reason to do otherwise.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>
>

-- 
Rushabh Lathia

Reply via email to