>> IMO a test is needed that makes sure no one is going to break this in >> the future. > > You definitely need more complex test scenarios for that. If you can > come up with new ways to make the TAP tests of pg_rewind mode modular > in handling more complicated node setups, that would be a nice > addition, for example.
I'm sorry for lacking tests. For now, I started off with a simple test that cause the problem I mentioned. The updated WIP patch 0001 includes the new test for pg_rewind. And also, I'm afraid that I'm not sure what kind of tests I have to make for fix this behavior. Would you mind giving me some advice? > This patch is at least incorrect in its handling of crash recovery, > because these two should *never* be set in this case as we want to > replay up to the end of WAL. For example, see xlog.c or the top of > xlogrecovery.c about the assumptions behind these variables: > /* crash recovery should always recover to the end of WAL */ > ControlFile->minRecoveryPoint = InvalidXLogRecPtr; > ControlFile->minRecoveryPointTLI = 0; > > If an end-of-recovery record is replayed during crash recovery, these > assumptions are plain broken. That make sense! I really appreciate your knowledgeable review. > One thing that we could consider is to be more aggressive with > restartpoints when replaying this record for a standby, see a few > lines above the lines added by your patch, for example. And we could > potentially emulate a post-promotion restart point to get a refresh of > the control file as it should, with the correct code paths involved in > the updates of minRecoveryPoint when the checkpointer does the job. I'm not confident but you meant we could make restartpoint (i.e., call `RequestCheckpoint()`) instead of my old patch? The patch 0001 also contains my understanding. I also found a bug (maybe). If we call `CreateRestartPoint()` during crash-recovery, the assertion fails at ComputeXidHorizon() in procarray.c. It's inherently orthogonal to the problem I already reported. So you can reproduce this at HEAD with this procedure. 1. Start primary and standby server 2. Modify checkpoint_timeout to 1h on standby 3. Insert 10^10 records and concurrently run CHECKPOINT every second on primary 4. Do an immediate stop on both standby and primary at the end of the insert 5. Modify checkpoint_timeout to 30 on standby 6. Remove standby.signal on standby 7. Restart standby (it will start crash-recovery) 8. Assertion failure is raised shortly I think this is because `TruncateSUBTRANS();` in `CreateRestartPoint()` is called but `StartupSUBTRANS()` isn't called yet. In `StartupXLOG()`, we call `StartupSUBTRANS()` if `(ArchiveRecoveryRequested && EnableHotStandby)`. However, in `CreateRestartPoint()`, we call `TruncateSUBTRANS()` if `(EnableHotStandby)`. I guess the difference causes this bug. The latter possibly be called even crash-recovery while former isn't. The attached patch 0002 fixes it. I think we could discuss about this bug in another thread if needed. Best regards! Masaki Kuwamura
v2-0001-pg_rewind-Fix-bug-using-cascade-standby-as-source.patch
Description: Binary data
v1-0002-Fix-restartpoint-during-crash-recovery.patch
Description: Binary data