On Fri, May 11, 2018 at 8:39 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> Michael Paquier wrote: > > On Thu, May 10, 2018 at 10:52:12AM +0530, Pavan Deolasee wrote: > > > I propose that we should always clear the minRecoveryPoint after > promotion > > > to ensure that crash recovery always run to the end if a just-promoted > > > standby crashes before completing its first regular checkpoint. A WIP > patch > > > is attached. > > > > I have been playing with your patch and upgraded the test to check as > > well for cascading standbys. We could use that in the final patch. > > That's definitely something to add in the recovery test suite, and the > > sleep phases should be replaced by waits on replay and/or flush. > > > > Still, that approach looks sensitive to me. A restart point could be > > running while the end-of-recovery record is inserted, so your patch > > could update minRecoveryPoint to InvalidXLogRecPtr, and then a restart > > point would happily update again the control file's minRecoveryPoint to > > lastCheckPointEndPtr because it would see that the former is older than > > lastCheckPointEndPtr (let's not forget that InvalidXLogRecPtr is 0), so > > you could still crash on invalid pages? > > Yeah, I had this exact comment, but I was unable to come up with a test > case that would cause a problem. > Looks like I didn't understand Alvaro's comment when he mentioned it to me off-list. But I now see what Michael and Alvaro mean and that indeed seems like a problem. I was thinking that the test for (ControlFile->state == DB_IN_ARCHIVE_RECOVERY) will ensure that minRecoveryPoint can't be updated after the standby is promoted. While that's true for a DB_IN_PRODUCTION, the RestartPoint may finish after we have written end-of-recovery record, but before we're in production and thus the minRecoveryPoint may again be set. > > > I need to think a bit more about that stuff, but one idea would be to > > use a special state in the control file to mark it as ending recovery, > > this way we would control race conditions with restart points. > > Hmm. Can we change the control file in released branches? (It should > be possible to make the new server understand both old and new formats, > but I think this is breaking new ground and it looks easy to introduce > more bugs there.) > > Can't we just remember that in shared memory state instead of writing to the control file? So if we've already performed end-of-recovery, we don't update the minRecoveryPoint when RestartPoint completes. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services