Re: BUG #15346: Replica fails to start after the crash

Alexander Kukushkin Thu, 30 Aug 2018 11:32:21 -0700

2018-08-30 19:34 GMT+02:00 Michael Paquier <[email protected]>:
> I have been struggling for a couple of hours to get a deterministic test
> case out of my pocket, and I did not get one as you would need to get
> the bgwriter to flush a page before crash recovery finishes, we could do


In my case the active standby server has crashed, it wasn't in the
crash recovery mode.

> the time, particularly for slow machines .  Anyway, I did more code
> review and I think that I found another issue with XLogNeedsFlush(),
> which could enforce updateMinRecoveryPoint to false if called before
> XLogFlush during crash recovery from another process than the startup
> process, so if it got called before XLogFlush() we'd still have the same
> issue for a process doing both operations.  Hence, I have come up with

At least XLogNeedsFlush() is called just from a couple of places and
doesn't break bgwriter, but anyway, thanks for finding it.

> the attached, which actually brings back the code to what it was before
> 8d68ee6 for those routines, except that we have fast-exit paths for the
> startup process so as it is still able to replay all WAL available and
> avoid page reference issues post-promotion, deciding when to update its
> own copy of minRecoveryPoint when it finishes crash recovery.  This also
> saves from a couple of locks on the control file from the startup
> process.

Sound good.

>
> If you apply the patch and try it on your standby, are you able to get
> things up and working?

Nope, minRecoveryPoint in pg_control is still wrong and therefore
startup still aborts on the same place if there are connections open.
I think there is no way to fix it other than let it replay sufficient
amount of WAL without open connections.
Just juddging from the timestamps of WAL files in the pg_xlog it is
obvious that a few moments before the hardware crashed postgres was
replaying 0000000500000AB300000057, because the next file has smaller
mtime (it was recycled).
-rw-------  1 akukushkin akukushkin 16777216 Aug 22 07:22
0000000500000AB300000057
-rw-------  1 akukushkin akukushkin 16777216 Aug 22 07:01
0000000500000AB300000058

Minimum recovery ending location is AB3/4A1B3118, but at the same time
I managed to find pages from 0000000500000AB300000053 on disk (at
least in the index files). That could only mean that bgwriter was
flushing dirty pages, but pg_control wasn't properly updated and it
happened not during recovery after hardware crash, but while the
postgres was running before the hardware crash.

The only possible way to recover such standby - cut off all possible
connections and let it replay all WAL files it managed to write to
disk before the first crash.

Regards,
--
Alexander Kukushkin

Re: BUG #15346: Replica fails to start after the crash

Reply via email to