On 2019-Dec-13, Kyotaro Horiguchi wrote: > At Thu, 12 Dec 2019 22:50:20 +0000, "Bossart, Nathan" <bossa...@amazon.com> > wrote in
> > The crux of the issue seems to be that XLogWrite() does not wait for > > the entire record to be written to disk before creating the ".ready" > > file. Instead, it just waits for the last page of the segment to be > > written before notifying the archiver. If PostgreSQL crashes before > > it is able to write the rest of the record, it will end up reusing the > > ".ready" segment at the end of crash recovery. In the meantime, the > > archiver process may have already processed the old version of the > > segment. > > Year, that can happen if the server restarted after the crash. ... which is the normal way to run things, no? > > servers after the primary server has crashed because it ran out of > > disk space. > > In the first place, it's quite bad to set restart_after_crash to on, > or just restart crashed master in replication set. Why is it bad? It's the default value. > The standby can be incosistent at the time of master crash, so it > should be fixed using pg_rewind or should be recreated from a base > backup. Surely the master will just come up and replay its WAL, and there should be no inconsistency. You seem to be thinking that a standby is promoted immediately on crash of the master, but this is not a given. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services