On Fri, Dec 13, 2019 at 7:50 AM Bossart, Nathan <bossa...@amazon.com> wrote: > > Hi hackers, > > I believe I've uncovered a bug that may cause archive status ".ready" > files to be created too early, which in turn may cause an incorrect > version of the corresponding WAL segment to be archived. > > The crux of the issue seems to be that XLogWrite() does not wait for > the entire record to be written to disk before creating the ".ready" > file. Instead, it just waits for the last page of the segment to be > written before notifying the archiver. If PostgreSQL crashes before > it is able to write the rest of the record, it will end up reusing the > ".ready" segment at the end of crash recovery. In the meantime, the > archiver process may have already processed the old version of the > segment.
Maybe I'm missing something... But since XLogWrite() seems to call issue_xlog_fsync() before XLogArchiveNotifySeg(), ISTM that this trouble shouldn't happen. No? Regards, -- Fujii Masao