Re: archive status ".ready" files may be created too early

Alvaro Herrera Fri, 30 Jul 2021 15:22:31 -0700

On 2021-Jul-30, Bossart, Nathan wrote:

> On 7/30/21, 11:34 AM, "Alvaro Herrera" <[email protected]> wrote:
> > Hmm ... I'm not sure we're prepared to backpatch this kind of change.
> > It seems a bit too disruptive to how replay works.  I think patch we
> > should be focusing solely on patch 0001 to surgically fix the precise
> > bug you see.  Does patch 0002 exist because you think that a system with
> > only 0001 will not correctly deal with a crash at the right time?
> 
> Yes, that was what I was worried about.  However, I just performed a
> variety of tests with just 0001 applied, and I am beginning to suspect
> my concerns were unfounded.  With wal_buffers set very high,
> synchronous_commit set to off, and a long sleep at the end of
> XLogWrite(), I can reliably cause the archive status files to lag far
> behind the current open WAL segment.  However, even if I crash at this
> time, the .ready files are created when the server restarts (albeit
> out of order).  This appears to be due to the call to
> XLogArchiveCheckDone() in RemoveOldXlogFiles().  Therefore, we can
> likely abandon 0002.


That's great to hear.  I'll give 0001 a look again.

> > Now, the reason I'm looking at this patch series is that we're seeing a
> > related problem with walsender/walreceiver, which apparently are capable
> > of creating a file in the replica that ends up not existing in the
> > primary after a crash, for a reason closely related to what you
> > describe for WAL archival.  I'm not sure what is going on just yet, so
> > I'm not going to try and explain because I'm likely to get it wrong.
> 
> I've suspected that this is due to the use of the flushed location for
> the send pointer, which AFAICT needn't align with a WAL record
> boundary.

Yeah, I had gotten as far as the GetFlushRecPtr but haven't tracked down
what happens with a contrecord.


-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: archive status ".ready" files may be created too early

Reply via email to