> On Mar 25, 2025, at 13:58, Phillip Diffley <phillip6...@gmail.com> wrote:
>
> Oh I see! I was conflating the data I see coming out of a replication slot
> with the internal organization of the WAL. I think the more specific question
> I am trying to answer is, as a consumer of a replication slot, how do I
> reason about what replication records will be made unavailable when I confirm
> an LSN? Here I am worried about situations where the replication connection
> is interrupted or the program processing the records crashes, and we need to
> replay records that may have been previously sent but were not fully
> processed.
It's up to the consuming client to keep track of where it is in the WAL (using
an LSN). When the client connects, it specifies what LSN to start streaming
at. If that LSN is no longer available, the publisher / primary returns an
error.
The client shouldn't confirm the flush of an LSN unless it is crash-proof to
that point, since any WAL before that should be assumed to be unavailable.
> For example, are the records sent by a replication slot always sent in the
> same order such that if I advance the confirmed_flush_lsn of a slot to the
> LSN of record "A", I will know that any records that had been streamed after
> record "A" will be replayable?
You know that any WAL generated after `confirmed_flush_lsn` is available for
replay. That's the oldest LSN that the client can specify on connection
(although it can specify a later one, if it exists). You shouldn't need to
manually advance the replication slot. Instead, the client specifies where it
wants to start when it connects. The client is also expected to send back
regular messages letting the publisher / primary know that it has successfully
consumed up to a particular point in the WAL, so the publisher / primary knows
it can release that WAL information.