Yeh, tks for your clarification.  I have a basic understanding of it now. I
mean is this considered a bug or design defect in the codebase? If so,
should we prevent it from occuring in general, not just for this specific
test.

vignesh C <vignes...@gmail.com>

>
> We have three processes involved in this scenario:
> A walsender process on the publisher, responsible for decoding and
> sending WAL changes.
> An apply worker process on the subscriber, which applies the changes.
> A session executing the ALTER SUBSCRIPTION command.
>
> Due to the asynchronous nature of these processes, the ALTER
> SUBSCRIPTION command may not be immediately observed by the apply
> worker. Meanwhile, the walsender may process and decode an INSERT
> statement.
> If the insert targets a table (e.g., tab_3) that does not belong to
> the current publication (pub1), the walsender silently skips
> replicating the record and advances its decoding position. This
> position is sent in a keepalive message to the subscriber, and since
> there are no pending transactions to flush, the apply worker reports
> it as the latest received LSN.
> Later, when the apply worker eventually detects the subscription
> change, it restarts—but by then, the insert has already been skipped
> and is no longer eligible for replay, as the table was not part of the
> publication (pub1) at the time of decoding.
> This race condition arises because the three processes run
> independently and may progress at different speeds due to CPU
> scheduling or system load.
> Thoughts?
>
> Regards,
> Vignesh
>

Reply via email to