Yeh, tks for your clarification. I have a basic understanding of it now. I mean is this considered a bug or design defect in the codebase? If so, should we prevent it from occuring in general, not just for this specific test.
vignesh C <vignes...@gmail.com> > > We have three processes involved in this scenario: > A walsender process on the publisher, responsible for decoding and > sending WAL changes. > An apply worker process on the subscriber, which applies the changes. > A session executing the ALTER SUBSCRIPTION command. > > Due to the asynchronous nature of these processes, the ALTER > SUBSCRIPTION command may not be immediately observed by the apply > worker. Meanwhile, the walsender may process and decode an INSERT > statement. > If the insert targets a table (e.g., tab_3) that does not belong to > the current publication (pub1), the walsender silently skips > replicating the record and advances its decoding position. This > position is sent in a keepalive message to the subscriber, and since > there are no pending transactions to flush, the apply worker reports > it as the latest received LSN. > Later, when the apply worker eventually detects the subscription > change, it restarts—but by then, the insert has already been skipped > and is no longer eligible for replay, as the table was not part of the > publication (pub1) at the time of decoding. > This race condition arises because the three processes run > independently and may progress at different speeds due to CPU > scheduling or system load. > Thoughts? > > Regards, > Vignesh >