Dear Amit, > Your analysis sounds correct to me.
Okay, so we could have a same picture... > > IIUC, the root cause is that pg_create_logical_replication_slot() returns a > > LSN > > which is not generated yet. So, I think both mine [1] and Euler's approach > > [2] > > can solve the issue. My proposal was to add an extra WAL record after the > > final > > slot creation, and Euler's one was to use a restart_lsn as the > recovery_target_lsn. > > > > I don't think it is correct to set restart_lsn as consistent_lsn point > because the same is used to set replication origin progress. Later > when we start the subscriber, the system will use that LSN as a > start_decoding_at point which is the point after which all the commits > will be replicated. So, we will end up incorrectly using restart_lsn > (LSN from where we start reading the WAL) as start_decoding_at point. > How could that be correct? I didn't say we could use restart_lsn as consistent point of logical replication, but I could agree the approach has issues. > Now, even if we use restart_lsn as recovery_target_lsn and the LSN > returned by pg_create_logical_replication_slot() as consistent LSN to > set replication progress, that also could lead to data loss because > the subscriber may never get data between restart_lsn value and > consistent LSN value. You considered the case, e.g., tuples were inserted just after the restart_lsn but before the RUNNING_XACT record? In this case, yes, the streaming replication finishes before replicating tuples but logical replication will skip them. Euler's approach cannot be used as-is. Best regards, Hayato Kuroda FUJITSU LIMITED