On Fri, 31 May 2019 at 17:31, Amit Khandekar <amitdkhan...@gmail.com> wrote: > > On Fri, 31 May 2019 at 11:08, Amit Khandekar <amitdkhan...@gmail.com> wrote: > > > > On Thu, 30 May 2019 at 20:13, Andres Freund <and...@anarazel.de> wrote: > > > > > > Hi, > > > > > > On 2019-05-30 19:46:26 +0530, Amit Khandekar wrote: > > > > @@ -1042,7 +1042,8 @@ ReplicationSlotReserveWal(void) > > > > if (!RecoveryInProgress() && SlotIsLogical(slot)) > > > > { > > > > .... > > > > } > > > > else > > > > { > > > > - restart_lsn = GetRedoRecPtr(); > > > > + restart_lsn = SlotIsLogical(slot) ? > > > > + GetXLogReplayRecPtr(&ThisTimeLineID) : > > > > GetRedoRecPtr(); > > > > > > > > But then when I do pg_create_logical_replication_slot(), it hangs in > > > > DecodingContextFindStartpoint(), waiting to find new records > > > > (XLogReadRecord). > > > > > > But just till the primary has logged the necessary WAL records? If you > > > just do CHECKPOINT; or such on the primary, it should succeed quickly? > > > > Yes, it waits until there is a commit record, or (just tried) until a > > checkpoint command. > > Is XLOG_RUNNING_XACTS record essential for the logical decoding to > build a consistent snapshot ? > Since the restart_lsn is now ReplayRecPtr, there is no > XLOG_RUNNING_XACTS record, and so the snapshot state is not yet > SNAPBUILD_CONSISTENT. And so > DecodingContextFindStartpoint()=>DecodingContextReady() never returns > true, and hence DecodingContextFindStartpoint() goes in an infinite > loop, until it gets XLOG_RUNNING_XACTS.
After giving more thought on this, I think it might make sense to arrange for the xl_running_xact record to be sent from master to the standby, when a logical slot is to be created on standby. How about standby sending a new message type to the master, requesting for xl_running_xact record ? Then on master, ProcessStandbyMessage() will process this new message type and call LogStandbySnapshot(). -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company