On Thu, Jul 9, 2020 at 6:14 PM Petr Jelinek <p...@2ndquadrant.com> wrote: > > Hi, > > On 09/07/2020 14:34, Amit Kapila wrote: > > On Thu, Jul 9, 2020 at 5:16 PM Petr Jelinek <p...@2ndquadrant.com> wrote: > >> > >> On 09/07/2020 13:10, Amit Kapila wrote: > >>> On Thu, Feb 6, 2020 at 2:40 PM Amit Kapila <amit.kapil...@gmail.com> > >>> wrote: > >>>> > >>>> During logical decoding, we send replication_origin and > >>>> replication_origin_lsn when we decode commit. In pgoutput_begin_txn, > >>>> we send values for these two but never used on the subscriber side. > >>>> Though we have provided a function (logicalrep_read_origin) to read > >>>> these two values but that is not used in code anywhere. > >>>> > >> > >> We don't use the origin message anywhere really because we don't support > >> origin forwarding in the built-in replication yet. That part I left out > >> intentionally in the original PG10 patchset as it's mostly useful for > >> circular replication detection when you want to replicate both ways. > >> However that's relatively useless without also having some kind of > >> conflict detection which would be another huge pile of code and I > >> expected we would end up not getting logical replication in PG10 at all > >> if I tried to push conflict detection as well :) > >> > > > > Fair enough. However, without tests and more documentation about this > > concept, it is likely that future development might break it. It is > > good that you and others who know this part well are there to respond > > but still, the more documentation and tests would be preferred. > > > > Honestly that part didn't even need to be committed given it's unused. > Protocol supports versioning so it could have been added at later time. > > >>> > >>> For the purpose of decoding in-progress transactions, I think we can > >>> send replication_origin in the first 'start' message as it is present > >>> with each WAL record, however replication_origin_lsn is only logged at > >>> commit time, so can't send it before commit. The > >>> replication_origin_lsn is set by pg_replication_origin_xact_setup() > >>> but it is not clear how and when that function can be used. Do we > >>> really need replication_origin_lsn before we decode the commit record? > >>> > >> > >> That's the SQL interface, C interface does not require that and I don't > >> think we need to do that. > >> > > > > I think when you are saying SQL interface, you referred to > > pg_replication_origin_xact_setup() but I am not sure which C interface > > you are referring to in the above sentence? > > > > All the stuff pg_replication_origin_xact_setup does internally. > > >> The existing apply code sets the > >> replorigin_session_origin_lsn only when processing commit message IIRC. > >> > > > > That's correct. However, we do send it via 'begin' callback which > > won't be possible with the streaming of in-progress transactions. Do > > we need to send this origin related information (origin, origin_lsn) > > while streaming of in-progress transactions? If so, when? As far as > > I can see, the origin_id can be sent with the first 'start' message. > > The origin_lsn and origin_commit can be sent with the last 'start' of > > streaming commit if we want but not sure if that is of use. If we > > need to send origin_lsn earlier than that then we need to record it > > with other WAL records (other than Commit WAL record). > > > > If we were to support the origin forwarding, then strictly speaking we > need everything only at commit time from correctness perspective, >
Okay. Anyway streaming mode is optional, so in such cases, we can keep it 'off' > but > ideally origin_id would be best sent with first message as it can be > used to filter out changes at decoding stage rather than while we > process the commit so having it set early improves performance of decoding. > Yeah, makes sense. So, we will just send origin_id (with first streaming start message) and leave others. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com