On Tue, Feb 9, 2021 at 11:29 AM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > On Tue, Feb 9, 2021 at 8:32 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, Feb 8, 2021 at 8:36 PM Markus Wanner > > <markus.wan...@enterprisedb.com> wrote: > > > > > > Hello Amit, > > > > > > thanks for your very quick response. > > > > > > On 08.02.21 11:13, Amit Kapila wrote: > > > > /* > > > > * It is possible that this transaction is not decoded at prepare time > > > > * either because by that time we didn't have a consistent snapshot or > > > > it > > > > * was decoded earlier but we have restarted. We can't distinguish > > > > between > > > > * those two cases so we send the prepare in both the cases and let > > > > * downstream decide whether to process or skip it. We don't need to > > > > * decode the xact for aborts if it is not done already. > > > > */ > > > > > > The way I read the surrounding code, the only case a 2PC transaction > > > does not get decoded a prepare time is if the transaction is empty. Or > > > are you aware of any other situation that might currently happen? > > > > > > > We also skip decoding at prepare time if we haven't reached a > > consistent snapshot by that time. See below code in DecodePrepare(). > > DecodePrepare() > > { > > .. > > /* We can't start streaming unless a consistent state is reached. */ > > if (SnapBuildCurrentState(builder) < SNAPBUILD_CONSISTENT) > > { > > ReorderBufferSkipPrepare(ctx->reorder, xid); > > return; > > } > > .. > > } > > Can you please provide steps which can lead to this situation? >
Ajin has already shared the example with you. > If > there is an earlier discussion which has example scenarios, please > point us to the relevant thread. > It started in the email [1] and from there you can read later emails to know more about this. > If we are not sending PREPARED transactions that's fine, > Hmm, I am not sure if that is fine because if the output plugin sets the two-phase-commit option, it would expect all prepared xacts to arrive not some only some of them. > but sending > the same prepared transaction as many times as the WAL sender is > restarted between sending prepare and commit prepared is a waste of > network bandwidth. > I think similar happens without any of the work done in PG-14 as well if we restart the apply worker before the commit completes on the subscriber. After the restart, we will send the start_decoding_at point based on some previous commit which will make publisher send the entire transaction again. I don't think restart of WAL sender or WAL receiver is such a common thing. It can only happen due to some bug in code or user wishes to stop the nodes or some crash happened. [1] - https://www.postgresql.org/message-id/CAA4eK1%2Bd3gzCyzsYjt1m6sfGf_C_uFmo9JK%3D3Wafp6yR8Mg8uQ%40mail.gmail.com -- With Regards, Amit Kapila.