On Sat, Apr 2, 2022 at 5:47 AM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > On 4/1/22 17:02, Tomas Vondra wrote: > > So, I investigated this a bit more, and I wrote a couple test_decoding > isolation tests (patch attached) demonstrating the issue. Actually, I > should say "issues" because it's a bit worse than you described ... > > The whole problem is in this chunk of code in sequence_decode(): > > > /* Skip the change if already processed (per the snapshot). */ > if (transactional && > !SnapBuildProcessChange(builder, xid, buf->origptr)) > return; > else if (!transactional && > (SnapBuildCurrentState(builder) != SNAPBUILD_CONSISTENT || > SnapBuildXactNeedsSkip(builder, buf->origptr))) > return; > > /* Queue the increment (or send immediately if not transactional). */ > snapshot = SnapBuildGetOrBuildSnapshot(builder, xid); > ReorderBufferQueueSequence(ctx->reorder, xid, snapshot, buf->endptr, > origin_id, target_node, transactional, > xlrec->created, tuplebuf); > > With the script you described, the increment is non-transactional, so we > end up in the second branch, return and thus discard the increment. > > But it's also possible the change is transactional, which can only > trigger the first branch. But it does not, so we start building the > snapshot. But the first thing SnapBuildGetOrBuildSnapshot does is > > Assert(builder->state == SNAPBUILD_CONSISTENT); > > and we're still not in a consistent snapshot, so it just crashes and > burn :-( > > The sequences.spec file has two definitions of s2restart step, one empty > (resulting in non-transactional change), one with ALTER SEQUENCE (which > means the change will be transactional). > > > The really "funny" thing is this is not new code - this is an exact copy > from logicalmsg_decode(), and logical messages have all those issues > too. We may discard some messages, trigger the same Assert, etc. There's > a messages2.spec demonstrating this (s2message step defines whether the > message is transactional or not). >
It seems to me that the Assert in SnapBuildGetOrBuildSnapshot() is wrong. It is required only for non-transactional logical messages. For transactional message(s), we decide at the commit time whether the snapshot has reached a consistent state and then decide whether to skip the entire transaction or not. So, the possible fix for Assert could be that we pass an additional parameter 'transactional' to SnapBuildGetOrBuildSnapshot() and then assert only when it is false. I have also checked the development thread for this work and it appears to be introduced for non-transactional cases only. See email[1], this new function and Assert was for the non-transactional case and later while rearranging the code, this problem got introduced. Now, for the non-transactional cases, I am not sure if there is a one-to-one mapping with the sequence case. The way sequences are dealt with on the subscriber-side (first we copy initial data and then replicate the incremental changes) appears more as we deal with the table and its incremental changes. There is some commonality with non-transactional messages w.r.t the case where we want sequence changes to be sent even on rollbacks unless some DDL has happened for them but if we see the overall solution it doesn't appear that we can use it similar to messages. I think this is the reason we are facing the other problems w.r.t to syncing sequences to subscribers including the problem reported by Sawada-San yesterday. Now, the particular case where we won't send a non-transactional logical message unless the snapshot is consistent could be considered as its behavior and may be documented better. I am not very sure about this as there is no example of the way sync for these messages happens in the core but if someone outside the core wants a different behavior and presents the case then we can probably try to enhance it. I feel the same is not true for sequences because it can cause the replica (subscriber) to go out of sync with the master (publisher). > So I guess we need to fix both places, perhaps in a similar way. > It depends but I think for logical messages we should do the minimal fix required for Asserts and probably document the current behavior bit better unless we think there is a case to make it behave similar to sequences. [1] - https://www.postgresql.org/message-id/56D4B3AD.5000207%402ndquadrant.com -- With Regards, Amit Kapila.