On 2019-Jul-26, Andres Freund wrote: > 2) We could simply assign the subtransaction to the parent using > ReorderBufferAssignChild() in SnapBuildProcessNewCid() or it's > caller. That ought to also fix the bug > > I also has the advantage that we can save some memory in transactions > that have some, but fewer than the ASSIGNMENT limit subtransactions, > because it allows us to avoid having a separate base snapshot for > them (c.f. ReorderBufferTransferSnapToParent()).
I'm not sure I understood this suggestion correctly. I first tried with this, which seems the simplest rendition: --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -772,6 +772,12 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid, { CommandId cid; + if ((SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT) && + (xlrec->top_xid != xid)) + { + ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid, lsn); + } + /* * we only log new_cid's if a catalog tuple was modified, so mark the * transaction as containing catalog modifications test_decoding's tests pass with that, but if I try the example script provided by Ildar, all pgbench clients die with this: client 19 script 1 aborted in command 1 query 0: ERROR: subtransaction logged without previous top-level txn record I thought I would create the main txn before calling AssignChild in snapbuild; however, ReorderBufferTXNByXid is static in reorderbuffer.c. So that seems out. My next try was to remove the elog() that was causing the failure ... but that leads pretty quickly to a crash with this backtrace: #2 0x00005653241fb823 in ExceptionalCondition (conditionName=conditionName@entry=0x5653243c1960 "!(prev_first_lsn < cur_txn->first_lsn)", errorType=errorType@entry=0x565324250596 "FailedAssertion", fileName=fileName@entry=0x5653243c18e8 "/pgsql/source/master/src/backend/replication/logical/reorderbuffer.c", lineNumber=lineNumber@entry=680) at /pgsql/source/master/src/backend/utils/error/assert.c:54 #3 0x0000565324062a84 in AssertTXNLsnOrder (rb=rb@entry=0x565326304fa8) at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:680 #4 0x0000565324062e39 in ReorderBufferTXNByXid (rb=rb@entry=0x565326304fa8, xid=<optimized out>, xid@entry=185613, create=create@entry=true, is_new=is_new@entry=0x0, lsn=lsn@entry=2645271944, create_as_top=create_as_top@entry=true) at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:559 #5 0x0000565324067365 in ReorderBufferAddNewTupleCids (rb=0x565326304fa8, xid=185613, lsn=lsn@entry=2645271944, node=..., tid=..., cmin=0, cmax=4294967295, combocid=4294967295) at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:2100 #6 0x0000565324069451 in SnapBuildProcessNewCid (builder=0x56532630afd8, xid=185614, lsn=2645271944, xlrec=0x5653262efc78) at /pgsql/source/master/src/backend/replication/logical/snapbuild.c:787 Now this failure goes away if I relax the < to <= in the complained-about line ... but at this point it's two sanity checks that I've lobotomized in order to get this to run at all. Not really comfortable with that. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 5fa3d7323e..55538fa44c 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -677,7 +677,7 @@ AssertTXNLsnOrder(ReorderBuffer *rb) /* Current initial LSN must be strictly higher than previous */ if (prev_first_lsn != InvalidXLogRecPtr) - Assert(prev_first_lsn < cur_txn->first_lsn); + Assert(prev_first_lsn <= cur_txn->first_lsn); /* known-as-subtxn txns must not be listed */ Assert(!cur_txn->is_known_as_subxact); @@ -778,9 +778,6 @@ ReorderBufferAssignChild(ReorderBuffer *rb, TransactionId xid, txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true); subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false); - if (new_top && !new_sub) - elog(ERROR, "subtransaction logged without previous top-level txn record"); - if (!new_sub) { if (subtxn->is_known_as_subxact) diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index dc64b1e0c2..c8b1fcd96c 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -772,6 +772,10 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid, { CommandId cid; + if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT && + xlrec->top_xid != xid) + ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid, lsn); + /* * we only log new_cid's if a catalog tuple was modified, so mark the * transaction as containing catalog modifications