On 2019-Jul-26, Andres Freund wrote:

> 2) We could simply assign the subtransaction to the parent using
>    ReorderBufferAssignChild() in SnapBuildProcessNewCid() or it's
>    caller. That ought to also fix the bug
> 
>    I also has the advantage that we can save some memory in transactions
>    that have some, but fewer than the ASSIGNMENT limit subtransactions,
>    because it allows us to avoid having a separate base snapshot for
>    them (c.f. ReorderBufferTransferSnapToParent()).

I'm not sure I understood this suggestion correctly.  I first tried with
this, which seems the simplest rendition:

--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -772,6 +772,12 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId 
xid,
 {
        CommandId       cid;
 
+       if ((SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT) &&
+               (xlrec->top_xid != xid))
+       {
+               ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid, 
lsn);
+       }
+
        /*
         * we only log new_cid's if a catalog tuple was modified, so mark the
         * transaction as containing catalog modifications

test_decoding's tests pass with that, but if I try the example script
provided by Ildar, all pgbench clients die with this:

client 19 script 1 aborted in command 1 query 0: ERROR:  subtransaction logged 
without previous top-level txn record

I thought I would create the main txn before calling AssignChild in
snapbuild; however, ReorderBufferTXNByXid is static in reorderbuffer.c.
So that seems out.  My next try was to remove the elog() that was
causing the failure ... but that leads pretty quickly to a crash with
this backtrace:

#2  0x00005653241fb823 in ExceptionalCondition 
(conditionName=conditionName@entry=0x5653243c1960 "!(prev_first_lsn < 
cur_txn->first_lsn)", 
    errorType=errorType@entry=0x565324250596 "FailedAssertion", 
    fileName=fileName@entry=0x5653243c18e8 
"/pgsql/source/master/src/backend/replication/logical/reorderbuffer.c", 
    lineNumber=lineNumber@entry=680) at 
/pgsql/source/master/src/backend/utils/error/assert.c:54
#3  0x0000565324062a84 in AssertTXNLsnOrder (rb=rb@entry=0x565326304fa8)
    at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:680
#4  0x0000565324062e39 in ReorderBufferTXNByXid (rb=rb@entry=0x565326304fa8, 
xid=<optimized out>, xid@entry=185613, create=create@entry=true, 
    is_new=is_new@entry=0x0, lsn=lsn@entry=2645271944, 
create_as_top=create_as_top@entry=true)
    at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:559
#5  0x0000565324067365 in ReorderBufferAddNewTupleCids (rb=0x565326304fa8, 
xid=185613, lsn=lsn@entry=2645271944, node=..., tid=..., cmin=0, 
    cmax=4294967295, combocid=4294967295) at 
/pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:2100
#6  0x0000565324069451 in SnapBuildProcessNewCid (builder=0x56532630afd8, 
xid=185614, lsn=2645271944, xlrec=0x5653262efc78)
    at /pgsql/source/master/src/backend/replication/logical/snapbuild.c:787

Now this failure goes away if I relax the < to <= in the
complained-about line ... but at this point it's two sanity checks that
I've lobotomized in order to get this to run at all.  Not really
comfortable with that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5fa3d7323e..55538fa44c 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -677,7 +677,7 @@ AssertTXNLsnOrder(ReorderBuffer *rb)
 
 		/* Current initial LSN must be strictly higher than previous */
 		if (prev_first_lsn != InvalidXLogRecPtr)
-			Assert(prev_first_lsn < cur_txn->first_lsn);
+			Assert(prev_first_lsn <= cur_txn->first_lsn);
 
 		/* known-as-subtxn txns must not be listed */
 		Assert(!cur_txn->is_known_as_subxact);
@@ -778,9 +778,6 @@ ReorderBufferAssignChild(ReorderBuffer *rb, TransactionId xid,
 	txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true);
 	subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false);
 
-	if (new_top && !new_sub)
-		elog(ERROR, "subtransaction logged without previous top-level txn record");
-
 	if (!new_sub)
 	{
 		if (subtxn->is_known_as_subxact)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index dc64b1e0c2..c8b1fcd96c 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -772,6 +772,10 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
 {
 	CommandId	cid;
 
+	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT &&
+		xlrec->top_xid != xid)
+		ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid, lsn);
+
 	/*
 	 * we only log new_cid's if a catalog tuple was modified, so mark the
 	 * transaction as containing catalog modifications

Reply via email to