Duplicated LSN in ReorderBuffer

Ildar Musin Tue, 25 Jun 2019 07:46:46 -0700

Hi hackers,

I believe we found a bug in logical decoding. It only occures with
casserts enabled. It was originally discovered and reproduced by Murat
Kabilov and Ildus Kurbangaliev. Here is the stacktrace we've got:


#0  0x00007facc66ef82f in raise () from /usr/lib/libc.so.6
#1  0x00007facc66da672 in abort () from /usr/lib/libc.so.6
#2  0x0000000000ac4ebf in ExceptionalCondition (
    conditionName=0xccdea8 "!(prev_first_lsn < cur_txn->first_lsn)",
    errorType=0xccdce4 "FailedAssertion", fileName=0xccdd38
"reorderbuffer.c",
    lineNumber=680) at assert.c:54
#3  0x00000000008a9515 in AssertTXNLsnOrder (rb=0x25ca128) at
reorderbuffer.c:680
#4  0x00000000008a900f in ReorderBufferTXNByXid (rb=0x25ca128, xid=65609,
create=true,
    is_new=0x0, lsn=211590864, create_as_top=true) at reorderbuffer.c:559
#5  0x00000000008abf0d in ReorderBufferAddNewTupleCids (rb=0x25ca128,
xid=65609,
    lsn=211590864, node=..., tid=..., cmin=0, cmax=4294967295,
combocid=4294967295)
    at reorderbuffer.c:2098
#6  0x00000000008b096b in SnapBuildProcessNewCid (builder=0x25d0158,
xid=65610,
    lsn=211590864, xlrec=0x25d60b8) at snapbuild.c:781
#7  0x000000000089d01c in DecodeHeap2Op (ctx=0x25ba0b8, buf=0x7ffd0e294da0)
at decode.c:382
#8  0x000000000089c8ca in LogicalDecodingProcessRecord (ctx=0x25ba0b8,
record=0x25ba378)
    at decode.c:125
#9  0x00000000008a124c in DecodingContextFindStartpoint (ctx=0x25ba0b8) at
logical.c:492
#10 0x00000000008b9c3d in CreateReplicationSlot (cmd=0x257be20) at
walsender.c:957
#11 0x00000000008baa60 in exec_replication_command (
    cmd_string=0x24f5b08 "CREATE_REPLICATION_SLOT temp_slot_name TEMPORARY
LOGICAL pgoutput USE_SNAPSHOT") at walsender.c:1531
#12 0x0000000000937230 in PostgresMain (argc=1, argv=0x25233a8,
dbname=0x2523380 "postgres",
    username=0x24f23c8 "zilder") at postgres.c:4245
#13 0x0000000000881453 in BackendRun (port=0x251a900) at postmaster.c:4431
#14 0x0000000000880b4f in BackendStartup (port=0x251a900) at
postmaster.c:4122
#15 0x000000000087cbbe in ServerLoop () at postmaster.c:1704
#16 0x000000000087c34a in PostmasterMain (argc=3, argv=0x24f0330) at
postmaster.c:1377
#17 0x00000000007926b6 in main (argc=3, argv=0x24f0330) at main.c:228

After viewing coredump we see that
`prev_first_lsn == cur_txn->first_lsn`

The problem seems to be that ReorderBuffer adds two ReorderBufferTXNs
with the same LSN, but different transaction ids: subxid and top-level
xid. See FIX part below.


STEPS TO REPRODUCE
------------------

We were able reproduce it on 10, 11 and on master branch. Postgres was
configured as:

./configure --enable-cassert CFLAGS='-ggdb3 -O0' --prefix=$HOME/pg12

Additional options in postgresql.conf:

wal_level='logical'
max_connections=1000
max_replication_slots=100
max_wal_senders=100
max_logical_replication_workers=100

pgbench scripts:

$ cat create_table.sql
BEGIN;
SAVEPOINT p1;
CREATE temp TABLE t_t (id INT) ON COMMIT DROP;
ROLLBACK TO SAVEPOINT p1;
ROLLBACK;

$ cat create_slot.sql
BEGIN ISOLATION LEVEL REPEATABLE READ READ ONLY;
SELECT pg_create_logical_replication_slot('test' || pg_backend_pid(),
'pgoutput', true);
SELECT pg_drop_replication_slot('test' || pg_backend_pid());
ROLLBACK;

Run in parallel terminals:

$HOME/pg12/bin/pgbench postgres -f create_table.sql -T1000 -c50 -j50
$HOME/pg12/bin/pgbench postgres -f create_slot.sql -T1000 -c50 -j50

It may take some time. On my local machine it breaks in few seconds.


FIX?
----

Can't say that i have enough understanding of what's going on in the
logical decoding code. But the one thing i've noticed is inconsistency
of xids used to make ReorderBufferTXNByXid() call:

1. first, in DecodeHeap2Op() function ReorderBufferProcessXid() is
called with subtransaction id; it actually creates ReorderBufferTXN
and adds it to reorder buffer's hash table and toplevel_by_lsn list;
2. second, within ReorderBufferXidSetCatalogChanges() it uses same
subxid to lookup the ReorderBufferTXN that was created before,
successfully;
3. now in ReorderBufferAddNewTupleCids() it uses top-level transaction
id instead for lookup; it cannot find xid in hash table and tries to
add a new record with the same LSN. And it fails since this LSN is
already in toplevel_by_lsn list.

Attached is a simple patch that uses subxid instead of top-level xid
in ReorderBufferAddNewTupleCids() call. It seems to fix the bug, but
i'm not sure that this is a valid change. Can someone please verify it
or maybe suggest a better solution for the issue?

Best regards,
Ildar

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 3e9d4cd79f..6c03c85768 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -778,7 +778,7 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
 	 */
 	ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
 
-	ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
+	ReorderBufferAddNewTupleCids(builder->reorder, xid, lsn,
 								 xlrec->target_node, xlrec->target_tid,
 								 xlrec->cmin, xlrec->cmax,
 								 xlrec->combocid);

Duplicated LSN in ReorderBuffer

Reply via email to