Hi hackers, While creating a patch which allows ALTER SUBSCRIPTION SET (two_phase) [1], we found some issues related with logical replication and two_phase. I think this can happen not only HEAD but PG14+, but for now I shared patches for HEAD.
Issue #1 When handling a PREPARE message, the subscriber mistook the wrong lsn position (the end position of the last commit) as the end position of the current prepare. This can be fixed by adding a new global variable to record the end position of the last prepare. 0001 patch fixes the issue. Issue #2 When the subscriber enables two-phase commit but doesn't set max_prepared_transaction >0 and a transaction is prepared on the publisher, the apply worker reports an ERROR on the subscriber. After that, the prepared transaction is not replayed, which means it's lost forever. Attached script can emulate the situation. -- ERROR: prepared transactions are disabled HINT: Set "max_prepared_transactions" to a nonzero value. -- The reason is that we advanced the origin progress when aborting the transaction as well (RecordTransactionAbort->replorigin_session_advance). So, after setting replorigin_session_origin_lsn, if any ERROR happens when preparing the transaction, the transaction aborts which incorrectly advances the origin lsn. An easiest fix is to reset session replication origin before calling the RecordTransactionAbort(). I think this can happen when 1) LogicalRepApplyLoop() raises an ERROR or 2) apply worker exits. 0002 patch fixes the issue. How do you think? [1]: https://www.postgresql.org/message-id/flat/8fab8-65d74c80-1-2f28e880@39088166 Best regards, Hayato Kuroda FUJITSU LIMITED
test_2pc.sh
Description: test_2pc.sh
0001-Add-XactLastPrepareEnd-to-indicate-the-last-PREPARE-.patch
Description: 0001-Add-XactLastPrepareEnd-to-indicate-the-last-PREPARE-.patch
0002-Prevent-origin-progress-advancement-if-failed-to-app.patch
Description: 0002-Prevent-origin-progress-advancement-if-failed-to-app.patch