Hi all, (CCing some folks who worked on this area lately) The following sequence of commands generates an assertion failure, as of $subject: select pg_replication_origin_create('popo'); select pg_replication_origin_session_setup('popo'); begin; select txid_current(); prepare transaction 'popo'; -- assertion fails
The problem originates from 1eb6d65, down to 11, where we finish by triggering this assertion before replorigin_session_origin_lsn is not valid: + if (replorigin) + { + Assert(replorigin_session_origin_lsn != InvalidXLogRecPtr); + hdr->origin_lsn = replorigin_session_origin_lsn; + hdr->origin_timestamp = replorigin_session_origin_timestamp; + } As far as I understand this code and based on the docs, pg_replication_origin_xact_setup(), that would set of the session origin LSN and timestamp, is an optional choise. replorigin_session_advance() would be a no-op for remote_lsn, and local_lsn requires an update. Now please note that I am really fluent with this stuff, so feel free to correct me. The intention of the code also seems that XACT_XINFO_HAS_ORIGIN should still be set, but with no data. At the end, it seems to me that the assertion could just be dropped as per the attached, as we'd finish by setting up origin_lsn and origin_timestamp in the 2PC file header with some invalid data. The else block could just be removed, but then one would need to guess from origin.c that both replorigin_session_* may not be set. I am not completely sure that this is the right move, though, as pg_replication_origin_status has a *very* limited amount of tests. Adding a test to replorigin.sql with 2PC transactions able to trigger the problem is easy once we rely on a different slot to not influence the existing tests, as the expected contents of pg_replication_origin_status could be messed up in the follow-up tests. Thoughts? -- Michael
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 28b153abc3..7518f83925 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1127,9 +1127,9 @@ EndPrepare(GlobalTransaction gxact) replorigin = (replorigin_session_origin != InvalidRepOriginId && replorigin_session_origin != DoNotReplicateId); - if (replorigin) + if (replorigin && + replorigin_session_origin_lsn != InvalidXLogRecPtr) { - Assert(replorigin_session_origin_lsn != InvalidXLogRecPtr); hdr->origin_lsn = replorigin_session_origin_lsn; hdr->origin_timestamp = replorigin_session_origin_timestamp; } diff --git a/contrib/test_decoding/expected/replorigin.out b/contrib/test_decoding/expected/replorigin.out index 8077318755..895783d917 100644 --- a/contrib/test_decoding/expected/replorigin.out +++ b/contrib/test_decoding/expected/replorigin.out @@ -181,3 +181,65 @@ SELECT pg_replication_origin_drop('regress_test_decoding: regression_slot'); (1 row) +-- Transactions with no origin LSN and commit timestamps set for this +-- session yet. +SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_no_lsn', 'test_decoding'); + ?column? +---------- + init +(1 row) + +SELECT pg_replication_origin_create('regress_test_decoding: regression_slot_no_lsn'); + pg_replication_origin_create +------------------------------ + 1 +(1 row) + +-- mark session as replaying +SELECT pg_replication_origin_session_setup('regress_test_decoding: regression_slot_no_lsn'); + pg_replication_origin_session_setup +------------------------------------- + +(1 row) + +-- Normal transaction +BEGIN; +INSERT INTO target_tbl(data) + SELECT data FROM + pg_logical_slot_get_changes('regression_slot_no_lsn', NULL, NULL, 'include-xids', '0'); +COMMIT; +-- 2PC transaction +BEGIN; +INSERT INTO target_tbl(data) + SELECT data FROM + pg_logical_slot_get_changes('regression_slot_no_lsn', NULL, NULL, 'include-xids', '0'); +PREPARE TRANSACTION 'replorigin_prepared'; +COMMIT PREPARED 'replorigin_prepared'; +SELECT local_id, external_id, + remote_lsn <> '0/0' AS valid_remote_lsn, + local_lsn <> '0/0' AS valid_local_lsn + FROM pg_replication_origin_status; + local_id | external_id | valid_remote_lsn | valid_local_lsn +----------+-----------------------------------------------+------------------+----------------- + 1 | regress_test_decoding: regression_slot_no_lsn | f | t +(1 row) + +-- This ensures that everything can be dropped. +SELECT pg_replication_origin_session_reset(); + pg_replication_origin_session_reset +------------------------------------- + +(1 row) + +SELECT pg_drop_replication_slot('regression_slot_no_lsn'); + pg_drop_replication_slot +-------------------------- + +(1 row) + +SELECT pg_replication_origin_drop('regress_test_decoding: regression_slot_no_lsn'); + pg_replication_origin_drop +---------------------------- + +(1 row) + diff --git a/contrib/test_decoding/sql/replorigin.sql b/contrib/test_decoding/sql/replorigin.sql index b68f819fa1..87e57d50ea 100644 --- a/contrib/test_decoding/sql/replorigin.sql +++ b/contrib/test_decoding/sql/replorigin.sql @@ -87,3 +87,31 @@ SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'inc SELECT pg_drop_replication_slot('regression_slot'); SELECT pg_replication_origin_drop('regress_test_decoding: regression_slot'); + +-- Transactions with no origin LSN and commit timestamps set for this +-- session yet. +SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot_no_lsn', 'test_decoding'); +SELECT pg_replication_origin_create('regress_test_decoding: regression_slot_no_lsn'); +-- mark session as replaying +SELECT pg_replication_origin_session_setup('regress_test_decoding: regression_slot_no_lsn'); +-- Normal transaction +BEGIN; +INSERT INTO target_tbl(data) + SELECT data FROM + pg_logical_slot_get_changes('regression_slot_no_lsn', NULL, NULL, 'include-xids', '0'); +COMMIT; +-- 2PC transaction +BEGIN; +INSERT INTO target_tbl(data) + SELECT data FROM + pg_logical_slot_get_changes('regression_slot_no_lsn', NULL, NULL, 'include-xids', '0'); +PREPARE TRANSACTION 'replorigin_prepared'; +COMMIT PREPARED 'replorigin_prepared'; +SELECT local_id, external_id, + remote_lsn <> '0/0' AS valid_remote_lsn, + local_lsn <> '0/0' AS valid_local_lsn + FROM pg_replication_origin_status; +-- This ensures that everything can be dropped. +SELECT pg_replication_origin_session_reset(); +SELECT pg_drop_replication_slot('regression_slot_no_lsn'); +SELECT pg_replication_origin_drop('regress_test_decoding: regression_slot_no_lsn');
signature.asc
Description: PGP signature