Hi,

On 2021-12-22 22:34:45 +0900, Masahiko Sawada wrote:
> I've attached an updated patch. Please review it.

Sorry for dropping the ball on this again :(. I've pushed the fix with some
very minor polishing.


> > The attached detects that bug, but I'm not sure it's worth expending
> > test time, or this might be in the server test suit.
>
> Thanks. It's convenient to test this issue but I'm also not sure it's
> worth adding to the test suit.

I think it's definitely worth adding a test, but I don't particularly like the
specific test implementation. Primarily because I think it's better to test
this in a cluster that stays running, so that we can verify that the slot drop
worked. It also doesn't seem necessary to create a separate cluster.

I wrote the attached isolation test. I ended up not committing it yet - I'm
worried that there could be some OS dependent output difference, due to some
libpq error handling issues. See [1], which Tom pointed out is caused by the
issue discussed in [2].

Greetings,

Andres Freund

[1] https://postgr.es/m/20220215004143.dlzsn72oqsmqa7uw%40alap3.anarazel.de
[2] https://postgr.es/m/20220215004143.dlzsn72oqsmqa7uw%40alap3.anarazel.de
>From 343106442eafde4fcc55cc75e78501010d50a883 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 14 Feb 2022 17:17:37 -0800
Subject: [PATCH v1] Add isolation test for errors during logical slot
 creation.

Discussion: https://postgr.es/m/cad21aodaeepabzeyyjspzjumsparicvsbobal7spaofnkz+...@mail.gmail.com
---
 contrib/test_decoding/Makefile                |   2 +-
 .../expected/slot_creation_error.out          | 113 ++++++++++++++++++
 .../specs/slot_creation_error.spec            |  41 +++++++
 3 files changed, 155 insertions(+), 1 deletion(-)
 create mode 100644 contrib/test_decoding/expected/slot_creation_error.out
 create mode 100644 contrib/test_decoding/specs/slot_creation_error.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 56ddc3abaeb..36929dd97d3 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -9,7 +9,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	sequence
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
-	twophase_snapshot
+	twophase_snapshot slot_creation_error
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/slot_creation_error.out b/contrib/test_decoding/expected/slot_creation_error.out
new file mode 100644
index 00000000000..fc1c98b5248
--- /dev/null
+++ b/contrib/test_decoding/expected/slot_creation_error.out
@@ -0,0 +1,113 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_b s1_xid s2_init s1_view_slot s1_cancel_s2 s1_view_slot s1_c
+step s1_b: BEGIN ISOLATION LEVEL SERIALIZABLE;
+step s1_xid: SELECT 'xid' FROM txid_current();
+?column?
+--------
+xid     
+(1 row)
+
+step s2_init: 
+    SELECT 'init' FROM pg_create_logical_replication_slot('slot_creation_error', 'test_decoding');
+ <waiting ...>
+step s1_view_slot: 
+    SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
+
+slot_name          |slot_type|active
+-------------------+---------+------
+slot_creation_error|logical  |t     
+(1 row)
+
+step s1_cancel_s2: 
+    SELECT pg_cancel_backend(pid)
+    FROM pg_stat_activity
+    WHERE application_name = 'isolation/slot_creation_error/s2';
+
+pg_cancel_backend
+-----------------
+t                
+(1 row)
+
+step s2_init: <... completed>
+ERROR:  canceling statement due to user request
+step s1_view_slot: 
+    SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
+
+slot_name|slot_type|active
+---------+---------+------
+(0 rows)
+
+step s1_c: COMMIT;
+
+starting permutation: s1_b s1_xid s2_init s1_c s1_view_slot s1_drop_slot
+step s1_b: BEGIN ISOLATION LEVEL SERIALIZABLE;
+step s1_xid: SELECT 'xid' FROM txid_current();
+?column?
+--------
+xid     
+(1 row)
+
+step s2_init: 
+    SELECT 'init' FROM pg_create_logical_replication_slot('slot_creation_error', 'test_decoding');
+ <waiting ...>
+step s1_c: COMMIT;
+step s2_init: <... completed>
+?column?
+--------
+init    
+(1 row)
+
+step s1_view_slot: 
+    SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
+
+slot_name          |slot_type|active
+-------------------+---------+------
+slot_creation_error|logical  |f     
+(1 row)
+
+step s1_drop_slot: 
+    SELECT pg_drop_replication_slot('slot_creation_error');
+
+pg_drop_replication_slot
+------------------------
+                        
+(1 row)
+
+
+starting permutation: s1_b s1_xid s2_init s1_terminate_s2 s1_c s1_view_slot
+step s1_b: BEGIN ISOLATION LEVEL SERIALIZABLE;
+step s1_xid: SELECT 'xid' FROM txid_current();
+?column?
+--------
+xid     
+(1 row)
+
+step s2_init: 
+    SELECT 'init' FROM pg_create_logical_replication_slot('slot_creation_error', 'test_decoding');
+ <waiting ...>
+step s1_terminate_s2: 
+    SELECT pg_terminate_backend(pid)
+    FROM pg_stat_activity
+    WHERE application_name = 'isolation/slot_creation_error/s2';
+
+pg_terminate_backend
+--------------------
+t                   
+(1 row)
+
+step s2_init: <... completed>
+FATAL:  terminating connection due to administrator command
+FATAL:  terminating connection due to administrator command
+server closed the connection unexpectedly
+	This probably means the server terminated abnormally
+	before or while processing the request.
+
+step s1_c: COMMIT;
+step s1_view_slot: 
+    SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
+
+slot_name|slot_type|active
+---------+---------+------
+(0 rows)
+
diff --git a/contrib/test_decoding/specs/slot_creation_error.spec b/contrib/test_decoding/specs/slot_creation_error.spec
new file mode 100644
index 00000000000..582b282331f
--- /dev/null
+++ b/contrib/test_decoding/specs/slot_creation_error.spec
@@ -0,0 +1,41 @@
+# Test that erroring out during logical slot creation is handled properly
+
+session "s1"
+setup { SET synchronous_commit=on; }
+
+step s1_b { BEGIN ISOLATION LEVEL SERIALIZABLE; }
+step s1_xid { SELECT 'xid' FROM txid_current(); }
+step s1_c { COMMIT; }
+step s1_cancel_s2 {
+    SELECT pg_cancel_backend(pid)
+    FROM pg_stat_activity
+    WHERE application_name = 'isolation/slot_creation_error/s2';
+}
+
+step s1_terminate_s2 {
+    SELECT pg_terminate_backend(pid)
+    FROM pg_stat_activity
+    WHERE application_name = 'isolation/slot_creation_error/s2';
+}
+
+step s1_view_slot {
+    SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
+}
+
+step s1_drop_slot {
+    SELECT pg_drop_replication_slot('slot_creation_error');
+}
+
+session s2
+setup { SET synchronous_commit=on; }
+step s2_init {
+    SELECT 'init' FROM pg_create_logical_replication_slot('slot_creation_error', 'test_decoding');
+}
+
+# The tests first start a transaction with an xid assigned in s1, then create
+# a slot in s2. The slot creation waits for s1's transaction to end. Instead
+# we cancel / terminate s2.
+permutation s1_b s1_xid s2_init s1_view_slot s1_cancel_s2 s1_view_slot s1_c
+permutation s1_b s1_xid s2_init s1_c s1_view_slot s1_drop_slot # check slot creation still works
+permutation s1_b s1_xid s2_init s1_terminate_s2 s1_c s1_view_slot
+# can't run tests after this, due to s2's connection failure
-- 
2.34.0

Reply via email to