On Wed, 10 Jul 2024 at 12:28, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Tue, Jul 9, 2024 at 8:14 PM vignesh C <vignes...@gmail.com> wrote:
> >
> > On Tue, 9 Jul 2024 at 17:05, Amit Kapila <amit.kapil...@gmail.com> wrote:
> > >
> > > On Mon, Jul 1, 2024 at 10:51 AM vignesh C <vignes...@gmail.com> wrote:
> > > >
> > > >
> > > > This issue is present in all supported versions. I was able to
> > > > reproduce it using the steps recommended by Andres and Tomas's
> > > > scripts. I also conducted a small test through TAP tests to verify the
> > > > problem. Attached is the alternate_lock_HEAD.patch, which includes the
> > > > lock modification(Tomas's change) and the TAP test.
> > > >
> > >
> > > @@ -1568,7 +1568,7 @@ OpenTableList(List *tables)
> > >   /* Allow query cancel in case this takes a long time */
> > >   CHECK_FOR_INTERRUPTS();
> > >
> > > - rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
> > > + rel = table_openrv(t->relation, ShareRowExclusiveLock);
> > >
> > > The comment just above this code ("Open, share-lock, and check all the
> > > explicitly-specified relations") needs modification. It would be
> > > better to explain the reason of why we would need SRE lock here.
> >
> > Updated comments for the same.
> >
>
> The patch missed to use the ShareRowExclusiveLock for partitions, see
> attached. I haven't tested it but they should also face the same
> problem. Apart from that, I have changed the comments in a few places
> in the patch.

I could not hit the updated ShareRowExclusiveLock changes through the
partition table, instead I could verify it using the inheritance
table. Added a test for the same and also attaching the backbranch
patch.

Regards,
Vignesh
From d300868b61c65a6b575078c29c0d20994acae1fa Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Tue, 9 Jul 2024 19:23:10 +0530
Subject: [PATCH v4] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRowExclusiveLock mode
during the addition to a publication. This change ensures that the
ALTER PUBLICATION command waits for any ongoing transactions on the tables
(to be added to the publication) to be completed before proceeding. As a
result, transactions initiated before the publication alteration are
correctly included in the replication process.

Reported-by: Tomas Vondra
Diagnosed-by: Andres Freund
Author: Vignesh C, Tomas Vondra
Reviewed-by: Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com
---
 src/backend/commands/publicationcmds.c |  16 ++-
 src/test/subscription/t/100_bugs.pl    | 142 +++++++++++++++++++++++++
 2 files changed, 153 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 6ea709988e..341ea0318d 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -1542,8 +1542,14 @@ RemovePublicationSchemaById(Oid psoid)
 
 /*
  * Open relations specified by a PublicationTable list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
- * add them to a publication.
+ *
+ * The returned tables are locked in ShareRowExclusiveLock mode to add them
+ * to a publication. The table needs to be locked in ShareRowExclusiveLock
+ * mode to ensure that any ongoing transactions involving that table are
+ * completed before adding it to the publication. Otherwise, the transaction
+ * initiated before the alteration of the publication will continue to use a
+ * catalog snapshot predating the publication change, leading to
+ * non-replication of these transaction changes.
  */
 static List *
 OpenTableList(List *tables)
@@ -1568,7 +1574,7 @@ OpenTableList(List *tables)
 		/* Allow query cancel in case this takes a long time */
 		CHECK_FOR_INTERRUPTS();
 
-		rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+		rel = table_openrv(t->relation, ShareRowExclusiveLock);
 		myrelid = RelationGetRelid(rel);
 
 		/*
@@ -1594,7 +1600,7 @@ OpenTableList(List *tables)
 						 errmsg("conflicting or redundant column lists for table \"%s\"",
 								RelationGetRelationName(rel))));
 
-			table_close(rel, ShareUpdateExclusiveLock);
+			table_close(rel, ShareRowExclusiveLock);
 			continue;
 		}
 
@@ -1622,7 +1628,7 @@ OpenTableList(List *tables)
 			List	   *children;
 			ListCell   *child;
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
+			children = find_all_inheritors(myrelid, ShareRowExclusiveLock,
 										   NULL);
 
 			foreach(child, children)
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index cb36ca7b16..670c574547 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -487,6 +487,148 @@ $result =
 is( $result, qq(2|f
 3|t), 'check replicated update on subscriber');
 
+# The bug was that the incremental data synchronization was being skipped when
+# a new table is added to the publication in presence of a concurrent active
+# transaction performing the DML on the same table.
+
+# Create tables in publisher and subscriber.
+$node_publisher->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+$node_subscriber->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+]);
+
+# Add the table to the publication using background_psql, as the alter
+# publication operation will wait for the lock and can only be completed after
+# the previous open transaction is committed.
+my $background_psql2 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This operation will wait because there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab_conc;\n");
+
+# Verify that the table addition is waiting to acquire a ShareRowExclusiveLock.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab_conc'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+
+# Maintain an active transaction with inheritance table.
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add an inheritance table to the publication, this operation will wait because
+# there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab1_conc;\n");
+
+# Verify that the child table addition is waiting to acquire a
+# ShareRowExclusiveLock.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab1_conc_child'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+# Wait till the tables are added to the publication.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid IN ('tab_conc'::regclass, 'tab1_conc'::regclass);"
+  )
+  or die
+  "Timed out while waiting for alter publication to add the table to the publication";
+
+$background_psql1->quit;
+
+$node_publisher->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (2);
+	INSERT INTO tab1_conc_child VALUES (2);
+));
+
+# Refresh the publication.
+$node_subscriber->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'sub1');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab1_conc_child table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+# Perform an insert.
+$node_publisher->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (3);
+	INSERT INTO tab1_conc_child VALUES (3);
+));
+$node_publisher->wait_for_catchup('sub1');
+
+# Verify that the insert is replicated to the subscriber.
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab_conc added after table synchronization is replicated to the subscriber'
+);
+
+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab1_conc_child added after table synchronization is replicated to the subscriber'
+);
+
 $node_publisher->stop('fast');
 $node_subscriber->stop('fast');
 
-- 
2.34.1

From 654c8d3f6f599889c628090194aa28639bf3430d Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Wed, 10 Jul 2024 21:43:43 +0530
Subject: [PATCH v5] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRowExclusiveLock mode
during the addition to a publication. This change ensures that the
ALTER PUBLICATION command waits for any ongoing transactions on the tables
(to be added to the publication) to be completed before proceeding. As a
result, transactions initiated before the publication alteration are
correctly included in the replication process.

Reported-by: Tomas Vondra
Diagnosed-by: Andres Freund
Author: Vignesh C, Tomas Vondra
Reviewed-by: Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com
---
 src/backend/commands/publicationcmds.c |  16 ++-
 src/test/subscription/t/100_bugs.pl    | 161 ++++++++++++++++++++++++-
 2 files changed, 171 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index e288dd41cd..0215f3999f 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -547,8 +547,14 @@ RemovePublicationById(Oid pubid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
- * add them to a publication.
+ *
+ * The returned tables are locked in ShareRowExclusiveLock mode to add them
+ * to a publication. The table needs to be locked in ShareRowExclusiveLock
+ * mode to ensure that any ongoing transactions involving that table are
+ * completed before adding it to the publication. Otherwise, the transaction
+ * initiated before the alteration of the publication will continue to use a
+ * catalog snapshot predating the publication change, leading to
+ * non-replication of these transaction changes.
  */
 static List *
 OpenTableList(List *tables)
@@ -570,7 +576,7 @@ OpenTableList(List *tables)
 		/* Allow query cancel in case this takes a long time */
 		CHECK_FOR_INTERRUPTS();
 
-		rel = table_openrv(rv, ShareUpdateExclusiveLock);
+		rel = table_openrv(rv, ShareRowExclusiveLock);
 		myrelid = RelationGetRelid(rel);
 
 		/*
@@ -582,7 +588,7 @@ OpenTableList(List *tables)
 		 */
 		if (list_member_oid(relids, myrelid))
 		{
-			table_close(rel, ShareUpdateExclusiveLock);
+			table_close(rel, ShareRowExclusiveLock);
 			continue;
 		}
 
@@ -600,7 +606,7 @@ OpenTableList(List *tables)
 			List	   *children;
 			ListCell   *child;
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
+			children = find_all_inheritors(myrelid, ShareRowExclusiveLock,
 										   NULL);
 
 			foreach(child, children)
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index cce91891ab..501b94a332 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 9;
+use Test::More tests => 13;
 
 # Bug #15114
 
@@ -362,3 +362,162 @@ is( $node_subscriber_d_cols->safe_psql(
 
 $node_publisher_d_cols->stop('fast');
 $node_subscriber_d_cols->stop('fast');
+
+# The bug was that the incremental data synchronization was being skipped when
+# a new table is added to the publication in presence of a concurrent active
+# transaction performing the DML on the same table.
+my $node_publisher1 = get_new_node('node_publisher1');
+$node_publisher1->init(allows_streaming => 'logical');
+$node_publisher1->start;
+
+my $node_subscriber1 = get_new_node('node_subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+$publisher_connstr = $node_publisher1->connstr . ' dbname=postgres';
+$node_publisher1->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1"
+);
+
+# Create tables in publisher and subscriber.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+$node_subscriber1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+]);
+
+# Add the table to the publication using background_psql, as the alter
+# publication operation will wait for the lock and can only be completed after
+# the previous open transaction is committed.
+my $background_psql2 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This operation will wait because there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab_conc;\n");
+
+# Verify that the table addition is waiting to acquire a ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab_conc'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+
+# Maintain an active transaction with inheritance table.
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add an inheritance table to the publication, this operation will wait because
+# there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab1_conc;\n");
+
+# Verify that the child table addition is waiting to acquire a
+# ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab1_conc_child'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+# Wait till the tables are added to the publication.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid IN ('tab_conc'::regclass, 'tab1_conc'::regclass);"
+  )
+  or die
+  "Timed out while waiting for alter publication to add the table to the publication";
+
+$background_psql1->quit;
+
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (2);
+	INSERT INTO tab1_conc_child VALUES (2);
+));
+
+# Refresh the publication.
+$node_subscriber1->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber1->wait_for_subscription_sync($node_publisher1, 'sub1');
+
+my $result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab1_conc_child table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+# Perform an insert.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (3);
+	INSERT INTO tab1_conc_child VALUES (3);
+));
+$node_publisher1->wait_for_catchup('sub1');
+
+# Verify that the insert is replicated to the subscriber.
+$result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab_conc added after table synchronization is replicated to the subscriber'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab1_conc_child added after table synchronization is replicated to the subscriber'
+);
+
+$node_publisher1->stop('fast');
+$node_subscriber1->stop('fast');
-- 
2.34.1

From 7b2d8e60eb2192c4a10408b40a49914b3e1b5019 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Wed, 10 Jul 2024 21:57:57 +0530
Subject: [PATCH v5] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRowExclusiveLock mode
during the addition to a publication. This change ensures that the
ALTER PUBLICATION command waits for any ongoing transactions on the tables
(to be added to the publication) to be completed before proceeding. As a
result, transactions initiated before the publication alteration are
correctly included in the replication process.

Reported-by: Tomas Vondra
Diagnosed-by: Andres Freund
Author: Vignesh C, Tomas Vondra
Reviewed-by: Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com
---
 src/backend/commands/publicationcmds.c |  15 ++-
 src/test/subscription/t/100_bugs.pl    | 161 ++++++++++++++++++++++++-
 2 files changed, 171 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 4f70af07ba..bf9761779f 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -506,7 +506,14 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode.
+ *
+ * The returned tables are locked in ShareRowExclusiveLock mode to add them
+ * to a publication. The table needs to be locked in ShareRowExclusiveLock
+ * mode to ensure that any ongoing transactions involving that table are
+ * completed before adding it to the publication. Otherwise, the transaction
+ * initiated before the alteration of the publication will continue to use a
+ * catalog snapshot predating the publication change, leading to
+ * non-replication of these transaction changes.
  */
 static List *
 OpenTableList(List *tables)
@@ -528,7 +535,7 @@ OpenTableList(List *tables)
 		/* Allow query cancel in case this takes a long time */
 		CHECK_FOR_INTERRUPTS();
 
-		rel = table_openrv(rv, ShareUpdateExclusiveLock);
+		rel = table_openrv(rv, ShareRowExclusiveLock);
 		myrelid = RelationGetRelid(rel);
 
 		/*
@@ -540,7 +547,7 @@ OpenTableList(List *tables)
 		 */
 		if (list_member_oid(relids, myrelid))
 		{
-			table_close(rel, ShareUpdateExclusiveLock);
+			table_close(rel, ShareRowExclusiveLock);
 			continue;
 		}
 
@@ -553,7 +560,7 @@ OpenTableList(List *tables)
 			List	   *children;
 			ListCell   *child;
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
+			children = find_all_inheritors(myrelid, ShareRowExclusiveLock,
 										   NULL);
 
 			foreach(child, children)
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index f5968ffa97..ba11528750 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 7;
+use Test::More tests => 11;
 
 # Bug #15114
 
@@ -244,3 +244,162 @@ is( $node_subscriber_d_cols->safe_psql(
 
 $node_publisher_d_cols->stop('fast');
 $node_subscriber_d_cols->stop('fast');
+
+# The bug was that the incremental data synchronization was being skipped when
+# a new table is added to the publication in presence of a concurrent active
+# transaction performing the DML on the same table.
+my $node_publisher1 = get_new_node('node_publisher1');
+$node_publisher1->init(allows_streaming => 'logical');
+$node_publisher1->start;
+
+my $node_subscriber1 = get_new_node('node_subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+$publisher_connstr = $node_publisher1->connstr . ' dbname=postgres';
+$node_publisher1->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1"
+);
+
+# Create tables in publisher and subscriber.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+$node_subscriber1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+]);
+
+# Add the table to the publication using background_psql, as the alter
+# publication operation will wait for the lock and can only be completed after
+# the previous open transaction is committed.
+my $background_psql2 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This operation will wait because there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab_conc;\n");
+
+# Verify that the table addition is waiting to acquire a ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab_conc'::regclass AND mode = 'ShareRowExclusiveLock';"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+
+# Maintain an active transaction with inheritance table.
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add an inheritance table to the publication, this operation will wait because
+# there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab1_conc;\n");
+
+# Verify that the child table addition is waiting to acquire a
+# ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab1_conc_child'::regclass AND mode = 'ShareRowExclusiveLock';"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+# Wait till the tables are added to the publication.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid IN ('tab_conc'::regclass, 'tab1_conc'::regclass);"
+  )
+  or die
+  "Timed out while waiting for alter publication to add the table to the publication";
+
+$background_psql1->quit;
+
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (2);
+	INSERT INTO tab1_conc_child VALUES (2);
+));
+
+# Refresh the publication.
+$node_subscriber1->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber1->wait_for_subscription_sync($node_publisher1, 'sub1');
+
+my $result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab1_conc_child table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+# Perform an insert.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (3);
+	INSERT INTO tab1_conc_child VALUES (3);
+));
+$node_publisher1->wait_for_catchup('sub1');
+
+# Verify that the insert is replicated to the subscriber.
+$result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab_conc added after table synchronization is replicated to the subscriber'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab1_conc_child added after table synchronization is replicated to the subscriber'
+);
+
+$node_publisher1->stop('fast');
+$node_subscriber1->stop('fast');
-- 
2.34.1

From 496463b1d4b4e7d0685f03144ed23c7c3b24a7a0 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Wed, 10 Jul 2024 21:43:43 +0530
Subject: [PATCH v5] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRowExclusiveLock mode
during the addition to a publication. This change ensures that the
ALTER PUBLICATION command waits for any ongoing transactions on the tables
(to be added to the publication) to be completed before proceeding. As a
result, transactions initiated before the publication alteration are
correctly included in the replication process.

Reported-by: Tomas Vondra
Diagnosed-by: Andres Freund
Author: Vignesh C, Tomas Vondra
Reviewed-by: Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com
---
 src/backend/commands/publicationcmds.c |  16 ++-
 src/test/subscription/t/100_bugs.pl    | 161 ++++++++++++++++++++++++-
 2 files changed, 171 insertions(+), 6 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 7ee8825522..8135db2cc0 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -548,8 +548,14 @@ RemovePublicationRelById(Oid proid)
 
 /*
  * Open relations specified by a RangeVar list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
- * add them to a publication.
+ *
+ * The returned tables are locked in ShareRowExclusiveLock mode to add them
+ * to a publication. The table needs to be locked in ShareRowExclusiveLock
+ * mode to ensure that any ongoing transactions involving that table are
+ * completed before adding it to the publication. Otherwise, the transaction
+ * initiated before the alteration of the publication will continue to use a
+ * catalog snapshot predating the publication change, leading to
+ * non-replication of these transaction changes.
  */
 static List *
 OpenTableList(List *tables)
@@ -571,7 +577,7 @@ OpenTableList(List *tables)
 		/* Allow query cancel in case this takes a long time */
 		CHECK_FOR_INTERRUPTS();
 
-		rel = table_openrv(rv, ShareUpdateExclusiveLock);
+		rel = table_openrv(rv, ShareRowExclusiveLock);
 		myrelid = RelationGetRelid(rel);
 
 		/*
@@ -583,7 +589,7 @@ OpenTableList(List *tables)
 		 */
 		if (list_member_oid(relids, myrelid))
 		{
-			table_close(rel, ShareUpdateExclusiveLock);
+			table_close(rel, ShareRowExclusiveLock);
 			continue;
 		}
 
@@ -601,7 +607,7 @@ OpenTableList(List *tables)
 			List	   *children;
 			ListCell   *child;
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
+			children = find_all_inheritors(myrelid, ShareRowExclusiveLock,
 										   NULL);
 
 			foreach(child, children)
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 7ebb97bbcf..bdbacef33c 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 9;
+use Test::More tests => 13;
 
 # Bug #15114
 
@@ -291,3 +291,162 @@ is( $node_subscriber_d_cols->safe_psql(
 
 $node_publisher_d_cols->stop('fast');
 $node_subscriber_d_cols->stop('fast');
+
+# The bug was that the incremental data synchronization was being skipped when
+# a new table is added to the publication in presence of a concurrent active
+# transaction performing the DML on the same table.
+my $node_publisher1 = get_new_node('node_publisher1');
+$node_publisher1->init(allows_streaming => 'logical');
+$node_publisher1->start;
+
+my $node_subscriber1 = get_new_node('node_subscriber1');
+$node_subscriber1->init(allows_streaming => 'logical');
+$node_subscriber1->start;
+
+$publisher_connstr = $node_publisher1->connstr . ' dbname=postgres';
+$node_publisher1->safe_psql('postgres',
+	"CREATE PUBLICATION pub1");
+$node_subscriber1->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1"
+);
+
+# Create tables in publisher and subscriber.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+$node_subscriber1->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+]);
+
+# Add the table to the publication using background_psql, as the alter
+# publication operation will wait for the lock and can only be completed after
+# the previous open transaction is committed.
+my $background_psql2 = $node_publisher1->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This operation will wait because there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab_conc;\n");
+
+# Verify that the table addition is waiting to acquire a ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab_conc'::regclass AND mode = 'ShareRowExclusiveLock';"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+
+# Maintain an active transaction with inheritance table.
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add an inheritance table to the publication, this operation will wait because
+# there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab1_conc;\n");
+
+# Verify that the child table addition is waiting to acquire a
+# ShareRowExclusiveLock.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab1_conc_child'::regclass AND mode = 'ShareRowExclusiveLock';"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+# Wait till the tables are added to the publication.
+$node_publisher1->poll_query_until('postgres',
+	"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid IN ('tab_conc'::regclass, 'tab1_conc'::regclass);"
+  )
+  or die
+  "Timed out while waiting for alter publication to add the table to the publication";
+
+$background_psql1->quit;
+
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (2);
+	INSERT INTO tab1_conc_child VALUES (2);
+));
+
+# Refresh the publication.
+$node_subscriber1->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber1->wait_for_subscription_sync($node_publisher1, 'sub1');
+
+my $result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab1_conc_child table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+# Perform an insert.
+$node_publisher1->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (3);
+	INSERT INTO tab1_conc_child VALUES (3);
+));
+$node_publisher1->wait_for_catchup('sub1');
+
+# Verify that the insert is replicated to the subscriber.
+$result = $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab_conc added after table synchronization is replicated to the subscriber'
+);
+
+$result =
+  $node_subscriber1->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab1_conc_child added after table synchronization is replicated to the subscriber'
+);
+
+$node_publisher1->stop('fast');
+$node_subscriber1->stop('fast');
-- 
2.34.1

From acd506f960cb1851e47ed7c3966b71618d6d2182 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Wed, 10 Jul 2024 20:58:04 +0530
Subject: [PATCH v5] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRowExclusiveLock mode
during the addition to a publication. This change ensures that the
ALTER PUBLICATION command waits for any ongoing transactions on the tables
(to be added to the publication) to be completed before proceeding. As a
result, transactions initiated before the publication alteration are
correctly included in the replication process.

Reported-by: Tomas Vondra
Diagnosed-by: Andres Freund
Author: Vignesh C, Tomas Vondra
Reviewed-by: Amit Kapila
Backpatch-through: 12
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com
---
 src/backend/commands/publicationcmds.c |  16 ++-
 src/test/subscription/t/100_bugs.pl    | 142 +++++++++++++++++++++++++
 2 files changed, 153 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f4ba572697..4bd56edb9b 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -1549,8 +1549,14 @@ RemovePublicationSchemaById(Oid psoid)
 
 /*
  * Open relations specified by a PublicationTable list.
- * The returned tables are locked in ShareUpdateExclusiveLock mode in order to
- * add them to a publication.
+ *
+ * The returned tables are locked in ShareRowExclusiveLock mode to add them
+ * to a publication. The table needs to be locked in ShareRowExclusiveLock
+ * mode to ensure that any ongoing transactions involving that table are
+ * completed before adding it to the publication. Otherwise, the transaction
+ * initiated before the alteration of the publication will continue to use a
+ * catalog snapshot predating the publication change, leading to
+ * non-replication of these transaction changes.
  */
 static List *
 OpenTableList(List *tables)
@@ -1575,7 +1581,7 @@ OpenTableList(List *tables)
 		/* Allow query cancel in case this takes a long time */
 		CHECK_FOR_INTERRUPTS();
 
-		rel = table_openrv(t->relation, ShareUpdateExclusiveLock);
+		rel = table_openrv(t->relation, ShareRowExclusiveLock);
 		myrelid = RelationGetRelid(rel);
 
 		/*
@@ -1601,7 +1607,7 @@ OpenTableList(List *tables)
 						 errmsg("conflicting or redundant column lists for table \"%s\"",
 								RelationGetRelationName(rel))));
 
-			table_close(rel, ShareUpdateExclusiveLock);
+			table_close(rel, ShareRowExclusiveLock);
 			continue;
 		}
 
@@ -1629,7 +1635,7 @@ OpenTableList(List *tables)
 			List	   *children;
 			ListCell   *child;
 
-			children = find_all_inheritors(myrelid, ShareUpdateExclusiveLock,
+			children = find_all_inheritors(myrelid, ShareRowExclusiveLock,
 										   NULL);
 
 			foreach(child, children)
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 091da5a506..1d087a74c0 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -488,6 +488,148 @@ $result = $node_subscriber->safe_psql('postgres',
 is($result, qq(2|f
 3|t), 'check replicated update on subscriber');
 
+# The bug was that the incremental data synchronization was being skipped when
+# a new table is added to the publication in presence of a concurrent active
+# transaction performing the DML on the same table.
+
+# Create tables in publisher and subscriber.
+$node_publisher->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+$node_subscriber->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc(a int);
+	CREATE TABLE tab1_conc_child() inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+]);
+
+# Add the table to the publication using background_psql, as the alter
+# publication operation will wait for the lock and can only be completed after
+# the previous open transaction is committed.
+my $background_psql2 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This operation will wait because there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab_conc;\n");
+
+# Verify that the table addition is waiting to acquire a ShareRowExclusiveLock.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab_conc'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+
+# Maintain an active transaction with inheritance table.
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add an inheritance table to the publication, this operation will wait because
+# there is an open transaction holding a lock.
+$background_psql2->query_until(qr//,
+	"ALTER PUBLICATION pub1 ADD TABLE tab1_conc;\n");
+
+# Verify that the child table addition is waiting to acquire a
+# ShareRowExclusiveLock.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 1 FROM pg_locks WHERE relation = 'tab1_conc_child'::regclass AND mode = 'ShareRowExclusiveLock' AND waitstart IS NOT NULL;"
+  )
+  or die
+  "Timed out while waiting for alter publication tries to wait on ShareRowExclusiveLock";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+# Wait till the tables are added to the publication.
+$node_publisher->poll_query_until('postgres',
+	"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid IN ('tab_conc'::regclass, 'tab1_conc'::regclass);"
+  )
+  or die
+  "Timed out while waiting for alter publication to add the table to the publication";
+
+$background_psql1->quit;
+
+$node_publisher->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (2);
+	INSERT INTO tab1_conc_child VALUES (2);
+));
+
+# Refresh the publication.
+$node_subscriber->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'sub1');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2),
+	'Ensure that the data from the tab1_conc_child table is synchronized to the subscriber after the subscription is refreshed'
+);
+
+# Perform an insert.
+$node_publisher->safe_psql(
+	'postgres', qq(
+	INSERT INTO tab_conc VALUES (3);
+	INSERT INTO tab1_conc_child VALUES (3);
+));
+$node_publisher->wait_for_catchup('sub1');
+
+# Verify that the insert is replicated to the subscriber.
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab_conc added after table synchronization is replicated to the subscriber'
+);
+
+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3),
+	'Verify that the incremental data for table tab1_conc_child added after table synchronization is replicated to the subscriber'
+);
+
 $node_publisher->stop('fast');
 $node_subscriber->stop('fast');
 
-- 
2.34.1

diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index cb36ca7b16..bd8c305c7d 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -487,6 +487,95 @@ $result =
 is( $result, qq(2|f
 3|t), 'check replicated update on subscriber');
 
+# Incremental data synchronization skipped when a new table is added, if
+# there is a concurrent active transaction involving the same table.
+
+# Create table in publisher and subscriber.
+$node_publisher->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc (a int);
+	CREATE TABLE tab1_conc_child () inherits (tab1_conc);
+));
+$node_subscriber->safe_psql(
+	'postgres', qq(
+	CREATE TABLE tab_conc(a int);
+	CREATE TABLE tab1_conc (a int);
+	CREATE TABLE tab1_conc_child () inherits (tab1_conc);
+));
+
+# Bump the query timeout to avoid false negatives on slow test systems.
+my $psql_timeout_secs = 4 * $PostgreSQL::Test::Utils::timeout_default;
+
+# Initiate a background session that keeps a transaction active.
+my $background_psql1 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+# Maintain an active transaction with the table.
+$background_psql1->set_query_timer_restart();
+$background_psql1->query_safe(
+	qq[
+	BEGIN;
+	INSERT INTO tab_conc VALUES (1);
+	INSERT INTO tab1_conc_child VALUES (1);
+]);
+
+# Add the table to the publication from background_psql
+my $background_psql2 = $node_publisher->background_psql(
+	'postgres',
+	on_error_stop => 0,
+	timeout => $psql_timeout_secs);
+
+$background_psql2->set_query_timer_restart();
+
+# This will wait as the open transaction holding a lock.
+$background_psql2->query_until(qr//, "ALTER PUBLICATION pub1 ADD TABLE tab_conc, tab1_conc;\n");
+
+$node_publisher->poll_query_until('postgres',
+"SELECT COUNT(1) = 2 FROM pg_publication_rel WHERE prrelid = 'tab_conc'::regclass OR prrelid = 'tab1_conc'::regclass;"
+  )
+  or die
+  "Timed out while waiting for the table tab_conc is added to pg_publication_rel";
+
+# Complete the old transaction.
+$background_psql1->query_safe(qq[COMMIT]);
+$background_psql1->quit;
+
+$background_psql1->query_safe(qq[INSERT INTO tab_conc VALUES (2)]);
+$background_psql1->query_safe(qq[INSERT INTO tab1_conc_child VALUES (2)]);
+$background_psql1->quit;
+
+# Refresh the publication
+$node_subscriber->safe_psql('postgres',
+	'ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION');
+
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'sub1');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2), 'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2), 'Ensure that the data from the tab_conc table is synchronized to the subscriber after the subscription is refreshed');
+
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_conc values(3)");
+$node_publisher->safe_psql('postgres', "INSERT INTO tab1_conc_child values(3)");
+
+$node_publisher->wait_for_catchup('sub1');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_conc");
+is( $result, qq(1
+2
+3), 'Verify that the incremental data added after table synchronization is replicated to the subscriber');
+
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab1_conc_child");
+is( $result, qq(1
+2
+3), 'Verify that the incremental data added after table synchronization is replicated to the subscriber');
+
 $node_publisher->stop('fast');
 $node_subscriber->stop('fast');
 

Reply via email to