On Sat, Feb 13, 2021 at 5:58 PM Erik Rijkers <e...@xs4all.nl> wrote:
>
> > On 02/13/2021 11:49 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
> >
> > On Fri, Feb 12, 2021 at 10:00 PM <e...@xs4all.nl> wrote:
> > >
> > > > On 02/12/2021 1:51 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> > > >
> > > > On Fri, Feb 12, 2021 at 6:04 PM Erik Rijkers <e...@xs4all.nl> wrote:
> > > > >
> > > > > I am seeing errors in replication in a test program that I've been 
> > > > > running for years with very little change (since 2017, really [1]).
> > >
> > > Hi,
> > >
> > > Here is a test program.  Careful, it deletes stuff.  And it will need 
> > > some changes:
> > >
> >
> > Thanks for sharing the test. I think I have found the problem.
> > Actually, it was an existing code problem exposed by the commit
> > ce0fdbfe97. In pgoutput_begin_txn(), we were sometimes sending the
> > prepare_write ('w') message but then the actual message was not being
> > sent. This was the case when we didn't found the origin of a txn. This
> > can happen after that commit because we have now started using origins
> > for tablesync workers as well and those origins are removed once the
> > tablesync workers are finished. We might want to change the behavior
> > related to the origin messages as indicated in the comments but for
> > now, fixing the existing code.
> >
> > Can you please test if the attached fixes the problem at your end as well?
>
> > [fix_origin_message_1.patch]
>
> I compiled just now a binary from HEAD, and a binary from HEAD+patch
>
> HEAD is still broken; your patch rescues it, so yes, fixed.
>
> Maybe a test (check or check-world) should be added to run a second replica?  
> (Assuming that would have caught this bug)
>

+1 for the idea of having a test for this. I have written a test for this.
Thanks for the fix Amit, I could reproduce the issue without your fix
and verified that the issue gets fixed with the patch you shared.
Attached a patch for the same. Thoughts?

Regards,
Vignesh
From eff13970ae34bcdc8590a9602eddce5eb6200195 Mon Sep 17 00:00:00 2001
From: vignesh <vignesh@localhost.localdomain>
Date: Mon, 15 Feb 2021 11:41:55 +0530
Subject: [PATCH v1] Test for verifying data is replicated in cascaded setup.

Test for verifying data is replicated in cascaded setup.
---
 src/test/subscription/t/100_bugs.pl | 65 +++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index d1e407a..afb2d08 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -153,3 +153,68 @@ is($node_twoways->safe_psql('d2', "SELECT count(f) FROM t"),
 	$rows * 2, "2x$rows rows in t");
 is($node_twoways->safe_psql('d2', "SELECT count(f) FROM t2"),
 	$rows * 2, "2x$rows rows in t2");
+
+# Verify table data is synced with cascaded replication setup.
+my $node_pub = get_new_node('testpublisher1');
+$node_pub->init(allows_streaming => 'logical');
+$node_pub->start;
+
+my $node_pub_sub = get_new_node('testpublisher_subscriber');
+$node_pub_sub->init(allows_streaming => 'logical');
+$node_pub_sub->start;
+
+my $node_sub = get_new_node('testsubscriber1');
+$node_sub->init(allows_streaming => 'logical');
+$node_sub->start;
+
+# Create the tables in all nodes.
+$node_pub->safe_psql('postgres', "CREATE TABLE tab1 (a int)");
+$node_pub_sub->safe_psql('postgres', "CREATE TABLE tab1 (a int)");
+$node_sub->safe_psql('postgres', "CREATE TABLE tab1 (a int)");
+
+# Create a cascaded replication setup like:
+# N1 - Create publication testpub1.
+# N2 - Create publication testpub2 and also include subscriber which subscribes
+#      to testpub1.
+# N3 - Create subscription testsub2 subscribes to testpub2.
+$node_pub->safe_psql('postgres',
+	"CREATE PUBLICATION testpub1 FOR TABLE tab1");
+
+$node_pub_sub->safe_psql('postgres',
+	"CREATE PUBLICATION testpub2 FOR TABLE tab1");
+
+my $publisher1_connstr = $node_pub->connstr . ' dbname=postgres';
+my $publisher2_connstr = $node_pub_sub->connstr . ' dbname=postgres';
+
+$node_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION testsub2 CONNECTION '$publisher2_connstr' PUBLICATION testpub2"
+);
+
+$node_pub_sub->safe_psql('postgres',
+	"CREATE SUBSCRIPTION testsub1 CONNECTION '$publisher1_connstr' PUBLICATION testpub1"
+);
+
+$node_pub->safe_psql('postgres',
+	"INSERT INTO tab1 values(generate_series(1,10))");
+
+# Verify that the data is cascaded from testpub1 to testsub1 and further from
+# testpub2 (which had testsub1) to testsub2.
+$node_pub->wait_for_catchup('testsub1');
+$node_pub_sub->wait_for_catchup('testsub2');
+
+# Drop subscriptions as we don't need them anymore
+$node_pub_sub->safe_psql('postgres', "DROP SUBSCRIPTION testsub1");
+$node_sub->safe_psql('postgres', "DROP SUBSCRIPTION testsub2");
+
+# Drop publications as we don't need them anymore
+$node_pub->safe_psql('postgres', "DROP PUBLICATION testpub1");
+$node_pub_sub->safe_psql('postgres', "DROP PUBLICATION testpub2");
+
+# Clean up the tables on both publisher and subscriber as we don't need them
+$node_pub->safe_psql('postgres', "DROP TABLE tab1");
+$node_pub_sub->safe_psql('postgres', "DROP TABLE tab1");
+$node_sub->safe_psql('postgres', "DROP TABLE tab1");
+
+$node_pub->stop('fast');
+$node_pub_sub->stop('fast');
+$node_sub->stop('fast');
-- 
1.8.3.1

Reply via email to