On Sat, Feb 13, 2021 at 5:58 PM Erik Rijkers <e...@xs4all.nl> wrote: > > > On 02/13/2021 11:49 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Fri, Feb 12, 2021 at 10:00 PM <e...@xs4all.nl> wrote: > > > > > > > On 02/12/2021 1:51 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > > > On Fri, Feb 12, 2021 at 6:04 PM Erik Rijkers <e...@xs4all.nl> wrote: > > > > > > > > > > I am seeing errors in replication in a test program that I've been > > > > > running for years with very little change (since 2017, really [1]). > > > > > > Hi, > > > > > > Here is a test program. Careful, it deletes stuff. And it will need > > > some changes: > > > > > > > Thanks for sharing the test. I think I have found the problem. > > Actually, it was an existing code problem exposed by the commit > > ce0fdbfe97. In pgoutput_begin_txn(), we were sometimes sending the > > prepare_write ('w') message but then the actual message was not being > > sent. This was the case when we didn't found the origin of a txn. This > > can happen after that commit because we have now started using origins > > for tablesync workers as well and those origins are removed once the > > tablesync workers are finished. We might want to change the behavior > > related to the origin messages as indicated in the comments but for > > now, fixing the existing code. > > > > Can you please test if the attached fixes the problem at your end as well? > > > [fix_origin_message_1.patch] > > I compiled just now a binary from HEAD, and a binary from HEAD+patch > > HEAD is still broken; your patch rescues it, so yes, fixed. > > Maybe a test (check or check-world) should be added to run a second replica? > (Assuming that would have caught this bug) >
+1 for the idea of having a test for this. I have written a test for this. Thanks for the fix Amit, I could reproduce the issue without your fix and verified that the issue gets fixed with the patch you shared. Attached a patch for the same. Thoughts? Regards, Vignesh
From eff13970ae34bcdc8590a9602eddce5eb6200195 Mon Sep 17 00:00:00 2001 From: vignesh <vignesh@localhost.localdomain> Date: Mon, 15 Feb 2021 11:41:55 +0530 Subject: [PATCH v1] Test for verifying data is replicated in cascaded setup. Test for verifying data is replicated in cascaded setup. --- src/test/subscription/t/100_bugs.pl | 65 +++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index d1e407a..afb2d08 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -153,3 +153,68 @@ is($node_twoways->safe_psql('d2', "SELECT count(f) FROM t"), $rows * 2, "2x$rows rows in t"); is($node_twoways->safe_psql('d2', "SELECT count(f) FROM t2"), $rows * 2, "2x$rows rows in t2"); + +# Verify table data is synced with cascaded replication setup. +my $node_pub = get_new_node('testpublisher1'); +$node_pub->init(allows_streaming => 'logical'); +$node_pub->start; + +my $node_pub_sub = get_new_node('testpublisher_subscriber'); +$node_pub_sub->init(allows_streaming => 'logical'); +$node_pub_sub->start; + +my $node_sub = get_new_node('testsubscriber1'); +$node_sub->init(allows_streaming => 'logical'); +$node_sub->start; + +# Create the tables in all nodes. +$node_pub->safe_psql('postgres', "CREATE TABLE tab1 (a int)"); +$node_pub_sub->safe_psql('postgres', "CREATE TABLE tab1 (a int)"); +$node_sub->safe_psql('postgres', "CREATE TABLE tab1 (a int)"); + +# Create a cascaded replication setup like: +# N1 - Create publication testpub1. +# N2 - Create publication testpub2 and also include subscriber which subscribes +# to testpub1. +# N3 - Create subscription testsub2 subscribes to testpub2. +$node_pub->safe_psql('postgres', + "CREATE PUBLICATION testpub1 FOR TABLE tab1"); + +$node_pub_sub->safe_psql('postgres', + "CREATE PUBLICATION testpub2 FOR TABLE tab1"); + +my $publisher1_connstr = $node_pub->connstr . ' dbname=postgres'; +my $publisher2_connstr = $node_pub_sub->connstr . ' dbname=postgres'; + +$node_sub->safe_psql('postgres', + "CREATE SUBSCRIPTION testsub2 CONNECTION '$publisher2_connstr' PUBLICATION testpub2" +); + +$node_pub_sub->safe_psql('postgres', + "CREATE SUBSCRIPTION testsub1 CONNECTION '$publisher1_connstr' PUBLICATION testpub1" +); + +$node_pub->safe_psql('postgres', + "INSERT INTO tab1 values(generate_series(1,10))"); + +# Verify that the data is cascaded from testpub1 to testsub1 and further from +# testpub2 (which had testsub1) to testsub2. +$node_pub->wait_for_catchup('testsub1'); +$node_pub_sub->wait_for_catchup('testsub2'); + +# Drop subscriptions as we don't need them anymore +$node_pub_sub->safe_psql('postgres', "DROP SUBSCRIPTION testsub1"); +$node_sub->safe_psql('postgres', "DROP SUBSCRIPTION testsub2"); + +# Drop publications as we don't need them anymore +$node_pub->safe_psql('postgres', "DROP PUBLICATION testpub1"); +$node_pub_sub->safe_psql('postgres', "DROP PUBLICATION testpub2"); + +# Clean up the tables on both publisher and subscriber as we don't need them +$node_pub->safe_psql('postgres', "DROP TABLE tab1"); +$node_pub_sub->safe_psql('postgres', "DROP TABLE tab1"); +$node_sub->safe_psql('postgres', "DROP TABLE tab1"); + +$node_pub->stop('fast'); +$node_pub_sub->stop('fast'); +$node_sub->stop('fast'); -- 1.8.3.1