Fair question, I should have been more elaborate.

When I have the creation of a replication slot, before I insert any data
into the table in my example (the SQL-file named "broken" in my gist),
after I then create the subscription on the receiving side, it does the
initial sync of data and then the replication fails with these messages
repeating on the receiving server:

[67] LOG:  logical replication apply worker for subscription "widgets_sub"
has started
[67] DEBUG:  connecting to publisher using connection string "dbname=testdb
host=master port=5432 user=repuser password=abc123"
[67] ERROR:  could not receive data from WAL stream: ERROR:  publication
"widgets_pub" does not exist
  CONTEXT:  slot "my_slot", output plugin "pgoutput", in the change
callback, associated LSN 0/1674958
[1] DEBUG:  unregistering background worker "logical replication worker for
subscription 16396"
[1] LOG:  background worker "logical replication worker" (PID 67) exited
with exit code 1

I can verify that the publication called widgets_pub does exist, and I am
not seeing any errors on the sending side.

The SQL-file named "works" just has the creation of the replication slot
moved down to after I insert some data into the table I want to replicate.
And that works as expected.


On Wed, 20 May 2020 at 15:00, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2020-05-19 09:22, Anders Bøgh Bruun wrote:
> > I have run into a (to me) weird issue with logical replication. We are
> > running Zalandos postgres-operator in our Kubernetes clusters and have
> > recently had a use-case where we wanted to start doing logical
> > replication of select tables to a data warehouse, also running postgres.
> > It worked as expected at first, but then after a pod-restart in
> > Kubernetes, the replication slots that were created for the subscription
> > were gone. A bit of reading later, and I learn we need to tell Patroni
> > which slots should be permanently available, so we specify a slot and
> > try to set this up, but then run into an error which says the
> > publication does not exist, even though we can verify that it does. At
> > first I suspected Patroni handling the replication slots to be the cause
> > of the problem, but about a week's worth of learning and experimenting
> > later, I can now reliably replicate the problem in pure postgres.
> > Patroni is kind of the catalyst, since my findings are that if the
> > replication slot is created before data is inserted into the source
> > database, and a publication is created, then it breaks. If the
> > replication slot is created after data is inserted and the publication
> > is created, then it works. We just can't tell Patroni to not create it
> > until some arbitrary point in time. I am guessing this is either a bug
> > or a case of us not knowing what we are doing...
>
> What does "works" and "breaks" mean?  Are there error messages shown
> from commands or in the server logs?  Is replication progressing, but
> doing something your are not expecting?  etc.
>
> --
> Peter Eisentraut              http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


-- 
Anders Bøgh Bruun

Infrastructure Architect

CellPoint digital
cellpointdigital.com
*WE MAKE TRAVEL EASIER™*

M: +45 31 14 87 41
E: and...@cellpointdigital.com

Chicago | *Copenhagen* | Dubai | London | Miami | Pune | Singapore

Reply via email to