On Thu, Jan 18, 2024 at 8:16 AM Peter Smith <smithpb2...@gmail.com> wrote: > > Hi, > > I had reported a possible subscription 'disable_on_error' bug found > while reviewing another patch. > > I am including my initial report and Nisha's analysis again here so > that this topic has its own thread. > > ================== > INITIAL REPORT [1] > ================== > > ... > I see now that any ALTER of the subscription's connection, even to > some value that fails, will restart a new worker (like ALTER of any > other subscription parameters). For a bad connection, it will continue > to relaunch-worker/ERROR over and over. e.g. > > ---------- > test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG: logical > replication apply worker for subscription "sub4" has started > 2024-01-17 09:34:28.666 AEDT [11274] ERROR: could not connect to the > publisher: invalid port number: "-1" > 2024-01-17 09:34:28.667 AEDT [928] LOG: background worker "logical > replication apply worker" (PID 11274) exited with exit code 1 > 2024-01-17 09:34:33.669 AEDT [11391] LOG: logical replication apply > worker for subscription "sub4" has started > 2024-01-17 09:34:33.669 AEDT [11391] ERROR: could not connect to the > publisher: invalid port number: "-1" > 2024-01-17 09:34:33.670 AEDT [928] LOG: background worker "logical > replication apply worker" (PID 11391) exited with exit code 1 > etc... > ---------- > > While experimenting with the bad connection ALTER I also tried setting > 'disable_on_error' like below: > > ALTER SUBSCRIPTION sub4 SET (disable_on_error); > ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1'; > > ...but here the subscription did not become DISABLED as I expected it > would do on the next connection error iteration. It remains enabled > and just continues to loop relaunch/ERROR indefinitely same as before. > > That looks like it may be a bug. Thoughts?
Although we can improve it to handle this case too, I'm not sure it's a bug. The doc says[1]: Specifies whether the subscription should be automatically disabled if any errors are detected by subscription workers during data replication from the publisher. When an apply worker is trying to establish a connection, it's not replicating data from the publisher. Regards, [1] https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-DISABLE-ON-ERROR -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com