Hi hackers, When testing some other logical replication related patches, I found two unexpected behaviours about ALTER SUBSCRIPTION ADD/DROP PUBLICATION.
(1) when I execute the following sqls[1], the data of tables that registered to 'pub' wasn't copied to the subscriber side which means tablesync worker didn't start. -----sub originally had 2 pub nodes(pub,pub2) ALTER SUBSCRIPTION sub drop PUBLICATION pub; ALTER SUBSCRIPTION sub add PUBLICATION pub; ----- (2) And when I execute the following sqls, the data of table registered to 'pub2' are synced again. -----sub originally had 2 pub nodes(pub,pub2) ALTER SUBSCRIPTION sub drop PUBLICATION pub; ALTER SUBSCRIPTION sub REFRESH PUBLICATION; ----- After looking into this problem, I think the reason is the [alter sub add/drop publication] misused the function AlterSubscription_refresh(). For DROP cases: Currently, in function AlterSubscription_refresh(), it will first fetch the target tables from the target publication, and also fetch the tables in subscriber side from pg_subscription_rel. Then it will check each table from local pg_subscription_rel, if the table does not exists in the tables fetched from the target publication then drop it. The logic above only works for SET PUBLICATION. However, When DROP PUBLICATION, the tables fetched from target publication is actually the tables that need to be dropped. If reuse the above logic, it will drop the wrong table which result in unexpected behavioud in (1) and (2).(ADD PUBLICATION have similar problem). So, I think we'd better fix this problem. I tried add some additional check in AlterSubscription_refresh() which can avoid the problem like the attached patch. Not sure do we need to further refactor. Best regards, houzj
0001-fix-ALTER-SUB-ADD-DROP-PUBLICATION.patch
Description: 0001-fix-ALTER-SUB-ADD-DROP-PUBLICATION.patch