On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignes...@gmail.com> wrote: > > Amit and I engaged in an offline discussion regarding the design and > contemplated that it could be like below:
If I understand correctly, does this require the sequences to already exist on the subscribing node before creating the subscription, or will it also copy any non-existing sequences? > 1) CREATE PUBLICATION syntax enhancement: > CREATE PUBLICATION ... FOR ALL SEQUENCES; > The addition of a new column titled "all sequences" in the > pg_publication system table will signify whether the publication is > designated as all sequences publication or not. > > 2) CREATE SUBSCRIPTION -- no syntax change. > Upon creation of a subscription, the following additional steps will > be managed by the subscriber: > i) The subscriber will retrieve the list of sequences associated with > the subscription's publications. > ii) For each sequence: a) Retrieve the sequence value from the > publisher by invoking the pg_sequence_state function. b) Set the > sequence with the value obtained from the publisher. iv) Once the > subscription creation is completed, all sequence values will become > visible at the subscriber's end. > > An alternative design approach could involve retrieving the sequence > list from the publisher during subscription creation and inserting the > sequences with an "init" state into the pg_subscription_rel system > table. These tasks could be executed by a single sequence sync worker, > which would: > i) Retrieve the list of sequences in the "init" state from the > pg_subscription_rel system table. > ii) Initiate a transaction. > iii) For each sequence: a) Obtain the sequence value from the > publisher by utilizing the pg_sequence_state function. b) Update the > sequence with the value obtained from the publisher. > iv) Commit the transaction. > > The benefit with the second approach is that if there are large number > of sequences, the sequence sync can be enhanced to happen in parallel > and also if there are any locks held on the sequences in the > publisher, the sequence worker can wait to acquire the lock instead of > blocking the whole create subscription command which will delay the > initial copy of the tables too. Yeah w.r.t. this point second approach seems better. > 3) Refreshing the sequence can be achieved through the existing > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > here). > The subscriber identifies stale sequences, meaning sequences present > in pg_subscription_rel but absent from the publication, and removes > them from the pg_subscription_rel system table. The subscriber also > checks for newly added sequences in the publisher and synchronizes > their values from the publisher using the steps outlined in the > subscription creation process. It's worth noting that previously > synchronized sequences won't be synchronized again; the sequence sync > will occur solely for the newly added sequences. > 4) Introducing a new command for refreshing all sequences: ALTER > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > The subscriber will remove stale sequences and add newly added > sequences from the publisher. Following this, it will re-synchronize > the sequence values for all sequences in the updated list from the > publisher, following the steps outlined in the subscription creation > process. Okay, this answers my first question: we will remove the sequences that are removed from the publisher and add the new sequences. I don't see any problem with this, but doesn't it seem like we are effectively doing DDL replication only for sequences without having a comprehensive plan for overall DDL replication? > 5) Incorporate the pg_sequence_state function to fetch the sequence > value from the publisher, along with the page LSN. Incorporate > SetSequence function, which will procure a new relfilenode for the > sequence and set the new relfilenode with the specified value. This > will facilitate rollback in case of any failures. I do not understand this point, you mean whenever we are fetching the sequence value from the publisher we need to create a new relfilenode on the subscriber? Why not just update the catalog tuple is sufficient? Or this is for handling the ALTER SEQUENCE case? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com