On Fri, 18 Jul 2025 at 14:11, Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Fri, Jul 18, 2025 at 10:44 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Thu, Jul 17, 2025 at 4:52 PM vignesh C <vignes...@gmail.com> wrote: > > > > > I was looking at the high level idea of sequence sync worker patch > i.e. 0005, so far I haven't found anything problematic there, but I > haven't completed the review and testing yet. Here are some comments > I have while reading through the patch. I will try to do more > thorough review and testing next week. > > 1. > + /* > + * Count running sync workers for this subscription, while we have the > + * lock. > + */ > + nsyncworkers = logicalrep_sync_worker_count(MyLogicalRepWorker->subid); > + > + /* Now safe to release the LWLock */ > + LWLockRelease(LogicalRepWorkerLock); > + > + /* > + * If there is a free sync worker slot, start a new sequencesync worker, > + * and break from the loop. > + */ > + if (nsyncworkers < max_sync_workers_per_subscription) > + { > + TimestampTz now = GetCurrentTimestamp(); > + > + /* > + * To prevent starting the sequencesync worker at a high frequency > + * after a failure, we store its last failure time. We start the > + * sequencesync worker again after waiting at least > + * wal_retrieve_retry_interval. > + */ > + if (!MyLogicalRepWorker->sequencesync_failure_time || > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, > + now, wal_retrieve_retry_interval)) > + { > + MyLogicalRepWorker->sequencesync_failure_time = 0; > + > + if (!logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, > + MyLogicalRepWorker->dbid, > + MySubscription->oid, > + MySubscription->name, > + MyLogicalRepWorker->userid, > + InvalidOid, > + DSM_HANDLE_INVALID)) > + MyLogicalRepWorker->sequencesync_failure_time = now; > + } > > This code seems to duplicate much of the logic found in > ProcessSyncingTablesForApply() within its final else block, with only > minor differences (perhaps 1-2 lines). > > To improve code maintainability and avoid redundancy, consider > extracting the common logic into a static function. This function > could then be called from both places.
Modified > 2. > +/* > + * Common function to setup the leader apply, tablesync worker and > sequencesync > + * worker. > + */ > > Change to "Common function to setup the leader apply, tablesync and > sequencesync worker" Modified > 3. > + /* > + * To prevent starting the sequencesync worker at a high frequency > + * after a failure, we store its last failure time. We start the > + * sequencesync worker again after waiting at least > + * wal_retrieve_retry_interval. > + */ > > We haven't explained what's the rationale behind comparing with the > last failure time for sequence sync worker whereas for table sync > worker we compare with last start time. Since we use a single sequencesync worker to handle all sequence synchronization, I considered marking a failure when the worker exits and using that as a trigger for retries. However, since tablesync relies on the start time for retries, it would be more consistent to apply the same approach here. The v20250720 version patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm2swnY6nYAg%3D%3D7-4ah3yyaBQ_5wyr57p%3D%2BvtpfuSOT%2Bag%40mail.gmail.com Regards, Vignesh