Re: Synchronizing slots from primary to standby

2024-01-11 Thread shveta malik
On Thu, Jan 11, 2024 at 7:28 AM Peter Smith wrote: > > Here are some review comments for patch v58-0002 Thank You for the feedback. These are addressed in v60. Please find my response inline for a few. > (FYI - I quickly checked with the latest v59-0002 and AFAIK all these > review comments belo

Re: Synchronizing slots from primary to standby

2024-01-12 Thread shveta malik
On Fri, Jan 12, 2024 at 5:30 PM Masahiko Sawada wrote: > > On Thu, Jan 11, 2024 at 7:53 PM Amit Kapila wrote: > > > > On Tue, Jan 9, 2024 at 6:39 PM Amit Kapila wrote: > > > > > > +static bool > > > +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot) > > > { > > > ... > > > +

Re: Synchronizing slots from primary to standby

2024-01-15 Thread shveta malik
On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila wrote: > > On Fri, Jan 12, 2024 at 5:50 PM shveta malik wrote: > > > > There are multiple approaches discussed and tried when it comes to > > starting a slot-sync worker. I am summarizing all here: > > > > 1)

Re: Synchronizing slots from primary to standby

2024-01-16 Thread shveta malik
On Tue, Jan 16, 2024 at 12:59 PM Masahiko Sawada wrote: > > On Tue, Jan 16, 2024 at 1:07 PM Amit Kapila wrote: > > > > On Tue, Jan 16, 2024 at 9:03 AM shveta malik wrote: > > > > > > On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila > > > wrote: &g

Re: Synchronizing slots from primary to standby

2024-01-16 Thread shveta malik
On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila wrote: > > On Fri, Jan 12, 2024 at 5:50 PM shveta malik wrote: > > > > There are multiple approaches discussed and tried when it comes to > > starting a slot-sync worker. I am summarizing all here: > > > > 1)

Re: Synchronizing slots from primary to standby

2024-01-16 Thread shveta malik
On Wed, Jan 17, 2024 at 6:43 AM Masahiko Sawada wrote: > > On Tue, Jan 16, 2024 at 6:40 PM shveta malik wrote: > > > > On Tue, Jan 16, 2024 at 12:59 PM Masahiko Sawada > > wrote: > > > > > > On Tue, Jan 16, 2024 at 1:07 PM Amit Kapila > > &g

Re: Synchronizing slots from primary to standby

2024-01-17 Thread shveta malik
On Wed, Jan 17, 2024 at 3:08 PM Bertrand Drouvot wrote: > > Hi, > > On Tue, Jan 16, 2024 at 05:27:05PM +0530, shveta malik wrote: > > PFA v62. Details: > > Thanks! > > > v62-003: > > It is a new patch which attempts to implement slot-sync worker as a

Re: Synchronizing slots from primary to standby

2024-01-18 Thread shveta malik
On Thu, Jan 18, 2024 at 10:31 AM Peter Smith wrote: > > I have one question about the new code in v63-0002. > > == > src/backend/replication/logical/slotsync.c > > 1. ReplSlotSyncWorkerMain > > + Assert(SlotSyncWorker->pid == InvalidPid); > + > + /* > + * Startup process signaled the slot sync

Re: Synchronizing slots from primary to standby

2024-01-18 Thread shveta malik
On Fri, Jan 19, 2024 at 11:23 AM Amit Kapila wrote: > > > > 5 === (coming from v62-0002) > > > + Assert(tuplestore_tuple_count(res->tuplestore) == 1); > > > > > > Is it even possible for the related query to not return only one row? (I > > > think the > > > "count" ensures it). > > > > I th

Re: Synchronizing slots from primary to standby

2024-01-19 Thread shveta malik
On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada wrote: > > > Thank you for updating the patch. I have some comments: > > --- > +latestWalEnd = GetWalRcvLatestWalEnd(); > +if (remote_slot->confirmed_lsn > latestWalEnd) > +{ > +elog(ERROR, "exiting from slot

Re: Synchronizing slots from primary to standby

2024-01-19 Thread shveta malik
On Thu, Jan 18, 2024 at 4:49 PM Amit Kapila wrote: > 2. > +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot) > { > ... > + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > + xmin_horizon = GetOldestSafeDecodingTransactionId(true); > + SpinLockAcquire(&slot->mutex); > + slot->dat

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik
On Mon, Jan 22, 2024 at 12:28 PM Amit Kapila wrote: > > On Fri, Jan 19, 2024 at 3:55 PM shveta malik wrote: > > > > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada > > wrote: > > > > > > > > > Thank you for updating the patch. I have some co

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik
On Fri, Jan 19, 2024 at 11:48 AM Peter Smith wrote: > > Here are some review comments for patch v63-0003. Thanks Peter. I have addressed all in v65. > > 4b. > It was a bit different when there were ERRORs but now they are LOGs > somehow it seems wrong for this function to say what the *caller* w

Re: Synchronizing slots from primary to standby

2024-01-23 Thread shveta malik
On Tue, Jan 23, 2024 at 9:45 AM Peter Smith wrote: > > Here are some review comments for v65-0002 Thanks Peter for the feedback. I have addressed these in v66. > > 4. GetStandbyFlushRecPtr > > /* > - * Returns the latest point in WAL that has been safely flushed to disk, and > - * can be sent t

Re: Synchronizing slots from primary to standby

2024-01-24 Thread shveta malik
On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot wrote: > > Hi, > > On Wed, Jan 24, 2024 at 01:51:54PM +0530, Amit Kapila wrote: > > On Wed, Jan 24, 2024 at 11:24 AM Masahiko Sawada > > wrote: > > > > > > On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila > > > wrote: > > > > > > > > > > > > > > +/*

Re: Synchronizing slots from primary to standby

2024-01-24 Thread shveta malik
On Thu, Jan 25, 2024 at 9:13 AM Amit Kapila wrote: > > > 3) Removed syncing 'failover' on standby from remote_slot. The > > 'failover' field will be false for synced slots. Since we do not > > support sync to cascading standbys yet, thus failover=true was > > misleading and unused there. > > > > B

Re: Synchronizing slots from primary to standby

2024-01-25 Thread shveta malik
On Thu, Jan 25, 2024 at 10:39 AM Peter Smith wrote: > 2. synchronize_one_slot > > + /* > + * Sanity check: Make sure that concerned WAL is received and flushed > + * before syncing slot to target lsn received from the primary server. > + * > + * This check should never pass as on the primary serv

Re: Synchronizing slots from primary to standby

2024-01-30 Thread shveta malik
On Tue, Jan 30, 2024 at 11:31 AM Amit Kapila wrote: > > In this regard, I feel we don't need to dump/restore the 'FAILOVER' > option non-binary upgrade paths similar to the 'ENABLE' option. For > binary upgrade, if the failover option is enabled, then we can enable > it using Alter Subscription SE

Re: Synchronizing slots from primary to standby

2024-02-01 Thread shveta malik
On Thu, Feb 1, 2024 at 11:21 AM Peter Smith wrote: > > Here are some review comments for v740001. Thanks Peter for the feedback. > == > src/sgml/logicaldecoding.sgml > > 1. > + > +Replication Slot Synchronization > + > + A logical replication slot on the primary can be synchro

Re: Synchronizing slots from primary to standby

2024-02-01 Thread shveta malik
On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada wrote: > > --- > +static char * > +wait_for_valid_params_and_get_dbname(void) > +{ > + char *dbname; > + int rc; > + > + /* Sanity check. */ > + Assert(enable_syncslot); > + > + for (;;) > + { > + if (validate_paramete

Re: Synchronizing slots from primary to standby

2024-02-02 Thread shveta malik
On Fri, Feb 2, 2024 at 12:25 PM Peter Smith wrote: > > Here are some review comments for v750002. Thanks for the feedback Peter. Addressed all in v76 except one. > (this is a WIP but this is what I found so far...) > I wonder if it is better to log all the problems in one go instead of > making

Re: Synchronizing slots from primary to standby

2024-02-05 Thread shveta malik
On Mon, Feb 5, 2024 at 4:36 PM Amit Kapila wrote: > > I have pushed the first patch. Next, a few comments on 0002 are as follows: Thanks for the feedback Amit. Some of these are addressed in v78. Rest will be addressed in the next version. > 1. > +static bool > +validate_parameters_and_get_dbnam

Re: Synchronizing slots from primary to standby

2024-02-06 Thread shveta malik
On Tue, Feb 6, 2024 at 7:21 PM Zhijie Hou (Fujitsu) wrote: > > > --- > > +/* Slot sync worker objects */ > > +extern PGDLLIMPORT char *PrimaryConnInfo; extern PGDLLIMPORT char > > +*PrimarySlotName; > > > > These two variables are declared also in xlogrecovery.h. Is it intentional? > > If so, I >

Re: Synchronizing slots from primary to standby

2024-02-07 Thread shveta malik
On Tue, Feb 6, 2024 at 12:25 PM Bertrand Drouvot wrote: > > Hi, > That said, I still think the commit message needs some re-wording, what about? > > = > If a logical slot on the primary is valid but is invalidated on the standby, > then that slot is dropped and can be recreated on the standby

Re: Synchronizing slots from primary to standby

2024-02-08 Thread shveta malik
On Thu, Feb 8, 2024 at 12:08 PM Peter Smith wrote: > > Here are some review comments for patch v80_2-0001. Thanks for the feedback Peter. Addressed the comments in v81. Attached patch001 for early feedback. Rest of the patches need rebasing and thus will post those later. It also addresses comme

Re: Synchronizing slots from primary to standby

2024-02-08 Thread shveta malik
On Thu, Feb 8, 2024 at 4:03 PM Amit Kapila wrote: > > Few comments on 0001 > === Thanks Amit. Addressed these in v81. > 1. > + * the slots on the standby and synchronize them. This is done on every call > + * to SQL function pg_sync_replication_slots. > > > > I think the second s

Re: Synchronizing slots from primary to standby

2024-02-08 Thread shveta malik
On Thu, Feb 8, 2024 at 4:31 PM shveta malik wrote: > > On Thu, Feb 8, 2024 at 12:08 PM Peter Smith wrote: > > > > Here are some review comments for patch v80_2-0001. > > Thanks for the feedback Peter. Addressed the comments in v81. Missed to mention, Hou-san helped in

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-03-14 Thread shveta malik
On Thu, Mar 14, 2024 at 10:57 AM Amit Kapila wrote: > > On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada wrote: > > > > This fact makes me think that the slotsync worker might be able to > > accept the primary_conninfo value even if there is no dbname in the > > value. That is, if there is no dbna

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-14 Thread shveta malik
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy wrote: > > On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila wrote: > > > > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > > > > > > Yes, there will be some sort of duplicity if we emit conflict_reason > > > as a text field. However, I still think

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-15 Thread shveta malik
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy wrote: > > While we wait to hear from others on this, I'm attaching the v9 patch > set implementing the above idea (check 0001 patch). Please have a > look. I'll come back to the other review comments soon. > patch002: 1) I would like to understa

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-24 Thread shveta malik
On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy wrote: > > I've attached the v18 patch set here. Thanks for the patches. Please find few comments: patch 001: 1) slot.h: + /* The time at which this slot become inactive */ + TimestampTz last_inactive_time; become -->became -

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-24 Thread shveta malik
On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote: > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > wrote: > > > > I've attached the v18 patch set here. > I have a question. Don't we allow creating subscriptions on an existing slot with a non

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 11:53 AM shveta malik wrote: > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote: > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > wrote: > > > > > > I've attached the v18 patch set here. I have one

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik > > wrote: > > > > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik &

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot wrote: > > Hi, > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > case on the primary), probably due to the sync worker slot acquisition/release > which does not seem right. > Yes, you are right, last_inactive_time

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy wrote: > > Right. Done that way i.e. not setting the last_inactive_time for slots > both while releasing the slot and restoring from the disk. > > Also, I've added a TAP function to check if the captured times are > sane per Bertrand's review comme

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila wrote: > > I think we should keep pg_alter_replication_slot() as the last > priority among the remaining patches for this release. Let's try to > first finish the primary functionality of inactive_timeout patch. > Otherwise, I agree that the problem repo

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 12:43 PM shveta malik wrote: > > I have one concern, for synced slots on standby, how do we disallow > invalidation due to inactive-timeout immediately after promotion? > > For synced slots, last_inactive_time and inactive_timeout are both > set. Let&

Re: pgsql: Track last_inactive_time in pg_replication_slots.

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas wrote: > > > And I'm suspicious that having an exception for slots being synced is > > > a bad idea. That makes too much

Re: pgsql: Track last_inactive_time in pg_replication_slots.

2024-03-25 Thread shveta malik
On Tue, Mar 26, 2024 at 1:50 AM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart > wrote: > > > > On Mon, Mar 25, 2024 at 04:49:12PM +, Bertrand Drouvot wrote: > > > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: > > >> In the same vein, I think deact

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot wrote: > > > > The issue that I can see with your proposal is: what if one synced the slots > > manually (with pg_sync_replication_slots()) but does not use the sync > > worker? > > Then I think ShutDownSlotSync() is not going to help in that case

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 9:30 AM shveta malik wrote: > > > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik > > wrote: > > > > > > I have one concern, for synced slots on standby, how do we d

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy wrote: > > > > > 1) > > Commti msg: > > > > ensures the value is set to current timestamp during the > > shutdown to help correctly interpret the time if the standby gets > > promoted without a restart. > > > > shutdown --> shutdown of slot sync wo

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
update inactive_since on standby only at 3 occasions: 1) at the time of creation of the synced slot. 2) during standby restart. 3) during promotion of standby. I have attached a sample patch for this idea as.txt file. I am fine with any of these approaches. One gives data synced from primary for syn

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot wrote: > > Hi, > > > I think there may have been some misunderstanding here. > > Indeed ;-) > > > But now if I > > rethink this, I am fine with 'inactive_since' getting synced from > > primary to standby. But if we do that, we need to add docs stati

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 4:18 PM shveta malik wrote: > > > > > What about another approach?: inactive_since gives data synced from > > > primary for > > > synced slots and another dedicate

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy > wrote: > > > > If we just sync inactive_since value for synced slots while in > > recovery from the primary, so be it. Why do we need to update it to > > the current time when the slot

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila wrote: > > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy > wrote: > > > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > > wrote: > > > > > 3) > > > update_synced_slots_inactive_time(): > > > > > > This assert is removed, is it intentional?

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot wrote: > > > > We can think on that later if we really need another > > field which give us sync time. > > I think that calling GetCurrentTimestamp() so frequently could be too costly, > so > I'm not sure we should. Agreed. > > In my second appro

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy wrote: > > Fixed an issue in synchronize_slots where DatumGetLSN is being used in > place of DatumGetTimestampTz. Found this via CF bot member [1], not on > my dev system. > > Please find the attached v6 patch. Thanks for the patch. Few trivial t

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy wrote: > > On Wed, Mar 27, 2024 at 11:39 AM shveta malik wrote: > > > > Thanks for the patch. Few trivial things: > > Thanks for reviewing. > > > -- > > 1) > > system-views.sgml: > > &g

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy wrote: > > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the > standby for sync slots. 0002 implementing inactive timeout GUC based > invalidation mechanism. > > Please have a look. Thanks for the patches. v29-001 looks good t

Re: Synchronizing slots from primary to standby

2024-03-28 Thread shveta malik
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) wrote: > > Attach a new version patch which fixed an un-initialized variable issue and > added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so > that > we can analyze the possible CFbot failures easily. As suggested by A

Re: Synchronizing slots from primary to standby

2024-03-31 Thread shveta malik
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote: > > On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy > wrote: > > > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > [2] The steps to reproduce the data miss issue on a primary->standby > > > setup: > > > > I'm tr

Re: Synchronizing slots from primary to standby

2024-04-01 Thread shveta malik
On Mon, Apr 1, 2024 at 5:05 PM Amit Kapila wrote: > > > 2 === > > > > + { > > + if (SnapBuildSnapshotExists(remote_slot->restart_lsn)) > > + { > > > > That could call SnapBuildSnapshotExists() multiple times for the same > > "restart_lsn" (for example in case of m

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread shveta malik
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot wrote: > > Hi, > > On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote: > > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy > > > > FWIW, coming to this thread late, I think that the inactive_since > > should not be synchronized from t

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread shveta malik
On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy wrote: > > On Wed, Apr 3, 2024 at 8:38 AM shveta malik wrote: > > > > > > Or a simple solution is that the slotsync worker updates > > > > inactive_since as it does for non-synced slots, and disables > &g

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila wrote: > > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote: > > > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > > wrote: > > > > > I quickly looked at v8, and have a nit, rest all looks good. > > > > > > +if (DecodingContextReady(ctx) &

promotion related handling in pg_sync_replication_slots()

2024-04-04 Thread shveta malik
Hello hackers, Currently, promotion related handling is missing in the slot sync SQL function pg_sync_replication_slots(). Here is the background on how it is done in slot sync worker: During promotion, the startup process in order to shut down the slot-sync worker, sets the 'stopSignaled' flag,

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > Prior to commit 2ec005b, this check was okay, as we did not expect > restart_lsn of the synced slot to be ahead of remote since we were > directly copying the lsns. But now when we use 'advance' to do logical > d

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy wrote: > > On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila wrote: > > > > > Thanks for the changes. v34-0001 LGTM. > > > > I was doing a final review before pushing 0001 and found that > > 'inactive_since' could be set twice during startup after promoti

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot wrote: > > Hi, > > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > > > > > > > Prior to commit 2ec005b, this check was okay,

Re: promotion related handling in pg_sync_replication_slots()

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 5:17 PM Amit Kapila wrote: > > On Thu, Apr 4, 2024 at 5:05 PM shveta malik wrote: > > > > Hello hackers, > > > > Currently, promotion related handling is missing in the slot sync SQL > > function pg_sync_replication_slots(). Here is

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot wrote: > > What about something like? > > ereport(LOG, > errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs from > remote slot", > remote_slot->name), > errdetail("Remote slot has LSN %X/%X but local slot has L

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 4:31 PM Bertrand Drouvot wrote: > > BTW, I just realized that the LSN I used in my example in the > LSN_FORMAT_ARGS() > are not the right ones. Noted. Thanks. Please find v3 with the comments addressed. thanks Shveta v3-0001-Correct-sanity-check-to-compare-confirmed_ls

Re: Synchronizing slots from primary to standby

2024-02-12 Thread shveta malik
On Tue, Feb 13, 2024 at 6:45 AM Zhijie Hou (Fujitsu) wrote: > > On Monday, February 12, 2024 5:40 PM Amit Kapila > wrote: > > Thanks for the comments, I have addressed them. > > Here is the new version patch which addressed above and > most of Bertrand's comments. Thanks for the patch. I am tr

Re: Synchronizing slots from primary to standby

2024-02-13 Thread shveta malik
On Fri, Feb 9, 2024 at 10:04 AM Amit Kapila wrote: > > +reserve_wal_for_local_slot(XLogRecPtr restart_lsn) > { > ... > + /* > + * Find the oldest existing WAL segment file. > + * > + * Normally, we can determine it by using the last removed segment > + * number. However, if no WAL segment files ha

Re: About a recently-added message

2024-02-14 Thread shveta malik
On Thu, Feb 15, 2024 at 8:26 AM Amit Kapila wrote: > > On Wed, Feb 14, 2024 at 7:51 PM Euler Taveira wrote: > > > > On Wed, Feb 14, 2024, at 8:45 AM, Amit Kapila wrote: > > > > Now, I am less clear about whether to quote "logical" or not in the > > above message. Do you have any suggestions? > >

Re: Synchronizing slots from primary to standby

2024-02-18 Thread shveta malik
On Sun, Feb 18, 2024 at 7:40 PM Zhijie Hou (Fujitsu) wrote: > > On Friday, February 16, 2024 6:41 PM shveta malik > wrote: > Thanks for the patch. Here are few comments: Thanks for the comments. > > 2. > > +static bool > +validate_remote_info(WalRecei

Re: A new message seems missing a punctuation

2024-02-18 Thread shveta malik
On Mon, Feb 19, 2024 at 10:31 AM Robert Haas wrote: > > On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi > wrote: > > A recent commit (7a424ece48) added the following message: > > > > > could not sync slot information as remote slot precedes local slot: > > > remote slot "%s": LSN (%X/%X), cata

Re: About a recently-added message

2024-02-18 Thread shveta malik
On Mon, Feb 19, 2024 at 11:10 AM Amit Kapila wrote: > > On Thu, Feb 15, 2024 at 11:49 AM Kyotaro Horiguchi > wrote: > > > > At Thu, 15 Feb 2024 09:22:23 +0530, shveta malik > > wrote in > > > > > > +1 on changing the msg(s) suggested way. Plea

Re: Synchronizing slots from primary to standby

2024-02-19 Thread shveta malik
On Mon, Feb 19, 2024 at 5:32 PM Amit Kapila wrote: > > Few comments on 0001 Thanks for the feedback. > > 1. I think it is better to error out when the valid GUC or option is > not set in ensure_valid_slotsync_params() and > ensure_valid_remote_info() instead of waiting. And

Re: Synchronizing slots from primary to standby

2024-02-19 Thread shveta malik
On Tue, Feb 20, 2024 at 8:25 AM Masahiko Sawada wrote: > > > I've reviewed the v91 patch. Here are random comments: Thanks for the comments. > --- > /* > * Checks the remote server info. > * > - * We ensure that the 'primary_slot_name' exists on the remote server and the > - * remote server

Re: About a recently-added message

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 2:13 PM Amit Kapila wrote: > > I would prefer the changed ones as those clearly explain the problem > without additional information. okay, attached v2 patch with changed error msgs and double quotes around logical. thanks Shveta v2-0001-Fix-quotation-of-variable-names.

Re: A new message seems missing a punctuation

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 5:01 PM Amit Kapila wrote: > > > We do expose the required information (restart_lsn, catalog_xmin, > synced, temporary, etc) via pg_replication_slots. So, I feel the LOG > message here is sufficient to DEBUG (or know the details) when the > slot sync doesn't succeed. > Ple

'Shutdown <= SmartShutdown' check while launching processes in postmaster.

2024-02-20 Thread shveta malik
Hi hackers, I would like to know that why we have 'Shutdown <= SmartShutdown' check before launching few processes (WalReceiver, WalSummarizer, AutoVacuum worker) while rest of the processes (BGWriter, WalWriter, Checkpointer, Archiver etc) do not have any such check. If I have to launch a new pro

Re: Synchronizing slots from primary to standby

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 6:19 PM Masahiko Sawada wrote: > > On Tue, Feb 20, 2024 at 12:33 PM shveta malik wrote: > Thank you for the explanation. It makes sense to me to move the check. > > > 2. If the wal_level is not logical, the server will need to restart > anyway to cha

Re: 'Shutdown <= SmartShutdown' check while launching processes in postmaster.

2024-02-21 Thread shveta malik
On Wed, Feb 21, 2024 at 10:01 AM Tom Lane wrote: > > shveta malik writes: > > I would like to know that why we have 'Shutdown <= SmartShutdown' > > check before launching few processes (WalReceiver, WalSummarizer, > > AutoVacuum worker) while rest

Re: Synchronizing slots from primary to standby

2024-02-21 Thread shveta malik
On Wed, Feb 21, 2024 at 5:19 PM Amit Kapila wrote: > > A few minor comments: Thanks for the feedback. > = > 1. > +/* > + * Is stopSignaled set in SlotSyncCtx? > + */ > +bool > +IsStopSignaledSet(void) > +{ > + bool signaled; > + > + SpinLockAcquire(&SlotSyncCtx->mutex); > + signa

Re: Synchronizing slots from primary to standby

2024-02-21 Thread shveta malik
On Thu, Feb 22, 2024 at 10:31 AM shveta malik wrote: > > Please find patch001 attached. There is a CFBot failure in patch002. > The test added there needs some adjustment. We will rebase and post > rest of the patches once we fix that issue. > There was a recent commit 801792e t

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Thu, Feb 22, 2024 at 3:44 PM Bertrand Drouvot wrote: > > Hi, > > Thanks! > > Some random comments about v92_001 (Sorry if it has already been discussed > up-thread): Thanks for the feedback. The patch is pushed 15 minutes back. I will prepare a top-up patch for your comments. > 1 === > > +

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot wrote: > > Suppose that in synchronize_slots() the query would be: > > const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn," > " restart_lsn, catalog_xmin, two_phase, failover," > " database, conflict_reason" >

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Fri, Feb 23, 2024 at 8:35 AM shveta malik wrote: > > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > wrote: > > > > Suppose that in synchronize_slots() the query would be: > > > > const char *query = "SELECT slot_name, plugin, confirmed

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Fri, Feb 23, 2024 at 10:06 AM Zhijie Hou (Fujitsu) wrote: > > > I noticed one CFbot failure[1] which is because the tap-test doesn't wait for > the > standby to catch up before promoting, thus the data inserted after promotion > could not be replicated to the subscriber. Add a wait_for_replay_

Re: Synchronizing slots from primary to standby

2024-02-23 Thread shveta malik
On Fri, Feb 23, 2024 at 1:28 PM Bertrand Drouvot wrote: > > Hi, > > Because one could create say the "=" OPERATOR in their own schema, attach a > function to it doing undesired stuff and change the search_path for the > database > the sync slot worker connects to. > > Then this new "=" operator w

Re: Synchronizing slots from primary to standby

2024-02-25 Thread shveta malik
t). Thanks for the patch, changes look good. I have corporated it in the patch which addresses the rest of your comments in [1]. I have attached the patch as .txt [1]: https://www.postgresql.org/message-id/ZdcejBDCr%2BwlVGnO%40ip-10-97-1-34.eu-west-3.compute.internal thanks Shveta From f1489f783

Regardign RecentFlushPtr in WalSndWaitForWal()

2024-02-26 Thread shveta malik
Hi hackers, I would like to understand why we have code [1] that retrieves RecentFlushPtr in WalSndWaitForWal() outside of the loop. We utilize RecentFlushPtr later within the loop, but prior to that, we already have [2]. Wouldn't [2] alone be sufficient? Just to check the impact, I ran 'make che

Re: Synchronizing slots from primary to standby

2024-02-26 Thread shveta malik
On Mon, Feb 26, 2024 at 5:18 PM Amit Kapila wrote: > > > > > + if (!ok) > > > > + GUC_check_errdetail("List syntax is invalid."); > > > > + > > > > + /* > > > > +* If there is a syntax error in the name or if the replication > > > > slots' > > > > +* data

Re: Synchronizing slots from primary to standby

2024-02-27 Thread shveta malik
On Wed, Feb 28, 2024 at 8:49 AM Amit Kapila wrote: > > > Few comments: Thanks for the feedback. > === > 1. > - if (logical) > + if (logical || !replication) > { > > Can we add a comment about connection types that require > ALWAYS_SECURE_SEARCH_PATH_SQL? > > 2. > Can we add a test

Re: Synchronizing slots from primary to standby

2024-02-28 Thread shveta malik
On Wed, Feb 28, 2024 at 1:33 PM Bertrand Drouvot wrote: > > Hi, > A few comments: Thanks for reviewing. > > 1 === > > +* used to run normal SQL queries > > s/run normal SQL/run SQL/ ? > > As mentioned up-thread I don't like that much the idea of creating such a test > but if we do then h

Re: Synchronizing slots from primary to standby

2024-02-29 Thread shveta malik
On Thu, Feb 29, 2024 at 7:04 AM Zhijie Hou (Fujitsu) wrote: > > Here is the v101 patch set which addressed above comments. > Thanks for the patch. Few comments: 1) Shall we mention in doc that shutdown will wait for standbys in standby_slot_names to confirm receiving WAL: Suggestion for logica

Re: Synchronizing slots from primary to standby

2024-02-29 Thread shveta malik
On Wed, Feb 28, 2024 at 4:51 PM Amit Kapila wrote: > > On Wed, Feb 28, 2024 at 3:26 PM shveta malik wrote: > > > > > > Here is the patch which addresses the above comments. Also optimized > > the test a little bit. Now we use pg_sync_replication_slots() function >

Add comment to specify timeout unit in ConditionVariableTimedSleep()

2024-03-04 Thread shveta malik
Hi hackers, ConditionVariableTimedSleep() accepts a timeout parameter, but it doesn't explicitly state the unit for the timeout anywhere. To determine this, one needs to look into the details of the function to find it out from the comments of the internally called function WaitLatch(). It would b

Re: Synchronizing slots from primary to standby

2024-03-04 Thread shveta malik
lush-position(and not on each change) should be outside of this function. It is too embedded. And the comment which states the order of wait (first flush and then standbys confirmation) should be outside the for-loop in WalSndWaitForWal(), but yes we do need both the comments. Attached a patch (.txt

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread shveta malik
On Tue, Mar 5, 2024 at 1:25 PM Heikki Linnakangas wrote: > SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED > mode, when you pass need_lock=true. So that at least should be changed > to false. > Also don't we need to release the lock when we return here: /* * Nothing to do f

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread shveta malik
On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot wrote: > > > /* > > * Nothing to do for physical slots as we collect stats only for logical > > * slots. > > */ > > if (SlotIsPhysical(slot)) > > return; > > D'oh! Thanks! Fixed in v2 shared up-thread. Thanks. Can we try to get rid of multiple LwLo

Re: Synchronizing slots from primary to standby

2024-03-06 Thread shveta malik
On Wed, Mar 6, 2024 at 6:54 PM Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 6, 2024 9:13 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada > > > wrote: > > > > > >

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread shveta malik
On Wed, Mar 6, 2024 at 2:36 PM Bertrand Drouvot wrote: > > Hi, > > On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > > wrote: > > Thanks. Can we try to get rid of multiple LwLockRelease in > >

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread shveta malik
On Thu, Mar 7, 2024 at 11:12 AM Michael Paquier wrote: > > Yeah, it is possible that what you retrieve from > pgstat_fetch_replslot() does not refer exactly to the slot's content > under concurrent activity, but you cannot protect a single scan of > pg_stat_replication_slots as of an effect of it

Re: Synchronizing slots from primary to standby

2024-03-07 Thread shveta malik
On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian wrote: > >> Pushed with minor modifications. I'll keep an eye on BF. >> >> BTW, one thing that we should try to evaluate a bit more is the >> traversal of slots in StandbySlotsHaveCaughtup() where we verify if >> all the slots mentioned in standby_slot_n

  1   2   3   4   5   >