On Thu, Jan 11, 2024 at 7:28 AM Peter Smith wrote:
>
> Here are some review comments for patch v58-0002
Thank You for the feedback. These are addressed in v60. Please find my
response inline for a few.
> (FYI - I quickly checked with the latest v59-0002 and AFAIK all these
> review comments belo
On Fri, Jan 12, 2024 at 5:30 PM Masahiko Sawada wrote:
>
> On Thu, Jan 11, 2024 at 7:53 PM Amit Kapila wrote:
> >
> > On Tue, Jan 9, 2024 at 6:39 PM Amit Kapila wrote:
> > >
> > > +static bool
> > > +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot)
> > > {
> > > ...
> > > +
On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila wrote:
>
> On Fri, Jan 12, 2024 at 5:50 PM shveta malik wrote:
> >
> > There are multiple approaches discussed and tried when it comes to
> > starting a slot-sync worker. I am summarizing all here:
> >
> > 1)
On Tue, Jan 16, 2024 at 12:59 PM Masahiko Sawada wrote:
>
> On Tue, Jan 16, 2024 at 1:07 PM Amit Kapila wrote:
> >
> > On Tue, Jan 16, 2024 at 9:03 AM shveta malik wrote:
> > >
> > > On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila
> > > wrote:
&g
On Sat, Jan 13, 2024 at 12:54 PM Amit Kapila wrote:
>
> On Fri, Jan 12, 2024 at 5:50 PM shveta malik wrote:
> >
> > There are multiple approaches discussed and tried when it comes to
> > starting a slot-sync worker. I am summarizing all here:
> >
> > 1)
On Wed, Jan 17, 2024 at 6:43 AM Masahiko Sawada wrote:
>
> On Tue, Jan 16, 2024 at 6:40 PM shveta malik wrote:
> >
> > On Tue, Jan 16, 2024 at 12:59 PM Masahiko Sawada
> > wrote:
> > >
> > > On Tue, Jan 16, 2024 at 1:07 PM Amit Kapila
> > &g
On Wed, Jan 17, 2024 at 3:08 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Tue, Jan 16, 2024 at 05:27:05PM +0530, shveta malik wrote:
> > PFA v62. Details:
>
> Thanks!
>
> > v62-003:
> > It is a new patch which attempts to implement slot-sync worker as a
On Thu, Jan 18, 2024 at 10:31 AM Peter Smith wrote:
>
> I have one question about the new code in v63-0002.
>
> ==
> src/backend/replication/logical/slotsync.c
>
> 1. ReplSlotSyncWorkerMain
>
> + Assert(SlotSyncWorker->pid == InvalidPid);
> +
> + /*
> + * Startup process signaled the slot sync
On Fri, Jan 19, 2024 at 11:23 AM Amit Kapila wrote:
>
> > > 5 === (coming from v62-0002)
> > > + Assert(tuplestore_tuple_count(res->tuplestore) == 1);
> > >
> > > Is it even possible for the related query to not return only one row? (I
> > > think the
> > > "count" ensures it).
> >
> > I th
On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada wrote:
>
>
> Thank you for updating the patch. I have some comments:
>
> ---
> +latestWalEnd = GetWalRcvLatestWalEnd();
> +if (remote_slot->confirmed_lsn > latestWalEnd)
> +{
> +elog(ERROR, "exiting from slot
On Thu, Jan 18, 2024 at 4:49 PM Amit Kapila wrote:
> 2.
> +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot)
> {
> ...
> + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> + xmin_horizon = GetOldestSafeDecodingTransactionId(true);
> + SpinLockAcquire(&slot->mutex);
> + slot->dat
On Mon, Jan 22, 2024 at 12:28 PM Amit Kapila wrote:
>
> On Fri, Jan 19, 2024 at 3:55 PM shveta malik wrote:
> >
> > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada
> > wrote:
> > >
> > >
> > > Thank you for updating the patch. I have some co
On Fri, Jan 19, 2024 at 11:48 AM Peter Smith wrote:
>
> Here are some review comments for patch v63-0003.
Thanks Peter. I have addressed all in v65.
>
> 4b.
> It was a bit different when there were ERRORs but now they are LOGs
> somehow it seems wrong for this function to say what the *caller* w
On Tue, Jan 23, 2024 at 9:45 AM Peter Smith wrote:
>
> Here are some review comments for v65-0002
Thanks Peter for the feedback. I have addressed these in v66.
>
> 4. GetStandbyFlushRecPtr
>
> /*
> - * Returns the latest point in WAL that has been safely flushed to disk, and
> - * can be sent t
On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Wed, Jan 24, 2024 at 01:51:54PM +0530, Amit Kapila wrote:
> > On Wed, Jan 24, 2024 at 11:24 AM Masahiko Sawada
> > wrote:
> > >
> > > On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila
> > > wrote:
> > > >
> > > > >
> > > > > +/*
On Thu, Jan 25, 2024 at 9:13 AM Amit Kapila wrote:
>
> > 3) Removed syncing 'failover' on standby from remote_slot. The
> > 'failover' field will be false for synced slots. Since we do not
> > support sync to cascading standbys yet, thus failover=true was
> > misleading and unused there.
> >
>
> B
On Thu, Jan 25, 2024 at 10:39 AM Peter Smith wrote:
> 2. synchronize_one_slot
>
> + /*
> + * Sanity check: Make sure that concerned WAL is received and flushed
> + * before syncing slot to target lsn received from the primary server.
> + *
> + * This check should never pass as on the primary serv
On Tue, Jan 30, 2024 at 11:31 AM Amit Kapila wrote:
>
> In this regard, I feel we don't need to dump/restore the 'FAILOVER'
> option non-binary upgrade paths similar to the 'ENABLE' option. For
> binary upgrade, if the failover option is enabled, then we can enable
> it using Alter Subscription SE
On Thu, Feb 1, 2024 at 11:21 AM Peter Smith wrote:
>
> Here are some review comments for v740001.
Thanks Peter for the feedback.
> ==
> src/sgml/logicaldecoding.sgml
>
> 1.
> +
> +Replication Slot Synchronization
> +
> + A logical replication slot on the primary can be synchro
On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada wrote:
>
> ---
> +static char *
> +wait_for_valid_params_and_get_dbname(void)
> +{
> + char *dbname;
> + int rc;
> +
> + /* Sanity check. */
> + Assert(enable_syncslot);
> +
> + for (;;)
> + {
> + if (validate_paramete
On Fri, Feb 2, 2024 at 12:25 PM Peter Smith wrote:
>
> Here are some review comments for v750002.
Thanks for the feedback Peter. Addressed all in v76 except one.
> (this is a WIP but this is what I found so far...)
> I wonder if it is better to log all the problems in one go instead of
> making
On Mon, Feb 5, 2024 at 4:36 PM Amit Kapila wrote:
>
> I have pushed the first patch. Next, a few comments on 0002 are as follows:
Thanks for the feedback Amit. Some of these are addressed in v78. Rest
will be addressed in the next version.
> 1.
> +static bool
> +validate_parameters_and_get_dbnam
On Tue, Feb 6, 2024 at 7:21 PM Zhijie Hou (Fujitsu)
wrote:
>
> > ---
> > +/* Slot sync worker objects */
> > +extern PGDLLIMPORT char *PrimaryConnInfo; extern PGDLLIMPORT char
> > +*PrimarySlotName;
> >
> > These two variables are declared also in xlogrecovery.h. Is it intentional?
> > If so, I
>
On Tue, Feb 6, 2024 at 12:25 PM Bertrand Drouvot
wrote:
>
> Hi,
> That said, I still think the commit message needs some re-wording, what about?
>
> =
> If a logical slot on the primary is valid but is invalidated on the standby,
> then that slot is dropped and can be recreated on the standby
On Thu, Feb 8, 2024 at 12:08 PM Peter Smith wrote:
>
> Here are some review comments for patch v80_2-0001.
Thanks for the feedback Peter. Addressed the comments in v81.
Attached patch001 for early feedback. Rest of the patches need
rebasing and thus will post those later.
It also addresses comme
On Thu, Feb 8, 2024 at 4:03 PM Amit Kapila wrote:
>
> Few comments on 0001
> ===
Thanks Amit. Addressed these in v81.
> 1.
> + * the slots on the standby and synchronize them. This is done on every call
> + * to SQL function pg_sync_replication_slots.
> >
>
> I think the second s
On Thu, Feb 8, 2024 at 4:31 PM shveta malik wrote:
>
> On Thu, Feb 8, 2024 at 12:08 PM Peter Smith wrote:
> >
> > Here are some review comments for patch v80_2-0001.
>
> Thanks for the feedback Peter. Addressed the comments in v81.
Missed to mention, Hou-san helped in
On Thu, Mar 14, 2024 at 10:57 AM Amit Kapila wrote:
>
> On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada wrote:
> >
> > This fact makes me think that the slotsync worker might be able to
> > accept the primary_conninfo value even if there is no dbname in the
> > value. That is, if there is no dbna
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy
wrote:
>
> On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila wrote:
> >
> > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
> > >
> > > Yes, there will be some sort of duplicity if we emit conflict_reason
> > > as a text field. However, I still think
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy
wrote:
>
> While we wait to hear from others on this, I'm attaching the v9 patch
> set implementing the above idea (check 0001 patch). Please have a
> look. I'll come back to the other review comments soon.
>
patch002:
1)
I would like to understa
On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
wrote:
>
> I've attached the v18 patch set here.
Thanks for the patches. Please find few comments:
patch 001:
1)
slot.h:
+ /* The time at which this slot become inactive */
+ TimestampTz last_inactive_time;
become -->became
-
On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote:
>
> On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> wrote:
> >
> > I've attached the v18 patch set here.
>
I have a question. Don't we allow creating subscriptions on an
existing slot with a non
On Mon, Mar 25, 2024 at 11:53 AM shveta malik wrote:
>
> On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote:
> >
> > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
> > wrote:
> > >
> > > I've attached the v18 patch set here.
I have one
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik
> > wrote:
> > >
> > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik
&
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> Yeah, and I can see last_inactive_time is moving on the standby (while not the
> case on the primary), probably due to the sync worker slot acquisition/release
> which does not seem right.
>
Yes, you are right, last_inactive_time
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
wrote:
>
> Right. Done that way i.e. not setting the last_inactive_time for slots
> both while releasing the slot and restoring from the disk.
>
> Also, I've added a TAP function to check if the captured times are
> sane per Bertrand's review comme
On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila wrote:
>
> I think we should keep pg_alter_replication_slot() as the last
> priority among the remaining patches for this release. Let's try to
> first finish the primary functionality of inactive_timeout patch.
> Otherwise, I agree that the problem repo
On Mon, Mar 25, 2024 at 12:43 PM shveta malik wrote:
>
> I have one concern, for synced slots on standby, how do we disallow
> invalidation due to inactive-timeout immediately after promotion?
>
> For synced slots, last_inactive_time and inactive_timeout are both
> set. Let&
On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote:
> > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas wrote:
> > > And I'm suspicious that having an exception for slots being synced is
> > > a bad idea. That makes too much
On Tue, Mar 26, 2024 at 1:50 AM Bharath Rupireddy
wrote:
>
> On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart
> wrote:
> >
> > On Mon, Mar 25, 2024 at 04:49:12PM +, Bertrand Drouvot wrote:
> > > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote:
> > >> In the same vein, I think deact
On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot
wrote:
> >
> > The issue that I can see with your proposal is: what if one synced the slots
> > manually (with pg_sync_replication_slots()) but does not use the sync
> > worker?
> > Then I think ShutDownSlotSync() is not going to help in that case
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy
wrote:
>
> On Tue, Mar 26, 2024 at 9:30 AM shveta malik wrote:
> >
> > On Mon, Mar 25, 2024 at 12:43 PM shveta malik
> > wrote:
> > >
> > > I have one concern, for synced slots on standby, how do we d
On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy
wrote:
>
> >
> > 1)
> > Commti msg:
> >
> > ensures the value is set to current timestamp during the
> > shutdown to help correctly interpret the time if the standby gets
> > promoted without a restart.
> >
> > shutdown --> shutdown of slot sync wo
update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.
I have attached a sample patch for this idea as.txt file.
I am fine with any of these approaches. One gives data synced from
primary for syn
On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> > I think there may have been some misunderstanding here.
>
> Indeed ;-)
>
> > But now if I
> > rethink this, I am fine with 'inactive_since' getting synced from
> > primary to standby. But if we do that, we need to add docs stati
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
wrote:
>
> On Tue, Mar 26, 2024 at 4:18 PM shveta malik wrote:
> >
> > > What about another approach?: inactive_since gives data synced from
> > > primary for
> > > synced slots and another dedicate
On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy
wrote:
>
> On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
> wrote:
> >
> > If we just sync inactive_since value for synced slots while in
> > recovery from the primary, so be it. Why do we need to update it to
> > the current time when the slot
On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila wrote:
>
> On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
> wrote:
> >
> > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
> > wrote:
> >
> > > 3)
> > > update_synced_slots_inactive_time():
> > >
> > > This assert is removed, is it intentional?
On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot
wrote:
>
>
> > We can think on that later if we really need another
> > field which give us sync time.
>
> I think that calling GetCurrentTimestamp() so frequently could be too costly,
> so
> I'm not sure we should.
Agreed.
> > In my second appro
On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy
wrote:
>
> Fixed an issue in synchronize_slots where DatumGetLSN is being used in
> place of DatumGetTimestampTz. Found this via CF bot member [1], not on
> my dev system.
>
> Please find the attached v6 patch.
Thanks for the patch. Few trivial t
On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy
wrote:
>
> On Wed, Mar 27, 2024 at 11:39 AM shveta malik wrote:
> >
> > Thanks for the patch. Few trivial things:
>
> Thanks for reviewing.
>
> > --
> > 1)
> > system-views.sgml:
> >
&g
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
wrote:
>
> Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
> standby for sync slots. 0002 implementing inactive timeout GUC based
> invalidation mechanism.
>
> Please have a look.
Thanks for the patches. v29-001 looks good t
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
wrote:
>
> Attach a new version patch which fixed an un-initialized variable issue and
> added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so
> that
> we can analyze the possible CFbot failures easily.
As suggested by A
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote:
>
> On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy
> wrote:
> >
> > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
> > wrote:
> > >
> > > [2] The steps to reproduce the data miss issue on a primary->standby
> > > setup:
> >
> > I'm tr
On Mon, Apr 1, 2024 at 5:05 PM Amit Kapila wrote:
>
> > 2 ===
> >
> > + {
> > + if (SnapBuildSnapshotExists(remote_slot->restart_lsn))
> > + {
> >
> > That could call SnapBuildSnapshotExists() multiple times for the same
> > "restart_lsn" (for example in case of m
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote:
> > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy
> >
> > FWIW, coming to this thread late, I think that the inactive_since
> > should not be synchronized from t
On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
wrote:
>
> On Wed, Apr 3, 2024 at 8:38 AM shveta malik wrote:
> >
> > > > Or a simple solution is that the slotsync worker updates
> > > > inactive_since as it does for non-synced slots, and disables
> &g
On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila wrote:
>
> On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote:
> >
> > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> > wrote:
> >
> > > I quickly looked at v8, and have a nit, rest all looks good.
> > >
> > > +if (DecodingContextReady(ctx) &
Hello hackers,
Currently, promotion related handling is missing in the slot sync SQL
function pg_sync_replication_slots(). Here is the background on how
it is done in slot sync worker:
During promotion, the startup process in order to shut down the
slot-sync worker, sets the 'stopSignaled' flag,
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
>
>
> Prior to commit 2ec005b, this check was okay, as we did not expect
> restart_lsn of the synced slot to be ahead of remote since we were
> directly copying the lsns. But now when we use 'advance' to do logical
> d
On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy
wrote:
>
> On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila wrote:
> >
> > > Thanks for the changes. v34-0001 LGTM.
> >
> > I was doing a final review before pushing 0001 and found that
> > 'inactive_since' could be set twice during startup after promoti
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote:
> > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
> > >
> > >
> > > Prior to commit 2ec005b, this check was okay,
On Thu, Apr 4, 2024 at 5:17 PM Amit Kapila wrote:
>
> On Thu, Apr 4, 2024 at 5:05 PM shveta malik wrote:
> >
> > Hello hackers,
> >
> > Currently, promotion related handling is missing in the slot sync SQL
> > function pg_sync_replication_slots(). Here is
On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot
wrote:
>
> What about something like?
>
> ereport(LOG,
> errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs from
> remote slot",
> remote_slot->name),
> errdetail("Remote slot has LSN %X/%X but local slot has L
On Fri, Apr 5, 2024 at 4:31 PM Bertrand Drouvot
wrote:
>
> BTW, I just realized that the LSN I used in my example in the
> LSN_FORMAT_ARGS()
> are not the right ones.
Noted. Thanks.
Please find v3 with the comments addressed.
thanks
Shveta
v3-0001-Correct-sanity-check-to-compare-confirmed_ls
On Tue, Feb 13, 2024 at 6:45 AM Zhijie Hou (Fujitsu)
wrote:
>
> On Monday, February 12, 2024 5:40 PM Amit Kapila
> wrote:
>
> Thanks for the comments, I have addressed them.
>
> Here is the new version patch which addressed above and
> most of Bertrand's comments.
Thanks for the patch.
I am tr
On Fri, Feb 9, 2024 at 10:04 AM Amit Kapila wrote:
>
> +reserve_wal_for_local_slot(XLogRecPtr restart_lsn)
> {
> ...
> + /*
> + * Find the oldest existing WAL segment file.
> + *
> + * Normally, we can determine it by using the last removed segment
> + * number. However, if no WAL segment files ha
On Thu, Feb 15, 2024 at 8:26 AM Amit Kapila wrote:
>
> On Wed, Feb 14, 2024 at 7:51 PM Euler Taveira wrote:
> >
> > On Wed, Feb 14, 2024, at 8:45 AM, Amit Kapila wrote:
> >
> > Now, I am less clear about whether to quote "logical" or not in the
> > above message. Do you have any suggestions?
> >
On Sun, Feb 18, 2024 at 7:40 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Friday, February 16, 2024 6:41 PM shveta malik
> wrote:
> Thanks for the patch. Here are few comments:
Thanks for the comments.
>
> 2.
>
> +static bool
> +validate_remote_info(WalRecei
On Mon, Feb 19, 2024 at 10:31 AM Robert Haas wrote:
>
> On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi
> wrote:
> > A recent commit (7a424ece48) added the following message:
> >
> > > could not sync slot information as remote slot precedes local slot:
> > > remote slot "%s": LSN (%X/%X), cata
On Mon, Feb 19, 2024 at 11:10 AM Amit Kapila wrote:
>
> On Thu, Feb 15, 2024 at 11:49 AM Kyotaro Horiguchi
> wrote:
> >
> > At Thu, 15 Feb 2024 09:22:23 +0530, shveta malik
> > wrote in
> > >
> > > +1 on changing the msg(s) suggested way. Plea
On Mon, Feb 19, 2024 at 5:32 PM Amit Kapila wrote:
>
> Few comments on 0001
Thanks for the feedback.
>
> 1. I think it is better to error out when the valid GUC or option is
> not set in ensure_valid_slotsync_params() and
> ensure_valid_remote_info() instead of waiting. And
On Tue, Feb 20, 2024 at 8:25 AM Masahiko Sawada wrote:
>
>
> I've reviewed the v91 patch. Here are random comments:
Thanks for the comments.
> ---
> /*
> * Checks the remote server info.
> *
> - * We ensure that the 'primary_slot_name' exists on the remote server and the
> - * remote server
On Tue, Feb 20, 2024 at 2:13 PM Amit Kapila wrote:
>
> I would prefer the changed ones as those clearly explain the problem
> without additional information.
okay, attached v2 patch with changed error msgs and double quotes
around logical.
thanks
Shveta
v2-0001-Fix-quotation-of-variable-names.
On Tue, Feb 20, 2024 at 5:01 PM Amit Kapila wrote:
>
>
> We do expose the required information (restart_lsn, catalog_xmin,
> synced, temporary, etc) via pg_replication_slots. So, I feel the LOG
> message here is sufficient to DEBUG (or know the details) when the
> slot sync doesn't succeed.
>
Ple
Hi hackers,
I would like to know that why we have 'Shutdown <= SmartShutdown'
check before launching few processes (WalReceiver, WalSummarizer,
AutoVacuum worker) while rest of the processes (BGWriter, WalWriter,
Checkpointer, Archiver etc) do not have any such check. If I have to
launch a new pro
On Tue, Feb 20, 2024 at 6:19 PM Masahiko Sawada wrote:
>
> On Tue, Feb 20, 2024 at 12:33 PM shveta malik wrote:
> Thank you for the explanation. It makes sense to me to move the check.
>
>
> 2. If the wal_level is not logical, the server will need to restart
> anyway to cha
On Wed, Feb 21, 2024 at 10:01 AM Tom Lane wrote:
>
> shveta malik writes:
> > I would like to know that why we have 'Shutdown <= SmartShutdown'
> > check before launching few processes (WalReceiver, WalSummarizer,
> > AutoVacuum worker) while rest
On Wed, Feb 21, 2024 at 5:19 PM Amit Kapila wrote:
>
> A few minor comments:
Thanks for the feedback.
> =
> 1.
> +/*
> + * Is stopSignaled set in SlotSyncCtx?
> + */
> +bool
> +IsStopSignaledSet(void)
> +{
> + bool signaled;
> +
> + SpinLockAcquire(&SlotSyncCtx->mutex);
> + signa
On Thu, Feb 22, 2024 at 10:31 AM shveta malik wrote:
>
> Please find patch001 attached. There is a CFBot failure in patch002.
> The test added there needs some adjustment. We will rebase and post
> rest of the patches once we fix that issue.
>
There was a recent commit 801792e t
On Thu, Feb 22, 2024 at 3:44 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> Thanks!
>
> Some random comments about v92_001 (Sorry if it has already been discussed
> up-thread):
Thanks for the feedback. The patch is pushed 15 minutes back. I will
prepare a top-up patch for your comments.
> 1 ===
>
> +
On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot
wrote:
>
> Suppose that in synchronize_slots() the query would be:
>
> const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
> " restart_lsn, catalog_xmin, two_phase, failover,"
> " database, conflict_reason"
>
On Fri, Feb 23, 2024 at 8:35 AM shveta malik wrote:
>
> On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot
> wrote:
> >
> > Suppose that in synchronize_slots() the query would be:
> >
> > const char *query = "SELECT slot_name, plugin, confirmed
On Fri, Feb 23, 2024 at 10:06 AM Zhijie Hou (Fujitsu)
wrote:
>
>
> I noticed one CFbot failure[1] which is because the tap-test doesn't wait for
> the
> standby to catch up before promoting, thus the data inserted after promotion
> could not be replicated to the subscriber. Add a wait_for_replay_
On Fri, Feb 23, 2024 at 1:28 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> Because one could create say the "=" OPERATOR in their own schema, attach a
> function to it doing undesired stuff and change the search_path for the
> database
> the sync slot worker connects to.
>
> Then this new "=" operator w
t).
Thanks for the patch, changes look good. I have corporated it in the
patch which addresses the rest of your comments in [1]. I have
attached the patch as .txt
[1]:
https://www.postgresql.org/message-id/ZdcejBDCr%2BwlVGnO%40ip-10-97-1-34.eu-west-3.compute.internal
thanks
Shveta
From f1489f783
Hi hackers,
I would like to understand why we have code [1] that retrieves
RecentFlushPtr in WalSndWaitForWal() outside of the loop. We utilize
RecentFlushPtr later within the loop, but prior to that, we already
have [2]. Wouldn't [2] alone be sufficient?
Just to check the impact, I ran 'make che
On Mon, Feb 26, 2024 at 5:18 PM Amit Kapila wrote:
>
> > > > + if (!ok)
> > > > + GUC_check_errdetail("List syntax is invalid.");
> > > > +
> > > > + /*
> > > > +* If there is a syntax error in the name or if the replication
> > > > slots'
> > > > +* data
On Wed, Feb 28, 2024 at 8:49 AM Amit Kapila wrote:
>
>
> Few comments:
Thanks for the feedback.
> ===
> 1.
> - if (logical)
> + if (logical || !replication)
> {
>
> Can we add a comment about connection types that require
> ALWAYS_SECURE_SEARCH_PATH_SQL?
>
> 2.
> Can we add a test
On Wed, Feb 28, 2024 at 1:33 PM Bertrand Drouvot
wrote:
>
> Hi,
> A few comments:
Thanks for reviewing.
>
> 1 ===
>
> +* used to run normal SQL queries
>
> s/run normal SQL/run SQL/ ?
>
> As mentioned up-thread I don't like that much the idea of creating such a test
> but if we do then h
On Thu, Feb 29, 2024 at 7:04 AM Zhijie Hou (Fujitsu)
wrote:
>
> Here is the v101 patch set which addressed above comments.
>
Thanks for the patch. Few comments:
1) Shall we mention in doc that shutdown will wait for standbys in
standby_slot_names to confirm receiving WAL:
Suggestion for logica
On Wed, Feb 28, 2024 at 4:51 PM Amit Kapila wrote:
>
> On Wed, Feb 28, 2024 at 3:26 PM shveta malik wrote:
> >
> >
> > Here is the patch which addresses the above comments. Also optimized
> > the test a little bit. Now we use pg_sync_replication_slots() function
>
Hi hackers,
ConditionVariableTimedSleep() accepts a timeout parameter, but it
doesn't explicitly state the unit for the timeout anywhere. To
determine this, one needs to look into the details of the function to
find it out from the comments of the internally called function
WaitLatch(). It would b
lush-position(and not on each change) should be outside of this
function. It is too embedded. And the comment which states the order
of wait (first flush and then standbys confirmation) should be outside
the for-loop in WalSndWaitForWal(), but yes we do need both the
comments. Attached a patch (.txt
On Tue, Mar 5, 2024 at 1:25 PM Heikki Linnakangas wrote:
> SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED
> mode, when you pass need_lock=true. So that at least should be changed
> to false.
>
Also don't we need to release the lock when we return here:
/*
* Nothing to do f
On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot
wrote:
>
> > /*
> > * Nothing to do for physical slots as we collect stats only for logical
> > * slots.
> > */
> > if (SlotIsPhysical(slot))
> > return;
>
> D'oh! Thanks! Fixed in v2 shared up-thread.
Thanks. Can we try to get rid of multiple LwLo
On Wed, Mar 6, 2024 at 6:54 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Wednesday, March 6, 2024 9:13 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu)
> > wrote:
> > >
> > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada
> > > wrote:
> > >
> > >
On Wed, Mar 6, 2024 at 2:36 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote:
> > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot
> > wrote:
> > Thanks. Can we try to get rid of multiple LwLockRelease in
> >
On Thu, Mar 7, 2024 at 11:12 AM Michael Paquier wrote:
>
> Yeah, it is possible that what you retrieve from
> pgstat_fetch_replslot() does not refer exactly to the slot's content
> under concurrent activity, but you cannot protect a single scan of
> pg_stat_replication_slots as of an effect of it
On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian wrote:
>
>> Pushed with minor modifications. I'll keep an eye on BF.
>>
>> BTW, one thing that we should try to evaluate a bit more is the
>> traversal of slots in StandbySlotsHaveCaughtup() where we verify if
>> all the slots mentioned in standby_slot_n
1 - 100 of 443 matches
Mail list logo