Hi,
On Tue, Apr 08, 2025 at 03:27:41PM +0900, Michael Paquier wrote:
> On Tue, Apr 08, 2025 at 06:19:02AM +, Bertrand Drouvot wrote:
> > - A new injection_points_wakeup_detach() function that is holding the
> > spinlock
> > during the whole duration to ensure that no process can wait in betwe
Hi,
On Tue, Apr 08, 2025 at 02:13:20PM +0900, Michael Paquier wrote:
> On Tue, Apr 08, 2025 at 02:00:35AM +, Hayato Kuroda (Fujitsu) wrote:
> > Your patch looks good to me and it could pass on my env. PSA patches for
> > PG16.
> > Patch for PG17 is not changed, just renamed.
>
> @@ -1287,6 +
On Wed, Apr 09, 2025 at 12:07:31PM +0530, Amit Kapila wrote:
> On Wed, Apr 9, 2025 at 11:24 AM Bertrand Drouvot
> wrote:
> I can't think of a good reason to have this DEBUG1 as there is no
> predictability of it getting generated even with tests using an
> injection point. OTOH, I don't have any o
On Wed, Apr 9, 2025 at 11:24 AM Bertrand Drouvot
wrote:
>
> On Wed, Apr 09, 2025 at 12:03:06PM +0900, Michael Paquier wrote:
> > On Tue, Apr 08, 2025 at 06:42:53AM +, Bertrand Drouvot wrote:
> > > Fully agree. Will need to find another way to prevent a process to wait
> > > between the
> > >
Hi,
On Wed, Apr 09, 2025 at 12:03:06PM +0900, Michael Paquier wrote:
> On Tue, Apr 08, 2025 at 06:42:53AM +, Bertrand Drouvot wrote:
> > Fully agree. Will need to find another way to prevent a process to wait
> > between the
> > wakeup and the detach. I'll open a dedicated thread.
>
> By the
On Tue, Apr 08, 2025 at 06:42:53AM +, Bertrand Drouvot wrote:
> Fully agree. Will need to find another way to prevent a process to wait
> between the
> wakeup and the detach. I'll open a dedicated thread.
By the way, there is a small thing that's itching me a bit about the
change done in LogS
On Tue, Apr 08, 2025 at 06:19:02AM +, Bertrand Drouvot wrote:
> - A new injection_points_wakeup_detach() function that is holding the spinlock
> during the whole duration to ensure that no process can wait in between the
> wakeup and the detach.
That would not a correct spinlock use. injectio
Hi,
On 2025-04-08 02:00:35 +, Hayato Kuroda (Fujitsu) wrote:
> > I have changed quite a few comments and commit message for the PG17
> > patch in the attached. Can you update PG16 patch based on this and
> > also use the same commit message as used in attached for all the three
> > patches?
>
On Tue, Apr 08, 2025 at 02:00:35AM +, Hayato Kuroda (Fujitsu) wrote:
> Your patch looks good to me and it could pass on my env. PSA patches for PG16.
> Patch for PG17 is not changed, just renamed.
@@ -1287,6 +1288,17 @@ LogStandbySnapshot(void)
Assert(XLogStandbyInfoActive());
+#ifdef
Dear Amit,
>
> I have changed quite a few comments and commit message for the PG17
> patch in the attached. Can you update PG16 patch based on this and
> also use the same commit message as used in attached for all the three
> patches?
Your patch looks good to me and it could pass on my env. PSA
On Thu, Apr 3, 2025 at 11:29 AM Hayato Kuroda (Fujitsu)
wrote:
>
I have changed quite a few comments and commit message for the PG17
patch in the attached. Can you update PG16 patch based on this and
also use the same commit message as used in attached for all the three
patches?
--
With Regards
Hi,
On Mon, Apr 07, 2025 at 03:16:07PM +0530, Amit Kapila wrote:
> On Thu, Apr 3, 2025 at 3:15 PM Amit Kapila wrote:
> > Hmm, but adding some additional smarts also makes this test less easy
> > to backpatch. I see your points related to the benefits, but I still
> > mildly prefer to go with the
On Thu, Apr 3, 2025 at 3:15 PM Amit Kapila wrote:
>
> On Thu, Apr 3, 2025 at 12:29 PM Bertrand Drouvot
> wrote:
> >
> > On Thu, Apr 03, 2025 at 05:34:10AM +, Hayato Kuroda (Fujitsu) wrote:
> > > Dear Bertrand, Amit,
> > >
> > > > > I do prefer v5-PG17-2 as it is "closer" to HEAD. That said, I
Hi Kuroda-san,
On Mon, Apr 07, 2025 at 06:15:13AM +, Hayato Kuroda (Fujitsu) wrote:
> I had been debugging and found the case that VACUUM FULL also has a timing
> issue.
> This means the we cannot keep the testcase.
>
> PSA the reproducer for PG17. IIUC this can happen even in PG16.
> I cons
Dear Bertrand,
> I wonder if we could not keep this test and make the slot active for the
> vacuum full case. Looking at drongo's failure in [1], there is no occurence
> of "vacuum full" and that's probably linked to Andres's explanation in [2]:
>
> "
> a VACUUM FULL on pg_class is
> used, which
On Thu, Apr 3, 2025 at 11:04 AM Hayato Kuroda (Fujitsu)
wrote:
>
> Dear Bertrand, Amit,
>
> > > I do prefer v5-PG17-2 as it is "closer" to HEAD. That said, I think that
> > > we should
> > > keep the slots active and only avoid doing the checks for them (they are
> > invalidated
> > > that's fine
On Wed, Apr 02, 2025 at 12:13:52PM +, Hayato Kuroda (Fujitsu) wrote:
> Dear Amit, Bertrand,
>
> > The other idea to simplify the changes for backbranches:
> > sub reactive_slots_change_hfs_and_wait_for_xmins
> > {
> > ...
> > + my ($slot_prefix, $hsf, $invalidated, $needs_active_slot) = @_;
>
Hi,
On Thu, Apr 03, 2025 at 05:34:10AM +, Hayato Kuroda (Fujitsu) wrote:
> Dear Bertrand, Amit,
>
> > > I do prefer v5-PG17-2 as it is "closer" to HEAD. That said, I think that
> > > we should
> > > keep the slots active and only avoid doing the checks for them (they are
> > invalidated
> >
On Wed, Apr 2, 2025 at 2:06 PM Bertrand Drouvot
wrote:
>
> Hi Kuroda-san,
>
> On Wed, Apr 02, 2025 at 07:16:25AM +, Hayato Kuroda (Fujitsu) wrote:
>
> As far v4-0001:
>
> === 1
>
> +# would advance an active replication slot's catalog_xmin
>
> s/would/could/? I mean the system also needs to be
Hi,
On Wed, Apr 02, 2025 at 03:04:07PM +0530, Amit Kapila wrote:
> I have changed it based on your suggestions and made a few other
> changes in the comments. Please see attached.
Thanks!
> *
> + if (IS_INJECTION_POINT_ATTACHED("log-running-xacts"))
>
> It is better to name the injection point
Dear Amit, Bertrand,
> The other idea to simplify the changes for backbranches:
> sub reactive_slots_change_hfs_and_wait_for_xmins
> {
> ...
> + my ($slot_prefix, $hsf, $invalidated, $needs_active_slot) = @_;
>
> # create the logical slots
> - create_logical_slots($node_standby, $slot_prefix)
On Thu, Apr 3, 2025 at 12:29 PM Bertrand Drouvot
wrote:
>
> On Thu, Apr 03, 2025 at 05:34:10AM +, Hayato Kuroda (Fujitsu) wrote:
> > Dear Bertrand, Amit,
> >
> > > > I do prefer v5-PG17-2 as it is "closer" to HEAD. That said, I think
> > > > that we should
> > > > keep the slots active and on
> Isn't patch 0001-Fix-invalid-referring-of-hash-ref-for-replication-sl
> unrelated to this thread? Or am, I missing something?
I did attach wrongly, PSA correct set. Sorry for inconvenience.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
v5-0001-Stabilize-035_standby_logical_decoding.pl-by-usin
Dear Bertrand, Amit,
> > I do prefer v5-PG17-2 as it is "closer" to HEAD. That said, I think that we
> > should
> > keep the slots active and only avoid doing the checks for them (they are
> invalidated
> > that's fine, they are not that's fine too).
> >
>
> I don't mind doing that, but there is
On Wed, Apr 2, 2025 at 8:30 PM Bertrand Drouvot
wrote:
>
> On Wed, Apr 02, 2025 at 12:13:52PM +, Hayato Kuroda (Fujitsu) wrote:
> > Dear Amit, Bertrand,
> >
> > > The other idea to simplify the changes for backbranches:
> > > sub reactive_slots_change_hfs_and_wait_for_xmins
> > > {
> > > ...
>
On Wed, Apr 2, 2025 at 2:06 PM Bertrand Drouvot
wrote:
>
>
> As far PG17-v4-0001:
>
>
> === 4
>
> It looks like that check_slots_conflict_reason() is not called with
> checks_active_slot
> as argument.
>
> === 5
>
> I think that we could remove the need for the drop_active_slot parameter in
> dro
Dear Amit, Bertrand,
> You have not added any injection point for the above case. Isn't it
> possible that if running_xact record is logged concurrently to the
> pruning record, it should move the active slot on standby, and the
> same failure should occur in this case as well?
I considered that
Hi Kuroda-san,
On Wed, Apr 02, 2025 at 07:16:25AM +, Hayato Kuroda (Fujitsu) wrote:
> Dear Amit, Bertrand,
>
> > You have not added any injection point for the above case. Isn't it
> > possible that if running_xact record is logged concurrently to the
> > pruning record, it should move the ac
On Tue, Apr 1, 2025 at 2:02 PM Bertrand Drouvot
wrote:
>
> Hi Kuroda-san,
>
> On Tue, Apr 01, 2025 at 01:22:49AM +, Hayato Kuroda (Fujitsu) wrote:
> > Dear Bertrand,
> >
>
> Thanks for the updated patch!
>
> > > s/to avoid the seeing a xl_running_xacts/to avoid seeing a
> > > xl_running_xacts
Dear Bertrand,
> s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?
Fixed.
>
> === 2 (Nit)
>
> /* For testing slot invalidation due to the conflict */
>
> Not sure "due to the conflict" is needed.
>
OK, removed.
> About PG17-v2-0001
>
> === 3
>
> The commit
On Tue, Apr 1, 2025 at 6:53 AM Hayato Kuroda (Fujitsu)
wrote:
>
With respect to 0001, can't this problem happen for the following case as well?
# Recovery conflict: Invalidate conflicting slots, including in-use slots
# Scenario 5: conflict due to on-access pruning.
You have not added any inject
Hi,
On Tue, Apr 01, 2025 at 04:55:06PM +0530, Amit Kapila wrote:
> On Tue, Apr 1, 2025 at 2:02 PM Bertrand Drouvot
> wrote:
>
> > But if we "really" want to produce a "new" WAL record, what about using
> > LogLogicalMessage()?
> >
>
> We are using injection points for testing purposes, which me
Dear Bertrand,
> > > s/to avoid the seeing a xl_running_xacts/to avoid seeing a
> > > xl_running_xacts/?
> >
> > Fixed.
Sorry, I misunderstood your comment and wrongly fixed. I will address in next
version.
> === 1
>
> +* XXX What value should we return here? Originally this
>
Hi Kuroda-san,
On Tue, Apr 01, 2025 at 01:22:49AM +, Hayato Kuroda (Fujitsu) wrote:
> Dear Bertrand,
>
Thanks for the updated patch!
> > s/to avoid the seeing a xl_running_xacts/to avoid seeing a
> > xl_running_xacts/?
>
> Fixed.
hmm, not sure as I still can see:
+# Note that injection_
Hi Kuroda-san and Amit,
On Fri, Mar 28, 2025 at 09:02:29AM +, Hayato Kuroda (Fujitsu) wrote:
> Dear Amit,
>
> >
> > Right, I think this is a better idea.
I like it too and the bonus point is that this injection point can be used
in more tests (more use cases).
A few comments:
About v
Dear Amit,
>
> Right, I think this is a better idea. I have a few comments:
> 1.
> + /*
> + * In 035_standby_logical_decoding.pl, RUNNING_XACTS could move slots's
> + * xmin forward and cause random failures.
>
> No need to use test file name in code comments.
Fixed.
> 2. The comments atop wai
On Wed, Mar 26, 2025 at 1:17 PM Hayato Kuroda (Fujitsu)
wrote:
>
> > Seeing all these failures, I wonder whether we can reliably test
> > active slots apart from wal_level change test (aka Scenario 6:
> > incorrect wal_level on primary.). Sure, we can try by having some
> > injection point kind of
Dear Amit, Bertrand,
> Seeing all these failures, I wonder whether we can reliably test
> active slots apart from wal_level change test (aka Scenario 6:
> incorrect wal_level on primary.).
Hmm, agreed. We do not have good solution to stabilize tests, at least for now.
I've created a patch for PG1
Dear Amit, Bertrand,
> Seeing all these failures, I wonder whether we can reliably test
> active slots apart from wal_level change test (aka Scenario 6:
> incorrect wal_level on primary.). Sure, we can try by having some
> injection point kind of tests, but is it really worth because, anyway
> the
On Fri, Mar 21, 2025 at 9:48 PM Bertrand Drouvot
wrote:
>
> So, I'm not sure I like the idea that much, but thinking out loud: I wonder if
> we could bypass the "active" slot checks in 16 and 17 and use injection
> points as
> proposed as of 18 (as we need the injection points changes proposed in
Hi Kuroda-san,
On Mon, Mar 24, 2025 at 04:54:21AM +, Hayato Kuroda (Fujitsu) wrote:
> > So, I'm not sure I like the idea that much, but thinking out loud: I wonder
> > if
> > we could bypass the "active" slot checks in 16 and 17 and use injection
> > points as
> > proposed as of 18 (as we ne
Dear Bertrand,
I'm also working on the thread to resolve the random failure.
> Yes, that's also my understanding. It's also easy to "simulate" by adding
> a checkpoint on the primary and a long enough sleep after we launched our sql
> in
> wait_until_vacuum_can_remove().
Thanks for letting me k
Dear Bertrand,
> > SIGSTOP signal for pg_recvlogical may do the idea,
>
> Yeah, but would we be "really" testing an "active" slot?
Yeah, this is also a debatable point.
> At the end we want to produce an invalidation that may or not happen on a real
> environment. The corner case is in the test
Hi Kuroda-san,
On Fri, Mar 21, 2025 at 12:28:10PM +, Hayato Kuroda (Fujitsu) wrote:
> I'm also working on the thread to resolve the random failure.
Thanks for looking at it!
> I've also tried the idea with the living transaction via background_psql(),
> but I got the same result. The test co
Hi,
On Wed, Mar 19, 2025 at 12:12:19PM +0530, Amit Kapila wrote:
> On Mon, Feb 10, 2025 at 8:12 PM Bertrand Drouvot
> wrote:
> >
> > Please find attached a patch to $SUBJECT.
> >
> > In rare circumstances (and on slow machines) it is possible that a
> > xl_running_xacts
> > is emitted and that t
On Mon, Feb 10, 2025 at 8:12 PM Bertrand Drouvot
wrote:
>
> Please find attached a patch to $SUBJECT.
>
> In rare circumstances (and on slow machines) it is possible that a
> xl_running_xacts
> is emitted and that the catalog_xmin of a logical slot on the standby advances
> past the conflict poin
Hi hackers,
Please find attached a patch to $SUBJECT.
In rare circumstances (and on slow machines) it is possible that a
xl_running_xacts
is emitted and that the catalog_xmin of a logical slot on the standby advances
past the conflict point. In that case, no conflict is reported and the test
fai
47 matches
Mail list logo