On Mon, Jun 23, 2025 at 8:04 PM vignesh C wrote:
>
> On Mon, 23 Jun 2025 at 04:36, Alexander Korotkov wrote:
> >
> > On Fri, Jun 20, 2025 at 2:24 PM vignesh C wrote:
> > > On Fri, 20 Jun 2025 at 05:54, Alexander Korotkov
> > > wrote:
> > > > Dear Kuroda-san,
> > > >
> > > > On Thu, Jun 19, 202
On Mon, 23 Jun 2025 at 04:36, Alexander Korotkov wrote:
>
> On Fri, Jun 20, 2025 at 2:24 PM vignesh C wrote:
> > On Fri, 20 Jun 2025 at 05:54, Alexander Korotkov
> > wrote:
> > > Dear Kuroda-san,
> > >
> > > On Thu, Jun 19, 2025 at 2:05 PM Hayato Kuroda (Fujitsu)
> > > wrote:
> > > > > > Regar
On Fri, Jun 20, 2025 at 2:24 PM vignesh C wrote:
> On Fri, 20 Jun 2025 at 05:54, Alexander Korotkov wrote:
> > Dear Kuroda-san,
> >
> > On Thu, Jun 19, 2025 at 2:05 PM Hayato Kuroda (Fujitsu)
> > wrote:
> > > > > Regarding assertion failure, I've found that assert in
> > > > > PhysicalConfirmRec
Hi, Vignesh!
On Fri, Jun 20, 2025 at 3:42 PM vignesh C wrote:
> On Fri, 20 Jun 2025 at 05:54, Alexander Korotkov wrote:
> > On Thu, Jun 19, 2025 at 2:05 PM Hayato Kuroda (Fujitsu)
> > wrote:
> > > > > Regarding assertion failure, I've found that assert in
> > > > > PhysicalConfirmReceivedLocati
On Fri, Jun 20, 2025 at 5:48 AM Alexander Korotkov wrote:
>
> >
> > If what I said above is correct, then the following part of the commit
> > message will be incorrect:
> > "As stated in the ReplicationSlotReserveWal() comment, this is not
> > always true. Additionally, this issue has been spotte
Dear Kuroda-san,
On Thu, Jun 19, 2025 at 2:05 PM Hayato Kuroda (Fujitsu)
wrote:
> > > Regarding assertion failure, I've found that assert in
> > > PhysicalConfirmReceivedLocation() conflicts with restart_lsn
> > > previously set by ReplicationSlotReserveWal(). As I can see,
> > > ReplicationSlot
On Thu, Jun 19, 2025 at 1:29 PM Amit Kapila wrote:
> On Wed, Jun 18, 2025 at 10:17 PM Alexander Korotkov
> wrote:
> >
> > On Wed, Jun 18, 2025 at 6:50 PM Vitaly Davydov
> > wrote:
> > > > I think, it is a good idea. Once we do not use the generated data, it
> > > > is ok
> > > > just to genera
Dear Amit, Alexander,
> > Regarding assertion failure, I've found that assert in
> > PhysicalConfirmReceivedLocation() conflicts with restart_lsn
> > previously set by ReplicationSlotReserveWal(). As I can see,
> > ReplicationSlotReserveWal() just picks fresh XLogCtl->RedoRecPtr lsn.
> > So, it d
On Wed, Jun 18, 2025 at 10:17 PM Alexander Korotkov
wrote:
>
> On Wed, Jun 18, 2025 at 6:50 PM Vitaly Davydov
> wrote:
> > > I think, it is a good idea. Once we do not use the generated data, it is
> > > ok
> > > just to generate WAL segments using the proposed function. I've tested
> > > this
On Wed, Jun 18, 2025 at 6:50 PM Vitaly Davydov wrote:
> > I think, it is a good idea. Once we do not use the generated data, it is ok
> > just to generate WAL segments using the proposed function. I've tested this
> > function. The tests worked as expected with and without the fix. The
> > attach
On Wed, 18 Jun 2025 at 14:35, Vitaly Davydov wrote:
>
> Dear Hayato,
>
> > To confirm, can you tell me the theory why the walsender received old LSN?
> > It is sent by the walreceiver, so is there a case that
> > LogstreamResult.Flush can go backward?
> > Not sure we can accept the situation.
>
>
Dear Vitaly,
I've been working on the bug...
> This assert was introduced in the patch. Now, I think, it is a wrong one. Let
> me
> please explain one of the possible scenarios when it can be triggered. In case
> of physical replication, when walsender receives a standby reply message, it
> call
vignesh C writes:
> While tracking buildfarm for one of other commits, I noticed this failure:
> TRAP: failed Assert("s->data.restart_lsn >=
> s->last_saved_restart_lsn"), File:
> "../pgsql/src/backend/replication/slot.c", Line: 1813, PID: 3945797
My animal mamba is also showing this assertion fa
Hi, Vitaly!
On Tue, Jun 17, 2025 at 6:02 PM Vitaly Davydov wrote:
> Thank you for reporting the issue.
>
> >While tracking buildfarm for one of other commits, I noticed this failure:
> >TRAP: failed Assert("s->data.restart_lsn >=
> >s->last_saved_restart_lsn"), File:
> >"../pgsql/src/backend/repl
Hi Alexander,
While tracking buildfarm for one of other commits, I noticed this failure:
TRAP: failed Assert("s->data.restart_lsn >=
s->last_saved_restart_lsn"), File:
"../pgsql/src/backend/replication/slot.c", Line: 1813, PID: 3945797
postgres: standby: checkpointer (ExceptionalCondition+0x83) [0
Dear Alexander,
> Thank you! All of these totally make sense. The updated patch is attached.
Thanks for the update. I found another point.
```
-# Another 2M rows; that's about 260MB (~20 segments) worth of WAL.
+# Another 50K rows; that's about 86MB (~5 segments) worth of WAL.
$node->safe_psq
Dear Kuroda-san,
On Mon, Jun 16, 2025 at 12:11 PM Hayato Kuroda (Fujitsu)
wrote:
> Thanks for pushing the fix patch! BTW, I have few comments for your commits.
> Can you check and include them if needed?
>
> 01.
> ```
> $node->append_conf('postgresql.conf',
> "shared_preload_libraries = '
Dear Alexander,
Thanks for pushing the fix patch! BTW, I have few comments for your commits.
Can you check and include them if needed?
01.
```
$node->append_conf('postgresql.conf',
"shared_preload_libraries = 'injection_points'");
```
No need to set shared_preload_libraries in 046/047. I
Hi, Tom!
On Sun, Jun 15, 2025 at 7:05 PM Tom Lane wrote:
> BTW, while you're cleaning up this commit, could you remove the
> excess newlines in some of the "note" commands in 046 and 047, like
>
> note('starting checkpoint\n');
>
> This produces bizarre output, as shown in the buildfarm logs:
Th
BTW, while you're cleaning up this commit, could you remove the
excess newlines in some of the "note" commands in 046 and 047, like
note('starting checkpoint\n');
This produces bizarre output, as shown in the buildfarm logs:
[04:04:38.953](603.550s) # starting checkpoint\\n
15.06.2025 14:02, Alexander Korotkov wrote:
Could you, please, check this patch? On my system it makes 046 and
047 execute in 140 secs with -O0 and -DRELCACHE_FORCE_RELEASE
-DCATCACHE_FORCE_RELEASE.
Thank you for the patch!
It decreases the test's duration significantly:
# +++ tap check in sr
On Sun, Jun 15, 2025 at 12:00 PM Alexander Lakhin wrote:
>
> Hello Alexander,
>
> 10.06.2025 23:14, Alexander Korotkov wrote:
>
> So, my proposal is to commit the attached patchset to the HEAD, and
> commit [1] to the back branches. Any objections?
>
>
> As the buildfarm animal prion shows [1], t
Hi, Alexander!
On Sun, Jun 15, 2025 at 12:00 PM Alexander Lakhin wrote:
>
> Hello Alexander,
>
> 10.06.2025 23:14, Alexander Korotkov wrote:
>
> So, my proposal is to commit the attached patchset to the HEAD, and
> commit [1] to the back branches. Any objections?
>
>
> As the buildfarm animal pr
Hello Alexander,
10.06.2025 23:14, Alexander Korotkov wrote:
So, my proposal is to commit the attached patchset to the HEAD, and
commit [1] to the back branches. Any objections?
As the buildfarm animal prion shows [1], the 046_checkpoint_logical_slot
test fails with "-DRELCACHE_FORCE_RELEASE
On Wed, Jun 11, 2025 at 1:44 AM Alexander Korotkov wrote:
>
> On Mon, Jun 9, 2025 at 7:09 PM Vitaly Davydov
> wrote:
> > > I think we can use this approach for HEAD and probably keep the
> > > previous idea for backbranches. Keeping some value in shared_memory
> > > per slot sounds risky to me i
On Mon, Jun 9, 2025 at 7:09 PM Vitaly Davydov wrote:
> > I think we can use this approach for HEAD and probably keep the
> > previous idea for backbranches. Keeping some value in shared_memory
> > per slot sounds risky to me in terms of introducing new bugs.
>
> Not sure, what kind of problems may
Hi Amit,
> I think we can use this approach for HEAD and probably keep the
> previous idea for backbranches. Keeping some value in shared_memory
> per slot sounds risky to me in terms of introducing new bugs.
Not sure, what kind of problems may occur. I propose to allocate in shmem an
array of la
On Tue, Jun 3, 2025 at 6:51 PM Alexander Korotkov wrote:
>
> >
> > As per my understanding, for logical slots, effective_xmin is only set
> > during the initial copy phase (or say if one has to export a
> > snapshot), after that, its value won't change. Please read the
> > comments in CreateInitDe
On Thu, Jun 5, 2025 at 8:51 PM Vitaly Davydov wrote:
>
> Dear Alexander, Amit
>
> Alexander Korotkov wrote:
> > Also, I've changed ReplicationSlotsComputeRequiredLSN() call to
> > CheckPointReplicationSlots() to update required LSN after
> > SaveSlotToPath() updated last_saved_restart_lsn. This h
Dear Alexander, Amit
Alexander Korotkov wrote:
> Also, I've changed ReplicationSlotsComputeRequiredLSN() call to
> CheckPointReplicationSlots() to update required LSN after
> SaveSlotToPath() updated last_saved_restart_lsn. This helps to pass
> checks in 001_stream_rep.pl without additional hacks
On Mon, Jun 2, 2025 at 2:53 PM Amit Kapila wrote:
>
> On Thu, May 29, 2025 at 5:29 PM Alexander Korotkov
> wrote:
> >
> > On Tue, May 27, 2025 at 2:26 PM Amit Kapila wrote:
> > > Yeah, we should be able to change ABI during beta, but I can't comment
> > > on the idea of effective_restart_lsn wi
On Thu, May 29, 2025 at 5:29 PM Alexander Korotkov wrote:
>
> On Tue, May 27, 2025 at 2:26 PM Amit Kapila wrote:
> > Yeah, we should be able to change ABI during beta, but I can't comment
> > on the idea of effective_restart_lsn without seeing the patch or a
> > detailed explanation of this idea.
On Tue, May 27, 2025 at 2:26 PM Amit Kapila wrote:
> Yeah, we should be able to change ABI during beta, but I can't comment
> on the idea of effective_restart_lsn without seeing the patch or a
> detailed explanation of this idea.
Could you, please, check the patch [1]. It implements this idea
ex
On Tue, May 27, 2025 at 2:48 PM Alexander Korotkov wrote:
>
> On Tue, May 27, 2025 at 12:12 PM Alexander Korotkov
> wrote:
> >
> > On Tue, May 27, 2025 at 7:08 AM Amit Kapila wrote:
> > > On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov
> > > wrote:
> > > >
> > > > On Mon, May 26, 2025 at 2:
On Tue, May 27, 2025 at 12:12 PM Alexander Korotkov
wrote:
>
> On Tue, May 27, 2025 at 7:08 AM Amit Kapila wrote:
> > On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov
> > wrote:
> > >
> > > On Mon, May 26, 2025 at 2:43 PM Amit Kapila
> > > wrote:
> > > >
> > > > On Mon, May 26, 2025 at 3:52
On Tue, May 27, 2025 at 7:08 AM Amit Kapila wrote:
> On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov
> wrote:
> >
> > On Mon, May 26, 2025 at 2:43 PM Amit Kapila wrote:
> > >
> > > On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov
> > > wrote:
> >
> > > OTOH, if we don't want to adjust physic
On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov
wrote:
>
> On Mon, May 26, 2025 at 2:43 PM Amit Kapila wrote:
> >
> > On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov
> > wrote:
>
> > OTOH, if we don't want to adjust physical
> > slot machinery, it seems saving the logical slots to disk immed
On Mon, May 26, 2025 at 2:43 PM Amit Kapila wrote:
>
> On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov
> wrote:
> >
> > Dear Alexander, Amit, All
> >
> > > Amit wrote:
> > > > Is my understanding correct that we need 0001 because
> > > > PhysicalConfirmReceivedLocation() doesn't save the slot to
Dear Amit,
> OTOH, if we don't want to adjust physical
> slot machinery, it seems saving the logical slots to disk immediately
> when its restart_lsn is updated is a waste of effort after your patch,
> no? If so, why are we okay with that?
I agree, that saving logical slots at advance is a possib
On Mon, May 26, 2025 at 9:49 AM Amit Kapila wrote:
>
> On Sat, May 24, 2025 at 6:59 PM Alexander Korotkov
> wrote:
> >
> > Hi, Amit!
> >
> > Thank you for your attention to this patchset!
> >
> > On Sat, May 24, 2025 at 2:15 PM Amit Kapila wrote:
> > > On Sat, May 24, 2025 at 4:08 PM Alexander
On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov wrote:
>
> Dear Alexander, Amit, All
>
> > Amit wrote:
> > > Is my understanding correct that we need 0001 because
> > > PhysicalConfirmReceivedLocation() doesn't save the slot to disk after
> > > changing the slot's restart_lsn?
> >
> > Yes. Also, e
Dear Alexander, Amit, All
> Amit wrote:
> > Is my understanding correct that we need 0001 because
> > PhysicalConfirmReceivedLocation() doesn't save the slot to disk after
> > changing the slot's restart_lsn?
>
> Yes. Also, even if it would save slot to the disk, there is still
> race condition t
On Sat, May 24, 2025 at 6:59 PM Alexander Korotkov wrote:
>
> Hi, Amit!
>
> Thank you for your attention to this patchset!
>
> On Sat, May 24, 2025 at 2:15 PM Amit Kapila wrote:
> > On Sat, May 24, 2025 at 4:08 PM Alexander Korotkov
> > wrote:
> > >
> > > I spend more time on this. The next re
Hi, Amit!
Thank you for your attention to this patchset!
On Sat, May 24, 2025 at 2:15 PM Amit Kapila wrote:
> On Sat, May 24, 2025 at 4:08 PM Alexander Korotkov
> wrote:
> >
> > I spend more time on this. The next revision is attached. It
> > contains revised comments and other cosmetic chan
On Sat, May 24, 2025 at 4:08 PM Alexander Korotkov wrote:
>
> I spend more time on this. The next revision is attached. It
> contains revised comments and other cosmetic changes. I'm going to
> backpatch 0001 to all supported branches,
>
Is my understanding correct that we need 0001 because
Ph
On Fri, May 23, 2025 at 12:10 AM Alexander Korotkov
wrote:
>
> Hi, Vitaly!
>
> On Tue, May 20, 2025 at 6:44 PM Vitaly Davydov
> wrote:
> >
> > Thank you very much for the review!
> >
> > > The patchset doesn't seem to build after 371f2db8b0, which adjusted
> > > the signature of the INJECTION_PO
Hi, Vitaly!
On Tue, May 20, 2025 at 6:44 PM Vitaly Davydov wrote:
>
> Thank you very much for the review!
>
> > The patchset doesn't seem to build after 371f2db8b0, which adjusted
> > the signature of the INJECTION_POINT() macro. Could you, please,
> > update the patchset accordingly.
>
> I've u
Hi Alexander,
Thank you very much for the review!
> The patchset doesn't seem to build after 371f2db8b0, which adjusted
> the signature of the INJECTION_POINT() macro. Could you, please,
> update the patchset accordingly.
I've updated the patch (see attached). Thanks.
> I see in 0004 patch we'
Hi Vitaly!
On Fri, May 2, 2025 at 8:47 PM Vitaly Davydov wrote:
> Thank you for the attention to the patch. I updated a patch with a better
> solution for the master branch which can be easily backported to the other
> branches as we agree on the final solution.
>
> Two tests are introduced which
Dear All,
Thank you for the attention to the patch. I updated a patch with a better
solution for the master branch which can be easily backported to the other
branches as we agree on the final solution.
Two tests are introduced which are based on Tomas Vondra's test for logical
slots
with inject
On Mon, Apr 28, 2025 at 6:39 PM Alexander Korotkov wrote:
>
> On Tue, Apr 29, 2025 at 4:03 AM Masahiko Sawada wrote:
> >
> > On Mon, Apr 28, 2025 at 8:17 AM Alexander Korotkov
> > wrote:
> > >
> > > > I have a question - is there any interest to backport the solution into
> > > > existing major
On Tue, Apr 29, 2025 at 4:03 AM Masahiko Sawada wrote:
>
> On Mon, Apr 28, 2025 at 8:17 AM Alexander Korotkov
> wrote:
> >
> > > I have a question - is there any interest to backport the solution into
> > > existing major releases?
> >
> > As long as this is the bug, it should be backpatched to
On Mon, Apr 28, 2025 at 8:17 AM Alexander Korotkov wrote:
>
> > I have a question - is there any interest to backport the solution into
> > existing major releases?
>
> As long as this is the bug, it should be backpatched to all supported
> affected releases.
Yes, but I think we cannot back-patch
On Thu, Apr 24, 2025 at 5:32 PM Vitaly Davydov wrote:
> Thank you for the review. I apologize for a late reply. I missed your email.
>
> > 1) As ReplicationSlotsComputeRequiredLSN() is called each time we need
> > to advance the position of WAL needed by replication slots, the usage
> > pattern pr
Hi Alexander,
Thank you for the review. I apologize for a late reply. I missed your email.
> 1) As ReplicationSlotsComputeRequiredLSN() is called each time we need
> to advance the position of WAL needed by replication slots, the usage
> pattern probably could be changed. Thus, we probably need
Hi, Vitaly!
On Mon, Mar 3, 2025 at 5:12 PM Vitaly Davydov wrote:
> The slot data is flushed to the disk at the beginning of checkpoint. If
> an existing slot is advanced in the middle of checkpoint execution, its
> advanced restart LSN is taken to calculate the oldest LSN for WAL
> segments remov
Dear Hackers,
Let me please introduce a new version of the patch.
Patch description:
The slot data is flushed to the disk at the beginning of checkpoint. If
an existing slot is advanced in the middle of checkpoint execution, its
advanced restart LSN is taken to calculate the oldest LSN for WAL
s
> On 11/21/24 14:59, Tomas Vondra wrote:
>
> I don't have a great idea how to improve this. It seems wrong for
> ReplicationSlotsComputeRequiredLSN() to calculate the LSN using values
> from dirty slots, so maybe it should simply retry if any slot is dirty?
> Or retry on that one slot? But various
On 11/21/24 14:59, Tomas Vondra wrote:
>
> ...
>
> But then there's the SQL API - pg_logical_slot_get_changes(). And it
> turns out it ends up syncing the slot to disk pretty often, because for
> RUNNING_XACTS we call LogicalDecodingProcessRecord() + standby_decode(),
> which ends up calling SaveS
On Thursday, November 21, 2024 17:56 MSK, "Vitaly Davydov"
wrote:
> I'm trying to create a perl test to reproduce it. Please, give me some time
> to create the test script.
Attached is the test script which reproduces my problem. It should be run on a
patched postgresql with the following cha
Hi Tomas,
Thank you for the reply and your interest to the investigation.
On Wednesday, November 20, 2024 20:24 MSK, Tomas Vondra wrote:
> If an existing physical slot is advanced in the middle of checkpoint
> execution, WAL segments, which are related to saved on disk restart LSN
> may be r
On 11/20/24 23:19, Tomas Vondra wrote:
> On 11/20/24 18:24, Tomas Vondra wrote:
>>
>> ...
>>
>> What confuses me a bit is that we update the restart_lsn (and call
>> ReplicationSlotsComputeRequiredLSN() to recalculate the global value)
>> all the time. Walsender does that in PhysicalConfirmRecei
On 10/31/24 11:18, Vitaly Davydov wrote:
> Dear Hackers,
>
>
>
> I'd like to discuss a problem with replication slots's restart LSN.
> Physical slots are saved to disk at the beginning of checkpoint. At the
> end of checkpoint, old WAL segments are recycled or removed from disk,
> if they are n
On 11/20/24 18:24, Tomas Vondra wrote:
>
> ...
>
> What confuses me a bit is that we update the restart_lsn (and call
> ReplicationSlotsComputeRequiredLSN() to recalculate the global value)
> all the time. Walsender does that in PhysicalConfirmReceivedLocation for
> example. So we actually see the
On 11/20/24 14:40, Vitaly Davydov wrote:
> Dear Hackers,
>
>
>
> To ping the topic, I'd like to clarify what may be wrong with the idea
> described here, because I do not see any interest from the community.
> The topic is related to physical replication. The primary idea is to
> define the
Dear Hackers,
To ping the topic, I'd like to clarify what may be wrong with the idea
described here, because I do not see any interest from the community. The topic
is related to physical replication. The primary idea is to define the horizon
of WAL segments (files) removal based on saved on
Dear Hackers,
I'd like to introduce an improved version of my patch (see the attached file).
My original idea was to take into account saved on disk restart_lsn
(slot→restart_lsn_flushed) for persistent slots when removing WAL segment
files. It helps tackle errors like: ERROR: requested WAL s
Sorry, attached the missed patch.
On Thursday, October 31, 2024 13:18 MSK, "Vitaly Davydov"
wrote:
Dear Hackers,
I'd like to discuss a problem with replication slots's restart LSN. Physical
slots are saved to disk at the beginning of checkpoint. At the end of
checkpoint, old WAL segments a
68 matches
Mail list logo