On Thu, Aug 21, 2025 at 2:09 PM Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> On Thursday, August 21, 2025 2:01 PM shveta malik <shveta.ma...@gmail.com> 
> wrote:
> >
> > On Wed, Aug 20, 2025 at 12:12 PM Zhijie Hou (Fujitsu)
> > <houzj.f...@fujitsu.com> wrote:
> > >
> > >
> > > I agree. Here is V63 version which implements this approach.
> > >
> >
> > Thank You for the patches.
> >
> > > The retention status is recorded in the pg_subscription catalog
> > > (subretentionactive) to prevent unnecessary retention initiation upon
> > > server restarts. The apply worker is responsible for updating this
> > > flag based on the retention duration. Meanwhile, the column is set to
> > > true when retain_dead_tuples is enabled or when creating a new
> > > subscription with retain_dead_tuples enabled, and it is set to false when
> > retain_dead_tuples is disabled.
> > >
> >
> > +1 on the idea.
> >
> > Please find few initial testing feedback:
>
> Thanks for the comments.
>
> >
> > 1)
> > When it stops, it does not resume until we restart th server. It keeps on 
> > waiting
> > in wait_for_publisher_status and it never receives one.
> >
> > 2)
> > When we do: alter subscription sub1 set (max_conflict_retention_duration=0);
> >
> > It does not resume in this scenario too.
> > should_resume_retention_immediately() does not return true due to
> > wait-status on publisher.
>
> Fixed in the V64 patches.
>
>
> > 3)
> > AlterSubscription():
> >  * retention will be stopped gain soon in such cases, and
> >
> > stopped gain --> stopped again
>
> Sorry, I missed this typo in V64, I will fix it in the next version.
>

Sure. Thanks.
Please find a few more comments:

1)
There is an issue in retention resumption. The issue is observed for a
multi pub-sub setup where one sub is retaining info while another one
has stopped retention. Now even if I set
max_conflict_retention_duration=0 for the one which has stopped
retention, it does not resume. I have attached steps in the txt file.

2)
In the same testcase, sub1 is not resuming otherwise also i.e. even
though if we do not set max_conflict_retention_duration to 0, it
should resume in a while as there is no other txn on pub which is
stuck in commit-phase. In a single pub-sub setup, it works well. Multi
pub-sub setup has this issue.

3)
ApplyLauncherMain() has some processing under 'if
(sub->retaindeadtuples)', all dependent upon sub->retentionactive.
Will it be better to write it as:

                        if (sub->retaindeadtuples)
                        {
                                retain_dead_tuples = true;
         CreateConflictDetectionSlot();
         if (sub->retentionactive)
         {
               retention_inactive = false
             can_advance_xmin &= sub->enabled;
           if (!TransactionIdIsValid(MyReplicationSlot->data.xmin))
             init_conflict_slot_xmin();
         }
                        }

All 'sub->retentionactive' based logic under one 'if' would be easier
to understand.

thanks
Shveta
Pub and Sub:
create table tab1(i int primary key);
create table tab2(i int primary key);
create table tab3(i int primary key);
 
Pub:
create publication pub1 for table tab1;
 
Sub:
create subscription sub1  connection 'dbname=postgres host=localhost 
user=shveta port=5433' publication pub1 WITH (retain_dead_tuples = true, 
max_conflict_retention_duration=10000);
 
Pub:
insert into tab1 values(10); - hold debugger in RecordTransactionCommit
 
Sub:
insert into tab1 values(100);
 
--Wait for sub1 to stop retention and slot's xmin to become NULL:
select subname, subenabled, subretaindeadtuples, submaxconflretention, 
subretentionactive from pg_subscription;
select slot_name, xmin from pg_replication_slots;
 
--Now disable sub1:
alter subscription sub1 disable;
--And release pub's insert debugger.
 
 
--Now create another pub and sub pair:
create publication pub2 for table tab2;
create subscription sub2 connection 'dbname=postgres host=localhost user=shveta 
port=5433' publication pub2 WITH (retain_dead_tuples = true, 
max_conflict_retention_duration=10000);
 
--Sub2 will resume normally, slot's xmin will be valid now.
select subname, subenabled, subretaindeadtuples, submaxconflretention, 
subretentionactive from pg_subscription;
select slot_name, xmin from pg_replication_slots;
 
--Now enable sub1, it should still be in stop-retention mode
alter subscription sub1 enable;
select subname, subenabled, subretaindeadtuples, submaxconflretention, 
subretentionactive from pg_subscription;
select slot_name, xmin from pg_replication_slots;
 
--Now set sub1's max_conflict_retention_duration to 0, expectation is to resume 
retention immediately, but it does not.
alter subscription sub1 set (max_conflict_retention_duration=0);

Reply via email to