On Thu, Nov 27, 2025 at 4:59 AM Amit Kapila <[email protected]> wrote:
>
> On Thu, Nov 27, 2025 at 2:32 AM Masahiko Sawada <[email protected]> wrote:
> >
> > I've squashed all fixup patches and attached the updated patch.
> >
>
> 1.
> <literal>wal_level_insufficient</literal> means that the
> -          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient 
> to
> -          perform logical decoding.  It is set only for logical slots.
> +          primary doesn't have a <xref linkend="guc-effective-wal-level"/>
> +          to perform logical decoding.
>
> sufficient is missing after "guc-effective-wal-level"
>
> 2.
> + * With 'minimal' WAL level, there are not logical replication slots
> + * during recovery.
>
> /not/no. Typo
>
> 3.
> case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
>   {
> - xl_parameter_change *xlrec =
> - (xl_parameter_change *) XLogRecGetData(buf->record);
> + bool logical_decoding;
>
> - /*
> - * If wal_level on the primary is reduced to less than
> - * logical, we want to prevent existing logical slots from
> - * being used.  Existing logical slots on the standby get
> - * invalidated when this WAL record is replayed; and further,
> - * slot creation fails when wal_level is not sufficient; but
> - * all these operations are not synchronized, so a logical
> - * slot may creep in while the wal_level is being reduced.
> - * Hence this extra check.
> - */
> - if (xlrec->wal_level < WAL_LEVEL_LOGICAL)
> + memcpy(&logical_decoding, XLogRecGetData(buf->record), sizeof(bool));
>
> The patch has entirely removed this comment but I feel we should write
> something similar to it especially for the part: "Existing logical
> slots on the standby get invalidated when this WAL record is replayed;
> and further, slot creation fails when wal_level is not sufficient; but
> all these operations are not synchronized, so a logical slot may creep
> in while the wal_level is being reduced. Hence this extra check." Did
> anything change about this part of the comment?
>
> 4.
> WaitLSN "Waiting to read or update shared Wait-for-LSN state."
> +LogicalDecodingControl "Waiting to access logical decoding status 
> information."
>
> Seeing the description just above, won't it be correct to say:"Waiting
> to read or update logical decoding status information."?

Fixed the above points.

>
> 5. The newly added test took approximately 8s on my machine, whereas
> other similar tests normally took 2-6s on the same machine, though
> there are some exceptions, such as 035_standby_logical_decoding.pl.
> See below results of some of the tests:
> -------
> [10:03:37] t/028_pitr_timelines.pl ............... ok     2254 ms (
> 0.00 usr  0.00 sys +  0.39 cusr  0.83 csys =  1.22 CPU)
> [10:03:39] t/029_stats_restart.pl ................ ok     2915 ms (
> 0.00 usr  0.00 sys +  0.34 cusr  0.42 csys =  0.76 CPU)
> [10:03:42] t/030_stats_cleanup_replica.pl ........ ok     2282 ms (
> 0.00 usr  0.00 sys +  0.42 cusr  0.66 csys =  1.08 CPU)
> [10:03:45] t/031_recovery_conflict.pl ............ ok     2705 ms (
> 0.00 usr  0.00 sys +  0.39 cusr  0.64 csys =  1.03 CPU)
> [10:03:47] t/032_relfilenode_reuse.pl ............ ok     2611 ms (
> 0.01 usr  0.00 sys +  0.37 cusr  0.61 csys =  0.99 CPU)
> [10:03:50] t/033_replay_tsp_drops.pl ............. ok     4860 ms (
> 0.00 usr  0.00 sys +  0.57 cusr  1.60 csys =  2.17 CPU)
> [10:03:55] t/034_create_database.pl .............. ok      922 ms (
> 0.00 usr  0.00 sys +  0.19 cusr  0.19 csys =  0.38 CPU)
> [10:03:56] t/035_standby_logical_decoding.pl ..... ok    10899 ms (
> 0.01 usr  0.00 sys +  1.13 cusr  2.21 csys =  3.35 CPU)
> [10:04:07] t/036_truncated_dropped.pl ............ ok     1781 ms (
> 0.00 usr  0.00 sys +  0.21 cusr  0.22 csys =  0.43 CPU)
> [10:04:09] t/037_invalid_database.pl ............. ok      944 ms (
> 0.00 usr  0.00 sys +  0.19 cusr  0.21 csys =  0.40 CPU)
> [10:04:09] t/038_save_logical_slots_shutdown.pl .. ok     1562 ms (
> 0.00 usr  0.00 sys +  0.21 cusr  0.36 csys =  0.57 CPU)
> [10:04:11] t/039_end_of_wal.pl ................... ok     4638 ms (
> 0.00 usr  0.00 sys +  0.48 cusr  0.66 csys =  1.14 CPU)
> [10:04:16] t/040_standby_failover_slots_sync.pl .. ok     7418 ms (
> 0.01 usr  0.00 sys +  0.81 cusr  1.82 csys =  2.64 CPU)
> [10:04:23] t/041_checkpoint_at_promote.pl ........ ok     1535 ms (
> 0.00 usr  0.00 sys +  0.29 cusr  0.51 csys =  0.80 CPU)
> [10:04:25] t/042_low_level_backup.pl ............. ok     2842 ms (
> 0.00 usr  0.00 sys +  0.37 cusr  0.66 csys =  1.03 CPU)
> [10:04:27] t/043_no_contrecord_switch.pl ......... ok     1946 ms (
> 0.00 usr  0.00 sys +  0.32 cusr  0.69 csys =  1.01 CPU)
> [10:04:29] t/044_invalidate_inactive_slots.pl .... ok      603 ms (
> 0.00 usr  0.00 sys +  0.19 cusr  0.17 csys =  0.36 CPU)
> [10:04:30] t/045_archive_restartpoint.pl ......... ok     4324 ms (
> 0.00 usr  0.00 sys +  0.97 cusr  0.66 csys =  1.63 CPU)
> [10:04:34] t/046_checkpoint_logical_slot.pl ...... ok     3322 ms (
> 0.00 usr  0.00 sys +  0.33 cusr  0.55 csys =  0.88 CPU)
> [10:04:38] t/047_checkpoint_physical_slot.pl ..... ok     1919 ms (
> 0.00 usr  0.00 sys +  0.28 cusr  0.43 csys =  0.71 CPU)
> [10:04:40] t/048_vacuum_horizon_floor.pl ......... ok     1413 ms (
> 0.01 usr  0.00 sys +  0.26 cusr  0.53 csys =  0.80 CPU)
> [10:04:41] t/049_wait_for_lsn.pl ................. ok     6851 ms (
> 0.00 usr  0.00 sys +  0.40 cusr  0.71 csys =  1.11 CPU)
> [10:04:48] t/050_effective_wal_level.pl .......... ok     8106 ms (
> 0.00 usr  0.00 sys +  0.83 cusr  1.79 csys =  2.62 CPU)
> ---------
>
> I haven't investigated to see if we can optimize or reduce the test
> timing without impacting the coverage or functionality, but just see
> if we can reduce it. If you think we can't do anything on this front
> without compromising functionality coverage, then I think we can live
> with it.

I guess that we cannot avoid making this test heavy to some extent
given that it involves multiple replication setup, standby promotions,
and injection points etc. I've reduced several tests and I hope it
helped reduce test duration on your env. It has been reduced a bit on
my env but the test time is unstable.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to