On Fri, Apr 5, 2024 at 1:14 PM Bertrand Drouvot <bertranddrouvot...@gmail.com> wrote: > > > Please find the attached v36 patch. > > A few comments: > > 1 === > > + <para> > + The timeout is measured from the time since the slot has become > + inactive (known from its > + <structfield>inactive_since</structfield> value) until it gets > + used (i.e., its <structfield>active</structfield> is set to true). > + </para> > > That's right except when it's invalidated during the checkpoint (as the slot > is not acquired in CheckPointReplicationSlots()). > > So, what about adding: "or a checkpoint occurs"? That would also explain that > the invalidation could occur during checkpoint.
Reworded. > 2 === > > + /* If the slot has been invalidated, recalculate the resource limits > */ > + if (invalidated) > + { > > /If the slot/If a slot/? Modified it to be like elsewhere. > 3 === > > + * NB - this function also runs as part of checkpoint, so avoid raising > errors > > s/NB - this/NB: This function/? (that looks more consistent with other > comments > in the code) Done. > 4 === > > + * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause > instead > > I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause > instead" > looks weird to me. Maybe it would make sense to reword this a bit. Reworded. > 5 === > > + * considered not active as they don't actually perform logical > decoding. > > Not sure that's 100% accurate as we switched in fast forward logical > in 2ec005b4e2. > > "as they perform only fast forward logical decoding (or not at all)", maybe? Changed it to "as they don't perform logical decoding to produce the changes". In fast_forward mode no changes are produced. > 6 === > > + if (RecoveryInProgress() && slot->data.synced) > + return false; > + > + if (replication_slot_inactive_timeout == 0) > + return false; > > What about just using one if? It's more a matter of taste but it also probably > reduces the object file size a bit for non optimized build. Changed. > 7 === > > + /* > + * Do not invalidate the slots which are currently being > synced from > + * the primary to the standby. > + */ > + if (RecoveryInProgress() && slot->data.synced) > + return false; > > I think we don't need this check as the exact same one is done just before. Right. Removed. > 8 === > > +sub check_for_slot_invalidation_in_server_log > +{ > + my ($node, $slot_name, $offset) = @_; > + my $invalidated = 0; > + > + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; > $i++) > + { > + $node->safe_psql('postgres', "CHECKPOINT"); > > Wouldn't be better to wait for the replication_slot_inactive_timeout time > before > instead of triggering all those checkpoints? (it could be passed as an extra > arg > to wait_for_slot_invalidation()). Done. > 9 === > > # Synced slot mustn't get invalidated on the standby, it must sync > invalidation > # from the primary. So, we must not see the slot's invalidation message in > server > # log. > ok( !$standby1->log_contains( > "invalidating obsolete replication slot \"lsub1_sync_slot\"", > $standby1_logstart), > 'check that syned slot has not been invalidated on the standby'); > > Would that make sense to trigger a checkpoint on the standby before this test? > I mean I think that without a checkpoint on the standby we should not see the > invalidation in the log anyway. Done. Please find the attached v37 patch for further review. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
v37-0001-Add-inactive_timeout-based-replication-slot-inva.patch
Description: Binary data