Hi,

On 4/24/23 11:45 AM, Amit Kapila wrote:
On Mon, Apr 24, 2023 at 11:54 AM Amit Kapila <amit.kapil...@gmail.com> wrote:

On Mon, Apr 24, 2023 at 11:24 AM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:


Few comments:
============


+# We can not test if the WAL file still exists immediately.
+# We need to let some time to the standby to actually "remove" it.
+my $i = 0;
+while (1)
+{
+ last if !-f $standby_walfile;
+ if ($i++ == 10 * $default_timeout)
+ {
+ die
+   "could not determine if WAL file has been retained or not, can't continue";
+ }
+ usleep(100_000);
+}

Is this adhoc wait required because we can't guarantee that the
checkpoint is complete on standby even after using wait_for_catchup?

Yes, the restart point on the standby is not necessary completed even after 
wait_for_catchup is done.

Is there a guarantee that it can never fail on some slower machines?


We are waiting here at a maximum for 10 * $default_timeout (means 3 minutes) 
before
we time out. Would you prefer to wait more than 3 minutes at a maximum?

BTW, for the second test is it necessary that we first ensure that the
WAL file has not been retained on the primary?


I was not sure it's worth it too. Idea was more: it's useless to verify it is 
removed on
the standby if we are not 100% sure it has been removed on the primary first. 
But yeah, we can get
rid of this test if you prefer.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to