Hi,
On 4/24/23 11:45 AM, Amit Kapila wrote:
On Mon, Apr 24, 2023 at 11:54 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
On Mon, Apr 24, 2023 at 11:24 AM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:
Few comments:
============
+# We can not test if the WAL file still exists immediately.
+# We need to let some time to the standby to actually "remove" it.
+my $i = 0;
+while (1)
+{
+ last if !-f $standby_walfile;
+ if ($i++ == 10 * $default_timeout)
+ {
+ die
+ "could not determine if WAL file has been retained or not, can't continue";
+ }
+ usleep(100_000);
+}
Is this adhoc wait required because we can't guarantee that the
checkpoint is complete on standby even after using wait_for_catchup?
Yes, the restart point on the standby is not necessary completed even after
wait_for_catchup is done.
Is there a guarantee that it can never fail on some slower machines?
We are waiting here at a maximum for 10 * $default_timeout (means 3 minutes)
before
we time out. Would you prefer to wait more than 3 minutes at a maximum?
BTW, for the second test is it necessary that we first ensure that the
WAL file has not been retained on the primary?
I was not sure it's worth it too. Idea was more: it's useless to verify it is
removed on
the standby if we are not 100% sure it has been removed on the primary first.
But yeah, we can get
rid of this test if you prefer.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com