On 2015-01-19 17:16:11 +0900, Michael Paquier wrote: > On Mon, Jan 19, 2015 at 4:10 PM, Michael Paquier > <michael.paqu...@gmail.com> wrote: > > On Sat, Jan 17, 2015 at 2:44 AM, Andres Freund <and...@2ndquadrant.com> > > wrote: > >> Not this patch's fault, but I'm getting a bit tired seeing the above open > >> coded. How about adding a function that does the sleeping based on a > >> timestamptz and a ms interval? > > You mean in plugins, right? I don't recall seeing similar patterns in > > other code paths of backend. But I think that we can use something > > like that in timestamp.c then because we need to leverage that between > > two timestamps, the last failure and now(): > > TimestampSleepDifference(start_time, stop_time, internal_ms); > > Perhaps you have something else in mind? > > > > Attached is an updated patch.
> Actually I came with better than last patch by using a boolean flag as > return value of TimestampSleepDifference and use > TimestampDifferenceExceeds directly inside it. > Subject: [PATCH] Add wal_availability_check_interval to control WAL fetching > on failure I think that name isn't a very good. And its isn't very accurate either. How about wal_retrieve_retry_interval? Not very nice, but I think it's still better than the above. > + <varlistentry id="wal-availability-check-interval" > xreflabel="wal_availability_check_interval"> > + <term><varname>wal_availability_check_interval</varname> > (<type>integer</type>) > + <indexterm> > + <primary><varname>wal_availability_check_interval</> recovery > parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + This parameter specifies the amount of time to wait when > + WAL is not available for a node in recovery. Default value is > + <literal>5s</>. > + </para> > + <para> > + A node in recovery will wait for this amount of time if > + <varname>restore_command</> returns nonzero exit status code when > + fetching new WAL segment files from archive or when a WAL receiver > + is not able to fetch a WAL record when using streaming replication. > + </para> > + </listitem> > + </varlistentry> > + > </variablelist> Walreceiver doesn't wait that amount, but rather how long the connection is intact. And restore_command may or may not retry. > /*------- > * Standby mode is implemented by a state machine: > @@ -10490,15 +10511,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool > randAccess, > * machine, so we've exhausted all the > options for > * obtaining the requested WAL. We're > going to loop back > * and retry from the archive, but if > it hasn't been long > - * since last attempt, sleep 5 seconds > to avoid > - * busy-waiting. > + * since last attempt, sleep the amount > of time specified > + * by wal_availability_check_interval > to avoid busy-waiting. > */ > - now = (pg_time_t) time(NULL); > - if ((now - last_fail_time) < 5) > - { > - pg_usleep(1000000L * (5 - (now > - last_fail_time))); > - now = (pg_time_t) time(NULL); > - } > + now = GetCurrentTimestamp(); > + if > (TimestampSleepDifference(last_fail_time, now, > + > wal_availability_check_interval)) > + now = GetCurrentTimestamp(); Not bad, that's much easier to read imo. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers