Hello. Thanks, I understand, what look in another part of code. Hope right now I did right changes.
To not modify current pg_usleep calculation, I changed restore_command_retry_interval value to seconds (not milliseconds). In this case, min value - 1 second. Mon, 29 Dec 2014 00:15:03 +0900 от Michael Paquier <michael.paqu...@gmail.com>: >On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev < leopard...@inbox.ru > wrote: >> Thanks for suggestions. >> >> Patch updated. > >Cool, thanks. I just had an extra look at it. > >+ This is useful, if I using for restore of wal logs some >+ external storage (like AWS S3) and no matter what the slave database >+ will lag behind the master. The problem, what for each request to >+ AWS S3 need to pay, what is why for N nodes, which try to get next >+ wal log each 5 seconds will be bigger price, than for example each >+ 30 seconds. >I reworked this portion of the docs, it is rather incorrect as the >documentation should not use first-person subjects, and I don't >believe that referencing any commercial products is a good thing in >this context. > >+# specifies an optional timeout after nonzero code of restore_command. >+# This can be useful to increase/decrease number of a restore_command calls. >This is still referring to a timeout. That's not good. And the name of >the parameter at the top of this comment block is missing. > >+static int restore_command_retry_interval = 5000L; >I think that it would be more adapted to set that to 5000, and >multiply by 1L. I am also wondering about having a better lower bound, >like 100ms to avoid some abuse with this feature in the retries? > >+ ereport(ERROR, >+ >(errcode(ERRCODE_INVALID_PARAMETER_VALUE), >+ errmsg("\"%s\" must >be bigger zero", >+ "restore_command_retry_interval"))); >I'd rather rewrite that to "must have a strictly positive value". > >- * Wait for more WAL to >arrive. Time out after 5 seconds, >+ * Wait for more WAL to >arrive. Time out after >+ * >restore_command_retry_interval (5 seconds by default), > * like when polling the >archive, to react to a trigger > * file promptly. > */ > >WaitLatch(&XLogCtl->recoveryWakeupLatch, > WL_LATCH_SET >| WL_TIMEOUT, >- 5000L); >+ >restore_command_retry_interval); >I should have noticed earlier, but in its current state your patch >actually does not work. What you are doing here is tuning the time >process waits for WAL from stream. In your case what you want to >control is the retry time for a restore_command in archive recovery, >no? >-- >Michael > > >-- >Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-hackers -- Alexey Vasiliev
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml index ef78bc0..38420a5 100644 --- a/doc/src/sgml/recovery-config.sgml +++ b/doc/src/sgml/recovery-config.sgml @@ -145,6 +145,26 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows </listitem> </varlistentry> + <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval"> + <term><varname>restore_command_retry_interval</varname> (<type>integer</type>) + <indexterm> + <primary><varname>restore_command_retry_interval</> recovery parameter</primary> + </indexterm> + </term> + <listitem> + <para> + If <varname>restore_command</> returns nonzero exit status code, retry + command after the interval of time specified by this parameter. + Default value is <literal>5s</>. + </para> + <para> + This is useful, if I using for restore of wal logs some + external storage and no matter what the slave database + will lag behind the master. + </para> + </listitem> + </varlistentry> + </variablelist> </sect1> diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample index b777400..5b63f60 100644 --- a/src/backend/access/transam/recovery.conf.sample +++ b/src/backend/access/transam/recovery.conf.sample @@ -58,6 +58,11 @@ # #recovery_end_command = '' # +# specifies an optional retry interval of restore_command command, if previous return nonzero exit status code. +# This can be useful to increase/decrease number of a restore_command calls. +# +#restore_command_retry_interval = 5s +# #--------------------------------------------------------------------------- # RECOVERY TARGET PARAMETERS #--------------------------------------------------------------------------- diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e5dddd4..83a6db0 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime; static char *recoveryTargetName; static int recovery_min_apply_delay = 0; static TimestampTz recoveryDelayUntilTime; +static int restore_command_retry_interval = 5; /* options taken from recovery.conf for XLOG streaming */ static bool StandbyModeRequested = false; @@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void) (errmsg_internal("trigger_file = '%s'", TriggerFile))); } + else if (strcmp(item->name, "restore_command_retry_interval") == 0) + { + const char *hintmsg; + + if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_S, + &hintmsg)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a temporal value", + "restore_command_retry_interval"), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + ereport(DEBUG2, + (errmsg_internal("restore_command_retry_interval = '%s'", item->value))); + + if (restore_command_retry_interval < 1) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"%s\" must have a strictly positive value", + "restore_command_retry_interval"))); + } + } else if (strcmp(item->name, "recovery_min_apply_delay") == 0) { const char *hintmsg; @@ -10495,13 +10518,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, * machine, so we've exhausted all the options for * obtaining the requested WAL. We're going to loop back * and retry from the archive, but if it hasn't been long - * since last attempt, sleep 5 seconds to avoid - * busy-waiting. + * since last attempt, sleep restore_command_retry_interval + * (by default 5 seconds) to avoid busy-waiting. */ now = (pg_time_t) time(NULL); - if ((now - last_fail_time) < 5) + if ((now - last_fail_time) < restore_command_retry_interval) { - pg_usleep(1000000L * (5 - (now - last_fail_time))); + pg_usleep(1000000L * (restore_command_retry_interval - (now - last_fail_time))); now = (pg_time_t) time(NULL); } last_fail_time = now;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers