Hi,

It is quite common to set a global statement_timeout to a few seconds
(or minutes) in postgresql.conf in order to avoid hitting a production
server with slow/bad queries.
This value is applied to all connections in the system unless it is
redefined per database, user, or explicitly changed in the connection.
Pg_rewind runs quite a few queries on the primary and if
statement_timeout hits one of them it breaks the whole process. I
can't tell for sure if this is an unrecoverable error or not and maybe
the second run of pg_rewind would be able to finish the process. Even
in case if retry is possible it makes it hard to use it for reliable
automation.

There are a few workarounds to this problem:
1. ALTER DATABASE postgres SET statement_timeout = 0;
2. ALTER rewind_username SET statement_timeout = 0;
3. Run export PGOPTIONS="-c statement_timeout=0" before calling pg_rwind.

All of them have certain pros and cons. The third approach works good
for automation, but IMHO we should simply fix pg_rewind itself and SET
statement_timeout after establishing a connection, so everybody will
benefit from it.

Patch attached.

Regards,
--
Alexander Kukushkin
diff --git a/src/bin/pg_rewind/libpq_fetch.c b/src/bin/pg_rewind/libpq_fetch.c
index 37eccc3126..18740c15d0 100644
--- a/src/bin/pg_rewind/libpq_fetch.c
+++ b/src/bin/pg_rewind/libpq_fetch.c
@@ -54,6 +54,13 @@ libpqConnect(const char *connstr)
 	if (showprogress)
 		pg_log_info("connected to server");
 
+	/* We don't want our queries cancelled due to statement timeout */
+	res = PQexec(conn, "SET statement_timeout = 0");
+	if (PQresultStatus(res) != PGRES_COMMAND_OK)
+		pg_fatal("could not set up statement_timeout: %s",
+				 PQresultErrorMessage(res));
+	PQclear(res);
+
 	res = PQexec(conn, ALWAYS_SECURE_SEARCH_PATH_SQL);
 	if (PQresultStatus(res) != PGRES_TUPLES_OK)
 		pg_fatal("could not clear search_path: %s",

Reply via email to