This is a bit of a detour, but probably a useful one. Attached is a
patch that replaces a tight PostmasterIsAlive() polling loop in the AV
launcher with a latch, making use of the new WL_POSTMASTER_DEATH
functionality. It's similar to what we've already done for the
archiver. It is relatively straightforward as these auxiliary process
polling loop elimination patches go (certainly compared to what we're
contemplating with the WAL writer), but it raises some questions that
we were lucky to have been able to avoid when I worked on the
archiver. Obviously, this patch isn't finished.

We register various generic signal handlers for the AVLauncher,
including StatementCancelHandler(). Of course, signals that are
handled generically have the same potential to invalidate WaitLatch()
timeout as any other type of signal. We should be mindful of this.

ISTM that these generic handlers ought to be handling this
generically, and that there should be a Latch for just this purpose
for each process within PGPROC. We already have this Latch in PGPROC:

Latch           waitLatch;              /* allow us to wait for sync rep */

Maybe its purpose should be expanded to "current process Latch"?

Another concern is, what happens when we receive a signal, generically
handled or otherwise, and have to SetLatch() to avoid time-out
invalidation? Should we just live with a spurious
AutoVacLauncherMain() iteration, or should we do something like check
if the return value of WaitLatch indicates that we woke up due to a
SetLatch() call, which must have been within a singal handler, and
that we should therefore goto just before WaitLatch() and elide the
spurious iteration? Given that we can expect some signals to occur
relatively frequently, spurious iterations could be a real concern.

Incidentally, should I worry about the timeout long for WaitLatch()
overflowing?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 2f3fcbf..cf5eb59 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -84,6 +84,7 @@
 #include "postmaster/postmaster.h"
 #include "storage/bufmgr.h"
 #include "storage/ipc.h"
+#include "storage/latch.h"
 #include "storage/pmsignal.h"
 #include "storage/proc.h"
 #include "storage/procsignal.h"
@@ -267,6 +268,10 @@ int			AutovacuumLauncherPid = 0;
 static pid_t avlauncher_forkexec(void);
 static pid_t avworker_forkexec(void);
 #endif
+/*
+ * Latch used by signal handlers to wake up the sleep in the main loop.
+ */
+static Latch mainloop_latch;
 NON_EXEC_STATIC void AutoVacWorkerMain(int argc, char *argv[]);
 NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
 
@@ -545,6 +550,8 @@ AutoVacLauncherMain(int argc, char *argv[])
 	 */
 	rebuild_database_list(InvalidOid);
 
+	InitLatch(&mainloop_latch);
+
 	for (;;)
 	{
 		struct timeval nap;
@@ -567,38 +574,20 @@ AutoVacLauncherMain(int argc, char *argv[])
 
 		/*
 		 * Sleep for a while according to schedule.
+		 * Wait on Latch.
 		 *
-		 * On some platforms, signals won't interrupt the sleep.  To ensure we
-		 * respond reasonably promptly when someone signals us, break down the
-		 * sleep into 1-second increments, and check for interrupts after each
-		 * nap.
+		 * We handle wait time invalidation by calling
+		 * SetLatch() in signal handlers.
 		 */
-		while (nap.tv_sec > 0 || nap.tv_usec > 0)
-		{
-			uint32		sleeptime;
-
-			if (nap.tv_sec > 0)
-			{
-				sleeptime = 1000000;
-				nap.tv_sec--;
-			}
-			else
-			{
-				sleeptime = nap.tv_usec;
-				nap.tv_usec = 0;
-			}
-			pg_usleep(sleeptime);
-
-			/*
-			 * Emergency bailout if postmaster has died.  This is to avoid the
-			 * necessity for manual cleanup of all postmaster children.
-			 */
-			if (!PostmasterIsAlive())
-				proc_exit(1);
-
-			if (got_SIGTERM || got_SIGHUP || got_SIGUSR2)
-				break;
-		}
+		WaitLatch(&mainloop_latch, WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH, (nap.tv_sec * 1000000L) + nap.tv_usec);
+		/*
+		 * Emergency bailout if postmaster has died.  This is to avoid the
+		 * necessity for manual cleanup of all postmaster children.
+		 *
+		 * This happens again here because we may have slept for quite a while.
+		 */
+		if (!PostmasterIsAlive())
+			proc_exit(1);
 
 		DisableCatchupInterrupt();
 
@@ -1322,6 +1311,8 @@ static void
 avl_sighup_handler(SIGNAL_ARGS)
 {
 	got_SIGHUP = true;
+	/* let the waiting loop iterate */
+	SetLatch(&mainloop_latch);
 }
 
 /* SIGUSR2: a worker is up and running, or just finished, or failed to fork */
@@ -1329,6 +1320,8 @@ static void
 avl_sigusr2_handler(SIGNAL_ARGS)
 {
 	got_SIGUSR2 = true;
+	/* let the waiting loop iterate */
+	SetLatch(&mainloop_latch);
 }
 
 /* SIGTERM: time to die */
@@ -1336,6 +1329,8 @@ static void
 avl_sigterm_handler(SIGNAL_ARGS)
 {
 	got_SIGTERM = true;
+	/* let the waiting loop iterate */
+	SetLatch(&mainloop_latch);
 }
 
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to