This is a bit of a detour, but probably a useful one. Attached is a patch that replaces a tight PostmasterIsAlive() polling loop in the AV launcher with a latch, making use of the new WL_POSTMASTER_DEATH functionality. It's similar to what we've already done for the archiver. It is relatively straightforward as these auxiliary process polling loop elimination patches go (certainly compared to what we're contemplating with the WAL writer), but it raises some questions that we were lucky to have been able to avoid when I worked on the archiver. Obviously, this patch isn't finished.
We register various generic signal handlers for the AVLauncher, including StatementCancelHandler(). Of course, signals that are handled generically have the same potential to invalidate WaitLatch() timeout as any other type of signal. We should be mindful of this. ISTM that these generic handlers ought to be handling this generically, and that there should be a Latch for just this purpose for each process within PGPROC. We already have this Latch in PGPROC: Latch waitLatch; /* allow us to wait for sync rep */ Maybe its purpose should be expanded to "current process Latch"? Another concern is, what happens when we receive a signal, generically handled or otherwise, and have to SetLatch() to avoid time-out invalidation? Should we just live with a spurious AutoVacLauncherMain() iteration, or should we do something like check if the return value of WaitLatch indicates that we woke up due to a SetLatch() call, which must have been within a singal handler, and that we should therefore goto just before WaitLatch() and elide the spurious iteration? Given that we can expect some signals to occur relatively frequently, spurious iterations could be a real concern. Incidentally, should I worry about the timeout long for WaitLatch() overflowing? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 2f3fcbf..cf5eb59 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -84,6 +84,7 @@ #include "postmaster/postmaster.h" #include "storage/bufmgr.h" #include "storage/ipc.h" +#include "storage/latch.h" #include "storage/pmsignal.h" #include "storage/proc.h" #include "storage/procsignal.h" @@ -267,6 +268,10 @@ int AutovacuumLauncherPid = 0; static pid_t avlauncher_forkexec(void); static pid_t avworker_forkexec(void); #endif +/* + * Latch used by signal handlers to wake up the sleep in the main loop. + */ +static Latch mainloop_latch; NON_EXEC_STATIC void AutoVacWorkerMain(int argc, char *argv[]); NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]); @@ -545,6 +550,8 @@ AutoVacLauncherMain(int argc, char *argv[]) */ rebuild_database_list(InvalidOid); + InitLatch(&mainloop_latch); + for (;;) { struct timeval nap; @@ -567,38 +574,20 @@ AutoVacLauncherMain(int argc, char *argv[]) /* * Sleep for a while according to schedule. + * Wait on Latch. * - * On some platforms, signals won't interrupt the sleep. To ensure we - * respond reasonably promptly when someone signals us, break down the - * sleep into 1-second increments, and check for interrupts after each - * nap. + * We handle wait time invalidation by calling + * SetLatch() in signal handlers. */ - while (nap.tv_sec > 0 || nap.tv_usec > 0) - { - uint32 sleeptime; - - if (nap.tv_sec > 0) - { - sleeptime = 1000000; - nap.tv_sec--; - } - else - { - sleeptime = nap.tv_usec; - nap.tv_usec = 0; - } - pg_usleep(sleeptime); - - /* - * Emergency bailout if postmaster has died. This is to avoid the - * necessity for manual cleanup of all postmaster children. - */ - if (!PostmasterIsAlive()) - proc_exit(1); - - if (got_SIGTERM || got_SIGHUP || got_SIGUSR2) - break; - } + WaitLatch(&mainloop_latch, WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH, (nap.tv_sec * 1000000L) + nap.tv_usec); + /* + * Emergency bailout if postmaster has died. This is to avoid the + * necessity for manual cleanup of all postmaster children. + * + * This happens again here because we may have slept for quite a while. + */ + if (!PostmasterIsAlive()) + proc_exit(1); DisableCatchupInterrupt(); @@ -1322,6 +1311,8 @@ static void avl_sighup_handler(SIGNAL_ARGS) { got_SIGHUP = true; + /* let the waiting loop iterate */ + SetLatch(&mainloop_latch); } /* SIGUSR2: a worker is up and running, or just finished, or failed to fork */ @@ -1329,6 +1320,8 @@ static void avl_sigusr2_handler(SIGNAL_ARGS) { got_SIGUSR2 = true; + /* let the waiting loop iterate */ + SetLatch(&mainloop_latch); } /* SIGTERM: time to die */ @@ -1336,6 +1329,8 @@ static void avl_sigterm_handler(SIGNAL_ARGS) { got_SIGTERM = true; + /* let the waiting loop iterate */ + SetLatch(&mainloop_latch); }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers