On 24.05.2011 23:43, Peter Geoghegan wrote:
Attached is the latest revision of the latch implementation that monitors postmaster death, plus the archiver client that now relies on that new functionality and thereby works well without a tight PostmasterIsAlive() polling loop.
The Unix-stuff looks good to me at a first glance.
The lifesign terminology has been dropped. We now close() the file descriptor that represents "ownership" - the write end of our anonymous pipe - in each child backend directly in the forking machinery (the thin fork() wrapper for the non-EXEC_BACKEND case), through a call to ReleasePostmasterDeathWatchHandle(). We don't have to do that on Windows, and we don't.
There's one reference left to "life sign" in comments. (FWIW, I don't have a problem with that terminology myself)
Disappointingly, and despite a big effort, there doesn't seem to be a way to have the win32 WaitForMultipleObjects() call wake on postmaster death in addition to everything else in the same way that select() does, so there are now two blocking calls, each in a thread of its own (when the latch code is interested in postmaster death - otherwise, it's single threaded as before). The threading stuff (in particular, the fact that we used a named pipe in a thread where the name of the pipe comes from the process PID) is inspired by win32 signal emulation, src/backend/port/win32/signal.c .
That's a pity, all those threads and named pipes are a bit gross for a safety mechanism like this.
Looking at the MSDN docs again, can't you simply include PostmasterHandle in the WaitForMultipleObjects() call to have it return when the process dies? It should be possible to mix different kind of handles in one call, including process handles. Does it not work as advertised?
You can easily observe that it works as advertised on Windows by starting Postgres with archiving, using task manager to monitor processes, and doing the following to the postmaster (assuming it has a PID of 1234). This is the Windows equivalent of kill -9 : C:\Users\Peter>taskkill /pid 1234 /F You'll see that it takes about a second for the archiver to exit. All processes exit.
Hmm, shouldn't the archiver exit almost instantaneously now that there's no polling anymore?
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers