Chris Travers <chris.trav...@adjust.com> writes: > Here's a new patch. No rush on it. I am moving it to next commitfest > anyway because as code documentation I think this is a low priority late in > the release cycle. > The changes mostly address Andres's feedback above.
I took a look through this version. (BTW, please add a version number to the patchfile name in future, so people can keep track of which one we're talking about.) +src/backend/utils/misc/README.SIGNAL_HANDLING Hmm ... I'm not very sure of a good place to put this, but utils/misc/ seems kinda random, because there's no code at all in there that's particularly related to this topic. One alternate suggestion is utils/init/, which is at least related by virtue of the signal flags being defined in globals.c. Another idea is src/backend/tcop/, since postgres.c is where the rubber meets the road for these issues, at least in regular backends. Also, "README.SIGNAL_HANDLING" seems both overly long and not very consistent with the naming of other README files. The ones that aren't just "README" are README.git doc/src/sgml/README.links src/backend/access/heap/README.HOT src/backend/access/heap/README.tuplock src/backend/access/transam/README.parallel src/backend/libpq/README.SSL src/backend/statistics/README.dependencies src/backend/statistics/README.mcv src/backend/storage/lmgr/README-SSI src/backend/storage/lmgr/README.barrier src/interfaces/ecpg/README.dynSQL src/interfaces/ecpg/preproc/README.parser so there's a pretty clear preference for "README.lowercasesomething". Hence I suggest "README.signals". +Implementation Notes on Globals and Signal/Event Handling +========================================================= I'd drop "Globals and", it's not very relevant here; it's surely not the lead keyword. Maybe "Signal Handling in Postgres" would be even more apropos. +The approch to signal handling in PostgreSQL is designed to strictly conform +with the C89 standard and designed to run with as few platform assumptions as +possible. s/approch/approach/, s/conform with/conform to/, s/run with/make/ (or just drop "and designed..." entirely; C spec compliance is as much as we need to say). +Most signals (except SIGSEGV, and SIGKILL) are blocked by PostgreSQL +during process startup. The parenthetical remark is wrong, or at least incomplete; there's other stuff we don't block such as SIGBUS. I think I'd skip any attempt to list them here and just say "Most blockable signals are blocked ...". (Also, reading down, I note you return to this topic where you mention SIGILL. Perhaps better to enlarge on never-blocked signals at that point.) +An exception is made for SIGQUIT which is used by the postmaster to terminate +backend sessions quickly when another backend dies so that the postmaster +may re-initialize shared memory and otherwise return to a known-good state. It'd likely be worth the verbiage to enlarge on this. There is a clear risk of the SIGQUIT handler malfunctioning because it interrupts something that's not safe to interrupt. We live with this on the theory that we're just trying to kill the process anyway, so corruption of its internal state is tolerable; plus the postmaster has SIGKILL-after-a-timeout logic in case a child's SIGQUIT handler gets completely stuck. +CHECK_FOR_INTERRUPTS() may be called only when there is no more cleanup to do +because the query or process may be aborted. This seems pretty wrong or at least confusing. Maybe better "CHECK_FOR_INTERRUPTS must be called only in places where it is safe to invoke transaction abort and/or process termination. Because Postgres generally expects backend subsystems to maintain enough state to clean up after themselves at transaction abort, this is a weak restriction. However, functions that execute only straight-line code are allowed to put persistent variables into transiently inconsistent states; that's safe exactly as long as no CHECK_FOR_INTERRUPTS is executed while the process state is unsafe for transaction abort". +SIGHUP - Reload config file (Postmaster Only) Nope, that's not postmaster-only. Backends and most other child processes also re-read the config file on SIGHUP. +SIGINT - (Postmaster) Fast shutdown, (Backend) Cancel Query +SIGQUIT - (Postmaster) Immediate Shutdown, (Backend) Exit Immediately +SIGTERM - Terminate backend gracefully IIRC, SIGTERM is also Smart Shutdown in the postmaster; why isn't this documented similarly to SIGINT/SIGQUIT? +SIGUSR1 - Pending IPC or LWLock event Probably better to explain this as a general purpose signal multiplexor, because that's what it is. LWLock etc are specific use-cases. +SIGINFO - Parent died +SIGPWR - Parent died (alternative) I was totally confused by this to start with, until grepping reminded me that POSTMASTER_DEATH_SIGNAL is implememented as one or the other of them. It might be better to use that name in this documentation, and then explain that under the hood it is SIGINFO, SIGPWR, or possibly some other otherwise-unused signal. "Parent died" seems inappropriately general, too, because it's specifically the postmaster we care about. +A few other signals, such as SIGUSR2 may have different meanings in different +worker processes. In the case of SIGUSR2, we use it to promote a postmaster, +but also to exit a checkpointer or to stop replication. In general CPU-based +interrupts (such as SIGILL) are not caught or handled by PostgreSQL. Not sure if we want such an incomplete listing of SIGUSR2 uses as that. Also, the point about SIGILL surely should be a separate para? +CHECK_FOR_INTERRUPTS() may cause a non-local exit because the function it wraps +utilizes PostgreSQL's built-in exception framework (ereport) to abort queries +and can call exit() to exit a orocess. CHECK_FOR_INTERRUPTS() MUST NOT be called +in places where the current function assumes that cleanup of shared resources +may be required, or where other problems with non-local exits are foreseen in the +case of ordinary usage of the function. In cases where expectations of normal +safety is suspended (critical sections and the like) there is a mechanism for +holding off interrupts, as documented where the HOLD_INTERRUPTS() macro is defined. s/orocess/process/, also this seems mighty repetitive with what was written earlier. Maybe best for the earlier text to be very minimal and just say use of CHECK_FOR_INTERRUPTS is explained below. +Because InterruptPending will always be set when QueryCancelPending and +ProcDiePending are set, checking it first is a useful operation where the number +of possible calls are very high. In cases where the number of calls are lower, +this micro-optimization may be omitted. This para seems pretty pointless. Are there any such places? regards, tom lane