On Wed, Oct 10, 2018 at 7:29 AM Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Oct 4, 2018 at 7:37 PM Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: > > 0. The default SIGTERM handler for bgworkers is bgworker_die(), which > > generates a FATAL ereport "terminating background worker \"%s\" due to > > administrator command", directly in the signal handler (hmm, see also > > recent discussions about the legality of similar code in quickdie()). > > Yeah, I think that code is bad news, for the same reasons discussed > with regard to quickdie().
So I suppose we should just remove it, with something like 0002. I'm a bit uneasy about existing code out there that might be not calling CFI. OTOH I suspect that a lot of code copied worker_spi.c and installed its own handler. > > 1. Some processes install their own custom SIGTERM handler that sets > > a flag, and use that to break out of their main loop and exit quietly. > > Example: autovacuum.c, or the open-source pg_cron extension, and > > probably many other things that just want a nice quiet shutdown. > > > > 2. Some processes install the standard SIGTERM handler die(), and > > then use CHECK_FOR_INTERRUPTS() to break out of their main loop. By > > default this looks like "FATAL: terminating connection due to > > administrator command". This approach is effectively required if the > > main loop will reach other code that has a CHECK_FOR_INTERRUPTS() wait > > loop. For example, a "launcher"-type process that calls > > WaitForBackgroundWorkerStartup() could hang forever if it used > > approach 1 (ask me how I know). > > My experience with background workers has been that approach #1 does > not really work. I mean, if you aren't doing anything complicated it > might be OK, if for example you are executing SQL queries, and you try > to do #1, then your SQL queries don't respond to interrupts. And that > sucks. So I have generally adopted approach #2. Maybe src/test/modules/worker_spi.c shouldn't use approach #1 (even if it might technically be OK for that code)? I think I might have been guilty of copying that. > To put that another way, nearly everything can reach > CHECK_FOR_INTERRUPTS(), so saying that this is required whenever that > in the case isn't much different from saying that it is required, full > stop. > > 3. Some system processes generally use approach 2, but have a special > > case in ProcessInterrupts() to suppress or alter the usual FATAL > > message or behaviour. This includes the logical replication launcher. > > Maybe we could replace this by a general-purpose hook. So instead of > the various tests for process types that are there now, we would just > have > > if (procdie_hook != NULL) > (*procdie_hook)(); > > And that hook can do whatever you like (but probably including dying). Ok, patch 0001 is a sketch like that, for discussion. -- Thomas Munro http://www.enterprisedb.com
0001-Add-proc_die_hook-to-customize-die-interrupt-handlin.patch
Description: Binary data
0002-Remove-async-signal-unsafe-bgworker_die-function.patch
Description: Binary data