On Thu, Oct 4, 2018 at 7:37 PM Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > I still think the current situation is non-ideal. I don't have a > strong view on whether some or all system-wide processes should say > hello and goodbye explicitly in the log, but I do think we need a way > to make that not look like an error condition, and ideally without > special cases for known built-in processes. > > I looked into this a bit today, while debugging an extension that runs > a background worker. Here are some assorted approaches to shutdown: > > 0. The default SIGTERM handler for bgworkers is bgworker_die(), which > generates a FATAL ereport "terminating background worker \"%s\" due to > administrator command", directly in the signal handler (hmm, see also > recent discussions about the legality of similar code in quickdie()).
Yeah, I think that code is bad news, for the same reasons discussed with regard to quickdie(). > 1. Some processes install their own custom SIGTERM handler that sets > a flag, and use that to break out of their main loop and exit quietly. > Example: autovacuum.c, or the open-source pg_cron extension, and > probably many other things that just want a nice quiet shutdown. > > 2. Some processes install the standard SIGTERM handler die(), and > then use CHECK_FOR_INTERRUPTS() to break out of their main loop. By > default this looks like "FATAL: terminating connection due to > administrator command". This approach is effectively required if the > main loop will reach other code that has a CHECK_FOR_INTERRUPTS() wait > loop. For example, a "launcher"-type process that calls > WaitForBackgroundWorkerStartup() could hang forever if it used > approach 1 (ask me how I know). My experience with background workers has been that approach #1 does not really work. I mean, if you aren't doing anything complicated it might be OK, if for example you are executing SQL queries, and you try to do #1, then your SQL queries don't respond to interrupts. And that sucks. So I have generally adopted approach #2. To put that another way, nearly everything can reach CHECK_FOR_INTERRUPTS(), so saying that this is required whenever that in the case isn't much different from saying that it is required, full stop. > 3. Some system processes generally use approach 2, but have a special > case in ProcessInterrupts() to suppress or alter the usual FATAL > message or behaviour. This includes the logical replication launcher. Maybe we could replace this by a general-purpose hook. So instead of the various tests for process types that are there now, we would just have if (procdie_hook != NULL) (*procdie_hook)(); And that hook can do whatever you like (but probably including dying). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company