Hi, In case of smart shutdown postmaster receives SIGTERM from the pg_ctl, it "disallows new connections, but lets existing sessions end their work normally". Which means that it doesn't abort any ongoing txns in any of the sessions and it lets the sessions to exit(on their own) and then the postmaster is shut down. Looks like this behavior is true only if the sessions are executing non-parallel queries. Parallel queries are getting aborted, see [1].
Although the postmaster receives two different signals for smart(SIGTERM) and fast(SIGINT) shutdowns, it only sends SIGTERM to bgworkers for both the cases. (see pmdie() -> SignalSomeChildren() in postmaster.c). In StartBackgroundWorker(), bgworkers have the bgworker_die() as default handler for SIGTERM, which just reports FATAL error. Is this handler correct for both fast and smart shutdown for all types of bgworkers? For parallel workers in ParallelWorkerMain(), SIGTERM handler gets changed to die()(which means for both smart and fast shutdowns, the same handler gets used), which sets ProcDiePending = true; and later if the parallel workers try to CHECK_FOR_INTERRUPTS(); (for parallel seq scan, it is done in ExecScanFetch()), since ProcDiePending was set to true, the parallel workers throw error "terminating connection due to administrator command" in ProcessInterrupts(). Having die() as a handler for fast shutdown may be correct, but for smart shutdown, as mentioned in $subject, it looks inconsistent. 1. In general, do we need to allow postmaster to send different signals to bgworkers for fast and smart shutdowns and let them differentiate the two modes(if needed)? 2. Do we need to handle smart shutdown for dynamic bgworkers(parallel workers) with a different signal than SIGTERM and retain SIGTERM handler die() as is, since SIGTERM is being sent to bgworkers from other places as well? If we do so, we can block that new signal until the parallel workers finish the current query execution or completely ignore it in ParallelWorkerMain(). If the bgw_flags flag is BGWORKER_CLASS_PARALLEL, we can do some changes in postmaster's SignalSomeChildren() to detect and send that new signal. Looks like SIGUSR2 remains currently ignored for dynamic bgworker, and can be used for this purpose. Thoughts? Thanks @vignesh C for inputs and writeup review. [1] (smart shutdown issued with pg_ctl, while the parallel query is running). postgres=# EXPLAIN ANALYZE SELECT COUNT(*) FROM t_test t1, t_test t2 WHERE t1.many = t2.many; ERROR: terminating connection due to administrator command CONTEXT: parallel worker [2] The usual behavior of: smart shutdown - lets existing sessions end their work normally and fast shutdown - abort their current transactions and exit promptly. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com