On Thu, Aug 13, 2020 at 3:32 AM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > After a smart shutdown is issued(with pg_ctl), run a parallel query, > then the query hangs. The postmaster doesn't inform backends about the > smart shutdown(see pmdie() -> SIGTERM -> BACKEND_TYPE_NORMAL are not > informed), so if they request parallel workers, the postmaster is > unable to fork any workers as it's status(pmState) gets changed to > PM_WAIT_BACKENDS(see maybe_start_bgworkers() --> > bgworker_should_start_now() returns false). > > Few ways we could solve this: > 1. Do we want to disallow parallelism when there is a pending smart > shutdown? - If yes, then, we can let the postmaster know the regular > backends whenever a smart shutdown is received and the backends use > this info to not consider parallelism. If we use SIGTERM to notify, > since the backends have die() as handlers, they just cancel the > queries which is again an inconsistent behaviour[1]. Would any other > signal like SIGUSR2(I think it's currently ignored by backends) be > used here? If the signals are overloaded, can we multiplex SIGTERM > similar to SIGUSR1? If we don't want to use signals at all, the > postmaster can make an entry of it's status in bg worker shared memory > i.e. BackgroundWorkerData, RegisterDynamicBackgroundWorker() can > simply return, without requesting the postmaster for parallel workers. > > 2. If we want to allow parallelism, then, we can tweak > bgworker_should_start_now(), detect that the pending bg worker fork > requests are for parallelism, and let the postmaster start the > workers. > > Thoughts?
Hello Bharath, Yeah, the current situation is not good. I think your option 2 sounds better, because the documented behaviour of smart shutdown is that it "lets existing sessions end their work normally". I think that means that a query that is already running or allowed to start should be able to start new workers and not have its existing workers terminated. Arseny Sher wrote a couple of different patches to try that last year, but they fell through the cracks: https://www.postgresql.org/message-id/flat/CA%2BhUKGLrJij0BuFtHsMHT4QnLP54Z3S6vGVBCWR8A49%2BNzctCw%40mail.gmail.com