Re: Parallel query hangs after a smart shutdown is issued

2020-08-14 Thread Tom Lane
Arseny Sher writes: > FWIW, I've also looked through the patch and it's fine. Moderate testing > also found no issues, check-world works, bgws are started during smart > shutdown as expected. And surely this is better than the inital > shorthack of allowing only parallel workers. Thanks, apprecia

Re: Parallel query hangs after a smart shutdown is issued

2020-08-14 Thread Arseny Sher
Tom Lane writes: > Thomas Munro writes: >> On Fri, Aug 14, 2020 at 4:45 AM Tom Lane wrote: >>> After some more rethinking and testing, here's a v5 that feels >>> fairly final to me. I realized that the logic in canAcceptConnections >>> was kind of backwards: it's better to check the main pmS

Re: Parallel query hangs after a smart shutdown is issued

2020-08-14 Thread Tom Lane
Thomas Munro writes: > On Fri, Aug 14, 2020 at 4:45 AM Tom Lane wrote: >> After some more rethinking and testing, here's a v5 that feels >> fairly final to me. I realized that the logic in canAcceptConnections >> was kind of backwards: it's better to check the main pmState restrictions >> first

Re: Parallel query hangs after a smart shutdown is issued

2020-08-13 Thread Thomas Munro
On Fri, Aug 14, 2020 at 4:45 AM Tom Lane wrote: > I wrote: > > Hmmm ... maybe that should be more like > > if (smartShutState != SMART_NORMAL_USAGE && > > backend_type == BACKEND_TYPE_NORMAL) > > After some more rethinking and testing, here's a v5 that feels > fairly final to m

Re: Parallel query hangs after a smart shutdown is issued

2020-08-13 Thread Tom Lane
I wrote: > Hmmm ... maybe that should be more like > if (smartShutState != SMART_NORMAL_USAGE && > backend_type == BACKEND_TYPE_NORMAL) After some more rethinking and testing, here's a v5 that feels fairly final to me. I realized that the logic in canAcceptConnections was kind

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
Thomas Munro writes: > Makes sense. I tested this version on a primary and a replica and > verified that parallel workers launch, but I saw that autovacuum > workers still can't start without something like this: > @@ -2463,7 +2463,8 @@ canAcceptConnections(int backend_type) > * be retu

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Thomas Munro
On Thu, Aug 13, 2020 at 2:37 PM Tom Lane wrote: > I experimented with separating the shutdown-in-progress state into a > separate variable, letting us actually reduce not increase the number of > pmStates. This way, PM_RUN and other states still apply until we're > ready to pull the shutdown trig

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
Thomas Munro writes: > On Thu, Aug 13, 2020 at 10:21 AM Tom Lane wrote: >> Also, the state before PM_WAIT_READONLY could have been >> PM_RECOVERY or PM_STARTUP, in which case we don't really want to think >> it's like PM_HOT_STANDBY either; only the BgWorkerStart_PostmasterStart >> case should be

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Thomas Munro
On Thu, Aug 13, 2020 at 10:21 AM Tom Lane wrote: > Thomas Munro writes: > > @@ -5911,11 +5912,11 @@ bgworker_should_start_now(BgWorkerStartTime > > start_time) > > + case PM_WAIT_READONLY: > > + case PM_WAIT_CLIENTS: > > case PM_RUN: > > So the questio

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
Thomas Munro writes: > I think we also need: > + else if (Shutdown <= SmartShutdown && > +backend_type == BACKEND_TYPE_AUTOVAC) > + result = CAC_OK; Hm, ok. > Retesting the original complaint, I think we need: > @@ -5911,11 +5

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Thomas Munro
On Thu, Aug 13, 2020 at 8:59 AM Tom Lane wrote: > I wrote: > > Oh, excellent point! I'd not thought to look at tests of the Shutdown > > variable, but yeah, those should be <= SmartShutdown if we want autovac > > to continue to operate in this state. > > On looking closer, there's another problem

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
I wrote: > Oh, excellent point! I'd not thought to look at tests of the Shutdown > variable, but yeah, those should be <= SmartShutdown if we want autovac > to continue to operate in this state. On looking closer, there's another problem: setting start_autovac_launcher isn't enough to get the AV

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
Thomas Munro writes: > On Thu, Aug 13, 2020 at 6:00 AM Tom Lane wrote: >> One other thing I changed here was to remove PM_WAIT_READONLY from the >> set of states in which we'll allow promotion to occur or a new walreceiver >> to start. I'm not convinced that either of those behaviors aren't >> b

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Thomas Munro
On Thu, Aug 13, 2020 at 6:00 AM Tom Lane wrote: > Thomas Munro writes: > > On Thu, Aug 13, 2020 at 3:32 AM Bharath Rupireddy > > wrote: > >> After a smart shutdown is issued(with pg_ctl), run a parallel query, > >> then the query hangs. > > > Yeah, the current situation is not good. I think you

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Tom Lane
Thomas Munro writes: > On Thu, Aug 13, 2020 at 3:32 AM Bharath Rupireddy > wrote: >> After a smart shutdown is issued(with pg_ctl), run a parallel query, >> then the query hangs. > Yeah, the current situation is not good. I think your option 2 sounds > better, because the documented behaviour o

Re: Parallel query hangs after a smart shutdown is issued

2020-08-12 Thread Thomas Munro
On Thu, Aug 13, 2020 at 3:32 AM Bharath Rupireddy wrote: > After a smart shutdown is issued(with pg_ctl), run a parallel query, > then the query hangs. The postmaster doesn't inform backends about the > smart shutdown(see pmdie() -> SIGTERM -> BACKEND_TYPE_NORMAL are not > informed), so if they