On Mon, Mar 27, 2017 at 3:43 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> On 03/25/2017 05:18 PM, Rushabh Lathia wrote: > >> >> >> On Sat, Mar 25, 2017 at 7:01 PM, Peter Eisentraut >> <peter.eisentr...@2ndquadrant.com >> <mailto:peter.eisentr...@2ndquadrant.com>> wrote: >> >> On 3/25/17 09:01, David Rowley wrote: >> > On 25 March 2017 at 23:09, Rushabh Lathia <rushabh.lat...@gmail.com >> <mailto:rushabh.lat...@gmail.com>> wrote: >> >> Also another point which I think we should fix is, when someone set >> >> max_parallel_workers = 0, we should also set the >> >> max_parallel_workers_per_gather >> >> to zero. So that way it we can avoid generating the gather path >> with >> >> max_parallel_worker = 0. >> > I see that it was actually quite useful that it works the way it >> does. >> > If it had worked the same as max_parallel_workers_per_gather, then >> > likely Tomas would never have found this bug. >> >> Another problem is that the GUC system doesn't really support cases >> where the validity of one setting depends on the current value of >> another setting. So each individual setting needs to be robust >> against >> cases of related settings being nonsensical. >> >> >> Okay. >> >> About the original issue reported by Tomas, I did more debugging and >> found that - problem was gather_merge_clear_slots() was not returning >> the clear slot when nreader is zero (means nworkers_launched = 0). >> Due to the same scan was continue even all the tuple are exhausted, >> and then end up with server crash at gather_merge_getnext(). In the patch >> I also added the Assert into gather_merge_getnext(), about the index >> should be less then the nreaders + 1 (leader). >> >> PFA simple patch to fix the problem. >> >> > I think there are two issues at play, here - the first one is that we > still produce parallel plans even with max_parallel_workers=0, and the > second one is the crash in GatherMerge when nworkers=0. > > Your patch fixes the latter (thanks for looking into it), which is > obviously a good thing - getting 0 workers on a busy system is quite > possible, because all the parallel workers can be already chewing on some > other query. > > Thanks. > But it seems a bit futile to produce the parallel plan in the first place, > because with max_parallel_workers=0 we can't possibly get any parallel > workers ever. I wonder why compute_parallel_worker() only looks at > max_parallel_workers_per_gather, i.e. why shouldn't it do: > > parallel_workers = Min(parallel_workers, max_parallel_workers); > > I agree with you here. Producing the parallel plan when max_parallel_workers = 0 is wrong. But rather then your suggested fix, I think that we should do something like: /* * In no case use more than max_parallel_workers_per_gather or * max_parallel_workers. */ parallel_workers = Min(parallel_workers, Min(max_parallel_workers, max_parallel_workers_per_gather)); > Perhaps this was discussed and is actually intentional, though. > > Yes, I am not quite sure about this. Regarding handling this at the GUC level - I agree with Peter that that's > not a good idea. I suppose we could deal with checking the values in the > GUC check/assign hooks, but what we don't have is a way to undo the changes > in all the GUCs. That is, if I do > > SET max_parallel_workers = 0; > SET max_parallel_workers = 16; > > I expect to end up with just max_parallel_workers GUC changed and nothing > else. > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > -- Rushabh Lathia