On Wed, Aug 2, 2017 at 11:12 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > On Wed, Jul 12, 2017 at 7:08 PM, Amit Kapila <amit.kapil...@gmail.com> > wrote: >> >> On Wed, Jul 12, 2017 at 11:20 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: >> > On Tue, Jul 11, 2017 at 10:25 PM, Amit Kapila <amit.kapil...@gmail.com> >> > wrote: >> >> >> >> On Wed, Jul 12, 2017 at 1:50 AM, Jeff Janes <jeff.ja...@gmail.com> >> >> wrote: >> >> > On Mon, Jul 10, 2017 at 9:51 PM, Dilip Kumar <dilipbal...@gmail.com> >> >> > wrote: >> >> >> >> >> >> So because of this high projection cost the seqpath and parallel >> >> >> path >> >> >> both have fuzzily same cost but seqpath is winning because it's >> >> >> parallel safe. >> >> > >> >> > >> >> > I think you are correct. However, unless parallel_tuple_cost is set >> >> > very >> >> > low, apply_projection_to_path never gets called with the Gather path >> >> > as >> >> > an >> >> > argument. It gets ruled out at some earlier stage, presumably >> >> > because >> >> > it >> >> > assumes the projection step cannot make it win if it is already >> >> > behind >> >> > by >> >> > enough. >> >> > >> >> >> >> I think that is genuine because tuple communication cost is very high. >> > >> > >> > Sorry, I don't know which you think is genuine, the early pruning or my >> > complaint about the early pruning. >> > >> >> Early pruning. See, currently, we don't have a way to maintain both >> parallel and non-parallel paths till later stage and then decide which >> one is better. If we want to maintain both parallel and non-parallel >> paths, it can increase planning cost substantially in the case of >> joins. Now, surely it can have benefit in many cases, so it is a >> worthwhile direction to pursue. > > > If I understand it correctly, we have a way, it just can lead to exponential > explosion problem, so we are afraid to use it, correct? If I just > lobotomize the path domination code (make pathnode.c line 466 always test > false) > > if (JJ_all_paths==0 && costcmp != COSTS_DIFFERENT) > > Then it keeps the parallel plan and later chooses to use it (after applying > your other patch in this thread) as the overall best plan. It even doesn't > slow down "make installcheck-parallel" by very much, which I guess just > means the regression tests don't have a lot of complex joins. > > But what is an acceptable solution? Is there a heuristic for when retaining > a parallel path could be helpful, the same way there is for fast-start > paths? It seems like the best thing would be to include the evaluation > costs in the first place at this step. > > Why is the path-cost domination code run before the cost of the function > evaluation is included?
Because the function evaluation is part of target list and we create path target after the creation of base paths (See call to create_pathtarget @ planner.c:1696). > Is that because the information needed to compute > it is not available at that point, Right. I see two ways to include the cost of the target list for parallel paths before rejecting them (a) Don't reject parallel paths (Gather/GatherMerge) during add_path. This has the danger of path explosion. (b) In the case of parallel paths, somehow try to identify that path has a costly target list (maybe just check if the target list has anything other than vars) and use it as a heuristic to decide that whether a parallel path can be retained. I think the preference will be to do something on the lines of approach (b), but I am not sure whether we can easily do that. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers