On Thu, Nov 19, 2015 at 12:27 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On Wed, Nov 18, 2015 at 7:25 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > Don't we need the startup cost incase we need to build partial paths for > > joinpaths like mergepath? > > Also, I think there are other cases for single relation scan where startup > > cost can matter like when there are psuedoconstants in qualification > > (refer cost_qual_eval_walker()) or let us say if someone has disabled > > seq scan (disable_cost is considered as startup cost.) > > I'm not saying that we don't need to compute it. I'm saying we don't > need to take it into consideration when deciding which paths have > merit. Note that consider_statup is set this way: > > rel->consider_startup = (root->tuple_fraction > 0); >
Even when consider_startup is false, still startup_cost is used for cost calc, now may be ignoring that is okay for partial paths, but still it seems worth thinking why leaving for partial paths it is okay even though it is used in add_path(). + * We don't generate parameterized partial paths because they seem unlikely + * ever to be worthwhile. The only way we could ever use such a path is + * by executing a nested loop with a complete path on the outer side - thus, + * each worker would scan the entire outer relation - and the partial path + * on the inner side - thus, each worker would scan only part of the inner + * relation. This is silly: a parameterized path is generally going to be + * based on an index scan, and we can't generate a partial path for that. Won't it be useful to consider parameterized paths for below kind of plans where we can push the jointree to worker and each worker can scan the complete outer relation A and then the rest work is divided among workers (ofcourse there can be other ways to parallelize such joins, but still the way described also seems to be possible)? NestLoop -> Seq Scan on A Hash Join Join Condition: B.Y = C.W -> Seq Scan on B -> Index Scan using C_Z_IDX on C Index Condition: C.Z = A.X - Is the main reason to have add_partial_path() is that it has some less checks or is it that current add_path will give wrong answers in any case? If there is no case where add_path can't work, then there is some advanatge in retaining add_path() atleast in terms of maintining the code. +void +add_partial_path(RelOptInfo *parent_rel, Path *new_path) { .. + /* Unless pathkeys are incompable, keep just one of the two paths. */ .. typo - 'incompable' > > A. > > This means that for inheritance child relations for which rel pages are > > less than parallel_threshold, it will always consider the cost shared > > between 1 worker and leader as per below calc in cost_seqscan: > > if (path->parallel_degree > 0) > > run_cost = run_cost / (path->parallel_degree + 0.5); > > > > I think this might not be the appropriate cost model for even for > > non-inheritence relations which has pages more than parallel_threshold, > > but it seems to be even worst for inheritance children which have > > pages less than parallel_threshold > > Why? Because I think the way code is written, it assumes that for each of the inheritence-child relation which has pages lesser than threshold, half the work will be done by master-backend which doesn't seem to be the right distribution. Consider a case where there are three such children each having cost 100 to scan, now it will cost them as 100/1.5 + 100/1.5 + 100/1.5 which means that per worker, it is considering 0.5 of master backends work which seems to be wrong. I think for Append case, we should consider this cost during Append path creation in create_append_path(). Basically we can make cost_seqscan to ignore the cost reduction due to parallel_degree for inheritance relations and then during Append path creation we can consider it and also consider work unit of master backend as 0.5 with respect to overall work. - --- a/src/backend/optimizer/README +++ b/src/backend/optimizer/README +plan as possible. Expanding the range of cases in which more work can be +pushed below the Gather (and costly them accurately) is likely to keep us +busy for a long time to come. Seems there is a typo in above text. /costly/cost With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com