On Thu, Sep 17, 2015 at 1:40 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On Thu, Sep 10, 2015 at 12:12 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > >> 2. I think it's probably a good idea - at least for now, and maybe > >> forever - to avoid nesting parallel plans inside of other parallel > >> plans. It's hard to imagine that being a win in a case like this, and > >> it certainly adds a lot more cases to think about. > > > > I also think that avoiding nested parallel plans is a good step forward. > > Doing that as a part of the assess parallel safety patch was trivial, so I did. >
As per my understanding, what you have done there will not prohibit such cases. + * For now, we don't try to use parallel mode if we're running inside + * a parallel worker. We might eventually be able to relax this + * restriction, but for now it seems best not to have parallel workers + * trying to create their own parallel workers. + */ + glob->parallelModeOK = (cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && + IsUnderPostmaster && dynamic_shared_memory_type != DSM_IMPL_NONE && + parse->commandType == CMD_SELECT && !parse->hasModifyingCTE && + parse->utilityStmt == NULL && !IsParallelWorker() && + !contain_parallel_unsafe((Node *) parse); IIUC, your are referring to !IsParallelWorker() check in above code. If yes, then I think it won't work because we generate the plan in master backend, parallel worker will never exercise this code. I have tested it as well with below example and it still generates SubPlan as Funnel. CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 10000000) g; CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 1000000) g; set parallel_seqscan_degree=2; set cpu_tuple_comm_cost=0.01; explain select * from t1 where c1 not in (select c1 from t2 where c2 = 'xxxx'); QUERY PLAN -------------------------------------------------------------------------------- ---- Funnel on t1 (cost=11536.88..126809.17 rows=3432492 width=36) Filter: (NOT (hashed SubPlan 1)) Number of Workers: 2 -> Partial Seq Scan on t1 (cost=11536.88..58159.32 rows=3432492 width=36) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Funnel on t2 (cost=0.00..11528.30 rows=3433 width=4) Filter: (c2 = 'xxxx'::text) Number of Workers: 2 -> Partial Seq Scan on t2 (cost=0.00..4662.68 rows=3433 width =4) Filter: (c2 = 'xxxx'::text) SubPlan 1 -> Funnel on t2 (cost=0.00..11528.30 rows=3433 width=4) Filter: (c2 = 'xxxx'::text) Number of Workers: 2 -> Partial Seq Scan on t2 (cost=0.00..4662.68 rows=3433 width=4) Filter: (c2 = 'xxxx'::text) (17 rows) Here the subplan is generated before the top level plan and while generation of subplan we can't predict whether it is okay to generate it as Funnel or not, because it might be that top level plan is non-Funnel. Also if such a subplan is actually an InitPlan, then we are safe (as we execute the InitPlans in master backend and then pass the result to parallel worker) even if top level plan is Funnel. I think the place where we can catch this is during the generation of Funnel path, basically we can evaluate if any nodes beneath Funnel node has 'filter' or 'targetlist' as another Funnel node, then we have two options to proceed: a. Mark such a filter or target list as non-pushable which will indicate that they need to be executed only in master backend. If we go with this option, then we have to make Funnel node capable of evaluating Filter and Targetlist which is not a big thing. b. Don't choose the current path as Funnel path. I prefer second one as that seems to be simpler as compare to first and there doesn't seem to be much benefit in going by first. Any better ideas? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com