"Li, Zheng" <zhe...@amazon.com> writes: > * I find it entirely unacceptable to stick some planner temporary > fields into struct SubLink. If you need that storage you'll have > to find some other place to put it. But in point of fact I don't > think you need it; it doesn't look to me to be critical to generate > the subquery's plan any earlier than make_subplan would have done it. > Moreover, you should really strive to *not* do that, because it's > likely to get in the way of other future optimizations. As the > existing comment in make_subplan already suggests, we might want to > delay subplan planning even further than that in future.
> The reason for calling make_subplan this early is that we want to > Call subplan_is_hashable(plan), to decide whether to proceed with the proposed > transformation. Well, you're going to have to find another way, because this one won't do. If you really need to get whacked over the head with a counterexample for this approach, consider what happens if some part of the planner decides to pass the SubLink through copyObject, expression_tree_mutator, etc in between where you've done the planning and where make_subplan looks at it. Since you haven't taught copyfuncs.c about these fields, they'll semi-accidentally wind up as NULL afterwards, meaning you lost the information anyway. (In fact, I wouldn't be surprised if that's happening already in some cases; you couldn't really tell, since make_subplan will just repeat the work.) On the other hand, you can't have copyfuncs.c copying such fields either --- we don't have copyfuncs support for PlannerInfo, and if we did, the case would end up as infinite recursion. Nor would it be particularly cool to try to fake things out by copying the pointers as scalars; that will lead to dangling pointers later on. BTW, so far as I can see, the only reason you're bothering with the whole thing is to compare the size of the subquery output with work_mem, because that's all that subplan_is_hashable does. I wonder whether that consideration is even still necessary in the wake of 1f39bce02. If it is, I wonder whether there isn't a cheaper way to figure it out. (Note similar comment in make_subplan.) Also ... > We try to stick with the fast hashed subplan when possible to avoid > potential performance degradation from the transformation which may > restrict the planner to choose Nested Loop Anti Join in order to handle > Null properly, But can't you detect that case directly? It seems like you'd need to figure out the NULL situation anyway to know whether the transformation to antijoin is valid in the first place. regards, tom lane