On Tue, 20 Feb 2024 at 05:26, Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > I doubt CreatePlanContext is a great way to achieve this. For one, it > breaks the long-standing custom that PlannerInfo is the first parameter, > usually followed by RelOptInfo, etc. CreatePlanContext is added to some > functions (but not all), which makes it ... unpredictable.
I suggested this to Melih as a way to do this based on what Andy wrote in [1]. I agree with Andy that it's not great to add est_calls to every function in createplan.c. I feel that CreatePlanContext is a way to take the hit of adding a parameter once and hopefully never having to do it again to this degree. I wondered if PlannerInfo could be a field in CreatePlanContext. > FWIW it's not clear to me if/how this solves the problem with early > create_plan() calls for subplans. Or is it still the same? > > Wouldn't it be simpler to just build the plan as we do now, and then > have an expression_tree_walker that walks the complete plan top-down, > inspects the nodes, enables JIT where appropriate and so on? That can > have arbitrary context, no problem with that. Why walk the entire plan tree again to do something we could do when building it in the first place? Recursively walking trees isn't great from a performance point of view. It would be nice to avoid this if we can find some other way to handle subplans. I do have a few other reasons up my sleeve that subplan creation should be delayed until later, so maybe we should fix that to unblock those issues. > Considering we decide JIT pretty late anyway (long after costing and > other stuff that might affect the plan selection), the result should be > exactly the same, without the extensive createplan.c disruption ... > > (usual caveat: I haven't tried, maybe there's something that means this > can't work) It's not like we can look at the top-node's cost as a pre-check to skip the recursive step for cheap plans as it's perfectly valid that a node closer to the root of the plan tree have a lower total cost than that node's subnodes. e.g LIMIT 1. > > - The number of estimated calls are especially useful where a node is > > expected to be rescanned, such as Gather. Gather Merge, Memoize and Nested > > Loop. Knowing the estimated number of calls for a node allows us to rely on > > total cost multiplied by the estimated calls instead of only total cost for > > the node. > > Not sure I follow. Why would be these nodes (Gather, Gather Merge, ...) > more likely to be rescanned compared to other nodes? I think Melih is listing nodes that can change the est_calls. Any node can be rescanned, but only a subset of nodes can adjust the number of times they call their subnode vs how many times they themselves are called. > > - For each node, the planner considers if the node should be JITed. If the > > cost of the node * the number of estimated calls is greater than > > jit_above_cost, it's decided to be JIT compiled. Note that this changes > > the meaning of jit_above_cost, it's now a threshold for a single plan node > > and not the whole query. Additionally, this change in JIT consideration is > > only for JIT compilations. Inlining and optimizations continue to be for > > the whole query and based on the overall cost of the query. > > It's not clear to me why JIT compilation is decided for each node, while > the inlining/optimization is decided for the plan as a whole. I'm not > familiar with the JIT stuff, so maybe it's obvious to others ... This is a problem with LLVM, IIRC. The problem is it's a decision that has to be made for an entire compilation unit and it can't be decided at the expression level. This is pretty annoying as it's pretty hard to decide the best logic to use to enable optimisations and inlining :-( I think the best thing we could discuss right now is, is this the best way to fix the JIT costing problem. In [2] I did link to a complaint about the JIT costings. See [3]. The OP there wanted to keep the plan private, but I did get to see it and described the problem on the list. Also, I don't happen to think the decision about JITting per plan node is perfect as the node's total costs can be high for reasons other than the cost of evaluating expressions. Also, the number of times a given expression is evaluated can vary quite a bit based on when the expression is evaluated. For example, a foreign table scan that does most of the filtering remotely, but has a non-shippable expr that needs to be evaluated locally. The foreign scan might be very expensive, especially if lots of filtering is done by a Seq Scan and not many rows might make it back to the local server to benefit from JITting the non-shippable expression. A counter-example is the join condition of a non-parameterized nested loop. Those get evaluated n_outer_rows * n_inner_rows times. I think the savings JIT gives us on evaluation of expressions is going to be more closely tied to the number of times an expression is evaluated than the total cost of the node. However, it's likely more complex for optimisations and inlining as I imagine the size and complexity of the comparison function matters too. It would be good to all agree on how we're going to fix this problem exactly before Melih gets in too deep fixing the finer details of the patch. If anyone isn't convinced enough there's a problem with the JIT costings then I can see if I can dig up other threads where this is being complained about. Does anyone want to disagree with the general idea of making the compilation decision based on the total cost of the node? Or have a better idea? David [1] https://postgr.es/m/CAKU4AWqqSAi%2B-1ZaFawY300WknH79J9dhx%3DpU5%2BbyAbShjUjCQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAApHDvpQJqLrNOSi8P1JLM8YE2C%2BksKFpSdZg%3Dq6sTbtQ-v%3Daw%40mail.gmail.com [3] https://www.postgresql.org/message-id/7736c40e-6db5-4e7a-8fe3-4b2ab8e22...@elevated-dev.com