Re: JIT compilation per plan node

David Rowley Mon, 19 Feb 2024 21:15:31 -0800

On Tue, 20 Feb 2024 at 05:26, Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:
> I doubt CreatePlanContext is a great way to achieve this. For one, it
> breaks the long-standing custom that PlannerInfo is the first parameter,
> usually followed by RelOptInfo, etc. CreatePlanContext is added to some
> functions (but not all), which makes it ... unpredictable.


I suggested this to Melih as a way to do this based on what Andy wrote
in [1]. I agree with Andy that it's not great to add est_calls to
every function in createplan.c. I feel that CreatePlanContext is a way
to take the hit of adding a parameter once and hopefully never having
to do it again to this degree.  I wondered if PlannerInfo could be a
field in CreatePlanContext.

> FWIW it's not clear to me if/how this solves the problem with early
> create_plan() calls for subplans. Or is it still the same?
>
> Wouldn't it be simpler to just build the plan as we do now, and then
> have an expression_tree_walker that walks the complete plan top-down,
> inspects the nodes, enables JIT where appropriate and so on? That can
> have arbitrary context, no problem with that.

Why walk the entire plan tree again to do something we could do when
building it in the first place?  Recursively walking trees isn't great
from a performance point of view. It would be nice to avoid this if we
can find some other way to handle subplans.  I do have a few other
reasons up my sleeve that subplan creation should be delayed until
later, so maybe we should fix that to unblock those issues.

> Considering we decide JIT pretty late anyway (long after costing and
> other stuff that might affect the plan selection), the result should be
> exactly the same, without the extensive createplan.c disruption ...
>
> (usual caveat: I haven't tried, maybe there's something that means this
> can't work)

It's not like we can look at the top-node's cost as a pre-check to
skip the recursive step for cheap plans as it's perfectly valid that a
node closer to the root of the plan tree have a lower total cost than
that node's subnodes.  e.g LIMIT 1.

> > - The number of estimated calls are especially useful where a node is
> > expected to be rescanned, such as Gather. Gather Merge, Memoize and Nested
> > Loop. Knowing the estimated number of calls for a node allows us to rely on
> > total cost multiplied by the estimated calls instead of only total cost for
> > the node.
>
> Not sure I follow. Why would be these nodes (Gather, Gather Merge, ...)
> more likely to be rescanned compared to other nodes?

I think Melih is listing nodes that can change the est_calls.  Any
node can be rescanned, but only a subset of nodes can adjust the
number of times they call their subnode vs how many times they
themselves are called.

> > - For each node, the planner considers if the node should be JITed. If the
> > cost of the node * the number of estimated calls is greater than
> > jit_above_cost,  it's decided to be JIT compiled. Note that this changes
> > the meaning of jit_above_cost, it's now a threshold for a single plan node
> > and not the whole query. Additionally, this change in JIT consideration is
> > only for JIT compilations. Inlining and optimizations continue to be for
> > the whole query and based on the overall cost of the query.
>
> It's not clear to me why JIT compilation is decided for each node, while
> the inlining/optimization is decided for the plan as a whole. I'm not
> familiar with the JIT stuff, so maybe it's obvious to others ...

This is a problem with LLVM, IIRC.  The problem is it's a decision
that has to be made for an entire compilation unit and it can't be
decided at the expression level.  This is pretty annoying as it's
pretty hard to decide the best logic to use to enable optimisations
and inlining :-(

I think the best thing we could discuss right now is, is this the best
way to fix the JIT costing problem.  In [2] I did link to a complaint
about the JIT costings. See [3].  The OP there wanted to keep the plan
private, but I did get to see it and described the problem on the
list.

Also, I don't happen to think the decision about JITting per plan node
is perfect as the node's total costs can be high for reasons other
than the cost of evaluating expressions.  Also, the number of times a
given expression is evaluated can vary quite a bit based on when the
expression is evaluated.  For example, a foreign table scan that does
most of the filtering remotely, but has a non-shippable expr that
needs to be evaluated locally. The foreign scan might be very
expensive, especially if lots of filtering is done by a Seq Scan and
not many rows might make it back to the local server to benefit from
JITting the non-shippable expression.

A counter-example is the join condition of a non-parameterized nested
loop.  Those get evaluated  n_outer_rows * n_inner_rows times.

I think the savings JIT gives us on evaluation of expressions is going
to be more closely tied to the number of times an expression is
evaluated than the total cost of the node.  However, it's likely more
complex for optimisations and inlining as I imagine the size and
complexity of the comparison function matters too.

It would be good to all agree on how we're going to fix this problem
exactly before Melih gets in too deep fixing the finer details of the
patch. If anyone isn't convinced enough there's a problem with the JIT
costings then I can see if I can dig up other threads where this is
being complained about.

Does anyone want to disagree with the general idea of making the
compilation decision based on the total cost of the node? Or have a
better idea?

David

[1] 
https://postgr.es/m/CAKU4AWqqSAi%2B-1ZaFawY300WknH79J9dhx%3DpU5%2BbyAbShjUjCQ%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAApHDvpQJqLrNOSi8P1JLM8YE2C%2BksKFpSdZg%3Dq6sTbtQ-v%3Daw%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/7736c40e-6db5-4e7a-8fe3-4b2ab8e22...@elevated-dev.com

Re: JIT compilation per plan node

Reply via email to