askalt commented on issue #14342: URL: https://github.com/apache/datafusion/issues/14342#issuecomment-2664835375
@alamb Sorry for delay from my side. I investigated plan re-use question again and what I noticed. - There is also a problem with nodes state `shared across partitions` and for example, initialized once per execution. For example, take a look at: https://github.com/apache/datafusion/blob/19fe44cf2f30cbdd63d4a4f52c74055163c6cc38/datafusion/physical-plan/src/joins/hash_join.rs#L340 Join node keeps a future that collects a build side. And, for example, recursive query executor solves this problem by copying a recursive term physical plan each iteration. https://github.com/apache/datafusion/blob/19fe44cf2f30cbdd63d4a4f52c74055163c6cc38/datafusion/physical-plan/src/recursive_query.rs#L371-L389 So, there a 3 problems that need to be solved to reuse plans and make a reusing theoretically useful (physical placeholders): 1) How `ExecutionPlan` will resolve placeholders? As there is a single `params` set we can keep it inside `TaskContext` and fill when executions begins. Each `*Exec` is responsible to resolve placeholders in itself physical expressions. For example: - We project a "x + $1" from t - `ProjectionExec` on `execute(self, task_ctx)` stage takes a param values from `task_ctx` and rewrite itself expressions to pass already resolved expressions to the stream. - Code to look https://github.com/tarantool/datafusion/blob/5eede25b64a333dd991f7d27f301c03993ce760d/datafusion/physical-plan/src/projection.rs#L214-L222 2) Where metrics will be stored? 3) Where `plan state` (like join build side future) will be stored? To answer these questions I check how pg sql solves these problems. There is an additional tree which nodes called `PlanState`: https://github.com/postgres/postgres/blob/217919dd0954f54402e8d0a38cd203a740754077/src/include/nodes/execnodes.h#L1144-L1236 Each `PlanState` node keeps a pointer to the corresponding plan node (in our situtation is `Arc<dyn ExecutionPlan>`) and an optional runtime stats (`ExecutionPlanMetricSet in our situation). We can create a similar structure that will repeat a form of `ExecutionPlan` and keeps Arcs to the nodes. ``` AggregateExecState -------> AggregateExec | | ProjectionExecState --------> ProjectionExec | | ... ... ``` So we can keep `*ExecState` cheap and no think about a weight of `*Exec` cloning. When execute an `*Exec` we firstly build a "state" tree and then pass pointers to the corresponding `*ExecState` nodes to the `execute(...)` of the node. To collect metrics we visit a state tree. In this state tree we can keep any additional per execution state. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
