skyzh commented on PR #14595: URL: https://github.com/apache/datafusion/pull/14595#issuecomment-2661674964
And ready for review again :) After trying understanding what's happening in `decorrelate.rs`, I think we need new code path to support a variety of logical plans produced by lateral joins. The key is that we should make the decorrelation code aware of the joins while de-correlating, instead of first gathering the information and then generate a join at the top. If we are going towards making the optimizer able to unnest any subqueries and lateral joins, then we will likely have a meta rule that recursively apply the following rules top-down: * Convert join operators into LogicalApply if the right side of the join contains outer column reference. This can also be done in the SQL->logical phase when we encounter a lateral join. For the correlated filter predicate and scalar subqueries (exists/in), we can also convert them into the apply operator in the future. * Have a set of rules like: push down apply->join, push down apply->aggregation, push down apply->filter, etc. * Apply these rules top-down until no outer column reference is in the plan tree. We can either use the Hyper unnesting rules (we implemented it in CMU-DB's [optd](https://github.com/cmu-db/optd-original/blob/main/optd-datafusion-repr/src/rules/subquery/depjoin_pushdown.rs) optimizer) or the SQL server unnesting rules (which we've implemented in [risinglight](https://github.com/risinglightdb/risinglight/blob/f12ea232a502b1dbda37ddaa3e98c3b8d1e6439b/src/planner/rules/plan.rs#L204-L280)). This meta unnesting rule is more powerful than what we have right now (decorrelate predicate subquery + scalar subquery unnesting rule) and we can eventually replace these two rules with the new meta unnesting rule in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org