jayzhan211 commented on PR #15281: URL: https://github.com/apache/datafusion/pull/15281#issuecomment-2769195062
``` [2025-04-01T12:19:41Z DEBUG datafusion_optimizer::utils] scalar_subquery_to_join: Projection: e.b, __scalar_sq_1.CASE WHEN max(e2.a) > Int64(10) THEN Utf8("a") ELSE Utf8("b") END AS CASE WHEN max(e2.a) > Int64(10) THEN Utf8("a") ELSE Utf8("b") END Left Join: Filter: CAST(__scalar_sq_1.b AS Int64) = CAST(e.b AS Int64) + Int64(1) SubqueryAlias: e TableScan: t SubqueryAlias: __scalar_sq_1 Projection: CASE WHEN max(e2.a) > Int32(10) THEN Utf8("a") ELSE Utf8("b") END AS CASE WHEN max(e2.a) > Int64(10) THEN Utf8("a") ELSE Utf8("b") END, e2.b Aggregate: groupBy=[[e2.b]], aggr=[[max(e2.a)]] SubqueryAlias: e2 TableScan: t2 ``` In `scalar_subquery_to_join` optimization, we have left join on `__scalar_sq_1` and `e`. I wonder could we join the plan first and then call aggregation on top of joined plan. The plan I expected is something like this ``` [2025-04-01T12:19:41Z DEBUG datafusion_optimizer::utils] scalar_subquery_to_join: Projection: CASE WHEN max(e2.a) > Int32(10) THEN Utf8("a") ELSE Utf8("b") END AS CASE WHEN max(e2.a) > Int64(10) THEN Utf8("a") ELSE Utf8("b") END, e2.b Aggregate: groupBy=[[e2.b]], aggr=[[max(e2.a)]] Left Join: Filter: CAST(e2.b AS Int64) = CAST(e.b AS Int64) + Int64(1) SubqueryAlias: e TableScan: t SubqueryAlias: e2 TableScan: t2 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org