LiaCastaneda opened a new pull request, #22530: URL: https://github.com/apache/datafusion/pull/22530
## Which issue does this PR close? Related to discussion on #21240 and https://github.com/apache/datafusion/issues/21080#issuecomment-4543527331. - Closes #. PR #21240 introduced `ScalarSubqueryExec` / `ScalarSubqueryExpr` to execute uncorrelated scalar subqueries during physical execution. The two communicate via shared in process state (a `slot` in `ExecutionProps`), which breaks distributed execution that may split execution across a network boundary between the producer (`ScalarSubqueryExec`) and the consumer expression (`ScalarSubqueryExpr`). See more details on this explanation in [datafusion-contrib/datafusion-distributed#460](https://github.com/datafusion-contrib/datafusion-distributed/issues/460) ## What changes are included in this PR? Adds a new optimizer config option `datafusion.optimizer.physical_uncorrelated_scalar_subquery` (default true, preserving the current behavior). When true (default), behavior is unchanged from current main; when false, all scalar subqueries are rewritten to left joins by `ScalarSubqueryToJoin` and `ScalarSubqueryExec` is never constructed (which was the previous behavior). ## Are these changes tested? Yes all tests pass and added `uncorrelated_scalar_subquery_rewritten_when_flag_off` to test the negative case. ## Are there any user-facing changes? Yes, a new config option `datafusion.optimizer.physical_uncorrelated_scalar_subquery` (this just changes the way the query is executed but not the results) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
