Acfboy commented on issue #18816: URL: https://github.com/apache/datafusion/issues/18816#issuecomment-3949811476
Hi @niebayes, After tracing the logical plan through each optimizer pass using your reproduction example, I found that the issues you observed are caused by the `OptimizeProjections` rule not being applied to your custom node. This prevents column `b` from being eliminated and fails to clean up the redundant projections generated by the alternating rewrites of CSE and `FilterPushdown`. The reason `OptimizeProjections` is bypassed is that `DummyPlan` does not implement the `necessary_children_exprs` method. In DataFusion, the default implementation of this method returns `None`, which signals to the optimizer that it cannot determine which columns are required from the child. As a result, the optimizer takes a conservative approach and skips projection pruning for that node: https://github.com/apache/datafusion/blob/b6d46a63824f003117297848d8d83b659ac2e759/datafusion/optimizer/src/optimize_projections/mod.rs#L331-L337 If you provide an implementation for your custom node, such as: ```rust fn necessary_children_exprs( &self, _output_columns: &[usize], ) -> Option<Vec<Vec<usize>>> { Some(vec![vec![0]]) } ``` The optimizer will produce the expected result. Therefore, this appears to be an implementation requirement for UserDefinedLogicalNode rather than a bug in DataFusion itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
