Acfboy commented on issue #18816:
URL: https://github.com/apache/datafusion/issues/18816#issuecomment-3949811476

   Hi @niebayes,
   
   After tracing the logical plan through each optimizer pass using your 
reproduction example, I found that the issues you observed are caused by the 
`OptimizeProjections` rule not being applied to your custom node. This prevents 
column `b` from being eliminated and fails to clean up the redundant 
projections generated by the alternating rewrites of CSE and `FilterPushdown`.
   
   The reason `OptimizeProjections` is bypassed is that `DummyPlan` does not 
implement the `necessary_children_exprs` method. In DataFusion, the default 
implementation of this method returns `None`, which signals to the optimizer 
that it cannot determine which columns are required from the child. As a 
result, the optimizer takes a conservative approach and skips projection 
pruning for that node:
   
   
https://github.com/apache/datafusion/blob/b6d46a63824f003117297848d8d83b659ac2e759/datafusion/optimizer/src/optimize_projections/mod.rs#L331-L337
   
   If you provide an implementation for your custom node, such as:
   ```rust
   fn necessary_children_exprs(
       &self,
       _output_columns: &[usize],
   ) -> Option<Vec<Vec<usize>>> {
       Some(vec![vec![0]]) 
   }
   ```
   The optimizer will produce the expected result.
   
   Therefore, this appears to be an implementation requirement for 
UserDefinedLogicalNode rather than a bug in DataFusion itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to