adriangb commented on code in PR #16642: URL: https://github.com/apache/datafusion/pull/16642#discussion_r2182655294
########## datafusion/physical-plan/src/filter_pushdown.rs: ########## @@ -317,24 +152,76 @@ impl<T> FilterPushdownPropagation<T> { } #[derive(Debug, Clone)] -struct ChildFilterDescription { +pub struct ChildFilterDescription { /// Description of which parent filters can be pushed down into this node. /// Since we need to transmit filter pushdown results back to this node's parent /// we need to track each parent filter for each child, even those that are unsupported / won't be pushed down. /// We do this using a [`PredicateSupport`] which simplifies manipulating supported/unsupported filters. - parent_filters: PredicateSupports, + pub(crate) parent_filters: Vec<PredicateSupport>, /// Description of which filters this node is pushing down to its children. /// Since this is not transmitted back to the parents we can have variable sized inner arrays /// instead of having to track supported/unsupported. - self_filters: Vec<Arc<dyn PhysicalExpr>>, + pub(crate) self_filters: Vec<Arc<dyn PhysicalExpr>>, } impl ChildFilterDescription { - fn new() -> Self { - Self { - parent_filters: PredicateSupports::new(vec![]), - self_filters: vec![], + /// Build a child filter description by analyzing which parent filters can be pushed to a specific child. + /// + /// See [`FilterDescription::from_children`] for more details + pub fn from_child( + parent_filters: Vec<Arc<dyn PhysicalExpr>>, + child: &Arc<dyn crate::ExecutionPlan>, + ) -> Result<Self> { + let child_schema = child.schema(); + + // Get column names from child schema for quick lookup + let child_column_names: HashSet<&str> = child_schema + .fields() + .iter() + .map(|f| f.name().as_str()) + .collect(); Review Comment: I agree that performance of optimizer rules and planning is a concern but I think that needs to be solved at a higher level (e.g. caching of plan trees or subtrees). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org