adriangb commented on code in PR #16732: URL: https://github.com/apache/datafusion/pull/16732#discussion_r2196297976
########## datafusion/physical-plan/src/filter.rs: ########## @@ -555,15 +552,12 @@ impl ExecutionPlan for FilterExec { }; Some(Arc::new(new) as _) }; - // Mark all parent filters as supported since we absorbed them - let supported_filters = child_pushdown_result - .parent_filters - .into_iter() - .map(|f| PredicateSupport::Supported(f.into_inner())) - .collect(); Ok(FilterPushdownPropagation { - filters: supported_filters, + filters: vec![ + PredicateSupportDiscriminant::Supported; + child_pushdown_result.parent_filters.len() + ], Review Comment: I think this is now nicer to make 😄 ########## datafusion/physical-plan/src/filter_pushdown.rs: ########## @@ -112,21 +209,16 @@ impl PredicateSupport { /// [`ExecutionPlan::handle_child_pushdown_result`]: crate::ExecutionPlan::handle_child_pushdown_result #[derive(Debug, Clone)] pub struct ChildPushdownResult { - /// The combined result of pushing down each parent filter into each child. - /// For example, given the fitlers `[a, b]` and children `[1, 2, 3]` the matrix of responses: - /// - // | filter | child 1 | child 2 | child 3 | result | - // |--------|-------------|-----------|-----------|-------------| - // | a | Supported | Supported | Supported | Supported | - // | b | Unsupported | Supported | Supported | Unsupported | - /// - /// That is: if any child marks a filter as unsupported or if the filter was not pushed - /// down into any child then the result is unsupported. - /// If at least one children and all children that received the filter mark it as supported - /// then the result is supported. - pub parent_filters: Vec<PredicateSupport>, + /// The parent filters that were pushed down as received by the current node when [`ExecutionPlan::gather_filters_for_pushdown`](crate::ExecutionPlan::handle_child_pushdown_result) was called. + /// Note that this may *not* be the same as the filters that were passed to the children as the current node may have modified them + /// (e.g. by reassigning column indices) when it returned them from [`ExecutionPlan::gather_filters_for_pushdown`](crate::ExecutionPlan::handle_child_pushdown_result) in a [`FilterDescription`]. + /// Attached to each filter is a [`PredicateSupportDiscriminant`] *per child* that indicates whether the filter was supported or unsupported by each child. + /// To get combined results see [`ChildFitlerPushdownResult::any`] and [`ChildFitlerPushdownResult::all`]. + pub parent_filters: Vec<ChildFitlerPushdownResult>, Review Comment: I think part of the reason I made that assumption in the original design is was that I didn't want a `Vec<Vec<PredicateSupport>>`, in part because you could technically have different lengths or different filters. This new structure makes that much more sound. ########## datafusion/physical-optimizer/src/filter_pushdown.rs: ########## @@ -434,70 +435,81 @@ impl PhysicalOptimizerRule for FilterPushdown { } } -/// Support state of each predicate for the children of the node. -/// These predicates are coming from the parent node. -#[derive(Debug, Clone, Copy, PartialEq, Eq)] -enum ParentPredicateStates { - NoChildren, - Unsupported, - Supported, -} Review Comment: I was able to get rid of this ugly helper enum ########## datafusion/physical-plan/src/repartition/mod.rs: ########## @@ -821,9 +821,7 @@ impl ExecutionPlan for RepartitionExec { child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>> { - Ok(FilterPushdownPropagation::transparent( - child_pushdown_result, - )) + Ok(FilterPushdownPropagation::all(child_pushdown_result)) Review Comment: Note that `all` is much more explicit and clear than `transparent`. We could even name it something more verbose? ########## datafusion/physical-plan/src/filter_pushdown.rs: ########## @@ -112,21 +209,16 @@ impl PredicateSupport { /// [`ExecutionPlan::handle_child_pushdown_result`]: crate::ExecutionPlan::handle_child_pushdown_result #[derive(Debug, Clone)] pub struct ChildPushdownResult { - /// The combined result of pushing down each parent filter into each child. - /// For example, given the fitlers `[a, b]` and children `[1, 2, 3]` the matrix of responses: - /// - // | filter | child 1 | child 2 | child 3 | result | - // |--------|-------------|-----------|-----------|-------------| - // | a | Supported | Supported | Supported | Supported | - // | b | Unsupported | Supported | Supported | Unsupported | - /// Review Comment: This is the assumption that was fundamentally flawed ########## datafusion/physical-plan/src/coalesce_batches.rs: ########## @@ -243,9 +243,7 @@ impl ExecutionPlan for CoalesceBatchesExec { child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>> { - Ok(FilterPushdownPropagation::transparent( - child_pushdown_result, - )) + Ok(FilterPushdownPropagation::all(child_pushdown_result)) Review Comment: For nodes with 1 child `all` and `any` are equivalent, but I think it's safer to pick `all` by default. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org