adriangb commented on code in PR #16732:
URL: https://github.com/apache/datafusion/pull/16732#discussion_r2196297976


##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -555,15 +552,12 @@ impl ExecutionPlan for FilterExec {
             };
             Some(Arc::new(new) as _)
         };
-        // Mark all parent filters as supported since we absorbed them
-        let supported_filters = child_pushdown_result
-            .parent_filters
-            .into_iter()
-            .map(|f| PredicateSupport::Supported(f.into_inner()))
-            .collect();
 
         Ok(FilterPushdownPropagation {
-            filters: supported_filters,
+            filters: vec![
+                PredicateSupportDiscriminant::Supported;
+                child_pushdown_result.parent_filters.len()
+            ],

Review Comment:
   I think this is now nicer to make 😄 



##########
datafusion/physical-plan/src/filter_pushdown.rs:
##########
@@ -112,21 +209,16 @@ impl PredicateSupport {
 /// [`ExecutionPlan::handle_child_pushdown_result`]: 
crate::ExecutionPlan::handle_child_pushdown_result
 #[derive(Debug, Clone)]
 pub struct ChildPushdownResult {
-    /// The combined result of pushing down each parent filter into each child.
-    /// For example, given the fitlers `[a, b]` and children `[1, 2, 3]` the 
matrix of responses:
-    ///
-    // | filter | child 1     | child 2   | child 3   | result      |
-    // |--------|-------------|-----------|-----------|-------------|
-    // | a      | Supported   | Supported | Supported | Supported   |
-    // | b      | Unsupported | Supported | Supported | Unsupported |
-    ///
-    /// That is: if any child marks a filter as unsupported or if the filter 
was not pushed
-    /// down into any child then the result is unsupported.
-    /// If at least one children and all children that received the filter 
mark it as supported
-    /// then the result is supported.
-    pub parent_filters: Vec<PredicateSupport>,
+    /// The parent filters that were pushed down as received by the current 
node when 
[`ExecutionPlan::gather_filters_for_pushdown`](crate::ExecutionPlan::handle_child_pushdown_result)
 was called.
+    /// Note that this may *not* be the same as the filters that were passed 
to the children as the current node may have modified them
+    /// (e.g. by reassigning column indices) when it returned them from 
[`ExecutionPlan::gather_filters_for_pushdown`](crate::ExecutionPlan::handle_child_pushdown_result)
 in a [`FilterDescription`].
+    /// Attached to each filter is a [`PredicateSupportDiscriminant`] *per 
child* that indicates whether the filter was supported or unsupported by each 
child.
+    /// To get combined results see [`ChildFitlerPushdownResult::any`] and 
[`ChildFitlerPushdownResult::all`].
+    pub parent_filters: Vec<ChildFitlerPushdownResult>,

Review Comment:
   I think part of the reason I made that assumption in the original design is 
was that I didn't want a `Vec<Vec<PredicateSupport>>`, in part because you 
could technically have different lengths or different filters. This new 
structure makes that much more sound.



##########
datafusion/physical-optimizer/src/filter_pushdown.rs:
##########
@@ -434,70 +435,81 @@ impl PhysicalOptimizerRule for FilterPushdown {
     }
 }
 
-/// Support state of each predicate for the children of the node.
-/// These predicates are coming from the parent node.
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-enum ParentPredicateStates {
-    NoChildren,
-    Unsupported,
-    Supported,
-}

Review Comment:
   I was able to get rid of this ugly helper enum



##########
datafusion/physical-plan/src/repartition/mod.rs:
##########
@@ -821,9 +821,7 @@ impl ExecutionPlan for RepartitionExec {
         child_pushdown_result: ChildPushdownResult,
         _config: &ConfigOptions,
     ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>> {
-        Ok(FilterPushdownPropagation::transparent(
-            child_pushdown_result,
-        ))
+        Ok(FilterPushdownPropagation::all(child_pushdown_result))

Review Comment:
   Note that `all` is much more explicit and clear than `transparent`. We could 
even name it something more verbose?



##########
datafusion/physical-plan/src/filter_pushdown.rs:
##########
@@ -112,21 +209,16 @@ impl PredicateSupport {
 /// [`ExecutionPlan::handle_child_pushdown_result`]: 
crate::ExecutionPlan::handle_child_pushdown_result
 #[derive(Debug, Clone)]
 pub struct ChildPushdownResult {
-    /// The combined result of pushing down each parent filter into each child.
-    /// For example, given the fitlers `[a, b]` and children `[1, 2, 3]` the 
matrix of responses:
-    ///
-    // | filter | child 1     | child 2   | child 3   | result      |
-    // |--------|-------------|-----------|-----------|-------------|
-    // | a      | Supported   | Supported | Supported | Supported   |
-    // | b      | Unsupported | Supported | Supported | Unsupported |
-    ///

Review Comment:
   This is the assumption that was fundamentally flawed



##########
datafusion/physical-plan/src/coalesce_batches.rs:
##########
@@ -243,9 +243,7 @@ impl ExecutionPlan for CoalesceBatchesExec {
         child_pushdown_result: ChildPushdownResult,
         _config: &ConfigOptions,
     ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>> {
-        Ok(FilterPushdownPropagation::transparent(
-            child_pushdown_result,
-        ))
+        Ok(FilterPushdownPropagation::all(child_pushdown_result))

Review Comment:
   For nodes with 1 child `all` and `any` are equivalent, but I think it's 
safer to pick `all` by default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to