kosiew opened a new issue, #16188: URL: https://github.com/apache/datafusion/issues/16188
### Is your feature request related to a problem or challenge? The current filter pushdown APIs in DataFusion (FilterPushdownPropagation, PredicateSupports, etc.) have grown organically but now appear convoluted and redundant. The complex layering of abstractions makes the filter pushdown mechanism difficult to understand, maintain, and extend. Specific issues include: - Multiple overlapping abstraction layers (PredicateSupport, PredicateSupports, FilterDescription, etc.) - Redundant helper methods with inconsistent naming patterns (.unsupported(), .transparent(), .with_filters(), .with_updated_node(), .new_with_supported_check(), .collect_supported(), .is_all_supported(), etc.) - Complex mental model requiring developers to track multiple states and transformations - Lack of clear documentation about the conceptual model and flow - Inconsistent naming conventions (e.g., all_supported creates new objects while make_supported transforms existing ones) - These issues increase the learning curve for new contributors and make maintenance more challenging for all developers. ### Describe the solution you'd like Redesign the filter pushdown APIs with a focus on simplicity, consistency, and clarity: 1. Reduce abstraction layers: Consolidate the multiple wrappers into fewer, more focused data structures. 2. Consistent API patterns: Use clear naming conventions: - with_* for non-mutating methods that return new objects - mark_* for transformations - collect_* for extraction methods 3. Simplified core data structures: ```rust /// A predicate with its support status for pushdown enum PredicateWithSupport { Supported(Arc<dyn PhysicalExpr>), Unsupported(Arc<dyn PhysicalExpr>), } /// Collection of predicates with clearly defined operations struct Predicates { // Core operations that are intuitive to use // ... } /// Clear result type for pushdown operations struct FilterPushdownResult<T> { pushed_predicates: Vec<Arc<dyn PhysicalExpr>>, retained_predicates: Vec<Arc<dyn PhysicalExpr>>, updated_plan: Option<T>, } ``` 4. More declarative approach: Let execution plan nodes declare which predicates they support rather than relying on complex negotiation. 5. Better documentation: Add clear documentation about the mental model, flow, and expected usage patterns. 6. Test coverage: Ensure robust test coverage for the new APIs to prevent regressions. This redesign should aim to reduce cognitive load for developers while maintaining all current functionality. It should also make future extensions to the filter pushdown system more straightforward. ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org