jayzhan211 opened a new issue, #14618:
URL: https://github.com/apache/datafusion/issues/14618

   ### Is your feature request related to a problem or challenge?
   
   The steps in Logical Layer is Sql->LogicalPlan->Analyzer->Optimizer.
   
   These 5 rules are in `Analyzer`
   ```rust
               Arc::new(InlineTableScan::new()),
               // Every rule that will generate [Expr::Wildcard] should be 
placed in front of [ExpandWildcardRule].
               Arc::new(ExpandWildcardRule::new()),
               // [Expr::Wildcard] should be expanded before [TypeCoercion]
               Arc::new(ResolveGroupingFunction::new()),
               Arc::new(TypeCoercion::new()),
               Arc::new(CountWildcardRule::new()),
   ```
   
   Analyzer's role is unclear to me. It doesn't make sense to me that we need 
two types of "optimization" after the plan completion. We only need one when we 
are building the plan and the one after the plan is completed. **I claim that 
those rules can be either in SQL->LogicalPlan building stage or optimizer**
   
   > Comments of `Analyzer`
   ```
   /// [`AnalyzerRule`]s transform [`LogicalPlan`]s in some way to make
   /// the plan valid prior to the rest of the DataFusion optimization process.
   ///
   /// `AnalyzerRule`s are different than an 
[`OptimizerRule`](crate::OptimizerRule)s
   /// which must preserve the semantics of the `LogicalPlan`, while computing
   /// results in a more optimal way.
   ///
   /// For example, an `AnalyzerRule` may resolve 
[`Expr`](datafusion_expr::Expr)s into more specific
   /// forms such as a subquery reference, or do type coercion to ensure the 
types
   /// of operands are correct.
   ```
   
   If a rule MUST be executed, it should be applied during the plan creation 
stage, not after the plan is completed. However, if the rule is OPTIONAL for 
plan completion, it should be applied in the optimizer.
   
   I propose removing the concept of the Analyzer and integrating it into the 
SQL → LogicalPlan stage. Specifically, TypeCoercion should be applied before 
the plan is finalized (#14380).
   
   Before moving TypeCoercion into the builder, ExpandWildcardRule needs to be 
relocated first. The remaining three rules can be moved either into the builder 
or the optimize
   
   ### Describe the solution you'd like
   
   ## Requirement
   Rules in the Analyzer are optional, allowing users to choose whether to 
apply them or add custom rules. This flexibility should be preserved, ensuring 
that the rule remains optional and customizable even after being moved out of 
the Analyzer.
   
   ## Tasks
   - [ ] Move `ExpandWildcardRule` in SQL->LogicalPlan stage
   - [ ] #14296 We could make TypeCoercion customizable before moving it out of 
the Analyzer.
   - [ ] Move `TypeCoercion` in SQL->LogicalPlan stage
   - [ ] Revisit #14380
   - [ ] Investigate other 3 rules and remove `Analyzer` internally.
   - [ ] Find a way to migrate existing customize analyzer rule away
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to