wutiangan opened a new issue #4223:
URL: https://github.com/apache/incubator-doris/issues/4223


   ## Hierarchy of query planner layer
   
   The hierarchy of Good Query Planner Layer is as follows:
   
   - Analysis phase: identify the namespace and  check the meta data; build the 
queryblock, generate the hierarchical relationship of queryblock and logic 
PlanNode tree.
   
   - Single node planner: rewrite based on rules and cost model (include 
condition push down, materialized view selection, projection selection, join 
reorder); generate  physical plan node tree.
   
   - Distributed planner
   
      generate distributed  physical plan fragment  tree(include generate 
TupleId  / SlotId , bind the Exprs to SlotId,  generate distributed aggregation 
operator and join operator, and determine the join type of join Node (shuffler 
join or broadcast join)).
   
   but in fact, Excessive coupling between analysis and optimization occur in 
Doris. SQL rewriting and join reorder should be completed in the single node 
planner stage, and the binding of TupleId  and  distributed computing of 
aggregation should be completed in the distributed execution planner stage. 
However, the above work is currently coupled in the analysis phase of Doris.
   
   ## Problems
   
   1、Adding rules is difficult
   
   In the stage of Single Node Plan, the optimizer of Doris does not abstract 
the framework that can add various rules and uses the cost estimation model to 
guide the execution plan. Therefore, in the process of supporting TPCDS SQL, 
Doris exposed that adding rules was troublesome, which brought a lot of 
maintenance costs, and the added rules could only partially solve the problem. 
For example, having subquery cannot support multiple subqueries
   
   The current rule of Doris can only execute rules in serial mode. A good 
optimizer can dynamically adjust the order of rules, such as whether you 
execute project or filter first.
   
   2、The performance of multi table join is poor
   
   When there are multiple join tables, Doris has some problems, such as 
improper selection of table join order and improper selection of join type( 
broadcast join or shuffle join), resulting in very slow query speed or unable 
to return data. The current join model of Doris adopts the way of left deep 
tree, which is not the optimal join in fact.
   
   3、difficult to support more materialized views
   
   For support materialized views, we need to rewrite the SQL in the analysis 
phase.  Now  we only rewrite the select item. If the filter and join operators 
are supported in the future, it will be more difficult to rewrite. Moreover, 
after rewriting SQL, the projection made in the single node planner stage will 
lead to errors in selecting materialized views . We need a framework like kylin 
that can support materialized views.
   
   ## Solutions
   
   In the face of the above problems, we need to reactor Doris optimizer to 
deal with complex SQL scenarios, and the new optimizer will be easier to add 
various optimization rules.  The new optimizer will provide us with the ability 
of ad hoc and better query performance.
   
   - 1、Calcite
   
   Calcite is a relatively mature scheme. Many OLAP databases are in use,such 
as druid, kylin, hive. However, it is difficult to modify the framework to 
adapt to our query engine. It is difficult for us to modify the general 
framework and give full play to the best query ability. In addition, Calcite's 
own search process is time-consuming.
   
   - 2、Self-developed optimizer
   
   Optimizer is the core of OLAP database, such as Tidb and spark are 
self-developed optimizer. If self-developed, we need to implement the rule 
framework and search algorithm of the optimizer, which is a time-consuming and 
high-tech work. In the long run, self-developed optimizer is our core 
competitiveness.
   
   In the short term, it is easier to achieve results by using Calcite. In the 
long run, self-developed optimizer is a better choice, which is worth our 
investment.
   
   ## The scheme of Introducing Calcite
   
   Calcite is introduced as a jar package and will not invade our code too much.
   
   - First Stage:       
   
   ```
     Parse(Doris) -> Parser/Analyze/optimzer(Calcite) -> 
RelNodeToPlanNodeConverter(Doris) -> DistributedPlan(Doris)
   ```
   
   In the first stage, the QueryStmt is parsed by Doris syntax analysis, and 
then the Doris optimizer or Calcite optimizer can be determined according to 
the type of QueryStmt and the session Variables. After Calcite generates 
RelNode,  convert the  RelNode into Doris' PlanNode, and then execute the 
distributed execution planner phase of Doris.
   
   - Second Stage:
   
   ```
    Parse/Analyze(Doris) -> QueryBlock(Doris) --> 
QueryBlockToRelNodeConverter(Doris) --> optimzer(Calcite) --> 
DistributedPlan(Doris)
   ```
   
   The difference between the second stage and the first stage is that in the 
second stage, Doris 's own parser is used, and then the AST is generated into 
query block, which is then convert to Calcite's RelNode. The advantage is that 
Doris can use its own dialect, we just use the optimizer framework of Calcite.  
             
   
   ## Key process
   
   ### 1、Doris Syntax Parse
   
   - Using Doris' syntax Parser to generate QueryStmt
   - If "enable_calcite"  Session Variable is true and  the QueryStmt is a 
Select statement,  and We use Calcite as the optimizer.
   - Otherwise we use default Doris optimizer.
   
   ### 2、Calcite Parser/Analyze/optimizer
   
   Using Calcite to go through the stages of Syntax Parse, semantic analysis 
and optimization, at last  the single physical plan node(RelNode) tree is 
generated.
   
   ### 3、Single Node Planner Converter
   
   Traverse the optimal RelNode tree generated by Calcite and convert it into 
Doris single node plan tree.
   
   In this step, you need to generate the tuple descriptor and bind the 
Exprs/conjuncts. Also the output column schema for Mysql will be generated.
   
   ### 4、Distribute Planner
   
   Distribute planner mainly transforms a single node plan tree into a 
distributed execution plan(segments in Doris).
   
   Some of Doris' s previous optimizations in the distribute planner stage will 
be disabled, such as colocation join which  should be completed in the single 
node planner phase. In addition, we need to make distributed transformation of 
AggregateNode and UnionNode.
   
   ## Main work
   
   1、Add custom rule
   
   The rules of Calcite are limited. Calcite only provides a framework, and 
users need to implement most of the custom rules.
   
   2、Construction of TupleDescriptor and  bind the Exprs to slotId(heavy 
workload).
   
   3、Implementation of each node (heavy workload)
   
   4、Meta data interaction between Doris and Calcite.
   
   5、Statistical information collection and interaction.
   
   6、QueryBlock.
   
   7、Convert AST  to RelNode.
   
   8、Selection of join method based cost model(mainly choose shuffler join or 
broadcast join), which should be completed in Distribute planner.
   
   9、 Planner compatibility: for example, the intersect/except Node of Doris 
are implement  to separate PlanNode, while Calcite convert intersect/except to 
JoinNode. In the complex subquery, we introduce AssertNode, but Calcite 
completes the corresponding work through Predicate of JoinNode.
   
   10、Compatibility of functions and dialects.
   
   ## Compatibility
   
   1、The Calcite  optimizer is introduced as a jar package. we only adds code, 
and the original code logic remains unchanged
   
   2、If semantic analysis fails, go back to the default optimizer of Doris.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to