HappenLee opened a new issue #6238:
URL: https://github.com/apache/incubator-doris/issues/6238


   ## Motivation
   At present, the underlying storage in Doris is column storage. But query 
execution needs to be transferred to the query layer for execution by 
row-to-column first. Such an implementation maybe cause the performance problem。
   
   * 1. Column-to-row loss.
   * 2. Can not get better CPU performance without vectorized execution.
   
   Currently, vectorized execution has been commonly adopted in mainstream MPP 
databases, which has a significant effect on improving CPU utilization. In this 
paper, we investigate the knowledge of vectorization and give a detailed design 
for implementing vectorization on columnar storage.
   
   ## What is vectorization execution
   
   Organizing data in this batched, columnar fashion is the primary 
prerequisite for using SIMD CPU instructions, which operate on a vector of data 
at a time. Using SIMD instructions is an eventual goal. 
   
   Now suppose there is a table **People**, the data content is as follows:
   |Id|Name|Age|
   |:----: |:----: |:----: |
   |101|Ivan|22|
   115|Peggy|37
   114|Victor|45
   113|Eve|25
   112|Walter|19
   109|Trudy|31
   108|Bob|27
   105|Zoe|29
   104|Charlie|42
   102|Alice|35
   
   Execute the following query on the table:`SELECT Id, Name, Age, (Age - 30) * 
50 AS Bonus FROM People WHERE Age > 30`
   
   #### Volcano model
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-675e7939d434aaab.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   For traditional relational databases, the way it returns data is on a 
per-row basis. The problems are: one function call per line, which interrupts 
CPU flow and is not conducive to branch prediction; high instruction and data 
cache misses; unfriendly compiler, not conducive to loop expansion, and not 
conducive to using CPU acceleration instructions such as SIMD.
   
   #### Vectorization Execution Engine
   
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-74329d0444b6d2f9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   But for columnar databases, you can actually do it in the form of columns, 
which can reduce the overall cache failure rate. This is equivalent to a 
vectorized execution to achieve such an effect, which is a unique optimization 
technique for columnar databases to reduce CPU consumption and improve the 
overall CPU utilization.
   
   ##### It can bring us the following benefits:
   * 1. Reduced dummy function calls for branch prediction. 
   
   * 2. Reduces interpretation overhead by getting a batch of results at a time.
   
   * 3. Easy for compiler to do loop pipelining and SIMD optimization.
       
   * 4. Friendly to CPU L1 and L2 Cache.
   
   ## How Vectorization Execution Engine For Doris
   
   ### The key idea and challenge
   
   * Design a new memory structure to replace the original `RowBatch` and 
`Tuple` structure
   * Rewrite all operators to support vectorization/columnar computation
   
   **ClickHouse is an excellent implementation of the vectorized execution 
engine database, so here we have borrowed a lot from its excellent 
implementation in terms of data structure and function implementation. We are 
based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community 
and developers.**
   
   ### 1. Data Structure
   
   We have implemented all types natively supported by Doris based on 
**Clickhouse's implementation of Column and Block** including:
   
   * tinyint
   * smallint
   * int
   * bigint
   * largeint
   * boolean
   * float
   * double
   * decimal
   * date
   * datetime
   * HLL
   * bitmap
   
   ### 2. Function and AggregateFunction
   
   We have implemented more than 80% of Doris's native support functions based 
on Clickhouse's implementation of Function and AggregateFunction interfaces, 
and have completed a large number of SIMD support.
   
   And for a more efficient SIMD optimization, the type of NULLABLE for each 
function is identified as follows.
   
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-1535c8756d14b447.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   The specific function implementation support can be found in the following 
link:[HERE](https://github.com/doris-vectorized/doris-vectorized/issues/49)
   
   ### 3. Operator And Execution System Integration
   
   ##### Operator
   
   We already implement 
   
   * VSortNode
   * VCrossJoinNode
   * VAggregateNode
   * VOlapScanNode
   
   After the data is read from the storage layer, it is transformed from 
`RowCursor` to `Block` structures, i.e., from row storage to column storage. 
Starting from **VOlapScanNode**, the data is organized in the form of columns.
   
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-d90cfbfcaae32426.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   
   ##### Execution System Integration
   
   The current execution logic of the vectorized execution engine is 
independent of the current Doris execution engine and is self-contained, which 
makes it very difficult to implement in practice.
   
   * **MemTracker**
   * **NewProfile**
   * **DCHCK Replace Exception**
   
   ### 4. Vectorization Execution Engine Query Plan 
   
   For Enable the Vectorization Execution Engine, set the session variable 
`enable_vectorized_engine=true`. The Query Plan is same as origin Doris, **only 
the ExecNode will change to VExecNode**:
   
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-0eb367cf840b4a5e.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   
   
   #### 5. How to decide a query should use Vectorization Execution Engine
   
   **Without CrossJoin, the vectorized execution engine usually outperforms 
Doris' original execution engine.**
   
   So when you have a large wide table analysis needs, you can use the 
vectorized execution engine. For example, the following analysis scenario of 
[SSB-FLAT](https://clickhouse.tech/docs/zh/getting-started/example-datasets/star-schema/).
   
   
   #### 6. Others
   
   * Add a session variable `enable_vectorized_engine`, the default value is 
`true`.
   * From the above, it is clear that the larger the Batch Size at execution, 
the more pronounced the vectorization effect. Our test results show that the 
vectorization execution engine works best with `Batch Size = 4096`.
   
   ## The limit for current vectorization execution engine
   
   * 1. Do not support **VHashJoinNode**, WIP。
   * 2. Do not support **VAnalyticEvalNode**, WIP。
   * 3. Cross Join is slower than Origin Execution Execute, Because here are 
many memcopy and filter data is Inefficient.
   * 4. Now, the filter condition in VOlapScanNode push down to the Storage 
Engine do not delete in VSCanner, may cause some performance issues.
   * 5. **After we have implemented the vectorized query at the query layer, we 
find that the storage layer becomes the main bottleneck for the query.** We 
should replace the original `RowCursor and RowBlock` structures with the 
`Column and Block` structures. This is a difficult but challenging task and 
everyone is welcome to participate.
   
   ## Origin VS Vectorization Execution
   
   #### Test Data:
   
   SSB 60kw lineorder table
   
   #### Cluser Info:
   
   1 BEs, BE is Physical Machine and has 48CPU, 96GMEM.
   
   #### Test Result
   
   Here we only show the result of  `count(*), min() group by, sum(distinct)`。
   
   
![image.png](https://upload-images.jianshu.io/upload_images/8552201-0a81dd5487ff75c6.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to