huangmengbin opened a new issue #6021:
URL: https://github.com/apache/incubator-doris/issues/6021


   **Is your feature request related to a problem? Please describe.**
   - 按照 time 进行 range分区后,假如有一个分区 `p20210520` 范围是 `['2021-05-20 00:00:00', 
'2021-05-21 00:00:00')`
   ```sql
   1. SELECT MAX(xxx) FROM xx PARTITION(p20210520);
   2. SELECT MAX(xxx) FROM xx WHERE `time` >= '2021-05-20 00:00:00' AND `time` 
< `2021-05-21 00:00:00`;
   3. SELECT MAX(xxx) FROM xx PARTITION(p20210520) 
                              WHERE `time` >= '2021-05-20 00:00:00' AND `time` 
< `2021-05-21 00:00:00`;
   ```
   - 理论上,三者逻辑是完全等价的。然而2和3的往往通常会比1慢。
   - 这可能是 where 中的信息仍进入了be,使be执行了无效的扫描和计算。
   
   
   **Describe the solution you'd like**
   - 取出所有被分区裁剪器选中的分区,求出其上下界,作为一个range。它是所有被选中的分区的超集。
   - 若为多列分区,则需要额外计算最后一个columnIndex。要求:此下标位置及之前的所有literal,上下界均要相等。
   - 遍历where表达式列表:删除掉列号不超过1+columnIndex、且范围能够全覆盖上述的range的表达式。
   
   
   **Additional context**
   - 功能已开发完成。预计下周可发布PR,前提是Range分区裁剪器不精确的BUG要先被修复。[issue 
5989](https://github.com/apache/incubator-doris/issues/5989)
   - 预计添加新feature后,2和3的性能能够追平第1条sql.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to