[GitHub] [incubator-doris] HappenLee commented on issue #7774: [Enhancement][Vectorized] Speed up column filtering via SIMD

2022-01-17 Thread GitBox


HappenLee commented on issue #7774:
URL: 
https://github.com/apache/incubator-doris/issues/7774#issuecomment-1014247052


   nice job!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yangzhg opened a new pull request #7776: [improvment] (fe) add retry at be heartbeat, avoid show be is down when be in high load

2022-01-17 Thread GitBox


yangzhg opened a new pull request #7776:
URL: https://github.com/apache/incubator-doris/pull/7776


   # Proposed changes
   
   1. add retry at be heartbeat, avoid show be is down when be in high load
   2. remove some unused code
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman closed issue #7662: [Feature] Support general hints int select stmt

2022-01-17 Thread GitBox


morningman closed issue #7662:
URL: https://github.com/apache/incubator-doris/issues/7662


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee merged pull request #7775: [Vectorized][Improvement] Speed up column filtering via SIMD

2022-01-17 Thread GitBox


HappenLee merged pull request #7775:
URL: https://github.com/apache/incubator-doris/pull/7775


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch vectorized updated (d2f2210 -> 57bdde6)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a change to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from d2f2210  [Vectorized][feature](planner)(executor) Support grouping 
sets rollup cube (#7601)
 add 57bdde6  [Vectorized][Improvement] Speed up column filtering via SIMD 
(#7775)

No new revisions were added by this update.

Summary of changes:
 be/src/vec/columns/column_decimal.cpp | 25 +
 be/src/vec/columns/column_vector.cpp  | 28 ++--
 be/src/vec/columns/columns_common.cpp | 20 +---
 be/src/vec/columns/columns_common.h   | 30 ++
 4 files changed, 74 insertions(+), 29 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee closed issue #7774: [Enhancement][Vectorized] Speed up column filtering via SIMD

2022-01-17 Thread GitBox


HappenLee closed issue #7774:
URL: https://github.com/apache/incubator-doris/issues/7774


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee opened a new issue #7777: [Vectorized][Bug] Bug of repeated node resize and compile of grouping set code

2022-01-17 Thread GitBox


HappenLee opened a new issue #:
URL: https://github.com/apache/incubator-doris/issues/


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   vectorized
   
   ### What's Wrong?
   
   core dump
   
   ### What You Expected?
   
   normal execute
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee opened a new pull request #7778: [Vectorized][Bug] Fix bug of repeated node resize and compile failed

2022-01-17 Thread GitBox


HappenLee opened a new pull request #7778:
URL: https://github.com/apache/incubator-doris/pull/7778


   # Proposed changes
   
   Issue Number: close # 
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No Need)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (Yes)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee merged pull request #7751: [vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747

2022-01-17 Thread GitBox


HappenLee merged pull request #7751:
URL: https://github.com/apache/incubator-doris/pull/7751


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch vectorized updated (57bdde6 -> 778fa8d)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a change to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 57bdde6  [Vectorized][Improvement] Speed up column filtering via SIMD 
(#7775)
 add 778fa8d  [Vectorized](improving) (exec) optimize VDataStreamSender's 
send() performance #7747 (#7751)

No new revisions were added by this update.

Summary of changes:
 be/src/vec/columns/column.h |  4 ++
 be/src/vec/columns/column_complex.h | 10 -
 be/src/vec/columns/column_const.h   |  4 ++
 be/src/vec/columns/column_decimal.h | 10 +
 be/src/vec/columns/column_dummy.h   |  4 ++
 be/src/vec/columns/column_nullable.cpp  |  6 +++
 be/src/vec/columns/column_nullable.h|  1 +
 be/src/vec/columns/column_string.cpp|  6 +++
 be/src/vec/columns/column_string.h  |  2 +
 be/src/vec/columns/column_vector.cpp| 10 +
 be/src/vec/columns/column_vector.h  |  2 +
 be/src/vec/columns/predicate_column.h   |  6 ++-
 be/src/vec/core/block.cpp   | 13 +-
 be/src/vec/core/block.h |  2 +
 be/src/vec/sink/vdata_stream_sender.cpp | 79 ++---
 be/src/vec/sink/vdata_stream_sender.h   | 36 ++-
 16 files changed, 164 insertions(+), 31 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] zenoyang opened a new issue #7779: [Bug][Vectorized] Fix compile error and warning

2022-01-17 Thread GitBox


zenoyang opened a new issue #7779:
URL: https://github.com/apache/incubator-doris/issues/7779


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   vectorized branch
   
   ### What's Wrong?
   
   current compile error and warning
   
   ### What You Expected?
   
   no
   
   ### How to Reproduce?
   
   fix
   
   ### Anything Else?
   
   no
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] zenoyang opened a new pull request #7780: [Vectorized](compile) Fix compile error and warning

2022-01-17 Thread GitBox


zenoyang opened a new pull request #7780:
URL: https://github.com/apache/incubator-doris/pull/7780


   # Proposed changes
   
   Issue Number: #7779 
   Fix compile error and warning
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman opened a new issue #7781: [Roadmap] Support automatic table structure transformation and pseudo data generation

2022-01-17 Thread GitBox


morningman opened a new issue #7781:
URL: https://github.com/apache/incubator-doris/issues/7781


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   _No response_
   
   ### Use case
   
   Use Case:
   
   1. Automate or guide users in converting table build statements from other 
databases to Doris table build statements.
   
   2. Support generating pseudo data based on table structure for easy testing
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman edited a comment on issue #7502: Doris Roadmap 2022

2022-01-17 Thread GitBox


morningman edited a comment on issue #7502:
URL: 
https://github.com/apache/incubator-doris/issues/7502#issuecomment-1001839293


   The following is the Roadmap for the Doris community in 2022.
   The plan includes all aspects of code features, documentation, community 
building, etc. that are to be developed, have already been developed, and have 
been completed but require ongoing optimization.
   
   > The plan is currently under discussion, so if you have comments or 
suggestions on any aspect of the plan or beyond, please feel free to leave a 
comment or send an email to d...@doris.apache.org.
   
   > We will gradually create issues or jira for each direction of the plan to 
describe and track the progress in detail. Developers who wish to contribute 
are also welcome to create issues directly and associate with them (just leave 
a comment)
   
   > The directions marked (**Good First Issue**) in the plan are more 
independent modules, which are more suitable for newbie tasks or developers who 
are new to Doris. If you are interested in the relevant direction, please 
contact us at d...@doris.apache.org or under this issue, and we will provide 
detailed guidance, help and discussion.
   
   > The directions marked with (**Q1**) are the current work to be completed 
in the first quarter of 2022. We will update the schedule and progress of other 
directions gradually.
   
   > The marked (**Done & Optimizing**) directions are the directions that are 
currently completed but need continuous optimization. Such as ease of use, 
feature additions, and documentation additions.
   
   > We encourage developers to discuss anything in the dev mailing list, to 
subscribe to the mailing list please refer to [How to 
subscribe](http://doris.incubator.apache.org/master/en/community/subscribe-mail-list.html).
   
   ## Features
   
   - [ ] #7571 
   
   - [ ] Extensible new query optimizer framework
   - [ ] Statistical information collection and utilization
   - [ ] Standard test set support and performance enhancements
   + TPC-DS feature pass rate 100%
   + TPC-H performance enhancements
   
   - [ ] #7572
   
   - [ ] Pipeline execution engine
   - [ ] Algorithm Concurrency Control and Resource Control
   
   - [ ] #7573
   
   - [ ] #7570
   - [ ] Map
   - [ ] Struct
   
   - [ ] #7574
   
   Provides Schemaless semantics for fast analysis of semi-structured data.
   
   - [ ] Json
   
   - [ ] #7575 (Q1)
   
   Supports cold data storage to object storage at partition granularity 
with remote access capabilities and local Cache acceleration.
   
   - [ ] #7503
   
   Doris' current "materialized view" is more of a "materialized index" 
concept. Doris will later implement a true Materialized View to support full 
and incremental construction of single and multi-table views. 
   
   - [ ] #7576
   
   Provide Kudu-like data update support.
   
   - [ ] #7577
   
   - [ ] WindowFunnel
   
   - [ ] #7578
   
   Support for the new UDF framework has solved the problems of high 
writing difficulty, poor isolation, and poor compatibility with existing C++ 
frameworks.
   
   - [ ] UDF
   - [ ] UDAF
   - [ ] UDTF
   
   - [ ] #7579 (**Good First Issue**)
   
   - [ ] #7552
   - [ ] #7650 
   
   - [ ] Add more resource limits
   
   - [ ] #7129 
   
   ## Performance Optimization
   
   - [ ] #7580 (Q1)
   
   - [ ] Query layer vectorization
   - [ ] Storage level vectorization
   - [ ] Vectorization function supplementation
   - [ ] Query layer storage layer arithmetic unification
   - [ ] Import Vectorization
   
   - [ ] Json Parsing Optimization (**Good First Issue**)
   
   - [ ] #7551
   
   - [ ] #7743 
   
   Optimize the performance of compaction task. And try to refactor the 
compaction logic. For example, only one replica do the compaction and sync to 
other replicas.
   
   ## Stability and Observability
   
   - [ ] #7553 (Q1)
   
   Solve the problems of inaccurate memory prediction and OOM, and improve 
memory observability by global + thread + task level memory management.
   
   - [ ] #7581
   
   Provides fine-grained IO speed limit, priority scheduling, etc. through 
global IO management.
   
   - [ ] #7582
   
   Introduces OpenTelemetry to enhance system internal state observability 
and unify monitoring data format.
   
   ## Testing
   
   - [ ] #7583
   
   - [ ] FE
   
   Refine the FE single test framework to support multi-node simulation 
testing of features.
   
   - [ ] BE
   
   Provide testing framework to simplify the difficulty of writing 
complex unit tests (e.g. data builds) for BE.
   
   - [ ] #7584
   
   Provide Case collection or submission framework for refining and 
accumulating regression test sets.
   
   - [ ] #7585
   
   Provide a Benchmark te

[GitHub] [incubator-doris] yiguolei commented on pull request #7780: [Vectorized](compile) Fix compile error and warning

2022-01-17 Thread GitBox


yiguolei commented on pull request #7780:
URL: https://github.com/apache/incubator-doris/pull/7780#issuecomment-1014330999


   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] spaces-X opened a new issue #7782: [Proposal][Feature] enable Quantile pre-aggregation

2022-01-17 Thread GitBox


spaces-X opened a new issue #7782:
URL: https://github.com/apache/incubator-doris/issues/7782


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   In current doris-0.15 or elder version, the quantile value is calculated by 
the detailed data from duplicated module, whose latency is unfriendly under the 
large scale of data.
   
   Proposed to enable `quantile pre-aggregation` to reduce query latency, 
already implemented in ClickHouse as follows.
   
   ```
   SELECT quantileState(number) AS st  -- st is a quantileState generated by 0~9
   FROM numbers(10)
   
   Query id: cbde1c1b-e20a-430d-b34a-67c9833be6af
   
   ┌─st─┐
   │
   6364136223846793005 0 123459 │
   └┘
   
   1 rows in set. Elapsed: 0.002 sec.
   
   ---
   
   SELECT quantileMerge(0.8)(st) -- use quantileMerge function to calculate 
quantile
   FROM
   (
   SELECT quantileState(number) AS st
   FROM numbers(10)
   )
   
   Query id: 1c25beb5-f6c5-4f32-a6ce-7bbd6d0429ef
   
   ┌─quantileMerge(0.8)(st)─┐
   │7.2 │
   └┘
   ```
   
   
   Referring to the existing **HLL and bitmap** implementations, the 
**intermediate state** of the quantile function can be **stored** by TDigest 
serialization in stream-load step. 
   
   The changes are roughly as follows.
   
   
   1. A new column named `quantilestate` and corresponding agg function 
`quantile_union` , `quantile_cal` are supposed to added.
  - quantile_union: add a value into quantilestate
  - quantile_cal(float: percentage):  calculate the quantile of percentage  
by quantilestate
  - to_quantile(float: value): transfer value to quantilestate
   
   2. Support for `QuantileState` in query and load step.
  
   3. Refactor  `PercentileApproxState` and `TDigest`
   
   
   
   
   
   
   
   
   
   
   ### Use case
   
   create table sql like:
   ```
   CREATE TABLE `QuantileState_Test` (
 `keys` bigint(20) NULL COMMENT "keys",
 `quantile_value` quantilestate quantile_union NOT NULL COMMENT "qualite 
calue"
   ) ENGINE=OLAP
   AGGREGATE KEY(`brand_id`, `dt`, `poi_type`)
   COMMENT "bitmap load 测试#OWNER#lihuigang"
   PARTITION BY RANGE(`dt`) (
  xxx
   )
   DISTRIBUTED BY HASH(`keys`) BUCKETS 3
   PROPERTIES (
  xxx
   );
   ```
   
   stream load cmd like:
   
   ```
   curl --location-trusted -u root -H "columns: k1, k2, 
v1=to_quantilestate(v1)" -T testData 
http://host:port/api/testDb/testTbl/_stream_load
   
   ```
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785863571



##
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##
@@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() {
 public static boolean isTableNamesCaseInsensitive() {
 return GlobalVariable.lowerCaseTableNames == 2;
 }
+
+public void compactTable(AdminCompactTableStmt stmt) throws DdlException {
+String dbName = stmt.getDbName();
+String tableName = stmt.getTblName();
+
+String type = stmt.getCompactionType();
+if (type == null || (!type.equals("base") && 
!type.equals("cumulative"))) {

Review comment:
   > This check should be done in analysis phase
   
   OK, thank you.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785864111



##
File path: be/src/agent/task_worker_pool.cpp
##
@@ -1650,4 +1654,73 @@ void TaskWorkerPool::_random_sleep(int second) {
 sleep(rnd.Uniform(second) + 1);
 }
 
+void TaskWorkerPool::_submit_table_compaction_worker_thread_callback() {
+while (_is_work) {
+TAgentTaskRequest agent_task_req;
+TCompactionReq compaction_req;
+
+{
+lock_guard worker_thread_lock(_worker_thread_lock);
+while (_is_work && _tasks.empty()) {
+_worker_thread_condition_variable.wait();
+}
+if (!_is_work) {
+return;
+}
+
+agent_task_req = _tasks.front();
+compaction_req = agent_task_req.compaction_req;
+_tasks.pop_front();
+}
+
+LOG(INFO) << "get compaction task. signature:" << 
agent_task_req.signature
+  << ", compaction type:" << compaction_req.type;
+
+CompactionType compaction_type;
+if (compaction_req.type == "base") {
+compaction_type = CompactionType::BASE_COMPACTION;
+} else {
+compaction_type = CompactionType::CUMULATIVE_COMPACTION;
+}
+
+TabletSharedPtr tablet_ptr = 
StorageEngine::instance()->tablet_manager()->get_tablet(
+compaction_req.tablet_id, compaction_req.schema_hash);
+if (tablet_ptr != nullptr) {
+auto data_dir = tablet_ptr->data_dir();
+if (!tablet_ptr->can_do_compaction(data_dir->path_hash(), 
compaction_type)) {
+LOG(WARNING) << "can not do compaction: " << 
tablet_ptr->tablet_id()
+ << ", compaction type: " << compaction_type;
+_remove_task_info(agent_task_req.task_type, 
agent_task_req.signature);
+continue;
+}
+
+if (compaction_type == CompactionType::BASE_COMPACTION) {

Review comment:
   > Is it necessary to check lock here?
   
   OK.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785864622



##
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##
@@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() {
 public static boolean isTableNamesCaseInsensitive() {
 return GlobalVariable.lowerCaseTableNames == 2;
 }
+
+public void compactTable(AdminCompactTableStmt stmt) throws DdlException {
+String dbName = stmt.getDbName();
+String tableName = stmt.getTblName();
+
+String type = stmt.getCompactionType();
+if (type == null || (!type.equals("base") && 
!type.equals("cumulative"))) {
+throw new DdlException("compaction type should be [BASE] or 
[CUMULATIVE]");
+}
+
+Database db = this.getDbOrDdlException(dbName);
+OlapTable olapTable = db.getOlapTableOrDdlException(tableName);
+
+olapTable.writeLock();

Review comment:
   > readLock is enough
   
   OK, thank you.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865075



##
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##
@@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() {
 public static boolean isTableNamesCaseInsensitive() {
 return GlobalVariable.lowerCaseTableNames == 2;
 }
+
+public void compactTable(AdminCompactTableStmt stmt) throws DdlException {
+String dbName = stmt.getDbName();
+String tableName = stmt.getTblName();
+
+String type = stmt.getCompactionType();
+if (type == null || (!type.equals("base") && 
!type.equals("cumulative"))) {
+throw new DdlException("compaction type should be [BASE] or 
[CUMULATIVE]");
+}
+
+Database db = this.getDbOrDdlException(dbName);
+OlapTable olapTable = db.getOlapTableOrDdlException(tableName);
+
+olapTable.writeLock();
+try {
+AgentBatchTask batchTask = new AgentBatchTask();
+List partitionNames = stmt.getPartitions();
+LOG.info("Table compaction. database: {}, table: {}, partition: 
{}, type: {}", dbName, tableName,
+Joiner.on(", ").join(partitionNames), type);
+for (String parName : partitionNames) {
+Partition partition = olapTable.getPartition(parName);
+if (partition == null) {
+throw new DdlException("partition[" + parName + "] not 
exist in table[" + tableName + "]");
+}
+
+for (MaterializedIndex idx : 
partition.getMaterializedIndices(IndexExtState.VISIBLE)) {
+for (Tablet tablet : idx.getTablets()) {
+for (Replica replica : tablet.getReplicas()) {
+CompactionTask compactionTask = new 
CompactionTask(replica.getBackendId(), db.getId(),
+olapTable.getId(), partition.getId(), 
idx.getId(), tablet.getId(),
+
olapTable.getSchemaHashByIndexId(idx.getId()), type, 5000);
+batchTask.addTask(compactionTask);
+}
+}
+} // indices
+}
+// send task immediately
+AgentTaskExecutor.submit(batchTask);

Review comment:
   > submit task outside the lock
   
   Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865358



##
File path: gensrc/thrift/AgentService.thrift
##
@@ -160,6 +160,13 @@ struct TCloneReq {
 10: optional i32 timeout_s;
 }
 
+struct TCompactionReq {
+1: required Types.TTabletId tablet_id

Review comment:
   > use optional for all fields
   
   Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865665



##
File path: docs/en/sql-reference/sql-statements/Administration/ADMIN COMPACT.md
##
@@ -0,0 +1,52 @@
+---

Review comment:
   > New doc need to be added to the `docs/.vuepress/sidebar/en.js` and 
`zh-CN.js`
   
   Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually

2022-01-17 Thread GitBox


weizuo93 commented on a change in pull request #7521:
URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785866544



##
File path: be/src/olap/olap_server.cpp
##
@@ -551,4 +537,51 @@ void 
StorageEngine::_pop_tablet_from_submitted_compaction(TabletSharedPtr tablet
 }
 }
 
+Status StorageEngine::_submit_compaction_task(TabletSharedPtr tablet, 
CompactionType compaction_type) {
+bool already_exist = _push_tablet_into_submitted_compaction(tablet, 
compaction_type);
+if (already_exist) {
+return Status::InternalError(strings::Substitute(

Review comment:
   > Is `Status::AlreadyExist` more appropriate?
   
   OK, thank you.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] caiconghui commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


caiconghui commented on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014419489


   > After this pr is merged, will the thrift data of different versions cannot 
be parsed during the upgrade or in the mixed version cluster?
   
   The only thing I'm not sure about is if List and binary are compatible
   but according to 
https://stackoverflow.com/questions/40886279/apache-thrift-difference-between-byte-and-binary-types
   It seems that it is ok
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


caiconghui edited a comment on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014419489


   > After this pr is merged, will the thrift data of different versions cannot 
be parsed during the upgrade or in the mixed version cluster?
   
   The only thing I'm not sure about is if `List` and binary are 
compatible
   but according to 
https://stackoverflow.com/questions/40886279/apache-thrift-difference-between-byte-and-binary-types
   It seems that it is ok
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7780: [Vectorized](compile) Fix compile error and warning

2022-01-17 Thread GitBox


github-actions[bot] commented on pull request #7780:
URL: https://github.com/apache/incubator-doris/pull/7780#issuecomment-1014435956






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yangzhg commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


yangzhg commented on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014436369


   Have you test this in a cluster that have new and old nodes ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] caiconghui commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


caiconghui commented on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058


   > Have you test this in a cluster that have new and old nodes ?
   
   
   
   > Have you test this in a cluster that have new and old nodes ?
   
   follower with new version,master with older version,I test forward function, 
but not test list
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


caiconghui edited a comment on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058


   > Have you test this in a cluster that have new and old nodes ?
   
   
   
   > Have you test this in a cluster that have new and old nodes ?
   
   follower with new version,master with older version,I test forward function, 
but not test `list`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode

2022-01-17 Thread GitBox


caiconghui edited a comment on pull request #7773:
URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058


   > Have you test this in a cluster that have new and old nodes ?
   
   follower with new version,master with older version,I test forward function, 
but not test `list`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee merged pull request #7778: [Vectorized][Bug] Fix bug of repeated node resize and compile failed

2022-01-17 Thread GitBox


HappenLee merged pull request #7778:
URL: https://github.com/apache/incubator-doris/pull/7778


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee closed issue #7777: [Vectorized][Bug] Bug of repeated node resize and compile of grouping set code

2022-01-17 Thread GitBox


HappenLee closed issue #:
URL: https://github.com/apache/incubator-doris/issues/


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch vectorized updated (778fa8d -> c86d691)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a change to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 778fa8d  [Vectorized](improving) (exec) optimize VDataStreamSender's 
send() performance #7747 (#7751)
 add c86d691  [Vectorized][Bug] Fix bug of repeated node resize and compile 
failed (#7778)

No new revisions were added by this update.

Summary of changes:
 be/src/vec/exec/vrepeat_node.cpp   | 13 ++---
 be/src/vec/functions/simple_function_factory.h |  2 +-
 build.sh   |  2 +-
 3 files changed, 8 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch vectorized updated (c86d691 -> 66b3b1d)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a change to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from c86d691  [Vectorized][Bug] Fix bug of repeated node resize and compile 
failed (#7778)
 add 66b3b1d  [Vectorized](compile) Fix compile error and warning (#7780)

No new revisions were added by this update.

Summary of changes:
 be/src/vec/columns/column.h| 1 +
 be/src/vec/functions/function.h| 3 +++
 be/src/vec/functions/function_grouping.cpp | 4 ++--
 3 files changed, 6 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee merged pull request #7780: [Vectorized](compile) Fix compile error and warning

2022-01-17 Thread GitBox


HappenLee merged pull request #7780:
URL: https://github.com/apache/incubator-doris/pull/7780


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee closed pull request #7763: Vectorized

2022-01-17 Thread GitBox


HappenLee closed pull request #7763:
URL: https://github.com/apache/incubator-doris/pull/7763


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch vectorized updated (66b3b1d -> 8a1a612)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a change to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


omit 66b3b1d  [Vectorized](compile) Fix compile error and warning (#7780)
omit c86d691  [Vectorized][Bug] Fix bug of repeated node resize and compile 
failed (#7778)
omit 778fa8d  [Vectorized](improving) (exec) optimize VDataStreamSender's 
send() performance #7747 (#7751)
omit 57bdde6  [Vectorized][Improvement] Speed up column filtering via SIMD 
(#7775)
omit d2f2210  [Vectorized][feature](planner)(executor) Support grouping 
sets rollup cube (#7601)
omit 685f452  [Vectorized][Improvement] Enhancement unit test for 
vectorized function (#7750)
omit ee90c94  [Vectorization] Support SegmentIterator vectorization (#7613)
omit 24787ed  [Vectorized][Function] Support function 
stddev/variance/stddev_samp/variance_samp (#7734)
omit fc05698  [Vectorized] Rebase code from master
omit e9056d6  [Vectorized][Bug] Bitmap/HLL type no support cast to 
varchar/char (#7737)
omit 2af5181  [Vectorized][Feature] upport function conv  (#7693)
omit b79496b  [Vectorized][Bug] Fix get wrong result when select random 
column && fix get wrong has_null_tag (#7728)
omit 28fb8c7  [Vectorized][Enhancement] use simd to speed up coalesce and 
if_not_null function (#7722)
omit 2c38a50  [Vectorized][Enhancement] fix some bug & improve some code 
(#7714)
omit 27d3898  [Vectorized][Bug] fix 'negative' function ut run fail && fix 
testIsBucketShuffleJoin run fail && fix some compile fail (#7688)
omit 3e45025  [Vectorized] (olap) Optimize BlockReader's performance (#7642)
omit 0dd1662  [Feature][Vectorized] Support String in vec exe engine (#7670)
omit a051b33  [Vectorized] [Function] Support do not fold constant at 
vectorized (#7668)
omit 952f0e3  [Vectorized] Support bloom filter predicate on vectorized 
engine storage layer (#7557)
omit 77e0212  [vectorized] [block] Add new method get_data_type to avoid 
unnecessary copy  by the method get_data_type (#7600)
omit 01d9434  [Vectorized][Feature] support 
money_format/ucase/character_length (#7649)
omit 9432587  [Vectorizd] [Function] Add string type vec support at 
doris_builtins_functions[D (#7661)
omit ead467c  [Bug] Fix function nulllable not match and largetint cast 
failed (#7659)
omit 3425e8a  [Function][Vec] add function coalesce (#7632)
omit bdeb6b7  [Vectorized][Feature] fix core dump when using function 
override and function alias at the same time && support substr(str,int) 
override (#7640)
omit c4623f2  [Bug] Fix bug of cast expr nullable and ifnull function 
(#7626)
omit fb945cd  [Refactor] Cow refactor: giveup using boost (#7567)
omit 326f0d7  [Vectorized][Function] Support function  and (#7618)
omit 3339878  [Bug] Change parser string to int (#7595)
omit 2d31421  [Bug] Fix bug of concat function and fold const expr (#7608)
omit 204a35d  [Function] Fix error about rank/dense_rank/row_number return 
always not nullable (#7561)
omit 54bd985  [Bug] Fix negative function error result and sort node eos 
(#7555)
omit 07abe49  [Vectorized Exec Engine] Support Vectorized Exec Engine In 
Doris
 add b51121f  [chore](github-action) Add label auto for pull requests 
(#7663)
 add d1a994e  [fix](cpu-resource)(resource-tag) Allow set 
cpu_resource_limit to -1 and fix resource tag bug(#6830)
 add 3da4425  [fix](github-action) fix the action of 
set-label-based-on-pr-title (#7757)
 add 10709f3  [fix](github-action) fix the action of 
set-label-based-on-pr-title (#7758)
 add d03151b  [chore](be) Add -Werror (#7744)
 add 902ab93  [fix](session-variable) fix bug that checkpoint may overwrite 
the global variables (#7526)
 add 6188ab2  [docs](faq) add multiple FE WEB UI login issues (#7654)
 add f381782  [fix] fix malloc and free mismatch issue  (#7702)
 add fe80d14  [style] replace Chinese comments with English comments (#7732)
 add 5c4055a  [style] Translate Chinese to English in be_olap_field.h 
(#7738)
 add e7d65e4  [style] translate code annotations into english (#7752)
 add a6ff1bd  Flink / Spark connector compilation problem (#7725)
 add be43316  [docs] add doc for community feedback and fix CI (#7759)
 add 4a3cbf5  [fix](show-load) fix show load with the same column name in 
Where Clause (#7523)
 add 5b0f11b  [feature](mysql-compatibility)(function)  add `WEEKDAY` 
function (#7673)
 add 8b7d7e4  [improvement] create/drop index support if [not] exist (#7748)
 add 5f8d912  [improvement](routine-load) Reduce the probability that the 
routine load task rpc timeout (#7754)
 add 36d6d23  [refactor] remove duplicate if that will never be used (#7761)
 add 88a3d08  [fix] fix NPE in SysVariableDesc::equal (#7766)
 add 5c7863c  [improvement](fe-unit-test) Fix port in use when the cluster 
starts in UT. (#

[incubator-doris] 02/33: [Bug] Fix negative function error result and sort node eos (#7555)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit a33b2ac2d1027e7c3c00c2a0d36276dd1b54df33
Author: HappenLee 
AuthorDate: Fri Dec 31 00:37:54 2021 -0600

[Bug] Fix negative function error result and sort node eos (#7555)

Co-authored-by: lihaopeng 
---
 be/src/vec/exec/vsort_node.cpp  | 1 +
 be/src/vec/functions/math.cpp   | 9 +
 be/test/vec/function/function_math_test.cpp | 2 +-
 3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/be/src/vec/exec/vsort_node.cpp b/be/src/vec/exec/vsort_node.cpp
index 79af7c8..734af91 100644
--- a/be/src/vec/exec/vsort_node.cpp
+++ b/be/src/vec/exec/vsort_node.cpp
@@ -84,6 +84,7 @@ Status VSortNode::get_next(RuntimeState* state, Block* block, 
bool* eos) {
 _sorted_blocks[0].skip_num_rows(_offset);
 }
 block->swap(_sorted_blocks[0]);
+*eos = true;
 } else {
 RETURN_IF_ERROR(merge_sort_read(state, block, eos));
 }
diff --git a/be/src/vec/functions/math.cpp b/be/src/vec/functions/math.cpp
index af48277..57d6c48 100644
--- a/be/src/vec/functions/math.cpp
+++ b/be/src/vec/functions/math.cpp
@@ -258,14 +258,7 @@ struct NegativeImpl {
 using ResultType = A;
 
 static inline ResultType apply(A a) {
-if constexpr (IsDecimalNumber)
-return a > 0 ? A(-a) : a;
-else if constexpr (std::is_integral_v && std::is_signed_v)
-return a > 0 ? static_cast(~a) + 1 : a;
-else if constexpr (std::is_integral_v && std::is_unsigned_v)
-return static_cast(-a);
-else if constexpr (std::is_floating_point_v)
-return static_cast(-std::abs(a));
+return -a;
 }
 };
 
diff --git a/be/test/vec/function/function_math_test.cpp 
b/be/test/vec/function/function_math_test.cpp
index f56ab7d..0413abd 100644
--- a/be/test/vec/function/function_math_test.cpp
+++ b/be/test/vec/function/function_math_test.cpp
@@ -296,7 +296,7 @@ TEST(MathFunctionTest, negative_test) {
 {
 std::vector input_types = {vectorized::TypeIndex::Float64};
 
-DataSet data_set = {{{0.0123}, -0.0123}, {{90.45}, -90.45}, {{0.0}, 
0.0}, {{-60.0}, -60.0}};
+DataSet data_set = {{{0.0123}, -0.0123}, {{90.45}, -90.45}, {{0.0}, 
0.0}, {{-60.0}, 60.0}};
 
 vectorized::check_function(func_name, input_types,
   
data_set);

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 05/33: [Bug] Change parser string to int (#7595)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 45286430724bfecaf73407f15c427aae6bd417e3
Author: Pxl <952130...@qq.com>
AuthorDate: Wed Jan 5 16:13:38 2022 +0800

[Bug] Change parser string to int (#7595)
---
 be/src/util/string_parser.hpp | 20 +
 be/src/vec/io/io_helper.h | 52 ++-
 2 files changed, 27 insertions(+), 45 deletions(-)

diff --git a/be/src/util/string_parser.hpp b/be/src/util/string_parser.hpp
index 0354343..cc1110c 100644
--- a/be/src/util/string_parser.hpp
+++ b/be/src/util/string_parser.hpp
@@ -573,6 +573,26 @@ T StringParser::numeric_limits(bool negative) {
 }
 
 template<>
+inline int StringParser::StringParseTraits::max_ascii_len() {
+return 3;
+}
+
+template<>
+inline int StringParser::StringParseTraits::max_ascii_len() {
+return 5;
+}
+
+template<>
+inline int StringParser::StringParseTraits::max_ascii_len() {
+return 10;
+}
+
+template<>
+inline int StringParser::StringParseTraits::max_ascii_len() {
+return 20;
+}
+
+template<>
 inline int StringParser::StringParseTraits::max_ascii_len() {
 return 3;
 }
diff --git a/be/src/vec/io/io_helper.h b/be/src/vec/io/io_helper.h
index fc232d7..fb9371f 100644
--- a/be/src/vec/io/io_helper.h
+++ b/be/src/vec/io/io_helper.h
@@ -126,7 +126,7 @@ inline void write_string_binary(const StringRef& s, 
BufferWritable& buf) {
 }
 
 inline void write_string_binary(const char* s, BufferWritable& buf) {
-write_string_binary(StringRef{s}, buf);
+write_string_binary(StringRef {s}, buf);
 }
 
 template 
@@ -288,53 +288,15 @@ bool read_float_text_fast_impl(T& x, ReadBuffer& in) {
 
 template 
 bool read_int_text_impl(T& x, ReadBuffer& buf) {
-bool negative = false;
-std::make_unsigned_t res = 0;
-if (buf.eof()) {
-return false;
-}
+StringParser::ParseResult result;
+x = StringParser::string_to_int(buf.position(), buf.count(), &result);
 
-while (!buf.eof()) {
-switch (*buf.position()) {
-case '+':
-break;
-case '-':
-if (std::is_signed_v)
-negative = true;
-else {
-return false;
-}
-break;
-case '0':
-[[fallthrough]];
-case '1':
-[[fallthrough]];
-case '2':
-[[fallthrough]];
-case '3':
-[[fallthrough]];
-case '4':
-[[fallthrough]];
-case '5':
-[[fallthrough]];
-case '6':
-[[fallthrough]];
-case '7':
-[[fallthrough]];
-case '8':
-[[fallthrough]];
-case '9':
-res *= 10;
-res += *buf.position() - '0';
-break;
-default:
-x = negative ? -res : res;
-return true;
-}
-++buf.position();
+if (UNLIKELY(result != StringParser::PARSE_SUCCESS)) {
+return false;
 }
 
-x = negative ? -res : res;
+// only to match the is_all_read() check to prevent return null
+buf.position() = buf.end();
 return true;
 }
 

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 08/33: [Bug] Fix bug of cast expr nullable and ifnull function (#7626)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 24e1d64a6b26be2f97477d39d87caed8ee0f5f8e
Author: HappenLee 
AuthorDate: Wed Jan 5 23:56:22 2022 -0600

[Bug] Fix bug of cast expr nullable and ifnull function (#7626)

Co-authored-by: lihaopeng 
---
 be/src/runtime/descriptors.h   |  1 -
 be/src/runtime/fold_constant_executor.cpp  |  8 ++--
 be/src/vec/exec/join/vhash_join_node.cpp   |  2 +-
 be/src/vec/exec/vunion_node.cpp|  2 +
 be/src/vec/functions/function.cpp  |  2 +-
 .../function_date_or_datetime_computation.h|  4 +-
 be/src/vec/functions/function_ifnull.h | 43 --
 be/src/vec/sink/vtabet_sink.cpp|  1 +
 .../java/org/apache/doris/analysis/CastExpr.java   |  7 
 .../apache/doris/analysis/FunctionCallExpr.java|  2 +-
 .../apache/doris/rewrite/FoldConstantsRule.java|  9 -
 11 files changed, 51 insertions(+), 30 deletions(-)

diff --git a/be/src/runtime/descriptors.h b/be/src/runtime/descriptors.h
index 97f9712..ad43209 100644
--- a/be/src/runtime/descriptors.h
+++ b/be/src/runtime/descriptors.h
@@ -381,7 +381,6 @@ public:
 int get_row_size() const;
 
 int num_materialized_slots() const {
-DCHECK(_num_materialized_slots != 0);
 return _num_materialized_slots;
 }
 
diff --git a/be/src/runtime/fold_constant_executor.cpp 
b/be/src/runtime/fold_constant_executor.cpp
index 9781c2f..f093c04 100644
--- a/be/src/runtime/fold_constant_executor.cpp
+++ b/be/src/runtime/fold_constant_executor.cpp
@@ -127,9 +127,11 @@ Status FoldConstantExecutor::fold_constant_vexpr(
 }
 
 vectorized::Block tmp_block;
+tmp_block.insert({vectorized::ColumnUInt8::create(1),
+std::make_shared(), ""});
 int result_column = -1;
 // calc vexpr
-ctx->execute(&tmp_block, &result_column);
+RETURN_IF_ERROR(ctx->execute(&tmp_block, &result_column));
 DCHECK(result_column != -1);
 PrimitiveType root_type = ctx->root()->type().type;
 // covert to thrift type
@@ -139,7 +141,7 @@ Status FoldConstantExecutor::fold_constant_vexpr(
 PExprResult expr_result;
 string result;
 const auto& column_ptr = 
tmp_block.get_by_position(result_column).column;
-if (column_ptr->is_nullable() && column_ptr->is_null_at(0)) {
+if (column_ptr->is_null_at(0)) {
 expr_result.set_success(false);
 } else {
 expr_result.set_success(true);
@@ -194,7 +196,7 @@ Status FoldConstantExecutor::_init(const TQueryGlobals& 
query_globals) {
 
 template 
 Status FoldConstantExecutor::_prepare_and_open(Context* ctx) {
-ctx->prepare(_runtime_state.get(), RowDescriptor(), _mem_tracker);
+RETURN_IF_ERROR(ctx->prepare(_runtime_state.get(), RowDescriptor(), 
_mem_tracker));
 return ctx->open(_runtime_state.get());
 }
 
diff --git a/be/src/vec/exec/join/vhash_join_node.cpp 
b/be/src/vec/exec/join/vhash_join_node.cpp
index 7606783..4533cae 100644
--- a/be/src/vec/exec/join/vhash_join_node.cpp
+++ b/be/src/vec/exec/join/vhash_join_node.cpp
@@ -133,7 +133,7 @@ struct ProcessRuntimeFilterBuild {
 
 RETURN_IF_ERROR(runtime_filter_slots->init(state, 
hash_table_ctx.hash_table.get_size()));
 
-if (!runtime_filter_slots->empty()) {
+if (!runtime_filter_slots->empty() && 
!_join_node->_inserted_rows.empty()) {
 {
 SCOPED_TIMER(_join_node->_push_compute_timer);
 runtime_filter_slots->insert(_join_node->_inserted_rows);
diff --git a/be/src/vec/exec/vunion_node.cpp b/be/src/vec/exec/vunion_node.cpp
index 122eafa..1fa4da4 100644
--- a/be/src/vec/exec/vunion_node.cpp
+++ b/be/src/vec/exec/vunion_node.cpp
@@ -181,6 +181,8 @@ Status VUnionNode::get_next_const(RuntimeState* state, 
Block* block) {
 
MutableBlock(Block(VectorizedUtils::create_columns_with_type_and_name(row_desc(;
 for (; _const_expr_list_idx < _const_expr_lists.size(); 
++_const_expr_list_idx) {
 Block tmp_block;
+tmp_block.insert({vectorized::ColumnUInt8::create(1),
+std::make_shared(), ""});
 int const_expr_lists_size = 
_const_expr_lists[_const_expr_list_idx].size();
 std::vector result_list(const_expr_lists_size);
 for (size_t i = 0; i < const_expr_lists_size; ++i) {
diff --git a/be/src/vec/functions/function.cpp 
b/be/src/vec/functions/function.cpp
index d3a910e..9ed3edb 100644
--- a/be/src/vec/functions/function.cpp
+++ b/be/src/vec/functions/function.cpp
@@ -66,7 +66,7 @@ ColumnPtr wrap_in_nullable(const ColumnPtr& src, const Block& 
block, const Colum
 
null_map_column->clone_resized(null_map_

[incubator-doris] 16/33: [Vectorized] [Function] Support do not fold constant at vectorized (#7668)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit acb63c749cff8b323cf9efca8cf47ddb171aecd0
Author: Pxl <952130...@qq.com>
AuthorDate: Mon Jan 10 10:52:44 2022 +0800

[Vectorized] [Function] Support do not fold constant at vectorized (#7668)
---
 .../org/apache/doris/analysis/ArithmeticExpr.java  | 30 ++
 .../java/org/apache/doris/rewrite/FEFunctions.java | 64 --
 2 files changed, 88 insertions(+), 6 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java
index 50012b7..79a1ffa 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java
@@ -267,6 +267,33 @@ public class ArithmeticExpr extends Expr {
 }
 }
 
+private boolean castIfHaveSameType(Type t1, Type t2, Type target) throws 
AnalysisException {
+if (t1 == target || t2 == target) {
+castChild(target, 0);
+castChild(target, 1);
+return true;
+}
+return false;
+}
+
+private void castUpperInteger(Type t1, Type t2) throws AnalysisException {
+if (!t1.isIntegerType() || !t2.isIntegerType()) {
+return;
+}
+if (castIfHaveSameType(t1, t2, Type.BIGINT)) {
+return;
+}
+if (castIfHaveSameType(t1, t2, Type.INT)) {
+return;
+}
+if (castIfHaveSameType(t1, t2, Type.SMALLINT)) {
+return;
+}
+if (castIfHaveSameType(t1, t2, Type.TINYINT)) {
+return;
+}
+}
+
 @Override
 public void analyzeImpl(Analyzer analyzer) throws AnalysisException {
 if (VectorizedUtil.isVectorized()) {
@@ -320,6 +347,9 @@ public class ArithmeticExpr extends Expr {
 if (t1.isDecimalV2() || t2.isDecimalV2()) {
 castBinaryOp(findCommonType(t1, t2));
 }
+if (isConstant()) {
+castUpperInteger(t1, t2);
+}
 case MOD:
 if (t1.isDecimalV2() || t2.isDecimalV2()) {
 castBinaryOp(findCommonType(t1, t2));
diff --git a/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java 
b/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java
index 26ca3f7..0bcbfb6 100755
--- a/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java
@@ -350,12 +350,30 @@ public class FEFunctions {
  * Arithmetic function
  */
 
-@FEFunction(name = "add", argTypes = { "BIGINT", "BIGINT" }, returnType = 
"BIGINT")
+@FEFunction(name = "add", argTypes = { "TINYINT", "TINYINT" }, returnType 
= "SMALLINT")
+public static IntLiteral addTinyint(LiteralExpr first, LiteralExpr second) 
throws AnalysisException {
+long result = Math.addExact(first.getLongValue(), 
second.getLongValue());
+return new IntLiteral(result, Type.SMALLINT);
+}
+
+@FEFunction(name = "add", argTypes = { "SMALLINT", "SMALLINT" }, 
returnType = "INT")
+public static IntLiteral addSmallint(LiteralExpr first, LiteralExpr 
second) throws AnalysisException {
+long result = Math.addExact(first.getLongValue(), 
second.getLongValue());
+return new IntLiteral(result, Type.INT);
+}
+
+@FEFunction(name = "add", argTypes = { "INT", "INT" }, returnType = 
"BIGINT")
 public static IntLiteral addInt(LiteralExpr first, LiteralExpr second) 
throws AnalysisException {
 long result = Math.addExact(first.getLongValue(), 
second.getLongValue());
 return new IntLiteral(result, Type.BIGINT);
 }
 
+@FEFunction(name = "add", argTypes = { "BIGINT", "BIGINT" }, returnType = 
"BIGINT")
+public static IntLiteral addBigint(LiteralExpr first, LiteralExpr second) 
throws AnalysisException {
+long result = Math.addExact(first.getLongValue(), 
second.getLongValue());
+return new IntLiteral(result, Type.BIGINT);
+}
+
 @FEFunction(name = "add", argTypes = { "DOUBLE", "DOUBLE" }, returnType = 
"DOUBLE")
 public static FloatLiteral addDouble(LiteralExpr first, LiteralExpr 
second) throws AnalysisException {
 double result = first.getDoubleValue() + second.getDoubleValue();
@@ -379,12 +397,30 @@ public class FEFunctions {
 return new LargeIntLiteral(result.toString());
 }
 
-@FEFunction(name = "subtract", argTypes = { "BIGINT", "BIGINT" }, 
returnType = "BIGINT")
+@FEFunction(name = "subtract", argTypes = { "TINYINT", "TINYINT" }, 
returnType = "SMALLINT")
+public static IntLiteral subtractTinyint(LiteralExpr first, LiteralExpr 
second) th

[incubator-doris] 12/33: [Vectorizd] [Function] Add string type vec support at doris_builtins_functions[D (#7661)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 345c119510602110275a698e64f76f7e2f058065
Author: Pxl <952130...@qq.com>
AuthorDate: Fri Jan 7 14:57:20 2022 +0800

[Vectorizd] [Function] Add string type vec support at 
doris_builtins_functions[D (#7661)
---
 gensrc/script/doris_builtins_functions.py | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gensrc/script/doris_builtins_functions.py 
b/gensrc/script/doris_builtins_functions.py
index 1399c97..5d96751 100755
--- a/gensrc/script/doris_builtins_functions.py
+++ b/gensrc/script/doris_builtins_functions.py
@@ -1045,26 +1045,26 @@ visible_functions = [
 '_ZN5doris15StringFunctions17parse_url_prepareEPN9doris_udf'
 '15FunctionContextENS2_18FunctionStateScopeE',
 '_ZN5doris15StringFunctions15parse_url_closeEPN9doris_udf'
-'15FunctionContextENS2_18FunctionStateScopeE', '', ''],
+'15FunctionContextENS2_18FunctionStateScopeE', 'vec', ''],
 [['parse_url'], 'STRING', ['STRING', 'STRING', 'STRING'],
 '_ZN5doris15StringFunctions13parse_url_keyEPN9doris_udf'
 '15FunctionContextERKNS1_9StringValES6_S6_',
 '_ZN5doris15StringFunctions17parse_url_prepareEPN9doris_udf'
 '15FunctionContextENS2_18FunctionStateScopeE',
 '_ZN5doris15StringFunctions15parse_url_closeEPN9doris_udf'
-'15FunctionContextENS2_18FunctionStateScopeE', '', ''],
+'15FunctionContextENS2_18FunctionStateScopeE', 'vec', ''],
 [['money_format'], 'STRING', ['BIGINT'],
 
'_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_9BigIntValE',
-'', '', '', ''],
+'', '', 'vec', ''],
 [['money_format'], 'STRING', ['LARGEINT'],
 
'_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_11LargeIntValE',
-'', '', '', ''],
+'', '', 'vec', ''],
 [['money_format'], 'STRING', ['DOUBLE'],
 
'_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_9DoubleValE',
-'', '', '', ''],
+'', '', 'vec', ''],
 [['money_format'], 'STRING', ['DECIMALV2'],
 
'_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_12DecimalV2ValE',
-'', '', '', ''],
+'', '', 'vec', ''],
 [['split_part'], 'STRING', ['STRING', 'STRING', 'INT'],
 
'_ZN5doris15StringFunctions10split_partEPN9doris_udf15FunctionContextERKNS1_9StringValES6_RKNS1_6IntValE',
 '', '', 'vec', 'ALWAYS_NULLABLE'],
@@ -1276,7 +1276,7 @@ visible_functions = [
 '15FunctionContextERKNS1_9StringValES6_', '', '', 'vec', 
'ALWAYS_NULLABLE'],
 [['aes_decrypt'], 'STRING', ['STRING', 'STRING'],
 '_ZN5doris19EncryptionFunctions11aes_decryptEPN9doris_udf'
-'15FunctionContextERKNS1_9StringValES6_', '', '', '', ''],
+'15FunctionContextERKNS1_9StringValES6_', '', '', 'vec', ''],
 [['aes_encrypt'], 'STRING', ['STRING', 'STRING', 'STRING', 'STRING'],
 '_ZN5doris19EncryptionFunctions11aes_encryptEPN9doris_udf'
 '15FunctionContextERKNS1_9StringValES6_S6_S6_', '', '', '', ''],
@@ -1300,7 +1300,7 @@ visible_functions = [
 '15FunctionContextERKNS1_9StringValE', '', '', 'vec', 
'ALWAYS_NULLABLE'],
 [['to_base64'], 'STRING', ['STRING'],
 '_ZN5doris19EncryptionFunctions9to_base64EPN9doris_udf'
-'15FunctionContextERKNS1_9StringValE', '', '', '', 'ALWAYS_NULLABLE'],
+'15FunctionContextERKNS1_9StringValE', '', '', 'vec', 
'ALWAYS_NULLABLE'],
 [['to_base64'], 'VARCHAR', ['VARCHAR'],
 '_ZN5doris19EncryptionFunctions9to_base64EPN9doris_udf'
 '15FunctionContextERKNS1_9StringValE', '', '', 'vec', 
'ALWAYS_NULLABLE'],

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 23/33: [Vectorized][Feature] upport function conv (#7693)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit ffdc9fc9be28ae94671916f06e4fd5f219707e93
Author: Pxl <952130...@qq.com>
AuthorDate: Wed Jan 12 17:08:09 2022 +0800

[Vectorized][Feature] upport function conv  (#7693)

* support function conv()

* add document
---
 be/src/exprs/math_functions.h  |  14 +-
 be/src/vec/CMakeLists.txt  |   1 +
 be/src/vec/data_types/data_type_bitmap.h   |   2 +
 be/src/vec/data_types/data_type_date.h |   2 +-
 be/src/vec/data_types/data_type_date_time.h|  40 ++---
 be/src/vec/data_types/data_type_decimal.h  |   2 +-
 be/src/vec/data_types/data_type_number_base.h  |   5 +-
 be/src/vec/data_types/data_type_string.h   |   5 +-
 be/src/vec/functions/function_conv.cpp | 163 +
 be/src/vec/functions/simple_function_factory.h |   2 +
 docs/.vuepress/sidebar/en.js   |   5 +
 docs/.vuepress/sidebar/zh-CN.js|   5 +
 .../sql-functions/math-functions/conv.md   |  60 
 .../sql-functions/math-functions/conv.md   |  60 
 gensrc/script/doris_builtins_functions.py  |   6 +-
 15 files changed, 339 insertions(+), 33 deletions(-)

diff --git a/be/src/exprs/math_functions.h b/be/src/exprs/math_functions.h
index 15d8749..9d55ed6 100644
--- a/be/src/exprs/math_functions.h
+++ b/be/src/exprs/math_functions.h
@@ -50,7 +50,8 @@ public:
 static doris_udf::IntVal abs(doris_udf::FunctionContext*, const 
doris_udf::SmallIntVal&);
 static doris_udf::SmallIntVal abs(doris_udf::FunctionContext*, const 
doris_udf::TinyIntVal&);
 
-static doris_udf::TinyIntVal sign(doris_udf::FunctionContext* ctx, const 
doris_udf::DoubleVal& v);
+static doris_udf::TinyIntVal sign(doris_udf::FunctionContext* ctx,
+  const doris_udf::DoubleVal& v);
 
 static doris_udf::DoubleVal sin(doris_udf::FunctionContext*, const 
doris_udf::DoubleVal&);
 static doris_udf::DoubleVal asin(doris_udf::FunctionContext*, const 
doris_udf::DoubleVal&);
@@ -182,11 +183,6 @@ public:
 
 static double my_double_round(double value, int64_t dec, bool 
dec_unsigned, bool truncate);
 
-private:
-static const int32_t MIN_BASE = 2;
-static const int32_t MAX_BASE = 36;
-static const char* _s_alphanumeric_chars;
-
 // Converts src_num in decimal to dest_base,
 // and fills expr_val.string_val with the result.
 static doris_udf::StringVal decimal_to_base(doris_udf::FunctionContext* 
ctx, int64_t src_num,
@@ -207,6 +203,12 @@ private:
 // Returns false otherwise, indicating some other error condition.
 static bool handle_parse_result(int8_t dest_base, int64_t* num,
 StringParser::ParseResult parse_res);
+
+static const int32_t MIN_BASE = 2;
+static const int32_t MAX_BASE = 36;
+
+private:
+static const char* _s_alphanumeric_chars;
 };
 
 } // namespace doris
diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt
index 01c69eb..aa302ce 100644
--- a/be/src/vec/CMakeLists.txt
+++ b/be/src/vec/CMakeLists.txt
@@ -108,6 +108,7 @@ set(VEC_FILES
   functions/functions_logical.cpp
   functions/function_case.cpp
   functions/function_cast.cpp
+  functions/function_conv.cpp
   functions/function_string.cpp
   functions/function_timestamp.cpp
   functions/function_utility.cpp
diff --git a/be/src/vec/data_types/data_type_bitmap.h 
b/be/src/vec/data_types/data_type_bitmap.h
index 692d6fc..69f5540 100644
--- a/be/src/vec/data_types/data_type_bitmap.h
+++ b/be/src/vec/data_types/data_type_bitmap.h
@@ -18,6 +18,7 @@
 #pragma once
 #include "util/bitmap_value.h"
 #include "vec/columns/column.h"
+#include "vec/columns/column_complex.h"
 #include "vec/core/types.h"
 #include "vec/data_types/data_type.h"
 
@@ -27,6 +28,7 @@ public:
 DataTypeBitMap() = default;
 ~DataTypeBitMap() override = default;
 
+using ColumnType = ColumnBitmap;
 using FieldType = BitmapValue;
 
 std::string do_get_name() const override { return get_family_name(); }
diff --git a/be/src/vec/data_types/data_type_date.h 
b/be/src/vec/data_types/data_type_date.h
index b3aa90c..b5d148b 100644
--- a/be/src/vec/data_types/data_type_date.h
+++ b/be/src/vec/data_types/data_type_date.h
@@ -34,7 +34,7 @@ public:
 
 bool equals(const IDataType& rhs) const override;
 std::string to_string(const IColumn& column, size_t row_num) const;
-void to_string(const IColumn &column, size_t row_num, BufferWritable 
&ostr) const override;
+void to_string(const IColumn& column, size_t row_num, BufferWritable& 
ostr) const override;
 
 static void cast_to_date(Int64& x);
 };
diff --git a/be/src/vec/data_types/data_type_date_time.h 
b/be/src/vec/data_types/data_type_date_time.h
inde

[incubator-doris] 18/33: [Vectorized] (olap) Optimize BlockReader's performance (#7642)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 8b8210433eeeb0fbfc20b39520188d2e23892767
Author: thinker 
AuthorDate: Mon Jan 10 20:28:21 2022 +0800

[Vectorized] (olap) Optimize BlockReader's performance (#7642)

Co-authored-by: zuochunwei 
---
 be/src/vec/olap/block_reader.cpp | 51 +---
 be/src/vec/olap/block_reader.h   |  9 +++
 2 files changed, 24 insertions(+), 36 deletions(-)

diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp
index 8e2d4b2..ef3ba3a 100644
--- a/be/src/vec/olap/block_reader.cpp
+++ b/be/src/vec/olap/block_reader.cpp
@@ -25,15 +25,8 @@
 #include "runtime/mem_tracker.h"
 #include "vec/olap/vcollect_iterator.h"
 
-using std::nothrow;
-using std::set;
-using std::vector;
-
 namespace doris::vectorized {
 
-BlockReader::BlockReader()
-: _collect_iter(new VCollectIterator()), _next_row {nullptr, -1, 
false} {}
-
 BlockReader::~BlockReader() {
 for (int i = 0; i < _agg_functions.size(); ++i) {
 AggregateFunctionPtr function = _agg_functions[i];
@@ -45,7 +38,7 @@ BlockReader::~BlockReader() {
 
 OLAPStatus BlockReader::_init_collect_iter(const ReaderParams& read_params,
std::vector* 
valid_rs_readers) {
-_collect_iter->init(this);
+_vcollect_iter.init(this);
 std::vector rs_readers;
 auto res = _capture_rs_readers(read_params, &rs_readers);
 if (res != OLAP_SUCCESS) {
@@ -59,7 +52,7 @@ OLAPStatus BlockReader::_init_collect_iter(const 
ReaderParams& read_params,
 
 for (auto& rs_reader : rs_readers) {
 RETURN_NOT_OK(rs_reader->init(&_reader_context));
-OLAPStatus res = _collect_iter->add_child(rs_reader);
+OLAPStatus res = _vcollect_iter.add_child(rs_reader);
 if (res != OLAP_SUCCESS && res != OLAP_ERR_DATA_EOF) {
 LOG(WARNING) << "failed to add child to iterator, err=" << res;
 return res;
@@ -69,9 +62,9 @@ OLAPStatus BlockReader::_init_collect_iter(const 
ReaderParams& read_params,
 }
 }
 
-_collect_iter->build_heap(*valid_rs_readers);
-if (_collect_iter->is_merge()) {
-auto status = _collect_iter->current_row(&_next_row);
+_vcollect_iter.build_heap(*valid_rs_readers);
+if (_vcollect_iter.is_merge()) {
+auto status = _vcollect_iter.current_row(&_next_row);
 _eof = status == OLAP_ERR_DATA_EOF;
 }
 
@@ -85,8 +78,9 @@ void BlockReader::_init_agg_state() {
 _stored_has_null_tag.resize(_stored_data_columns.size());
 _stored_has_string_tag.resize(_stored_data_columns.size());
 
+auto& tablet_schema = tablet()->tablet_schema();
 for (auto idx : _agg_columns_idx) {
-FieldAggregationMethod agg_method = 
tablet()->tablet_schema().column(idx).aggregation();
+FieldAggregationMethod agg_method = 
tablet_schema.column(idx).aggregation();
 std::string agg_name =
 TabletColumn::get_string_by_aggregation_type(agg_method) + 
agg_reader_suffix;
 std::transform(agg_name.begin(), agg_name.end(), agg_name.begin(),
@@ -159,6 +153,7 @@ OLAPStatus BlockReader::init(const ReaderParams& 
read_params) {
 break;
 case KeysType::AGG_KEYS:
 _next_block_func = &BlockReader::_agg_key_next_block;
+_init_agg_state();
 break;
 default:
 DCHECK(false) << "No next row function for type:" << 
tablet()->keys_type();
@@ -170,7 +165,7 @@ OLAPStatus BlockReader::init(const ReaderParams& 
read_params) {
 
 OLAPStatus BlockReader::_direct_next_block(Block* block, MemPool* mem_pool, 
ObjectPool* agg_pool,
bool* eof) {
-auto res = _collect_iter->next(block);
+auto res = _vcollect_iter.next(block);
 if (UNLIKELY(res != OLAP_SUCCESS && res != OLAP_ERR_DATA_EOF)) {
 return res;
 }
@@ -190,11 +185,6 @@ OLAPStatus BlockReader::_agg_key_next_block(Block* block, 
MemPool* mem_pool, Obj
 return OLAP_SUCCESS;
 }
 
-if (!_agg_inited) {
-_init_agg_state();
-_agg_inited = true;
-}
-
 auto target_block_row = 0;
 auto target_columns = block->mutate_columns();
 
@@ -203,7 +193,7 @@ OLAPStatus BlockReader::_agg_key_next_block(Block* block, 
MemPool* mem_pool, Obj
 _append_agg_data(target_columns);
 
 while (true) {
-auto res = _collect_iter->next(&_next_row);
+auto res = _vcollect_iter.next(&_next_row);
 if (UNLIKELY(res == OLAP_ERR_DATA_EOF)) {
 *eof = true;
 break;
@@ -251,7 +241,7 @@ OLAPStatus BlockReader::_unique_key_next_block(Block* 
block, MemPool* mem_pool,
 // the version is in reverse order, the first row is the highest 
version,
 // in UNIQUE_KEY highest version is the final result, there is no need 
to
 // merge the lower

[incubator-doris] 22/33: [Vectorized][Bug] Fix get wrong result when select random column && fix get wrong has_null_tag (#7728)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit a9d9c02a2fefc16c0eccc572626fac8b58a64d70
Author: Pxl <952130...@qq.com>
AuthorDate: Wed Jan 12 15:55:58 2022 +0800

[Vectorized][Bug] Fix get wrong result when select random column && fix get 
wrong has_null_tag (#7728)
---
 be/src/vec/columns/column.h  |  3 +++
 be/src/vec/columns/column_nullable.h |  5 +++--
 be/src/vec/core/block.cpp| 29 +
 be/src/vec/olap/block_reader.cpp | 23 +++
 be/src/vec/olap/block_reader.h   |  4 ++--
 5 files changed, 40 insertions(+), 24 deletions(-)

diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h
index 88f9a3a..5e41028 100644
--- a/be/src/vec/columns/column.h
+++ b/be/src/vec/columns/column.h
@@ -336,6 +336,9 @@ public:
 // true iff column has null element
 virtual bool has_null() const { return false; }
 
+// true iff column has null element [0,size)
+virtual bool has_null(size_t size) const { return false; }
+
 /// It's a special kind of column, that contain single value, but is not a 
ColumnConst.
 virtual bool is_dummy() const { return false; }
 
diff --git a/be/src/vec/columns/column_nullable.h 
b/be/src/vec/columns/column_nullable.h
index 9641788..8f4 100644
--- a/be/src/vec/columns/column_nullable.h
+++ b/be/src/vec/columns/column_nullable.h
@@ -176,8 +176,9 @@ public:
 /// Check that size of null map equals to size of nested column.
 void check_consistency() const;
 
-bool has_null() const override {
-size_t size = get_null_map_data().size();
+bool has_null() const override { return 
has_null(get_null_map_data().size()); }
+
+bool has_null(size_t size) const override {
 const UInt8* null_pos = get_null_map_data().data();
 const UInt8* null_pos_end = get_null_map_data().data() + size;
 #ifdef __SSE2__
diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp
index 2aff751..d200a46 100644
--- a/be/src/vec/core/block.cpp
+++ b/be/src/vec/core/block.cpp
@@ -21,18 +21,18 @@
 #include "vec/core/block.h"
 
 #include 
+#include 
+
 #include 
 #include 
 #include 
-#include 
 
 #include "common/status.h"
 #include "gen_cpp/data.pb.h"
 #include "runtime/descriptors.h"
+#include "runtime/row_batch.h"
 #include "runtime/tuple.h"
 #include "runtime/tuple_row.h"
-#include "runtime/row_batch.h"
-
 #include "vec/columns/column_const.h"
 #include "vec/columns/column_nullable.h"
 #include "vec/columns/column_vector.h"
@@ -692,8 +692,10 @@ Status Block::filter_block(Block* block, int 
filter_column_id, int column_to_kee
 if (auto* nullable_column = 
check_and_get_column(*filter_column)) {
 ColumnPtr nested_column = nullable_column->get_nested_column_ptr();
 
-MutableColumnPtr mutable_holder = nested_column->use_count() == 1 ?
-nested_column->assume_mutable() : 
nested_column->clone_resized(nested_column->size());
+MutableColumnPtr mutable_holder =
+nested_column->use_count() == 1
+? nested_column->assume_mutable()
+: nested_column->clone_resized(nested_column->size());
 
 ColumnUInt8* concrete_column = 
typeid_cast(mutable_holder.get());
 if (!concrete_column) {
@@ -769,8 +771,8 @@ void Block::serialize(RowBatch* output_batch, const 
RowDescriptor& row_desc) {
 }
 }
 
-doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, 
MemPool* pool,
-int row, int column_offset, bool padding_char) {
+doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, 
MemPool* pool, int row,
+ int column_offset, bool padding_char) {
 auto dst = 
reinterpret_cast(pool->allocate(desc.byte_size()));
 
 for (int i = 0; i < desc.slots().size(); ++i) {
@@ -787,8 +789,9 @@ doris::Tuple* Block::deep_copy_tuple(const 
doris::TupleDescriptor& desc, MemPool
 
 if (!slot_desc->type().is_string_type() && 
!slot_desc->type().is_date_type()) {
 memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), 
data_ref.data, data_ref.size);
-} else if (slot_desc->type().is_string_type() && slot_desc->type() != 
TYPE_OBJECT){
-memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), (const 
void*)(&data_ref), sizeof(data_ref));
+} else if (slot_desc->type().is_string_type() && slot_desc->type() != 
TYPE_OBJECT) {
+memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), (const 
void*)(&data_ref),
+   sizeof(data_ref));
 // Copy the content of string
 if (padding_char && slot_desc->type() == TYPE_CHAR) {
 // serialize the content of string
@@ -800,7 +803,8 @@ doris::Tuple* Block::deep_copy_tuple(const 
doris::TupleDescriptor& desc, MemPool
   

[incubator-doris] 13/33: [Vectorized][Feature] support money_format/ucase/character_length (#7649)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit cc451e7126235327872f1d043ce41d09460fbd79
Author: Pxl <952130...@qq.com>
AuthorDate: Fri Jan 7 15:04:55 2022 +0800

[Vectorized][Feature] support money_format/ucase/character_length (#7649)
---
 be/src/vec/functions/function_string.cpp  |  12 +++-
 be/src/vec/functions/function_string.h| 110 +-
 gensrc/script/doris_builtins_functions.py |   8 +--
 3 files changed, 120 insertions(+), 10 deletions(-)

diff --git a/be/src/vec/functions/function_string.cpp 
b/be/src/vec/functions/function_string.cpp
index 34f1c6b..bdeb7e5 100644
--- a/be/src/vec/functions/function_string.cpp
+++ b/be/src/vec/functions/function_string.cpp
@@ -272,7 +272,7 @@ struct HexStringName {
 
 struct HexStringImpl {
 static DataTypes get_variadic_argument_types() {
-return {std::make_shared()};
+return {std::make_shared()};
 }
 
 static Status vector(const ColumnString::Chars& data, const 
ColumnString::Offsets& offsets,
@@ -774,8 +774,8 @@ void register_function_string(SimpleFunctionFactory& 
factory) {
 factory.register_function();
 factory.register_function();
 factory.register_function();
-factory.register_function>();
-factory.register_function>();
+factory.register_function>();
+factory.register_function>();
 factory.register_function();
 factory.register_function();
 factory.register_function();
@@ -792,12 +792,18 @@ void register_function_string(SimpleFunctionFactory& 
factory) {
 factory.register_function();
 factory.register_function();
 factory.register_function();
+factory.register_function>();
+factory.register_function>();
+factory.register_function>();
+factory.register_function>();
 
 factory.register_alias(FunctionLeft::name, "strleft");
 factory.register_alias(FunctionRight::name, "strright");
 factory.register_alias(SubstringUtil::name, "substr");
 factory.register_alias(FunctionToLower::name, "lcase");
+factory.register_alias(FunctionToUpper::name, "ucase");
 factory.register_alias(FunctionStringMd5sum::name, "md5");
+factory.register_alias(FunctionStringUTF8Length::name, "character_length");
 }
 
 } // namespace doris::vectorized
diff --git a/be/src/vec/functions/function_string.h 
b/be/src/vec/functions/function_string.h
index 3f3e538..af58062 100644
--- a/be/src/vec/functions/function_string.h
+++ b/be/src/vec/functions/function_string.h
@@ -24,9 +24,12 @@
 #include 
 
 #include "exprs/anyval_util.h"
+#include "exprs/math_functions.h"
+#include "exprs/string_functions.h"
 #include "runtime/string_value.hpp"
 #include "util/md5.h"
 #include "util/url_parser.h"
+#include "vec/columns/column_decimal.h"
 #include "vec/columns/column_nullable.h"
 #include "vec/columns/column_string.h"
 #include "vec/columns/columns_number.h"
@@ -211,7 +214,7 @@ public:
 }
 };
 
-struct Substr3Imp {
+struct Substr3Impl {
 static DataTypes get_variadic_argument_types() {
 return {std::make_shared(), 
std::make_shared(),
 std::make_shared()};
@@ -225,7 +228,7 @@ struct Substr3Imp {
 }
 };
 
-struct Substr2Imp {
+struct Substr2Impl {
 static DataTypes get_variadic_argument_types() {
 return {std::make_shared(), 
std::make_shared()};
 }
@@ -558,7 +561,7 @@ public:
 }
 return Status::OK();
 }
-}; // namespace doris::vectorized
+};
 
 class FunctionStringRepeat : public IFunction {
 public:
@@ -1038,4 +1041,105 @@ public:
 }
 };
 
+template 
+class FunctionMoneyFormat : public IFunction {
+public:
+static constexpr auto name = "money_format";
+static FunctionPtr create() { return 
std::make_shared>(); }
+String get_name() const override { return name; }
+
+DataTypePtr get_return_type_impl(const DataTypes& arguments) const 
override {
+return std::make_shared();
+}
+DataTypes get_variadic_argument_types_impl() const override {
+return Impl::get_variadic_argument_types();
+}
+size_t get_number_of_arguments() const override { return 1; }
+
+bool use_default_implementation_for_constants() const override { return 
true; }
+
+Status execute_impl(FunctionContext* context, Block& block, const 
ColumnNumbers& arguments,
+size_t result, size_t input_rows_count) override {
+auto res_column = ColumnString::create();
+ColumnPtr argument_column = block.get_by_position(arguments[0]).column;
+
+auto result_column = assert_cast(res_column.get());
+auto data_column = assert_cast(argument_column.get());
+
+Impl::execute(context, result_column, data_column, input_rows_count);
+
+block.replace_by_position(result, std::move(res_column));
+return Status::OK();
+}
+};
+
+struct MoneyFormatDoubleImpl {
+us

[incubator-doris] 07/33: [Refactor] Cow refactor: giveup using boost (#7567)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit a261538948f61f9b38c4e4d906bd9254bbae07b4
Author: thinker 
AuthorDate: Thu Jan 6 00:12:17 2022 +0800

[Refactor] Cow refactor: giveup using boost (#7567)

Co-authored-by: zuochunwei 
---
 be/src/vec/common/cow.h | 172 ++--
 be/src/vec/exec/vanalytic_eval_node.cpp |   2 +-
 be/src/vec/functions/function_cast.h|   2 +-
 3 files changed, 146 insertions(+), 30 deletions(-)

diff --git a/be/src/vec/common/cow.h b/be/src/vec/common/cow.h
index dbee2b9..08edb89 100644
--- a/be/src/vec/common/cow.h
+++ b/be/src/vec/common/cow.h
@@ -24,6 +24,7 @@
 #include 
 #include 
 
+
 /** Copy-on-write shared ptr.
   * Allows to work with shared immutable objects and sometimes unshare and 
mutate you own unique copy.
   *
@@ -92,36 +93,158 @@
   *   to use std::unique_ptr for it somehow.
   */
 template 
-class COW : public boost::intrusive_ref_counter {
-private:
+class COW {
+std::atomic_uint ref_counter;
+
+protected:
+COW() : ref_counter(0) {}
+
+COW(COW const&) : ref_counter(0) {}
+
+COW& operator=(COW const&) {
+return *this;
+}
+
+unsigned int use_count() const {
+return ref_counter.load();
+}
+
+void add_ref() {
+++ref_counter;
+}
+
+void release_ref() {
+if (--ref_counter == 0) {
+delete static_cast(this);
+}
+}
+
 Derived* derived() { return static_cast(this); }
+
 const Derived* derived() const { return static_cast(this); 
}
 
 template 
-class IntrusivePtr : public boost::intrusive_ptr {
+class intrusive_ptr {
 public:
-using boost::intrusive_ptr::intrusive_ptr;
+intrusive_ptr() : t(nullptr) {}
+
+intrusive_ptr(T* t, bool add_ref=true) : t(t) {
+if (t && add_ref) ((std::remove_const_t*)t)->add_ref();
+}
+
+template 
+intrusive_ptr(intrusive_ptr const& rhs) : t(rhs.get()) {
+if (t) ((std::remove_const_t*)t)->add_ref();
+}
+
+intrusive_ptr(intrusive_ptr const& rhs) : t(rhs.get()) {
+if (t) ((std::remove_const_t*)t)->add_ref();
+}
+
+~intrusive_ptr() {
+if (t) ((std::remove_const_t*)t)->release_ref();
+}
+
+template 
+intrusive_ptr& operator=(intrusive_ptr const& rhs) {
+intrusive_ptr(rhs).swap(*this);
+return *this;
+}
+
+intrusive_ptr(intrusive_ptr&& rhs) : t(rhs.t) {
+rhs.t = nullptr;
+}
+
+intrusive_ptr& operator=(intrusive_ptr&& rhs) {
+intrusive_ptr(static_cast(rhs)).swap(*this);
+return *this;
+}
+
+template friend class intrusive_ptr;
+
+template
+intrusive_ptr(intrusive_ptr&& rhs) : t(rhs.t) {
+rhs.t = nullptr;
+}
+
+template
+intrusive_ptr& operator=(intrusive_ptr&& rhs) {
+intrusive_ptr(static_cast&&>(rhs)).swap(*this);
+return *this;
+}
+
+intrusive_ptr& operator=(intrusive_ptr const& rhs) {
+intrusive_ptr(rhs).swap(*this);
+return *this;
+}
+
+intrusive_ptr& operator=(T* rhs) {
+intrusive_ptr(rhs).swap(*this);
+return *this;
+}
+
+void reset() {
+intrusive_ptr().swap(*this);
+}
+
+void reset(T* rhs) {
+intrusive_ptr(rhs).swap(*this);
+}
+
+void reset(T* rhs, bool add_ref) {
+intrusive_ptr(rhs, add_ref).swap(*this);
+}
+
+T* get() const {
+return t;
+}
+
+T* detach() {
+T* ret = t;
+t = nullptr;
+return ret;
+}
+
+void swap(intrusive_ptr& rhs) {
+T* tmp = t;
+t = rhs.t;
+rhs.t = tmp;
+}
+
+T& operator*() const& {
+return *t;
+}
 
-T& operator*() const& { return boost::intrusive_ptr::operator*(); }
 T&& operator*() const&& {
-return const_cast::type&&>(
-*boost::intrusive_ptr::get());
+return const_cast&&>(*t);
+}
+
+T* operator->() const {
+return t;
+}
+
+operator bool() const {
+return t != nullptr;
+}
+
+operator T*() const {
+return t;
 }
+
+private:
+T* t;
 };
 
 protected:
 template 
-class mutable_ptr : public IntrusivePtr {
+class mutable_ptr : public intrusive_ptr {
 private:
-using Base = IntrusivePtr;
+using Base = intrusive_ptr;
 
-template 
-friend class COW;
-template 
-friend class COWHelper;
+template  friend class COW;
+templ

[incubator-doris] 30/33: [Vectorized][Improvement] Speed up column filtering via SIMD (#7775)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 0a68fc3138057d536cae16e0c4052469cdd2c76e
Author: Zeno Yang 
AuthorDate: Mon Jan 17 16:54:40 2022 +0800

[Vectorized][Improvement] Speed up column filtering via SIMD (#7775)
---
 be/src/vec/columns/column_decimal.cpp | 25 +
 be/src/vec/columns/column_vector.cpp  | 28 ++--
 be/src/vec/columns/columns_common.cpp | 20 +---
 be/src/vec/columns/columns_common.h   | 30 ++
 4 files changed, 74 insertions(+), 29 deletions(-)

diff --git a/be/src/vec/columns/column_decimal.cpp 
b/be/src/vec/columns/column_decimal.cpp
index 8e5ae12..5cc5853 100644
--- a/be/src/vec/columns/column_decimal.cpp
+++ b/be/src/vec/columns/column_decimal.cpp
@@ -162,6 +162,31 @@ ColumnPtr ColumnDecimal::filter(const IColumn::Filter& 
filt, ssize_t result_s
 const UInt8* filt_end = filt_pos + size;
 const T* data_pos = data.data();
 
+/** A slightly more optimized version.
+* Based on the assumption that often pieces of consecutive values
+*  completely pass or do not pass the filter.
+* Therefore, we will optimistically check the parts of `SIMD_BYTES` 
values.
+*/
+static constexpr size_t SIMD_BYTES = 32;
+const UInt8* filt_end_sse = filt_pos + size / SIMD_BYTES * SIMD_BYTES;
+
+while (filt_pos < filt_end_sse) {
+uint32_t mask = bytes32_mask_to_bits32_mask(filt_pos);
+
+if (0x == mask) {
+res_data.insert(data_pos, data_pos + SIMD_BYTES);
+} else {
+while (mask) {
+const size_t idx = __builtin_ctzll(mask);
+res_data.push_back(data_pos[idx]);
+mask = mask & (mask - 1);
+}
+}
+
+filt_pos += SIMD_BYTES;
+data_pos += SIMD_BYTES;
+}
+
 while (filt_pos < filt_end) {
 if (*filt_pos) res_data.push_back(*data_pos);
 
diff --git a/be/src/vec/columns/column_vector.cpp 
b/be/src/vec/columns/column_vector.cpp
index f6627d0..017ae29 100644
--- a/be/src/vec/columns/column_vector.cpp
+++ b/be/src/vec/columns/column_vector.cpp
@@ -26,6 +26,8 @@
 #include 
 #include 
 
+#include "runtime/datetime_value.h"
+#include "vec/columns/columns_common.h"
 #include "vec/common/arena.h"
 #include "vec/common/bit_cast.h"
 #include "vec/common/exception.h"
@@ -33,12 +35,6 @@
 #include "vec/common/sip_hash.h"
 #include "vec/common/unaligned.h"
 
-#include "runtime/datetime_value.h"
-
-#ifdef __SSE2__
-#include 
-#endif
-
 namespace doris::vectorized {
 
 template 
@@ -237,34 +233,30 @@ ColumnPtr ColumnVector::filter(const IColumn::Filter& 
filt, ssize_t result_si
 const UInt8* filt_end = filt_pos + size;
 const T* data_pos = data.data();
 
-#ifdef __SSE2__
 /** A slightly more optimized version.
 * Based on the assumption that often pieces of consecutive values
 *  completely pass or do not pass the filter.
 * Therefore, we will optimistically check the parts of `SIMD_BYTES` 
values.
 */
-
-static constexpr size_t SIMD_BYTES = 16;
-const __m128i zero16 = _mm_setzero_si128();
+static constexpr size_t SIMD_BYTES = 32;
 const UInt8* filt_end_sse = filt_pos + size / SIMD_BYTES * SIMD_BYTES;
 
 while (filt_pos < filt_end_sse) {
-int mask = _mm_movemask_epi8(_mm_cmpgt_epi8(
-_mm_loadu_si128(reinterpret_cast(filt_pos)), 
zero16));
+uint32_t mask = bytes32_mask_to_bits32_mask(filt_pos);
 
-if (0 == mask) {
-/// Nothing is inserted.
-} else if (0x == mask) {
+if (0x == mask) {
 res_data.insert(data_pos, data_pos + SIMD_BYTES);
 } else {
-for (size_t i = 0; i < SIMD_BYTES; ++i)
-if (filt_pos[i]) res_data.push_back(data_pos[i]);
+while (mask) {
+const size_t idx = __builtin_ctzll(mask);
+res_data.push_back(data_pos[idx]);
+mask = mask & (mask - 1);
+}
 }
 
 filt_pos += SIMD_BYTES;
 data_pos += SIMD_BYTES;
 }
-#endif
 
 while (filt_pos < filt_end) {
 if (*filt_pos) res_data.push_back(*data_pos);
diff --git a/be/src/vec/columns/columns_common.cpp 
b/be/src/vec/columns/columns_common.cpp
index 02d650a..3045c5b 100644
--- a/be/src/vec/columns/columns_common.cpp
+++ b/be/src/vec/columns/columns_common.cpp
@@ -24,6 +24,7 @@
 
 #include "vec/columns/column.h"
 #include "vec/columns/column_vector.h"
+#include "vec/columns/columns_common.h"
 #include "vec/common/typeid_cast.h"
 
 namespace doris::vectorized {
@@ -173,18 +174,13 @@ void filter_arrays_impl_generic(const PaddedPODArray& 
src_elems,
 memcpy(&res_elems[elems_size_old], &src_elems[arr_offset], arr_size * 
sizeof(T));
 };
 
-#ifdef

[incubator-doris] 25/33: [Vectorized] Rebase code from master

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 20619e795832ca6947a32787f7f8937f5c6d0411
Author: lihaopeng 
AuthorDate: Thu Jan 13 17:27:07 2022 +0800

[Vectorized] Rebase code from master
---
 be/src/exec/olap_scanner.cpp  |  2 +-
 be/src/exec/olap_scanner.h|  4 ++--
 be/src/vec/exec/join/vhash_join_node.cpp  |  2 +-
 be/src/vec/exec/volap_scanner.cpp |  7 ---
 be/src/vec/exec/volap_scanner.h   |  6 +-
 be/src/vec/functions/function_binary_arithmetic.h |  2 +-
 be/src/vec/olap/block_reader.cpp  |  2 +-
 be/src/vec/olap/block_reader.h|  4 ++--
 be/src/vec/olap/vcollect_iterator.cpp |  6 +++---
 be/src/vec/olap/vcollect_iterator.h   | 14 +++---
 be/src/vec/runtime/vdatetime_value.cpp|  2 +-
 be/src/vec/runtime/vdatetime_value.h  |  2 +-
 be/test/vec/core/block_test.cpp   |  2 +-
 13 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/be/src/exec/olap_scanner.cpp b/be/src/exec/olap_scanner.cpp
index 34336fa..2e05c5d 100644
--- a/be/src/exec/olap_scanner.cpp
+++ b/be/src/exec/olap_scanner.cpp
@@ -176,7 +176,7 @@ Status OlapScanner::_init_tablet_reader_params(
  _tablet_reader_params.rs_readers[1]->rowset()->start_version() == 
2 &&
  
!_tablet_reader_params.rs_readers[1]->rowset()->rowset_meta()->is_segments_overlapping());
 
-_params.origin_return_columns = &_return_columns;
+_tablet_reader_params.origin_return_columns = &_return_columns;
 if (_aggregation || single_version) {
 _tablet_reader_params.return_columns = _return_columns;
 _tablet_reader_params.direct_mode = true;
diff --git a/be/src/exec/olap_scanner.h b/be/src/exec/olap_scanner.h
index f234925..0c684d9 100644
--- a/be/src/exec/olap_scanner.h
+++ b/be/src/exec/olap_scanner.h
@@ -58,7 +58,7 @@ public:
 
 Status open();
 
-Status get_batch(RuntimeState* state, RowBatch* batch, bool* eof);
+virtual Status get_batch(RuntimeState* state, RowBatch* batch, bool* eof);
 
 Status close(RuntimeState* state);
 
@@ -103,7 +103,7 @@ protected:
 // Update profile that need to be reported in realtime.
 void _update_realtime_counter();
 
-virtual void set_tablet_reader() { _tablet_reader.reset(new TupleReader); }
+virtual void set_tablet_reader() { _tablet_reader = 
std::make_unique(); }
 
 protected:
 RuntimeState* _runtime_state;
diff --git a/be/src/vec/exec/join/vhash_join_node.cpp 
b/be/src/vec/exec/join/vhash_join_node.cpp
index 4533cae..9563ebf 100644
--- a/be/src/vec/exec/join/vhash_join_node.cpp
+++ b/be/src/vec/exec/join/vhash_join_node.cpp
@@ -590,7 +590,7 @@ Status HashJoinNode::init(const TPlanNode& tnode, 
RuntimeState* state) {
 
 for (const auto& filter_desc : _runtime_filter_descs) {
 
RETURN_IF_ERROR(state->runtime_filter_mgr()->regist_filter(RuntimeFilterRole::PRODUCER,
-   
filter_desc));
+   
filter_desc, state->query_options()));
 }
 
 return Status::OK();
diff --git a/be/src/vec/exec/volap_scanner.cpp 
b/be/src/vec/exec/volap_scanner.cpp
index 64a51a1..1b4bb02 100644
--- a/be/src/vec/exec/volap_scanner.cpp
+++ b/be/src/vec/exec/volap_scanner.cpp
@@ -17,6 +17,8 @@
 
 #include "vec/exec/volap_scanner.h"
 
+#include 
+
 #include "vec/columns/column_complex.h"
 #include "vec/columns/column_nullable.h"
 #include "vec/columns/column_string.h"
@@ -25,14 +27,13 @@
 #include "vec/core/block.h"
 #include "vec/exec/volap_scan_node.h"
 #include "vec/exprs/vexpr_context.h"
-#include "vec/olap/block_reader.h"
 #include "vec/runtime/vdatetime_value.h"
+
 namespace doris::vectorized {
 
 VOlapScanner::VOlapScanner(RuntimeState* runtime_state, VOlapScanNode* parent, 
bool aggregation,
bool need_agg_finalize, const TPaloScanRange& 
scan_range)
 : OlapScanner(runtime_state, parent, aggregation, need_agg_finalize, 
scan_range) {
-_reader.reset(new BlockReader);
 }
 
 Status VOlapScanner::get_block(RuntimeState* state, vectorized::Block* block, 
bool* eof) {
@@ -50,7 +51,7 @@ Status VOlapScanner::get_block(RuntimeState* state, 
vectorized::Block* block, bo
 
 do {
 // Read one block from block reader
-auto res = _reader->next_block_with_aggregation(block, nullptr, 
nullptr, eof);
+auto res = _tablet_reader->next_block_with_aggregation(block, nullptr, 
nullptr, eof);
 if (res != OLAP_SUCCESS) {
 std::stringstream ss;
 ss << "Internal Error: read storage fail. res=" << res
diff --git a/be/src/vec/exec/volap_scanner.h b/be/src/vec/exec/volap_scanner.h
index 3c66f4d..5efaf9d 100644

[incubator-doris] 33/33: [Vectorized](compile) Fix compile error and warning (#7780)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 8a1a6126b4c387eaa678f4a65e773d795e021f0a
Author: Zeno Yang 
AuthorDate: Mon Jan 17 20:27:13 2022 +0800

[Vectorized](compile) Fix compile error and warning (#7780)
---
 be/src/vec/columns/column.h| 1 +
 be/src/vec/functions/function.h| 3 +++
 be/src/vec/functions/function_grouping.cpp | 4 ++--
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h
index d58979d..71b86ae 100644
--- a/be/src/vec/columns/column.h
+++ b/be/src/vec/columns/column.h
@@ -220,6 +220,7 @@ public:
  */
 virtual Ptr filter_by_selector(const uint16_t* sel, size_t sel_size, Ptr* 
ptr = nullptr) {
 LOG(FATAL) << "column not support filter_by_selector";
+__builtin_unreachable();
 };
 
 /// Permutes elements using specified permutation. Is used in sortings.
diff --git a/be/src/vec/functions/function.h b/be/src/vec/functions/function.h
index bc76a10..cf00c08 100644
--- a/be/src/vec/functions/function.h
+++ b/be/src/vec/functions/function.h
@@ -404,6 +404,7 @@ public:
  const ColumnNumbers& 
/*arguments*/,
  size_t /*result*/) const final {
 LOG(FATAL) << "prepare is not implemented for IFunction";
+__builtin_unreachable();
 }
 
 Status prepare(FunctionContext* context, 
FunctionContext::FunctionStateScope scope) override {
@@ -412,10 +413,12 @@ public:
 
 [[noreturn]] const DataTypes& get_argument_types() const final {
 LOG(FATAL) << "get_argument_types is not implemented for IFunction";
+__builtin_unreachable();
 }
 
 [[noreturn]] const DataTypePtr& get_return_type() const final {
 LOG(FATAL) << "get_return_type is not implemented for IFunction";
+__builtin_unreachable();
 }
 
 protected:
diff --git a/be/src/vec/functions/function_grouping.cpp 
b/be/src/vec/functions/function_grouping.cpp
index 763dec2..07872ee 100644
--- a/be/src/vec/functions/function_grouping.cpp
+++ b/be/src/vec/functions/function_grouping.cpp
@@ -15,11 +15,11 @@
 // specific language governing permissions and limitations
 // under the License.
 
-#include "function_grouping.h"
+#include "vec/functions/function_grouping.h"
 
 namespace doris::vectorized {
 void register_function_grouping(SimpleFunctionFactory& factory) {
 factory.register_function();
 factory.register_function();
 }
-}
+} // namespace doris::vectorized

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 09/33: [Vectorized][Feature] fix core dump when using function override and function alias at the same time && support substr(str, int) override (#7640)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 41b165cbf10b39ec41f63958e2332aa1d5b8fc5d
Author: Pxl <952130...@qq.com>
AuthorDate: Thu Jan 6 18:57:05 2022 +0800

[Vectorized][Feature] fix core dump when using function override and 
function alias at the same time && support substr(str,int) override (#7640)
---
 be/src/vec/functions/function_string.cpp   |  19 +++--
 be/src/vec/functions/function_string.h | 109 -
 be/src/vec/functions/function_timestamp.cpp|  11 +--
 be/src/vec/functions/simple_function_factory.h |  10 ++-
 gensrc/script/doris_builtins_functions.py  |   4 +-
 5 files changed, 111 insertions(+), 42 deletions(-)

diff --git a/be/src/vec/functions/function_string.cpp 
b/be/src/vec/functions/function_string.cpp
index 73e2413..34f1c6b 100644
--- a/be/src/vec/functions/function_string.cpp
+++ b/be/src/vec/functions/function_string.cpp
@@ -293,7 +293,7 @@ struct HexStringImpl {
 dst_data_ptr++;
 offset++;
 } else {
-VStringFunctions::hex_encode(source, srclen, 
reinterpret_cast(dst_data_ptr));
+VStringFunctions::hex_encode(source, srclen, 
reinterpret_cast(dst_data_ptr));
 dst_data_ptr[srclen * 2] = '\0';
 dst_data_ptr += (srclen * 2 + 1);
 offset += (srclen * 2 + 1);
@@ -513,9 +513,9 @@ struct AesEncryptImpl {
 int cipher_len = l_size + 16;
 char p[cipher_len];
 
-int outlen =
-EncryptionUtil::encrypt(AES_128_ECB, (unsigned 
char*)l_raw, l_size,
- (unsigned char*)r_raw, r_size, NULL, 
true, (unsigned char*)p);
+int outlen = EncryptionUtil::encrypt(AES_128_ECB, (unsigned 
char*)l_raw, l_size,
+ (unsigned char*)r_raw, 
r_size, NULL, true,
+ (unsigned char*)p);
 if (outlen < 0) {
 StringOP::push_null_string(i, res_data, res_offsets, 
null_map_data);
 } else {
@@ -553,9 +553,9 @@ struct AesDecryptImpl {
 int cipher_len = l_size;
 char p[cipher_len];
 
-int outlen =
-EncryptionUtil::decrypt(AES_128_ECB, (unsigned 
char*)l_raw, l_size,
- (unsigned char*)r_raw, r_size, NULL, 
true, (unsigned char*)p);
+int outlen = EncryptionUtil::decrypt(AES_128_ECB, (unsigned 
char*)l_raw, l_size,
+ (unsigned char*)r_raw, 
r_size, NULL, true,
+ (unsigned char*)p);
 if (outlen < 0) {
 StringOP::push_null_string(i, res_data, res_offsets, 
null_map_data);
 } else {
@@ -774,7 +774,8 @@ void register_function_string(SimpleFunctionFactory& 
factory) {
 factory.register_function();
 factory.register_function();
 factory.register_function();
-factory.register_function();
+factory.register_function>();
+factory.register_function>();
 factory.register_function();
 factory.register_function();
 factory.register_function();
@@ -794,7 +795,7 @@ void register_function_string(SimpleFunctionFactory& 
factory) {
 
 factory.register_alias(FunctionLeft::name, "strleft");
 factory.register_alias(FunctionRight::name, "strright");
-factory.register_alias(FunctionSubstring::name, "substr");
+factory.register_alias(SubstringUtil::name, "substr");
 factory.register_alias(FunctionToLower::name, "lcase");
 factory.register_alias(FunctionStringMd5sum::name, "md5");
 }
diff --git a/be/src/vec/functions/function_string.h 
b/be/src/vec/functions/function_string.h
index efeef41..3f3e538 100644
--- a/be/src/vec/functions/function_string.h
+++ b/be/src/vec/functions/function_string.h
@@ -88,25 +88,9 @@ struct StringOP {
 }
 };
 
-class FunctionSubstring : public IFunction {
-public:
+struct SubstringUtil {
 static constexpr auto name = "substring";
-static FunctionPtr create() { return 
std::make_shared(); }
-String get_name() const override { return name; }
-size_t get_number_of_arguments() const override { return 3; }
-
-DataTypePtr get_return_type_impl(const DataTypes& arguments) const 
override {
-return make_nullable(std::make_shared());
-}
-
-bool use_default_implementation_for_nulls() const override { return false; 
}
-bool use_default_implementation_for_constants() const override { return 
true; }
 
-Status execute_impl(FunctionContext* context, Block& block, const 
ColumnNumbers& arguments,
-size_t result, size_t input_rows_count) override {
-substring_execute(block, arguments, result, input_rows_count);
-return S

[incubator-doris] 06/33: [Vectorized][Function] Support function and (#7618)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 5779e9b580d8c4b80e9f3f677974d1b05ee4215a
Author: Pxl <952130...@qq.com>
AuthorDate: Wed Jan 5 20:11:07 2022 +0800

[Vectorized][Function] Support function  and (#7618)
---
 be/src/vec/CMakeLists.txt  |   1 +
 be/src/vec/functions/function_utility.cpp  | 118 +
 be/src/vec/functions/math.cpp  |   1 +
 be/src/vec/functions/simple_function_factory.h |   2 +
 gensrc/script/doris_builtins_functions.py  |   4 +-
 5 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt
index 1738819..71efde5 100644
--- a/be/src/vec/CMakeLists.txt
+++ b/be/src/vec/CMakeLists.txt
@@ -110,6 +110,7 @@ set(VEC_FILES
   functions/function_cast.cpp
   functions/function_string.cpp
   functions/function_timestamp.cpp
+  functions/function_utility.cpp
   functions/comparison_equal_for_null.cpp
   functions/function_json.cpp
   functions/hll_cardinality.cpp
diff --git a/be/src/vec/functions/function_utility.cpp 
b/be/src/vec/functions/function_utility.cpp
new file mode 100644
index 000..6c7da89
--- /dev/null
+++ b/be/src/vec/functions/function_utility.cpp
@@ -0,0 +1,118 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "util/monotime.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/data_types/data_type_string.h"
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+class FunctionSleep : public IFunction {
+public:
+static constexpr auto name = "sleep";
+static FunctionPtr create() { return std::make_shared(); }
+
+String get_name() const override { return name; }
+
+size_t get_number_of_arguments() const override { return 1; }
+
+DataTypePtr get_return_type_impl(const DataTypes& arguments) const 
override {
+if (arguments[0].get()->is_nullable()) {
+return make_nullable(std::make_shared());
+}
+return std::make_shared();
+}
+
+bool use_default_implementation_for_constants() const override { return 
true; }
+bool use_default_implementation_for_nulls() const override { return false; 
}
+
+Status execute_impl(FunctionContext* context, Block& block, const 
ColumnNumbers& arguments,
+size_t result, size_t input_rows_count) override {
+ColumnPtr argument_column =
+
block.get_by_position(arguments[0]).column->convert_to_full_column_if_const();
+
+auto res_column = ColumnUInt8::create();
+
+if (auto* nullable_column = 
check_and_get_column(*argument_column)) {
+auto null_map_column = ColumnUInt8::create();
+
+auto nested_column = nullable_column->get_nested_column_ptr();
+auto data_column = assert_cast*>(nested_column.get());
+
+for (int i = 0; i < input_rows_count; i++) {
+if (nullable_column->is_null_at(i)) {
+res_column->insert(0);
+null_map_column->insert(1);
+} else {
+int seconds = data_column->get_data()[i];
+SleepFor(MonoDelta::FromSeconds(seconds));
+res_column->insert(1);
+null_map_column->insert(0);
+}
+}
+
+block.replace_by_position(result, 
ColumnNullable::create(std::move(res_column),
+ 
std::move(null_map_column)));
+} else {
+auto data_column = assert_cast*>(argument_column.get());
+
+for (int i = 0; i < input_rows_count; i++) {
+int seconds = data_column->get_element(i);
+SleepFor(MonoDelta::FromSeconds(seconds));
+res_column->insert(1);
+}
+
+block.replace_by_position(result, std::move(res_column));
+}
+return Status::OK();
+}
+};
+
+class FunctionVersion : public IFunction {
+public:
+static constexpr auto name = "version";
+
+

[incubator-doris] 04/33: [Bug] Fix bug of concat function and fold const expr (#7608)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 5f52c04d62be933c6b1f377db862989a19574778
Author: HappenLee 
AuthorDate: Tue Jan 4 06:45:32 2022 -0600

[Bug] Fix bug of concat function and fold const expr (#7608)

Co-authored-by: lihaopeng 
---
 be/src/exec/exec_node.cpp |  2 +-
 be/src/runtime/fold_constant_executor.cpp | 10 +++---
 be/src/runtime/fold_constant_executor.h   |  2 +-
 be/src/vec/exec/join/vhash_join_node.cpp  |  9 -
 be/src/vec/exec/vcross_join_node.cpp  |  1 -
 be/src/vec/functions/function_string.h| 20 
 6 files changed, 17 insertions(+), 27 deletions(-)

diff --git a/be/src/exec/exec_node.cpp b/be/src/exec/exec_node.cpp
index b18160e..4582f89 100644
--- a/be/src/exec/exec_node.cpp
+++ b/be/src/exec/exec_node.cpp
@@ -756,7 +756,7 @@ void ExecNode::reached_limit(vectorized::Block* block, 
bool* eos) {
 }
 
 _num_rows_returned += block->rows();
-if (*eos) COUNTER_SET(_rows_returned_counter, _num_rows_returned);
+COUNTER_SET(_rows_returned_counter, _num_rows_returned);
 }
 
 /*
diff --git a/be/src/runtime/fold_constant_executor.cpp 
b/be/src/runtime/fold_constant_executor.cpp
index cd6d5ff..9781c2f 100644
--- a/be/src/runtime/fold_constant_executor.cpp
+++ b/be/src/runtime/fold_constant_executor.cpp
@@ -82,7 +82,7 @@ Status FoldConstantExecutor::fold_constant_expr(
 expr_result.set_success(false);
 } else {
 expr_result.set_success(true);
-result = _get_result(src, ctx->root()->type().type);
+result = _get_result(src, 0, ctx->root()->type().type);
 }
 
 expr_result.set_content(std::move(result));
@@ -143,7 +143,8 @@ Status FoldConstantExecutor::fold_constant_vexpr(
 expr_result.set_success(false);
 } else {
 expr_result.set_success(true);
-result = _get_result((void *) 
column_ptr->get_data_at(0).data, ctx->root()->type().type);
+auto string_ref = column_ptr->get_data_at(0);
+result = _get_result((void*)string_ref.data, 
string_ref.size, ctx->root()->type().type);
 }
 
 expr_result.set_content(std::move(result));
@@ -198,7 +199,7 @@ Status FoldConstantExecutor::_prepare_and_open(Context* 
ctx) {
 }
 
 template 
-string FoldConstantExecutor::_get_result(void* src, PrimitiveType slot_type){
+string FoldConstantExecutor::_get_result(void* src, size_t size, PrimitiveType 
slot_type){
 switch (slot_type) {
 case TYPE_BOOLEAN: {
 bool val = *reinterpret_cast(src);
@@ -237,6 +238,9 @@ string FoldConstantExecutor::_get_result(void* src, 
PrimitiveType slot_type){
 case TYPE_STRING:
 case TYPE_HLL:
 case TYPE_OBJECT: {
+if constexpr (is_vec) {
+return std::string((char*)src, size);
+}
 return (reinterpret_cast(src))->to_string();
 }
 case TYPE_DATE:
diff --git a/be/src/runtime/fold_constant_executor.h 
b/be/src/runtime/fold_constant_executor.h
index c7c5a38..84c52f7 100644
--- a/be/src/runtime/fold_constant_executor.h
+++ b/be/src/runtime/fold_constant_executor.h
@@ -47,7 +47,7 @@ private:
 Status _prepare_and_open(Context* ctx);
 
 template 
-std::string _get_result(void* src, PrimitiveType slot_type);
+std::string _get_result(void* src, size_t size, PrimitiveType slot_type);
 
 std::unique_ptr _runtime_state;
 std::shared_ptr _mem_tracker;
diff --git a/be/src/vec/exec/join/vhash_join_node.cpp 
b/be/src/vec/exec/join/vhash_join_node.cpp
index 62dfffa..7606783 100644
--- a/be/src/vec/exec/join/vhash_join_node.cpp
+++ b/be/src/vec/exec/join/vhash_join_node.cpp
@@ -124,7 +124,7 @@ struct ProcessRuntimeFilterBuild {
 ProcessRuntimeFilterBuild(HashJoinNode* join_node) : _join_node(join_node) 
{}
 
 Status operator()(RuntimeState* state, HashTableContext& hash_table_ctx) {
-if (_join_node->_runtime_filter_descs.empty() || 
_join_node->_inserted_rows.empty()) {
+if (_join_node->_runtime_filter_descs.empty()) {
 return Status::OK();
 }
 VRuntimeFilterSlots* runtime_filter_slots =
@@ -162,7 +162,6 @@ struct ProcessHashTableProbe {
   _probe_block(join_node->_probe_block),
   _probe_index(join_node->_probe_index),
   _probe_raw_ptrs(join_node->_probe_columns),
-  _arena(join_node->_arena),
   _rows_returned_counter(join_node->_rows_returned_counter) {}
 
 // Only process the join with no other join conjunt, because of no other 
join conjunt
@@ -198,7 +197,7 @@ struct ProcessHashTableProbe {
_arena)) {nullptr, 
false}
 : key_getter.find_key(hash_table_ctx.hash_table, 
_probe_inde

[incubator-doris] 03/33: [Function] Fix error about rank/dense_rank/row_number return always not nullable (#7561)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 9431a93dd623f76618c31921ea1f650ce6444a6a
Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com>
AuthorDate: Tue Jan 4 11:05:32 2022 +0800

[Function] Fix error about rank/dense_rank/row_number return always not 
nullable (#7561)
---
 .../src/main/java/org/apache/doris/catalog/AggregateFunction.java   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java 
b/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java
index 4e85ca3..82e4035 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java
@@ -49,7 +49,7 @@ public class AggregateFunction extends Function {
 private static final Logger LOG = 
LogManager.getLogger(AggregateFunction.class);
 
 public static ImmutableSet 
NOT_NULLABLE_AGGREGATE_FUNCTION_NAME_SET =
-ImmutableSet.of(FunctionSet.COUNT, "ndv", 
FunctionSet.BITMAP_UNION_INT, FunctionSet.BITMAP_UNION_COUNT, 
"ndv_no_finalize");
+ImmutableSet.of("row_number", "rank", "dense_rank", 
FunctionSet.COUNT, "ndv", FunctionSet.BITMAP_UNION_INT, 
FunctionSet.BITMAP_UNION_COUNT, "ndv_no_finalize");
 
 // Set if different from retType_, null otherwise.
 private Type intermediateType;

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 17/33: [Feature][Vectorized] Support String in vec exe engine (#7670)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 1fc1c7005dccabdadeb0a5d2cd0cf0aa4871a3a8
Author: HappenLee 
AuthorDate: Mon Jan 10 20:27:45 2022 +0800

[Feature][Vectorized] Support String in vec exe engine (#7670)

Co-authored-by: lihaopeng 
---
 be/src/olap/olap_define.h  |  3 ++-
 be/src/olap/row_block2.cpp | 26 --
 be/src/olap/row_block2.h   |  2 +-
 be/src/olap/rowset/beta_rowset_reader.cpp  |  6 -
 be/src/vec/exec/vunion_node.cpp|  9 
 .../apache/doris/rewrite/FoldConstantsRule.java|  7 --
 6 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/be/src/olap/olap_define.h b/be/src/olap/olap_define.h
index c2d4b7f..a9ac731 100644
--- a/be/src/olap/olap_define.h
+++ b/be/src/olap/olap_define.h
@@ -384,7 +384,8 @@ enum OLAPStatus {
 OLAP_ERR_ROWSET_LOAD_FAILED = -3109,
 OLAP_ERR_ROWSET_READER_INIT = -3110,
 OLAP_ERR_ROWSET_READ_FAILED = -3111,
-OLAP_ERR_ROWSET_INVALID_STATE_TRANSITION = -3112
+OLAP_ERR_ROWSET_INVALID_STATE_TRANSITION = -3112,
+OLAP_ERR_STRING_OVERFLOW_IN_VEC_ENGINE = -3113
 };
 
 enum ColumnFamilyIndex {
diff --git a/be/src/olap/row_block2.cpp b/be/src/olap/row_block2.cpp
index 26b58ca..877f6a2 100644
--- a/be/src/olap/row_block2.cpp
+++ b/be/src/olap/row_block2.cpp
@@ -95,7 +95,9 @@ Status RowBlockV2::convert_to_row_block(RowCursor* helper, 
RowBlock* dst) {
 return Status::OK();
 }
 
-void RowBlockV2::_copy_data_to_column(int cid, 
doris::vectorized::MutableColumnPtr& origin_column) {
+Status RowBlockV2::_copy_data_to_column(int cid, 
doris::vectorized::MutableColumnPtr& origin_column) {
+constexpr auto MAX_SIZE_OF_VEC_STRING = 1024l * 1024;
+
 auto* column = origin_column.get();
 bool nullable_mark_array[_selected_size];
 
@@ -170,6 +172,24 @@ void RowBlockV2::_copy_data_to_column(int cid, 
doris::vectorized::MutableColumnP
 }
 break;
 }
+case OLAP_FIELD_TYPE_STRING: {
+auto column_string = assert_cast(column);
+
+for (uint16_t j = 0; j < _selected_size; ++j) {
+if (!nullable_mark_array[j]) {
+uint16_t row_idx = _selection_vector[j];
+auto slice = reinterpret_cast(column_block(cid).cell_ptr(row_idx));
+if (LIKELY(slice->size <= MAX_SIZE_OF_VEC_STRING)) {
+column_string->insert_data(slice->data, slice->size);
+} else {
+return Status::NotSupported("Not support string len over 
than 1MB in vec engine.");
+}
+} else {
+column_string->insert_default();
+}
+}
+break;
+}
 case OLAP_FIELD_TYPE_CHAR: {
 auto column_string = assert_cast(column);
 
@@ -286,13 +306,15 @@ void RowBlockV2::_copy_data_to_column(int cid, 
doris::vectorized::MutableColumnP
 DCHECK(false) << "Invalid type in RowBlockV2:" << 
_schema.column(cid)->type();
 }
 }
+
+return Status::OK();
 }
 
 Status RowBlockV2::convert_to_vec_block(vectorized::Block* block) {
 for (int i = 0; i < _schema.column_ids().size(); ++i) {
 auto cid = _schema.column_ids()[i];
 auto column = 
(*std::move(block->get_by_position(i).column)).assume_mutable();
-_copy_data_to_column(cid, column);
+RETURN_IF_ERROR(_copy_data_to_column(cid, column));
 }
 _pool->clear();
 return Status::OK();
diff --git a/be/src/olap/row_block2.h b/be/src/olap/row_block2.h
index cdbf428..b98ab95 100644
--- a/be/src/olap/row_block2.h
+++ b/be/src/olap/row_block2.h
@@ -109,7 +109,7 @@ public:
 std::string debug_string();
 
 private:
-void _copy_data_to_column(int cid, vectorized::MutableColumnPtr& 
mutable_column_ptr);
+Status _copy_data_to_column(int cid, vectorized::MutableColumnPtr& 
mutable_column_ptr);
 
 Schema _schema;
 size_t _capacity;
diff --git a/be/src/olap/rowset/beta_rowset_reader.cpp 
b/be/src/olap/rowset/beta_rowset_reader.cpp
index 459f3ca..4d35f2f 100644
--- a/be/src/olap/rowset/beta_rowset_reader.cpp
+++ b/be/src/olap/rowset/beta_rowset_reader.cpp
@@ -204,7 +204,11 @@ OLAPStatus BetaRowsetReader::next_block(vectorized::Block* 
block) {
 
 {
 SCOPED_RAW_TIMER(&_stats->block_convert_ns);
-_input_block->convert_to_vec_block(block);
+auto s = _input_block->convert_to_vec_block(block);
+if (UNLIKELY(!s.ok())) {
+LOG(WARNING) << "failed to read next block: " << s.to_string();
+return OLAP_ERR_STRING_OVERFLOW_IN_VEC_ENGINE;
+}
 }
 is_first = false;
 } while (block->rows() < _context->runtime_state->batch_size()); // here 
we should keep block.rows() < batch_size
diff --git a/be/

[incubator-doris] 14/33: [vectorized] [block] Add new method get_data_type to avoid unnecessary copy by the method get_data_type (#7600)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit e4619d98a18952a24e2e3583d3ab82da2d0a1ba8
Author: thinker 
AuthorDate: Fri Jan 7 15:37:35 2022 +0800

[vectorized] [block] Add new method get_data_type to avoid unnecessary copy 
 by the method get_data_type (#7600)

Co-authored-by: zuochunwei 
---
 be/src/vec/core/block.cpp| 4 ++--
 be/src/vec/core/block.h  | 7 ++-
 be/src/vec/olap/block_reader.cpp | 4 ++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp
index a5e445d..2aff751 100644
--- a/be/src/vec/core/block.cpp
+++ b/be/src/vec/core/block.cpp
@@ -51,7 +51,7 @@
 
 namespace doris::vectorized {
 
-inline DataTypePtr get_data_type(const PColumn& pcolumn) {
+inline DataTypePtr create_data_type(const PColumn& pcolumn) {
 switch (pcolumn.type()) {
 case PColumn::UINT8: {
 return std::make_shared();
@@ -176,7 +176,7 @@ Block::Block(const ColumnsWithTypeAndName& data_) : data 
{data_} {
 
 Block::Block(const PBlock& pblock) {
 for (const auto& pcolumn : pblock.columns()) {
-DataTypePtr type = get_data_type(pcolumn);
+DataTypePtr type = create_data_type(pcolumn);
 MutableColumnPtr data_column;
 if (pcolumn.is_null_size() > 0) {
 data_column =
diff --git a/be/src/vec/core/block.h b/be/src/vec/core/block.h
index addecf7..1c435ee 100644
--- a/be/src/vec/core/block.h
+++ b/be/src/vec/core/block.h
@@ -130,6 +130,11 @@ public:
 Names get_names() const;
 DataTypes get_data_types() const;
 
+DataTypePtr get_data_type(size_t index) const { 
+CHECK(index < data.size());
+return data[index].type; 
+}
+
 /// Returns number of rows from first column in block, not equal to 
nullptr. If no columns, returns 0.
 size_t rows() const;
 
@@ -204,7 +209,7 @@ public:
 static Status filter_block(Block* block, int filter_conlumn_id, int 
column_to_keep);
 
 static inline void erase_useless_column(Block* block, int column_to_keep) {
-for (size_t i = block->columns() - 1; i >= column_to_keep; --i) {
+for (int i = block->columns() - 1; i >= column_to_keep; --i) {
 block->erase(i);
 }
 }
diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp
index 769e27e..8e2d4b2 100644
--- a/be/src/vec/olap/block_reader.cpp
+++ b/be/src/vec/olap/block_reader.cpp
@@ -94,11 +94,11 @@ void BlockReader::_init_agg_state() {
 
 // create aggregate function
 DataTypes argument_types;
-argument_types.push_back(_next_row.block->get_data_types()[idx]);
+argument_types.push_back(_next_row.block->get_data_type(idx));
 Array params;
 AggregateFunctionPtr function = 
AggregateFunctionSimpleFactory::instance().get(
 agg_name, argument_types, params,
-_next_row.block->get_data_types()[idx]->is_nullable());
+_next_row.block->get_data_type(idx)->is_nullable());
 DCHECK(function != nullptr);
 _agg_functions.push_back(function);
 

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 10/33: [Function][Vec] add function coalesce (#7632)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit ebdc0bac985b488fda6f1a29d1c09ceedfe5315a
Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com>
AuthorDate: Thu Jan 6 19:33:25 2022 +0800

[Function][Vec] add function coalesce (#7632)
---
 be/src/vec/CMakeLists.txt  |   1 +
 be/src/vec/functions/function_coalesce.cpp | 143 +
 be/src/vec/functions/simple_function_factory.h |   2 +
 docs/.vuepress/sidebar/en.js   |   1 +
 docs/.vuepress/sidebar/zh-CN.js|   1 +
 .../sql-functions/string-functions/coalesce.md |  62 +
 .../sql-functions/string-functions/coalesce.md |  63 +
 gensrc/script/doris_builtins_functions.py  |  28 ++--
 8 files changed, 287 insertions(+), 14 deletions(-)

diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt
index 71efde5..01c69eb 100644
--- a/be/src/vec/CMakeLists.txt
+++ b/be/src/vec/CMakeLists.txt
@@ -133,6 +133,7 @@ set(VEC_FILES
   functions/function_ifnull.cpp
   functions/nullif.cpp
   functions/random.cpp
+  functions/function_coalesce.cpp
   functions/function_date_or_datetime_computation.cpp
   functions/function_date_or_datetime_to_string.cpp
   functions/function_datetime_string_to_string.cpp
diff --git a/be/src/vec/functions/function_coalesce.cpp 
b/be/src/vec/functions/function_coalesce.cpp
new file mode 100644
index 000..65d544c
--- /dev/null
+++ b/be/src/vec/functions/function_coalesce.cpp
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "udf/udf.h"
+#include "vec/data_types/data_type_nothing.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/data_types/get_least_supertype.h"
+#include "vec/functions/function_helpers.h"
+#include "vec/functions/simple_function_factory.h"
+#include "vec/utils/util.hpp"
+
+namespace doris::vectorized {
+class FunctionCoalesce : public IFunction {
+public:
+static constexpr auto name = "coalesce";
+
+static FunctionPtr create() { return std::make_shared(); 
}
+
+String get_name() const override { return name; }
+
+bool use_default_implementation_for_constants() const override { return 
false; }
+
+bool use_default_implementation_for_nulls() const override { return false; 
}
+
+bool is_variadic() const override { return true; }
+
+size_t get_number_of_arguments() const override { return 0; }
+
+DataTypePtr get_return_type_impl(const DataTypes& arguments) const 
override {
+for (const auto& arg : arguments) {
+if (!arg->is_nullable()) {
+return arg;
+}
+}
+return arguments[0];
+}
+
+Status execute_impl(FunctionContext* context, Block& block, const 
ColumnNumbers& arguments,
+size_t result, size_t input_rows_count) override {
+DCHECK_GE(arguments.size(), 1);
+ColumnNumbers filtered_args;
+filtered_args.reserve(arguments.size());
+for (const auto& arg : arguments) {
+const auto& type = block.get_by_position(arg).type;
+if (type->only_null()) {
+continue;
+}
+filtered_args.push_back(arg);
+if (!type->is_nullable()) {
+break;
+}
+}
+
+size_t remaining_rows = input_rows_count;
+size_t argument_size = filtered_args.size();
+std::vector record_idx(input_rows_count, -1); //used to save 
column idx
+MutableColumnPtr result_column;
+
+DataTypePtr type = block.get_by_position(result).type;
+if (!type->is_nullable()) {
+result_column = type->create_column();
+} else {
+result_column = remove_nullable(type)->create_column();
+}
+
+result_column->reserve(input_rows_count);
+auto return_type = std::make_shared();
+auto null_map = ColumnUInt8::create(input_rows_count, 1);
+auto& null_map_data = null_map->get_data();
+ColumnPtr argument_columns[argument_size];
+
+  

[incubator-doris] 11/33: [Bug] Fix function nulllable not match and largetint cast failed (#7659)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit da43e38a71ac7b5394e15d1d52ea055f50684f83
Author: HappenLee 
AuthorDate: Thu Jan 6 21:52:51 2022 -0600

[Bug] Fix function nulllable not match and largetint cast failed (#7659)

Co-authored-by: lihaopeng 
---
 be/src/vec/common/cow.h |  8 
 be/src/vec/core/block.cpp   |  6 --
 be/src/vec/data_types/data_type_number_base.cpp |  2 +-
 be/src/vec/functions/date_time_transforms.h |  8 +---
 be/src/vec/io/io_helper.h   | 18 --
 gensrc/script/doris_builtins_functions.py   | 10 +-
 6 files changed, 23 insertions(+), 29 deletions(-)

diff --git a/be/src/vec/common/cow.h b/be/src/vec/common/cow.h
index 08edb89..58ae14d 100644
--- a/be/src/vec/common/cow.h
+++ b/be/src/vec/common/cow.h
@@ -105,10 +105,6 @@ protected:
 return *this;
 }
 
-unsigned int use_count() const {
-return ref_counter.load();
-}
-
 void add_ref() {
 ++ref_counter;
 }
@@ -265,6 +261,10 @@ protected:
 public:
 using MutablePtr = mutable_ptr;
 
+unsigned int use_count() const {
+return ref_counter.load();
+}
+
 protected:
 template 
 class immutable_ptr : public intrusive_ptr {
diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp
index b52257d..a5e445d 100644
--- a/be/src/vec/core/block.cpp
+++ b/be/src/vec/core/block.cpp
@@ -646,7 +646,8 @@ void Block::clear_column_data(int column_size) noexcept {
 }
 }
 for (auto& d : data) {
-(*std::move(d.column)).mutate()->clear();
+DCHECK(d.column->use_count() == 1);
+(*std::move(d.column)).assume_mutable()->clear();
 }
 }
 
@@ -691,7 +692,8 @@ Status Block::filter_block(Block* block, int 
filter_column_id, int column_to_kee
 if (auto* nullable_column = 
check_and_get_column(*filter_column)) {
 ColumnPtr nested_column = nullable_column->get_nested_column_ptr();
 
-MutableColumnPtr mutable_holder = (*std::move(nested_column)).mutate();
+MutableColumnPtr mutable_holder = nested_column->use_count() == 1 ?
+nested_column->assume_mutable() : 
nested_column->clone_resized(nested_column->size());
 
 ColumnUInt8* concrete_column = 
typeid_cast(mutable_holder.get());
 if (!concrete_column) {
diff --git a/be/src/vec/data_types/data_type_number_base.cpp 
b/be/src/vec/data_types/data_type_number_base.cpp
index ee94a37..01a4248 100644
--- a/be/src/vec/data_types/data_type_number_base.cpp
+++ b/be/src/vec/data_types/data_type_number_base.cpp
@@ -35,7 +35,7 @@ namespace doris::vectorized {
 template 
 void DataTypeNumberBase::to_string(const IColumn& column, size_t row_num,
   BufferWritable& ostr) const {
-if constexpr (std::is_same::value || std::is_same::value) {
+if constexpr (std::is_same::value) {
 std::string hex = int128_to_string(
 assert_cast&>(*column.convert_to_full_column_if_const().get())
 .get_data()[row_num]);
diff --git a/be/src/vec/functions/date_time_transforms.h 
b/be/src/vec/functions/date_time_transforms.h
index a34b53d..eaab918 100644
--- a/be/src/vec/functions/date_time_transforms.h
+++ b/be/src/vec/functions/date_time_transforms.h
@@ -56,6 +56,7 @@ TIME_FUNCTION_IMPL(WeekOfYearImpl, weekofyear, 
week(mysql_week_mode(3)));
 TIME_FUNCTION_IMPL(DayOfYearImpl, dayofyear, day_of_year());
 TIME_FUNCTION_IMPL(DayOfMonthImpl, dayofmonth, day());
 TIME_FUNCTION_IMPL(DayOfWeekImpl, dayofweek, day_of_week());
+// TODO: the method should be always not nullable
 TIME_FUNCTION_IMPL(ToDaysImpl, to_days, daynr());
 TIME_FUNCTION_IMPL(ToYearWeekImpl, yearweek, year_week(mysql_week_mode(0)));
 struct ToDateImpl {
@@ -92,7 +93,7 @@ struct DayNameImpl {
 res_data[offset - 1] = 0;
 } else {
 auto len = strlen(day_name);
-memcpy_small_allow_read_write_overflow15(&res_data[offset], 
day_name, len);
+memcpy(&res_data[offset], day_name, len);
 offset += len + 1;
 res_data[offset - 1] = 0;
 }
@@ -113,8 +114,8 @@ struct MonthNameImpl {
 res_data[offset - 1] = 0;
 } else {
 auto len = strlen(month_name);
-memcpy_small_allow_read_write_overflow15(&res_data[offset], 
month_name, len);
-offset += len + 1;
+memcpy(&res_data[offset], month_name, len);
+offset += (len + 1);
 res_data[offset - 1] = 0;
 }
 return offset;
@@ -148,6 +149,7 @@ struct DateFormatImpl {
 }
 };
 
+// TODO: This function should be depend on argments not always nullable
 struct FromUnixTimeImpl {
 using FromType = Int32;
 
diff --git a/be/src/vec/io/io_helper.h b/be/src/v

[incubator-doris] 15/33: [Vectorized] Support bloom filter predicate on vectorized engine storage layer (#7557)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit ae9d0cff0ad6aa59919eaa138307d619fffa9aeb
Author: Zeno Yang 
AuthorDate: Sat Jan 8 01:00:43 2022 +0800

[Vectorized] Support bloom filter predicate on vectorized engine storage 
layer (#7557)
---
 be/src/olap/bloom_filter_predicate.h   |  40 +-
 .../olap/bloom_filter_column_predicate_test.cpp|  36 ++
 be/test/olap/null_predicate_test.cpp   | 144 +
 3 files changed, 219 insertions(+), 1 deletion(-)

diff --git a/be/src/olap/bloom_filter_predicate.h 
b/be/src/olap/bloom_filter_predicate.h
index b3dcbbb..ff3201c 100644
--- a/be/src/olap/bloom_filter_predicate.h
+++ b/be/src/olap/bloom_filter_predicate.h
@@ -27,6 +27,10 @@
 #include "olap/field.h"
 #include "runtime/string_value.hpp"
 #include "runtime/vectorized_row_batch.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/columns/column_vector.h"
+#include "vec/columns/predicate_column.h"
+#include "vec/utils/util.hpp"
 
 namespace doris {
 
@@ -59,12 +63,14 @@ public:
 return Status::OK();
 }
 
+void evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) 
const override;
+
 private:
 std::shared_ptr _filter;
 SpecificFilter* _specific_filter; // owned by _filter
 };
 
-// blomm filter column predicate do not support in segment v1
+// bloom filter column predicate do not support in segment v1
 template 
 void BloomFilterColumnPredicate::evaluate(VectorizedRowBatch* batch) 
const {
 uint16_t n = batch->size();
@@ -99,6 +105,38 @@ void 
BloomFilterColumnPredicate::evaluate(ColumnBlock* block, uint16_t* se
 *size = new_size;
 }
 
+template 
+void BloomFilterColumnPredicate::evaluate(vectorized::IColumn& column, 
uint16_t* sel,
+uint16_t* size) const {
+uint16_t new_size = 0;
+using T = typename PrimitiveTypeTraits::CppType;
+
+if (column.is_nullable()) {
+auto* nullable_col = 
vectorized::check_and_get_column(column);
+auto& null_map_data = nullable_col->get_null_map_column().get_data();
+auto* pred_col = 
vectorized::check_and_get_column>(
+nullable_col->get_nested_column());
+auto& pred_col_data = pred_col->get_data();
+for (uint16_t i = 0; i < *size; i++) {
+uint16_t idx = sel[i];
+sel[new_size] = idx;
+const auto* cell_value = reinterpret_cast(&(pred_col_data[idx]));
+new_size += (!null_map_data[idx]) && 
_specific_filter->find_olap_engine(cell_value);
+}
+} else {
+auto* pred_col =
+
vectorized::check_and_get_column>(column);
+auto& pred_col_data = pred_col->get_data();
+for (uint16_t i = 0; i < *size; i++) {
+uint16_t idx = sel[i];
+sel[new_size] = idx;
+const auto* cell_value = reinterpret_cast(&(pred_col_data[idx]));
+new_size += _specific_filter->find_olap_engine(cell_value);
+}
+}
+*size = new_size;
+}
+
 class BloomFilterColumnPredicateFactory {
 public:
 static ColumnPredicate* create_column_predicate(
diff --git a/be/test/olap/bloom_filter_column_predicate_test.cpp 
b/be/test/olap/bloom_filter_column_predicate_test.cpp
index 164c51d..24abea1 100644
--- a/be/test/olap/bloom_filter_column_predicate_test.cpp
+++ b/be/test/olap/bloom_filter_column_predicate_test.cpp
@@ -28,6 +28,11 @@
 #include "runtime/string_value.hpp"
 #include "runtime/vectorized_row_batch.h"
 #include "util/logging.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/columns/predicate_column.h"
+#include "vec/core/block.h"
+
+using namespace doris::vectorized;
 
 namespace doris {
 
@@ -172,6 +177,37 @@ TEST_F(TestBloomFilterColumnPredicate, FLOAT_COLUMN) {
 ASSERT_EQ(select_size, 1);
 
ASSERT_FLOAT_EQ(*(float*)col_block.cell(_row_block->selection_vector()[0]).cell_ptr(),
 5.1);
 
+// for vectorized::Block no null
+auto pred_col = PredicateColumnType::create();
+pred_col->reserve(size);
+for (int i = 0; i < size; ++i) {
+*(col_data + i) = i + 0.1f;
+pred_col->insert_data(reinterpret_cast(col_data + i), 0);
+}
+_row_block->clear();
+select_size = _row_block->selected_size();
+pred->evaluate(*pred_col, _row_block->selection_vector(), &select_size);
+ASSERT_EQ(select_size, 3);
+
ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[0]], 
4.1);
+
ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[1]], 
5.1);
+
ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[2]], 
6.1);
+
+// for vectorized::Block has nulls
+auto null_map = ColumnUInt8::create(size, 0);
+auto& null_map_data = null_map->get_data();
+for (int i = 0; i < size; ++i) {
+nu

[incubator-doris] 26/33: [Vectorized][Function] Support function stddev/variance/stddev_samp/variance_samp (#7734)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 990723c15b346f0838314e53c8138e574176a5b9
Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com>
AuthorDate: Thu Jan 13 20:27:34 2022 +0800

[Vectorized][Function] Support function 
stddev/variance/stddev_samp/variance_samp (#7734)
---
 be/src/exprs/aggregate_functions.cpp   |   2 +-
 be/src/vec/CMakeLists.txt  |   1 +
 .../vec/aggregate_functions/aggregate_function.h   |   5 +
 .../aggregate_functions/aggregate_function_null.h  |   9 +-
 .../aggregate_function_simple_factory.cpp  |   6 +-
 .../aggregate_function_simple_factory.h|  20 +-
 .../aggregate_function_stddev.cpp  | 101 
 .../aggregate_function_stddev.h| 285 +
 .../java/org/apache/doris/catalog/FunctionSet.java |  68 +
 9 files changed, 486 insertions(+), 11 deletions(-)

diff --git a/be/src/exprs/aggregate_functions.cpp 
b/be/src/exprs/aggregate_functions.cpp
index a22c3ee..93166cf 100644
--- a/be/src/exprs/aggregate_functions.cpp
+++ b/be/src/exprs/aggregate_functions.cpp
@@ -1874,8 +1874,8 @@ static double compute_knuth_variance(const 
KnuthVarianceState& state, bool pop)
 static DecimalV2Value decimalv2_compute_knuth_variance(const 
DecimalV2KnuthVarianceState& state,
bool pop) {
 DecimalV2Value new_count = DecimalV2Value();
-new_count.assign_from_double(state.count);
 if (state.count == 1) return new_count;
+new_count.assign_from_double(state.count);
 DecimalV2Value new_m2 = DecimalV2Value::from_decimal_val(state.m2);
 if (pop)
 return new_m2 / new_count;
diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt
index aa302ce..9c4d947 100644
--- a/be/src/vec/CMakeLists.txt
+++ b/be/src/vec/CMakeLists.txt
@@ -31,6 +31,7 @@ set(VEC_FILES
   aggregate_functions/aggregate_function_bitmap.cpp
   aggregate_functions/aggregate_function_reader.cpp
   aggregate_functions/aggregate_function_window.cpp
+  aggregate_functions/aggregate_function_stddev.cpp
   aggregate_functions/aggregate_function_simple_factory.cpp
   columns/collator.cpp
   columns/column.cpp
diff --git a/be/src/vec/aggregate_functions/aggregate_function.h 
b/be/src/vec/aggregate_functions/aggregate_function.h
index 412382f..4c2ef36 100644
--- a/be/src/vec/aggregate_functions/aggregate_function.h
+++ b/be/src/vec/aggregate_functions/aggregate_function.h
@@ -114,6 +114,11 @@ public:
   */
 virtual bool is_state() const { return false; }
 
+/// if return false, during insert_result_into function, you colud get 
nullable result column, 
+/// so could insert to null value by yourself, rather than by 
AggregateFunctionNullBase;
+/// because you maybe be calculate a invalid value, but want to use null 
replace it;
+virtual bool insert_to_null_default() const { return true; }
+
 /** The inner loop that uses the function pointer is better than using the 
virtual function.
   * The reason is that in the case of virtual functions GCC 5.1.2 
generates code,
   *  which, at each iteration of the loop, reloads the function address 
(the offset value in the virtual function table) from memory to the register.
diff --git a/be/src/vec/aggregate_functions/aggregate_function_null.h 
b/be/src/vec/aggregate_functions/aggregate_function_null.h
index 4c61e2b..9458d7d 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_null.h
+++ b/be/src/vec/aggregate_functions/aggregate_function_null.h
@@ -144,9 +144,12 @@ public:
 if constexpr (result_is_nullable) {
 ColumnNullable& to_concrete = assert_cast(to);
 if (get_flag(place)) {
-nested_function->insert_result_into(nested_place(place),
-
to_concrete.get_nested_column());
-to_concrete.get_null_map_data().push_back(0);
+if (nested_function->insert_to_null_default()) {
+nested_function->insert_result_into(nested_place(place), 
to_concrete.get_nested_column());
+to_concrete.get_null_map_data().push_back(0);
+} else {
+nested_function->insert_result_into(nested_place(place), 
to);  //want to insert into null value by self
+}
 } else {
 to_concrete.insert_default();
 }
diff --git 
a/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp 
b/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp
index 8a1995b..ba1b2ba 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp
+++ b/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp
@@ -35,6 +35,7 @@ void 
register_aggregate_f

[incubator-doris] 19/33: [Vectorized][Bug] fix 'negative' function ut run fail && fix testIsBucketShuffleJoin run fail && fix some compile fail (#7688)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 9f16ac2363f2777d55e7538eda2daee46df643b0
Author: Pxl <952130...@qq.com>
AuthorDate: Tue Jan 11 10:48:02 2022 +0800

[Vectorized][Bug] fix 'negative' function ut run fail && fix 
testIsBucketShuffleJoin run fail && fix some compile fail (#7688)
---
 be/src/util/brpc_stub_cache.h  |  2 +-
 be/test/vec/function/function_hash_test.cpp| 36 +-
 be/test/vec/function/function_math_test.cpp| 33 
 .../org/apache/doris/planner/HashJoinNode.java |  2 +-
 .../java/org/apache/doris/qe/CoordinatorTest.java  |  9 --
 5 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/be/src/util/brpc_stub_cache.h b/be/src/util/brpc_stub_cache.h
index e944aac..21800f3 100644
--- a/be/src/util/brpc_stub_cache.h
+++ b/be/src/util/brpc_stub_cache.h
@@ -47,7 +47,7 @@ namespace doris {
 class BrpcStubCache {
 public:
 BrpcStubCache();
-~BrpcStubCache();
+virtual ~BrpcStubCache();
 
 inline std::shared_ptr get_stub(const 
butil::EndPoint& endpoint) {
 auto stub_ptr = _stub_map.find(endpoint);
diff --git a/be/test/vec/function/function_hash_test.cpp 
b/be/test/vec/function/function_hash_test.cpp
index 75a2ac5..d5d41b2 100644
--- a/be/test/vec/function/function_hash_test.cpp
+++ b/be/test/vec/function/function_hash_test.cpp
@@ -33,20 +33,21 @@ TEST(HashFunctionTest, murmur_hash_3_test) {
 {
 std::vector input_types = {vectorized::TypeIndex::String};
 
-DataSet data_set = {{{Null()}, Null()},
-{{std::string("hello")}, (int32_t) 1321743225}};
+DataSet data_set = {{{Null()}, Null()}, {{std::string("hello")}, 
(int32_t)1321743225}};
 
-vectorized::check_function(func_name, 
input_types, data_set);
+vectorized::check_function(func_name, 
input_types,
+data_set);
 };
 
 {
 std::vector input_types = {vectorized::TypeIndex::String,
  vectorized::TypeIndex::String};
 
-DataSet data_set = {{{std::string("hello"), std::string("world")}, 
(int32_t) 984713481},
+DataSet data_set = {{{std::string("hello"), std::string("world")}, 
(int32_t)984713481},
 {{std::string("hello"), Null()}, Null()}};
 
-vectorized::check_function(func_name, 
input_types, data_set);
+vectorized::check_function(func_name, 
input_types,
+data_set);
 };
 
 {
@@ -54,10 +55,12 @@ TEST(HashFunctionTest, murmur_hash_3_test) {
  vectorized::TypeIndex::String,
  vectorized::TypeIndex::String};
 
-DataSet data_set = {{{std::string("hello"), std::string("world"), 
std::string("!")}, (int32_t) -666935433},
+DataSet data_set = {{{std::string("hello"), std::string("world"), 
std::string("!")},
+ (int32_t)-666935433},
 {{std::string("hello"), std::string("world"), 
Null()}, Null()}};
 
-vectorized::check_function(func_name, 
input_types, data_set);
+vectorized::check_function(func_name, 
input_types,
+data_set);
 };
 }
 
@@ -68,19 +71,22 @@ TEST(HashFunctionTest, murmur_hash_2_test) {
 std::vector input_types = {vectorized::TypeIndex::String};
 
 DataSet data_set = {{{Null()}, Null()},
-{{std::string("hello")}, (uint64_t) 
2191231550387646743}};
+{{std::string("hello")}, 
(uint64_t)2191231550387646743ull}};
 
-vectorized::check_function(func_name, input_types, data_set);
+vectorized::check_function(func_name, input_types,
+ data_set);
 };
 
 {
 std::vector input_types = {vectorized::TypeIndex::String,
  vectorized::TypeIndex::String};
 
-DataSet data_set = {{{std::string("hello"), std::string("world")}, 
(uint64_t) 11978658642541747642l},
-{{std::string("hello"), Null()}, Null()}};
+DataSet data_set = {
+{{std::string("hello"), std::string("world")}, 
(uint64_t)11978658642541747642ull},
+{{std::string("hello"), Null()}, Null()}};
 
-vectorized::check_function(func_name, input_types, data_set);
+vectorized::check_function(func_name, input_types,
+ data_set);
 };
 
 {
@@ -88,10 +94,12 @@ TEST(HashFunctionTest, murmur_hash_2_test) {
   

[incubator-doris] 27/33: [Vectorization] Support SegmentIterator vectorization (#7613)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit e23e332f75f6d1b3891c9adcd7ba85f9a89f765e
Author: wangbo <506340...@qq.com>
AuthorDate: Fri Jan 14 11:44:27 2022 +0800

[Vectorization] Support SegmentIterator vectorization (#7613)
---
 be/src/olap/column_predicate.h |   2 +
 be/src/olap/comparison_predicate.cpp   |   2 +-
 be/src/olap/in_list_predicate.cpp  |  39 +++
 be/src/olap/in_list_predicate.h|   7 +-
 be/src/olap/rowset/segment_v2/binary_dict_page.cpp |  21 +-
 be/src/olap/rowset/segment_v2/binary_plain_page.h  |  34 +-
 be/src/olap/rowset/segment_v2/bitshuffle_page.h|  36 +-
 be/src/olap/rowset/segment_v2/segment_iterator.cpp | 362 -
 be/src/olap/rowset/segment_v2/segment_iterator.h   |  29 +-
 be/src/olap/schema.cpp |  66 
 be/src/olap/schema.h   |   4 +
 be/src/vec/columns/column.h|  13 +-
 be/src/vec/columns/column_complex.h|   2 +
 be/src/vec/columns/column_nullable.cpp |  15 +-
 be/src/vec/columns/column_nullable.h   |   6 +
 be/src/vec/columns/column_vector.h |  26 ++
 be/src/vec/columns/predicate_column.h  |  27 +-
 17 files changed, 661 insertions(+), 30 deletions(-)

diff --git a/be/src/olap/column_predicate.h b/be/src/olap/column_predicate.h
index 10b8a91..6b1aa23 100644
--- a/be/src/olap/column_predicate.h
+++ b/be/src/olap/column_predicate.h
@@ -69,6 +69,8 @@ public:
 virtual void evaluate_vec(vectorized::IColumn& column, uint16_t size, 
bool* flags) const {};
 uint32_t column_id() const { return _column_id; }
 
+virtual bool is_in_predicate() { return false; }
+
 protected:
 uint32_t _column_id;
 bool _opposite;
diff --git a/be/src/olap/comparison_predicate.cpp 
b/be/src/olap/comparison_predicate.cpp
index 598e7f3..a154a04 100644
--- a/be/src/olap/comparison_predicate.cpp
+++ b/be/src/olap/comparison_predicate.cpp
@@ -188,7 +188,7 @@ COMPARISON_PRED_COLUMN_EVALUATE(GreaterEqualPredicate, >=)
 void CLASS::evaluate_vec(vectorized::IColumn& column, uint16_t size, 
bool* flags) const {   \
 if (column.is_nullable()) {
   \
 auto* nullable_column = 
vectorized::check_and_get_column(column);   
  \
-auto& data_array = reinterpret_cast&>(nullable_column->get_nested_column()).get_data();
  \
+auto& data_array = reinterpret_cast&>(nullable_column->get_nested_column()).get_data();
  \
 auto& null_bitmap = reinterpret_cast&>(*(nullable_column->get_null_map_column_ptr())).get_data();
 \
 for (uint16_t i = 0; i < size; i++) {  
   \
 flags[i] = (data_array[i] OP _value) && (!null_bitmap[i]); 
   \
diff --git a/be/src/olap/in_list_predicate.cpp 
b/be/src/olap/in_list_predicate.cpp
index c167a17..a17e157 100644
--- a/be/src/olap/in_list_predicate.cpp
+++ b/be/src/olap/in_list_predicate.cpp
@@ -20,6 +20,8 @@
 #include "olap/field.h"
 #include "runtime/string_value.hpp"
 #include "runtime/vectorized_row_batch.h"
+#include "vec/columns/predicate_column.h"
+#include "vec/columns/column_nullable.h"
 
 namespace doris {
 
@@ -115,6 +117,43 @@ IN_LIST_PRED_EVALUATE(NotInListPredicate, ==)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(InListPredicate, !=)
 IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(NotInListPredicate, ==)
 
+#define IN_LIST_PRED_COLUMN_EVALUATE(CLASS, OP)
\
+template   
\
+void CLASS::evaluate(vectorized::IColumn& column, uint16_t* sel, 
uint16_t* size) const { \
+uint16_t new_size = 0; 
\
+if (column.is_nullable()) {
\
+auto* nullable_column =
\
+
vectorized::check_and_get_column(column);   
   \
+auto& null_bitmap = reinterpret_cast&>(*(  \
+nullable_column->get_null_map_column_ptr())).get_data();   
\
+auto* nest_column_vector = vectorized::check_and_get_column
\
+
>(nullable_column->get_nested_column());  
   \
+auto& data_array = nest_

[incubator-doris] 29/33: [Vectorized][feature](planner)(executor) Support grouping sets rollup cube (#7601)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 8760e644f4086809f214f0d105f0ae4492024ca6
Author: anneji-dev <85534151+anneji-...@users.noreply.github.com>
AuthorDate: Mon Jan 17 14:09:15 2022 +0800

[Vectorized][feature](planner)(executor) Support grouping sets rollup cube 
(#7601)
---
 be/src/exec/exec_node.cpp  |   8 +-
 be/src/exec/repeat_node.h  |   2 +-
 be/src/vec/CMakeLists.txt  |   2 +
 be/src/vec/exec/vrepeat_node.cpp   | 245 +
 be/src/vec/exec/vrepeat_node.h |  56 +
 be/src/vec/functions/function_grouping.cpp |  25 +++
 be/src/vec/functions/function_grouping.h   |  90 
 be/src/vec/functions/simple_function_factory.h |   2 +
 .../apache/doris/planner/SingleNodePlanner.java|  10 +
 gensrc/script/doris_builtins_functions.py  |   4 +-
 10 files changed, 440 insertions(+), 4 deletions(-)

diff --git a/be/src/exec/exec_node.cpp b/be/src/exec/exec_node.cpp
index 4582f89..97c3259 100644
--- a/be/src/exec/exec_node.cpp
+++ b/be/src/exec/exec_node.cpp
@@ -82,6 +82,7 @@
 #include "vec/exprs/vexpr.h"
 #include "vec/exec/vempty_set_node.h"
 #include "vec/exec/vschema_scan_node.h"
+#include "vec/exec/vrepeat_node.h"
 namespace doris {
 
 const std::string ExecNode::ROW_THROUGHPUT_COUNTER = "RowsReturnedRate";
@@ -389,6 +390,7 @@ Status ExecNode::create_node(RuntimeState* state, 
ObjectPool* pool, const TPlanN
 case TPlanNodeType::SCHEMA_SCAN_NODE:
 case TPlanNodeType::ANALYTIC_EVAL_NODE:
 case TPlanNodeType::SELECT_NODE:
+case TPlanNodeType::REPEAT_NODE:
 break;
 default: {
 const auto& i = 
_TPlanNodeType_VALUES_TO_NAMES.find(tnode.node_type);
@@ -568,7 +570,11 @@ Status ExecNode::create_node(RuntimeState* state, 
ObjectPool* pool, const TPlanN
 return Status::OK();
 
 case TPlanNodeType::REPEAT_NODE:
-*node = pool->add(new RepeatNode(pool, tnode, descs));
+if (state->enable_vectorized_exec()) {
+*node = pool->add(new vectorized::VRepeatNode(pool, tnode, descs));
+} else {
+*node = pool->add(new RepeatNode(pool, tnode, descs));
+}
 return Status::OK();
 
 case TPlanNodeType::ASSERT_NUM_ROWS_NODE:
diff --git a/be/src/exec/repeat_node.h b/be/src/exec/repeat_node.h
index 01335d2..d9dce75 100644
--- a/be/src/exec/repeat_node.h
+++ b/be/src/exec/repeat_node.h
@@ -40,7 +40,7 @@ public:
 protected:
 virtual void debug_string(int indentation_level, std::stringstream* out) 
const override;
 
-private:
+protected:
 Status get_repeated_batch(RowBatch* child_row_batch, int repeat_id_idx, 
RowBatch* row_batch);
 
 // Slot id set used to indicate those slots need to set to null.
diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt
index 9c4d947..f737391 100644
--- a/be/src/vec/CMakeLists.txt
+++ b/be/src/vec/CMakeLists.txt
@@ -86,6 +86,7 @@ set(VEC_FILES
   exec/vempty_set_node.cpp
   exec/vanalytic_eval_node.cpp
   exec/vassert_num_rows_node.cpp
+  exec/vrepeat_node.cpp
   exec/join/vhash_join_node.cpp
   exprs/vectorized_agg_fn.cpp
   exprs/vectorized_fn_call.cpp
@@ -139,6 +140,7 @@ set(VEC_FILES
   functions/function_date_or_datetime_computation.cpp
   functions/function_date_or_datetime_to_string.cpp
   functions/function_datetime_string_to_string.cpp
+  functions/function_grouping.cpp
   olap/vgeneric_iterators.cpp
   olap/vcollect_iterator.cpp
   olap/block_reader.cpp
diff --git a/be/src/vec/exec/vrepeat_node.cpp b/be/src/vec/exec/vrepeat_node.cpp
new file mode 100644
index 000..dd8bb28
--- /dev/null
+++ b/be/src/vec/exec/vrepeat_node.cpp
@@ -0,0 +1,245 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/exec/vrepeat_node.h"
+#include "exprs/expr.h"
+#include "gutil/strings/join.h"
+#include "runtime/runtime_state.h"
+#include "util/runtime_profile.h"
+
+namespace doris::vectorized {
+VRepeatNode::VRepeatNode(ObjectPool* pool, const TPlanNode& tnode, const 
Des

[incubator-doris] 24/33: [Vectorized][Bug] Bitmap/HLL type no support cast to varchar/char (#7737)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit 30de672b2e45a83a95706a1f1f100054b436f33c
Author: HappenLee 
AuthorDate: Thu Jan 13 13:34:29 2022 +0800

[Vectorized][Bug] Bitmap/HLL type no support cast to varchar/char (#7737)

Co-authored-by: lihaopeng 
---
 be/src/vec/data_types/data_type_bitmap.cpp | 9 +
 be/src/vec/data_types/data_type_bitmap.h   | 1 +
 be/src/vec/data_types/data_type_string.cpp | 8 
 be/src/vec/data_types/data_type_string.h   | 1 +
 4 files changed, 19 insertions(+)

diff --git a/be/src/vec/data_types/data_type_bitmap.cpp 
b/be/src/vec/data_types/data_type_bitmap.cpp
index 4daa72e..c6bc9f0 100644
--- a/be/src/vec/data_types/data_type_bitmap.cpp
+++ b/be/src/vec/data_types/data_type_bitmap.cpp
@@ -90,4 +90,13 @@ void DataTypeBitMap::deserialize_as_stream(BitmapValue& 
value, BufferReadable& b
 read_string_binary(ref, buf);
 value.deserialize(ref.data);
 }
+
+void DataTypeBitMap::to_string(const class doris::vectorized::IColumn& column, 
size_t row_num,
+doris::vectorized::BufferWritable& ostr) const {
+auto& data = const_cast(assert_cast(column).get_element(row_num));
+std::string result(data.getSizeInBytes(), '0');
+data.write((char*)result.data());
+
+ostr.write(result.data(), result.size());
+}
 } // namespace doris::vectorized
diff --git a/be/src/vec/data_types/data_type_bitmap.h 
b/be/src/vec/data_types/data_type_bitmap.h
index 69f5540..c2166fb 100644
--- a/be/src/vec/data_types/data_type_bitmap.h
+++ b/be/src/vec/data_types/data_type_bitmap.h
@@ -65,6 +65,7 @@ public:
 bool can_be_inside_low_cardinality() const override { return false; }
 
 std::string to_string(const IColumn& column, size_t row_num) const { 
return "BitMap()"; }
+void to_string(const IColumn &column, size_t row_num, BufferWritable 
&ostr) const override;
 
 [[noreturn]] virtual Field get_default() const {
 LOG(FATAL) << "Method get_default() is not implemented for data type " 
<< get_name();
diff --git a/be/src/vec/data_types/data_type_string.cpp 
b/be/src/vec/data_types/data_type_string.cpp
index f481721..86b0aac 100644
--- a/be/src/vec/data_types/data_type_string.cpp
+++ b/be/src/vec/data_types/data_type_string.cpp
@@ -58,6 +58,14 @@ std::string DataTypeString::to_string(const IColumn& column, 
size_t row_num) con
 return s.to_string();
 }
 
+void DataTypeString::to_string(const class doris::vectorized::IColumn & 
column, size_t row_num,
+class doris::vectorized::BufferWritable & ostr) const {
+const StringRef& s =
+assert_cast(*column.convert_to_full_column_if_const().get())
+.get_data_at(row_num);
+ostr.write(s.data, s.size);
+}
+
 Field DataTypeString::get_default() const {
 return String();
 }
diff --git a/be/src/vec/data_types/data_type_string.h 
b/be/src/vec/data_types/data_type_string.h
index 9d5b21b..8506473 100644
--- a/be/src/vec/data_types/data_type_string.h
+++ b/be/src/vec/data_types/data_type_string.h
@@ -55,6 +55,7 @@ public:
 bool can_be_inside_nullable() const override { return true; }
 bool can_be_inside_low_cardinality() const override { return true; }
 std::string to_string(const IColumn& column, size_t row_num) const;
+void to_string(const IColumn &column, size_t row_num, BufferWritable 
&ostr) const override;
 };
 
 } // namespace doris::vectorized

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 31/33: [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit f45d2a2144179bcc64b807a68f4a6befdc796383
Author: zuochunwei 
AuthorDate: Mon Jan 17 17:10:24 2022 +0800

[Vectorized](improving) (exec) optimize VDataStreamSender's send() 
performance #7747 (#7751)
---
 be/src/vec/columns/column.h |  4 ++
 be/src/vec/columns/column_complex.h | 10 -
 be/src/vec/columns/column_const.h   |  4 ++
 be/src/vec/columns/column_decimal.h | 10 +
 be/src/vec/columns/column_dummy.h   |  4 ++
 be/src/vec/columns/column_nullable.cpp  |  6 +++
 be/src/vec/columns/column_nullable.h|  1 +
 be/src/vec/columns/column_string.cpp|  6 +++
 be/src/vec/columns/column_string.h  |  2 +
 be/src/vec/columns/column_vector.cpp| 10 +
 be/src/vec/columns/column_vector.h  |  2 +
 be/src/vec/columns/predicate_column.h   |  6 ++-
 be/src/vec/core/block.cpp   | 13 +-
 be/src/vec/core/block.h |  2 +
 be/src/vec/sink/vdata_stream_sender.cpp | 79 ++---
 be/src/vec/sink/vdata_stream_sender.h   | 36 ++-
 16 files changed, 164 insertions(+), 31 deletions(-)

diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h
index a869a65..d58979d 100644
--- a/be/src/vec/columns/column.h
+++ b/be/src/vec/columns/column.h
@@ -160,6 +160,10 @@ public:
 virtual void insert_many_from(const IColumn& src, size_t position, size_t 
length) {
 for (size_t i = 0; i < length; ++i) insert_from(src, position);
 }
+ 
+/// Appends a batch elements from other column with the same type
+/// indices_begin + indices_end represent the row indices of column src
+virtual void insert_indices_from(const IColumn& src, const int* 
indices_begin, const int* indices_end) = 0;
 
 /// Appends data located in specified memory chunk if it is possible 
(throws an exception if it cannot be implemented).
 /// Is used to optimize some computations (in aggregation, for example).
diff --git a/be/src/vec/columns/column_complex.h 
b/be/src/vec/columns/column_complex.h
index 296f94b..18794d3 100644
--- a/be/src/vec/columns/column_complex.h
+++ b/be/src/vec/columns/column_complex.h
@@ -127,6 +127,14 @@ public:
 data.insert(data.end(), st, ed);
 }
 
+void insert_indices_from(const IColumn& src, const int* indices_begin, 
const int* indices_end) override {
+const Self& src_vec = assert_cast(src);
+data.reserve(size() + (indices_end - indices_begin));
+for (auto x = indices_begin; x != indices_end; ++x) {
+data.push_back(src_vec.get_element(*x));
+}
+}
+
 void pop_back(size_t n) { data.erase(data.end() - n, data.end()); }
 // it's impossable to use ComplexType as key , so we don't have to 
implemnt them
 [[noreturn]] StringRef serialize_value_into_arena(size_t n, Arena& arena,
@@ -286,4 +294,4 @@ ColumnPtr ColumnComplexType::replicate(const 
IColumn::Offsets& offsets) const
 }
 
 using ColumnBitmap = ColumnComplexType;
-} // namespace doris::vectorized
\ No newline at end of file
+} // namespace doris::vectorized
diff --git a/be/src/vec/columns/column_const.h 
b/be/src/vec/columns/column_const.h
index 703e226..e019c56 100644
--- a/be/src/vec/columns/column_const.h
+++ b/be/src/vec/columns/column_const.h
@@ -84,6 +84,10 @@ public:
 s += length;
 }
 
+void insert_indices_from(const IColumn& src, const int* indices_begin, 
const int* indices_end) override {
+s += (indices_end - indices_begin);
+}
+
 void insert(const Field&) override { ++s; }
 
 void insert_data(const char*, size_t) override { ++s; }
diff --git a/be/src/vec/columns/column_decimal.h 
b/be/src/vec/columns/column_decimal.h
index 46412cb..67f4fa9 100644
--- a/be/src/vec/columns/column_decimal.h
+++ b/be/src/vec/columns/column_decimal.h
@@ -26,6 +26,7 @@
 #include "vec/columns/column_impl.h"
 #include "vec/columns/column_vector_helper.h"
 #include "vec/common/typeid_cast.h"
+#include "vec/common/assert_cast.h"
 #include "vec/core/field.h"
 
 namespace doris::vectorized {
@@ -95,6 +96,15 @@ public:
 void insert_from(const IColumn& src, size_t n) override {
 data.push_back(static_cast(src).get_data()[n]);
 }
+
+void insert_indices_from(const IColumn& src, const int* indices_begin, 
const int* indices_end) override {
+const Self& src_vec = assert_cast(src);
+data.reserve(size() + (indices_end - indices_begin));
+for (auto x = indices_begin; x != indices_end; ++x) {
+data.push_back_without_reserve(src_vec.get_element(*x));
+}
+}
+
 void insert_data(const char* pos, size_t /*length*/) override;
 void insert_default() override { data.push_back(T()); }
 void insert(const Field& x) override {
diff --git a/be/src/vec/columns/column_dummy.h 
b/be/src/vec/col

[incubator-doris] 20/33: [Vectorized][Enhancement] fix some bug & improve some code (#7714)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit b20b5b7e4310a91c6cb42a8cc1ba4c6850bd3af2
Author: Pxl <952130...@qq.com>
AuthorDate: Wed Jan 12 09:57:10 2022 +0800

[Vectorized][Enhancement] fix some bug & improve some code (#7714)
---
 .../aggregate_function_reader.cpp  |  8 +++-
 be/src/vec/exec/volap_scan_node.cpp|  2 ++
 be/src/vec/olap/block_reader.cpp   | 22 ++
 be/src/vec/olap/block_reader.h |  4 +++-
 run-be-ut.sh   |  2 +-
 5 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/be/src/vec/aggregate_functions/aggregate_function_reader.cpp 
b/be/src/vec/aggregate_functions/aggregate_function_reader.cpp
index 9a24ac5..3594d51 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_reader.cpp
+++ b/be/src/vec/aggregate_functions/aggregate_function_reader.cpp
@@ -23,9 +23,8 @@ namespace doris::vectorized {
 void register_aggregate_function_reader(AggregateFunctionSimpleFactory& 
factory) {
 // add a suffix to the function name here to distinguish special functions 
of agg reader
 auto register_function_reader = [&](const std::string& name,
-const AggregateFunctionCreator& 
creator,
-bool nullable = false) {
-factory.register_function(name + agg_reader_suffix, creator, nullable);
+const AggregateFunctionCreator& 
creator) {
+factory.register_function(name + agg_reader_suffix, creator, false);
 };
 
 register_function_reader("sum", create_aggregate_function_sum_reader);
@@ -38,8 +37,7 @@ void 
register_aggregate_function_reader(AggregateFunctionSimpleFactory& factory)
 
 void 
register_aggregate_function_reader_no_spread(AggregateFunctionSimpleFactory& 
factory) {
 auto register_function_reader = [&](const std::string& name,
-const AggregateFunctionCreator& 
creator,
-bool nullable = false) {
+const AggregateFunctionCreator& 
creator, bool nullable) {
 factory.register_function(name + agg_reader_suffix, creator, nullable);
 };
 
diff --git a/be/src/vec/exec/volap_scan_node.cpp 
b/be/src/vec/exec/volap_scan_node.cpp
index da7a204..b365c1d 100644
--- a/be/src/vec/exec/volap_scan_node.cpp
+++ b/be/src/vec/exec/volap_scan_node.cpp
@@ -259,6 +259,8 @@ void VOlapScanNode::scanner_thread(VOlapScanner* scanner) {
 }
 _scan_cpu_timer->update(cpu_watch.elapsed_time());
 _scanner_wait_worker_timer->update(wait_time);
+
+std::unique_lock l(_scan_blocks_lock);
 _running_thread--;
 
 // The transfer thead will wait for `_running_thread==0`, to make sure all 
scanner threads won't access class members.
diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp
index ef3ba3a..f7a1388 100644
--- a/be/src/vec/olap/block_reader.cpp
+++ b/be/src/vec/olap/block_reader.cpp
@@ -72,6 +72,10 @@ OLAPStatus BlockReader::_init_collect_iter(const 
ReaderParams& read_params,
 }
 
 void BlockReader::_init_agg_state() {
+if (_eof) {
+return;
+}
+
 _stored_data_block = 
_next_row.block->create_same_struct_block(_batch_size);
 _stored_data_columns = _stored_data_block->mutate_columns();
 
@@ -260,7 +264,8 @@ OLAPStatus BlockReader::_unique_key_next_block(Block* 
block, MemPool* mem_pool,
 void BlockReader::_insert_data_normal(MutableColumns& columns) {
 auto block = _next_row.block;
 for (auto idx : _normal_columns_idx) {
-
columns[_return_columns_loc[idx]]->insert_from(*block->get_by_position(idx).column,
 _next_row.row_pos);
+
columns[_return_columns_loc[idx]]->insert_from(*block->get_by_position(idx).column,
+   _next_row.row_pos);
 }
 }
 
@@ -270,7 +275,7 @@ void BlockReader::_append_agg_data(MutableColumns& columns) 
{
 
 // execute aggregate when have `batch_size` column or some ref invalid soon
 bool is_last = (_next_row.block->rows() == _next_row.row_pos + 1);
-if (_stored_row_ref.size() == _batch_size || is_last) {
+if (is_last || _stored_row_ref.size() == _batch_size) {
 _update_agg_data(columns);
 }
 }
@@ -301,11 +306,9 @@ void BlockReader::_update_agg_data(MutableColumns& 
columns) {
 }
 
 void BlockReader::_copy_agg_data() {
-phmap::flat_hash_map>> temp_ref_map;
-
 for (int i = 0; i < _stored_row_ref.size(); i++) {
 auto& ref = _stored_row_ref[i];
-temp_ref_map[ref.block].emplace_back(ref.row_pos, i);
+_temp_ref_map[ref.block].emplace_back(ref.row_pos, i);
 }
 
 for (auto idx : _agg_columns_idx) {
@@ -314,11 +317,11 @@ void BlockReader::_co

[incubator-doris] 32/33: [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit b634ea62bbdef860a5bafffb078e6f128bfaaaff
Author: HappenLee 
AuthorDate: Mon Jan 17 20:11:26 2022 +0800

[Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778)

Co-authored-by: lihaopeng 
---
 be/src/vec/exec/vrepeat_node.cpp   | 13 ++---
 be/src/vec/functions/simple_function_factory.h |  2 +-
 build.sh   |  2 +-
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/be/src/vec/exec/vrepeat_node.cpp b/be/src/vec/exec/vrepeat_node.cpp
index dd8bb28..287aa6e 100644
--- a/be/src/vec/exec/vrepeat_node.cpp
+++ b/be/src/vec/exec/vrepeat_node.cpp
@@ -104,6 +104,7 @@ Status VRepeatNode::get_repeated_block(Block* child_block, 
int repeat_id_idx, Bl
 std::set& repeat_ids = _slot_id_set_list[repeat_id_idx];
 bool is_repeat_slot = _all_slot_ids.find(_output_slots[cur_col]->id()) 
!= _all_slot_ids.end();
 bool is_set_null_slot = repeat_ids.find(_output_slots[cur_col]->id()) 
== repeat_ids.end();
+const auto column_size = src_column.column->size();
 
 if (is_repeat_slot) {
 DCHECK(_output_slots[cur_col]->is_nullable());
@@ -113,21 +114,19 @@ Status VRepeatNode::get_repeated_block(Block* 
child_block, int repeat_id_idx, Bl
 
 // set slot null not in repeat_ids
 if (is_set_null_slot) {
-nullable_column->resize(src_column.column->size());
-for (size_t j = 0; j < src_column.column->size(); ++j) {
-nullable_column->insert_data(nullptr, 0);
-}
+nullable_column->resize(column_size);
+memset(nullable_column->get_null_map_data().data(), 1, 
sizeof(UInt8) * column_size);
 } else {
 if (!src_column.type->is_nullable()) {
-for (size_t j = 0; j < src_column.column->size(); ++j) {
+for (size_t j = 0; j < column_size; ++j) {
 null_map.push_back(0);
 }
 column_ptr = &nullable_column->get_nested_column();
 }
-column_ptr->insert_range_from(*src_column.column, 0, 
src_column.column->size());
+column_ptr->insert_range_from(*src_column.column, 0, 
column_size);
 }
 } else {
-columns[cur_col]->insert_range_from(*src_column.column, 0, 
src_column.column->size());
+columns[cur_col]->insert_range_from(*src_column.column, 0, 
column_size);
 }
 cur_col++;
 }
diff --git a/be/src/vec/functions/simple_function_factory.h 
b/be/src/vec/functions/simple_function_factory.h
index 45420cb..5b9f0fd 100644
--- a/be/src/vec/functions/simple_function_factory.h
+++ b/be/src/vec/functions/simple_function_factory.h
@@ -66,7 +66,7 @@ void register_function_like(SimpleFunctionFactory& factory);
 void register_function_regexp(SimpleFunctionFactory& factory);
 void register_function_random(SimpleFunctionFactory& factory);
 void register_function_coalesce(SimpleFunctionFactory& factory);
-+void register_function_grouping(SimpleFunctionFactory& factory);
+void register_function_grouping(SimpleFunctionFactory& factory);
 
 class SimpleFunctionFactory {
 using Creator = std::function;
diff --git a/build.sh b/build.sh
index d842b07..a5ccaae 100755
--- a/build.sh
+++ b/build.sh
@@ -104,7 +104,7 @@ fi
 
 eval set -- "$OPTS"
 
-PARALLEL=$[$(nproc)+1]
+PARALLEL=$[$(nproc)/4+1]
 BUILD_BE=
 BUILD_FE=
 BUILD_UI=

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] 21/33: [Vectorized][Enhancement] use simd to speed up coalesce and if_not_null function (#7722)

2022-01-17 Thread lihaopeng
This is an automated email from the ASF dual-hosted git repository.

lihaopeng pushed a commit to branch vectorized
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git

commit f3ce1cba69aa27f8ae0af63451bab41d5048c367
Author: HappenLee 
AuthorDate: Wed Jan 12 13:20:48 2022 +0800

[Vectorized][Enhancement] use simd to speed up coalesce and if_not_null 
function (#7722)

Co-authored-by: lihaopeng 
---
 be/src/vec/functions/function_coalesce.cpp | 210 -
 be/src/vec/functions/is_not_null.cpp   |   4 +-
 be/src/vec/functions/simple_function_factory.h |   5 +-
 be/test/vec/function/function_string_test.cpp  |  42 +
 4 files changed, 214 insertions(+), 47 deletions(-)

diff --git a/be/src/vec/functions/function_coalesce.cpp 
b/be/src/vec/functions/function_coalesce.cpp
index 65d544c..99b6110 100644
--- a/be/src/vec/functions/function_coalesce.cpp
+++ b/be/src/vec/functions/function_coalesce.cpp
@@ -28,6 +28,8 @@ class FunctionCoalesce : public IFunction {
 public:
 static constexpr auto name = "coalesce";
 
+mutable FunctionBasePtr func_is_not_null;
+
 static FunctionPtr create() { return std::make_shared(); 
}
 
 String get_name() const override { return name; }
@@ -41,47 +43,70 @@ public:
 size_t get_number_of_arguments() const override { return 0; }
 
 DataTypePtr get_return_type_impl(const DataTypes& arguments) const 
override {
+DataTypePtr res;
 for (const auto& arg : arguments) {
 if (!arg->is_nullable()) {
-return arg;
+res = arg;
+break;
 }
 }
-return arguments[0];
+
+res = res ? res : arguments[0];
+
+const ColumnsWithTypeAndName is_not_null_col{
+{nullptr, make_nullable(res), ""}
+};
+func_is_not_null = SimpleFunctionFactory::instance().
+get_function("is_not_null_pred", is_not_null_col, 
std::make_shared());
+
+return res;
 }
 
 Status execute_impl(FunctionContext* context, Block& block, const 
ColumnNumbers& arguments,
 size_t result, size_t input_rows_count) override {
 DCHECK_GE(arguments.size(), 1);
+DataTypePtr result_type = block.get_by_position(result).type;
 ColumnNumbers filtered_args;
 filtered_args.reserve(arguments.size());
-for (const auto& arg : arguments) {
-const auto& type = block.get_by_position(arg).type;
-if (type->only_null()) {
-continue;
-}
-filtered_args.push_back(arg);
-if (!type->is_nullable()) {
-break;
+
+for (size_t i = 0; i < arguments.size(); ++i) {
+const auto& arg_type = block.get_by_position(arguments[i]).type;
+filtered_args.push_back(arguments[i]);
+if (!arg_type->is_nullable()) {
+if (i == 0) { //if the first column not null, return it's 
directly
+block.get_by_position(result).column = 
block.get_by_position(arguments[0]).column;
+return Status::OK();
+} else {
+break;
+}
 }
 }
 
 size_t remaining_rows = input_rows_count;
 size_t argument_size = filtered_args.size();
-std::vector record_idx(input_rows_count, -1); //used to save 
column idx
+std::vector record_idx(input_rows_count, 0); //used to save 
column idx, record the result data of each row from which column
+std::vector filled_flags(input_rows_count, 0); //used to save 
filled flag, in order to check current row whether have filled data
+
 MutableColumnPtr result_column;
+if (!result_type->is_nullable()) {
+result_column = result_type->create_column();
+} else {
+result_column = remove_nullable(result_type)->create_column();
+}
 
-DataTypePtr type = block.get_by_position(result).type;
-if (!type->is_nullable()) {
-result_column = type->create_column();
+// because now the string types does not support random position 
writing,
+// so insert into result data have two methods, one is for string 
types, one is for others type remaining
+bool is_string_result = result_column->is_column_string();
+if (is_string_result) {
+result_column->reserve(input_rows_count);
 } else {
-result_column = remove_nullable(type)->create_column();
+result_column->resize(input_rows_count);
 }
 
-result_column->reserve(input_rows_count);
 auto return_type = std::make_shared();
-auto null_map = ColumnUInt8::create(input_rows_count, 1);
-auto& null_map_data = null_map->get_data();
-ColumnPtr argument_columns[argument_size];
+auto null_map = ColumnUInt8::create(input_rows_count, 1);  //if 
n

[GitHub] [incubator-doris] zhengshengjun opened a new issue #7783: [Bug] Consider backend status when more than one backends exists in same host

2022-01-17 Thread GitBox


zhengshengjun opened a new issue #7783:
URL: https://github.com/apache/incubator-doris/issues/7783


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   I have more than one BEs exist in same host. If one of them are dead, stream 
load process would fail sometimes. Because FE select one at random not alive 
one. This will cause 'No backend alive.' ERROR during stream load process.
   
   ### What You Expected?
   
   Choose alive BE when more than one BE exists in same host, so that stream 
load process will not fail when there are both alive and dead BE in the same 
host.
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] zhengshengjun opened a new pull request #7784: [Bug] Consider backend status when more than one backends exists in same host #7783

2022-01-17 Thread GitBox


zhengshengjun opened a new pull request #7784:
URL: https://github.com/apache/incubator-doris/pull/7784


   …ame host
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] HappenLee opened a new pull request #7785: [Vectorized] Support Vectorized Exec Engine In Doris

2022-01-17 Thread GitBox


HappenLee opened a new pull request #7785:
URL: https://github.com/apache/incubator-doris/pull/7785


   # Proposed changes
   
   Issue Number: close #6238
   
   Co-authored-by: HappenLee 
   Co-authored-by: stdpain <34912776+stdp...@users.noreply.github.com>
   Co-authored-by: Zhengguo Yang 
   Co-authored-by: wangbo <506340...@qq.com>
   Co-authored-by: emmymiao87 <522274...@qq.com>
   Co-authored-by: Pxl <952130...@qq.com>
   Co-authored-by: zhangstar333 
<87313068+zhangstar...@users.noreply.github.com>
   Co-authored-by: thinker 
   Co-authored-by: Zeno Yang <1521564...@qq.com>
   Co-authored-by: Wang Shuo 
   Co-authored-by: zhoubintao <35688959+zbtzbt...@users.noreply.github.com>
   Co-authored-by: Gabriel 
   Co-authored-by: xinghuayu007 <1450306...@qq.com>
   Co-authored-by: weizuo93 
   Co-authored-by: yiguolei 
   Co-authored-by: anneji-dev <85534151+anneji-...@users.noreply.github.com>
   Co-authored-by: awakeljw <993007...@qq.com>
   Co-authored-by: taberylyang 
<95272637+taberyly...@users.noreply.github.com>
   Co-authored-by: Cui Kaifeng <48012748+azuren...@users.noreply.github.com>
   
   
   ## Problem Summary:
   
   ### 1. Some code from clickhouse
   
   **ClickHouse is an excellent implementation of the vectorized execution 
engine database, so here we have borrowed a lot from its excellent 
implementation in terms of data structure and function implementation. We are 
based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community 
and developers.**
   
   we add all code about Clickhouse Title: 
   // This file is copied from
   // 
https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
   // and modified by Doris
   
   ### 2. Support exec node and query:
   * vaggregation_node
   * vanalytic_eval_node
   * vassert_num_rows_node
   * vblocking_join_node
   * vcross_join_node
   * vempty_set_node
   * ves_http_scan_node
   * vexcept_node
   * vexchange_node
   * vintersect_node
   * vmysql_scan_node
   * vodbc_scan_node
   * volap_scan_node
   * vrepeat_node
   * vschema_scan_node
   * vselect_node
   * vset_operation_node
   * vsort_node
   * vunion_node
   * vhash_join_node
   
   You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.
   
   ### 3. Data Model
   
   Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader 
Vectorized. Segment Vec is working in process.
   
   ### 4. How to use
   
   1. Set the environment variable `set enable_vectorized_engine = true; 
`(required)
   
   2. Set the environment variable `set batch_size = 4096; ` (recommended)
   
   
   ### 5. Some diff from origin exec engine
   
   https://github.com/doris-vectorized/doris-vectorized/issues/294
   
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (Yes)
   3. Has document been added or modified: (No)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (Yes)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on issue #7580: [Roadmap] Support vectorized query engine

2022-01-17 Thread GitBox


morningman commented on issue #7580:
URL: 
https://github.com/apache/incubator-doris/issues/7580#issuecomment-1014511789


   Related #6238


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 opened a new pull request #7786: Add Amazon S3 support

2022-01-17 Thread GitBox


hf200012 opened a new pull request #7786:
URL: https://github.com/apache/incubator-doris/pull/7786


   Add Amazon S3  support
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 closed pull request #7786: [Doc]Add Amazon S3 support

2022-01-17 Thread GitBox


hf200012 closed pull request #7786:
URL: https://github.com/apache/incubator-doris/pull/7786


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 commented on pull request #7786: [Doc]Add Amazon S3 support

2022-01-17 Thread GitBox


hf200012 commented on pull request #7786:
URL: https://github.com/apache/incubator-doris/pull/7786#issuecomment-1014552582


   Resubmit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 opened a new pull request #7787: [Doc]Documentation corrections

2022-01-17 Thread GitBox


hf200012 opened a new pull request #7787:
URL: https://github.com/apache/incubator-doris/pull/7787


   Documentation corrections
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman opened a new pull request #7788: [bix](bitmap-index) Fix bug that bitmap index may return wrong result.

2022-01-17 Thread GitBox


morningman opened a new pull request #7788:
URL: https://github.com/apache/incubator-doris/pull/7788


   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Fix the following bugs.
   
   1. `column1` created a bitmap index.
   2. `column1` has a lot index items in the bitmap index, and the index page 
is divided into two levels.
   3. `column1`'s value range is `[1000, 1000]`.
   4. the query condition is `column1 > 0`
   5. the empty result will be returned, while the expected value should be 
000 rows.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7785: [feature][vectorized] Support Vectorized Exec Engine In Doris

2022-01-17 Thread GitBox


github-actions[bot] commented on pull request #7785:
URL: https://github.com/apache/incubator-doris/pull/7785#issuecomment-1014699158






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman closed issue #6238: [Proposal] Vectorization Execution Engine optimization for Doris

2022-01-17 Thread GitBox


morningman closed issue #6238:
URL: https://github.com/apache/incubator-doris/issues/6238


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman merged pull request #7785: [feature](vectorization) Support Vectorized Exec Engine In Doris

2022-01-17 Thread GitBox


morningman merged pull request #7785:
URL: https://github.com/apache/incubator-doris/pull/7785


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] zuochunwei closed pull request #7145: no static_cast

2022-01-17 Thread GitBox


zuochunwei closed pull request #7145:
URL: https://github.com/apache/incubator-doris/pull/7145


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] zuochunwei closed pull request #7695: [vectorized](optimization)(aggregate) improving aggregate count & sum performance

2022-01-17 Thread GitBox


zuochunwei closed pull request #7695:
URL: https://github.com/apache/incubator-doris/pull/7695


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yangzhg commented on a change in pull request #7788: [bix](bitmap-index) Fix bug that bitmap index may return wrong result.

2022-01-17 Thread GitBox


yangzhg commented on a change in pull request #7788:
URL: https://github.com/apache/incubator-doris/pull/7788#discussion_r786370400



##
File path: be/src/olap/rowset/segment_v2/ordinal_page_index.h
##
@@ -69,6 +69,9 @@ class OrdinalIndexReader {
 // load and parse the index page into memory
 Status load(bool use_page_cache, bool kept_in_memory);
 
+// the returned iter points to the largest element which is less than 
`ordinal`,
+// or points to the first element if all elements are greater than 
`ordinal`,
+// or points to "end" if all elementss are smaller than `ordinal`.

Review comment:
   ```suggestion
   // or points to "end" if all elements are smaller than `ordinal`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: [improvement](broker) add some properties that can be set in the broker conf file (#7499)

2022-01-17 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 946fa29  [improvement](broker) add some properties that can be set in 
the broker conf file (#7499)
946fa29 is described below

commit 946fa2960d8ada5839b542b50d0192f37f2a5f65
Author: Henry2SS <45096548+henry...@users.noreply.github.com>
AuthorDate: Tue Jan 18 10:24:54 2022 +0800

[improvement](broker) add some properties that can be set in the broker 
conf file (#7499)
---
 .../conf/apache_hdfs_broker.conf   | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf 
b/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf
index 5780687..92a30ac 100644
--- a/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf
+++ b/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf
@@ -15,8 +15,26 @@
 # specific language governing permissions and limitations
 # under the License.
 
+#
+## To see all Broker configurations,
+## see 
fs_brokers/apache_hdfs_broker/src/main/java/org/apache/doris/broker/hdfs/BrokerConfig.java
+#
+
+# INFO, WARNING, ERROR, FATAL
+# sys_log_level = INFO
+
 # the thrift rpc port
-broker_ipc_port=8000
+broker_ipc_port = 8000
 
 # client session will be deleted if not receive ping after this time
-client_expire_seconds=300
+client_expire_seconds = 300
+
+# Advanced configurations
+# sys_log_dir = ${BROKER_HOME}/log
+# sys_log_roll_num = 30
+# sys_log_roll_mode = SIZE-MB-1024
+# sys_log_verbose_modules = org.apache.doris
+# audit_log_dir = ${BROKER_HOME}/log
+# audit_log_roll_num = 10
+# audit_log_roll_mode = TIME-DAY
+# audit_log_modules =

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman closed issue #7498: [Enhancement] [Broker] Add theproperties that can be set to config file of Broker

2022-01-17 Thread GitBox


morningman closed issue #7498:
URL: https://github.com/apache/incubator-doris/issues/7498


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman merged pull request #7499: [Broker] Add some properties that can be set in the broker conf file

2022-01-17 Thread GitBox


morningman merged pull request #7499:
URL: https://github.com/apache/incubator-doris/pull/7499


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated: [improvement](colocation) Add a new config to delay the relocation of colocation group (#7656)

2022-01-17 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 3494c89  [improvement](colocation) Add a new config to delay the 
relocation of colocation group (#7656)
3494c89 is described below

commit 3494c8973b91d725cff1c4c20e87bcb1e6f4f300
Author: Mingyu Chen 
AuthorDate: Tue Jan 18 10:26:36 2022 +0800

[improvement](colocation) Add a new config to delay the relocation of 
colocation group (#7656)

1. Add a new FE config `colocate_group_relocate_delay_second`

The relocation of a colocation group may involve a large number of 
tablets moving within the cluster.
Therefore, we should use a more conservative strategy to avoid 
relocation of colocation groups as much as possible.
Relocation usually occurs after a BE node goes offline or goes down.
This config is used to delay the determination of BE node 
unavailability.
The default is 30 minutes, i.e., if a BE node recovers within 30 
minutes, relocation of the colocation group
will not be triggered.

2. Change the priority of colocate tablet repair and balance task from HIGH 
to NORMAL

3. Add a new FE config allow_replica_on_same_host

If set to true, when creating table, Doris will allow to locate 
replicas of a tablet
on same host. And also the tablet repair and balance will be disabled.
This is only for local test, so that we can deploy multi BE on same 
host and create table
with multi replicas.
---
 docs/en/administrator-guide/config/fe_config.md| 22 ++
 .../operation/tablet-repair-and-balance.md | 88 ++
 docs/zh-CN/administrator-guide/config/fe_config.md | 25 +-
 .../operation/tablet-repair-and-balance.md | 86 +
 .../main/java/org/apache/doris/catalog/Tablet.java | 47 ++--
 .../clone/ColocateTableCheckerAndBalancer.java | 28 +++
 .../java/org/apache/doris/clone/TabletChecker.java |  9 +--
 .../org/apache/doris/clone/TabletScheduler.java|  7 +-
 .../main/java/org/apache/doris/common/Config.java  | 21 ++
 .../main/java/org/apache/doris/system/Backend.java | 26 ---
 .../org/apache/doris/system/SystemInfoService.java | 32 
 .../java/org/apache/doris/catalog/BackendTest.java | 14 ++--
 .../clone/ColocateTableCheckerAndBalancerTest.java | 18 ++---
 .../doris/clone/TabletRepairAndBalanceTest.java|  1 +
 14 files changed, 338 insertions(+), 86 deletions(-)

diff --git a/docs/en/administrator-guide/config/fe_config.md 
b/docs/en/administrator-guide/config/fe_config.md
index bf6a8d2..69d611a 100644
--- a/docs/en/administrator-guide/config/fe_config.md
+++ b/docs/en/administrator-guide/config/fe_config.md
@@ -2099,3 +2099,25 @@ Default: true
 IsMutable:true
 MasterOnly: true
 If set to true, the replica with slower compaction will be automatically 
detected and migrated to other machines. The detection condition is that the 
version difference between the fastest and slowest replica exceeds 100, and the 
difference exceeds 30% of the fastest replica
+
+### colocate_group_relocate_delay_second
+
+Default: 1800
+
+Dynamically configured: true
+
+Only for Master FE: true
+
+The relocation of a colocation group may involve a large number of tablets 
moving within the cluster. Therefore, we should use a more conservative 
strategy to avoid relocation of colocation groups as much as possible.
+Reloaction usually occurs after a BE node goes offline or goes down. This 
parameter is used to delay the determination of BE node unavailability. The 
default is 30 minutes, i.e., if a BE node recovers within 30 minutes, 
relocation of the colocation group will not be triggered.
+
+### allow_replica_on_same_host
+
+Default: false
+
+Dynamically configured: false
+
+Only for Master FE: false
+
+Whether to allow multiple replicas of the same tablet to be distributed on the 
same host. This parameter is mainly used for local testing, to facilitate 
building multiple BEs to test certain multi-replica situations. Do not use it 
for non-test environments.
+
diff --git a/docs/en/administrator-guide/operation/tablet-repair-and-balance.md 
b/docs/en/administrator-guide/operation/tablet-repair-and-balance.md
index 1593cec..e924e62 100644
--- a/docs/en/administrator-guide/operation/tablet-repair-and-balance.md
+++ b/docs/en/administrator-guide/operation/tablet-repair-and-balance.md
@@ -684,3 +684,91 @@ The following parameters do not support modification for 
the time being, just fo
 * In some cases, the default replica repair and balancing strategy may cause 
the network to be full (mostly in the case of gigabit network cards and a large 
number of disks per BE). At this point, some parameters need to be adjusted to 
reduce the number of simultaneous balancing and re

[GitHub] [incubator-doris] morningman merged pull request #7656: [improvement](colocation) Add a new config to delay the relocation of colocation group

2022-01-17 Thread GitBox


morningman merged pull request #7656:
URL: https://github.com/apache/incubator-doris/pull/7656


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #7098: Support remote storage, step1: use a struct instead of string for parameter path, add basic remote method

2022-01-17 Thread GitBox


morningman commented on pull request #7098:
URL: https://github.com/apache/incubator-doris/pull/7098#issuecomment-1015026898


   link to #7575 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman commented on pull request #7529: Support remote storage, step2, only for be: hot data trans to cold data. clean cold data when drop table

2022-01-17 Thread GitBox


morningman commented on pull request #7529:
URL: https://github.com/apache/incubator-doris/pull/7529#issuecomment-1015026994


   link to #7575 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] yiguolei commented on pull request #7529: Support remote storage, step2, only for be: hot data trans to cold data. clean cold data when drop table

2022-01-17 Thread GitBox


yiguolei commented on pull request #7529:
URL: https://github.com/apache/incubator-doris/pull/7529#issuecomment-1015028794


   I have two questions:
   1. Shoud  set the partition to freeze state to avoid insert data to cold 
partitions? 
   2. How to deal with schema change for the data in S3?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7787: [Doc]Documentation corrections

2022-01-17 Thread GitBox


github-actions[bot] commented on pull request #7787:
URL: https://github.com/apache/incubator-doris/pull/7787#issuecomment-1015033545






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] EmmyMiao87 commented on pull request #7787: [Doc]Documentation corrections

2022-01-17 Thread GitBox


EmmyMiao87 commented on pull request #7787:
URL: https://github.com/apache/incubator-doris/pull/7787#issuecomment-1015034341


   BTW, now our pr title is written in a fixed format and can be automatically 
labeled.
   For example, []()  (#pr). The label will be  
automatically


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 commented on issue #7502: Doris Roadmap 2022

2022-01-17 Thread GitBox


hf200012 commented on issue #7502:
URL: 
https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015047945


   #7680 Data export function supports exporting to db, kafka, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 closed issue #7676: [Feature] Doris supports multi-table Join materialized views

2022-01-17 Thread GitBox


hf200012 closed issue #7676:
URL: https://github.com/apache/incubator-doris/issues/7676


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] hf200012 commented on issue #7502: Doris Roadmap 2022

2022-01-17 Thread GitBox


hf200012 commented on issue #7502:
URL: 
https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015048834


   #7678 max_by, min_by aggregate function support 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] morningman merged pull request #7772: [fix](lateral-view) Fix some lateral view bugs

2022-01-17 Thread GitBox


morningman merged pull request #7772:
URL: https://github.com/apache/incubator-doris/pull/7772


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[incubator-doris] branch master updated (3494c89 -> efb4e18)

2022-01-17 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git.


from 3494c89  [improvement](colocation) Add a new config to delay the 
relocation of colocation group (#7656)
 add efb4e18  [fix](lateral-view) Fix some lateral view bugs (#7772)

No new revisions were added by this update.

Summary of changes:
 be/src/exec/table_function_node.cpp| 23 ++
 be/src/exec/table_function_node.h  |  2 ++
 be/src/runtime/plan_fragment_executor.cpp  |  2 +-
 .../apache/doris/planner/TableFunctionNode.java| 17 
 4 files changed, 35 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] huligong1234 commented on issue #7502: Doris Roadmap 2022

2022-01-17 Thread GitBox


huligong1234 commented on issue #7502:
URL: 
https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015068122


   looking forward to support decimal data type for create table as select 
statement.  (detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as 
select statement)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [incubator-doris] huligong1234 edited a comment on issue #7502: Doris Roadmap 2022

2022-01-17 Thread GitBox


huligong1234 edited a comment on issue #7502:
URL: 
https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015068122


   support decimal data type for create table as select statement.  
(detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as select 
statement)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



  1   2   >