[GitHub] [incubator-doris] HappenLee commented on issue #7774: [Enhancement][Vectorized] Speed up column filtering via SIMD
HappenLee commented on issue #7774: URL: https://github.com/apache/incubator-doris/issues/7774#issuecomment-1014247052 nice job! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg opened a new pull request #7776: [improvment] (fe) add retry at be heartbeat, avoid show be is down when be in high load
yangzhg opened a new pull request #7776: URL: https://github.com/apache/incubator-doris/pull/7776 # Proposed changes 1. add retry at be heartbeat, avoid show be is down when be in high load 2. remove some unused code ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman closed issue #7662: [Feature] Support general hints int select stmt
morningman closed issue #7662: URL: https://github.com/apache/incubator-doris/issues/7662 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee merged pull request #7775: [Vectorized][Improvement] Speed up column filtering via SIMD
HappenLee merged pull request #7775: URL: https://github.com/apache/incubator-doris/pull/7775 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch vectorized updated (d2f2210 -> 57bdde6)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a change to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from d2f2210 [Vectorized][feature](planner)(executor) Support grouping sets rollup cube (#7601) add 57bdde6 [Vectorized][Improvement] Speed up column filtering via SIMD (#7775) No new revisions were added by this update. Summary of changes: be/src/vec/columns/column_decimal.cpp | 25 + be/src/vec/columns/column_vector.cpp | 28 ++-- be/src/vec/columns/columns_common.cpp | 20 +--- be/src/vec/columns/columns_common.h | 30 ++ 4 files changed, 74 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee closed issue #7774: [Enhancement][Vectorized] Speed up column filtering via SIMD
HappenLee closed issue #7774: URL: https://github.com/apache/incubator-doris/issues/7774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee opened a new issue #7777: [Vectorized][Bug] Bug of repeated node resize and compile of grouping set code
HappenLee opened a new issue #: URL: https://github.com/apache/incubator-doris/issues/ ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version vectorized ### What's Wrong? core dump ### What You Expected? normal execute ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee opened a new pull request #7778: [Vectorized][Bug] Fix bug of repeated node resize and compile failed
HappenLee opened a new pull request #7778: URL: https://github.com/apache/incubator-doris/pull/7778 # Proposed changes Issue Number: close # ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (No Need) 3. Has document been added or modified: (No Need) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee merged pull request #7751: [vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747
HappenLee merged pull request #7751: URL: https://github.com/apache/incubator-doris/pull/7751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch vectorized updated (57bdde6 -> 778fa8d)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a change to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 57bdde6 [Vectorized][Improvement] Speed up column filtering via SIMD (#7775) add 778fa8d [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751) No new revisions were added by this update. Summary of changes: be/src/vec/columns/column.h | 4 ++ be/src/vec/columns/column_complex.h | 10 - be/src/vec/columns/column_const.h | 4 ++ be/src/vec/columns/column_decimal.h | 10 + be/src/vec/columns/column_dummy.h | 4 ++ be/src/vec/columns/column_nullable.cpp | 6 +++ be/src/vec/columns/column_nullable.h| 1 + be/src/vec/columns/column_string.cpp| 6 +++ be/src/vec/columns/column_string.h | 2 + be/src/vec/columns/column_vector.cpp| 10 + be/src/vec/columns/column_vector.h | 2 + be/src/vec/columns/predicate_column.h | 6 ++- be/src/vec/core/block.cpp | 13 +- be/src/vec/core/block.h | 2 + be/src/vec/sink/vdata_stream_sender.cpp | 79 ++--- be/src/vec/sink/vdata_stream_sender.h | 36 ++- 16 files changed, 164 insertions(+), 31 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zenoyang opened a new issue #7779: [Bug][Vectorized] Fix compile error and warning
zenoyang opened a new issue #7779: URL: https://github.com/apache/incubator-doris/issues/7779 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version vectorized branch ### What's Wrong? current compile error and warning ### What You Expected? no ### How to Reproduce? fix ### Anything Else? no ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zenoyang opened a new pull request #7780: [Vectorized](compile) Fix compile error and warning
zenoyang opened a new pull request #7780: URL: https://github.com/apache/incubator-doris/pull/7780 # Proposed changes Issue Number: #7779 Fix compile error and warning ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman opened a new issue #7781: [Roadmap] Support automatic table structure transformation and pseudo data generation
morningman opened a new issue #7781: URL: https://github.com/apache/incubator-doris/issues/7781 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description _No response_ ### Use case Use Case: 1. Automate or guide users in converting table build statements from other databases to Doris table build statements. 2. Support generating pseudo data based on table structure for easy testing ### Related issues _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman edited a comment on issue #7502: Doris Roadmap 2022
morningman edited a comment on issue #7502: URL: https://github.com/apache/incubator-doris/issues/7502#issuecomment-1001839293 The following is the Roadmap for the Doris community in 2022. The plan includes all aspects of code features, documentation, community building, etc. that are to be developed, have already been developed, and have been completed but require ongoing optimization. > The plan is currently under discussion, so if you have comments or suggestions on any aspect of the plan or beyond, please feel free to leave a comment or send an email to d...@doris.apache.org. > We will gradually create issues or jira for each direction of the plan to describe and track the progress in detail. Developers who wish to contribute are also welcome to create issues directly and associate with them (just leave a comment) > The directions marked (**Good First Issue**) in the plan are more independent modules, which are more suitable for newbie tasks or developers who are new to Doris. If you are interested in the relevant direction, please contact us at d...@doris.apache.org or under this issue, and we will provide detailed guidance, help and discussion. > The directions marked with (**Q1**) are the current work to be completed in the first quarter of 2022. We will update the schedule and progress of other directions gradually. > The marked (**Done & Optimizing**) directions are the directions that are currently completed but need continuous optimization. Such as ease of use, feature additions, and documentation additions. > We encourage developers to discuss anything in the dev mailing list, to subscribe to the mailing list please refer to [How to subscribe](http://doris.incubator.apache.org/master/en/community/subscribe-mail-list.html). ## Features - [ ] #7571 - [ ] Extensible new query optimizer framework - [ ] Statistical information collection and utilization - [ ] Standard test set support and performance enhancements + TPC-DS feature pass rate 100% + TPC-H performance enhancements - [ ] #7572 - [ ] Pipeline execution engine - [ ] Algorithm Concurrency Control and Resource Control - [ ] #7573 - [ ] #7570 - [ ] Map - [ ] Struct - [ ] #7574 Provides Schemaless semantics for fast analysis of semi-structured data. - [ ] Json - [ ] #7575 (Q1) Supports cold data storage to object storage at partition granularity with remote access capabilities and local Cache acceleration. - [ ] #7503 Doris' current "materialized view" is more of a "materialized index" concept. Doris will later implement a true Materialized View to support full and incremental construction of single and multi-table views. - [ ] #7576 Provide Kudu-like data update support. - [ ] #7577 - [ ] WindowFunnel - [ ] #7578 Support for the new UDF framework has solved the problems of high writing difficulty, poor isolation, and poor compatibility with existing C++ frameworks. - [ ] UDF - [ ] UDAF - [ ] UDTF - [ ] #7579 (**Good First Issue**) - [ ] #7552 - [ ] #7650 - [ ] Add more resource limits - [ ] #7129 ## Performance Optimization - [ ] #7580 (Q1) - [ ] Query layer vectorization - [ ] Storage level vectorization - [ ] Vectorization function supplementation - [ ] Query layer storage layer arithmetic unification - [ ] Import Vectorization - [ ] Json Parsing Optimization (**Good First Issue**) - [ ] #7551 - [ ] #7743 Optimize the performance of compaction task. And try to refactor the compaction logic. For example, only one replica do the compaction and sync to other replicas. ## Stability and Observability - [ ] #7553 (Q1) Solve the problems of inaccurate memory prediction and OOM, and improve memory observability by global + thread + task level memory management. - [ ] #7581 Provides fine-grained IO speed limit, priority scheduling, etc. through global IO management. - [ ] #7582 Introduces OpenTelemetry to enhance system internal state observability and unify monitoring data format. ## Testing - [ ] #7583 - [ ] FE Refine the FE single test framework to support multi-node simulation testing of features. - [ ] BE Provide testing framework to simplify the difficulty of writing complex unit tests (e.g. data builds) for BE. - [ ] #7584 Provide Case collection or submission framework for refining and accumulating regression test sets. - [ ] #7585 Provide a Benchmark te
[GitHub] [incubator-doris] yiguolei commented on pull request #7780: [Vectorized](compile) Fix compile error and warning
yiguolei commented on pull request #7780: URL: https://github.com/apache/incubator-doris/pull/7780#issuecomment-1014330999 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] spaces-X opened a new issue #7782: [Proposal][Feature] enable Quantile pre-aggregation
spaces-X opened a new issue #7782: URL: https://github.com/apache/incubator-doris/issues/7782 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description In current doris-0.15 or elder version, the quantile value is calculated by the detailed data from duplicated module, whose latency is unfriendly under the large scale of data. Proposed to enable `quantile pre-aggregation` to reduce query latency, already implemented in ClickHouse as follows. ``` SELECT quantileState(number) AS st -- st is a quantileState generated by 0~9 FROM numbers(10) Query id: cbde1c1b-e20a-430d-b34a-67c9833be6af ┌─st─┐ │ 6364136223846793005 0 123459 │ └┘ 1 rows in set. Elapsed: 0.002 sec. --- SELECT quantileMerge(0.8)(st) -- use quantileMerge function to calculate quantile FROM ( SELECT quantileState(number) AS st FROM numbers(10) ) Query id: 1c25beb5-f6c5-4f32-a6ce-7bbd6d0429ef ┌─quantileMerge(0.8)(st)─┐ │7.2 │ └┘ ``` Referring to the existing **HLL and bitmap** implementations, the **intermediate state** of the quantile function can be **stored** by TDigest serialization in stream-load step. The changes are roughly as follows. 1. A new column named `quantilestate` and corresponding agg function `quantile_union` , `quantile_cal` are supposed to added. - quantile_union: add a value into quantilestate - quantile_cal(float: percentage): calculate the quantile of percentage by quantilestate - to_quantile(float: value): transfer value to quantilestate 2. Support for `QuantileState` in query and load step. 3. Refactor `PercentileApproxState` and `TDigest` ### Use case create table sql like: ``` CREATE TABLE `QuantileState_Test` ( `keys` bigint(20) NULL COMMENT "keys", `quantile_value` quantilestate quantile_union NOT NULL COMMENT "qualite calue" ) ENGINE=OLAP AGGREGATE KEY(`brand_id`, `dt`, `poi_type`) COMMENT "bitmap load 测试#OWNER#lihuigang" PARTITION BY RANGE(`dt`) ( xxx ) DISTRIBUTED BY HASH(`keys`) BUCKETS 3 PROPERTIES ( xxx ); ``` stream load cmd like: ``` curl --location-trusted -u root -H "columns: k1, k2, v1=to_quantilestate(v1)" -T testData http://host:port/api/testDb/testTbl/_stream_load ``` ### Related issues _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785863571 ## File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java ## @@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() { public static boolean isTableNamesCaseInsensitive() { return GlobalVariable.lowerCaseTableNames == 2; } + +public void compactTable(AdminCompactTableStmt stmt) throws DdlException { +String dbName = stmt.getDbName(); +String tableName = stmt.getTblName(); + +String type = stmt.getCompactionType(); +if (type == null || (!type.equals("base") && !type.equals("cumulative"))) { Review comment: > This check should be done in analysis phase OK, thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785864111 ## File path: be/src/agent/task_worker_pool.cpp ## @@ -1650,4 +1654,73 @@ void TaskWorkerPool::_random_sleep(int second) { sleep(rnd.Uniform(second) + 1); } +void TaskWorkerPool::_submit_table_compaction_worker_thread_callback() { +while (_is_work) { +TAgentTaskRequest agent_task_req; +TCompactionReq compaction_req; + +{ +lock_guard worker_thread_lock(_worker_thread_lock); +while (_is_work && _tasks.empty()) { +_worker_thread_condition_variable.wait(); +} +if (!_is_work) { +return; +} + +agent_task_req = _tasks.front(); +compaction_req = agent_task_req.compaction_req; +_tasks.pop_front(); +} + +LOG(INFO) << "get compaction task. signature:" << agent_task_req.signature + << ", compaction type:" << compaction_req.type; + +CompactionType compaction_type; +if (compaction_req.type == "base") { +compaction_type = CompactionType::BASE_COMPACTION; +} else { +compaction_type = CompactionType::CUMULATIVE_COMPACTION; +} + +TabletSharedPtr tablet_ptr = StorageEngine::instance()->tablet_manager()->get_tablet( +compaction_req.tablet_id, compaction_req.schema_hash); +if (tablet_ptr != nullptr) { +auto data_dir = tablet_ptr->data_dir(); +if (!tablet_ptr->can_do_compaction(data_dir->path_hash(), compaction_type)) { +LOG(WARNING) << "can not do compaction: " << tablet_ptr->tablet_id() + << ", compaction type: " << compaction_type; +_remove_task_info(agent_task_req.task_type, agent_task_req.signature); +continue; +} + +if (compaction_type == CompactionType::BASE_COMPACTION) { Review comment: > Is it necessary to check lock here? OK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785864622 ## File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java ## @@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() { public static boolean isTableNamesCaseInsensitive() { return GlobalVariable.lowerCaseTableNames == 2; } + +public void compactTable(AdminCompactTableStmt stmt) throws DdlException { +String dbName = stmt.getDbName(); +String tableName = stmt.getTblName(); + +String type = stmt.getCompactionType(); +if (type == null || (!type.equals("base") && !type.equals("cumulative"))) { +throw new DdlException("compaction type should be [BASE] or [CUMULATIVE]"); +} + +Database db = this.getDbOrDdlException(dbName); +OlapTable olapTable = db.getOlapTableOrDdlException(tableName); + +olapTable.writeLock(); Review comment: > readLock is enough OK, thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865075 ## File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java ## @@ -7257,4 +7259,46 @@ public static boolean isStoredTableNamesLowerCase() { public static boolean isTableNamesCaseInsensitive() { return GlobalVariable.lowerCaseTableNames == 2; } + +public void compactTable(AdminCompactTableStmt stmt) throws DdlException { +String dbName = stmt.getDbName(); +String tableName = stmt.getTblName(); + +String type = stmt.getCompactionType(); +if (type == null || (!type.equals("base") && !type.equals("cumulative"))) { +throw new DdlException("compaction type should be [BASE] or [CUMULATIVE]"); +} + +Database db = this.getDbOrDdlException(dbName); +OlapTable olapTable = db.getOlapTableOrDdlException(tableName); + +olapTable.writeLock(); +try { +AgentBatchTask batchTask = new AgentBatchTask(); +List partitionNames = stmt.getPartitions(); +LOG.info("Table compaction. database: {}, table: {}, partition: {}, type: {}", dbName, tableName, +Joiner.on(", ").join(partitionNames), type); +for (String parName : partitionNames) { +Partition partition = olapTable.getPartition(parName); +if (partition == null) { +throw new DdlException("partition[" + parName + "] not exist in table[" + tableName + "]"); +} + +for (MaterializedIndex idx : partition.getMaterializedIndices(IndexExtState.VISIBLE)) { +for (Tablet tablet : idx.getTablets()) { +for (Replica replica : tablet.getReplicas()) { +CompactionTask compactionTask = new CompactionTask(replica.getBackendId(), db.getId(), +olapTable.getId(), partition.getId(), idx.getId(), tablet.getId(), + olapTable.getSchemaHashByIndexId(idx.getId()), type, 5000); +batchTask.addTask(compactionTask); +} +} +} // indices +} +// send task immediately +AgentTaskExecutor.submit(batchTask); Review comment: > submit task outside the lock Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865358 ## File path: gensrc/thrift/AgentService.thrift ## @@ -160,6 +160,13 @@ struct TCloneReq { 10: optional i32 timeout_s; } +struct TCompactionReq { +1: required Types.TTabletId tablet_id Review comment: > use optional for all fields Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785865665 ## File path: docs/en/sql-reference/sql-statements/Administration/ADMIN COMPACT.md ## @@ -0,0 +1,52 @@ +--- Review comment: > New doc need to be added to the `docs/.vuepress/sidebar/en.js` and `zh-CN.js` Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #7521: [Feature] Support triggering compaction for a specific partition manually
weizuo93 commented on a change in pull request #7521: URL: https://github.com/apache/incubator-doris/pull/7521#discussion_r785866544 ## File path: be/src/olap/olap_server.cpp ## @@ -551,4 +537,51 @@ void StorageEngine::_pop_tablet_from_submitted_compaction(TabletSharedPtr tablet } } +Status StorageEngine::_submit_compaction_task(TabletSharedPtr tablet, CompactionType compaction_type) { +bool already_exist = _push_tablet_into_submitted_compaction(tablet, compaction_type); +if (already_exist) { +return Status::InternalError(strings::Substitute( Review comment: > Is `Status::AlreadyExist` more appropriate? OK, thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
caiconghui commented on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014419489 > After this pr is merged, will the thrift data of different versions cannot be parsed during the upgrade or in the mixed version cluster? The only thing I'm not sure about is if List and binary are compatible but according to https://stackoverflow.com/questions/40886279/apache-thrift-difference-between-byte-and-binary-types It seems that it is ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
caiconghui edited a comment on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014419489 > After this pr is merged, will the thrift data of different versions cannot be parsed during the upgrade or in the mixed version cluster? The only thing I'm not sure about is if `List` and binary are compatible but according to https://stackoverflow.com/questions/40886279/apache-thrift-difference-between-byte-and-binary-types It seems that it is ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7780: [Vectorized](compile) Fix compile error and warning
github-actions[bot] commented on pull request #7780: URL: https://github.com/apache/incubator-doris/pull/7780#issuecomment-1014435956 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
yangzhg commented on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014436369 Have you test this in a cluster that have new and old nodes ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui commented on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
caiconghui commented on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058 > Have you test this in a cluster that have new and old nodes ? > Have you test this in a cluster that have new and old nodes ? follower with new version,master with older version,I test forward function, but not test list -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
caiconghui edited a comment on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058 > Have you test this in a cluster that have new and old nodes ? > Have you test this in a cluster that have new and old nodes ? follower with new version,master with older version,I test forward function, but not test `list` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui edited a comment on pull request #7773: [fix][chore](thrift) Fix warning when generate cpp code by thrift IDL file and use strict mode
caiconghui edited a comment on pull request #7773: URL: https://github.com/apache/incubator-doris/pull/7773#issuecomment-1014440058 > Have you test this in a cluster that have new and old nodes ? follower with new version,master with older version,I test forward function, but not test `list` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee merged pull request #7778: [Vectorized][Bug] Fix bug of repeated node resize and compile failed
HappenLee merged pull request #7778: URL: https://github.com/apache/incubator-doris/pull/7778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee closed issue #7777: [Vectorized][Bug] Bug of repeated node resize and compile of grouping set code
HappenLee closed issue #: URL: https://github.com/apache/incubator-doris/issues/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch vectorized updated (778fa8d -> c86d691)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a change to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 778fa8d [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751) add c86d691 [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778) No new revisions were added by this update. Summary of changes: be/src/vec/exec/vrepeat_node.cpp | 13 ++--- be/src/vec/functions/simple_function_factory.h | 2 +- build.sh | 2 +- 3 files changed, 8 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch vectorized updated (c86d691 -> 66b3b1d)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a change to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from c86d691 [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778) add 66b3b1d [Vectorized](compile) Fix compile error and warning (#7780) No new revisions were added by this update. Summary of changes: be/src/vec/columns/column.h| 1 + be/src/vec/functions/function.h| 3 +++ be/src/vec/functions/function_grouping.cpp | 4 ++-- 3 files changed, 6 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee merged pull request #7780: [Vectorized](compile) Fix compile error and warning
HappenLee merged pull request #7780: URL: https://github.com/apache/incubator-doris/pull/7780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee closed pull request #7763: Vectorized
HappenLee closed pull request #7763: URL: https://github.com/apache/incubator-doris/pull/7763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch vectorized updated (66b3b1d -> 8a1a612)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a change to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. omit 66b3b1d [Vectorized](compile) Fix compile error and warning (#7780) omit c86d691 [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778) omit 778fa8d [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751) omit 57bdde6 [Vectorized][Improvement] Speed up column filtering via SIMD (#7775) omit d2f2210 [Vectorized][feature](planner)(executor) Support grouping sets rollup cube (#7601) omit 685f452 [Vectorized][Improvement] Enhancement unit test for vectorized function (#7750) omit ee90c94 [Vectorization] Support SegmentIterator vectorization (#7613) omit 24787ed [Vectorized][Function] Support function stddev/variance/stddev_samp/variance_samp (#7734) omit fc05698 [Vectorized] Rebase code from master omit e9056d6 [Vectorized][Bug] Bitmap/HLL type no support cast to varchar/char (#7737) omit 2af5181 [Vectorized][Feature] upport function conv (#7693) omit b79496b [Vectorized][Bug] Fix get wrong result when select random column && fix get wrong has_null_tag (#7728) omit 28fb8c7 [Vectorized][Enhancement] use simd to speed up coalesce and if_not_null function (#7722) omit 2c38a50 [Vectorized][Enhancement] fix some bug & improve some code (#7714) omit 27d3898 [Vectorized][Bug] fix 'negative' function ut run fail && fix testIsBucketShuffleJoin run fail && fix some compile fail (#7688) omit 3e45025 [Vectorized] (olap) Optimize BlockReader's performance (#7642) omit 0dd1662 [Feature][Vectorized] Support String in vec exe engine (#7670) omit a051b33 [Vectorized] [Function] Support do not fold constant at vectorized (#7668) omit 952f0e3 [Vectorized] Support bloom filter predicate on vectorized engine storage layer (#7557) omit 77e0212 [vectorized] [block] Add new method get_data_type to avoid unnecessary copy by the method get_data_type (#7600) omit 01d9434 [Vectorized][Feature] support money_format/ucase/character_length (#7649) omit 9432587 [Vectorizd] [Function] Add string type vec support at doris_builtins_functions[D (#7661) omit ead467c [Bug] Fix function nulllable not match and largetint cast failed (#7659) omit 3425e8a [Function][Vec] add function coalesce (#7632) omit bdeb6b7 [Vectorized][Feature] fix core dump when using function override and function alias at the same time && support substr(str,int) override (#7640) omit c4623f2 [Bug] Fix bug of cast expr nullable and ifnull function (#7626) omit fb945cd [Refactor] Cow refactor: giveup using boost (#7567) omit 326f0d7 [Vectorized][Function] Support function and (#7618) omit 3339878 [Bug] Change parser string to int (#7595) omit 2d31421 [Bug] Fix bug of concat function and fold const expr (#7608) omit 204a35d [Function] Fix error about rank/dense_rank/row_number return always not nullable (#7561) omit 54bd985 [Bug] Fix negative function error result and sort node eos (#7555) omit 07abe49 [Vectorized Exec Engine] Support Vectorized Exec Engine In Doris add b51121f [chore](github-action) Add label auto for pull requests (#7663) add d1a994e [fix](cpu-resource)(resource-tag) Allow set cpu_resource_limit to -1 and fix resource tag bug(#6830) add 3da4425 [fix](github-action) fix the action of set-label-based-on-pr-title (#7757) add 10709f3 [fix](github-action) fix the action of set-label-based-on-pr-title (#7758) add d03151b [chore](be) Add -Werror (#7744) add 902ab93 [fix](session-variable) fix bug that checkpoint may overwrite the global variables (#7526) add 6188ab2 [docs](faq) add multiple FE WEB UI login issues (#7654) add f381782 [fix] fix malloc and free mismatch issue (#7702) add fe80d14 [style] replace Chinese comments with English comments (#7732) add 5c4055a [style] Translate Chinese to English in be_olap_field.h (#7738) add e7d65e4 [style] translate code annotations into english (#7752) add a6ff1bd Flink / Spark connector compilation problem (#7725) add be43316 [docs] add doc for community feedback and fix CI (#7759) add 4a3cbf5 [fix](show-load) fix show load with the same column name in Where Clause (#7523) add 5b0f11b [feature](mysql-compatibility)(function) add `WEEKDAY` function (#7673) add 8b7d7e4 [improvement] create/drop index support if [not] exist (#7748) add 5f8d912 [improvement](routine-load) Reduce the probability that the routine load task rpc timeout (#7754) add 36d6d23 [refactor] remove duplicate if that will never be used (#7761) add 88a3d08 [fix] fix NPE in SysVariableDesc::equal (#7766) add 5c7863c [improvement](fe-unit-test) Fix port in use when the cluster starts in UT. (#
[incubator-doris] 02/33: [Bug] Fix negative function error result and sort node eos (#7555)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit a33b2ac2d1027e7c3c00c2a0d36276dd1b54df33 Author: HappenLee AuthorDate: Fri Dec 31 00:37:54 2021 -0600 [Bug] Fix negative function error result and sort node eos (#7555) Co-authored-by: lihaopeng --- be/src/vec/exec/vsort_node.cpp | 1 + be/src/vec/functions/math.cpp | 9 + be/test/vec/function/function_math_test.cpp | 2 +- 3 files changed, 3 insertions(+), 9 deletions(-) diff --git a/be/src/vec/exec/vsort_node.cpp b/be/src/vec/exec/vsort_node.cpp index 79af7c8..734af91 100644 --- a/be/src/vec/exec/vsort_node.cpp +++ b/be/src/vec/exec/vsort_node.cpp @@ -84,6 +84,7 @@ Status VSortNode::get_next(RuntimeState* state, Block* block, bool* eos) { _sorted_blocks[0].skip_num_rows(_offset); } block->swap(_sorted_blocks[0]); +*eos = true; } else { RETURN_IF_ERROR(merge_sort_read(state, block, eos)); } diff --git a/be/src/vec/functions/math.cpp b/be/src/vec/functions/math.cpp index af48277..57d6c48 100644 --- a/be/src/vec/functions/math.cpp +++ b/be/src/vec/functions/math.cpp @@ -258,14 +258,7 @@ struct NegativeImpl { using ResultType = A; static inline ResultType apply(A a) { -if constexpr (IsDecimalNumber) -return a > 0 ? A(-a) : a; -else if constexpr (std::is_integral_v && std::is_signed_v) -return a > 0 ? static_cast(~a) + 1 : a; -else if constexpr (std::is_integral_v && std::is_unsigned_v) -return static_cast(-a); -else if constexpr (std::is_floating_point_v) -return static_cast(-std::abs(a)); +return -a; } }; diff --git a/be/test/vec/function/function_math_test.cpp b/be/test/vec/function/function_math_test.cpp index f56ab7d..0413abd 100644 --- a/be/test/vec/function/function_math_test.cpp +++ b/be/test/vec/function/function_math_test.cpp @@ -296,7 +296,7 @@ TEST(MathFunctionTest, negative_test) { { std::vector input_types = {vectorized::TypeIndex::Float64}; -DataSet data_set = {{{0.0123}, -0.0123}, {{90.45}, -90.45}, {{0.0}, 0.0}, {{-60.0}, -60.0}}; +DataSet data_set = {{{0.0123}, -0.0123}, {{90.45}, -90.45}, {{0.0}, 0.0}, {{-60.0}, 60.0}}; vectorized::check_function(func_name, input_types, data_set); - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 05/33: [Bug] Change parser string to int (#7595)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 45286430724bfecaf73407f15c427aae6bd417e3 Author: Pxl <952130...@qq.com> AuthorDate: Wed Jan 5 16:13:38 2022 +0800 [Bug] Change parser string to int (#7595) --- be/src/util/string_parser.hpp | 20 + be/src/vec/io/io_helper.h | 52 ++- 2 files changed, 27 insertions(+), 45 deletions(-) diff --git a/be/src/util/string_parser.hpp b/be/src/util/string_parser.hpp index 0354343..cc1110c 100644 --- a/be/src/util/string_parser.hpp +++ b/be/src/util/string_parser.hpp @@ -573,6 +573,26 @@ T StringParser::numeric_limits(bool negative) { } template<> +inline int StringParser::StringParseTraits::max_ascii_len() { +return 3; +} + +template<> +inline int StringParser::StringParseTraits::max_ascii_len() { +return 5; +} + +template<> +inline int StringParser::StringParseTraits::max_ascii_len() { +return 10; +} + +template<> +inline int StringParser::StringParseTraits::max_ascii_len() { +return 20; +} + +template<> inline int StringParser::StringParseTraits::max_ascii_len() { return 3; } diff --git a/be/src/vec/io/io_helper.h b/be/src/vec/io/io_helper.h index fc232d7..fb9371f 100644 --- a/be/src/vec/io/io_helper.h +++ b/be/src/vec/io/io_helper.h @@ -126,7 +126,7 @@ inline void write_string_binary(const StringRef& s, BufferWritable& buf) { } inline void write_string_binary(const char* s, BufferWritable& buf) { -write_string_binary(StringRef{s}, buf); +write_string_binary(StringRef {s}, buf); } template @@ -288,53 +288,15 @@ bool read_float_text_fast_impl(T& x, ReadBuffer& in) { template bool read_int_text_impl(T& x, ReadBuffer& buf) { -bool negative = false; -std::make_unsigned_t res = 0; -if (buf.eof()) { -return false; -} +StringParser::ParseResult result; +x = StringParser::string_to_int(buf.position(), buf.count(), &result); -while (!buf.eof()) { -switch (*buf.position()) { -case '+': -break; -case '-': -if (std::is_signed_v) -negative = true; -else { -return false; -} -break; -case '0': -[[fallthrough]]; -case '1': -[[fallthrough]]; -case '2': -[[fallthrough]]; -case '3': -[[fallthrough]]; -case '4': -[[fallthrough]]; -case '5': -[[fallthrough]]; -case '6': -[[fallthrough]]; -case '7': -[[fallthrough]]; -case '8': -[[fallthrough]]; -case '9': -res *= 10; -res += *buf.position() - '0'; -break; -default: -x = negative ? -res : res; -return true; -} -++buf.position(); +if (UNLIKELY(result != StringParser::PARSE_SUCCESS)) { +return false; } -x = negative ? -res : res; +// only to match the is_all_read() check to prevent return null +buf.position() = buf.end(); return true; } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 08/33: [Bug] Fix bug of cast expr nullable and ifnull function (#7626)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 24e1d64a6b26be2f97477d39d87caed8ee0f5f8e Author: HappenLee AuthorDate: Wed Jan 5 23:56:22 2022 -0600 [Bug] Fix bug of cast expr nullable and ifnull function (#7626) Co-authored-by: lihaopeng --- be/src/runtime/descriptors.h | 1 - be/src/runtime/fold_constant_executor.cpp | 8 ++-- be/src/vec/exec/join/vhash_join_node.cpp | 2 +- be/src/vec/exec/vunion_node.cpp| 2 + be/src/vec/functions/function.cpp | 2 +- .../function_date_or_datetime_computation.h| 4 +- be/src/vec/functions/function_ifnull.h | 43 -- be/src/vec/sink/vtabet_sink.cpp| 1 + .../java/org/apache/doris/analysis/CastExpr.java | 7 .../apache/doris/analysis/FunctionCallExpr.java| 2 +- .../apache/doris/rewrite/FoldConstantsRule.java| 9 - 11 files changed, 51 insertions(+), 30 deletions(-) diff --git a/be/src/runtime/descriptors.h b/be/src/runtime/descriptors.h index 97f9712..ad43209 100644 --- a/be/src/runtime/descriptors.h +++ b/be/src/runtime/descriptors.h @@ -381,7 +381,6 @@ public: int get_row_size() const; int num_materialized_slots() const { -DCHECK(_num_materialized_slots != 0); return _num_materialized_slots; } diff --git a/be/src/runtime/fold_constant_executor.cpp b/be/src/runtime/fold_constant_executor.cpp index 9781c2f..f093c04 100644 --- a/be/src/runtime/fold_constant_executor.cpp +++ b/be/src/runtime/fold_constant_executor.cpp @@ -127,9 +127,11 @@ Status FoldConstantExecutor::fold_constant_vexpr( } vectorized::Block tmp_block; +tmp_block.insert({vectorized::ColumnUInt8::create(1), +std::make_shared(), ""}); int result_column = -1; // calc vexpr -ctx->execute(&tmp_block, &result_column); +RETURN_IF_ERROR(ctx->execute(&tmp_block, &result_column)); DCHECK(result_column != -1); PrimitiveType root_type = ctx->root()->type().type; // covert to thrift type @@ -139,7 +141,7 @@ Status FoldConstantExecutor::fold_constant_vexpr( PExprResult expr_result; string result; const auto& column_ptr = tmp_block.get_by_position(result_column).column; -if (column_ptr->is_nullable() && column_ptr->is_null_at(0)) { +if (column_ptr->is_null_at(0)) { expr_result.set_success(false); } else { expr_result.set_success(true); @@ -194,7 +196,7 @@ Status FoldConstantExecutor::_init(const TQueryGlobals& query_globals) { template Status FoldConstantExecutor::_prepare_and_open(Context* ctx) { -ctx->prepare(_runtime_state.get(), RowDescriptor(), _mem_tracker); +RETURN_IF_ERROR(ctx->prepare(_runtime_state.get(), RowDescriptor(), _mem_tracker)); return ctx->open(_runtime_state.get()); } diff --git a/be/src/vec/exec/join/vhash_join_node.cpp b/be/src/vec/exec/join/vhash_join_node.cpp index 7606783..4533cae 100644 --- a/be/src/vec/exec/join/vhash_join_node.cpp +++ b/be/src/vec/exec/join/vhash_join_node.cpp @@ -133,7 +133,7 @@ struct ProcessRuntimeFilterBuild { RETURN_IF_ERROR(runtime_filter_slots->init(state, hash_table_ctx.hash_table.get_size())); -if (!runtime_filter_slots->empty()) { +if (!runtime_filter_slots->empty() && !_join_node->_inserted_rows.empty()) { { SCOPED_TIMER(_join_node->_push_compute_timer); runtime_filter_slots->insert(_join_node->_inserted_rows); diff --git a/be/src/vec/exec/vunion_node.cpp b/be/src/vec/exec/vunion_node.cpp index 122eafa..1fa4da4 100644 --- a/be/src/vec/exec/vunion_node.cpp +++ b/be/src/vec/exec/vunion_node.cpp @@ -181,6 +181,8 @@ Status VUnionNode::get_next_const(RuntimeState* state, Block* block) { MutableBlock(Block(VectorizedUtils::create_columns_with_type_and_name(row_desc(; for (; _const_expr_list_idx < _const_expr_lists.size(); ++_const_expr_list_idx) { Block tmp_block; +tmp_block.insert({vectorized::ColumnUInt8::create(1), +std::make_shared(), ""}); int const_expr_lists_size = _const_expr_lists[_const_expr_list_idx].size(); std::vector result_list(const_expr_lists_size); for (size_t i = 0; i < const_expr_lists_size; ++i) { diff --git a/be/src/vec/functions/function.cpp b/be/src/vec/functions/function.cpp index d3a910e..9ed3edb 100644 --- a/be/src/vec/functions/function.cpp +++ b/be/src/vec/functions/function.cpp @@ -66,7 +66,7 @@ ColumnPtr wrap_in_nullable(const ColumnPtr& src, const Block& block, const Colum null_map_column->clone_resized(null_map_
[incubator-doris] 16/33: [Vectorized] [Function] Support do not fold constant at vectorized (#7668)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit acb63c749cff8b323cf9efca8cf47ddb171aecd0 Author: Pxl <952130...@qq.com> AuthorDate: Mon Jan 10 10:52:44 2022 +0800 [Vectorized] [Function] Support do not fold constant at vectorized (#7668) --- .../org/apache/doris/analysis/ArithmeticExpr.java | 30 ++ .../java/org/apache/doris/rewrite/FEFunctions.java | 64 -- 2 files changed, 88 insertions(+), 6 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java index 50012b7..79a1ffa 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/ArithmeticExpr.java @@ -267,6 +267,33 @@ public class ArithmeticExpr extends Expr { } } +private boolean castIfHaveSameType(Type t1, Type t2, Type target) throws AnalysisException { +if (t1 == target || t2 == target) { +castChild(target, 0); +castChild(target, 1); +return true; +} +return false; +} + +private void castUpperInteger(Type t1, Type t2) throws AnalysisException { +if (!t1.isIntegerType() || !t2.isIntegerType()) { +return; +} +if (castIfHaveSameType(t1, t2, Type.BIGINT)) { +return; +} +if (castIfHaveSameType(t1, t2, Type.INT)) { +return; +} +if (castIfHaveSameType(t1, t2, Type.SMALLINT)) { +return; +} +if (castIfHaveSameType(t1, t2, Type.TINYINT)) { +return; +} +} + @Override public void analyzeImpl(Analyzer analyzer) throws AnalysisException { if (VectorizedUtil.isVectorized()) { @@ -320,6 +347,9 @@ public class ArithmeticExpr extends Expr { if (t1.isDecimalV2() || t2.isDecimalV2()) { castBinaryOp(findCommonType(t1, t2)); } +if (isConstant()) { +castUpperInteger(t1, t2); +} case MOD: if (t1.isDecimalV2() || t2.isDecimalV2()) { castBinaryOp(findCommonType(t1, t2)); diff --git a/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java b/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java index 26ca3f7..0bcbfb6 100755 --- a/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java +++ b/fe/fe-core/src/main/java/org/apache/doris/rewrite/FEFunctions.java @@ -350,12 +350,30 @@ public class FEFunctions { * Arithmetic function */ -@FEFunction(name = "add", argTypes = { "BIGINT", "BIGINT" }, returnType = "BIGINT") +@FEFunction(name = "add", argTypes = { "TINYINT", "TINYINT" }, returnType = "SMALLINT") +public static IntLiteral addTinyint(LiteralExpr first, LiteralExpr second) throws AnalysisException { +long result = Math.addExact(first.getLongValue(), second.getLongValue()); +return new IntLiteral(result, Type.SMALLINT); +} + +@FEFunction(name = "add", argTypes = { "SMALLINT", "SMALLINT" }, returnType = "INT") +public static IntLiteral addSmallint(LiteralExpr first, LiteralExpr second) throws AnalysisException { +long result = Math.addExact(first.getLongValue(), second.getLongValue()); +return new IntLiteral(result, Type.INT); +} + +@FEFunction(name = "add", argTypes = { "INT", "INT" }, returnType = "BIGINT") public static IntLiteral addInt(LiteralExpr first, LiteralExpr second) throws AnalysisException { long result = Math.addExact(first.getLongValue(), second.getLongValue()); return new IntLiteral(result, Type.BIGINT); } +@FEFunction(name = "add", argTypes = { "BIGINT", "BIGINT" }, returnType = "BIGINT") +public static IntLiteral addBigint(LiteralExpr first, LiteralExpr second) throws AnalysisException { +long result = Math.addExact(first.getLongValue(), second.getLongValue()); +return new IntLiteral(result, Type.BIGINT); +} + @FEFunction(name = "add", argTypes = { "DOUBLE", "DOUBLE" }, returnType = "DOUBLE") public static FloatLiteral addDouble(LiteralExpr first, LiteralExpr second) throws AnalysisException { double result = first.getDoubleValue() + second.getDoubleValue(); @@ -379,12 +397,30 @@ public class FEFunctions { return new LargeIntLiteral(result.toString()); } -@FEFunction(name = "subtract", argTypes = { "BIGINT", "BIGINT" }, returnType = "BIGINT") +@FEFunction(name = "subtract", argTypes = { "TINYINT", "TINYINT" }, returnType = "SMALLINT") +public static IntLiteral subtractTinyint(LiteralExpr first, LiteralExpr second) th
[incubator-doris] 12/33: [Vectorizd] [Function] Add string type vec support at doris_builtins_functions[D (#7661)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 345c119510602110275a698e64f76f7e2f058065 Author: Pxl <952130...@qq.com> AuthorDate: Fri Jan 7 14:57:20 2022 +0800 [Vectorizd] [Function] Add string type vec support at doris_builtins_functions[D (#7661) --- gensrc/script/doris_builtins_functions.py | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/gensrc/script/doris_builtins_functions.py b/gensrc/script/doris_builtins_functions.py index 1399c97..5d96751 100755 --- a/gensrc/script/doris_builtins_functions.py +++ b/gensrc/script/doris_builtins_functions.py @@ -1045,26 +1045,26 @@ visible_functions = [ '_ZN5doris15StringFunctions17parse_url_prepareEPN9doris_udf' '15FunctionContextENS2_18FunctionStateScopeE', '_ZN5doris15StringFunctions15parse_url_closeEPN9doris_udf' -'15FunctionContextENS2_18FunctionStateScopeE', '', ''], +'15FunctionContextENS2_18FunctionStateScopeE', 'vec', ''], [['parse_url'], 'STRING', ['STRING', 'STRING', 'STRING'], '_ZN5doris15StringFunctions13parse_url_keyEPN9doris_udf' '15FunctionContextERKNS1_9StringValES6_S6_', '_ZN5doris15StringFunctions17parse_url_prepareEPN9doris_udf' '15FunctionContextENS2_18FunctionStateScopeE', '_ZN5doris15StringFunctions15parse_url_closeEPN9doris_udf' -'15FunctionContextENS2_18FunctionStateScopeE', '', ''], +'15FunctionContextENS2_18FunctionStateScopeE', 'vec', ''], [['money_format'], 'STRING', ['BIGINT'], '_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_9BigIntValE', -'', '', '', ''], +'', '', 'vec', ''], [['money_format'], 'STRING', ['LARGEINT'], '_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_11LargeIntValE', -'', '', '', ''], +'', '', 'vec', ''], [['money_format'], 'STRING', ['DOUBLE'], '_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_9DoubleValE', -'', '', '', ''], +'', '', 'vec', ''], [['money_format'], 'STRING', ['DECIMALV2'], '_ZN5doris15StringFunctions12money_formatEPN9doris_udf15FunctionContextERKNS1_12DecimalV2ValE', -'', '', '', ''], +'', '', 'vec', ''], [['split_part'], 'STRING', ['STRING', 'STRING', 'INT'], '_ZN5doris15StringFunctions10split_partEPN9doris_udf15FunctionContextERKNS1_9StringValES6_RKNS1_6IntValE', '', '', 'vec', 'ALWAYS_NULLABLE'], @@ -1276,7 +1276,7 @@ visible_functions = [ '15FunctionContextERKNS1_9StringValES6_', '', '', 'vec', 'ALWAYS_NULLABLE'], [['aes_decrypt'], 'STRING', ['STRING', 'STRING'], '_ZN5doris19EncryptionFunctions11aes_decryptEPN9doris_udf' -'15FunctionContextERKNS1_9StringValES6_', '', '', '', ''], +'15FunctionContextERKNS1_9StringValES6_', '', '', 'vec', ''], [['aes_encrypt'], 'STRING', ['STRING', 'STRING', 'STRING', 'STRING'], '_ZN5doris19EncryptionFunctions11aes_encryptEPN9doris_udf' '15FunctionContextERKNS1_9StringValES6_S6_S6_', '', '', '', ''], @@ -1300,7 +1300,7 @@ visible_functions = [ '15FunctionContextERKNS1_9StringValE', '', '', 'vec', 'ALWAYS_NULLABLE'], [['to_base64'], 'STRING', ['STRING'], '_ZN5doris19EncryptionFunctions9to_base64EPN9doris_udf' -'15FunctionContextERKNS1_9StringValE', '', '', '', 'ALWAYS_NULLABLE'], +'15FunctionContextERKNS1_9StringValE', '', '', 'vec', 'ALWAYS_NULLABLE'], [['to_base64'], 'VARCHAR', ['VARCHAR'], '_ZN5doris19EncryptionFunctions9to_base64EPN9doris_udf' '15FunctionContextERKNS1_9StringValE', '', '', 'vec', 'ALWAYS_NULLABLE'], - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 23/33: [Vectorized][Feature] upport function conv (#7693)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit ffdc9fc9be28ae94671916f06e4fd5f219707e93 Author: Pxl <952130...@qq.com> AuthorDate: Wed Jan 12 17:08:09 2022 +0800 [Vectorized][Feature] upport function conv (#7693) * support function conv() * add document --- be/src/exprs/math_functions.h | 14 +- be/src/vec/CMakeLists.txt | 1 + be/src/vec/data_types/data_type_bitmap.h | 2 + be/src/vec/data_types/data_type_date.h | 2 +- be/src/vec/data_types/data_type_date_time.h| 40 ++--- be/src/vec/data_types/data_type_decimal.h | 2 +- be/src/vec/data_types/data_type_number_base.h | 5 +- be/src/vec/data_types/data_type_string.h | 5 +- be/src/vec/functions/function_conv.cpp | 163 + be/src/vec/functions/simple_function_factory.h | 2 + docs/.vuepress/sidebar/en.js | 5 + docs/.vuepress/sidebar/zh-CN.js| 5 + .../sql-functions/math-functions/conv.md | 60 .../sql-functions/math-functions/conv.md | 60 gensrc/script/doris_builtins_functions.py | 6 +- 15 files changed, 339 insertions(+), 33 deletions(-) diff --git a/be/src/exprs/math_functions.h b/be/src/exprs/math_functions.h index 15d8749..9d55ed6 100644 --- a/be/src/exprs/math_functions.h +++ b/be/src/exprs/math_functions.h @@ -50,7 +50,8 @@ public: static doris_udf::IntVal abs(doris_udf::FunctionContext*, const doris_udf::SmallIntVal&); static doris_udf::SmallIntVal abs(doris_udf::FunctionContext*, const doris_udf::TinyIntVal&); -static doris_udf::TinyIntVal sign(doris_udf::FunctionContext* ctx, const doris_udf::DoubleVal& v); +static doris_udf::TinyIntVal sign(doris_udf::FunctionContext* ctx, + const doris_udf::DoubleVal& v); static doris_udf::DoubleVal sin(doris_udf::FunctionContext*, const doris_udf::DoubleVal&); static doris_udf::DoubleVal asin(doris_udf::FunctionContext*, const doris_udf::DoubleVal&); @@ -182,11 +183,6 @@ public: static double my_double_round(double value, int64_t dec, bool dec_unsigned, bool truncate); -private: -static const int32_t MIN_BASE = 2; -static const int32_t MAX_BASE = 36; -static const char* _s_alphanumeric_chars; - // Converts src_num in decimal to dest_base, // and fills expr_val.string_val with the result. static doris_udf::StringVal decimal_to_base(doris_udf::FunctionContext* ctx, int64_t src_num, @@ -207,6 +203,12 @@ private: // Returns false otherwise, indicating some other error condition. static bool handle_parse_result(int8_t dest_base, int64_t* num, StringParser::ParseResult parse_res); + +static const int32_t MIN_BASE = 2; +static const int32_t MAX_BASE = 36; + +private: +static const char* _s_alphanumeric_chars; }; } // namespace doris diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index 01c69eb..aa302ce 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -108,6 +108,7 @@ set(VEC_FILES functions/functions_logical.cpp functions/function_case.cpp functions/function_cast.cpp + functions/function_conv.cpp functions/function_string.cpp functions/function_timestamp.cpp functions/function_utility.cpp diff --git a/be/src/vec/data_types/data_type_bitmap.h b/be/src/vec/data_types/data_type_bitmap.h index 692d6fc..69f5540 100644 --- a/be/src/vec/data_types/data_type_bitmap.h +++ b/be/src/vec/data_types/data_type_bitmap.h @@ -18,6 +18,7 @@ #pragma once #include "util/bitmap_value.h" #include "vec/columns/column.h" +#include "vec/columns/column_complex.h" #include "vec/core/types.h" #include "vec/data_types/data_type.h" @@ -27,6 +28,7 @@ public: DataTypeBitMap() = default; ~DataTypeBitMap() override = default; +using ColumnType = ColumnBitmap; using FieldType = BitmapValue; std::string do_get_name() const override { return get_family_name(); } diff --git a/be/src/vec/data_types/data_type_date.h b/be/src/vec/data_types/data_type_date.h index b3aa90c..b5d148b 100644 --- a/be/src/vec/data_types/data_type_date.h +++ b/be/src/vec/data_types/data_type_date.h @@ -34,7 +34,7 @@ public: bool equals(const IDataType& rhs) const override; std::string to_string(const IColumn& column, size_t row_num) const; -void to_string(const IColumn &column, size_t row_num, BufferWritable &ostr) const override; +void to_string(const IColumn& column, size_t row_num, BufferWritable& ostr) const override; static void cast_to_date(Int64& x); }; diff --git a/be/src/vec/data_types/data_type_date_time.h b/be/src/vec/data_types/data_type_date_time.h inde
[incubator-doris] 18/33: [Vectorized] (olap) Optimize BlockReader's performance (#7642)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 8b8210433eeeb0fbfc20b39520188d2e23892767 Author: thinker AuthorDate: Mon Jan 10 20:28:21 2022 +0800 [Vectorized] (olap) Optimize BlockReader's performance (#7642) Co-authored-by: zuochunwei --- be/src/vec/olap/block_reader.cpp | 51 +--- be/src/vec/olap/block_reader.h | 9 +++ 2 files changed, 24 insertions(+), 36 deletions(-) diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp index 8e2d4b2..ef3ba3a 100644 --- a/be/src/vec/olap/block_reader.cpp +++ b/be/src/vec/olap/block_reader.cpp @@ -25,15 +25,8 @@ #include "runtime/mem_tracker.h" #include "vec/olap/vcollect_iterator.h" -using std::nothrow; -using std::set; -using std::vector; - namespace doris::vectorized { -BlockReader::BlockReader() -: _collect_iter(new VCollectIterator()), _next_row {nullptr, -1, false} {} - BlockReader::~BlockReader() { for (int i = 0; i < _agg_functions.size(); ++i) { AggregateFunctionPtr function = _agg_functions[i]; @@ -45,7 +38,7 @@ BlockReader::~BlockReader() { OLAPStatus BlockReader::_init_collect_iter(const ReaderParams& read_params, std::vector* valid_rs_readers) { -_collect_iter->init(this); +_vcollect_iter.init(this); std::vector rs_readers; auto res = _capture_rs_readers(read_params, &rs_readers); if (res != OLAP_SUCCESS) { @@ -59,7 +52,7 @@ OLAPStatus BlockReader::_init_collect_iter(const ReaderParams& read_params, for (auto& rs_reader : rs_readers) { RETURN_NOT_OK(rs_reader->init(&_reader_context)); -OLAPStatus res = _collect_iter->add_child(rs_reader); +OLAPStatus res = _vcollect_iter.add_child(rs_reader); if (res != OLAP_SUCCESS && res != OLAP_ERR_DATA_EOF) { LOG(WARNING) << "failed to add child to iterator, err=" << res; return res; @@ -69,9 +62,9 @@ OLAPStatus BlockReader::_init_collect_iter(const ReaderParams& read_params, } } -_collect_iter->build_heap(*valid_rs_readers); -if (_collect_iter->is_merge()) { -auto status = _collect_iter->current_row(&_next_row); +_vcollect_iter.build_heap(*valid_rs_readers); +if (_vcollect_iter.is_merge()) { +auto status = _vcollect_iter.current_row(&_next_row); _eof = status == OLAP_ERR_DATA_EOF; } @@ -85,8 +78,9 @@ void BlockReader::_init_agg_state() { _stored_has_null_tag.resize(_stored_data_columns.size()); _stored_has_string_tag.resize(_stored_data_columns.size()); +auto& tablet_schema = tablet()->tablet_schema(); for (auto idx : _agg_columns_idx) { -FieldAggregationMethod agg_method = tablet()->tablet_schema().column(idx).aggregation(); +FieldAggregationMethod agg_method = tablet_schema.column(idx).aggregation(); std::string agg_name = TabletColumn::get_string_by_aggregation_type(agg_method) + agg_reader_suffix; std::transform(agg_name.begin(), agg_name.end(), agg_name.begin(), @@ -159,6 +153,7 @@ OLAPStatus BlockReader::init(const ReaderParams& read_params) { break; case KeysType::AGG_KEYS: _next_block_func = &BlockReader::_agg_key_next_block; +_init_agg_state(); break; default: DCHECK(false) << "No next row function for type:" << tablet()->keys_type(); @@ -170,7 +165,7 @@ OLAPStatus BlockReader::init(const ReaderParams& read_params) { OLAPStatus BlockReader::_direct_next_block(Block* block, MemPool* mem_pool, ObjectPool* agg_pool, bool* eof) { -auto res = _collect_iter->next(block); +auto res = _vcollect_iter.next(block); if (UNLIKELY(res != OLAP_SUCCESS && res != OLAP_ERR_DATA_EOF)) { return res; } @@ -190,11 +185,6 @@ OLAPStatus BlockReader::_agg_key_next_block(Block* block, MemPool* mem_pool, Obj return OLAP_SUCCESS; } -if (!_agg_inited) { -_init_agg_state(); -_agg_inited = true; -} - auto target_block_row = 0; auto target_columns = block->mutate_columns(); @@ -203,7 +193,7 @@ OLAPStatus BlockReader::_agg_key_next_block(Block* block, MemPool* mem_pool, Obj _append_agg_data(target_columns); while (true) { -auto res = _collect_iter->next(&_next_row); +auto res = _vcollect_iter.next(&_next_row); if (UNLIKELY(res == OLAP_ERR_DATA_EOF)) { *eof = true; break; @@ -251,7 +241,7 @@ OLAPStatus BlockReader::_unique_key_next_block(Block* block, MemPool* mem_pool, // the version is in reverse order, the first row is the highest version, // in UNIQUE_KEY highest version is the final result, there is no need to // merge the lower
[incubator-doris] 22/33: [Vectorized][Bug] Fix get wrong result when select random column && fix get wrong has_null_tag (#7728)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit a9d9c02a2fefc16c0eccc572626fac8b58a64d70 Author: Pxl <952130...@qq.com> AuthorDate: Wed Jan 12 15:55:58 2022 +0800 [Vectorized][Bug] Fix get wrong result when select random column && fix get wrong has_null_tag (#7728) --- be/src/vec/columns/column.h | 3 +++ be/src/vec/columns/column_nullable.h | 5 +++-- be/src/vec/core/block.cpp| 29 + be/src/vec/olap/block_reader.cpp | 23 +++ be/src/vec/olap/block_reader.h | 4 ++-- 5 files changed, 40 insertions(+), 24 deletions(-) diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h index 88f9a3a..5e41028 100644 --- a/be/src/vec/columns/column.h +++ b/be/src/vec/columns/column.h @@ -336,6 +336,9 @@ public: // true iff column has null element virtual bool has_null() const { return false; } +// true iff column has null element [0,size) +virtual bool has_null(size_t size) const { return false; } + /// It's a special kind of column, that contain single value, but is not a ColumnConst. virtual bool is_dummy() const { return false; } diff --git a/be/src/vec/columns/column_nullable.h b/be/src/vec/columns/column_nullable.h index 9641788..8f4 100644 --- a/be/src/vec/columns/column_nullable.h +++ b/be/src/vec/columns/column_nullable.h @@ -176,8 +176,9 @@ public: /// Check that size of null map equals to size of nested column. void check_consistency() const; -bool has_null() const override { -size_t size = get_null_map_data().size(); +bool has_null() const override { return has_null(get_null_map_data().size()); } + +bool has_null(size_t size) const override { const UInt8* null_pos = get_null_map_data().data(); const UInt8* null_pos_end = get_null_map_data().data() + size; #ifdef __SSE2__ diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp index 2aff751..d200a46 100644 --- a/be/src/vec/core/block.cpp +++ b/be/src/vec/core/block.cpp @@ -21,18 +21,18 @@ #include "vec/core/block.h" #include +#include + #include #include #include -#include #include "common/status.h" #include "gen_cpp/data.pb.h" #include "runtime/descriptors.h" +#include "runtime/row_batch.h" #include "runtime/tuple.h" #include "runtime/tuple_row.h" -#include "runtime/row_batch.h" - #include "vec/columns/column_const.h" #include "vec/columns/column_nullable.h" #include "vec/columns/column_vector.h" @@ -692,8 +692,10 @@ Status Block::filter_block(Block* block, int filter_column_id, int column_to_kee if (auto* nullable_column = check_and_get_column(*filter_column)) { ColumnPtr nested_column = nullable_column->get_nested_column_ptr(); -MutableColumnPtr mutable_holder = nested_column->use_count() == 1 ? -nested_column->assume_mutable() : nested_column->clone_resized(nested_column->size()); +MutableColumnPtr mutable_holder = +nested_column->use_count() == 1 +? nested_column->assume_mutable() +: nested_column->clone_resized(nested_column->size()); ColumnUInt8* concrete_column = typeid_cast(mutable_holder.get()); if (!concrete_column) { @@ -769,8 +771,8 @@ void Block::serialize(RowBatch* output_batch, const RowDescriptor& row_desc) { } } -doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, MemPool* pool, -int row, int column_offset, bool padding_char) { +doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, MemPool* pool, int row, + int column_offset, bool padding_char) { auto dst = reinterpret_cast(pool->allocate(desc.byte_size())); for (int i = 0; i < desc.slots().size(); ++i) { @@ -787,8 +789,9 @@ doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, MemPool if (!slot_desc->type().is_string_type() && !slot_desc->type().is_date_type()) { memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), data_ref.data, data_ref.size); -} else if (slot_desc->type().is_string_type() && slot_desc->type() != TYPE_OBJECT){ -memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), (const void*)(&data_ref), sizeof(data_ref)); +} else if (slot_desc->type().is_string_type() && slot_desc->type() != TYPE_OBJECT) { +memcpy((void*)dst->get_slot(slot_desc->tuple_offset()), (const void*)(&data_ref), + sizeof(data_ref)); // Copy the content of string if (padding_char && slot_desc->type() == TYPE_CHAR) { // serialize the content of string @@ -800,7 +803,8 @@ doris::Tuple* Block::deep_copy_tuple(const doris::TupleDescriptor& desc, MemPool
[incubator-doris] 13/33: [Vectorized][Feature] support money_format/ucase/character_length (#7649)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit cc451e7126235327872f1d043ce41d09460fbd79 Author: Pxl <952130...@qq.com> AuthorDate: Fri Jan 7 15:04:55 2022 +0800 [Vectorized][Feature] support money_format/ucase/character_length (#7649) --- be/src/vec/functions/function_string.cpp | 12 +++- be/src/vec/functions/function_string.h| 110 +- gensrc/script/doris_builtins_functions.py | 8 +-- 3 files changed, 120 insertions(+), 10 deletions(-) diff --git a/be/src/vec/functions/function_string.cpp b/be/src/vec/functions/function_string.cpp index 34f1c6b..bdeb7e5 100644 --- a/be/src/vec/functions/function_string.cpp +++ b/be/src/vec/functions/function_string.cpp @@ -272,7 +272,7 @@ struct HexStringName { struct HexStringImpl { static DataTypes get_variadic_argument_types() { -return {std::make_shared()}; +return {std::make_shared()}; } static Status vector(const ColumnString::Chars& data, const ColumnString::Offsets& offsets, @@ -774,8 +774,8 @@ void register_function_string(SimpleFunctionFactory& factory) { factory.register_function(); factory.register_function(); factory.register_function(); -factory.register_function>(); -factory.register_function>(); +factory.register_function>(); +factory.register_function>(); factory.register_function(); factory.register_function(); factory.register_function(); @@ -792,12 +792,18 @@ void register_function_string(SimpleFunctionFactory& factory) { factory.register_function(); factory.register_function(); factory.register_function(); +factory.register_function>(); +factory.register_function>(); +factory.register_function>(); +factory.register_function>(); factory.register_alias(FunctionLeft::name, "strleft"); factory.register_alias(FunctionRight::name, "strright"); factory.register_alias(SubstringUtil::name, "substr"); factory.register_alias(FunctionToLower::name, "lcase"); +factory.register_alias(FunctionToUpper::name, "ucase"); factory.register_alias(FunctionStringMd5sum::name, "md5"); +factory.register_alias(FunctionStringUTF8Length::name, "character_length"); } } // namespace doris::vectorized diff --git a/be/src/vec/functions/function_string.h b/be/src/vec/functions/function_string.h index 3f3e538..af58062 100644 --- a/be/src/vec/functions/function_string.h +++ b/be/src/vec/functions/function_string.h @@ -24,9 +24,12 @@ #include #include "exprs/anyval_util.h" +#include "exprs/math_functions.h" +#include "exprs/string_functions.h" #include "runtime/string_value.hpp" #include "util/md5.h" #include "util/url_parser.h" +#include "vec/columns/column_decimal.h" #include "vec/columns/column_nullable.h" #include "vec/columns/column_string.h" #include "vec/columns/columns_number.h" @@ -211,7 +214,7 @@ public: } }; -struct Substr3Imp { +struct Substr3Impl { static DataTypes get_variadic_argument_types() { return {std::make_shared(), std::make_shared(), std::make_shared()}; @@ -225,7 +228,7 @@ struct Substr3Imp { } }; -struct Substr2Imp { +struct Substr2Impl { static DataTypes get_variadic_argument_types() { return {std::make_shared(), std::make_shared()}; } @@ -558,7 +561,7 @@ public: } return Status::OK(); } -}; // namespace doris::vectorized +}; class FunctionStringRepeat : public IFunction { public: @@ -1038,4 +1041,105 @@ public: } }; +template +class FunctionMoneyFormat : public IFunction { +public: +static constexpr auto name = "money_format"; +static FunctionPtr create() { return std::make_shared>(); } +String get_name() const override { return name; } + +DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { +return std::make_shared(); +} +DataTypes get_variadic_argument_types_impl() const override { +return Impl::get_variadic_argument_types(); +} +size_t get_number_of_arguments() const override { return 1; } + +bool use_default_implementation_for_constants() const override { return true; } + +Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, +size_t result, size_t input_rows_count) override { +auto res_column = ColumnString::create(); +ColumnPtr argument_column = block.get_by_position(arguments[0]).column; + +auto result_column = assert_cast(res_column.get()); +auto data_column = assert_cast(argument_column.get()); + +Impl::execute(context, result_column, data_column, input_rows_count); + +block.replace_by_position(result, std::move(res_column)); +return Status::OK(); +} +}; + +struct MoneyFormatDoubleImpl { +us
[incubator-doris] 07/33: [Refactor] Cow refactor: giveup using boost (#7567)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit a261538948f61f9b38c4e4d906bd9254bbae07b4 Author: thinker AuthorDate: Thu Jan 6 00:12:17 2022 +0800 [Refactor] Cow refactor: giveup using boost (#7567) Co-authored-by: zuochunwei --- be/src/vec/common/cow.h | 172 ++-- be/src/vec/exec/vanalytic_eval_node.cpp | 2 +- be/src/vec/functions/function_cast.h| 2 +- 3 files changed, 146 insertions(+), 30 deletions(-) diff --git a/be/src/vec/common/cow.h b/be/src/vec/common/cow.h index dbee2b9..08edb89 100644 --- a/be/src/vec/common/cow.h +++ b/be/src/vec/common/cow.h @@ -24,6 +24,7 @@ #include #include + /** Copy-on-write shared ptr. * Allows to work with shared immutable objects and sometimes unshare and mutate you own unique copy. * @@ -92,36 +93,158 @@ * to use std::unique_ptr for it somehow. */ template -class COW : public boost::intrusive_ref_counter { -private: +class COW { +std::atomic_uint ref_counter; + +protected: +COW() : ref_counter(0) {} + +COW(COW const&) : ref_counter(0) {} + +COW& operator=(COW const&) { +return *this; +} + +unsigned int use_count() const { +return ref_counter.load(); +} + +void add_ref() { +++ref_counter; +} + +void release_ref() { +if (--ref_counter == 0) { +delete static_cast(this); +} +} + Derived* derived() { return static_cast(this); } + const Derived* derived() const { return static_cast(this); } template -class IntrusivePtr : public boost::intrusive_ptr { +class intrusive_ptr { public: -using boost::intrusive_ptr::intrusive_ptr; +intrusive_ptr() : t(nullptr) {} + +intrusive_ptr(T* t, bool add_ref=true) : t(t) { +if (t && add_ref) ((std::remove_const_t*)t)->add_ref(); +} + +template +intrusive_ptr(intrusive_ptr const& rhs) : t(rhs.get()) { +if (t) ((std::remove_const_t*)t)->add_ref(); +} + +intrusive_ptr(intrusive_ptr const& rhs) : t(rhs.get()) { +if (t) ((std::remove_const_t*)t)->add_ref(); +} + +~intrusive_ptr() { +if (t) ((std::remove_const_t*)t)->release_ref(); +} + +template +intrusive_ptr& operator=(intrusive_ptr const& rhs) { +intrusive_ptr(rhs).swap(*this); +return *this; +} + +intrusive_ptr(intrusive_ptr&& rhs) : t(rhs.t) { +rhs.t = nullptr; +} + +intrusive_ptr& operator=(intrusive_ptr&& rhs) { +intrusive_ptr(static_cast(rhs)).swap(*this); +return *this; +} + +template friend class intrusive_ptr; + +template +intrusive_ptr(intrusive_ptr&& rhs) : t(rhs.t) { +rhs.t = nullptr; +} + +template +intrusive_ptr& operator=(intrusive_ptr&& rhs) { +intrusive_ptr(static_cast&&>(rhs)).swap(*this); +return *this; +} + +intrusive_ptr& operator=(intrusive_ptr const& rhs) { +intrusive_ptr(rhs).swap(*this); +return *this; +} + +intrusive_ptr& operator=(T* rhs) { +intrusive_ptr(rhs).swap(*this); +return *this; +} + +void reset() { +intrusive_ptr().swap(*this); +} + +void reset(T* rhs) { +intrusive_ptr(rhs).swap(*this); +} + +void reset(T* rhs, bool add_ref) { +intrusive_ptr(rhs, add_ref).swap(*this); +} + +T* get() const { +return t; +} + +T* detach() { +T* ret = t; +t = nullptr; +return ret; +} + +void swap(intrusive_ptr& rhs) { +T* tmp = t; +t = rhs.t; +rhs.t = tmp; +} + +T& operator*() const& { +return *t; +} -T& operator*() const& { return boost::intrusive_ptr::operator*(); } T&& operator*() const&& { -return const_cast::type&&>( -*boost::intrusive_ptr::get()); +return const_cast&&>(*t); +} + +T* operator->() const { +return t; +} + +operator bool() const { +return t != nullptr; +} + +operator T*() const { +return t; } + +private: +T* t; }; protected: template -class mutable_ptr : public IntrusivePtr { +class mutable_ptr : public intrusive_ptr { private: -using Base = IntrusivePtr; +using Base = intrusive_ptr; -template -friend class COW; -template -friend class COWHelper; +template friend class COW; +templ
[incubator-doris] 30/33: [Vectorized][Improvement] Speed up column filtering via SIMD (#7775)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 0a68fc3138057d536cae16e0c4052469cdd2c76e Author: Zeno Yang AuthorDate: Mon Jan 17 16:54:40 2022 +0800 [Vectorized][Improvement] Speed up column filtering via SIMD (#7775) --- be/src/vec/columns/column_decimal.cpp | 25 + be/src/vec/columns/column_vector.cpp | 28 ++-- be/src/vec/columns/columns_common.cpp | 20 +--- be/src/vec/columns/columns_common.h | 30 ++ 4 files changed, 74 insertions(+), 29 deletions(-) diff --git a/be/src/vec/columns/column_decimal.cpp b/be/src/vec/columns/column_decimal.cpp index 8e5ae12..5cc5853 100644 --- a/be/src/vec/columns/column_decimal.cpp +++ b/be/src/vec/columns/column_decimal.cpp @@ -162,6 +162,31 @@ ColumnPtr ColumnDecimal::filter(const IColumn::Filter& filt, ssize_t result_s const UInt8* filt_end = filt_pos + size; const T* data_pos = data.data(); +/** A slightly more optimized version. +* Based on the assumption that often pieces of consecutive values +* completely pass or do not pass the filter. +* Therefore, we will optimistically check the parts of `SIMD_BYTES` values. +*/ +static constexpr size_t SIMD_BYTES = 32; +const UInt8* filt_end_sse = filt_pos + size / SIMD_BYTES * SIMD_BYTES; + +while (filt_pos < filt_end_sse) { +uint32_t mask = bytes32_mask_to_bits32_mask(filt_pos); + +if (0x == mask) { +res_data.insert(data_pos, data_pos + SIMD_BYTES); +} else { +while (mask) { +const size_t idx = __builtin_ctzll(mask); +res_data.push_back(data_pos[idx]); +mask = mask & (mask - 1); +} +} + +filt_pos += SIMD_BYTES; +data_pos += SIMD_BYTES; +} + while (filt_pos < filt_end) { if (*filt_pos) res_data.push_back(*data_pos); diff --git a/be/src/vec/columns/column_vector.cpp b/be/src/vec/columns/column_vector.cpp index f6627d0..017ae29 100644 --- a/be/src/vec/columns/column_vector.cpp +++ b/be/src/vec/columns/column_vector.cpp @@ -26,6 +26,8 @@ #include #include +#include "runtime/datetime_value.h" +#include "vec/columns/columns_common.h" #include "vec/common/arena.h" #include "vec/common/bit_cast.h" #include "vec/common/exception.h" @@ -33,12 +35,6 @@ #include "vec/common/sip_hash.h" #include "vec/common/unaligned.h" -#include "runtime/datetime_value.h" - -#ifdef __SSE2__ -#include -#endif - namespace doris::vectorized { template @@ -237,34 +233,30 @@ ColumnPtr ColumnVector::filter(const IColumn::Filter& filt, ssize_t result_si const UInt8* filt_end = filt_pos + size; const T* data_pos = data.data(); -#ifdef __SSE2__ /** A slightly more optimized version. * Based on the assumption that often pieces of consecutive values * completely pass or do not pass the filter. * Therefore, we will optimistically check the parts of `SIMD_BYTES` values. */ - -static constexpr size_t SIMD_BYTES = 16; -const __m128i zero16 = _mm_setzero_si128(); +static constexpr size_t SIMD_BYTES = 32; const UInt8* filt_end_sse = filt_pos + size / SIMD_BYTES * SIMD_BYTES; while (filt_pos < filt_end_sse) { -int mask = _mm_movemask_epi8(_mm_cmpgt_epi8( -_mm_loadu_si128(reinterpret_cast(filt_pos)), zero16)); +uint32_t mask = bytes32_mask_to_bits32_mask(filt_pos); -if (0 == mask) { -/// Nothing is inserted. -} else if (0x == mask) { +if (0x == mask) { res_data.insert(data_pos, data_pos + SIMD_BYTES); } else { -for (size_t i = 0; i < SIMD_BYTES; ++i) -if (filt_pos[i]) res_data.push_back(data_pos[i]); +while (mask) { +const size_t idx = __builtin_ctzll(mask); +res_data.push_back(data_pos[idx]); +mask = mask & (mask - 1); +} } filt_pos += SIMD_BYTES; data_pos += SIMD_BYTES; } -#endif while (filt_pos < filt_end) { if (*filt_pos) res_data.push_back(*data_pos); diff --git a/be/src/vec/columns/columns_common.cpp b/be/src/vec/columns/columns_common.cpp index 02d650a..3045c5b 100644 --- a/be/src/vec/columns/columns_common.cpp +++ b/be/src/vec/columns/columns_common.cpp @@ -24,6 +24,7 @@ #include "vec/columns/column.h" #include "vec/columns/column_vector.h" +#include "vec/columns/columns_common.h" #include "vec/common/typeid_cast.h" namespace doris::vectorized { @@ -173,18 +174,13 @@ void filter_arrays_impl_generic(const PaddedPODArray& src_elems, memcpy(&res_elems[elems_size_old], &src_elems[arr_offset], arr_size * sizeof(T)); }; -#ifdef
[incubator-doris] 25/33: [Vectorized] Rebase code from master
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 20619e795832ca6947a32787f7f8937f5c6d0411 Author: lihaopeng AuthorDate: Thu Jan 13 17:27:07 2022 +0800 [Vectorized] Rebase code from master --- be/src/exec/olap_scanner.cpp | 2 +- be/src/exec/olap_scanner.h| 4 ++-- be/src/vec/exec/join/vhash_join_node.cpp | 2 +- be/src/vec/exec/volap_scanner.cpp | 7 --- be/src/vec/exec/volap_scanner.h | 6 +- be/src/vec/functions/function_binary_arithmetic.h | 2 +- be/src/vec/olap/block_reader.cpp | 2 +- be/src/vec/olap/block_reader.h| 4 ++-- be/src/vec/olap/vcollect_iterator.cpp | 6 +++--- be/src/vec/olap/vcollect_iterator.h | 14 +++--- be/src/vec/runtime/vdatetime_value.cpp| 2 +- be/src/vec/runtime/vdatetime_value.h | 2 +- be/test/vec/core/block_test.cpp | 2 +- 13 files changed, 30 insertions(+), 25 deletions(-) diff --git a/be/src/exec/olap_scanner.cpp b/be/src/exec/olap_scanner.cpp index 34336fa..2e05c5d 100644 --- a/be/src/exec/olap_scanner.cpp +++ b/be/src/exec/olap_scanner.cpp @@ -176,7 +176,7 @@ Status OlapScanner::_init_tablet_reader_params( _tablet_reader_params.rs_readers[1]->rowset()->start_version() == 2 && !_tablet_reader_params.rs_readers[1]->rowset()->rowset_meta()->is_segments_overlapping()); -_params.origin_return_columns = &_return_columns; +_tablet_reader_params.origin_return_columns = &_return_columns; if (_aggregation || single_version) { _tablet_reader_params.return_columns = _return_columns; _tablet_reader_params.direct_mode = true; diff --git a/be/src/exec/olap_scanner.h b/be/src/exec/olap_scanner.h index f234925..0c684d9 100644 --- a/be/src/exec/olap_scanner.h +++ b/be/src/exec/olap_scanner.h @@ -58,7 +58,7 @@ public: Status open(); -Status get_batch(RuntimeState* state, RowBatch* batch, bool* eof); +virtual Status get_batch(RuntimeState* state, RowBatch* batch, bool* eof); Status close(RuntimeState* state); @@ -103,7 +103,7 @@ protected: // Update profile that need to be reported in realtime. void _update_realtime_counter(); -virtual void set_tablet_reader() { _tablet_reader.reset(new TupleReader); } +virtual void set_tablet_reader() { _tablet_reader = std::make_unique(); } protected: RuntimeState* _runtime_state; diff --git a/be/src/vec/exec/join/vhash_join_node.cpp b/be/src/vec/exec/join/vhash_join_node.cpp index 4533cae..9563ebf 100644 --- a/be/src/vec/exec/join/vhash_join_node.cpp +++ b/be/src/vec/exec/join/vhash_join_node.cpp @@ -590,7 +590,7 @@ Status HashJoinNode::init(const TPlanNode& tnode, RuntimeState* state) { for (const auto& filter_desc : _runtime_filter_descs) { RETURN_IF_ERROR(state->runtime_filter_mgr()->regist_filter(RuntimeFilterRole::PRODUCER, - filter_desc)); + filter_desc, state->query_options())); } return Status::OK(); diff --git a/be/src/vec/exec/volap_scanner.cpp b/be/src/vec/exec/volap_scanner.cpp index 64a51a1..1b4bb02 100644 --- a/be/src/vec/exec/volap_scanner.cpp +++ b/be/src/vec/exec/volap_scanner.cpp @@ -17,6 +17,8 @@ #include "vec/exec/volap_scanner.h" +#include + #include "vec/columns/column_complex.h" #include "vec/columns/column_nullable.h" #include "vec/columns/column_string.h" @@ -25,14 +27,13 @@ #include "vec/core/block.h" #include "vec/exec/volap_scan_node.h" #include "vec/exprs/vexpr_context.h" -#include "vec/olap/block_reader.h" #include "vec/runtime/vdatetime_value.h" + namespace doris::vectorized { VOlapScanner::VOlapScanner(RuntimeState* runtime_state, VOlapScanNode* parent, bool aggregation, bool need_agg_finalize, const TPaloScanRange& scan_range) : OlapScanner(runtime_state, parent, aggregation, need_agg_finalize, scan_range) { -_reader.reset(new BlockReader); } Status VOlapScanner::get_block(RuntimeState* state, vectorized::Block* block, bool* eof) { @@ -50,7 +51,7 @@ Status VOlapScanner::get_block(RuntimeState* state, vectorized::Block* block, bo do { // Read one block from block reader -auto res = _reader->next_block_with_aggregation(block, nullptr, nullptr, eof); +auto res = _tablet_reader->next_block_with_aggregation(block, nullptr, nullptr, eof); if (res != OLAP_SUCCESS) { std::stringstream ss; ss << "Internal Error: read storage fail. res=" << res diff --git a/be/src/vec/exec/volap_scanner.h b/be/src/vec/exec/volap_scanner.h index 3c66f4d..5efaf9d 100644
[incubator-doris] 33/33: [Vectorized](compile) Fix compile error and warning (#7780)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 8a1a6126b4c387eaa678f4a65e773d795e021f0a Author: Zeno Yang AuthorDate: Mon Jan 17 20:27:13 2022 +0800 [Vectorized](compile) Fix compile error and warning (#7780) --- be/src/vec/columns/column.h| 1 + be/src/vec/functions/function.h| 3 +++ be/src/vec/functions/function_grouping.cpp | 4 ++-- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h index d58979d..71b86ae 100644 --- a/be/src/vec/columns/column.h +++ b/be/src/vec/columns/column.h @@ -220,6 +220,7 @@ public: */ virtual Ptr filter_by_selector(const uint16_t* sel, size_t sel_size, Ptr* ptr = nullptr) { LOG(FATAL) << "column not support filter_by_selector"; +__builtin_unreachable(); }; /// Permutes elements using specified permutation. Is used in sortings. diff --git a/be/src/vec/functions/function.h b/be/src/vec/functions/function.h index bc76a10..cf00c08 100644 --- a/be/src/vec/functions/function.h +++ b/be/src/vec/functions/function.h @@ -404,6 +404,7 @@ public: const ColumnNumbers& /*arguments*/, size_t /*result*/) const final { LOG(FATAL) << "prepare is not implemented for IFunction"; +__builtin_unreachable(); } Status prepare(FunctionContext* context, FunctionContext::FunctionStateScope scope) override { @@ -412,10 +413,12 @@ public: [[noreturn]] const DataTypes& get_argument_types() const final { LOG(FATAL) << "get_argument_types is not implemented for IFunction"; +__builtin_unreachable(); } [[noreturn]] const DataTypePtr& get_return_type() const final { LOG(FATAL) << "get_return_type is not implemented for IFunction"; +__builtin_unreachable(); } protected: diff --git a/be/src/vec/functions/function_grouping.cpp b/be/src/vec/functions/function_grouping.cpp index 763dec2..07872ee 100644 --- a/be/src/vec/functions/function_grouping.cpp +++ b/be/src/vec/functions/function_grouping.cpp @@ -15,11 +15,11 @@ // specific language governing permissions and limitations // under the License. -#include "function_grouping.h" +#include "vec/functions/function_grouping.h" namespace doris::vectorized { void register_function_grouping(SimpleFunctionFactory& factory) { factory.register_function(); factory.register_function(); } -} +} // namespace doris::vectorized - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 09/33: [Vectorized][Feature] fix core dump when using function override and function alias at the same time && support substr(str, int) override (#7640)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 41b165cbf10b39ec41f63958e2332aa1d5b8fc5d Author: Pxl <952130...@qq.com> AuthorDate: Thu Jan 6 18:57:05 2022 +0800 [Vectorized][Feature] fix core dump when using function override and function alias at the same time && support substr(str,int) override (#7640) --- be/src/vec/functions/function_string.cpp | 19 +++-- be/src/vec/functions/function_string.h | 109 - be/src/vec/functions/function_timestamp.cpp| 11 +-- be/src/vec/functions/simple_function_factory.h | 10 ++- gensrc/script/doris_builtins_functions.py | 4 +- 5 files changed, 111 insertions(+), 42 deletions(-) diff --git a/be/src/vec/functions/function_string.cpp b/be/src/vec/functions/function_string.cpp index 73e2413..34f1c6b 100644 --- a/be/src/vec/functions/function_string.cpp +++ b/be/src/vec/functions/function_string.cpp @@ -293,7 +293,7 @@ struct HexStringImpl { dst_data_ptr++; offset++; } else { -VStringFunctions::hex_encode(source, srclen, reinterpret_cast(dst_data_ptr)); +VStringFunctions::hex_encode(source, srclen, reinterpret_cast(dst_data_ptr)); dst_data_ptr[srclen * 2] = '\0'; dst_data_ptr += (srclen * 2 + 1); offset += (srclen * 2 + 1); @@ -513,9 +513,9 @@ struct AesEncryptImpl { int cipher_len = l_size + 16; char p[cipher_len]; -int outlen = -EncryptionUtil::encrypt(AES_128_ECB, (unsigned char*)l_raw, l_size, - (unsigned char*)r_raw, r_size, NULL, true, (unsigned char*)p); +int outlen = EncryptionUtil::encrypt(AES_128_ECB, (unsigned char*)l_raw, l_size, + (unsigned char*)r_raw, r_size, NULL, true, + (unsigned char*)p); if (outlen < 0) { StringOP::push_null_string(i, res_data, res_offsets, null_map_data); } else { @@ -553,9 +553,9 @@ struct AesDecryptImpl { int cipher_len = l_size; char p[cipher_len]; -int outlen = -EncryptionUtil::decrypt(AES_128_ECB, (unsigned char*)l_raw, l_size, - (unsigned char*)r_raw, r_size, NULL, true, (unsigned char*)p); +int outlen = EncryptionUtil::decrypt(AES_128_ECB, (unsigned char*)l_raw, l_size, + (unsigned char*)r_raw, r_size, NULL, true, + (unsigned char*)p); if (outlen < 0) { StringOP::push_null_string(i, res_data, res_offsets, null_map_data); } else { @@ -774,7 +774,8 @@ void register_function_string(SimpleFunctionFactory& factory) { factory.register_function(); factory.register_function(); factory.register_function(); -factory.register_function(); +factory.register_function>(); +factory.register_function>(); factory.register_function(); factory.register_function(); factory.register_function(); @@ -794,7 +795,7 @@ void register_function_string(SimpleFunctionFactory& factory) { factory.register_alias(FunctionLeft::name, "strleft"); factory.register_alias(FunctionRight::name, "strright"); -factory.register_alias(FunctionSubstring::name, "substr"); +factory.register_alias(SubstringUtil::name, "substr"); factory.register_alias(FunctionToLower::name, "lcase"); factory.register_alias(FunctionStringMd5sum::name, "md5"); } diff --git a/be/src/vec/functions/function_string.h b/be/src/vec/functions/function_string.h index efeef41..3f3e538 100644 --- a/be/src/vec/functions/function_string.h +++ b/be/src/vec/functions/function_string.h @@ -88,25 +88,9 @@ struct StringOP { } }; -class FunctionSubstring : public IFunction { -public: +struct SubstringUtil { static constexpr auto name = "substring"; -static FunctionPtr create() { return std::make_shared(); } -String get_name() const override { return name; } -size_t get_number_of_arguments() const override { return 3; } - -DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { -return make_nullable(std::make_shared()); -} - -bool use_default_implementation_for_nulls() const override { return false; } -bool use_default_implementation_for_constants() const override { return true; } -Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, -size_t result, size_t input_rows_count) override { -substring_execute(block, arguments, result, input_rows_count); -return S
[incubator-doris] 06/33: [Vectorized][Function] Support function and (#7618)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 5779e9b580d8c4b80e9f3f677974d1b05ee4215a Author: Pxl <952130...@qq.com> AuthorDate: Wed Jan 5 20:11:07 2022 +0800 [Vectorized][Function] Support function and (#7618) --- be/src/vec/CMakeLists.txt | 1 + be/src/vec/functions/function_utility.cpp | 118 + be/src/vec/functions/math.cpp | 1 + be/src/vec/functions/simple_function_factory.h | 2 + gensrc/script/doris_builtins_functions.py | 4 +- 5 files changed, 124 insertions(+), 2 deletions(-) diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index 1738819..71efde5 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -110,6 +110,7 @@ set(VEC_FILES functions/function_cast.cpp functions/function_string.cpp functions/function_timestamp.cpp + functions/function_utility.cpp functions/comparison_equal_for_null.cpp functions/function_json.cpp functions/hll_cardinality.cpp diff --git a/be/src/vec/functions/function_utility.cpp b/be/src/vec/functions/function_utility.cpp new file mode 100644 index 000..6c7da89 --- /dev/null +++ b/be/src/vec/functions/function_utility.cpp @@ -0,0 +1,118 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "util/monotime.h" +#include "vec/data_types/data_type_number.h" +#include "vec/data_types/data_type_string.h" +#include "vec/functions/simple_function_factory.h" + +namespace doris::vectorized { + +class FunctionSleep : public IFunction { +public: +static constexpr auto name = "sleep"; +static FunctionPtr create() { return std::make_shared(); } + +String get_name() const override { return name; } + +size_t get_number_of_arguments() const override { return 1; } + +DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { +if (arguments[0].get()->is_nullable()) { +return make_nullable(std::make_shared()); +} +return std::make_shared(); +} + +bool use_default_implementation_for_constants() const override { return true; } +bool use_default_implementation_for_nulls() const override { return false; } + +Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, +size_t result, size_t input_rows_count) override { +ColumnPtr argument_column = + block.get_by_position(arguments[0]).column->convert_to_full_column_if_const(); + +auto res_column = ColumnUInt8::create(); + +if (auto* nullable_column = check_and_get_column(*argument_column)) { +auto null_map_column = ColumnUInt8::create(); + +auto nested_column = nullable_column->get_nested_column_ptr(); +auto data_column = assert_cast*>(nested_column.get()); + +for (int i = 0; i < input_rows_count; i++) { +if (nullable_column->is_null_at(i)) { +res_column->insert(0); +null_map_column->insert(1); +} else { +int seconds = data_column->get_data()[i]; +SleepFor(MonoDelta::FromSeconds(seconds)); +res_column->insert(1); +null_map_column->insert(0); +} +} + +block.replace_by_position(result, ColumnNullable::create(std::move(res_column), + std::move(null_map_column))); +} else { +auto data_column = assert_cast*>(argument_column.get()); + +for (int i = 0; i < input_rows_count; i++) { +int seconds = data_column->get_element(i); +SleepFor(MonoDelta::FromSeconds(seconds)); +res_column->insert(1); +} + +block.replace_by_position(result, std::move(res_column)); +} +return Status::OK(); +} +}; + +class FunctionVersion : public IFunction { +public: +static constexpr auto name = "version"; + +
[incubator-doris] 04/33: [Bug] Fix bug of concat function and fold const expr (#7608)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 5f52c04d62be933c6b1f377db862989a19574778 Author: HappenLee AuthorDate: Tue Jan 4 06:45:32 2022 -0600 [Bug] Fix bug of concat function and fold const expr (#7608) Co-authored-by: lihaopeng --- be/src/exec/exec_node.cpp | 2 +- be/src/runtime/fold_constant_executor.cpp | 10 +++--- be/src/runtime/fold_constant_executor.h | 2 +- be/src/vec/exec/join/vhash_join_node.cpp | 9 - be/src/vec/exec/vcross_join_node.cpp | 1 - be/src/vec/functions/function_string.h| 20 6 files changed, 17 insertions(+), 27 deletions(-) diff --git a/be/src/exec/exec_node.cpp b/be/src/exec/exec_node.cpp index b18160e..4582f89 100644 --- a/be/src/exec/exec_node.cpp +++ b/be/src/exec/exec_node.cpp @@ -756,7 +756,7 @@ void ExecNode::reached_limit(vectorized::Block* block, bool* eos) { } _num_rows_returned += block->rows(); -if (*eos) COUNTER_SET(_rows_returned_counter, _num_rows_returned); +COUNTER_SET(_rows_returned_counter, _num_rows_returned); } /* diff --git a/be/src/runtime/fold_constant_executor.cpp b/be/src/runtime/fold_constant_executor.cpp index cd6d5ff..9781c2f 100644 --- a/be/src/runtime/fold_constant_executor.cpp +++ b/be/src/runtime/fold_constant_executor.cpp @@ -82,7 +82,7 @@ Status FoldConstantExecutor::fold_constant_expr( expr_result.set_success(false); } else { expr_result.set_success(true); -result = _get_result(src, ctx->root()->type().type); +result = _get_result(src, 0, ctx->root()->type().type); } expr_result.set_content(std::move(result)); @@ -143,7 +143,8 @@ Status FoldConstantExecutor::fold_constant_vexpr( expr_result.set_success(false); } else { expr_result.set_success(true); -result = _get_result((void *) column_ptr->get_data_at(0).data, ctx->root()->type().type); +auto string_ref = column_ptr->get_data_at(0); +result = _get_result((void*)string_ref.data, string_ref.size, ctx->root()->type().type); } expr_result.set_content(std::move(result)); @@ -198,7 +199,7 @@ Status FoldConstantExecutor::_prepare_and_open(Context* ctx) { } template -string FoldConstantExecutor::_get_result(void* src, PrimitiveType slot_type){ +string FoldConstantExecutor::_get_result(void* src, size_t size, PrimitiveType slot_type){ switch (slot_type) { case TYPE_BOOLEAN: { bool val = *reinterpret_cast(src); @@ -237,6 +238,9 @@ string FoldConstantExecutor::_get_result(void* src, PrimitiveType slot_type){ case TYPE_STRING: case TYPE_HLL: case TYPE_OBJECT: { +if constexpr (is_vec) { +return std::string((char*)src, size); +} return (reinterpret_cast(src))->to_string(); } case TYPE_DATE: diff --git a/be/src/runtime/fold_constant_executor.h b/be/src/runtime/fold_constant_executor.h index c7c5a38..84c52f7 100644 --- a/be/src/runtime/fold_constant_executor.h +++ b/be/src/runtime/fold_constant_executor.h @@ -47,7 +47,7 @@ private: Status _prepare_and_open(Context* ctx); template -std::string _get_result(void* src, PrimitiveType slot_type); +std::string _get_result(void* src, size_t size, PrimitiveType slot_type); std::unique_ptr _runtime_state; std::shared_ptr _mem_tracker; diff --git a/be/src/vec/exec/join/vhash_join_node.cpp b/be/src/vec/exec/join/vhash_join_node.cpp index 62dfffa..7606783 100644 --- a/be/src/vec/exec/join/vhash_join_node.cpp +++ b/be/src/vec/exec/join/vhash_join_node.cpp @@ -124,7 +124,7 @@ struct ProcessRuntimeFilterBuild { ProcessRuntimeFilterBuild(HashJoinNode* join_node) : _join_node(join_node) {} Status operator()(RuntimeState* state, HashTableContext& hash_table_ctx) { -if (_join_node->_runtime_filter_descs.empty() || _join_node->_inserted_rows.empty()) { +if (_join_node->_runtime_filter_descs.empty()) { return Status::OK(); } VRuntimeFilterSlots* runtime_filter_slots = @@ -162,7 +162,6 @@ struct ProcessHashTableProbe { _probe_block(join_node->_probe_block), _probe_index(join_node->_probe_index), _probe_raw_ptrs(join_node->_probe_columns), - _arena(join_node->_arena), _rows_returned_counter(join_node->_rows_returned_counter) {} // Only process the join with no other join conjunt, because of no other join conjunt @@ -198,7 +197,7 @@ struct ProcessHashTableProbe { _arena)) {nullptr, false} : key_getter.find_key(hash_table_ctx.hash_table, _probe_inde
[incubator-doris] 03/33: [Function] Fix error about rank/dense_rank/row_number return always not nullable (#7561)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 9431a93dd623f76618c31921ea1f650ce6444a6a Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com> AuthorDate: Tue Jan 4 11:05:32 2022 +0800 [Function] Fix error about rank/dense_rank/row_number return always not nullable (#7561) --- .../src/main/java/org/apache/doris/catalog/AggregateFunction.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java index 4e85ca3..82e4035 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/AggregateFunction.java @@ -49,7 +49,7 @@ public class AggregateFunction extends Function { private static final Logger LOG = LogManager.getLogger(AggregateFunction.class); public static ImmutableSet NOT_NULLABLE_AGGREGATE_FUNCTION_NAME_SET = -ImmutableSet.of(FunctionSet.COUNT, "ndv", FunctionSet.BITMAP_UNION_INT, FunctionSet.BITMAP_UNION_COUNT, "ndv_no_finalize"); +ImmutableSet.of("row_number", "rank", "dense_rank", FunctionSet.COUNT, "ndv", FunctionSet.BITMAP_UNION_INT, FunctionSet.BITMAP_UNION_COUNT, "ndv_no_finalize"); // Set if different from retType_, null otherwise. private Type intermediateType; - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 17/33: [Feature][Vectorized] Support String in vec exe engine (#7670)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 1fc1c7005dccabdadeb0a5d2cd0cf0aa4871a3a8 Author: HappenLee AuthorDate: Mon Jan 10 20:27:45 2022 +0800 [Feature][Vectorized] Support String in vec exe engine (#7670) Co-authored-by: lihaopeng --- be/src/olap/olap_define.h | 3 ++- be/src/olap/row_block2.cpp | 26 -- be/src/olap/row_block2.h | 2 +- be/src/olap/rowset/beta_rowset_reader.cpp | 6 - be/src/vec/exec/vunion_node.cpp| 9 .../apache/doris/rewrite/FoldConstantsRule.java| 7 -- 6 files changed, 41 insertions(+), 12 deletions(-) diff --git a/be/src/olap/olap_define.h b/be/src/olap/olap_define.h index c2d4b7f..a9ac731 100644 --- a/be/src/olap/olap_define.h +++ b/be/src/olap/olap_define.h @@ -384,7 +384,8 @@ enum OLAPStatus { OLAP_ERR_ROWSET_LOAD_FAILED = -3109, OLAP_ERR_ROWSET_READER_INIT = -3110, OLAP_ERR_ROWSET_READ_FAILED = -3111, -OLAP_ERR_ROWSET_INVALID_STATE_TRANSITION = -3112 +OLAP_ERR_ROWSET_INVALID_STATE_TRANSITION = -3112, +OLAP_ERR_STRING_OVERFLOW_IN_VEC_ENGINE = -3113 }; enum ColumnFamilyIndex { diff --git a/be/src/olap/row_block2.cpp b/be/src/olap/row_block2.cpp index 26b58ca..877f6a2 100644 --- a/be/src/olap/row_block2.cpp +++ b/be/src/olap/row_block2.cpp @@ -95,7 +95,9 @@ Status RowBlockV2::convert_to_row_block(RowCursor* helper, RowBlock* dst) { return Status::OK(); } -void RowBlockV2::_copy_data_to_column(int cid, doris::vectorized::MutableColumnPtr& origin_column) { +Status RowBlockV2::_copy_data_to_column(int cid, doris::vectorized::MutableColumnPtr& origin_column) { +constexpr auto MAX_SIZE_OF_VEC_STRING = 1024l * 1024; + auto* column = origin_column.get(); bool nullable_mark_array[_selected_size]; @@ -170,6 +172,24 @@ void RowBlockV2::_copy_data_to_column(int cid, doris::vectorized::MutableColumnP } break; } +case OLAP_FIELD_TYPE_STRING: { +auto column_string = assert_cast(column); + +for (uint16_t j = 0; j < _selected_size; ++j) { +if (!nullable_mark_array[j]) { +uint16_t row_idx = _selection_vector[j]; +auto slice = reinterpret_cast(column_block(cid).cell_ptr(row_idx)); +if (LIKELY(slice->size <= MAX_SIZE_OF_VEC_STRING)) { +column_string->insert_data(slice->data, slice->size); +} else { +return Status::NotSupported("Not support string len over than 1MB in vec engine."); +} +} else { +column_string->insert_default(); +} +} +break; +} case OLAP_FIELD_TYPE_CHAR: { auto column_string = assert_cast(column); @@ -286,13 +306,15 @@ void RowBlockV2::_copy_data_to_column(int cid, doris::vectorized::MutableColumnP DCHECK(false) << "Invalid type in RowBlockV2:" << _schema.column(cid)->type(); } } + +return Status::OK(); } Status RowBlockV2::convert_to_vec_block(vectorized::Block* block) { for (int i = 0; i < _schema.column_ids().size(); ++i) { auto cid = _schema.column_ids()[i]; auto column = (*std::move(block->get_by_position(i).column)).assume_mutable(); -_copy_data_to_column(cid, column); +RETURN_IF_ERROR(_copy_data_to_column(cid, column)); } _pool->clear(); return Status::OK(); diff --git a/be/src/olap/row_block2.h b/be/src/olap/row_block2.h index cdbf428..b98ab95 100644 --- a/be/src/olap/row_block2.h +++ b/be/src/olap/row_block2.h @@ -109,7 +109,7 @@ public: std::string debug_string(); private: -void _copy_data_to_column(int cid, vectorized::MutableColumnPtr& mutable_column_ptr); +Status _copy_data_to_column(int cid, vectorized::MutableColumnPtr& mutable_column_ptr); Schema _schema; size_t _capacity; diff --git a/be/src/olap/rowset/beta_rowset_reader.cpp b/be/src/olap/rowset/beta_rowset_reader.cpp index 459f3ca..4d35f2f 100644 --- a/be/src/olap/rowset/beta_rowset_reader.cpp +++ b/be/src/olap/rowset/beta_rowset_reader.cpp @@ -204,7 +204,11 @@ OLAPStatus BetaRowsetReader::next_block(vectorized::Block* block) { { SCOPED_RAW_TIMER(&_stats->block_convert_ns); -_input_block->convert_to_vec_block(block); +auto s = _input_block->convert_to_vec_block(block); +if (UNLIKELY(!s.ok())) { +LOG(WARNING) << "failed to read next block: " << s.to_string(); +return OLAP_ERR_STRING_OVERFLOW_IN_VEC_ENGINE; +} } is_first = false; } while (block->rows() < _context->runtime_state->batch_size()); // here we should keep block.rows() < batch_size diff --git a/be/
[incubator-doris] 14/33: [vectorized] [block] Add new method get_data_type to avoid unnecessary copy by the method get_data_type (#7600)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit e4619d98a18952a24e2e3583d3ab82da2d0a1ba8 Author: thinker AuthorDate: Fri Jan 7 15:37:35 2022 +0800 [vectorized] [block] Add new method get_data_type to avoid unnecessary copy by the method get_data_type (#7600) Co-authored-by: zuochunwei --- be/src/vec/core/block.cpp| 4 ++-- be/src/vec/core/block.h | 7 ++- be/src/vec/olap/block_reader.cpp | 4 ++-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp index a5e445d..2aff751 100644 --- a/be/src/vec/core/block.cpp +++ b/be/src/vec/core/block.cpp @@ -51,7 +51,7 @@ namespace doris::vectorized { -inline DataTypePtr get_data_type(const PColumn& pcolumn) { +inline DataTypePtr create_data_type(const PColumn& pcolumn) { switch (pcolumn.type()) { case PColumn::UINT8: { return std::make_shared(); @@ -176,7 +176,7 @@ Block::Block(const ColumnsWithTypeAndName& data_) : data {data_} { Block::Block(const PBlock& pblock) { for (const auto& pcolumn : pblock.columns()) { -DataTypePtr type = get_data_type(pcolumn); +DataTypePtr type = create_data_type(pcolumn); MutableColumnPtr data_column; if (pcolumn.is_null_size() > 0) { data_column = diff --git a/be/src/vec/core/block.h b/be/src/vec/core/block.h index addecf7..1c435ee 100644 --- a/be/src/vec/core/block.h +++ b/be/src/vec/core/block.h @@ -130,6 +130,11 @@ public: Names get_names() const; DataTypes get_data_types() const; +DataTypePtr get_data_type(size_t index) const { +CHECK(index < data.size()); +return data[index].type; +} + /// Returns number of rows from first column in block, not equal to nullptr. If no columns, returns 0. size_t rows() const; @@ -204,7 +209,7 @@ public: static Status filter_block(Block* block, int filter_conlumn_id, int column_to_keep); static inline void erase_useless_column(Block* block, int column_to_keep) { -for (size_t i = block->columns() - 1; i >= column_to_keep; --i) { +for (int i = block->columns() - 1; i >= column_to_keep; --i) { block->erase(i); } } diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp index 769e27e..8e2d4b2 100644 --- a/be/src/vec/olap/block_reader.cpp +++ b/be/src/vec/olap/block_reader.cpp @@ -94,11 +94,11 @@ void BlockReader::_init_agg_state() { // create aggregate function DataTypes argument_types; -argument_types.push_back(_next_row.block->get_data_types()[idx]); +argument_types.push_back(_next_row.block->get_data_type(idx)); Array params; AggregateFunctionPtr function = AggregateFunctionSimpleFactory::instance().get( agg_name, argument_types, params, -_next_row.block->get_data_types()[idx]->is_nullable()); +_next_row.block->get_data_type(idx)->is_nullable()); DCHECK(function != nullptr); _agg_functions.push_back(function); - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 10/33: [Function][Vec] add function coalesce (#7632)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit ebdc0bac985b488fda6f1a29d1c09ceedfe5315a Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com> AuthorDate: Thu Jan 6 19:33:25 2022 +0800 [Function][Vec] add function coalesce (#7632) --- be/src/vec/CMakeLists.txt | 1 + be/src/vec/functions/function_coalesce.cpp | 143 + be/src/vec/functions/simple_function_factory.h | 2 + docs/.vuepress/sidebar/en.js | 1 + docs/.vuepress/sidebar/zh-CN.js| 1 + .../sql-functions/string-functions/coalesce.md | 62 + .../sql-functions/string-functions/coalesce.md | 63 + gensrc/script/doris_builtins_functions.py | 28 ++-- 8 files changed, 287 insertions(+), 14 deletions(-) diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index 71efde5..01c69eb 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -133,6 +133,7 @@ set(VEC_FILES functions/function_ifnull.cpp functions/nullif.cpp functions/random.cpp + functions/function_coalesce.cpp functions/function_date_or_datetime_computation.cpp functions/function_date_or_datetime_to_string.cpp functions/function_datetime_string_to_string.cpp diff --git a/be/src/vec/functions/function_coalesce.cpp b/be/src/vec/functions/function_coalesce.cpp new file mode 100644 index 000..65d544c --- /dev/null +++ b/be/src/vec/functions/function_coalesce.cpp @@ -0,0 +1,143 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "udf/udf.h" +#include "vec/data_types/data_type_nothing.h" +#include "vec/data_types/data_type_number.h" +#include "vec/data_types/get_least_supertype.h" +#include "vec/functions/function_helpers.h" +#include "vec/functions/simple_function_factory.h" +#include "vec/utils/util.hpp" + +namespace doris::vectorized { +class FunctionCoalesce : public IFunction { +public: +static constexpr auto name = "coalesce"; + +static FunctionPtr create() { return std::make_shared(); } + +String get_name() const override { return name; } + +bool use_default_implementation_for_constants() const override { return false; } + +bool use_default_implementation_for_nulls() const override { return false; } + +bool is_variadic() const override { return true; } + +size_t get_number_of_arguments() const override { return 0; } + +DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { +for (const auto& arg : arguments) { +if (!arg->is_nullable()) { +return arg; +} +} +return arguments[0]; +} + +Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, +size_t result, size_t input_rows_count) override { +DCHECK_GE(arguments.size(), 1); +ColumnNumbers filtered_args; +filtered_args.reserve(arguments.size()); +for (const auto& arg : arguments) { +const auto& type = block.get_by_position(arg).type; +if (type->only_null()) { +continue; +} +filtered_args.push_back(arg); +if (!type->is_nullable()) { +break; +} +} + +size_t remaining_rows = input_rows_count; +size_t argument_size = filtered_args.size(); +std::vector record_idx(input_rows_count, -1); //used to save column idx +MutableColumnPtr result_column; + +DataTypePtr type = block.get_by_position(result).type; +if (!type->is_nullable()) { +result_column = type->create_column(); +} else { +result_column = remove_nullable(type)->create_column(); +} + +result_column->reserve(input_rows_count); +auto return_type = std::make_shared(); +auto null_map = ColumnUInt8::create(input_rows_count, 1); +auto& null_map_data = null_map->get_data(); +ColumnPtr argument_columns[argument_size]; + +
[incubator-doris] 11/33: [Bug] Fix function nulllable not match and largetint cast failed (#7659)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit da43e38a71ac7b5394e15d1d52ea055f50684f83 Author: HappenLee AuthorDate: Thu Jan 6 21:52:51 2022 -0600 [Bug] Fix function nulllable not match and largetint cast failed (#7659) Co-authored-by: lihaopeng --- be/src/vec/common/cow.h | 8 be/src/vec/core/block.cpp | 6 -- be/src/vec/data_types/data_type_number_base.cpp | 2 +- be/src/vec/functions/date_time_transforms.h | 8 +--- be/src/vec/io/io_helper.h | 18 -- gensrc/script/doris_builtins_functions.py | 10 +- 6 files changed, 23 insertions(+), 29 deletions(-) diff --git a/be/src/vec/common/cow.h b/be/src/vec/common/cow.h index 08edb89..58ae14d 100644 --- a/be/src/vec/common/cow.h +++ b/be/src/vec/common/cow.h @@ -105,10 +105,6 @@ protected: return *this; } -unsigned int use_count() const { -return ref_counter.load(); -} - void add_ref() { ++ref_counter; } @@ -265,6 +261,10 @@ protected: public: using MutablePtr = mutable_ptr; +unsigned int use_count() const { +return ref_counter.load(); +} + protected: template class immutable_ptr : public intrusive_ptr { diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp index b52257d..a5e445d 100644 --- a/be/src/vec/core/block.cpp +++ b/be/src/vec/core/block.cpp @@ -646,7 +646,8 @@ void Block::clear_column_data(int column_size) noexcept { } } for (auto& d : data) { -(*std::move(d.column)).mutate()->clear(); +DCHECK(d.column->use_count() == 1); +(*std::move(d.column)).assume_mutable()->clear(); } } @@ -691,7 +692,8 @@ Status Block::filter_block(Block* block, int filter_column_id, int column_to_kee if (auto* nullable_column = check_and_get_column(*filter_column)) { ColumnPtr nested_column = nullable_column->get_nested_column_ptr(); -MutableColumnPtr mutable_holder = (*std::move(nested_column)).mutate(); +MutableColumnPtr mutable_holder = nested_column->use_count() == 1 ? +nested_column->assume_mutable() : nested_column->clone_resized(nested_column->size()); ColumnUInt8* concrete_column = typeid_cast(mutable_holder.get()); if (!concrete_column) { diff --git a/be/src/vec/data_types/data_type_number_base.cpp b/be/src/vec/data_types/data_type_number_base.cpp index ee94a37..01a4248 100644 --- a/be/src/vec/data_types/data_type_number_base.cpp +++ b/be/src/vec/data_types/data_type_number_base.cpp @@ -35,7 +35,7 @@ namespace doris::vectorized { template void DataTypeNumberBase::to_string(const IColumn& column, size_t row_num, BufferWritable& ostr) const { -if constexpr (std::is_same::value || std::is_same::value) { +if constexpr (std::is_same::value) { std::string hex = int128_to_string( assert_cast&>(*column.convert_to_full_column_if_const().get()) .get_data()[row_num]); diff --git a/be/src/vec/functions/date_time_transforms.h b/be/src/vec/functions/date_time_transforms.h index a34b53d..eaab918 100644 --- a/be/src/vec/functions/date_time_transforms.h +++ b/be/src/vec/functions/date_time_transforms.h @@ -56,6 +56,7 @@ TIME_FUNCTION_IMPL(WeekOfYearImpl, weekofyear, week(mysql_week_mode(3))); TIME_FUNCTION_IMPL(DayOfYearImpl, dayofyear, day_of_year()); TIME_FUNCTION_IMPL(DayOfMonthImpl, dayofmonth, day()); TIME_FUNCTION_IMPL(DayOfWeekImpl, dayofweek, day_of_week()); +// TODO: the method should be always not nullable TIME_FUNCTION_IMPL(ToDaysImpl, to_days, daynr()); TIME_FUNCTION_IMPL(ToYearWeekImpl, yearweek, year_week(mysql_week_mode(0))); struct ToDateImpl { @@ -92,7 +93,7 @@ struct DayNameImpl { res_data[offset - 1] = 0; } else { auto len = strlen(day_name); -memcpy_small_allow_read_write_overflow15(&res_data[offset], day_name, len); +memcpy(&res_data[offset], day_name, len); offset += len + 1; res_data[offset - 1] = 0; } @@ -113,8 +114,8 @@ struct MonthNameImpl { res_data[offset - 1] = 0; } else { auto len = strlen(month_name); -memcpy_small_allow_read_write_overflow15(&res_data[offset], month_name, len); -offset += len + 1; +memcpy(&res_data[offset], month_name, len); +offset += (len + 1); res_data[offset - 1] = 0; } return offset; @@ -148,6 +149,7 @@ struct DateFormatImpl { } }; +// TODO: This function should be depend on argments not always nullable struct FromUnixTimeImpl { using FromType = Int32; diff --git a/be/src/vec/io/io_helper.h b/be/src/v
[incubator-doris] 15/33: [Vectorized] Support bloom filter predicate on vectorized engine storage layer (#7557)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit ae9d0cff0ad6aa59919eaa138307d619fffa9aeb Author: Zeno Yang AuthorDate: Sat Jan 8 01:00:43 2022 +0800 [Vectorized] Support bloom filter predicate on vectorized engine storage layer (#7557) --- be/src/olap/bloom_filter_predicate.h | 40 +- .../olap/bloom_filter_column_predicate_test.cpp| 36 ++ be/test/olap/null_predicate_test.cpp | 144 + 3 files changed, 219 insertions(+), 1 deletion(-) diff --git a/be/src/olap/bloom_filter_predicate.h b/be/src/olap/bloom_filter_predicate.h index b3dcbbb..ff3201c 100644 --- a/be/src/olap/bloom_filter_predicate.h +++ b/be/src/olap/bloom_filter_predicate.h @@ -27,6 +27,10 @@ #include "olap/field.h" #include "runtime/string_value.hpp" #include "runtime/vectorized_row_batch.h" +#include "vec/columns/column_nullable.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/utils/util.hpp" namespace doris { @@ -59,12 +63,14 @@ public: return Status::OK(); } +void evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) const override; + private: std::shared_ptr _filter; SpecificFilter* _specific_filter; // owned by _filter }; -// blomm filter column predicate do not support in segment v1 +// bloom filter column predicate do not support in segment v1 template void BloomFilterColumnPredicate::evaluate(VectorizedRowBatch* batch) const { uint16_t n = batch->size(); @@ -99,6 +105,38 @@ void BloomFilterColumnPredicate::evaluate(ColumnBlock* block, uint16_t* se *size = new_size; } +template +void BloomFilterColumnPredicate::evaluate(vectorized::IColumn& column, uint16_t* sel, +uint16_t* size) const { +uint16_t new_size = 0; +using T = typename PrimitiveTypeTraits::CppType; + +if (column.is_nullable()) { +auto* nullable_col = vectorized::check_and_get_column(column); +auto& null_map_data = nullable_col->get_null_map_column().get_data(); +auto* pred_col = vectorized::check_and_get_column>( +nullable_col->get_nested_column()); +auto& pred_col_data = pred_col->get_data(); +for (uint16_t i = 0; i < *size; i++) { +uint16_t idx = sel[i]; +sel[new_size] = idx; +const auto* cell_value = reinterpret_cast(&(pred_col_data[idx])); +new_size += (!null_map_data[idx]) && _specific_filter->find_olap_engine(cell_value); +} +} else { +auto* pred_col = + vectorized::check_and_get_column>(column); +auto& pred_col_data = pred_col->get_data(); +for (uint16_t i = 0; i < *size; i++) { +uint16_t idx = sel[i]; +sel[new_size] = idx; +const auto* cell_value = reinterpret_cast(&(pred_col_data[idx])); +new_size += _specific_filter->find_olap_engine(cell_value); +} +} +*size = new_size; +} + class BloomFilterColumnPredicateFactory { public: static ColumnPredicate* create_column_predicate( diff --git a/be/test/olap/bloom_filter_column_predicate_test.cpp b/be/test/olap/bloom_filter_column_predicate_test.cpp index 164c51d..24abea1 100644 --- a/be/test/olap/bloom_filter_column_predicate_test.cpp +++ b/be/test/olap/bloom_filter_column_predicate_test.cpp @@ -28,6 +28,11 @@ #include "runtime/string_value.hpp" #include "runtime/vectorized_row_batch.h" #include "util/logging.h" +#include "vec/columns/column_nullable.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/block.h" + +using namespace doris::vectorized; namespace doris { @@ -172,6 +177,37 @@ TEST_F(TestBloomFilterColumnPredicate, FLOAT_COLUMN) { ASSERT_EQ(select_size, 1); ASSERT_FLOAT_EQ(*(float*)col_block.cell(_row_block->selection_vector()[0]).cell_ptr(), 5.1); +// for vectorized::Block no null +auto pred_col = PredicateColumnType::create(); +pred_col->reserve(size); +for (int i = 0; i < size; ++i) { +*(col_data + i) = i + 0.1f; +pred_col->insert_data(reinterpret_cast(col_data + i), 0); +} +_row_block->clear(); +select_size = _row_block->selected_size(); +pred->evaluate(*pred_col, _row_block->selection_vector(), &select_size); +ASSERT_EQ(select_size, 3); + ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[0]], 4.1); + ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[1]], 5.1); + ASSERT_FLOAT_EQ((float)pred_col->get_data()[_row_block->selection_vector()[2]], 6.1); + +// for vectorized::Block has nulls +auto null_map = ColumnUInt8::create(size, 0); +auto& null_map_data = null_map->get_data(); +for (int i = 0; i < size; ++i) { +nu
[incubator-doris] 26/33: [Vectorized][Function] Support function stddev/variance/stddev_samp/variance_samp (#7734)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 990723c15b346f0838314e53c8138e574176a5b9 Author: zhangstar333 <87313068+zhangstar...@users.noreply.github.com> AuthorDate: Thu Jan 13 20:27:34 2022 +0800 [Vectorized][Function] Support function stddev/variance/stddev_samp/variance_samp (#7734) --- be/src/exprs/aggregate_functions.cpp | 2 +- be/src/vec/CMakeLists.txt | 1 + .../vec/aggregate_functions/aggregate_function.h | 5 + .../aggregate_functions/aggregate_function_null.h | 9 +- .../aggregate_function_simple_factory.cpp | 6 +- .../aggregate_function_simple_factory.h| 20 +- .../aggregate_function_stddev.cpp | 101 .../aggregate_function_stddev.h| 285 + .../java/org/apache/doris/catalog/FunctionSet.java | 68 + 9 files changed, 486 insertions(+), 11 deletions(-) diff --git a/be/src/exprs/aggregate_functions.cpp b/be/src/exprs/aggregate_functions.cpp index a22c3ee..93166cf 100644 --- a/be/src/exprs/aggregate_functions.cpp +++ b/be/src/exprs/aggregate_functions.cpp @@ -1874,8 +1874,8 @@ static double compute_knuth_variance(const KnuthVarianceState& state, bool pop) static DecimalV2Value decimalv2_compute_knuth_variance(const DecimalV2KnuthVarianceState& state, bool pop) { DecimalV2Value new_count = DecimalV2Value(); -new_count.assign_from_double(state.count); if (state.count == 1) return new_count; +new_count.assign_from_double(state.count); DecimalV2Value new_m2 = DecimalV2Value::from_decimal_val(state.m2); if (pop) return new_m2 / new_count; diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index aa302ce..9c4d947 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -31,6 +31,7 @@ set(VEC_FILES aggregate_functions/aggregate_function_bitmap.cpp aggregate_functions/aggregate_function_reader.cpp aggregate_functions/aggregate_function_window.cpp + aggregate_functions/aggregate_function_stddev.cpp aggregate_functions/aggregate_function_simple_factory.cpp columns/collator.cpp columns/column.cpp diff --git a/be/src/vec/aggregate_functions/aggregate_function.h b/be/src/vec/aggregate_functions/aggregate_function.h index 412382f..4c2ef36 100644 --- a/be/src/vec/aggregate_functions/aggregate_function.h +++ b/be/src/vec/aggregate_functions/aggregate_function.h @@ -114,6 +114,11 @@ public: */ virtual bool is_state() const { return false; } +/// if return false, during insert_result_into function, you colud get nullable result column, +/// so could insert to null value by yourself, rather than by AggregateFunctionNullBase; +/// because you maybe be calculate a invalid value, but want to use null replace it; +virtual bool insert_to_null_default() const { return true; } + /** The inner loop that uses the function pointer is better than using the virtual function. * The reason is that in the case of virtual functions GCC 5.1.2 generates code, * which, at each iteration of the loop, reloads the function address (the offset value in the virtual function table) from memory to the register. diff --git a/be/src/vec/aggregate_functions/aggregate_function_null.h b/be/src/vec/aggregate_functions/aggregate_function_null.h index 4c61e2b..9458d7d 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_null.h +++ b/be/src/vec/aggregate_functions/aggregate_function_null.h @@ -144,9 +144,12 @@ public: if constexpr (result_is_nullable) { ColumnNullable& to_concrete = assert_cast(to); if (get_flag(place)) { -nested_function->insert_result_into(nested_place(place), - to_concrete.get_nested_column()); -to_concrete.get_null_map_data().push_back(0); +if (nested_function->insert_to_null_default()) { +nested_function->insert_result_into(nested_place(place), to_concrete.get_nested_column()); +to_concrete.get_null_map_data().push_back(0); +} else { +nested_function->insert_result_into(nested_place(place), to); //want to insert into null value by self +} } else { to_concrete.insert_default(); } diff --git a/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp b/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp index 8a1995b..ba1b2ba 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp +++ b/be/src/vec/aggregate_functions/aggregate_function_simple_factory.cpp @@ -35,6 +35,7 @@ void register_aggregate_f
[incubator-doris] 19/33: [Vectorized][Bug] fix 'negative' function ut run fail && fix testIsBucketShuffleJoin run fail && fix some compile fail (#7688)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 9f16ac2363f2777d55e7538eda2daee46df643b0 Author: Pxl <952130...@qq.com> AuthorDate: Tue Jan 11 10:48:02 2022 +0800 [Vectorized][Bug] fix 'negative' function ut run fail && fix testIsBucketShuffleJoin run fail && fix some compile fail (#7688) --- be/src/util/brpc_stub_cache.h | 2 +- be/test/vec/function/function_hash_test.cpp| 36 +- be/test/vec/function/function_math_test.cpp| 33 .../org/apache/doris/planner/HashJoinNode.java | 2 +- .../java/org/apache/doris/qe/CoordinatorTest.java | 9 -- 5 files changed, 44 insertions(+), 38 deletions(-) diff --git a/be/src/util/brpc_stub_cache.h b/be/src/util/brpc_stub_cache.h index e944aac..21800f3 100644 --- a/be/src/util/brpc_stub_cache.h +++ b/be/src/util/brpc_stub_cache.h @@ -47,7 +47,7 @@ namespace doris { class BrpcStubCache { public: BrpcStubCache(); -~BrpcStubCache(); +virtual ~BrpcStubCache(); inline std::shared_ptr get_stub(const butil::EndPoint& endpoint) { auto stub_ptr = _stub_map.find(endpoint); diff --git a/be/test/vec/function/function_hash_test.cpp b/be/test/vec/function/function_hash_test.cpp index 75a2ac5..d5d41b2 100644 --- a/be/test/vec/function/function_hash_test.cpp +++ b/be/test/vec/function/function_hash_test.cpp @@ -33,20 +33,21 @@ TEST(HashFunctionTest, murmur_hash_3_test) { { std::vector input_types = {vectorized::TypeIndex::String}; -DataSet data_set = {{{Null()}, Null()}, -{{std::string("hello")}, (int32_t) 1321743225}}; +DataSet data_set = {{{Null()}, Null()}, {{std::string("hello")}, (int32_t)1321743225}}; -vectorized::check_function(func_name, input_types, data_set); +vectorized::check_function(func_name, input_types, +data_set); }; { std::vector input_types = {vectorized::TypeIndex::String, vectorized::TypeIndex::String}; -DataSet data_set = {{{std::string("hello"), std::string("world")}, (int32_t) 984713481}, +DataSet data_set = {{{std::string("hello"), std::string("world")}, (int32_t)984713481}, {{std::string("hello"), Null()}, Null()}}; -vectorized::check_function(func_name, input_types, data_set); +vectorized::check_function(func_name, input_types, +data_set); }; { @@ -54,10 +55,12 @@ TEST(HashFunctionTest, murmur_hash_3_test) { vectorized::TypeIndex::String, vectorized::TypeIndex::String}; -DataSet data_set = {{{std::string("hello"), std::string("world"), std::string("!")}, (int32_t) -666935433}, +DataSet data_set = {{{std::string("hello"), std::string("world"), std::string("!")}, + (int32_t)-666935433}, {{std::string("hello"), std::string("world"), Null()}, Null()}}; -vectorized::check_function(func_name, input_types, data_set); +vectorized::check_function(func_name, input_types, +data_set); }; } @@ -68,19 +71,22 @@ TEST(HashFunctionTest, murmur_hash_2_test) { std::vector input_types = {vectorized::TypeIndex::String}; DataSet data_set = {{{Null()}, Null()}, -{{std::string("hello")}, (uint64_t) 2191231550387646743}}; +{{std::string("hello")}, (uint64_t)2191231550387646743ull}}; -vectorized::check_function(func_name, input_types, data_set); +vectorized::check_function(func_name, input_types, + data_set); }; { std::vector input_types = {vectorized::TypeIndex::String, vectorized::TypeIndex::String}; -DataSet data_set = {{{std::string("hello"), std::string("world")}, (uint64_t) 11978658642541747642l}, -{{std::string("hello"), Null()}, Null()}}; +DataSet data_set = { +{{std::string("hello"), std::string("world")}, (uint64_t)11978658642541747642ull}, +{{std::string("hello"), Null()}, Null()}}; -vectorized::check_function(func_name, input_types, data_set); +vectorized::check_function(func_name, input_types, + data_set); }; { @@ -88,10 +94,12 @@ TEST(HashFunctionTest, murmur_hash_2_test) {
[incubator-doris] 27/33: [Vectorization] Support SegmentIterator vectorization (#7613)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit e23e332f75f6d1b3891c9adcd7ba85f9a89f765e Author: wangbo <506340...@qq.com> AuthorDate: Fri Jan 14 11:44:27 2022 +0800 [Vectorization] Support SegmentIterator vectorization (#7613) --- be/src/olap/column_predicate.h | 2 + be/src/olap/comparison_predicate.cpp | 2 +- be/src/olap/in_list_predicate.cpp | 39 +++ be/src/olap/in_list_predicate.h| 7 +- be/src/olap/rowset/segment_v2/binary_dict_page.cpp | 21 +- be/src/olap/rowset/segment_v2/binary_plain_page.h | 34 +- be/src/olap/rowset/segment_v2/bitshuffle_page.h| 36 +- be/src/olap/rowset/segment_v2/segment_iterator.cpp | 362 - be/src/olap/rowset/segment_v2/segment_iterator.h | 29 +- be/src/olap/schema.cpp | 66 be/src/olap/schema.h | 4 + be/src/vec/columns/column.h| 13 +- be/src/vec/columns/column_complex.h| 2 + be/src/vec/columns/column_nullable.cpp | 15 +- be/src/vec/columns/column_nullable.h | 6 + be/src/vec/columns/column_vector.h | 26 ++ be/src/vec/columns/predicate_column.h | 27 +- 17 files changed, 661 insertions(+), 30 deletions(-) diff --git a/be/src/olap/column_predicate.h b/be/src/olap/column_predicate.h index 10b8a91..6b1aa23 100644 --- a/be/src/olap/column_predicate.h +++ b/be/src/olap/column_predicate.h @@ -69,6 +69,8 @@ public: virtual void evaluate_vec(vectorized::IColumn& column, uint16_t size, bool* flags) const {}; uint32_t column_id() const { return _column_id; } +virtual bool is_in_predicate() { return false; } + protected: uint32_t _column_id; bool _opposite; diff --git a/be/src/olap/comparison_predicate.cpp b/be/src/olap/comparison_predicate.cpp index 598e7f3..a154a04 100644 --- a/be/src/olap/comparison_predicate.cpp +++ b/be/src/olap/comparison_predicate.cpp @@ -188,7 +188,7 @@ COMPARISON_PRED_COLUMN_EVALUATE(GreaterEqualPredicate, >=) void CLASS::evaluate_vec(vectorized::IColumn& column, uint16_t size, bool* flags) const { \ if (column.is_nullable()) { \ auto* nullable_column = vectorized::check_and_get_column(column); \ -auto& data_array = reinterpret_cast&>(nullable_column->get_nested_column()).get_data(); \ +auto& data_array = reinterpret_cast&>(nullable_column->get_nested_column()).get_data(); \ auto& null_bitmap = reinterpret_cast&>(*(nullable_column->get_null_map_column_ptr())).get_data(); \ for (uint16_t i = 0; i < size; i++) { \ flags[i] = (data_array[i] OP _value) && (!null_bitmap[i]); \ diff --git a/be/src/olap/in_list_predicate.cpp b/be/src/olap/in_list_predicate.cpp index c167a17..a17e157 100644 --- a/be/src/olap/in_list_predicate.cpp +++ b/be/src/olap/in_list_predicate.cpp @@ -20,6 +20,8 @@ #include "olap/field.h" #include "runtime/string_value.hpp" #include "runtime/vectorized_row_batch.h" +#include "vec/columns/predicate_column.h" +#include "vec/columns/column_nullable.h" namespace doris { @@ -115,6 +117,43 @@ IN_LIST_PRED_EVALUATE(NotInListPredicate, ==) IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(InListPredicate, !=) IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(NotInListPredicate, ==) +#define IN_LIST_PRED_COLUMN_EVALUATE(CLASS, OP) \ +template \ +void CLASS::evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) const { \ +uint16_t new_size = 0; \ +if (column.is_nullable()) { \ +auto* nullable_column = \ + vectorized::check_and_get_column(column); \ +auto& null_bitmap = reinterpret_cast&>(*( \ +nullable_column->get_null_map_column_ptr())).get_data(); \ +auto* nest_column_vector = vectorized::check_and_get_column \ + >(nullable_column->get_nested_column()); \ +auto& data_array = nest_
[incubator-doris] 29/33: [Vectorized][feature](planner)(executor) Support grouping sets rollup cube (#7601)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 8760e644f4086809f214f0d105f0ae4492024ca6 Author: anneji-dev <85534151+anneji-...@users.noreply.github.com> AuthorDate: Mon Jan 17 14:09:15 2022 +0800 [Vectorized][feature](planner)(executor) Support grouping sets rollup cube (#7601) --- be/src/exec/exec_node.cpp | 8 +- be/src/exec/repeat_node.h | 2 +- be/src/vec/CMakeLists.txt | 2 + be/src/vec/exec/vrepeat_node.cpp | 245 + be/src/vec/exec/vrepeat_node.h | 56 + be/src/vec/functions/function_grouping.cpp | 25 +++ be/src/vec/functions/function_grouping.h | 90 be/src/vec/functions/simple_function_factory.h | 2 + .../apache/doris/planner/SingleNodePlanner.java| 10 + gensrc/script/doris_builtins_functions.py | 4 +- 10 files changed, 440 insertions(+), 4 deletions(-) diff --git a/be/src/exec/exec_node.cpp b/be/src/exec/exec_node.cpp index 4582f89..97c3259 100644 --- a/be/src/exec/exec_node.cpp +++ b/be/src/exec/exec_node.cpp @@ -82,6 +82,7 @@ #include "vec/exprs/vexpr.h" #include "vec/exec/vempty_set_node.h" #include "vec/exec/vschema_scan_node.h" +#include "vec/exec/vrepeat_node.h" namespace doris { const std::string ExecNode::ROW_THROUGHPUT_COUNTER = "RowsReturnedRate"; @@ -389,6 +390,7 @@ Status ExecNode::create_node(RuntimeState* state, ObjectPool* pool, const TPlanN case TPlanNodeType::SCHEMA_SCAN_NODE: case TPlanNodeType::ANALYTIC_EVAL_NODE: case TPlanNodeType::SELECT_NODE: +case TPlanNodeType::REPEAT_NODE: break; default: { const auto& i = _TPlanNodeType_VALUES_TO_NAMES.find(tnode.node_type); @@ -568,7 +570,11 @@ Status ExecNode::create_node(RuntimeState* state, ObjectPool* pool, const TPlanN return Status::OK(); case TPlanNodeType::REPEAT_NODE: -*node = pool->add(new RepeatNode(pool, tnode, descs)); +if (state->enable_vectorized_exec()) { +*node = pool->add(new vectorized::VRepeatNode(pool, tnode, descs)); +} else { +*node = pool->add(new RepeatNode(pool, tnode, descs)); +} return Status::OK(); case TPlanNodeType::ASSERT_NUM_ROWS_NODE: diff --git a/be/src/exec/repeat_node.h b/be/src/exec/repeat_node.h index 01335d2..d9dce75 100644 --- a/be/src/exec/repeat_node.h +++ b/be/src/exec/repeat_node.h @@ -40,7 +40,7 @@ public: protected: virtual void debug_string(int indentation_level, std::stringstream* out) const override; -private: +protected: Status get_repeated_batch(RowBatch* child_row_batch, int repeat_id_idx, RowBatch* row_batch); // Slot id set used to indicate those slots need to set to null. diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index 9c4d947..f737391 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -86,6 +86,7 @@ set(VEC_FILES exec/vempty_set_node.cpp exec/vanalytic_eval_node.cpp exec/vassert_num_rows_node.cpp + exec/vrepeat_node.cpp exec/join/vhash_join_node.cpp exprs/vectorized_agg_fn.cpp exprs/vectorized_fn_call.cpp @@ -139,6 +140,7 @@ set(VEC_FILES functions/function_date_or_datetime_computation.cpp functions/function_date_or_datetime_to_string.cpp functions/function_datetime_string_to_string.cpp + functions/function_grouping.cpp olap/vgeneric_iterators.cpp olap/vcollect_iterator.cpp olap/block_reader.cpp diff --git a/be/src/vec/exec/vrepeat_node.cpp b/be/src/vec/exec/vrepeat_node.cpp new file mode 100644 index 000..dd8bb28 --- /dev/null +++ b/be/src/vec/exec/vrepeat_node.cpp @@ -0,0 +1,245 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "vec/exec/vrepeat_node.h" +#include "exprs/expr.h" +#include "gutil/strings/join.h" +#include "runtime/runtime_state.h" +#include "util/runtime_profile.h" + +namespace doris::vectorized { +VRepeatNode::VRepeatNode(ObjectPool* pool, const TPlanNode& tnode, const Des
[incubator-doris] 24/33: [Vectorized][Bug] Bitmap/HLL type no support cast to varchar/char (#7737)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 30de672b2e45a83a95706a1f1f100054b436f33c Author: HappenLee AuthorDate: Thu Jan 13 13:34:29 2022 +0800 [Vectorized][Bug] Bitmap/HLL type no support cast to varchar/char (#7737) Co-authored-by: lihaopeng --- be/src/vec/data_types/data_type_bitmap.cpp | 9 + be/src/vec/data_types/data_type_bitmap.h | 1 + be/src/vec/data_types/data_type_string.cpp | 8 be/src/vec/data_types/data_type_string.h | 1 + 4 files changed, 19 insertions(+) diff --git a/be/src/vec/data_types/data_type_bitmap.cpp b/be/src/vec/data_types/data_type_bitmap.cpp index 4daa72e..c6bc9f0 100644 --- a/be/src/vec/data_types/data_type_bitmap.cpp +++ b/be/src/vec/data_types/data_type_bitmap.cpp @@ -90,4 +90,13 @@ void DataTypeBitMap::deserialize_as_stream(BitmapValue& value, BufferReadable& b read_string_binary(ref, buf); value.deserialize(ref.data); } + +void DataTypeBitMap::to_string(const class doris::vectorized::IColumn& column, size_t row_num, +doris::vectorized::BufferWritable& ostr) const { +auto& data = const_cast(assert_cast(column).get_element(row_num)); +std::string result(data.getSizeInBytes(), '0'); +data.write((char*)result.data()); + +ostr.write(result.data(), result.size()); +} } // namespace doris::vectorized diff --git a/be/src/vec/data_types/data_type_bitmap.h b/be/src/vec/data_types/data_type_bitmap.h index 69f5540..c2166fb 100644 --- a/be/src/vec/data_types/data_type_bitmap.h +++ b/be/src/vec/data_types/data_type_bitmap.h @@ -65,6 +65,7 @@ public: bool can_be_inside_low_cardinality() const override { return false; } std::string to_string(const IColumn& column, size_t row_num) const { return "BitMap()"; } +void to_string(const IColumn &column, size_t row_num, BufferWritable &ostr) const override; [[noreturn]] virtual Field get_default() const { LOG(FATAL) << "Method get_default() is not implemented for data type " << get_name(); diff --git a/be/src/vec/data_types/data_type_string.cpp b/be/src/vec/data_types/data_type_string.cpp index f481721..86b0aac 100644 --- a/be/src/vec/data_types/data_type_string.cpp +++ b/be/src/vec/data_types/data_type_string.cpp @@ -58,6 +58,14 @@ std::string DataTypeString::to_string(const IColumn& column, size_t row_num) con return s.to_string(); } +void DataTypeString::to_string(const class doris::vectorized::IColumn & column, size_t row_num, +class doris::vectorized::BufferWritable & ostr) const { +const StringRef& s = +assert_cast(*column.convert_to_full_column_if_const().get()) +.get_data_at(row_num); +ostr.write(s.data, s.size); +} + Field DataTypeString::get_default() const { return String(); } diff --git a/be/src/vec/data_types/data_type_string.h b/be/src/vec/data_types/data_type_string.h index 9d5b21b..8506473 100644 --- a/be/src/vec/data_types/data_type_string.h +++ b/be/src/vec/data_types/data_type_string.h @@ -55,6 +55,7 @@ public: bool can_be_inside_nullable() const override { return true; } bool can_be_inside_low_cardinality() const override { return true; } std::string to_string(const IColumn& column, size_t row_num) const; +void to_string(const IColumn &column, size_t row_num, BufferWritable &ostr) const override; }; } // namespace doris::vectorized - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 31/33: [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit f45d2a2144179bcc64b807a68f4a6befdc796383 Author: zuochunwei AuthorDate: Mon Jan 17 17:10:24 2022 +0800 [Vectorized](improving) (exec) optimize VDataStreamSender's send() performance #7747 (#7751) --- be/src/vec/columns/column.h | 4 ++ be/src/vec/columns/column_complex.h | 10 - be/src/vec/columns/column_const.h | 4 ++ be/src/vec/columns/column_decimal.h | 10 + be/src/vec/columns/column_dummy.h | 4 ++ be/src/vec/columns/column_nullable.cpp | 6 +++ be/src/vec/columns/column_nullable.h| 1 + be/src/vec/columns/column_string.cpp| 6 +++ be/src/vec/columns/column_string.h | 2 + be/src/vec/columns/column_vector.cpp| 10 + be/src/vec/columns/column_vector.h | 2 + be/src/vec/columns/predicate_column.h | 6 ++- be/src/vec/core/block.cpp | 13 +- be/src/vec/core/block.h | 2 + be/src/vec/sink/vdata_stream_sender.cpp | 79 ++--- be/src/vec/sink/vdata_stream_sender.h | 36 ++- 16 files changed, 164 insertions(+), 31 deletions(-) diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h index a869a65..d58979d 100644 --- a/be/src/vec/columns/column.h +++ b/be/src/vec/columns/column.h @@ -160,6 +160,10 @@ public: virtual void insert_many_from(const IColumn& src, size_t position, size_t length) { for (size_t i = 0; i < length; ++i) insert_from(src, position); } + +/// Appends a batch elements from other column with the same type +/// indices_begin + indices_end represent the row indices of column src +virtual void insert_indices_from(const IColumn& src, const int* indices_begin, const int* indices_end) = 0; /// Appends data located in specified memory chunk if it is possible (throws an exception if it cannot be implemented). /// Is used to optimize some computations (in aggregation, for example). diff --git a/be/src/vec/columns/column_complex.h b/be/src/vec/columns/column_complex.h index 296f94b..18794d3 100644 --- a/be/src/vec/columns/column_complex.h +++ b/be/src/vec/columns/column_complex.h @@ -127,6 +127,14 @@ public: data.insert(data.end(), st, ed); } +void insert_indices_from(const IColumn& src, const int* indices_begin, const int* indices_end) override { +const Self& src_vec = assert_cast(src); +data.reserve(size() + (indices_end - indices_begin)); +for (auto x = indices_begin; x != indices_end; ++x) { +data.push_back(src_vec.get_element(*x)); +} +} + void pop_back(size_t n) { data.erase(data.end() - n, data.end()); } // it's impossable to use ComplexType as key , so we don't have to implemnt them [[noreturn]] StringRef serialize_value_into_arena(size_t n, Arena& arena, @@ -286,4 +294,4 @@ ColumnPtr ColumnComplexType::replicate(const IColumn::Offsets& offsets) const } using ColumnBitmap = ColumnComplexType; -} // namespace doris::vectorized \ No newline at end of file +} // namespace doris::vectorized diff --git a/be/src/vec/columns/column_const.h b/be/src/vec/columns/column_const.h index 703e226..e019c56 100644 --- a/be/src/vec/columns/column_const.h +++ b/be/src/vec/columns/column_const.h @@ -84,6 +84,10 @@ public: s += length; } +void insert_indices_from(const IColumn& src, const int* indices_begin, const int* indices_end) override { +s += (indices_end - indices_begin); +} + void insert(const Field&) override { ++s; } void insert_data(const char*, size_t) override { ++s; } diff --git a/be/src/vec/columns/column_decimal.h b/be/src/vec/columns/column_decimal.h index 46412cb..67f4fa9 100644 --- a/be/src/vec/columns/column_decimal.h +++ b/be/src/vec/columns/column_decimal.h @@ -26,6 +26,7 @@ #include "vec/columns/column_impl.h" #include "vec/columns/column_vector_helper.h" #include "vec/common/typeid_cast.h" +#include "vec/common/assert_cast.h" #include "vec/core/field.h" namespace doris::vectorized { @@ -95,6 +96,15 @@ public: void insert_from(const IColumn& src, size_t n) override { data.push_back(static_cast(src).get_data()[n]); } + +void insert_indices_from(const IColumn& src, const int* indices_begin, const int* indices_end) override { +const Self& src_vec = assert_cast(src); +data.reserve(size() + (indices_end - indices_begin)); +for (auto x = indices_begin; x != indices_end; ++x) { +data.push_back_without_reserve(src_vec.get_element(*x)); +} +} + void insert_data(const char* pos, size_t /*length*/) override; void insert_default() override { data.push_back(T()); } void insert(const Field& x) override { diff --git a/be/src/vec/columns/column_dummy.h b/be/src/vec/col
[incubator-doris] 20/33: [Vectorized][Enhancement] fix some bug & improve some code (#7714)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit b20b5b7e4310a91c6cb42a8cc1ba4c6850bd3af2 Author: Pxl <952130...@qq.com> AuthorDate: Wed Jan 12 09:57:10 2022 +0800 [Vectorized][Enhancement] fix some bug & improve some code (#7714) --- .../aggregate_function_reader.cpp | 8 +++- be/src/vec/exec/volap_scan_node.cpp| 2 ++ be/src/vec/olap/block_reader.cpp | 22 ++ be/src/vec/olap/block_reader.h | 4 +++- run-be-ut.sh | 2 +- 5 files changed, 23 insertions(+), 15 deletions(-) diff --git a/be/src/vec/aggregate_functions/aggregate_function_reader.cpp b/be/src/vec/aggregate_functions/aggregate_function_reader.cpp index 9a24ac5..3594d51 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_reader.cpp +++ b/be/src/vec/aggregate_functions/aggregate_function_reader.cpp @@ -23,9 +23,8 @@ namespace doris::vectorized { void register_aggregate_function_reader(AggregateFunctionSimpleFactory& factory) { // add a suffix to the function name here to distinguish special functions of agg reader auto register_function_reader = [&](const std::string& name, -const AggregateFunctionCreator& creator, -bool nullable = false) { -factory.register_function(name + agg_reader_suffix, creator, nullable); +const AggregateFunctionCreator& creator) { +factory.register_function(name + agg_reader_suffix, creator, false); }; register_function_reader("sum", create_aggregate_function_sum_reader); @@ -38,8 +37,7 @@ void register_aggregate_function_reader(AggregateFunctionSimpleFactory& factory) void register_aggregate_function_reader_no_spread(AggregateFunctionSimpleFactory& factory) { auto register_function_reader = [&](const std::string& name, -const AggregateFunctionCreator& creator, -bool nullable = false) { +const AggregateFunctionCreator& creator, bool nullable) { factory.register_function(name + agg_reader_suffix, creator, nullable); }; diff --git a/be/src/vec/exec/volap_scan_node.cpp b/be/src/vec/exec/volap_scan_node.cpp index da7a204..b365c1d 100644 --- a/be/src/vec/exec/volap_scan_node.cpp +++ b/be/src/vec/exec/volap_scan_node.cpp @@ -259,6 +259,8 @@ void VOlapScanNode::scanner_thread(VOlapScanner* scanner) { } _scan_cpu_timer->update(cpu_watch.elapsed_time()); _scanner_wait_worker_timer->update(wait_time); + +std::unique_lock l(_scan_blocks_lock); _running_thread--; // The transfer thead will wait for `_running_thread==0`, to make sure all scanner threads won't access class members. diff --git a/be/src/vec/olap/block_reader.cpp b/be/src/vec/olap/block_reader.cpp index ef3ba3a..f7a1388 100644 --- a/be/src/vec/olap/block_reader.cpp +++ b/be/src/vec/olap/block_reader.cpp @@ -72,6 +72,10 @@ OLAPStatus BlockReader::_init_collect_iter(const ReaderParams& read_params, } void BlockReader::_init_agg_state() { +if (_eof) { +return; +} + _stored_data_block = _next_row.block->create_same_struct_block(_batch_size); _stored_data_columns = _stored_data_block->mutate_columns(); @@ -260,7 +264,8 @@ OLAPStatus BlockReader::_unique_key_next_block(Block* block, MemPool* mem_pool, void BlockReader::_insert_data_normal(MutableColumns& columns) { auto block = _next_row.block; for (auto idx : _normal_columns_idx) { - columns[_return_columns_loc[idx]]->insert_from(*block->get_by_position(idx).column, _next_row.row_pos); + columns[_return_columns_loc[idx]]->insert_from(*block->get_by_position(idx).column, + _next_row.row_pos); } } @@ -270,7 +275,7 @@ void BlockReader::_append_agg_data(MutableColumns& columns) { // execute aggregate when have `batch_size` column or some ref invalid soon bool is_last = (_next_row.block->rows() == _next_row.row_pos + 1); -if (_stored_row_ref.size() == _batch_size || is_last) { +if (is_last || _stored_row_ref.size() == _batch_size) { _update_agg_data(columns); } } @@ -301,11 +306,9 @@ void BlockReader::_update_agg_data(MutableColumns& columns) { } void BlockReader::_copy_agg_data() { -phmap::flat_hash_map>> temp_ref_map; - for (int i = 0; i < _stored_row_ref.size(); i++) { auto& ref = _stored_row_ref[i]; -temp_ref_map[ref.block].emplace_back(ref.row_pos, i); +_temp_ref_map[ref.block].emplace_back(ref.row_pos, i); } for (auto idx : _agg_columns_idx) { @@ -314,11 +317,11 @@ void BlockReader::_co
[incubator-doris] 32/33: [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit b634ea62bbdef860a5bafffb078e6f128bfaaaff Author: HappenLee AuthorDate: Mon Jan 17 20:11:26 2022 +0800 [Vectorized][Bug] Fix bug of repeated node resize and compile failed (#7778) Co-authored-by: lihaopeng --- be/src/vec/exec/vrepeat_node.cpp | 13 ++--- be/src/vec/functions/simple_function_factory.h | 2 +- build.sh | 2 +- 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/be/src/vec/exec/vrepeat_node.cpp b/be/src/vec/exec/vrepeat_node.cpp index dd8bb28..287aa6e 100644 --- a/be/src/vec/exec/vrepeat_node.cpp +++ b/be/src/vec/exec/vrepeat_node.cpp @@ -104,6 +104,7 @@ Status VRepeatNode::get_repeated_block(Block* child_block, int repeat_id_idx, Bl std::set& repeat_ids = _slot_id_set_list[repeat_id_idx]; bool is_repeat_slot = _all_slot_ids.find(_output_slots[cur_col]->id()) != _all_slot_ids.end(); bool is_set_null_slot = repeat_ids.find(_output_slots[cur_col]->id()) == repeat_ids.end(); +const auto column_size = src_column.column->size(); if (is_repeat_slot) { DCHECK(_output_slots[cur_col]->is_nullable()); @@ -113,21 +114,19 @@ Status VRepeatNode::get_repeated_block(Block* child_block, int repeat_id_idx, Bl // set slot null not in repeat_ids if (is_set_null_slot) { -nullable_column->resize(src_column.column->size()); -for (size_t j = 0; j < src_column.column->size(); ++j) { -nullable_column->insert_data(nullptr, 0); -} +nullable_column->resize(column_size); +memset(nullable_column->get_null_map_data().data(), 1, sizeof(UInt8) * column_size); } else { if (!src_column.type->is_nullable()) { -for (size_t j = 0; j < src_column.column->size(); ++j) { +for (size_t j = 0; j < column_size; ++j) { null_map.push_back(0); } column_ptr = &nullable_column->get_nested_column(); } -column_ptr->insert_range_from(*src_column.column, 0, src_column.column->size()); +column_ptr->insert_range_from(*src_column.column, 0, column_size); } } else { -columns[cur_col]->insert_range_from(*src_column.column, 0, src_column.column->size()); +columns[cur_col]->insert_range_from(*src_column.column, 0, column_size); } cur_col++; } diff --git a/be/src/vec/functions/simple_function_factory.h b/be/src/vec/functions/simple_function_factory.h index 45420cb..5b9f0fd 100644 --- a/be/src/vec/functions/simple_function_factory.h +++ b/be/src/vec/functions/simple_function_factory.h @@ -66,7 +66,7 @@ void register_function_like(SimpleFunctionFactory& factory); void register_function_regexp(SimpleFunctionFactory& factory); void register_function_random(SimpleFunctionFactory& factory); void register_function_coalesce(SimpleFunctionFactory& factory); -+void register_function_grouping(SimpleFunctionFactory& factory); +void register_function_grouping(SimpleFunctionFactory& factory); class SimpleFunctionFactory { using Creator = std::function; diff --git a/build.sh b/build.sh index d842b07..a5ccaae 100755 --- a/build.sh +++ b/build.sh @@ -104,7 +104,7 @@ fi eval set -- "$OPTS" -PARALLEL=$[$(nproc)+1] +PARALLEL=$[$(nproc)/4+1] BUILD_BE= BUILD_FE= BUILD_UI= - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 21/33: [Vectorized][Enhancement] use simd to speed up coalesce and if_not_null function (#7722)
This is an automated email from the ASF dual-hosted git repository. lihaopeng pushed a commit to branch vectorized in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit f3ce1cba69aa27f8ae0af63451bab41d5048c367 Author: HappenLee AuthorDate: Wed Jan 12 13:20:48 2022 +0800 [Vectorized][Enhancement] use simd to speed up coalesce and if_not_null function (#7722) Co-authored-by: lihaopeng --- be/src/vec/functions/function_coalesce.cpp | 210 - be/src/vec/functions/is_not_null.cpp | 4 +- be/src/vec/functions/simple_function_factory.h | 5 +- be/test/vec/function/function_string_test.cpp | 42 + 4 files changed, 214 insertions(+), 47 deletions(-) diff --git a/be/src/vec/functions/function_coalesce.cpp b/be/src/vec/functions/function_coalesce.cpp index 65d544c..99b6110 100644 --- a/be/src/vec/functions/function_coalesce.cpp +++ b/be/src/vec/functions/function_coalesce.cpp @@ -28,6 +28,8 @@ class FunctionCoalesce : public IFunction { public: static constexpr auto name = "coalesce"; +mutable FunctionBasePtr func_is_not_null; + static FunctionPtr create() { return std::make_shared(); } String get_name() const override { return name; } @@ -41,47 +43,70 @@ public: size_t get_number_of_arguments() const override { return 0; } DataTypePtr get_return_type_impl(const DataTypes& arguments) const override { +DataTypePtr res; for (const auto& arg : arguments) { if (!arg->is_nullable()) { -return arg; +res = arg; +break; } } -return arguments[0]; + +res = res ? res : arguments[0]; + +const ColumnsWithTypeAndName is_not_null_col{ +{nullptr, make_nullable(res), ""} +}; +func_is_not_null = SimpleFunctionFactory::instance(). +get_function("is_not_null_pred", is_not_null_col, std::make_shared()); + +return res; } Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, size_t result, size_t input_rows_count) override { DCHECK_GE(arguments.size(), 1); +DataTypePtr result_type = block.get_by_position(result).type; ColumnNumbers filtered_args; filtered_args.reserve(arguments.size()); -for (const auto& arg : arguments) { -const auto& type = block.get_by_position(arg).type; -if (type->only_null()) { -continue; -} -filtered_args.push_back(arg); -if (!type->is_nullable()) { -break; + +for (size_t i = 0; i < arguments.size(); ++i) { +const auto& arg_type = block.get_by_position(arguments[i]).type; +filtered_args.push_back(arguments[i]); +if (!arg_type->is_nullable()) { +if (i == 0) { //if the first column not null, return it's directly +block.get_by_position(result).column = block.get_by_position(arguments[0]).column; +return Status::OK(); +} else { +break; +} } } size_t remaining_rows = input_rows_count; size_t argument_size = filtered_args.size(); -std::vector record_idx(input_rows_count, -1); //used to save column idx +std::vector record_idx(input_rows_count, 0); //used to save column idx, record the result data of each row from which column +std::vector filled_flags(input_rows_count, 0); //used to save filled flag, in order to check current row whether have filled data + MutableColumnPtr result_column; +if (!result_type->is_nullable()) { +result_column = result_type->create_column(); +} else { +result_column = remove_nullable(result_type)->create_column(); +} -DataTypePtr type = block.get_by_position(result).type; -if (!type->is_nullable()) { -result_column = type->create_column(); +// because now the string types does not support random position writing, +// so insert into result data have two methods, one is for string types, one is for others type remaining +bool is_string_result = result_column->is_column_string(); +if (is_string_result) { +result_column->reserve(input_rows_count); } else { -result_column = remove_nullable(type)->create_column(); +result_column->resize(input_rows_count); } -result_column->reserve(input_rows_count); auto return_type = std::make_shared(); -auto null_map = ColumnUInt8::create(input_rows_count, 1); -auto& null_map_data = null_map->get_data(); -ColumnPtr argument_columns[argument_size]; +auto null_map = ColumnUInt8::create(input_rows_count, 1); //if n
[GitHub] [incubator-doris] zhengshengjun opened a new issue #7783: [Bug] Consider backend status when more than one backends exists in same host
zhengshengjun opened a new issue #7783: URL: https://github.com/apache/incubator-doris/issues/7783 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? I have more than one BEs exist in same host. If one of them are dead, stream load process would fail sometimes. Because FE select one at random not alive one. This will cause 'No backend alive.' ERROR during stream load process. ### What You Expected? Choose alive BE when more than one BE exists in same host, so that stream load process will not fail when there are both alive and dead BE in the same host. ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zhengshengjun opened a new pull request #7784: [Bug] Consider backend status when more than one backends exists in same host #7783
zhengshengjun opened a new pull request #7784: URL: https://github.com/apache/incubator-doris/pull/7784 …ame host # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee opened a new pull request #7785: [Vectorized] Support Vectorized Exec Engine In Doris
HappenLee opened a new pull request #7785: URL: https://github.com/apache/incubator-doris/pull/7785 # Proposed changes Issue Number: close #6238 Co-authored-by: HappenLee Co-authored-by: stdpain <34912776+stdp...@users.noreply.github.com> Co-authored-by: Zhengguo Yang Co-authored-by: wangbo <506340...@qq.com> Co-authored-by: emmymiao87 <522274...@qq.com> Co-authored-by: Pxl <952130...@qq.com> Co-authored-by: zhangstar333 <87313068+zhangstar...@users.noreply.github.com> Co-authored-by: thinker Co-authored-by: Zeno Yang <1521564...@qq.com> Co-authored-by: Wang Shuo Co-authored-by: zhoubintao <35688959+zbtzbt...@users.noreply.github.com> Co-authored-by: Gabriel Co-authored-by: xinghuayu007 <1450306...@qq.com> Co-authored-by: weizuo93 Co-authored-by: yiguolei Co-authored-by: anneji-dev <85534151+anneji-...@users.noreply.github.com> Co-authored-by: awakeljw <993007...@qq.com> Co-authored-by: taberylyang <95272637+taberyly...@users.noreply.github.com> Co-authored-by: Cui Kaifeng <48012748+azuren...@users.noreply.github.com> ## Problem Summary: ### 1. Some code from clickhouse **ClickHouse is an excellent implementation of the vectorized execution engine database, so here we have borrowed a lot from its excellent implementation in terms of data structure and function implementation. We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.** we add all code about Clickhouse Title: // This file is copied from // https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h // and modified by Doris ### 2. Support exec node and query: * vaggregation_node * vanalytic_eval_node * vassert_num_rows_node * vblocking_join_node * vcross_join_node * vempty_set_node * ves_http_scan_node * vexcept_node * vexchange_node * vintersect_node * vmysql_scan_node * vodbc_scan_node * volap_scan_node * vrepeat_node * vschema_scan_node * vselect_node * vset_operation_node * vsort_node * vunion_node * vhash_join_node You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set. ### 3. Data Model Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized. Segment Vec is working in process. ### 4. How to use 1. Set the environment variable `set enable_vectorized_engine = true; `(required) 2. Set the environment variable `set batch_size = 4096; ` (recommended) ### 5. Some diff from origin exec engine https://github.com/doris-vectorized/doris-vectorized/issues/294 ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman commented on issue #7580: [Roadmap] Support vectorized query engine
morningman commented on issue #7580: URL: https://github.com/apache/incubator-doris/issues/7580#issuecomment-1014511789 Related #6238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 opened a new pull request #7786: Add Amazon S3 support
hf200012 opened a new pull request #7786: URL: https://github.com/apache/incubator-doris/pull/7786 Add Amazon S3 support # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 closed pull request #7786: [Doc]Add Amazon S3 support
hf200012 closed pull request #7786: URL: https://github.com/apache/incubator-doris/pull/7786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 commented on pull request #7786: [Doc]Add Amazon S3 support
hf200012 commented on pull request #7786: URL: https://github.com/apache/incubator-doris/pull/7786#issuecomment-1014552582 Resubmit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 opened a new pull request #7787: [Doc]Documentation corrections
hf200012 opened a new pull request #7787: URL: https://github.com/apache/incubator-doris/pull/7787 Documentation corrections # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman opened a new pull request #7788: [bix](bitmap-index) Fix bug that bitmap index may return wrong result.
morningman opened a new pull request #7788: URL: https://github.com/apache/incubator-doris/pull/7788 # Proposed changes Issue Number: close #xxx ## Problem Summary: Fix the following bugs. 1. `column1` created a bitmap index. 2. `column1` has a lot index items in the bitmap index, and the index page is divided into two levels. 3. `column1`'s value range is `[1000, 1000]`. 4. the query condition is `column1 > 0` 5. the empty result will be returned, while the expected value should be 000 rows. ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (No) 3. Has document been added or modified: (No Need) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7785: [feature][vectorized] Support Vectorized Exec Engine In Doris
github-actions[bot] commented on pull request #7785: URL: https://github.com/apache/incubator-doris/pull/7785#issuecomment-1014699158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman closed issue #6238: [Proposal] Vectorization Execution Engine optimization for Doris
morningman closed issue #6238: URL: https://github.com/apache/incubator-doris/issues/6238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #7785: [feature](vectorization) Support Vectorized Exec Engine In Doris
morningman merged pull request #7785: URL: https://github.com/apache/incubator-doris/pull/7785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zuochunwei closed pull request #7145: no static_cast
zuochunwei closed pull request #7145: URL: https://github.com/apache/incubator-doris/pull/7145 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zuochunwei closed pull request #7695: [vectorized](optimization)(aggregate) improving aggregate count & sum performance
zuochunwei closed pull request #7695: URL: https://github.com/apache/incubator-doris/pull/7695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg commented on a change in pull request #7788: [bix](bitmap-index) Fix bug that bitmap index may return wrong result.
yangzhg commented on a change in pull request #7788: URL: https://github.com/apache/incubator-doris/pull/7788#discussion_r786370400 ## File path: be/src/olap/rowset/segment_v2/ordinal_page_index.h ## @@ -69,6 +69,9 @@ class OrdinalIndexReader { // load and parse the index page into memory Status load(bool use_page_cache, bool kept_in_memory); +// the returned iter points to the largest element which is less than `ordinal`, +// or points to the first element if all elements are greater than `ordinal`, +// or points to "end" if all elementss are smaller than `ordinal`. Review comment: ```suggestion // or points to "end" if all elements are smaller than `ordinal`. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: [improvement](broker) add some properties that can be set in the broker conf file (#7499)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new 946fa29 [improvement](broker) add some properties that can be set in the broker conf file (#7499) 946fa29 is described below commit 946fa2960d8ada5839b542b50d0192f37f2a5f65 Author: Henry2SS <45096548+henry...@users.noreply.github.com> AuthorDate: Tue Jan 18 10:24:54 2022 +0800 [improvement](broker) add some properties that can be set in the broker conf file (#7499) --- .../conf/apache_hdfs_broker.conf | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf b/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf index 5780687..92a30ac 100644 --- a/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf +++ b/fs_brokers/apache_hdfs_broker/conf/apache_hdfs_broker.conf @@ -15,8 +15,26 @@ # specific language governing permissions and limitations # under the License. +# +## To see all Broker configurations, +## see fs_brokers/apache_hdfs_broker/src/main/java/org/apache/doris/broker/hdfs/BrokerConfig.java +# + +# INFO, WARNING, ERROR, FATAL +# sys_log_level = INFO + # the thrift rpc port -broker_ipc_port=8000 +broker_ipc_port = 8000 # client session will be deleted if not receive ping after this time -client_expire_seconds=300 +client_expire_seconds = 300 + +# Advanced configurations +# sys_log_dir = ${BROKER_HOME}/log +# sys_log_roll_num = 30 +# sys_log_roll_mode = SIZE-MB-1024 +# sys_log_verbose_modules = org.apache.doris +# audit_log_dir = ${BROKER_HOME}/log +# audit_log_roll_num = 10 +# audit_log_roll_mode = TIME-DAY +# audit_log_modules = - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman closed issue #7498: [Enhancement] [Broker] Add theproperties that can be set to config file of Broker
morningman closed issue #7498: URL: https://github.com/apache/incubator-doris/issues/7498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #7499: [Broker] Add some properties that can be set in the broker conf file
morningman merged pull request #7499: URL: https://github.com/apache/incubator-doris/pull/7499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: [improvement](colocation) Add a new config to delay the relocation of colocation group (#7656)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new 3494c89 [improvement](colocation) Add a new config to delay the relocation of colocation group (#7656) 3494c89 is described below commit 3494c8973b91d725cff1c4c20e87bcb1e6f4f300 Author: Mingyu Chen AuthorDate: Tue Jan 18 10:26:36 2022 +0800 [improvement](colocation) Add a new config to delay the relocation of colocation group (#7656) 1. Add a new FE config `colocate_group_relocate_delay_second` The relocation of a colocation group may involve a large number of tablets moving within the cluster. Therefore, we should use a more conservative strategy to avoid relocation of colocation groups as much as possible. Relocation usually occurs after a BE node goes offline or goes down. This config is used to delay the determination of BE node unavailability. The default is 30 minutes, i.e., if a BE node recovers within 30 minutes, relocation of the colocation group will not be triggered. 2. Change the priority of colocate tablet repair and balance task from HIGH to NORMAL 3. Add a new FE config allow_replica_on_same_host If set to true, when creating table, Doris will allow to locate replicas of a tablet on same host. And also the tablet repair and balance will be disabled. This is only for local test, so that we can deploy multi BE on same host and create table with multi replicas. --- docs/en/administrator-guide/config/fe_config.md| 22 ++ .../operation/tablet-repair-and-balance.md | 88 ++ docs/zh-CN/administrator-guide/config/fe_config.md | 25 +- .../operation/tablet-repair-and-balance.md | 86 + .../main/java/org/apache/doris/catalog/Tablet.java | 47 ++-- .../clone/ColocateTableCheckerAndBalancer.java | 28 +++ .../java/org/apache/doris/clone/TabletChecker.java | 9 +-- .../org/apache/doris/clone/TabletScheduler.java| 7 +- .../main/java/org/apache/doris/common/Config.java | 21 ++ .../main/java/org/apache/doris/system/Backend.java | 26 --- .../org/apache/doris/system/SystemInfoService.java | 32 .../java/org/apache/doris/catalog/BackendTest.java | 14 ++-- .../clone/ColocateTableCheckerAndBalancerTest.java | 18 ++--- .../doris/clone/TabletRepairAndBalanceTest.java| 1 + 14 files changed, 338 insertions(+), 86 deletions(-) diff --git a/docs/en/administrator-guide/config/fe_config.md b/docs/en/administrator-guide/config/fe_config.md index bf6a8d2..69d611a 100644 --- a/docs/en/administrator-guide/config/fe_config.md +++ b/docs/en/administrator-guide/config/fe_config.md @@ -2099,3 +2099,25 @@ Default: true IsMutable:true MasterOnly: true If set to true, the replica with slower compaction will be automatically detected and migrated to other machines. The detection condition is that the version difference between the fastest and slowest replica exceeds 100, and the difference exceeds 30% of the fastest replica + +### colocate_group_relocate_delay_second + +Default: 1800 + +Dynamically configured: true + +Only for Master FE: true + +The relocation of a colocation group may involve a large number of tablets moving within the cluster. Therefore, we should use a more conservative strategy to avoid relocation of colocation groups as much as possible. +Reloaction usually occurs after a BE node goes offline or goes down. This parameter is used to delay the determination of BE node unavailability. The default is 30 minutes, i.e., if a BE node recovers within 30 minutes, relocation of the colocation group will not be triggered. + +### allow_replica_on_same_host + +Default: false + +Dynamically configured: false + +Only for Master FE: false + +Whether to allow multiple replicas of the same tablet to be distributed on the same host. This parameter is mainly used for local testing, to facilitate building multiple BEs to test certain multi-replica situations. Do not use it for non-test environments. + diff --git a/docs/en/administrator-guide/operation/tablet-repair-and-balance.md b/docs/en/administrator-guide/operation/tablet-repair-and-balance.md index 1593cec..e924e62 100644 --- a/docs/en/administrator-guide/operation/tablet-repair-and-balance.md +++ b/docs/en/administrator-guide/operation/tablet-repair-and-balance.md @@ -684,3 +684,91 @@ The following parameters do not support modification for the time being, just fo * In some cases, the default replica repair and balancing strategy may cause the network to be full (mostly in the case of gigabit network cards and a large number of disks per BE). At this point, some parameters need to be adjusted to reduce the number of simultaneous balancing and re
[GitHub] [incubator-doris] morningman merged pull request #7656: [improvement](colocation) Add a new config to delay the relocation of colocation group
morningman merged pull request #7656: URL: https://github.com/apache/incubator-doris/pull/7656 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman commented on pull request #7098: Support remote storage, step1: use a struct instead of string for parameter path, add basic remote method
morningman commented on pull request #7098: URL: https://github.com/apache/incubator-doris/pull/7098#issuecomment-1015026898 link to #7575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman commented on pull request #7529: Support remote storage, step2, only for be: hot data trans to cold data. clean cold data when drop table
morningman commented on pull request #7529: URL: https://github.com/apache/incubator-doris/pull/7529#issuecomment-1015026994 link to #7575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yiguolei commented on pull request #7529: Support remote storage, step2, only for be: hot data trans to cold data. clean cold data when drop table
yiguolei commented on pull request #7529: URL: https://github.com/apache/incubator-doris/pull/7529#issuecomment-1015028794 I have two questions: 1. Shoud set the partition to freeze state to avoid insert data to cold partitions? 2. How to deal with schema change for the data in S3? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7787: [Doc]Documentation corrections
github-actions[bot] commented on pull request #7787: URL: https://github.com/apache/incubator-doris/pull/7787#issuecomment-1015033545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] EmmyMiao87 commented on pull request #7787: [Doc]Documentation corrections
EmmyMiao87 commented on pull request #7787: URL: https://github.com/apache/incubator-doris/pull/7787#issuecomment-1015034341 BTW, now our pr title is written in a fixed format and can be automatically labeled. For example, []() (#pr). The label will be automatically -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 commented on issue #7502: Doris Roadmap 2022
hf200012 commented on issue #7502: URL: https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015047945 #7680 Data export function supports exporting to db, kafka, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 closed issue #7676: [Feature] Doris supports multi-table Join materialized views
hf200012 closed issue #7676: URL: https://github.com/apache/incubator-doris/issues/7676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 commented on issue #7502: Doris Roadmap 2022
hf200012 commented on issue #7502: URL: https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015048834 #7678 max_by, min_by aggregate function support -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #7772: [fix](lateral-view) Fix some lateral view bugs
morningman merged pull request #7772: URL: https://github.com/apache/incubator-doris/pull/7772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated (3494c89 -> efb4e18)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 3494c89 [improvement](colocation) Add a new config to delay the relocation of colocation group (#7656) add efb4e18 [fix](lateral-view) Fix some lateral view bugs (#7772) No new revisions were added by this update. Summary of changes: be/src/exec/table_function_node.cpp| 23 ++ be/src/exec/table_function_node.h | 2 ++ be/src/runtime/plan_fragment_executor.cpp | 2 +- .../apache/doris/planner/TableFunctionNode.java| 17 4 files changed, 35 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] huligong1234 commented on issue #7502: Doris Roadmap 2022
huligong1234 commented on issue #7502: URL: https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015068122 looking forward to support decimal data type for create table as select statement. (detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as select statement) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] huligong1234 edited a comment on issue #7502: Doris Roadmap 2022
huligong1234 edited a comment on issue #7502: URL: https://github.com/apache/incubator-doris/issues/7502#issuecomment-1015068122 support decimal data type for create table as select statement. (detailMessage = Unsupported type 'DECIMAL(9,0)' in create table as select statement) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org