[GitHub] [doris] anjia0532 commented on issue #6004: [Bug] Connect to Fe to add Be node failed.
anjia0532 commented on issue #6004: URL: https://github.com/apache/doris/issues/6004#issuecomment-1212792839 You need to register the Be to Fe. The error msg `Be is empty`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 commented on a diff in pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
924060929 commented on code in PR #11717: URL: https://github.com/apache/doris/pull/11717#discussion_r944172706 ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Memo.java: ## @@ -169,7 +169,8 @@ private Pair rewriteGroupExpression( GroupExpression groupExpression, Group target, LogicalProperties logicalProperties) { boolean newGroupExpressionGenerated = true; GroupExpression existedGroupExpression = groupExpressions.get(groupExpression); -if (existedGroupExpression != null) { +if (existedGroupExpression != null +&& (target == null || target.equals(existedGroupExpression.getOwnerGroup( { Review Comment: looks like todo. we can merge this pr and complete repair it later -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xy720 opened a new pull request, #11727: [chore] Add badges for jenkins on home page
xy720 opened a new pull request, #11727: URL: https://github.com/apache/doris/pull/11727 # Proposed changes Issue Number: #11725 ## Problem summary Add daily test results badges for jenkins in doris home page. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11721: [Vectorized](compaction) filter delete data in base compaction
github-actions[bot] commented on PR #11721: URL: https://github.com/apache/doris/pull/11721#issuecomment-1212805774 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jacktengg opened a new pull request, #11728: [bugfix](odbc) return error if convert unicode failed
jacktengg opened a new pull request, #11728: URL: https://github.com/apache/doris/pull/11728 # Proposed changes Issue Number: close #xxx ## Problem summary When inserting into external ODBC table, if data contains unicode chars whose code point value is bigger than max int16, `utf8_ucs2_cvt.from_bytes` throws exception `std::range_error` and causes BE coredump. This is a temporary solution only to avoid BE coredump. A final solution to support UNICODE chars bigger than max int16 should be provided later. How to reproduce: ``` -- mysql table CREATE TABLE t1 ( room_id bigint, space_name varchar(300) ); -- doris table CREATE EXTERNAL RESOURCE `mysql_resource` PROPERTIES ( "host" = "127.0.0.1", "port" = "3306", "user" = "root", "password" = "", "database" = "test", "driver" = "MySQL ODBC 5.3", "odbc_type" = "mysql", "type" = "odbc_catalog"); CREATE EXTERNAL TABLE `t1` ( `room_id` bigint(20) NULL COMMENT "ID", `space_name` text NULL COMMENT "name" ) ENGINE=ODBC COMMENT "ODBC" PROPERTIES ( "odbc_catalog_resource" = "mysql_resource", "database" = "test", "table" = "t1" ); CREATE TABLE `t2` ( `room_id` bigint(20) NULL COMMENT "ID", `space_name` text NULL COMMENT "name" ) ENGINE=OLAP UNIQUE KEY(`room_id`) COMMENT "OLAP" DISTRIBUTED BY HASH(`room_id`) BUCKETS 2 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2" ); insert into `t2`(`room_id`,`space_name`) values (1, "a🐷b"); insert into t1 select * from t2; ``` ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11726: [doc](mysql2doris)add mysql to doris documentation
github-actions[bot] commented on PR #11726: URL: https://github.com/apache/doris/pull/11726#issuecomment-1212809644 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11726: [doc](mysql2doris)add mysql to doris documentation
github-actions[bot] commented on PR #11726: URL: https://github.com/apache/doris/pull/11726#issuecomment-1212809666 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] englefly commented on a diff in pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
englefly commented on code in PR #11717: URL: https://github.com/apache/doris/pull/11717#discussion_r944186424 ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Memo.java: ## @@ -169,7 +169,34 @@ private Pair rewriteGroupExpression( GroupExpression groupExpression, Group target, LogicalProperties logicalProperties) { boolean newGroupExpressionGenerated = true; GroupExpression existedGroupExpression = groupExpressions.get(groupExpression); -if (existedGroupExpression != null) { +/* + * here we need to handle one situation that original target is not the same with + * existedGroupExpression.getOwnerGroup(). In this case, if we change target to + * existedGroupExpression.getOwnerGroup(), we could not rewrite plan as we expected and the plan + * will not be changed anymore. + * Think below example: + * We have a plan like this: + * Original (Group 2 is root): + * Group2: Project(outside) + * Group1: |---Project(inside) + * Group0: |---UnboundRelation + * + * and we want to rewrite group 2 by Project(inside, GroupPlan(group 0)) + * + * After rewriting we should get (Group 2 is root): + * Group2: Project(inside) + * Group0: |---UnboundRelation + * + * Group1: Project(inside) + * + * After rewriting, Group 1's GroupExpression is not in GroupExpressionsMap anymore and Group 1 is unreachable. + * Merge Group 1 into Group 2 is better, but in consideration of there is others way to let a Group take into + * unreachable. There's no need to complicate to add a merge step. Instead, we need to have a clear step to + * remove unreachable groups and GroupExpressions after rewrite. + * TODO: add a clear groups function to memo. + */ +if (existedGroupExpression != null +&& (target == null || target.equals(existedGroupExpression.getOwnerGroup( { target = existedGroupExpression.getOwnerGroup(); Review Comment: `groups.remove(target);` then `target = existedGroupExpression.getOwnerGroup();` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] englefly commented on a diff in pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
englefly commented on code in PR #11717: URL: https://github.com/apache/doris/pull/11717#discussion_r944188694 ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Memo.java: ## @@ -169,7 +169,34 @@ private Pair rewriteGroupExpression( GroupExpression groupExpression, Group target, LogicalProperties logicalProperties) { boolean newGroupExpressionGenerated = true; GroupExpression existedGroupExpression = groupExpressions.get(groupExpression); -if (existedGroupExpression != null) { +/* + * here we need to handle one situation that original target is not the same with + * existedGroupExpression.getOwnerGroup(). In this case, if we change target to + * existedGroupExpression.getOwnerGroup(), we could not rewrite plan as we expected and the plan + * will not be changed anymore. + * Think below example: + * We have a plan like this: + * Original (Group 2 is root): + * Group2: Project(outside) + * Group1: |---Project(inside) + * Group0: |---UnboundRelation + * + * and we want to rewrite group 2 by Project(inside, GroupPlan(group 0)) + * + * After rewriting we should get (Group 2 is root): + * Group2: Project(inside) + * Group0: |---UnboundRelation + * + * Group1: Project(inside) + * + * After rewriting, Group 1's GroupExpression is not in GroupExpressionsMap anymore and Group 1 is unreachable. + * Merge Group 1 into Group 2 is better, but in consideration of there is others way to let a Group take into + * unreachable. There's no need to complicate to add a merge step. Instead, we need to have a clear step to + * remove unreachable groups and GroupExpressions after rewrite. + * TODO: add a clear groups function to memo. + */ +if (existedGroupExpression != null +&& (target == null || target.equals(existedGroupExpression.getOwnerGroup( { target = existedGroupExpression.getOwnerGroup(); Review Comment: if target is unreachable, it should be removed, `groups.remove(target)`, right? But if this target is a child of other GroupExpression, it is still reachable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: label
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new e0d2fd5b765 label e0d2fd5b765 is described below commit e0d2fd5b7657bec6b097c9bfb76d5046d898d318 Author: jiafeng.zhang AuthorDate: Fri Aug 12 15:33:23 2022 +0800 label --- docusaurus.config.js | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docusaurus.config.js b/docusaurus.config.js index 9027a9d45a5..1d171156c6f 100644 --- a/docusaurus.config.js +++ b/docusaurus.config.js @@ -114,7 +114,7 @@ const config = { lastVersion: 'current', versions: { current: { -label: '1,1', +label: '1.1', path: '', }, '1.0': { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11728: [bugfix](odbc) return error if convert unicode failed
github-actions[bot] commented on PR #11728: URL: https://github.com/apache/doris/pull/11728#issuecomment-1212819135 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11728: [bugfix](odbc) return error if convert unicode failed
github-actions[bot] commented on PR #11728: URL: https://github.com/apache/doris/pull/11728#issuecomment-1212819169 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yzbin commented on issue #9291: [Bug] The decimal type is zero after the decimal point will be erased.
yzbin commented on issue #9291: URL: https://github.com/apache/doris/issues/9291#issuecomment-1212819805 I have the same problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
github-actions[bot] commented on PR #11717: URL: https://github.com/apache/doris/pull/11717#issuecomment-1212823524 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #11729: [Bug](date function) Fix bug for date format %T
Gabriel39 opened a new pull request, #11729: URL: https://github.com/apache/doris/pull/11729 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [tools](mysql2doris)add mysql to doris documentation #11726
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new f30553e0fe [tools](mysql2doris)add mysql to doris documentation #11726 f30553e0fe is described below commit f30553e0fe5af94fd87241ca2406ad6b5dda5dac Author: caoliang-web <71004656+caoliang-...@users.noreply.github.com> AuthorDate: Fri Aug 12 15:44:35 2022 +0800 [tools](mysql2doris)add mysql to doris documentation #11726 --- docs/en/docs/ecosystem/mysql-to-doris.md| 103 +++ docs/zh-CN/docs/ecosystem/mysql-to-doris.md | 104 2 files changed, 207 insertions(+) diff --git a/docs/en/docs/ecosystem/mysql-to-doris.md b/docs/en/docs/ecosystem/mysql-to-doris.md new file mode 100644 index 00..51605d72d9 --- /dev/null +++ b/docs/en/docs/ecosystem/mysql-to-doris.md @@ -0,0 +1,103 @@ +--- +{ + +"title": "Mysql to Doris", +"language": "en" + +} +--- + + + +# Mysql to Doris + +mysql to doris is mainly suitable for automating the creation of doris odbc tables, mainly implemented with shell scripts + +## manual + +mysql to doris code [here](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) + +### Directory Structure + +```text +├── mysql_to_doris +│ ├── conf +│ │ ├── doris.conf +│ │ ├── mysql.conf +│ │ └── tables +│ ├── all_tables.sh +│ │ +└── └── user_define_tables.sh +``` + +1. all_tables.sh + + This script mainly reads all the tables under the mysql specified library and automatically creates the Doris odbc external table + +2. user_define_tables.sh + + This script is mainly used for users to customize certain tables under the specified mysql library to automatically create Doris odbc external tables + +3. conf + + Configuration file, `doris.conf` is mainly used to configure doris related, `mysql.conf` is mainly used to configure mysql related, `tables` is mainly used to configure user-defined mysql library tables + +### full + +1. Download using mysql to doris [here](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) +2. Configuration related files + + ```shell + #doris.conf + master_host= + master_port= + doris_password= + + #mysql.conf + mysql_host= + mysql_password= + ``` + + | Configuration item | illustrate | + | -- | --- | + | master_host| Doris FE master node IP | + | master_port| Doris FE query_port port | + | doris_password | Doris Password (default root user) | + | mysql_host | Mysql IP | + | mysql_password | Mysql Password (default root user) | + +3. Execute the `all_tables.sh` script + +``` +sh all_tables.sh mysql_db_name doris_db_name +``` +After successful execution, the files directory will be generated, and the directory will contain `tables` (table name) and `tables.sql` (doris odbc table creation statement) + +### custom + +1. Modify the `conf/tables` file to add the name of the odbc table that needs to be created +2. To configure mysql and doris related information, refer to step 2 of full creation +3. Execute the `user_define_tables.sh` script + +``` +sh user_define_tables.sh mysql_db_name doris_db_name +``` + +After successful execution, the user_files directory will be generated, and the directory will contain `tables.sql` (doris odbc table creation statement) diff --git a/docs/zh-CN/docs/ecosystem/mysql-to-doris.md b/docs/zh-CN/docs/ecosystem/mysql-to-doris.md new file mode 100644 index 00..cc89126eb2 --- /dev/null +++ b/docs/zh-CN/docs/ecosystem/mysql-to-doris.md @@ -0,0 +1,104 @@ +--- +{ + +"title": "Mysql to Doris", +"language": "zh-CN" + +} +--- + + + +# Mysql to Doris + +mysql to doris 主要适用于自动化创建doris odbc 表,主要用shell脚本实现 + +## 使用手册 + +mysql to doris 代码[这里](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) + +### 目录结构 + +```text +├── mysql_to_doris +│ ├── conf +│ │ ├── doris.conf +│ │ ├── mysql.conf +│ │ └── tables +│ ├── all_tables.sh +│ │ +└── └── user_define_tables.sh +``` + +1. all_tables.sh + + 这个脚本主要是读取mysql指定库下的所有表,自动创建Doris odbc外表 + +2. user_define_tables.sh + + 这个脚本主要用于用户自定义指定mysql库下某几张表,自动创建Doris odbc外表 + +3. conf + + 配置文件,`doris.conf`主要是配置doris相关的,`mysql.conf`主要配置mysql相关的,`tables`主要用于配置用户自定义mysql库的表 + +### 全量 + +1. 下载使用mysql to doris[这里](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) +2. 配置相关文件 + + ```shell + #doris.conf + master_host= + master_port= + doris_password= + + #mysql.conf + mysql_host= + mysql_password= + ``` + + | 配置项 | 说明| + | -- | --- | + | master_host| Doris FE master节点IP | + | master_port| Doris FE query_port端口 | + | doris_password | Doris 密码(默认root用户) | + | mysql_host |
[GitHub] [doris] yiguolei merged pull request #11726: [doc](mysql2doris)add mysql to doris documentation
yiguolei merged PR #11726: URL: https://github.com/apache/doris/pull/11726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] pengxiangyu opened a new pull request, #11730: [fix](core)fix bug for status not init
pengxiangyu opened a new pull request, #11730: URL: https://github.com/apache/doris/pull/11730 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: spark load example
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new 0ffdbb2dcdf spark load example 0ffdbb2dcdf is described below commit 0ffdbb2dcdfb9d5e5d0e41be0d7a48584740e6ed Author: jiafeng.zhang AuthorDate: Fri Aug 12 15:47:54 2022 +0800 spark load example --- .../import/import-way/spark-load-manual.md | 68 + .../import/import-way/spark-load-manual.md | 70 ++ 2 files changed, 138 insertions(+) diff --git a/docs/data-operate/import/import-way/spark-load-manual.md b/docs/data-operate/import/import-way/spark-load-manual.md index a1c314ec377..a1ebce9837b 100644 --- a/docs/data-operate/import/import-way/spark-load-manual.md +++ b/docs/data-operate/import/import-way/spark-load-manual.md @@ -487,6 +487,73 @@ PROPERTIES ``` +Example 4: Import data from hive partitioned table + +```sql +-- hive create table statement +create table test_partition( +id int, +name string, +age int +) +partitioned by (dt string) +row format delimited fields terminated by ',' +stored as textfile; + +-- doris create table statement +CREATE TABLE IF NOT EXISTS test_partition_04 +( +dt date, +id int, +name string, +age int +) +UNIQUE KEY(`dt`, `id`) +DISTRIBUTED BY HASH(`id`) BUCKETS 1 +PROPERTIES ( +"replication_allocation" = "tag.location.default: 1" +); +-- spark load +CREATE EXTERNAL RESOURCE "spark_resource" +PROPERTIES +( +"type" = "spark", +"spark.master" = "yarn", +"spark.submit.deployMode" = "cluster", +"spark.executor.memory" = "1g", +"spark.yarn.queue" = "default", +"spark.hadoop.yarn.resourcemanager.address" = "localhost:50056", +"spark.hadoop.fs.defaultFS" = "hdfs://localhost:9000", +"working_dir" = "hdfs://localhost:9000/tmp/doris", +"broker" = "broker_01" +); +LOAD LABEL demo.test_hive_partition_table_18 +( +DATA INFILE("hdfs://localhost:9000/user/hive/warehouse/demo.db/test/dt=2022-08-01/*") +INTO TABLE test_partition_04 +COLUMNS TERMINATED BY "," +FORMAT AS "csv" +(id,name,age) +COLUMNS FROM PATH AS (`dt`) +SET +( +dt=dt, +id=id, +name=name, +age=age +) +) +WITH RESOURCE 'spark_resource' +( +"spark.executor.memory" = "1g", +"spark.shuffle.compress" = "true" +) +PROPERTIES +( +"timeout" = "3600" +); + + You can view the details syntax about creating load by input `help spark load`. This paper mainly introduces the parameter meaning and precautions in the creation and load syntax of spark load. **Label** @@ -651,6 +718,7 @@ The most suitable scenario to use spark load is that the raw data is in the file ## FAQ +* Spark load does not yet support the import of Doris table fields that are of type String. If your table fields are of type String, please change them to type varchar, otherwise the import will fail, prompting `type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel` * When using spark load, the `HADOOP_CONF_DIR` environment variable is no set in the `spark-env.sh`. If the `HADOOP_CONF_DIR` environment variable is not set, the error `When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment` will be reported. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md index 6933709791e..443bac0f99f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md @@ -449,6 +449,75 @@ PROPERTIES ); ``` +示例4: 导入 hive 分区表的数据 + +```sql +--hive 建表语句 +create table test_partition( + id int, + name string, + age int +) +partitioned by (dt string) +row format delimited fields terminated by ',' +stored as textfile; + +--doris 建表语句 +CREATE TABLE IF NOT EXISTS test_partition_04 +( + dt date, + id int, + name string, + age int +) +UNIQUE KEY(`dt`, `id`) +DISTRIBUTED BY HASH(`id`) BUCKETS 1 +PROPERTIES ( + "replication_allocation" = "tag.location.default: 1" +); +--spark load 语句 +CREATE EXTERNAL RESOURCE "spark_resource" +PROPERTIES +( +"type" = "spark", +"spark.master" = "yarn", +"spark.submit.deployMode" = "cluster", +"spark.executor.memory" = "1g", +"spark.yarn.queue" = "default", +"spark.hadoop.yarn.resourcemanager.address" = "localhost:50056", +"spark.hadoop.fs.defaultFS" = "hdfs://localhost:9000", +"working_dir" = "hdfs://localhost:9000/tmp/doris", +"broker" = "broker_01" +); +LOAD LABEL demo.test_hive_partition_table_18 +( +DATA INFILE("hdfs://localhost:9000/user/hive/warehouse/demo.db/test/dt=2022-08-01/*") +INTO TABL
[GitHub] [doris] yiguolei commented on a diff in pull request #11730: [fix](core)fix bug for status not init
yiguolei commented on code in PR #11730: URL: https://github.com/apache/doris/pull/11730#discussion_r944203361 ## be/src/vec/exec/vintersect_node.cpp: ## @@ -45,7 +45,7 @@ Status VIntersectNode::open(RuntimeState* state) { START_AND_SCOPE_SPAN(state->get_tracer(), span, "VIntersectNode::open"); RETURN_IF_ERROR(VSetOperationNode::open(state)); bool eos = false; -Status st; +Status st = Status::OK(); Review Comment: Maybe not. Add a default constructor and set st to ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 opened a new pull request, #11732: [improvement](doc)Import data example from hive partition table
hf200012 opened a new pull request, #11732: URL: https://github.com/apache/doris/pull/11732 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yixiutt opened a new pull request, #11731: [feature-wip](unique-key-merge-on-write) add regression case
yixiutt opened a new pull request, #11731: URL: https://github.com/apache/doris/pull/11731 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
github-actions[bot] commented on PR #11717: URL: https://github.com/apache/doris/pull/11717#issuecomment-1212854797 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 commented on a diff in pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
924060929 commented on code in PR #11717: URL: https://github.com/apache/doris/pull/11717#discussion_r944225534 ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Memo.java: ## @@ -169,7 +169,34 @@ private Pair rewriteGroupExpression( GroupExpression groupExpression, Group target, LogicalProperties logicalProperties) { boolean newGroupExpressionGenerated = true; GroupExpression existedGroupExpression = groupExpressions.get(groupExpression); -if (existedGroupExpression != null) { +/* + * here we need to handle one situation that original target is not the same with + * existedGroupExpression.getOwnerGroup(). In this case, if we change target to + * existedGroupExpression.getOwnerGroup(), we could not rewrite plan as we expected and the plan + * will not be changed anymore. + * Think below example: + * We have a plan like this: + * Original (Group 2 is root): + * Group2: Project(outside) + * Group1: |---Project(inside) + * Group0: |---UnboundRelation + * + * and we want to rewrite group 2 by Project(inside, GroupPlan(group 0)) + * + * After rewriting we should get (Group 2 is root): + * Group2: Project(inside) + * Group0: |---UnboundRelation + * + * Group1: Project(inside) + * + * After rewriting, Group 1's GroupExpression is not in GroupExpressionsMap anymore and Group 1 is unreachable. + * Merge Group 1 into Group 2 is better, but in consideration of there is others way to let a Group take into + * unreachable. There's no need to complicate to add a merge step. Instead, we need to have a clear step to + * remove unreachable groups and GroupExpressions after rewrite. + * TODO: add a clear groups function to memo. + */ +if (existedGroupExpression != null +&& (target == null || target.equals(existedGroupExpression.getOwnerGroup( { target = existedGroupExpression.getOwnerGroup(); Review Comment: fix it in next pr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 merged pull request #11717: [fix](Nereids)memo copy in cannot rewrite current plan with its child
924060929 merged PR #11717: URL: https://github.com/apache/doris/pull/11717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (f30553e0fe -> 887de4b465)
This is an automated email from the ASF dual-hosted git repository. huajianlan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from f30553e0fe [tools](mysql2doris)add mysql to doris documentation #11726 add 887de4b465 [fix](Nereids)memo copy in cannot rewrite current plan with its child (#11717) No new revisions were added by this update. Summary of changes: .../java/org/apache/doris/nereids/memo/Memo.java | 31 - .../org/apache/doris/nereids/memo/MemoTest.java| 52 +- 2 files changed, 79 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11731: [feature-wip](unique-key-merge-on-write) add regression case
github-actions[bot] commented on PR #11731: URL: https://github.com/apache/doris/pull/11731#issuecomment-1212859200 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11731: [feature-wip](unique-key-merge-on-write) add regression case
github-actions[bot] commented on PR #11731: URL: https://github.com/apache/doris/pull/11731#issuecomment-1212859218 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yangzhg opened a new issue, #11734: [Bug] backup restore raise `Storage backend not initialized.` error
yangzhg opened a new issue, #11734: URL: https://github.com/apache/doris/issues/11734 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? backup restore raise `Storage backend not initialized.` error ### What You Expected? success ### How to Reproduce? backup a table to broker repository ### Anything Else? _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] carlvinhust2012 commented on a diff in pull request #11732: [improvement](doc)Import data example from hive partition table
carlvinhust2012 commented on code in PR #11732: URL: https://github.com/apache/doris/pull/11732#discussion_r944243705 ## docs/en/docs/data-operate/import/import-way/spark-load-manual.md: ## @@ -647,6 +716,7 @@ The most suitable scenario to use spark load is that the raw data is in the file ## FAQ +* Spark load does not yet support the import of Doris table fields that are of type String. If your table fields are of type String, please change them to type varchar, otherwise the import will fail, prompting `type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel` Review Comment: ```suggestion * Spark load does not yet support the import of Doris table fields that are of type String. If your table fields are of type String, please change them to type varchar, otherwise the import will fail, prompting `type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel` ``` I think this passage change as below is more better: Spark load does not support the import of Doris table fields which is String type. If your table field has String type, please change to varchar type. Otherwise, the import will fail and prompt 'type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cun8cun8 opened a new pull request, #11735: "Spark load" supports string data type
cun8cun8 opened a new pull request, #11735: URL: https://github.com/apache/doris/pull/11735 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cun8cun8 closed pull request #11735: "Spark load" supports string data type
cun8cun8 closed pull request #11735: "Spark load" supports string data type URL: https://github.com/apache/doris/pull/11735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yangzhg opened a new pull request, #11736: [bug](backup) fix backup restore raise `Storage backend not initialized.` error
yangzhg opened a new pull request, #11736: URL: https://github.com/apache/doris/pull/11736 # Proposed changes Issue Number: close #11734 ## Problem summary fix backup restore raise `Storage backend not initialized.` error ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x]No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11727: [chore](regression) Add badges for jenkins on home page
github-actions[bot] commented on PR #11727: URL: https://github.com/apache/doris/pull/11727#issuecomment-1212887329 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11727: [chore](regression) Add badges for jenkins on home page
github-actions[bot] commented on PR #11727: URL: https://github.com/apache/doris/pull/11727#issuecomment-1212887374 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cun8cun8 opened a new pull request, #11737: "Spark load" supports string data type
cun8cun8 opened a new pull request, #11737: URL: https://github.com/apache/doris/pull/11737 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [enhancement](Nereids)refactor sort plan in nereids (#11673)
This is an automated email from the ASF dual-hosted git repository. huajianlan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new ed47a3bb6d [enhancement](Nereids)refactor sort plan in nereids (#11673) ed47a3bb6d is described below commit ed47a3bb6dcc325de66d50ca4a42dec2298fa1a3 Author: morrySnow <101034200+morrys...@users.noreply.github.com> AuthorDate: Fri Aug 12 17:08:23 2022 +0800 [enhancement](Nereids)refactor sort plan in nereids (#11673) 1. rename PhysicalHeapSort to PhysicalQuickSort 2. add LogicalTopN and PhysicalTopN 3. add implementation rule for LogicalTopN 4. add a interface Sort for both logical and physical sort 5. add a interface TopN for both logical and physical top-n 6. add a AbstractPhysicalSort as super class of PhysicalQuickSort and PhysicalTopN --- .../apache/doris/nereids/cost/CostCalculator.java | 17 ++- .../glue/translator/PhysicalPlanTranslator.java| 51 ++--- .../apache/doris/nereids/properties/OrderSpec.java | 4 +- .../org/apache/doris/nereids/rules/RuleSet.java| 6 +- .../org/apache/doris/nereids/rules/RuleType.java | 3 +- ...rt.java => LogicalSortToPhysicalQuickSort.java} | 8 +- ...eapSort.java => LogicalTopNToPhysicalTopN.java} | 18 ++-- .../doris/nereids/stats/StatsCalculator.java | 22 +++- .../apache/doris/nereids/trees/plans/PlanType.java | 4 +- .../plans/algebra/Sort.java} | 21 ++-- .../plans/algebra/TopN.java} | 22 ++-- .../nereids/trees/plans/logical/LogicalSort.java | 19 +--- .../logical/{LogicalSort.java => LogicalTopN.java} | 48 - ...PhysicalJoin.java => AbstractPhysicalJoin.java} | 6 +- ...icalHeapSort.java => AbstractPhysicalSort.java} | 50 ++--- .../trees/plans/physical/PhysicalHashJoin.java | 2 +- .../plans/physical/PhysicalNestedLoopJoin.java | 2 +- ...hysicalHeapSort.java => PhysicalQuickSort.java} | 46 ++-- .../{PhysicalHeapSort.java => PhysicalTopN.java} | 52 + .../nereids/trees/plans/visitor/PlanVisitor.java | 19 +++- .../org/apache/doris/nereids/util/JoinUtils.java | 10 +- .../rules/implementation/ImplementationTest.java | 120 + .../LogicalLimitToPhysicalLimitTest.java | 46 .../LogicalProjectToPhysicalProjectTest.java | 82 -- .../doris/nereids/stats/StatsCalculatorTest.java | 42 ++-- .../doris/nereids/trees/plans/PlanEqualsTest.java | 10 +- 26 files changed, 379 insertions(+), 351 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostCalculator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostCalculator.java index ccb99debde..d90970f7f4 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostCalculator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostCalculator.java @@ -24,10 +24,11 @@ import org.apache.doris.nereids.trees.plans.Plan; import org.apache.doris.nereids.trees.plans.physical.PhysicalAggregate; import org.apache.doris.nereids.trees.plans.physical.PhysicalDistribution; import org.apache.doris.nereids.trees.plans.physical.PhysicalHashJoin; -import org.apache.doris.nereids.trees.plans.physical.PhysicalHeapSort; import org.apache.doris.nereids.trees.plans.physical.PhysicalNestedLoopJoin; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapScan; import org.apache.doris.nereids.trees.plans.physical.PhysicalProject; +import org.apache.doris.nereids.trees.plans.physical.PhysicalQuickSort; +import org.apache.doris.nereids.trees.plans.physical.PhysicalTopN; import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; import org.apache.doris.statistics.StatsDeriveResult; @@ -79,7 +80,19 @@ public class CostCalculator { } @Override -public CostEstimate visitPhysicalHeapSort(PhysicalHeapSort physicalHeapSort, PlanContext context) { +public CostEstimate visitPhysicalQuickSort(PhysicalQuickSort physicalQuickSort, PlanContext context) { +// TODO: consider two-phase sort and enforcer. +StatsDeriveResult statistics = context.getStatisticsWithCheck(); +StatsDeriveResult childStatistics = context.getChildStatistics(0); + +return new CostEstimate( +childStatistics.computeSize(), +statistics.computeSize(), +childStatistics.computeSize()); +} + +@Override +public CostEstimate visitPhysicalTopN(PhysicalTopN topN, PlanContext context) { // TODO: consider two-phase sort and enforcer. StatsDeriveResult statistics = context.getStatisticsWithCheck(); StatsDeriveResult childStatistics = context.getChildStatistics(0); diff --git a/fe/fe-core/src/main/java/org/apache/doris/ne
[GitHub] [doris] 924060929 merged pull request #11673: [enhancement](Nereids)refactor sort plan in nereids
924060929 merged PR #11673: URL: https://github.com/apache/doris/pull/11673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11728: [bugfix](odbc) return error if convert unicode failed
yiguolei merged PR #11728: URL: https://github.com/apache/doris/pull/11728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch dev-1.1.2 updated: [bugfix](odbc) return error if convert unicode failed (#11728)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch dev-1.1.2 in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/dev-1.1.2 by this push: new 03d5f017b0 [bugfix](odbc) return error if convert unicode failed (#11728) 03d5f017b0 is described below commit 03d5f017b0dbce60e223d7804ec8e50e42a0cd63 Author: TengJianPing <18241664+jackte...@users.noreply.github.com> AuthorDate: Fri Aug 12 17:28:48 2022 +0800 [bugfix](odbc) return error if convert unicode failed (#11728) * [bugfix](odbc) return error if convert unicode failed --- be/src/exec/odbc_connector.cpp | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/be/src/exec/odbc_connector.cpp b/be/src/exec/odbc_connector.cpp index 2fd296dfee..1de8337e5d 100644 --- a/be/src/exec/odbc_connector.cpp +++ b/be/src/exec/odbc_connector.cpp @@ -48,9 +48,14 @@ static constexpr uint32_t BIG_COLUMN_SIZE_BUFFER = 65535; // Default max buffer size use in insert to: 50MB, normally a batch is smaller than the size static constexpr uint32_t INSERT_BUFFER_SIZE = 1024l * 1024 * 50; -static std::u16string utf8_to_wstring(const std::string& str) { +static doris::Status utf8_to_wstring(const std::string& str, std::u16string& out) { std::wstring_convert, char16_t> utf8_ucs2_cvt; -return utf8_ucs2_cvt.from_bytes(str); +try { +out = utf8_ucs2_cvt.from_bytes(str); +} catch (std::range_error& e) { +return doris::Status::InternalError("UNICODE out of supported range"); +} +return doris::Status::OK(); } namespace doris { @@ -128,7 +133,8 @@ Status ODBCConnector::query() { "alloc statement"); // Translate utf8 string to utf16 to use unicode encoding -auto wquery = utf8_to_wstring(_sql_str); +std::u16string wquery; +RETURN_IF_ERROR(utf8_to_wstring(_sql_str, wquery)); ODBC_DISPOSE(_stmt, SQL_HANDLE_STMT, SQLExecDirectW(_stmt, (SQLWCHAR*)(wquery.c_str()), SQL_NTS), "exec direct"); @@ -309,9 +315,10 @@ Status ODBCConnector::append(const std::string& table_name, RowBatch* batch, } } // Translate utf8 string to utf16 to use unicode encodeing -insert_stmt = utf8_to_wstring( +RETURN_IF_ERROR(utf8_to_wstring( std::string(_insert_stmt_buffer.data(), -_insert_stmt_buffer.data() + _insert_stmt_buffer.size())); +_insert_stmt_buffer.data() + _insert_stmt_buffer.size()), +insert_stmt)); } { @@ -494,9 +501,10 @@ Status ODBCConnector::append(const std::string& table_name, vectorized::Block* b } } // Translate utf8 string to utf16 to use unicode encodeing -insert_stmt = utf8_to_wstring( +RETURN_IF_ERROR(utf8_to_wstring( std::string(_insert_stmt_buffer.data(), -_insert_stmt_buffer.data() + _insert_stmt_buffer.size())); +_insert_stmt_buffer.data() + _insert_stmt_buffer.size()), +insert_stmt)); } { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [bugfix](odbc) return error if convert unicode failed (#11728)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 58822c7b55 [bugfix](odbc) return error if convert unicode failed (#11728) 58822c7b55 is described below commit 58822c7b5547d202c13352055df49fbc1105cf3e Author: TengJianPing <18241664+jackte...@users.noreply.github.com> AuthorDate: Fri Aug 12 17:28:48 2022 +0800 [bugfix](odbc) return error if convert unicode failed (#11728) * [bugfix](odbc) return error if convert unicode failed --- be/src/exec/odbc_connector.cpp | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/be/src/exec/odbc_connector.cpp b/be/src/exec/odbc_connector.cpp index 66920bf0f9..5ca74080df 100644 --- a/be/src/exec/odbc_connector.cpp +++ b/be/src/exec/odbc_connector.cpp @@ -48,9 +48,14 @@ static constexpr uint32_t BIG_COLUMN_SIZE_BUFFER = 65535; // Default max buffer size use in insert to: 50MB, normally a batch is smaller than the size static constexpr uint32_t INSERT_BUFFER_SIZE = 1024l * 1024 * 50; -static std::u16string utf8_to_wstring(const std::string& str) { +static doris::Status utf8_to_wstring(const std::string& str, std::u16string& out) { std::wstring_convert, char16_t> utf8_ucs2_cvt; -return utf8_ucs2_cvt.from_bytes(str); +try { +out = utf8_ucs2_cvt.from_bytes(str); +} catch (std::range_error& e) { +return doris::Status::InternalError("UNICODE out of supported range"); +} +return doris::Status::OK(); } namespace doris { @@ -128,7 +133,8 @@ Status ODBCConnector::query() { "alloc statement"); // Translate utf8 string to utf16 to use unicode encoding -auto wquery = utf8_to_wstring(_sql_str); +std::u16string wquery; +RETURN_IF_ERROR(utf8_to_wstring(_sql_str, wquery)); ODBC_DISPOSE(_stmt, SQL_HANDLE_STMT, SQLExecDirectW(_stmt, (SQLWCHAR*)(wquery.c_str()), SQL_NTS), "exec direct"); @@ -307,9 +313,10 @@ Status ODBCConnector::append(const std::string& table_name, RowBatch* batch, } } // Translate utf8 string to utf16 to use unicode encodeing -insert_stmt = utf8_to_wstring( +RETURN_IF_ERROR(utf8_to_wstring( std::string(_insert_stmt_buffer.data(), -_insert_stmt_buffer.data() + _insert_stmt_buffer.size())); +_insert_stmt_buffer.data() + _insert_stmt_buffer.size()), +insert_stmt)); } { @@ -492,9 +499,10 @@ Status ODBCConnector::append(const std::string& table_name, vectorized::Block* b } } // Translate utf8 string to utf16 to use unicode encodeing -insert_stmt = utf8_to_wstring( +RETURN_IF_ERROR(utf8_to_wstring( std::string(_insert_stmt_buffer.data(), -_insert_stmt_buffer.data() + _insert_stmt_buffer.size())); +_insert_stmt_buffer.data() + _insert_stmt_buffer.size()), +insert_stmt)); } { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: add MULTI-LOAD
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new bf9a6823244 add MULTI-LOAD bf9a6823244 is described below commit bf9a68232445505afc4429193d17104bac6601c0 Author: jiafeng.zhang AuthorDate: Fri Aug 12 17:33:24 2022 +0800 add MULTI-LOAD add MULTI-LOAD --- sidebars.json | 1 + 1 file changed, 1 insertion(+) diff --git a/sidebars.json b/sidebars.json index eb6ff0be824..21f6820e28c 100644 --- a/sidebars.json +++ b/sidebars.json @@ -611,6 +611,7 @@ "sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD", "sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD", "sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD", + "sql-manual/sql-reference/Data-Manipulation-Statements/Load/MULTI-LOAD", "sql-manual/sql-reference/Data-Manipulation-Statements/Load/PAUSE-ROUTINE-LOAD", "sql-manual/sql-reference/Data-Manipulation-Statements/Load/RESUME-ROUTINE-LOAD", "sql-manual/sql-reference/Data-Manipulation-Statements/Load/STOP-ROUTINE-LOAD", - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz opened a new issue, #11738: [Enhancement] Improve query memory tracking accuracy
xinyiZzz opened a new issue, #11738: URL: https://github.com/apache/doris/issues/11738 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description Based on the memory counted by a single query in the mem tracker, it is impossible to estimate how many queries the cluster can carry. Because the query actually only uses 300M of physical memory, but the mem tracker may count 900M. ### Solution _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei opened a new pull request, #11739: Fix sc bug
yiguolei opened a new pull request, #11739: URL: https://github.com/apache/doris/pull/11739 # Proposed changes If there is a delete condition in tablet on string columns. Schema change may core because the reader is deconstruted early and the delete predicate depends on reader's mempool. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11730: [fix](core)fix bug for status not init
yiguolei merged PR #11730: URL: https://github.com/apache/doris/pull/11730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [fix](core)fix bug for status not init(#11730)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 1c4927eac3 [fix](core)fix bug for status not init(#11730) 1c4927eac3 is described below commit 1c4927eac30d53e57f1ed43e57c1adc5262c0b64 Author: pengxiangyu AuthorDate: Fri Aug 12 17:42:37 2022 +0800 [fix](core)fix bug for status not init(#11730) --- be/src/vec/exec/file_arrow_scanner.cpp | 2 +- be/src/vec/exec/varrow_scanner.cpp | 2 +- be/src/vec/exec/vexcept_node.cpp | 2 +- be/src/vec/exec/vintersect_node.cpp| 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/be/src/vec/exec/file_arrow_scanner.cpp b/be/src/vec/exec/file_arrow_scanner.cpp index c8ce603f89..998d40ea6f 100644 --- a/be/src/vec/exec/file_arrow_scanner.cpp +++ b/be/src/vec/exec/file_arrow_scanner.cpp @@ -127,7 +127,7 @@ Status FileArrowScanner::_next_arrow_batch() { Status FileArrowScanner::_init_arrow_batch_if_necessary() { // 1. init batch if first time // 2. reset reader if end of file -Status status; +Status status = Status::OK(); if (_scanner_eof) { return Status::EndOfFile("EOF"); } diff --git a/be/src/vec/exec/varrow_scanner.cpp b/be/src/vec/exec/varrow_scanner.cpp index 200e467810..1e5597f9a0 100644 --- a/be/src/vec/exec/varrow_scanner.cpp +++ b/be/src/vec/exec/varrow_scanner.cpp @@ -140,7 +140,7 @@ Status VArrowScanner::_next_arrow_batch() { Status VArrowScanner::_init_arrow_batch_if_necessary() { // 1. init batch if first time // 2. reset reader if end of file -Status status; +Status status = Status::OK(); if (_scanner_eof) { return Status::EndOfFile("EOF"); } diff --git a/be/src/vec/exec/vexcept_node.cpp b/be/src/vec/exec/vexcept_node.cpp index 1a97359921..ee38702fbf 100644 --- a/be/src/vec/exec/vexcept_node.cpp +++ b/be/src/vec/exec/vexcept_node.cpp @@ -45,7 +45,7 @@ Status VExceptNode::open(RuntimeState* state) { START_AND_SCOPE_SPAN(state->get_tracer(), span, "VExceptNode::open"); RETURN_IF_ERROR(VSetOperationNode::open(state)); bool eos = false; -Status st; +Status st = Status::OK(); for (int i = 1; i < _children.size(); ++i) { if (i > 1) { refresh_hash_table(); diff --git a/be/src/vec/exec/vintersect_node.cpp b/be/src/vec/exec/vintersect_node.cpp index f8f083ced1..5fcc5f10fa 100644 --- a/be/src/vec/exec/vintersect_node.cpp +++ b/be/src/vec/exec/vintersect_node.cpp @@ -45,7 +45,7 @@ Status VIntersectNode::open(RuntimeState* state) { START_AND_SCOPE_SPAN(state->get_tracer(), span, "VIntersectNode::open"); RETURN_IF_ERROR(VSetOperationNode::open(state)); bool eos = false; -Status st; +Status st = Status::OK(); for (int i = 1; i < _children.size(); ++i) { if (i > 1) { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz opened a new pull request, #11740: [enhancement](memtracker) Optimize query memory accuracy
xinyiZzz opened a new pull request, #11740: URL: https://github.com/apache/doris/pull/11740 # Proposed changes Issue Number: close #11738 ## Problem summary ### motivation The value of the query mem tracker is consistent with the physical memory actually used by the query. ### problem causes Currently, only the virtual memory used by the query can be tracked through the tcmalloc hook. When the memory is not fully used after the application, the recorded virtual memory will be larger than the physical memory. At present, it is mainly because PODArray does not memset 0 when applying for memory, and blocks applied for through PODArray in places such as VOlapScanNode::_free_blocks are usually used for memory reuse and cannot be fully used. ### Fix The query mem tracker only records the peak memory used by PODArray and MemPool ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ChPi commented on pull request #11565: [fix] (docs) Fix Data type: string length and parameter description
ChPi commented on PR #11565: URL: https://github.com/apache/doris/pull/11565#issuecomment-1212935917 @morningman 用户使用Doris作为唯一数仓,会存储比较大的数据,且从0.15版本开始使用。未来不要直接限死,还是大小可配置,文档可以指出数据太大导致的问题,让用户自己根据自己的集群业务情况限制大小 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] daikon12 opened a new pull request, #11741: Update udf c make files
daikon12 opened a new pull request, #11741: URL: https://github.com/apache/doris/pull/11741 # Proposed changes 1. thirdparty/include --> ../thirdparty/include 2.thirdparty/lib/libDorisUdf.a --> ../thirdparty/lib/libDorisUdf.a 3.${BUILD_DIR}/src/udf_samples --> src/udf_samples 4.${BUILD_DIR}/src/udf_samples --> src/udf_samples ## Problem summary When I tried to compile the udf I wrote according to the official doc, there would be an error when make. After careful inspection, I found that several paths in the doc may need to be adjusted。 ## Checklist(Required) 1. Does it affect the original behavior: NO 2. Has unit tests been added:NO 3. Has document been added or modified:YES 4. Does it need to update dependencies:NO 5. Are there any changes that cannot be rolled back:NO -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #11740: [enhancement](memtracker) Optimize query memory accuracy
yiguolei commented on code in PR #11740: URL: https://github.com/apache/doris/pull/11740#discussion_r944312491 ## be/src/vec/common/pod_array.h: ## @@ -111,15 +112,26 @@ class PODArrayBase : private boost::noncopyable, return byte_size(num_elements) + pad_right + pad_left; } +inline void reset_peak() { +if (c_end - c_end_peak > 1024) { Review Comment: add unlikely here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] javagjChen commented on issue #11706: Getting Started Tasks for New Contributors
javagjChen commented on issue #11706: URL: https://github.com/apache/doris/issues/11706#issuecomment-1212952827 [WeOpen Star]I would like to help -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11739: [bugfix](schema change) when there is a string column with delete predicate, the schema change may core
github-actions[bot] commented on PR #11739: URL: https://github.com/apache/doris/pull/11739#issuecomment-1212964995 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11739: [bugfix](schema change) when there is a string column with delete predicate, the schema change may core
github-actions[bot] commented on PR #11739: URL: https://github.com/apache/doris/pull/11739#issuecomment-1212965041 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ZHbamboo commented on issue #11706: Getting Started Tasks for New Contributors
ZHbamboo commented on issue #11706: URL: https://github.com/apache/doris/issues/11706#issuecomment-1212976199 [WeOpen Star] I would like to help DOCS & BLOGS Doris SQL 原理解析 https://mp.weixin.qq.com/s/v1jI1MxEHPT5czCWd0kRxw -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] GoGoWen opened a new pull request, #11742: enhance load from parquet or orc file
GoGoWen opened a new pull request, #11742: URL: https://github.com/apache/doris/pull/11742 # Proposed changes enhance loading from parquet or orc file, when given column not exist in file, set to null default instead of fails with "Invalid Column" ## Problem summary 1, create table: CREATE TABLE `t22` ( `name` bigint(20) NOT NULL COMMENT "", `id` bigint(20) NOT NULL COMMENT "", `id2` bigint(20) NULL COMMENT "", `impressions` bigint(20) SUM NULL DEFAULT "0" COMMENT "用户总展现", `click` double SUM NULL DEFAULT "0" COMMENT "用户总点击", `cost` bigint(20) SUM NULL DEFAULT "0" COMMENT "用户总消费" ) ENGINE=OLAP AGGREGATE KEY(`name`, `id`, `id2`) COMMENT "OLAP" PARTITION BY RANGE(`name`) (PARTITION p201901 VALUES [("1"), ("100"))) DISTRIBUTED BY HASH(`id`) BUCKETS 16 PROPERTIES ( "replication_allocation" = "tag.location.default: 3", "in_memory" = "false", "storage_format" = "V2" ) 2, try to load data in parquet or orc with broker load 2.1 data like below: name,id,click,impressions,cost 1,111,,1,11 1,11,,,1 22,222,,2,2 3,33,333,,3 4,44,444,,4 5,55,555,,5 2.2 broker load without columns like below: LOAD LABEL label1 (DATA INFILE (hdfs://filepath) into table 't22' format as "parquet" with broker broker_name () 3, the result should like below instead of failed with "Invalid Column with ", the column id2 is NULL as it not exist in file. +--+--+--+-+---++ | name | id | id2 | impressions | click | cost | +--+--+--+-+---++ | 22 | 222 | NULL | 2 | | 2 | | 11 | 111 | NULL | 1 | | 11 | |5 | 55 | NULL | | 555 | 5 | |4 | 44 | NULL | | 444 | 4 | |3 | 33 | NULL | | 333 | 3 | |1 | 11 | NULL | | NULL | 1 | +--+--+--+-+---++ Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [Y ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 commented on a diff in pull request #11731: [feature-wip](unique-key-merge-on-write) add regression case
924060929 commented on code in PR #11731: URL: https://github.com/apache/doris/pull/11731#discussion_r944343559 ## regression-test/suites/primary_key/test_primary_key_simple_case.groovy: ## @@ -0,0 +1,113 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +import org.codehaus.groovy.runtime.IOGroovyMethods + +suite("test_primary_key_simple_case") { +def tableName = "primary_key_simple_case" + +try { Review Comment: you can remove this try-catch block, and use the onFinish action like this: ```groovy def tableName = "primary_key_simple_case" onFinish { try_sql("DROP TABLE IF EXISTS ${tableName}") } sql """create table ...""" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-website] hf200012 merged pull request #39: [doc](mysql2doris)Add mysql to doris documentation
hf200012 merged PR #39: URL: https://github.com/apache/doris-website/pull/39 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: Add mysql to doris documentation (#39)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new 1307d6371d2 Add mysql to doris documentation (#39) 1307d6371d2 is described below commit 1307d6371d2169bb2cfcf575272734d8b26ef51d Author: caoliang-web <71004656+caoliang-...@users.noreply.github.com> AuthorDate: Fri Aug 12 18:49:01 2022 +0800 Add mysql to doris documentation (#39) Add mysql to doris documentation --- docs/ecosystem/mysql-to-doris.md | 103 .../current/ecosystem/mysql-to-doris.md| 104 + sidebars.json | 1 + 3 files changed, 208 insertions(+) diff --git a/docs/ecosystem/mysql-to-doris.md b/docs/ecosystem/mysql-to-doris.md new file mode 100644 index 000..51605d72d91 --- /dev/null +++ b/docs/ecosystem/mysql-to-doris.md @@ -0,0 +1,103 @@ +--- +{ + +"title": "Mysql to Doris", +"language": "en" + +} +--- + + + +# Mysql to Doris + +mysql to doris is mainly suitable for automating the creation of doris odbc tables, mainly implemented with shell scripts + +## manual + +mysql to doris code [here](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) + +### Directory Structure + +```text +├── mysql_to_doris +│ ├── conf +│ │ ├── doris.conf +│ │ ├── mysql.conf +│ │ └── tables +│ ├── all_tables.sh +│ │ +└── └── user_define_tables.sh +``` + +1. all_tables.sh + + This script mainly reads all the tables under the mysql specified library and automatically creates the Doris odbc external table + +2. user_define_tables.sh + + This script is mainly used for users to customize certain tables under the specified mysql library to automatically create Doris odbc external tables + +3. conf + + Configuration file, `doris.conf` is mainly used to configure doris related, `mysql.conf` is mainly used to configure mysql related, `tables` is mainly used to configure user-defined mysql library tables + +### full + +1. Download using mysql to doris [here](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) +2. Configuration related files + + ```shell + #doris.conf + master_host= + master_port= + doris_password= + + #mysql.conf + mysql_host= + mysql_password= + ``` + + | Configuration item | illustrate | + | -- | --- | + | master_host| Doris FE master node IP | + | master_port| Doris FE query_port port | + | doris_password | Doris Password (default root user) | + | mysql_host | Mysql IP | + | mysql_password | Mysql Password (default root user) | + +3. Execute the `all_tables.sh` script + +``` +sh all_tables.sh mysql_db_name doris_db_name +``` +After successful execution, the files directory will be generated, and the directory will contain `tables` (table name) and `tables.sql` (doris odbc table creation statement) + +### custom + +1. Modify the `conf/tables` file to add the name of the odbc table that needs to be created +2. To configure mysql and doris related information, refer to step 2 of full creation +3. Execute the `user_define_tables.sh` script + +``` +sh user_define_tables.sh mysql_db_name doris_db_name +``` + +After successful execution, the user_files directory will be generated, and the directory will contain `tables.sql` (doris odbc table creation statement) diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/mysql-to-doris.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/mysql-to-doris.md new file mode 100644 index 000..5b924a1b510 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/mysql-to-doris.md @@ -0,0 +1,104 @@ +--- +{ + +"title": "Mysql to Doris", +"language": "zh-CN" + +} +--- + + + +# Mysql to Doris + +mysql to doris 主要适用于自动化创建doris odbc 表,主要用shell脚本实现 + +## 使用手册 + +mysql to doris 代码[这里](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) + +### 目录结构 + +```text +├── mysql_to_doris +│ ├── conf +│ │ ├── doris.conf +│ │ ├── mysql.conf +│ │ └── tables +│ ├── all_tables.sh +│ │ +└── └── user_define_tables.sh +``` + +1. all_tables.sh + + 这个脚本主要是读取mysql指定库下的所有表,自动创建Doris odbc外表 + +2. user_define_tables.sh + + 这个脚本主要用于用户自定义指定mysql库下某几张表,自动创建Doris odbc外表 + +3. conf + + 配置文件,`doris.conf`主要是配置doris相关的,`mysql.conf`主要配置mysql相关的,`tables`主要用于配置用户自定义mysql库的表 + +### 全量 + +1. 下载使用mysql to doris[这里](https://github.com/apache/doris/tree/master/extension/mysql_to_doris) +2. 配置相关文件 + + ```shell + #doris.conf + master_host= + master_port= + doris_password= + + #mysql.conf + mysql_host= + mysql_password= + ``` + + | 配置项 | 说明| + | -- | ---
[GitHub] [doris] 924060929 commented on a diff in pull request #11731: [feature-wip](unique-key-merge-on-write) add regression case
924060929 commented on code in PR #11731: URL: https://github.com/apache/doris/pull/11731#discussion_r944343559 ## regression-test/suites/primary_key/test_primary_key_simple_case.groovy: ## @@ -0,0 +1,113 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +import org.codehaus.groovy.runtime.IOGroovyMethods + +suite("test_primary_key_simple_case") { +def tableName = "primary_key_simple_case" + +try { Review Comment: you can remove this try-catch block, and use the onFinish action like this: ```groovy def tableName = "primary_key_simple_case" onFinish { try_sql("DROP TABLE IF EXISTS ${tableName}") } sql """create table ...""" ``` example: `doris/regression-test/suites/demo/event_action.groovy` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #11708: [refactor](date) Use uint32 as predicate type for date type
BiteThet commented on code in PR #11708: URL: https://github.com/apache/doris/pull/11708#discussion_r944352947 ## be/src/olap/comparison_predicate.h: ## @@ -95,19 +127,39 @@ class ComparisonPredicateBase : public ColumnPredicate { continue; } uint16_t idx = sel[i]; -const T* cell_value = reinterpret_cast(block->cell(idx).cell_ptr()); -auto result = (!block->cell(idx).is_null() && _operator(*cell_value, _value)); -flags[i] = flags[i] & (_opposite ? !result : result); +const T* cell_value = nullptr; +if constexpr (Type == TYPE_DATE) { +T tmp_uint32_value = 0; +memcpy((char*)(&tmp_uint32_value), block->cell(idx).cell_ptr(), + sizeof(uint24_t)); +cell_value = reinterpret_cast(&tmp_uint32_value); +auto result = (!block->cell(idx).is_null() && _operator(*cell_value, _value)); +flags[i] = flags[i] & (_opposite ? !result : result); +} else { +cell_value = reinterpret_cast(block->cell(idx).cell_ptr()); +auto result = (!block->cell(idx).is_null() && _operator(*cell_value, _value)); +flags[i] = flags[i] & (_opposite ? !result : result); +} } } else { for (uint16_t i = 0; i < size; ++i) { if (flags[i]) { continue; } uint16_t idx = sel[i]; -const T* cell_value = reinterpret_cast(block->cell(idx).cell_ptr()); -auto result = _operator(*cell_value, _value); -flags[i] = flags[i] & (_opposite ? !result : result); +const T* cell_value = nullptr; +if constexpr (Type == TYPE_DATE) { +T tmp_uint32_value = 0; +memcpy((char*)(&tmp_uint32_value), block->cell(idx).cell_ptr(), + sizeof(uint24_t)); +cell_value = reinterpret_cast(&tmp_uint32_value); +auto result = _operator(*cell_value, _value); Review Comment: why not just pass tmp_uint32_value to operator? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #11743: [feature](compaction) allow to set disable_auto_compaction for tables
Gabriel39 opened a new pull request, #11743: URL: https://github.com/apache/doris/pull/11743 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xiaokang commented on a diff in pull request #11579: [Feature](NGram BloomFilter Index) add new ngram bloom filter index to speed up like query
xiaokang commented on code in PR #11579: URL: https://github.com/apache/doris/pull/11579#discussion_r944356288 ## fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java: ## @@ -66,6 +93,19 @@ public void analyze() throws AnalysisException { if (columns.size() != distinct.size()) { throw new AnalysisException("columns of index has duplicated."); } +if (arguments != null && !arguments.isEmpty()) { +throw new AnalysisException("bimap index do not need arguments."); +} +} else if (indexType == IndexType.NGRAM_BF) { +if (columns == null || columns.size() != 1) { Review Comment: common check for indexName can be moved to the outer scope. ## fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java: ## @@ -142,6 +217,30 @@ public void checkColumn(Column column, KeysType keysType) throws AnalysisExcepti "BITMAP index only used in columns of DUP_KEYS/UNIQUE_KEYS table or key columns of" + " AGG_KEYS table. invalid column: " + indexColName); } +} else if (indexType == IndexType.NGRAM_BF) { +String indexColName = column.getName(); +PrimitiveType colType = column.getDataType(); +if (colType != PrimitiveType.CHAR && colType != PrimitiveType.VARCHAR) { Review Comment: colType.isStringType() may be better, if String is also supported. ## docs/zh-CN/docs/data-table/index/ngram-bloomfilter-index.md: ## @@ -0,0 +1,79 @@ +--- +{ +"title": "NGram BloomFilter索引", +"language": "zh-CN" +} +--- + + + +# Doris NGram BloomFilter索引及使用使用场景 + +为了提升like的查询性能,增加了NGram BloomFilter索引,其实现主要参照了ClickHouse的ngrambf。 + +## NGram BloomFilter创建 + +表创建时指定: + +```sql +CREATE TABLE `table3` ( + `siteid` int(11) NULL DEFAULT "10" COMMENT "", + `citycode` smallint(6) NULL COMMENT "", + `username` varchar(32) NULL DEFAULT "" COMMENT "", + INDEX idx_ngrambf (`username`) USING NGRAM_BF (3,256) COMMENT 'username ngram_bf index' +) ENGINE=OLAP +AGGREGATE KEY(`siteid`, `citycode`, `username`) COMMENT "OLAP" +DISTRIBUTED BY HASH(`siteid`) BUCKETS 10 +PROPERTIES ( +"replication_num" = "1" +); + +-- 其中(3,256),分别表示ngram的个数和bloomfilter的字节数。 +``` + +## 查看NGram BloomFilter索引 + +查看我们在表上建立的NGram BloomFilter索引是使用: + +```sql +show index from example_db.table3; +``` + +## 删除NGram BloomFilter索引 + + +```sql +alter table example_db.table3 drop index idx_ngrambf; +``` + +## 修改NGram BloomFilter索引 + +为已有列新增NGram BloomFilter索引: + +```sql +alter table example_db.table3 add index idx_ngrambf(username) using NGRAM_BF(3, 256) comment 'username ngram_bf index' +``` + +## **Doris NGram BloomFilter使用注意事项** + +1. NGram BloomFilter只支持字符串列 +2. NGram BloomFilter索引和BloomFilter索引为互斥关系,即同一个列只能设置两者中的一个 Review Comment: Can we support normal BloomFilter ability in NgramBloomFilter? It may be achived by adding the whole filed value as a token to the bloom filter. ## fe/fe-core/src/main/cup/sql_parser.cup: ## @@ -518,7 +518,9 @@ nonterminal ColumnDef.DefaultValue opt_default_value; nonterminal Boolean opt_if_exists, opt_if_not_exists; nonterminal Boolean opt_external; nonterminal Boolean opt_force; -nonterminal IndexDef.IndexType opt_index_type; +nonterminal IndexDef.IndexType index_type; Review Comment: can still be opt_index_type if bitmap index is kept as default ## be/src/olap/rowset/segment_v2/column_writer.cpp: ## @@ -296,8 +296,13 @@ Status ScalarColumnWriter::init() { RETURN_IF_ERROR( BitmapIndexWriter::create(get_field()->type_info(), &_bitmap_index_builder)); } + if (_opts.need_bloom_filter) { -RETURN_IF_ERROR(BloomFilterIndexWriter::create( +if (_opts.is_ngram_bf_index) +RETURN_IF_ERROR(BloomFilterIndexWriter::create( Review Comment: using NGramBloomFilterIndexWriterImpl directly may be more simple and intuitive ## docs/zh-CN/docs/data-table/index/ngram-bloomfilter-index.md: ## @@ -0,0 +1,79 @@ +--- +{ +"title": "NGram BloomFilter索引", +"language": "zh-CN" +} +--- + + + +# Doris NGram BloomFilter索引及使用使用场景 + +为了提升like的查询性能,增加了NGram BloomFilter索引,其实现主要参照了ClickHouse的ngrambf。 + +## NGram BloomFilter创建 + +表创建时指定: + +```sql +CREATE TABLE `table3` ( + `siteid` int(11) NULL DEFAULT "10" COMMENT "", + `citycode` smallint(6) NULL COMMENT "", + `username` varchar(32) NULL DEFAULT "" COMMENT "", + INDEX idx_ngrambf (`username`) USING NGRAM_BF (3,256) COMMENT 'username ngram_bf index' +) ENGINE=OLAP +AGGREGATE KEY(`siteid`, `citycode`, `username`) COMMENT "OLAP" +DISTRIBUTED BY HASH(`siteid`) BUCKETS 10 +PROPERTIES ( +"replication_num" = "1" +); + +-- 其中(3,256),分别表示ngram的个数和bloomfilter的字节数。 Review Comment: It's helpful for users to provide some suggestion for how to determin the size according to data distribution and error r
[GitHub] [doris] github-actions[bot] commented on pull request #11708: [refactor](date) Use uint32 as predicate type for date type
github-actions[bot] commented on PR #11708: URL: https://github.com/apache/doris/pull/11708#issuecomment-1213002563 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11708: [refactor](date) Use uint32 as predicate type for date type
github-actions[bot] commented on PR #11708: URL: https://github.com/apache/doris/pull/11708#issuecomment-1213002595 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow commented on a diff in pull request #11589: [Feature](nereids)support view and nested view
morrySnow commented on code in PR #11589: URL: https://github.com/apache/doris/pull/11589#discussion_r944365812 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Alias.java: ## @@ -35,7 +35,7 @@ public class Alias extends NamedExpression implements UnaryExpression { private final ExprId exprId; private final String name; -private final List qualifier; +private List qualifier; Review Comment: add back final ``` ImmutableList.copyOf(Objects.requireNonNull(qualifier, "qualifier can not be null")); ``` ## fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java: ## @@ -398,6 +399,12 @@ public PlanFragment visitPhysicalNestedLoopJoin(PhysicalNestedLoopJoin project, PlanTranslatorContext context) { PlanFragment inputFragment = project.child(0).accept(this, context); + +project.getProjects().stream().filter(p -> p instanceof Alias).forEach(p -> { Review Comment: ```suggestion project.getProjects().stream().filter(Alias.class::isInstance).forEach(p -> { ``` ## fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/MergeConsecutiveProjectJob.java: ## @@ -0,0 +1,44 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.jobs.batch; + +import org.apache.doris.nereids.CascadesContext; +import org.apache.doris.nereids.rules.rewrite.logical.MergeConsecutiveProjects; + +import com.google.common.collect.ImmutableList; + +/** + * Merge consecutive project rules. + */ +public class MergeConsecutiveProjectJob extends BatchRulesJob { + +/** + * Execute the merge consecutive job. + * @param ctx planner context for execute job + */ +public MergeConsecutiveProjectJob(CascadesContext ctx) { +//TODO: eliminate consecutive projects for view +super(ctx); +rulesJob.addAll(ImmutableList.of( +bottomUpBatch(ImmutableList.of( +new MergeConsecutiveProjects() +) +) Review Comment: ```suggestion bottomUpBatch(ImmutableList.of(new MergeConsecutiveProjects())) ``` ## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/BindSlotReference.java: ## @@ -269,38 +269,40 @@ private BoundStar bindQualifiedStar(List qualifierStar, Void context) { private List bindSlot(UnboundSlot unboundSlot, List boundSlots) { return boundSlots.stream().filter(boundSlot -> { List nameParts = unboundSlot.getNameParts(); -switch (nameParts.size()) { -case 1: -// Unbound slot name is `column` -return nameParts.get(0).equalsIgnoreCase(boundSlot.getName()); -case 2: -// Unbound slot name is `table`.`column` -List qualifier = boundSlot.getQualifier(); -String name = boundSlot.getName(); -switch (qualifier.size()) { -case 2: -// qualifier is `db`.`table` -return nameParts.get(0).equalsIgnoreCase(qualifier.get(1)) -&& nameParts.get(1).equalsIgnoreCase(name); -case 1: -// qualifier is `table` -return nameParts.get(0).equalsIgnoreCase(qualifier.get(0)) -&& nameParts.get(1).equalsIgnoreCase(name); -case 0: -// has no qualifiers -return nameParts.get(1).equalsIgnoreCase(name); -default: -throw new AnalysisException("Not supported qualifier: " -+ StringUtils.join(qualifier, ".")); -} -default: -throw new AnalysisException("Not supported name: " -
[doris] branch master updated: [bugfix](schema change) when there is a string column with delete predicate, the schema change may core (#11739)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 408dbf840b [bugfix](schema change) when there is a string column with delete predicate, the schema change may core (#11739) 408dbf840b is described below commit 408dbf840bb67c6284f509b6c417b69324f23aaa Author: yiguolei <676222...@qq.com> AuthorDate: Fri Aug 12 19:29:22 2022 +0800 [bugfix](schema change) when there is a string column with delete predicate, the schema change may core (#11739) * [bugfix](schema change) when there is a string column with delete predicate, the schema change may core Co-authored-by: yiguolei --- be/src/olap/schema_change.cpp | 2 +- .../test_schema_change_with_delete.out | 19 ++ .../test_schema_change_with_delete.groovy | 68 ++ 3 files changed, 88 insertions(+), 1 deletion(-) diff --git a/be/src/olap/schema_change.cpp b/be/src/olap/schema_change.cpp index 28272d7572..0bccf7ea2d 100644 --- a/be/src/olap/schema_change.cpp +++ b/be/src/olap/schema_change.cpp @@ -1738,6 +1738,7 @@ Status SchemaChangeHandler::_do_process_alter_tablet_v2(const TAlterTabletReqV2& } std::vector versions_to_be_changed; +vectorized::BlockReader reader; std::vector rs_readers; // delete handlers for new tablet DeleteHandler delete_handler; @@ -1849,7 +1850,6 @@ Status SchemaChangeHandler::_do_process_alter_tablet_v2(const TAlterTabletReqV2& break; } -vectorized::BlockReader reader; TabletReader::ReaderParams reader_params; reader_params.tablet = base_tablet; reader_params.reader_type = READER_ALTER_TABLE; diff --git a/regression-test/data/schema_change/test_schema_change_with_delete.out b/regression-test/data/schema_change/test_schema_change_with_delete.out new file mode 100644 index 00..e74aa7504b --- /dev/null +++ b/regression-test/data/schema_change/test_schema_change_with_delete.out @@ -0,0 +1,19 @@ +-- This file is automatically generated. You should know what you did if you want to edit this +-- !sql -- +2 2 2 bbb +3 3 3 ccc + +-- !sql -- +2 2 2 bbb +3 3 3 ccc + +-- !sql -- +2 2 2 bbb +3 3 3 ccc +4 4 efg ddd + +-- !sql -- +2 2 2 bbb +3 3 3 ccc +4 4 efg ddd + diff --git a/regression-test/suites/schema_change/test_schema_change_with_delete.groovy b/regression-test/suites/schema_change/test_schema_change_with_delete.groovy new file mode 100644 index 00..607f991f46 --- /dev/null +++ b/regression-test/suites/schema_change/test_schema_change_with_delete.groovy @@ -0,0 +1,68 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +// Test schema change for a table, the table has a delete predicate on string column +suite("test_schema_change_with_delete") { + +def tbName = "test_schema_change_with_delete" +def getJobState = { tableName -> + def jobStateResult = sql """ SHOW ALTER TABLE COLUMN WHERE IndexName='${tableName}' ORDER BY createtime DESC LIMIT 1 """ + return jobStateResult[0][9] + } + + sql """ DROP TABLE IF EXISTS ${tbName} FORCE""" + // Create table and disable light weight schema change + sql """ +CREATE TABLE IF NOT EXISTS ${tbName} +( +event_day int, +siteid INT , +citycode int, +username VARCHAR(32) DEFAULT '' +) +DUPLICATE KEY(event_day,siteid) +DISTRIBUTED BY HASH(event_day) BUCKETS 1 +PROPERTIES("replication_num" = "1", "light_schema_change" = "true"); + """ + sql """ insert into ${tbName} values(1, 1, 1, 'aaa');""" + sql """ insert into ${tbName} values(2, 2, 2, 'bbb');""" + sql """ delete from ${tbName} where username='aaa';""" + sql """ insert into ${tbName} values(3, 3, 3, 'ccc');""" + +
[GitHub] [doris] yiguolei merged pull request #11739: [bugfix](schema change) when there is a string column with delete predicate, the schema change may core
yiguolei merged PR #11739: URL: https://github.com/apache/doris/pull/11739 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11729: [Bug](date function) Fix bug for date format %T
yiguolei merged PR #11729: URL: https://github.com/apache/doris/pull/11729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Bug](date function) Fix bug for date format %T (#11729)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new abd2eb4fa1 [Bug](date function) Fix bug for date format %T (#11729) abd2eb4fa1 is described below commit abd2eb4fa148a3c88d9a78b29a2289d161c73b67 Author: Gabriel AuthorDate: Fri Aug 12 19:29:58 2022 +0800 [Bug](date function) Fix bug for date format %T (#11729) * [Bug](date function) Fix bug for date format %T --- be/src/vec/runtime/vdatetime_value.cpp | 99 ++ be/src/vec/runtime/vdatetime_value.h | 20 +++-- .../datetime_functions/test_date_function.out | 3 + .../datetime_functions/test_date_function.groovy | 1 + 4 files changed, 81 insertions(+), 42 deletions(-) diff --git a/be/src/vec/runtime/vdatetime_value.cpp b/be/src/vec/runtime/vdatetime_value.cpp index 315851c454..04dd72b8d4 100644 --- a/be/src/vec/runtime/vdatetime_value.cpp +++ b/be/src/vec/runtime/vdatetime_value.cpp @@ -48,7 +48,11 @@ uint8_t mysql_week_mode(uint32_t mode) { bool VecDateTimeValue::check_range(uint32_t year, uint32_t month, uint32_t day, uint32_t hour, uint32_t minute, uint32_t second, uint16_t type) { bool time = hour > (type == TIME_TIME ? TIME_MAX_HOUR : 23) || minute > 59 || second > 59; -return time || check_date(year, month, day); +if (type == TIME_TIME) { +return time; +} else { +return time || check_date(year, month, day); +} } bool VecDateTimeValue::check_date(uint32_t year, uint32_t month, uint32_t day) { @@ -1316,22 +1320,32 @@ bool VecDateTimeValue::from_date_format_str(const char* format, int format_len, val = tmp; date_part_used = true; break; -case 'r': -if (!from_date_format_str("%I:%i:%S %p", 11, val, val_end - val, &tmp)) { +case 'r': { +VecDateTimeValue tmp_val; +if (!tmp_val.from_date_format_str("%I:%i:%S %p", 11, val, val_end - val, &tmp)) { return false; } +this->_hour = tmp_val._hour; +this->_minute = tmp_val._minute; +this->_second = tmp_val._second; val = tmp; time_part_used = true; already_set_time_part = true; break; -case 'T': -if (!from_date_format_str("%H:%i:%S", 8, val, val_end - val, &tmp)) { +} +case 'T': { +VecDateTimeValue tmp_val; +if (!tmp_val.from_date_format_str("%H:%i:%S", 8, val, val_end - val, &tmp)) { return false; } +this->_hour = tmp_val._hour; +this->_minute = tmp_val._minute; +this->_second = tmp_val._second; time_part_used = true; already_set_time_part = true; val = tmp; break; +} case '.': while (val < val_end && ispunct(*val)) { val++; @@ -1679,13 +1693,19 @@ std::size_t hash_value(VecDateTimeValue const& value) { template bool DateV2Value::is_invalid(uint32_t year, uint32_t month, uint32_t day, uint8_t hour, -uint8_t minute, uint8_t second, uint32_t microsecond) { +uint8_t minute, uint8_t second, uint32_t microsecond, +bool only_time_part) { if (hour > 24 || minute >= 60 || second >= 60 || microsecond > 99) { return true; } +if (only_time_part) { +return false; +} +if (year < MIN_YEAR || year > MAX_YEAR) { +return true; +} if (month == 2 && day == 29 && doris::is_leap(year)) return false; -if (year < MIN_YEAR || year > MAX_YEAR || month == 0 || month > 12 || -day > s_days_in_month[month] || day == 0) { +if (month == 0 || month > 12 || day > s_days_in_month[month] || day == 0) { return true; } return false; @@ -2061,22 +2081,41 @@ bool DateV2Value::from_date_format_str(const char* format, int format_len, co val = tmp; date_part_used = true; break; -case 'r': -if (!from_date_format_str("%I:%i:%S %p", 11, val, val_end - val, &tmp)) { +case 'r': { +if constexpr (is_datetime) { +DateV2Value tmp_val; +if (!tmp_val.from_date_format_str("%I:%i:%S %p", 11, val, val_end - val, + &tmp)) { +return false; +} +this->date_v2_value_.hour_ = tmp_val.hour(); +
[GitHub] [doris] yiguolei commented on a diff in pull request #11740: [enhancement](memtracker) Optimize query memory accuracy
yiguolei commented on code in PR #11740: URL: https://github.com/apache/doris/pull/11740#discussion_r944376360 ## be/src/runtime/memory/chunk_allocator.cpp: ## @@ -147,7 +147,7 @@ ChunkAllocator::ChunkAllocator(size_t reserve_limit) } Status ChunkAllocator::allocate(size_t size, Chunk* chunk) { -DCHECK(BitUtil::RoundUpToPowerOfTwo(size) == size); +DCHECK((size & (size - 1)) == 0); Review Comment: CHECK here, not use DCHECK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11732: [improvement](doc)Import data example from hive partition table
github-actions[bot] commented on PR #11732: URL: https://github.com/apache/doris/pull/11732#issuecomment-1213019463 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11732: [improvement](doc)Import data example from hive partition table
github-actions[bot] commented on PR #11732: URL: https://github.com/apache/doris/pull/11732#issuecomment-1213019497 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 merged pull request #11732: [improvement](doc)Import data example from hive partition table
hf200012 merged PR #11732: URL: https://github.com/apache/doris/pull/11732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [improvement](doc)Import data example from hive partition table (#11732)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 2827ced1f6 [improvement](doc)Import data example from hive partition table (#11732) 2827ced1f6 is described below commit 2827ced1f6a3da76c4e47ff3a20eb71360732aba Author: jiafeng.zhang AuthorDate: Fri Aug 12 19:38:45 2022 +0800 [improvement](doc)Import data example from hive partition table (#11732) Import data example from hive partition table --- .../import/import-way/spark-load-manual.md | 70 ++ .../import/import-way/spark-load-manual.md | 70 ++ 2 files changed, 140 insertions(+) diff --git a/docs/en/docs/data-operate/import/import-way/spark-load-manual.md b/docs/en/docs/data-operate/import/import-way/spark-load-manual.md index 49dfc5628b..1d1d47e78f 100644 --- a/docs/en/docs/data-operate/import/import-way/spark-load-manual.md +++ b/docs/en/docs/data-operate/import/import-way/spark-load-manual.md @@ -483,6 +483,75 @@ PROPERTIES ``` +Example 4: Import data from hive partitioned table + +```sql +-- hive create table statement +create table test_partition( +id int, +name string, +age int +) +partitioned by (dt string) +row format delimited fields terminated by ',' +stored as textfile; + +-- doris create table statement +CREATE TABLE IF NOT EXISTS test_partition_04 +( +dt date, +id int, +name string, +age int +) +UNIQUE KEY(`dt`, `id`) +DISTRIBUTED BY HASH(`id`) BUCKETS 1 +PROPERTIES ( +"replication_allocation" = "tag.location.default: 1" +); +-- spark load +CREATE EXTERNAL RESOURCE "spark_resource" +PROPERTIES +( +"type" = "spark", +"spark.master" = "yarn", +"spark.submit.deployMode" = "cluster", +"spark.executor.memory" = "1g", +"spark.yarn.queue" = "default", +"spark.hadoop.yarn.resourcemanager.address" = "localhost:50056", +"spark.hadoop.fs.defaultFS" = "hdfs://localhost:9000", +"working_dir" = "hdfs://localhost:9000/tmp/doris", +"broker" = "broker_01" +); +LOAD LABEL demo.test_hive_partition_table_18 +( +DATA INFILE("hdfs://localhost:9000/user/hive/warehouse/demo.db/test/dt=2022-08-01/*") +INTO TABLE test_partition_04 +COLUMNS TERMINATED BY "," +FORMAT AS "csv" +(id,name,age) +COLUMNS FROM PATH AS (`dt`) +SET +( +dt=dt, +id=id, +name=name, +age=age +) +) +WITH RESOURCE 'spark_resource' +( +"spark.executor.memory" = "1g", +"spark.shuffle.compress" = "true" +) +PROPERTIES +( +"timeout" = "3600" +); + + + + You can view the details syntax about creating load by input `help spark load`. This paper mainly introduces the parameter meaning and precautions in the creation and load syntax of spark load. **Label** @@ -647,6 +716,7 @@ The most suitable scenario to use spark load is that the raw data is in the file ## FAQ +* Spark load does not yet support the import of Doris table fields that are of type String. If your table fields are of type String, please change them to type varchar, otherwise the import will fail, prompting `type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel` * When using spark load, the `HADOOP_CONF_DIR` environment variable is no set in the `spark-env.sh`. If the `HADOOP_CONF_DIR` environment variable is not set, the error `When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment` will be reported. diff --git a/docs/zh-CN/docs/data-operate/import/import-way/spark-load-manual.md b/docs/zh-CN/docs/data-operate/import/import-way/spark-load-manual.md index d8bc642296..c33fbf96fa 100644 --- a/docs/zh-CN/docs/data-operate/import/import-way/spark-load-manual.md +++ b/docs/zh-CN/docs/data-operate/import/import-way/spark-load-manual.md @@ -449,6 +449,75 @@ PROPERTIES ); ``` +示例4: 导入 hive 分区表的数据 + +```sql +--hive 建表语句 +create table test_partition( + id int, + name string, + age int +) +partitioned by (dt string) +row format delimited fields terminated by ',' +stored as textfile; + +--doris 建表语句 +CREATE TABLE IF NOT EXISTS test_partition_04 +( + dt date, + id int, + name string, + age int +) +UNIQUE KEY(`dt`, `id`) +DISTRIBUTED BY HASH(`id`) BUCKETS 1 +PROPERTIES ( + "replication_allocation" = "tag.location.default: 1" +); +--spark load 语句 +CREATE EXTERNAL RESOURCE "spark_resource" +PROPERTIES +( +"type" = "spark", +"spark.master" = "yarn", +"spark.submit.deployMode" = "cluster", +"spark.executor.memory" = "1g", +"spark.yarn.queue" = "default", +"spark.hadoop.yarn.resourcemanager.address" = "localhost:50056", +"spark.hadoop.fs.defaultFS" = "hdfs://localhost:9000", +"working_dir" = "hdfs://localhost:9000/tmp/doris", +"broker" = "broker_01" +); +LOAD LABEL demo.test_hive_partition_table_18 +( +DATA INFILE("hdfs://localhost:9000/user/hive/warehou
[doris] branch dev-1.1.2 updated: [improvement](profile) add json profile and add session context (#11279)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch dev-1.1.2 in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/dev-1.1.2 by this push: new 8e1018d1c8 [improvement](profile) add json profile and add session context (#11279) 8e1018d1c8 is described below commit 8e1018d1c8f0b646d8526f4b525a4078ac32cebc Author: miswujian <38979663+miswuj...@users.noreply.github.com> AuthorDate: Thu Jul 28 15:48:00 2022 +0800 [improvement](profile) add json profile and add session context (#11279) 1. Add a new session varible "session_context" 2. support export profile in json format --- .../org/apache/doris/analysis/SchemaTableType.java | 1 + .../org/apache/doris/catalog/PrimitiveType.java| 3 +- .../doris/common/profile/ProfileTreeNode.java | 1 - .../doris/common/profile/ProfileTreePrinter.java | 8 +- .../apache/doris/common/util/ProfileManager.java | 17 ++- .../apache/doris/common/util/RuntimeProfile.java | 2 +- .../apache/doris/httpv2/rest/MetaInfoAction.java | 15 +-- .../httpv2/rest/manager/QueryProfileAction.java| 148 + .../java/org/apache/doris/mysql/MysqlCommand.java | 1 + .../java/org/apache/doris/qe/SessionVariable.java | 42 +- .../java/org/apache/doris/qe/StmtExecutor.java | 1 + .../doris/common/util/RuntimeProfileTest.java | 4 +- 12 files changed, 164 insertions(+), 79 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SchemaTableType.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SchemaTableType.java index ff3e29d6f6..32fa75c7a8 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SchemaTableType.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SchemaTableType.java @@ -68,6 +68,7 @@ public enum SchemaTableType { SCH_INVALID("NULL", "NULL", TSchemaTableType.SCH_INVALID); private static final String dbName = "INFORMATION_SCHEMA"; private static SelectList fullSelectLists; + static { fullSelectLists = new SelectList(); fullSelectLists.addItem(SelectListItem.createStarItem(null)); diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java index cdc0f56980..b61e573f9e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/PrimitiveType.java @@ -70,6 +70,7 @@ public enum PrimitiveType { private static final int DECIMAL_INDEX_LEN = 12; private static ImmutableSetMultimap implicitCastMap; + static { ImmutableSetMultimap.Builder builder = ImmutableSetMultimap.builder(); // Nulltype @@ -746,7 +747,7 @@ public enum PrimitiveType { case DATETIME: { if (isTimeType) { return MysqlColType.MYSQL_TYPE_TIME; -} else { +} else { return MysqlColType.MYSQL_TYPE_DATETIME; } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreeNode.java b/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreeNode.java index 10d71a58bc..9d77bca46f 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreeNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreeNode.java @@ -137,7 +137,6 @@ public class ProfileTreeNode extends TreeNode { return sb.toString(); } - public JSONObject debugStringInJson(ProfileTreePrinter.PrintLevel level, String nodeLevel) { JSONObject jsonObject = new JSONObject(); jsonObject.put("id", nodeLevel); diff --git a/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreePrinter.java b/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreePrinter.java index 0028d080ed..c09a764d73 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreePrinter.java +++ b/fe/fe-core/src/main/java/org/apache/doris/common/profile/ProfileTreePrinter.java @@ -20,16 +20,14 @@ package org.apache.doris.common.profile; import hu.webarticum.treeprinter.BorderTreeNodeDecorator; import hu.webarticum.treeprinter.SimpleTreeNode; import hu.webarticum.treeprinter.TraditionalTreePrinter; - import org.apache.commons.lang3.StringUtils; import org.json.simple.JSONArray; import org.json.simple.JSONObject; public class ProfileTreePrinter { -public static enum PrintLevel { -FRAGMENT, -INSTANCE +public enum PrintLevel { +FRAGMENT, INSTANCE } // Fragment tree only print the entire query plan tree with node name @@ -57,7 +55,6 @@ public class ProfileTreePrinter { return node; } - public static JSONObject printFragmentTreeInJson(ProfileTreeNo
[GitHub] [doris] morningman merged pull request #11580: [fix](storage-policy) fix bug that missing field when refreshing storage policy
morningman merged PR #11580: URL: https://github.com/apache/doris/pull/11580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [fix](storage-policy) fix bug that missing field when refreshing storage policy (#11580)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 854b4b1b47 [fix](storage-policy) fix bug that missing field when refreshing storage policy (#11580) 854b4b1b47 is described below commit 854b4b1b4768f6a60b749b2a16a40caf1b82b127 Author: Mingyu Chen AuthorDate: Fri Aug 12 20:11:54 2022 +0800 [fix](storage-policy) fix bug that missing field when refreshing storage policy (#11580) 1. Change all required fields to optional Although they all "required", but it not recommended to use `required`, because it is hard to modify in future. 2. Fix a missing field bug --- .../main/java/org/apache/doris/policy/StoragePolicy.java | 1 - .../java/org/apache/doris/service/FrontendServiceImpl.java | 2 +- gensrc/thrift/AgentService.thrift | 14 +++--- 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/policy/StoragePolicy.java b/fe/fe-core/src/main/java/org/apache/doris/policy/StoragePolicy.java index 89f899ef2b..1200745c52 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/policy/StoragePolicy.java +++ b/fe/fe-core/src/main/java/org/apache/doris/policy/StoragePolicy.java @@ -373,7 +373,6 @@ public class StoragePolicy extends Policy { storageResource = alterStorageResource; } - md5Checksum = calcPropertiesMd5(); notifyUpdate(); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java b/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java index 4625cc2b7d..c66e2279e0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java +++ b/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java @@ -1046,7 +1046,7 @@ public class FrontendServiceImpl implements FrontendService.Iface { result.addToResultEntrys(rEntry); } ); -if (policyList.size() == 0) { +if (!result.isSetResultEntrys()) { result.setResultEntrys(new ArrayList<>()); } diff --git a/gensrc/thrift/AgentService.thrift b/gensrc/thrift/AgentService.thrift index 913b8aad96..156b3e70a7 100644 --- a/gensrc/thrift/AgentService.thrift +++ b/gensrc/thrift/AgentService.thrift @@ -67,16 +67,16 @@ struct TS3StorageParam { } struct TGetStoragePolicy { -1: required string policy_name -2: required i64 cooldown_datetime -3: required i64 cooldown_ttl -4: required TS3StorageParam s3_storage_param -5: required string md5_checksum +1: optional string policy_name +2: optional i64 cooldown_datetime +3: optional i64 cooldown_ttl +4: optional TS3StorageParam s3_storage_param +5: optional string md5_checksum } struct TGetStoragePolicyResult { -1: required Status.TStatus status -2: required list result_entrys +1: optional Status.TStatus status +2: optional list result_entrys } enum TCompressionType { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman closed issue #11581: [Bug] fail to refresh storge policy. host=172.19.0.11, port=9222, code=OK, reason=No more data to read.
morningman closed issue #11581: [Bug] fail to refresh storge policy. host=172.19.0.11, port=9222, code=OK, reason=No more data to read. URL: https://github.com/apache/doris/issues/11581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 opened a new pull request, #11744: [chore](workflow) Add shellcheck to check shell scripts
adonis0147 opened a new pull request, #11744: URL: https://github.com/apache/doris/pull/11744 # Proposed changes Add a workflow (powered by **ShellCheker**) to check our shell scripts. ## Problem summary There are some pitfalls in shell scripting. Therefore we need a checker to analyze our shell scripts. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] starocean999 opened a new pull request, #11745: [FIX](function)fix max_by function bug
starocean999 opened a new pull request, #11745: URL: https://github.com/apache/doris/pull/11745 # Proposed changes Issue Number: close #xxx ## Problem summary This pr does the same thing as https://github.com/apache/doris/pull/10650. Because the code base is so different that it's easier to make the changes based on dev-1.1.2 than cherry-pick ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ByteYue opened a new pull request, #11746: (BugFix)[FE](eliminate duplicate query id in fe.audit.log)
ByteYue opened a new pull request, #11746: URL: https://github.com/apache/doris/pull/11746 # Proposed changes Issue Number: close #xxx ## Problem summary The original logic in ConnectProcessor.java might result in duplicate query id for different query statement in fe.audit.log as follows.  This pr solves this by assigning new query id for wrong query. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] No 2. Has unit tests been added: - [ ] No 3. Has document been added or modified: - [ ] No 4. Does it need to update dependencies: - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] bin41215 opened a new pull request, #11747: [minor](be) Identify physical memory and virtual memory usage separately.
bin41215 opened a new pull request, #11747: URL: https://github.com/apache/doris/pull/11747 # Proposed changes In the check_sys_mem_info method of mem_tracker_limiter.h, turn virtual memory to physical memory ## Problem summary The information about memory is confusing ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Lchangliang opened a new pull request, #11748: [Bugfix](schema change) fix memory exceeded when schema change
Lchangliang opened a new pull request, #11748: URL: https://github.com/apache/doris/pull/11748 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 opened a new pull request, #11749: [doc](typo)Add doc sidebars
hf200012 opened a new pull request, #11749: URL: https://github.com/apache/doris/pull/11749 Add the docs document sidebars.json file, the complete structure of the document, only need to maintain a sidebars.json file for Chinese and English documents Prepare to publish the master sub-document on the official website and build it every day # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] bin41215 commented on pull request #11658: [minor](be)turn virtual memory to physical memory
bin41215 commented on PR #11658: URL: https://github.com/apache/doris/pull/11658#issuecomment-1213115912 > LGTM, run `sh build-support/clang-format.sh`, and resubmit @bin41215 @xinyiZzz Sorry, my repo project was deleted by mistake, so I reapplied a pr again. Please review this : https://github.com/apache/doris/pull/11747 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #11406: [feature-wip](array-type) add the array_join function
cambyzju commented on PR #11406: URL: https://github.com/apache/doris/pull/11406#issuecomment-1213169098 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #11703: [enhancement](array-type) support export files in 'select into outfile'
cambyzju commented on PR #11703: URL: https://github.com/apache/doris/pull/11703#issuecomment-1213173556 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #11585: [fix](array-type) Fix incorrect in function-set for array type
cambyzju commented on PR #11585: URL: https://github.com/apache/doris/pull/11585#issuecomment-1213174782 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #11649: [Fix] fix `cast(array as array<>)` causes be core dump
cambyzju commented on PR #11649: URL: https://github.com/apache/doris/pull/11649#issuecomment-1213178092 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #11602: [fix](array-type) disable cast function to array type on origin exec engine.
cambyzju commented on PR #11602: URL: https://github.com/apache/doris/pull/11602#issuecomment-1213180103 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Chovyyyyyy commented on issue #11706: Getting Started Tasks for New Contributors
Chovyy commented on issue #11706: URL: https://github.com/apache/doris/issues/11706#issuecomment-1213223958 [WeOpen Star] I would like to help IMPROVEMENT Recover DDL https://github.com/apache/doris/issues/8421 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz commented on a diff in pull request #11740: [enhancement](memtracker) Optimize query memory accuracy
xinyiZzz commented on code in PR #11740: URL: https://github.com/apache/doris/pull/11740#discussion_r944670632 ## be/src/runtime/memory/chunk_allocator.cpp: ## @@ -147,7 +147,7 @@ ChunkAllocator::ChunkAllocator(size_t reserve_limit) } Status ChunkAllocator::allocate(size_t size, Chunk* chunk) { -DCHECK(BitUtil::RoundUpToPowerOfTwo(size) == size); +DCHECK((size & (size - 1)) == 0); Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz commented on a diff in pull request #11740: [enhancement](memtracker) Optimize query memory accuracy
xinyiZzz commented on code in PR #11740: URL: https://github.com/apache/doris/pull/11740#discussion_r944676056 ## be/src/vec/common/pod_array.h: ## @@ -111,15 +112,26 @@ class PODArrayBase : private boost::noncopyable, return byte_size(num_elements) + pad_right + pad_left; } +inline void reset_peak() { +if (c_end - c_end_peak > 1024) { Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz closed pull request #11658: [minor](be)turn virtual memory to physical memory
xinyiZzz closed pull request #11658: [minor](be)turn virtual memory to physical memory URL: https://github.com/apache/doris/pull/11658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz commented on pull request #11658: [minor](be)turn virtual memory to physical memory
xinyiZzz commented on PR #11658: URL: https://github.com/apache/doris/pull/11658#issuecomment-1213370190 > > LGTM, run `sh build-support/clang-format.sh`, and resubmit @bin41215 > > @xinyiZzz Sorry, my repo project was deleted by mistake, so I reapplied a pr again. Please review this : #11747 OK -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz commented on pull request #11747: [minor](be) Identify physical memory and virtual memory usage separately.
xinyiZzz commented on PR #11747: URL: https://github.com/apache/doris/pull/11747#issuecomment-1213369943 LGTM, same #11658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz opened a new issue, #11750: [Enhancement] Too much cache leads to less memory available for queries
xinyiZzz opened a new issue, #11750: URL: https://github.com/apache/doris/issues/11750 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description The main cache, PageCache and ChunkAllocator ### Solution _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz opened a new pull request, #11751: [enhancement](memory) Fix too much cache leads to less memory available for queries
xinyiZzz opened a new pull request, #11751: URL: https://github.com/apache/doris/pull/11751 # Proposed changes Issue Number: close #11750 ## Problem summary Disable Chunk Allocator in Vectorized Allocator, this will reduce memory cache. For high concurrent queries, using Chunk Allocator with vectorized Allocator can reduce the impact of gperftools tcmalloc central lock. Jemalloc or google tcmalloc have core cache, Chunk Allocator may no longer be needed after replacing gperftools tcmalloc. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hqx871 commented on a diff in pull request #11579: [Feature](NGram BloomFilter Index) add new ngram bloom filter index to speed up like query
hqx871 commented on code in PR #11579: URL: https://github.com/apache/doris/pull/11579#discussion_r944998861 ## fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java: ## @@ -142,6 +217,30 @@ public void checkColumn(Column column, KeysType keysType) throws AnalysisExcepti "BITMAP index only used in columns of DUP_KEYS/UNIQUE_KEYS table or key columns of" + " AGG_KEYS table. invalid column: " + indexColName); } +} else if (indexType == IndexType.NGRAM_BF) { +String indexColName = column.getName(); +PrimitiveType colType = column.getDataType(); +if (colType != PrimitiveType.CHAR && colType != PrimitiveType.VARCHAR) { Review Comment: you are right. but i will change to colType.isCharFamily(), as isStringType includes hll type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hqx871 commented on a diff in pull request #11579: [Feature](NGram BloomFilter Index) add new ngram bloom filter index to speed up like query
hqx871 commented on code in PR #11579: URL: https://github.com/apache/doris/pull/11579#discussion_r944998969 ## fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java: ## @@ -142,6 +217,30 @@ public void checkColumn(Column column, KeysType keysType) throws AnalysisExcepti "BITMAP index only used in columns of DUP_KEYS/UNIQUE_KEYS table or key columns of" + " AGG_KEYS table. invalid column: " + indexColName); } +} else if (indexType == IndexType.NGRAM_BF) { +String indexColName = column.getName(); +PrimitiveType colType = column.getDataType(); +if (colType != PrimitiveType.CHAR && colType != PrimitiveType.VARCHAR) { +throw new AnalysisException(colType + " is not supported in ngram_bf index. " ++ "invalid column: " + indexColName); +} else if ((keysType == KeysType.AGG_KEYS && !column.isKey())) { +throw new AnalysisException( +"ngram_bf index only used in columns of DUP_KEYS/UNIQUE_KEYS table or key columns of" ++ " AGG_KEYS table. invalid column: " + indexColName); +} +if (arguments == null || arguments.size() != 2) { +throw new AnalysisException("ngram should have ngram size and bloom filter size arguments"); +} +Expr ngramSize = arguments.get(0); +if (!(ngramSize instanceof IntLiteral && ((IntLiteral) ngramSize).getLongValue() < 256 +&& ((IntLiteral) ngramSize).getLongValue() >= 1)) { +throw new AnalysisException("ngram size should be integer and less than 256"); +} +Expr bfSize = arguments.get(1); +if (!(bfSize instanceof IntLiteral && ((IntLiteral) bfSize).getLongValue() < 65536 Review Comment: get -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org