[GitHub] [incubator-doris] xiaokang commented on pull request #8451: [improvement](memory) fix olap table scan and sink memory usage problem
xiaokang commented on pull request #8451: URL: https://github.com/apache/incubator-doris/pull/8451#issuecomment-1066054539 @morningman volap_scan_node.cpp is done. The test result is almost the same as non-vectorized version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zhannngchen opened a new pull request #8458: [UT] add unit tests for min/max function, and cleaned up some unused …
zhannngchen opened a new pull request #8458: URL: https://github.com/apache/incubator-doris/pull/8458 # Proposed changes Add unit tests for min/max function, with some code cleanup. ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8458: [UT] add unit tests for min/max function, and cleaned up some unused …
github-actions[bot] commented on pull request #8458: URL: https://github.com/apache/incubator-doris/pull/8458#issuecomment-1066080355 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] dataroaring commented on issue #8382: [Bug] variance is different with trino
dataroaring commented on issue #8382: URL: https://github.com/apache/incubator-doris/issues/8382#issuecomment-1066080832 https://github.com/apache/incubator-doris/blob/master/regression-test/suites/aggregate/aggregate.groovy can reproduce. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zbtzbtzbt commented on issue #8435: [Enhancement] The bitmap_hash function can be implemented using murmur_hash3_128
zbtzbtzbt commented on issue #8435: URL: https://github.com/apache/incubator-doris/issues/8435#issuecomment-1066083773 I think this modification will be incompatible with old data @syb853553110 https://doris.apache.org/zh-CN/sql-reference/sql-functions/bitmap-functions/bitmap_hash.html#description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8457: [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error
github-actions[bot] commented on pull request #8457: URL: https://github.com/apache/incubator-doris/pull/8457#issuecomment-1066084147 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8451: [improvement](memory) fix olap table scan and sink memory usage problem
github-actions[bot] commented on pull request #8451: URL: https://github.com/apache/incubator-doris/pull/8451#issuecomment-1066084713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #8369: [docs] Update documentation configuration parameter `sink.batch.bytes…
morningman merged pull request #8369: URL: https://github.com/apache/incubator-doris/pull/8369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8369: [docs] Update documentation configuration parameter `sink.batch.bytes…
github-actions[bot] commented on pull request #8369: URL: https://github.com/apache/incubator-doris/pull/8369#issuecomment-1066095170 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: [doc] Update documentation configuration parameter `sink.batch.bytes` in flink-doris-connector (#8369)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new 392a977 [doc] Update documentation configuration parameter `sink.batch.bytes` in flink-doris-connector (#8369) 392a977 is described below commit 392a9774af584a230f041a756eca293a79b89460 Author: Jiangqiao Xu <96433131+bridgedr...@users.noreply.github.com> AuthorDate: Sun Mar 13 20:53:50 2022 +0800 [doc] Update documentation configuration parameter `sink.batch.bytes` in flink-doris-connector (#8369) --- docs/en/extending-doris/flink-doris-connector.md| 2 +- docs/zh-CN/extending-doris/flink-doris-connector.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/en/extending-doris/flink-doris-connector.md b/docs/en/extending-doris/flink-doris-connector.md index c52ea50..823cdbc 100644 --- a/docs/en/extending-doris/flink-doris-connector.md +++ b/docs/en/extending-doris/flink-doris-connector.md @@ -302,7 +302,7 @@ outputFormat.close(); | sink.batch.interval | 10s | The flush interval, after which the asynchronous thread will write the data in the cache to BE. The default value is 10 second, and the time units are ms, s, min, h, and d. Set to 0 to turn off periodic writing. | | sink.properties.* | -- | The stream load parameters. eg: sink.properties.column_separator' = ',' Setting 'sink.properties.escape_delimiters' = 'true' if you want to use a control char as a separator, so that such as '\\x01' will translate to binary 0x01 Support JSON format import, you need to enable both 'sink.properties.format' ='json' and 'sink.properties.strip_outer_array' ='true'| | sink.enable-delete | true | Whether to enable deletion. This option requires Doris table to enable batch delete function (0.15+ version is enabled by default), and only supports Uniq model.| - +| sink.batch.bytes| 10485760 | Maximum bytes of batch in a single write to BE. When the data size in batch exceeds this threshold, cache data is written to BE. The default value is 10MB | ## Doris & Flink Column Type Mapping diff --git a/docs/zh-CN/extending-doris/flink-doris-connector.md b/docs/zh-CN/extending-doris/flink-doris-connector.md index 7549fcb..fd3aca7 100644 --- a/docs/zh-CN/extending-doris/flink-doris-connector.md +++ b/docs/zh-CN/extending-doris/flink-doris-connector.md @@ -306,7 +306,7 @@ outputFormat.close(); | sink.batch.interval | 10s | flush 间隔时间,超过该时间后异步线程将 缓存中数据写入BE。 默认值为10秒,支持时间单位ms、s、min、h和d。设置为0表示关闭定期写入。 | | sink.properties.* | -- | Stream load 的导入参数例如:'sink.properties.column_separator' = ', '定义列分隔符'sink.properties.escape_delimiters' = 'true'特殊字符作为分隔符,'\\x01'会被转换为二进制的0x01 'sink.properties.format' = 'json''sink.properties.strip_outer_array' = 'true' JSON格式导入| | sink.enable-delete | true | 是否启用删除。此选项需要Doris表开启批量删除功能(0.15+版本默认开启),只支持Uniq模型。| - +| sink.batch.bytes | 10485760 | 单次写BE的最大数据量,当每个 batch 中记录的数据量超过该阈值时,会将缓存数据写入 BE。默认值为 10MB| ## Doris 和 Flink 列类型映射关系 | Doris Type | Flink Type | - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: [improvement](VHashJoin) add probe timer (#8233)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new 705989d [improvement](VHashJoin) add probe timer (#8233) 705989d is described below commit 705989d23916ce115b6ed269221f7be377b74a24 Author: awakeljw <993007...@qq.com> AuthorDate: Sun Mar 13 20:54:44 2022 +0800 [improvement](VHashJoin) add probe timer (#8233) --- be/src/vec/exec/join/vhash_join_node.cpp | 217 ++- be/src/vec/exec/join/vhash_join_node.h | 3 + 2 files changed, 127 insertions(+), 93 deletions(-) diff --git a/be/src/vec/exec/join/vhash_join_node.cpp b/be/src/vec/exec/join/vhash_join_node.cpp index c33bcb2..a1af769 100644 --- a/be/src/vec/exec/join/vhash_join_node.cpp +++ b/be/src/vec/exec/join/vhash_join_node.cpp @@ -166,8 +166,56 @@ struct ProcessHashTableProbe { _items_counts(join_node->_items_counts), _build_block_offsets(join_node->_build_block_offsets), _build_block_rows(join_node->_build_block_rows), - _rows_returned_counter(join_node->_rows_returned_counter) {} + _rows_returned_counter(join_node->_rows_returned_counter), + _search_hashtable_timer(join_node->_search_hashtable_timer), + _build_side_output_timer(join_node->_build_side_output_timer), + _probe_side_output_timer(join_node->_probe_side_output_timer) {} + +// output build side result column +void build_side_output_column(MutableColumns& mcol, int column_offset, int column_length, int size) { +constexpr auto is_semi_anti_join = JoinOpType::value == TJoinOp::RIGHT_ANTI_JOIN || + JoinOpType::value == TJoinOp::RIGHT_SEMI_JOIN || + JoinOpType::value == TJoinOp::LEFT_ANTI_JOIN || + JoinOpType::value == TJoinOp::LEFT_SEMI_JOIN; +constexpr auto probe_all = JoinOpType::value == TJoinOp::LEFT_OUTER_JOIN || + JoinOpType::value == TJoinOp::FULL_OUTER_JOIN; + +if constexpr (!is_semi_anti_join) { +if (_build_blocks.size() == 1) { +for (int i = 0; i < column_length; i++) { +auto& column = *_build_blocks[0].get_by_position(i).column; +mcol[i + column_offset]->insert_indices_from(column, +_build_block_rows.data(), _build_block_rows.data() + size); +} +} else { +for (int i = 0; i < column_length; i++) { +for (int j = 0; j < size; j++) { +if constexpr (probe_all) { +if (_build_block_offsets[j] == -1) { +DCHECK(mcol[i + column_offset]->is_nullable()); +assert_cast(mcol[i + column_offset].get())->insert_join_null_data(); +} else { +auto& column = *_build_blocks[_build_block_offsets[j]].get_by_position(i).column; +mcol[i + column_offset]->insert_from(column, _build_block_rows[j]); +} +} else { +auto& column = *_build_blocks[_build_block_offsets[j]].get_by_position(i).column; +mcol[i + column_offset]->insert_from(column, _build_block_rows[j]); +} +} +} +} +} +} + +// output probe side result column +void probe_side_output_column(MutableColumns& mcol, int column_length, int size) { +for (int i = 0; i < column_length; ++i) { +auto& column = _probe_block.get_by_position(i).column; +column->replicate(&_items_counts[0], size, *mcol[i]); +} +} // Only process the join with no other join conjunt, because of no other join conjunt // the output block struct is same with mutable block. we can do more opt on it and simplify // the logic of probe @@ -198,116 +246,93 @@ struct ProcessHashTableProbe { constexpr auto is_right_semi_anti_join = JoinOpType::value == TJoinOp::RIGHT_ANTI_JOIN || JoinOpType::value == TJoinOp::RIGHT_SEMI_JOIN; -constexpr auto is_semi_anti_join = is_right_semi_anti_join || -JoinOpType::value == TJoinOp::LEFT_ANTI_JOIN || -JoinOpType::value == TJoinOp::LEFT_SEMI_JOIN; - constexpr auto probe_all = JoinOpType::value == TJoinOp::LEFT_OUTER_JOIN || JoinOpType::value == TJoinOp::FULL_OUTER_JOIN; -for (; _probe_index < _p
[GitHub] [incubator-doris] morningman merged pull request #8233: [Vectorized][HashJoin] add probe timer
morningman merged pull request #8233: URL: https://github.com/apache/incubator-doris/pull/8233 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman opened a new issue #8459: [Bug] BE crash when doing left outer join with vec engine
morningman opened a new issue #8459: URL: https://github.com/apache/incubator-doris/issues/8459 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version dev-1.0.0 ### What's Wrong? Works``` *** Aborted at 1647150517 (unix time) try "date -d @1647150517" if you are using GNU date *** PC: @ 0x7fb284945a9c doris::vectorized::ColumnVector<>::insert_from() *** SIGSEGV (@0x0) received by PID 747 (TID 0x7fb2571bd700) from PID 0; stack trace: *** @ 0x7fb285fce812 google::(anonymous namespace)::FailureSignalHandler() @ 0x7fb281d0a920 (unknown) @ 0x7fb284945a9c doris::vectorized::ColumnVector<>::insert_from() @ 0x7fb284928ea7 doris::vectorized::ColumnNullable::insert_from() @ 0x7fb285b60e10 doris::vectorized::BlockReader::_agg_key_next_block() @ 0x7fb284cfe21d doris::vectorized::VOlapScanner::get_block() @ 0x7fb284cf3d62 doris::vectorized::VOlapScanNode::scanner_thread() @ 0x7fb28435344a doris::PriorityWorkStealingThreadPool::work_thread() @ 0x7fb2881077b0 execute_native_thread_routine @ 0x7fb281ac2851 start_thread @ 0x7fb281dbf67d clone @0x0 (unknown) ``` ### What You Expected? Works well ### How to Reproduce? Probably because the right table is an agg table , and the column is NOT NULL. But the right table's output slot should be nullable. ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8458: [UT] add unit tests for min/max function, and cleaned up some unused …
github-actions[bot] commented on pull request #8458: URL: https://github.com/apache/incubator-doris/pull/8458#issuecomment-1066110440 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8456: [chore](dependency) fix build thirdparty errors
github-actions[bot] commented on pull request #8456: URL: https://github.com/apache/incubator-doris/pull/8456#issuecomment-1066110771 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #8456: [chore](dependency) fix build thirdparty errors
morningman merged pull request #8456: URL: https://github.com/apache/incubator-doris/pull/8456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated (705989d -> a4b710c)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 705989d [improvement](VHashJoin) add probe timer (#8233) add a4b710c [chore](dependency) fix build thirdparty errors (#8456) No new revisions were added by this update. Summary of changes: docs/.vuepress/sidebar/en.js | 3 +- docs/.vuepress/sidebar/zh-CN.js| 3 +- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../doris/load/routineload/ScheduleRule.java | 13 + thirdparty/download-thirdparty.sh | 15 +- 6 files changed, 17 insertions(+), 127 deletions(-) delete mode 100644 docs/en/sql-reference/sql-functions/bitwise-functions/bit_length.md delete mode 100644 docs/zh-CN/sql-reference/sql-functions/bitwise-functions/bit_length.md - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #8451: [improvement](memory) fix olap table scan and sink memory usage problem
morningman merged pull request #8451: URL: https://github.com/apache/incubator-doris/pull/8451 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated (a4b710c -> e807e8b)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from a4b710c [chore](dependency) fix build thirdparty errors (#8456) add e807e8b [improvement](memory) fix olap table scan and sink memory usage problem (#8451) No new revisions were added by this update. Summary of changes: be/src/common/config.h | 4 +++- be/src/exec/olap_scan_node.cpp | 24 +++ be/src/exec/olap_scan_node.h| 6 + be/src/exec/olap_scanner.cpp| 10 be/src/exec/tablet_sink.cpp | 15 ++-- be/src/exec/tablet_sink.h | 3 +++ be/src/vec/exec/volap_scan_node.cpp | 48 +++-- be/src/vec/exec/volap_scanner.cpp | 8 ++- 8 files changed, 98 insertions(+), 20 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] dataroaring closed pull request #8433: add loggger to Suite to log in cases
dataroaring closed pull request #8433: URL: https://github.com/apache/incubator-doris/pull/8433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] dataroaring opened a new pull request #8460: let framework support sql cases
dataroaring opened a new pull request #8460: URL: https://github.com/apache/incubator-doris/pull/8460 We generate groovy files from sql cases and run the generated groovy file. This way, we can just put sql cases, then framework handles left work. # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch dev-1.0.0 updated (32da525 -> 701fd4f)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch dev-1.0.0 in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 32da525 [fix] BE crash when reporting tablet (#8453) new eb322f5 [improvement](vectorized) Support BetweenPredicate enable fold const expr (#8450) new ca05846 [improvement](memory) fix olap table scan and sink memory usage problem (#8451) new 701fd4f [chore](dependency) fix build thirdparty errors (#8456) The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: be/src/common/config.h | 4 +- be/src/exec/olap_scan_node.cpp | 22 +++-- be/src/exec/olap_scan_node.h | 6 +++ be/src/exec/olap_scanner.cpp | 10 ++-- be/src/exec/tablet_sink.cpp| 9 +++- be/src/exec/tablet_sink.h | 3 ++ be/src/runtime/mysql_result_writer.cpp | 6 ++- be/src/vec/columns/column.h| 4 +- be/src/vec/columns/column_nullable.cpp | 2 +- be/src/vec/columns/column_nullable.h | 2 +- be/src/vec/columns/column_vector.cpp | 4 +- be/src/vec/exec/volap_scan_node.cpp| 51 be/src/vec/exec/volap_scanner.cpp | 8 +++- be/src/vec/exprs/vtuple_is_null_predicate.cpp | 6 +-- docs/.vuepress/sidebar/en.js | 3 +- docs/.vuepress/sidebar/zh-CN.js| 3 +- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../doris/load/routineload/ScheduleRule.java | 13 + .../apache/doris/rewrite/FoldConstantsRule.java| 6 ++- thirdparty/download-thirdparty.sh | 15 +- 21 files changed, 127 insertions(+), 160 deletions(-) delete mode 100644 docs/en/sql-reference/sql-functions/bitwise-functions/bit_length.md delete mode 100644 docs/zh-CN/sql-reference/sql-functions/bitwise-functions/bit_length.md - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] 01/03: [improvement](vectorized) Support BetweenPredicate enable fold const expr (#8450)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch dev-1.0.0 in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit eb322f542cb37fa49d264b887bfe45fe7499046e Author: HappenLee AuthorDate: Sun Mar 13 09:36:24 2022 +0800 [improvement](vectorized) Support BetweenPredicate enable fold const expr (#8450) --- be/src/runtime/mysql_result_writer.cpp | 6 -- be/src/vec/columns/column.h | 4 +++- be/src/vec/columns/column_nullable.cpp | 2 +- be/src/vec/columns/column_nullable.h| 2 +- be/src/vec/columns/column_vector.cpp| 4 ++-- be/src/vec/exprs/vtuple_is_null_predicate.cpp | 6 ++ .../src/main/java/org/apache/doris/rewrite/FoldConstantsRule.java | 6 +- 7 files changed, 18 insertions(+), 12 deletions(-) diff --git a/be/src/runtime/mysql_result_writer.cpp b/be/src/runtime/mysql_result_writer.cpp index eaf1bd7..2a7de6c 100644 --- a/be/src/runtime/mysql_result_writer.cpp +++ b/be/src/runtime/mysql_result_writer.cpp @@ -159,8 +159,10 @@ int MysqlResultWriter::_add_row_value(int index, const TypeDescriptor& type, voi case TYPE_DECIMALV2: { DecimalV2Value decimal_val(reinterpret_cast(item)->value); -int output_scale = _output_expr_ctxs[index]->root()->output_scale(); -buf_ret = _row_buffer->push_decimal(decimal_val, output_scale); +// TODO: Support decimal output_scale after we support FE can sure +// accuracy of output_scale +// int output_scale = _output_expr_ctxs[index]->root()->output_scale(); +buf_ret = _row_buffer->push_decimal(decimal_val, -1); break; } diff --git a/be/src/vec/columns/column.h b/be/src/vec/columns/column.h index fdfd85b..07989f4 100644 --- a/be/src/vec/columns/column.h +++ b/be/src/vec/columns/column.h @@ -34,6 +34,8 @@ namespace doris::vectorized { class Arena; class Field; +// TODO: Remove the trickly hint, after FE support better way to remove function tuple_is_null +constexpr uint8_t JOIN_NULL_HINT = 2; /// Declares interface to store columns in memory. class IColumn : public COW { @@ -164,7 +166,7 @@ public: /// indices_begin + indices_end represent the row indices of column src /// Warning: /// if *indices == -1 means the row is null, only use in outer join, do not use in any other place -/// insert -1 in null map to hint the null is produced by outer join +/// insert JOIN_NULL_HINT in null map to hint the null is produced by outer join virtual void insert_indices_from(const IColumn& src, const int* indices_begin, const int* indices_end) = 0; /// Appends data located in specified memory chunk if it is possible (throws an exception if it cannot be implemented). diff --git a/be/src/vec/columns/column_nullable.cpp b/be/src/vec/columns/column_nullable.cpp index 9877903..69634ef 100644 --- a/be/src/vec/columns/column_nullable.cpp +++ b/be/src/vec/columns/column_nullable.cpp @@ -114,7 +114,7 @@ StringRef ColumnNullable::serialize_value_into_arena(size_t n, Arena& arena, void ColumnNullable::insert_join_null_data() { get_nested_column().insert_default(); -get_null_map_data().push_back(-1); +get_null_map_data().push_back(JOIN_NULL_HINT); } const char* ColumnNullable::deserialize_and_insert_from_arena(const char* pos) { diff --git a/be/src/vec/columns/column_nullable.h b/be/src/vec/columns/column_nullable.h index 030ca13..1a792f7 100644 --- a/be/src/vec/columns/column_nullable.h +++ b/be/src/vec/columns/column_nullable.h @@ -80,7 +80,7 @@ public: /// Will insert null value if pos=nullptr void insert_data(const char* pos, size_t length) override; -/// -1 in null map means null is generated by join, only use in tuple is null +/// JOIN_NULL_HINT in null map means null is generated by join, only use in tuple is null void insert_join_null_data(); StringRef serialize_value_into_arena(size_t n, Arena& arena, char const*& begin) const override; diff --git a/be/src/vec/columns/column_vector.cpp b/be/src/vec/columns/column_vector.cpp index 3188a93..dfe1bce 100644 --- a/be/src/vec/columns/column_vector.cpp +++ b/be/src/vec/columns/column_vector.cpp @@ -231,8 +231,8 @@ void ColumnVector::insert_indices_from(const IColumn& src, const int* indices // Now Uint8 use to identify null and non null // 1. nullable column : offset == -1 means is null at the here, set true here // 2. real data column : offset == -1 what at is meaningless -// 3. -1 only use in outer join to hint the null is produced by outer join -data[origin_size + i] = (offset == -1) ? UInt8(-1) : src_vec.get_element(offset); +// 3. JOIN_NULL_HINT only use in outer join to
[incubator-doris] 02/03: [improvement](memory) fix olap table scan and sink memory usage problem (#8451)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch dev-1.0.0 in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit ca058465f861f1df95e25d1a3b009e71c5bdf2ea Author: Kang AuthorDate: Sun Mar 13 22:12:15 2022 +0800 [improvement](memory) fix olap table scan and sink memory usage problem (#8451) Due to unlimited queue in OlapScanNode and NodeChannel, memory usage can be very large for reading and writing large table, e.g 'insert into tableB select * from tableA'. --- be/src/common/config.h | 4 ++- be/src/exec/olap_scan_node.cpp | 22 +--- be/src/exec/olap_scan_node.h| 6 + be/src/exec/olap_scanner.cpp| 10 be/src/exec/tablet_sink.cpp | 9 +-- be/src/exec/tablet_sink.h | 3 +++ be/src/vec/exec/volap_scan_node.cpp | 51 +++-- be/src/vec/exec/volap_scanner.cpp | 8 +- 8 files changed, 92 insertions(+), 21 deletions(-) diff --git a/be/src/common/config.h b/be/src/common/config.h index 8f3b0b7..26bd081 100644 --- a/be/src/common/config.h +++ b/be/src/common/config.h @@ -167,8 +167,10 @@ CONF_mInt64(thrift_client_retry_interval_ms, "1000"); CONF_mInt32(doris_scan_range_row_count, "524288"); // size of scanner queue between scanner thread and compute thread CONF_mInt32(doris_scanner_queue_size, "1024"); -// single read execute fragment row size +// single read execute fragment row number CONF_mInt32(doris_scanner_row_num, "16384"); +// single read execute fragment row bytes +CONF_mInt32(doris_scanner_row_bytes, "10485760"); // number of max scan keys CONF_mInt32(doris_max_scan_key_num, "1024"); // the max number of push down values of a single column. diff --git a/be/src/exec/olap_scan_node.cpp b/be/src/exec/olap_scan_node.cpp index af26ae1..19ec140 100644 --- a/be/src/exec/olap_scan_node.cpp +++ b/be/src/exec/olap_scan_node.cpp @@ -77,6 +77,8 @@ Status OlapScanNode::init(const TPlanNode& tnode, RuntimeState* state) { _max_pushdown_conditions_per_column = config::max_pushdown_conditions_per_column; } +_max_scanner_queue_size_bytes = query_options.mem_limit / 20; //TODO: session variable percent + /// TODO: could one filter used in the different scan_node ? int filter_size = _runtime_filter_descs.size(); _runtime_filter_ctxs.resize(filter_size); @@ -306,6 +308,7 @@ Status OlapScanNode::get_next(RuntimeState* state, RowBatch* row_batch, bool* eo materialized_batch = _materialized_row_batches.front(); DCHECK(materialized_batch != nullptr); _materialized_row_batches.pop_front(); +_materialized_row_batches_bytes -= materialized_batch->tuple_data_pool()->total_reserved_bytes(); } } @@ -394,12 +397,14 @@ Status OlapScanNode::close(RuntimeState* state) { } _materialized_row_batches.clear(); +_materialized_row_batches_bytes = 0; for (auto row_batch : _scan_row_batches) { delete row_batch; } _scan_row_batches.clear(); +_scan_row_batches_bytes = 0; // OlapScanNode terminate by exception // so that initiative close the Scanner @@ -1371,6 +1376,7 @@ void OlapScanNode::transfer_thread(RuntimeState* state) { int max_thread = _max_materialized_row_batches; if (config::doris_scanner_row_num > state->batch_size()) { max_thread /= config::doris_scanner_row_num / state->batch_size(); +if (max_thread <= 0) max_thread = 1; } // read from scanner while (LIKELY(status.ok())) { @@ -1393,7 +1399,7 @@ void OlapScanNode::transfer_thread(RuntimeState* state) { if (state->fragment_mem_tracker() != nullptr) { mem_consume = state->fragment_mem_tracker()->consumption(); } -if (mem_consume < (mem_limit * 6) / 10) { +if (mem_consume < (mem_limit * 6) / 10 && _scan_row_batches_bytes < _max_scanner_queue_size_bytes / 2) { thread_slot_num = max_thread - assigned_thread_num; } else { // Memory already exceed @@ -1473,6 +1479,7 @@ void OlapScanNode::transfer_thread(RuntimeState* state) { if (LIKELY(!_scan_row_batches.empty())) { scan_batch = _scan_row_batches.front(); _scan_row_batches.pop_front(); +_scan_row_batches_bytes -= scan_batch->tuple_data_pool()->total_reserved_bytes(); // delete scan_batch if transfer thread should be stopped // because scan_batch wouldn't be useful anymore @@ -1573,10 +1580,12 @@ void OlapScanNode::scanner_thread(OlapScanner* scanner) { // need yield this thread when we do enough work. However, OlapStorage read // data in pre-aggregate mode, then we can't use storage returned data to // judge if we need to yield. So we record all raw data read in this round -// sc
[incubator-doris] 03/03: [chore](dependency) fix build thirdparty errors (#8456)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch dev-1.0.0 in repository https://gitbox.apache.org/repos/asf/incubator-doris.git commit 701fd4f7f5bbf422f817be6a917e3ca19f294ae0 Author: Mingyu Chen AuthorDate: Sun Mar 13 22:11:24 2022 +0800 [chore](dependency) fix build thirdparty errors (#8456) 1. the patch for aws-c-cal-0.4.5 does not need anymore 2. remove duplicate bit_length document 3. add some debug log for routine load --- docs/.vuepress/sidebar/en.js | 3 +- docs/.vuepress/sidebar/zh-CN.js| 3 +- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../sql-functions/bitwise-functions/bit_length.md | 55 -- .../doris/load/routineload/ScheduleRule.java | 13 + thirdparty/download-thirdparty.sh | 15 +- 6 files changed, 17 insertions(+), 127 deletions(-) diff --git a/docs/.vuepress/sidebar/en.js b/docs/.vuepress/sidebar/en.js index 941be11..b24039d 100644 --- a/docs/.vuepress/sidebar/en.js +++ b/docs/.vuepress/sidebar/en.js @@ -449,8 +449,7 @@ module.exports = [ "bitand", "bitor", "bitxor", - "bitnot", - "bit_length" + "bitnot" ], }, { diff --git a/docs/.vuepress/sidebar/zh-CN.js b/docs/.vuepress/sidebar/zh-CN.js index 582407b..f05b3b8 100644 --- a/docs/.vuepress/sidebar/zh-CN.js +++ b/docs/.vuepress/sidebar/zh-CN.js @@ -453,8 +453,7 @@ module.exports = [ "bitand", "bitor", "bitxor", - "bitnot", - "bit_length" + "bitnot" ], }, { diff --git a/docs/en/sql-reference/sql-functions/bitwise-functions/bit_length.md b/docs/en/sql-reference/sql-functions/bitwise-functions/bit_length.md deleted file mode 100644 index 9f56a1f..000 --- a/docs/en/sql-reference/sql-functions/bitwise-functions/bit_length.md +++ /dev/null @@ -1,55 +0,0 @@ -{ -"title": "bit_length", -"language": "en" -} - - - -# bit_length -## description -### Syntax - -`INT bit_length(VARCHAR str)` - -Return length of argument in bits. - -## example - -``` -MySQL> select bit_length("doris"); -+-+ -| bit_length('doris') | -+-+ -| 40 | -+-+ - -MySQL [test]> select bit_length("hello world"); -+---+ -| bit_length('hello world') | -+---+ -|88 | -+---+ -``` - -## keyword - -bit_length diff --git a/docs/zh-CN/sql-reference/sql-functions/bitwise-functions/bit_length.md b/docs/zh-CN/sql-reference/sql-functions/bitwise-functions/bit_length.md deleted file mode 100644 index c0005fa..000 --- a/docs/zh-CN/sql-reference/sql-functions/bitwise-functions/bit_length.md +++ /dev/null @@ -1,55 +0,0 @@ -{ -"title": "bit_length", -"language": "zh-CN" -} - - - -# bit_length -## description -### Syntax - -`INT bit_length(VARCHAR str)` - -返回字符串的bit位数 - -## example - -``` -MySQL> select bit_length("doris"); -+-+ -| bit_length('doris') | -+-+ -| 40 | -+-+ - -MySQL [test]> select bit_length("hello world"); -+---+ -| bit_length('hello world') | -+---+ -|88 | -+---+ -``` - -## keyword - -bit_length diff --git a/fe/fe-core/src/main/java/org/apache/doris/load/routineload/ScheduleRule.java b/fe/fe-core/src/main/java/org/apache/doris/load/routineload/ScheduleRule.java index eaa52e2..c72ee6c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/load/routineload/ScheduleRule.java +++ b/fe/fe-core/src/main/java/org/apache/doris/load/routineload/ScheduleRule.java @@ -21,10 +21,14 @@ import org.apache.doris.common.Config; import org.apache.doris.common.InternalErrorCode; import org.apache.doris.system.SystemInfoService; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + /** * ScheduleRule: RoutineLoad PAUSED -> NEED_SCHEDULE */ public class ScheduleRule { +private static final Logger LOG = LogManager.getLogger(ScheduleRule.class); private static int deadBeCount(String clusterName) { SystemInfoService systemInfoService = Catalog.getCurrentSystemInfo(); @@ -43,17 +47,26 @@ public class ScheduleRule { return false; } if (jobRoutine.autoResumeLock) {//only manual resume for unlock +LOG.debug("routine load job {}'s autoResumeLock is true, skip", jobRoutine.id); return false; } /* * Handle all backends are down. */ +LOG.debug("try to auto reschedule routine load {}, firstResumeTimestamp: {}, autoResumeCoun
[incubator-doris] annotated tag 1.0.0-preview updated (e6478e8 -> 014414a)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to annotated tag 1.0.0-preview in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. *** WARNING: tag 1.0.0-preview was modified! *** from e6478e8 (commit) to 014414a (tag) tagging e6478e8229430d3ce7bce5282fc233b9511c303a (commit) by morningman on Sun Mar 13 22:42:39 2022 +0800 - Log - 1.0.0-preview --- No new revisions were added by this update. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] HappenLee opened a new pull request #8461: [Bug][Vectorized] Agg/Unique not null column outer join coredump
HappenLee opened a new pull request #8461: URL: https://github.com/apache/incubator-doris/pull/8461 # Proposed changes Issue Number: close #8459 ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (No Need) 3. Has document been added or modified: (No Need) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] xiaokang commented on pull request #8322: [refactor] Impl of MemTracker, and related use
xiaokang commented on pull request #8322: URL: https://github.com/apache/incubator-doris/pull/8322#issuecomment-1066228733 @xinyiZzz When I test for #8451 , I encounter a memory limit problem. The problem is that, after the long query, as specified in the test steps of #8451 , is finished, a simple query 'select count() from tableA' will raise memory limit error. I guess it's related to this pr, since the problem is not present before I merge the new MemTracker code. The following is mysql client error message. > ERROR 1105 (HY000): errCode = 2, detailMessage = Memory exceed limit. fragment=4f5f114582e7429d-a630eec6e0e45384, details=New partitioned aggregation, while getting next from child 0., on backend=[172.16.44.107](http://172.16.44.107/). Memory left in process limit=8589934592.00 GB. current tracker I0312 10:58:07.931836 2320 plan_fragment_executor.cpp:76] PlanFragmentExecutor::prepare|pthread_id=140354754955008|backend_num=1|instance_id=35895a325c6943dc -872eced6dfcb8c91|query_id=35895a325c6943dc-872eced6dfcb8c90 I0312 10:58:07.936765 2203 fragment_mgr.cpp:459] PlanFragmentExecutor::_exec_actual|pthread_id=140355728508672|instance_id=35895a325c6943dc-872eced6dfcb8c91| query_id=35895a325c6943dc-872eced6dfcb8c90 I0312 10:58:07.936780 2203 plan_fragment_executor.cpp:213] PlanFragmentExecutor::open, using query memory limit: 7.59 GB|mem_limit=8147483648|instance_id=358 95a325c6943dc-872eced6dfcb8c91|query_id=35895a325c6943dc-872eced6dfcb8c90 W0312 10:58:07.936826 2203 status.h:260] warning: Status msg truncated, OK: Memory exceed limit. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New part itioned aggregation, while getting next from child 0., on backend=[172.16.44.107](http://172.16.44.107/). Memory left in process limit=8589934592.00 GB. current tracker . If query, can change the limit by session variable exec_mem_limit. precise_code:1 W0312 10:58:07.943392 2203 mem_tracker.cpp:290] Memory exceed limit. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New partitioned aggregation, while g etting next from child 0., on backend=[172.16.44.107](http://172.16.44.107/). Memory left in process limit=8589934592.00 GB. current tracker . If query, can change the limit by session variable exec_mem_limit. MemTracker log_usage Label: queryId=35895a325c6943dc-872eced6dfcb8c90, Limit: 7.59 GB, Total: 19.00 KB, Peak: 19.00 KB, Exceeded: false MemTracker log_usage Label: RuntimeState:instance:35895a325c6943dc-872eced6dfcb8c92, Limit: 7.59 GB, Total: 1.00 KB, Peak: 1.00 KB, Exceeded: false MemTracker log_usage Label: RuntimeFilterMgr, Limit: -1.00 B, Total: 0, Peak: 0, Exceeded: false MemTracker log_usage Label: RuntimeState:instance:35895a325c6943dc-872eced6dfcb8c91, Limit: 7.59 GB, Total: 18.00 KB, Peak: 18.00 KB, Exceeded: false MemTracker log_usage Label: RuntimeFilterMgr, Limit: -1.00 B, Total: 0, Peak: 0, Exceeded: false MemTracker log_usage Label: ExecNode:AGGREGATION_NODE (id=1), Limit: -1.00 B, Total: 1.00 KB, Peak: 1.00 KB, Exceeded: false MemTracker log_usage Label: DataStreamSender:35895a325c6943dc-872eced6dfcb8c91, Limit: -1.00 B, Total: 16.00 KB, Peak: 16.00 KB, Exceeded: false W0312 10:58:07.944548 2203 fragment_mgr.cpp:231] Got error while opening fragment 35895a325c6943dc-872eced6dfcb8c91: Memory limit exceeded: Memory exceed lim it. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New partitioned aggregation, while getting next from child 0., on backend=[172.16.44.107](http://172.16.44.107/). Memory left i n process limit=8589934592.00 GB. current tracker
[GitHub] [incubator-doris-flink-connector] zhqu1148980644 opened a new pull request #19: Update README.md
zhqu1148980644 opened a new pull request #19: URL: https://github.com/apache/incubator-doris-flink-connector/pull/19 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] yangzhg commented on a change in pull request #8439: [refactor] use c++ 14 deprecated instead of comment, this detect usage of deprecated var or func at compile time
yangzhg commented on a change in pull request #8439: URL: https://github.com/apache/incubator-doris/pull/8439#discussion_r825546528 ## File path: be/src/agent/task_worker_pool.h ## @@ -51,12 +51,12 @@ class TaskWorkerPool { REALTIME_PUSH, PUBLISH_VERSION, // Deprecated -CLEAR_ALTER_TASK, +CLEAR_ALTER_TASK [[deprecated]], Review comment: remove directly may change the enum value -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] anjia0532 commented on issue #7587: [Roadmap] Doris on K8S
anjia0532 commented on issue #7587: URL: https://github.com/apache/incubator-doris/issues/7587#issuecomment-1066263940 @liangyongz [Kubernetes应用Pod固定IP之kruise](https://segmentfault.com/a/119040707667) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] caiconghui merged pull request #8457: [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error
caiconghui merged pull request #8457: URL: https://github.com/apache/incubator-doris/pull/8457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated: [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error (#8457)
This is an automated email from the ASF dual-hosted git repository. caiconghui pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/master by this push: new 991dc7f [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error (#8457) 991dc7f is described below commit 991dc7fc5cf53e359ea907d2c9d88f2916499a93 Author: caiconghui <55968745+caicong...@users.noreply.github.com> AuthorDate: Mon Mar 14 10:18:14 2022 +0800 [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error (#8457) --- be/src/runtime/routine_load/data_consumer_group.cpp | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/be/src/runtime/routine_load/data_consumer_group.cpp b/be/src/runtime/routine_load/data_consumer_group.cpp index 5f6c789..7242fbe 100644 --- a/be/src/runtime/routine_load/data_consumer_group.cpp +++ b/be/src/runtime/routine_load/data_consumer_group.cpp @@ -116,7 +116,6 @@ Status KafkaDataConsumerGroup::start_all(StreamLoadContext* ctx) { MonotonicStopWatch watch; watch.start(); -Status st; bool eos = false; while (true) { if (eos || left_time <= 0 || left_rows <= 0 || left_bytes <= 0) { @@ -140,12 +139,10 @@ Status KafkaDataConsumerGroup::start_all(StreamLoadContext* ctx) { // waiting all threads finished _thread_pool.shutdown(); _thread_pool.join(); - if (!result_st.ok()) { -// some of consumers encounter errors, cancel this task +kafka_pipe->cancel(result_st.get_error_msg()); return result_st; } - kafka_pipe->finish(); ctx->kafka_info->cmt_offset = std::move(cmt_offset); ctx->receive_bytes = ctx->max_batch_size - left_bytes; @@ -159,9 +156,8 @@ Status KafkaDataConsumerGroup::start_all(StreamLoadContext* ctx) { << ", partition: " << msg->partition() << ", offset: " << msg->offset() << ", len: " << msg->len(); -(kafka_pipe.get()->*append_data)(static_cast(msg->payload()), +Status st = (kafka_pipe.get()->*append_data)(static_cast(msg->payload()), static_cast(msg->len())); - if (st.ok()) { left_rows--; left_bytes -= msg->len(); @@ -172,6 +168,12 @@ Status KafkaDataConsumerGroup::start_all(StreamLoadContext* ctx) { // failed to append this msg, we must stop LOG(WARNING) << "failed to append msg to pipe. grp: " << _grp_id; eos = true; +{ +std::unique_lock lock(_mutex); +if (result_st.ok()) { +result_st = st; +} +} } delete msg; } else { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris-flink-connector] bridgeDream commented on pull request #18: [improvement] (before 1.13)Support set max bytes in each batch to avoid congestion
bridgeDream commented on pull request #18: URL: https://github.com/apache/incubator-doris-flink-connector/pull/18#issuecomment-1066275553 > I will close #13 @bridgeDream Ok, can you approval this pr for branch before 1.13 @hf200012 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] hf200012 commented on issue #7587: [Roadmap] Doris on K8S
hf200012 commented on issue #7587: URL: https://github.com/apache/incubator-doris/issues/7587#issuecomment-1066280183 > https://github.com/liangyongz/doris-on-k8s The temporary solution I am currently using is hostNetwork,This approach is limited. > > Will you solve this problem,Use Pod+SVC instead of hostNetwork I am also doing research in this area, we can communicate together, my WeChat: 35926237, let's pull a group to communicate -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8461: [fix](vectorized) Agg/Unique not null column outer join coredump
github-actions[bot] commented on pull request #8461: URL: https://github.com/apache/incubator-doris/pull/8461#issuecomment-1066284455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman commented on pull request #8461: [fix](vectorized) Agg/Unique not null column outer join coredump
morningman commented on pull request #8461: URL: https://github.com/apache/incubator-doris/pull/8461#issuecomment-1066286993 merge it for quick test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman merged pull request #8461: [fix](vectorized) Agg/Unique not null column outer join coredump
morningman merged pull request #8461: URL: https://github.com/apache/incubator-doris/pull/8461 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch master updated (991dc7f -> 41a15cc)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. from 991dc7f [fix][routine-load] fix bug that routine load cannot cancel task when append_data return error (#8457) add 41a15cc [fix](vectorized) Agg/Unique not null column outer join coredump (#8461) No new revisions were added by this update. Summary of changes: be/src/exec/olap_scanner.cpp | 4 be/src/exec/olap_scanner.h| 1 + be/src/olap/reader.cpp| 2 ++ be/src/olap/reader.h | 4 be/src/olap/tablet_schema.cpp | 7 +-- be/src/olap/tablet_schema.h | 3 ++- be/src/vec/olap/vcollect_iterator.cpp | 2 +- 7 files changed, 19 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] morningman closed issue #8459: [Bug] BE crash when doing left outer join with vec engine
morningman closed issue #8459: URL: https://github.com/apache/incubator-doris/issues/8459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] BiteTheDDDDt commented on a change in pull request #8448: [Feature][Vectorized] support lateral view
BiteThet commented on a change in pull request #8448: URL: https://github.com/apache/incubator-doris/pull/8448#discussion_r825562764 ## File path: be/src/vec/exec/vrepeat_node.cpp ## @@ -181,13 +187,9 @@ Status VRepeatNode::get_next(RuntimeState* state, Block* block, bool* eos) { // current child block has finished its repeat, get child's next block if (_child_block->rows() == 0) { -if (_child_eos) { Review comment: This is unnessesary code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch array-type updated: [feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch array-type in repository https://gitbox.apache.org/repos/asf/incubator-doris.git The following commit(s) were added to refs/heads/array-type by this push: new 706f7ff [feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401) 706f7ff is described below commit 706f7ff898b2e2599a29e36e63730d945bf53a5d Author: camby <104178...@qq.com> AuthorDate: Mon Mar 14 11:11:56 2022 +0800 [feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401) array_contains function Usage example: 1. create table with ARRAY column, and insert some data: ``` > select * from array_test; +--+--++ | k1 | k2 | k3 | +--+--++ |1 |2 | [1, 2] | |2 |3 | NULL | |4 | NULL | [] | |3 | NULL | NULL | +--+--++ ``` 2. enable vectorized: ``` > set enable_vectorized_engine=true; ``` 3. select with array_contains: ``` > select k1,array_contains(k3,1) from array_test; +--+-+ | k1 | array_contains(`k3`, 1) | +--+-+ |3 |NULL | |1 | 1 | |2 |NULL | |4 | 0 | +--+-+ ``` 4. also we can use array_contains in where condition ``` > select * from array_test where array_contains(k3,1); +--+--++ | k1 | k2 | k3 | +--+--++ |1 |2 | [1, 2] | +--+--++ ``` 5. array_position usage example ``` > select k1,k3,array_position(k3,2) from array_test; +--++-+ | k1 | k3 | array_position(`k3`, 2) | +--++-+ |3 | NULL |NULL | |1 | [1, 2] | 2 | |2 | NULL |NULL | |4 | [] | 0 | +--++-+ ``` --- be/src/vec/CMakeLists.txt | 2 + .../vec/functions/array/function_array_index.cpp | 31 ++ be/src/vec/functions/array/function_array_index.h | 196 +++ .../functions/array/function_array_register.cpp| 31 ++ be/src/vec/functions/simple_function_factory.h | 2 + be/src/vec/olap/vgeneric_iterators.cpp | 3 - be/test/vec/exec/vgeneric_iterators_test.cpp | 3 - be/test/vec/function/CMakeLists.txt| 1 + be/test/vec/function/function_array_index_test.cpp | 127 +++ be/test/vec/function/function_test_util.h | 384 ++--- .../java/org/apache/doris/catalog/ArrayType.java | 4 + gensrc/script/doris_builtins_functions.py | 37 ++ 12 files changed, 608 insertions(+), 213 deletions(-) diff --git a/be/src/vec/CMakeLists.txt b/be/src/vec/CMakeLists.txt index 0024fd0..fc81438 100644 --- a/be/src/vec/CMakeLists.txt +++ b/be/src/vec/CMakeLists.txt @@ -106,6 +106,8 @@ set(VEC_FILES exprs/vcast_expr.cpp exprs/vcase_expr.cpp exprs/vinfo_func.cpp + functions/array/function_array_index.cpp + functions/array/function_array_register.cpp functions/math.cpp functions/function_bitmap.cpp functions/function_bitmap_variadic.cpp diff --git a/be/src/vec/functions/array/function_array_index.cpp b/be/src/vec/functions/array/function_array_index.cpp new file mode 100644 index 000..474500e --- /dev/null +++ b/be/src/vec/functions/array/function_array_index.cpp @@ -0,0 +1,31 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "vec/functions/array/function_array_index.h" +#include "vec/functions/simple_function_factory.h" + +namespace doris::vectorized { + +struct NameArrayContains { static constexpr auto name = "array_contains"; }; +struct NameArrayPosition { static constexpr auto name = "array_position"; }; + +void register_function_array_index(SimpleFunctionFactor
[GitHub] [incubator-doris] morningman merged pull request #8401: [feature][array-type]add array_contains and array_position functions
morningman merged pull request #8401: URL: https://github.com/apache/incubator-doris/pull/8401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #8408: [Benchmark] Add TPC-H benchmark tools
EmmyMiao87 commented on a change in pull request #8408: URL: https://github.com/apache/incubator-doris/pull/8408#discussion_r825566731 ## File path: tools/tpch-tools/README.md ## @@ -0,0 +1,34 @@ + + +## Usage + +These scripts are used to make tpc-h test. +follow the steps below: + +### 1. build tpc-h dbgen tool. +./build-tpch-dbgen.sh +### 2. generate tpc-h data. use -h for more infomations. +./gen-tpch-data.sh -s 1 +### 3. create tpc-h tables. modify `doris-cluster.conf` to specify doris info, then run script below. +./create-tpch-tables.sh +### 4. load tpc-h data. use -h for help. +./load-tpch-data.sh +### 5. run tpc-h queries. +./run-tpch-queries.sh Review comment: In fact, the test set query of tpch is slightly different under different data volumes. So it's best to explain in the document that the query we give is a query under how big a dataset. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] zenoyang commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
zenoyang commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825582068 ## File path: be/src/vec/columns/column_dictionary.h ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "gutil/hash/string_hash.h" +#include "olap/decimal12.h" +#include "olap/uint24.h" +#include "runtime/string_value.h" +#include "util/slice.h" +#include "vec/columns/column.h" +#include "vec/columns/column_decimal.h" +#include "vec/columns/column_impl.h" +#include "vec/columns/column_string.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/types.h" + +namespace doris::vectorized { + +/** + * For low cardinality string columns, using ColumnDictionary can reducememory + * usage and improve query efficiency. + * For equal predicate comparisons, convert the predicate constant to encodings + * according to the dictionary, so that encoding comparisons are used instead + * of string comparisons to improve performance. + * For range comparison predicates, it is necessary to sort the dictionary + * contents, convert the encoding column, and then compare the encoding directly. + * If the read data page contains plain-encoded data pages, the dictionary + * columns are converted into PredicateColumn for processing. + * Currently ColumnDictionary is only used for storage layer. + */ +template +class ColumnDictionary final : public COWHelper> { +private: +friend class COWHelper; + +ColumnDictionary() {} +ColumnDictionary(const size_t n) : codes(n) {} +ColumnDictionary(const ColumnDictionary& src) : codes(src.codes.begin(), src.codes.end()) {} + +public: +using Self = ColumnDictionary; +using value_type = T; +using Container = PaddedPODArray; +using DictContainer = PaddedPODArray; + +bool is_numeric() const override { return false; } + +bool is_predicate_column() const override { return false; } + +bool is_column_dictionary() const override { return true; } + +size_t size() const override { return codes.size(); } + +[[noreturn]] StringRef get_data_at(size_t n) const override { +LOG(FATAL) << "get_data_at not supported in ColumnDictionary"; +} + +void insert_from(const IColumn& src, size_t n) override { +LOG(FATAL) << "insert_from not supported in ColumnDictionary"; +} + +void insert_range_from(const IColumn& src, size_t start, size_t length) override { +LOG(FATAL) << "insert_range_from not supported in ColumnDictionary"; +} + +void insert_indices_from(const IColumn& src, const int* indices_begin, + const int* indices_end) override { +LOG(FATAL) << "insert_indices_from not supported in ColumnDictionary"; +} + +void pop_back(size_t n) override { LOG(FATAL) << "pop_back not supported in ColumnDictionary"; } + +void update_hash_with_value(size_t n, SipHash& hash) const override { +LOG(FATAL) << "update_hash_with_value not supported in ColumnDictionary"; +} + +void insert_data(const char* pos, size_t /*length*/) override { +codes.push_back(unaligned_load(pos)); +} + +void insert_data(const T value) { codes.push_back(value); } + +void insert_default() override { codes.push_back(T()); } + +void clear() override { codes.clear(); } + +// TODO: Make dict memory usage more precise +size_t byte_size() const override { return codes.size() * sizeof(codes[0]); } + +size_t allocated_bytes() const override { return byte_size(); } + +void protect() override {} + +void get_permutation(bool reverse, size_t limit, int nan_direction_hint, + IColumn::Permutation& res) const override { +LOG(FATAL) << "get_permutation not supported in ColumnDictionary"; +} + +void reserve(size_t n) override { codes.reserve(n); } + +[[noreturn]] const char* get_family_name() const override { +LOG(FATAL) << "get_family_name not supported in ColumnDictionary"; +} + +[[noreturn]] MutableColumnPtr clone_resized(size_t size) const override { +LOG(FATAL) << "clo
[GitHub] [incubator-doris] zenoyang commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
zenoyang commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825586828 ## File path: be/src/olap/comparison_predicate.cpp ## @@ -145,28 +146,68 @@ COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(LessEqualPredicate, <=) COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterPredicate, >) COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterEqualPredicate, >=) -#define COMPARISON_PRED_COLUMN_EVALUATE(CLASS, OP) \ +#define COMPARISON_PRED_COLUMN_EVALUATE(CLASS, OP, IS_RANGE) \ template \ void CLASS::evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) const { \ uint16_t new_size = 0; \ if (column.is_nullable()) { \ -auto* nullable_column = \ +auto* nullable_col = \ vectorized::check_and_get_column(column); \ -auto& null_bitmap = reinterpret_cast&>(\ - *(nullable_column->get_null_map_column_ptr())) \ +auto& null_bitmap = reinterpret_cast( \ +nullable_col->get_null_map_column()) \ .get_data(); \ -auto* nest_column_vector = \ - vectorized::check_and_get_column>( \ -nullable_column->get_nested_column()); \ -auto& data_array = nest_column_vector->get_data(); \ -for (uint16_t i = 0; i < *size; i++) { \ -uint16_t idx = sel[i]; \ -sel[new_size] = idx; \ -const type& cell_value = reinterpret_cast(data_array[idx]); \ -bool ret = !null_bitmap[idx] && (cell_value OP _value); \ -new_size += _opposite ? !ret : ret; \ +auto& nested_col = nullable_col->get_nested_column(); \ +if (nested_col.is_column_dictionary()) { \ +if constexpr (std::is_same_v) { \ +auto* nested_col_ptr = vectorized::check_and_get_column< \ + vectorized::ColumnDictionary>(nested_col); \ +auto code = nested_col_ptr->find_code(_value); \ +if (code < 0 && IS_RANGE) { \ +code = nested_col_ptr->find_bound_code(_value, 0 OP 1, 1 OP 1 ); \ +} \ +auto& data_array = nested_col_ptr->get_data(); \ +for (uint16_t i = 0; i < *size; i++) { \ +uint16_t idx = sel[i]; \ +sel[new_size] = idx; \ +const auto& cell_value = \ +reinterpret_cast(data_array[idx]); \ +bool ret = !null_bitmap[idx] && (cell_value OP code); \ +new_size += _opposite ? !ret : ret; \ +} \ +} \ +} else { \ +auto* nested_col_ptr = \ + vectorized::check_and_get_column>( \ +nested_col); \ +
[GitHub] [incubator-doris] zenoyang commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
zenoyang commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825587115 ## File path: be/src/vec/columns/column_dictionary.h ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "gutil/hash/string_hash.h" +#include "olap/decimal12.h" +#include "olap/uint24.h" +#include "runtime/string_value.h" +#include "util/slice.h" +#include "vec/columns/column.h" +#include "vec/columns/column_decimal.h" +#include "vec/columns/column_impl.h" +#include "vec/columns/column_string.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/types.h" + +namespace doris::vectorized { + +/** + * For low cardinality string columns, using ColumnDictionary can reducememory Review comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825596165 ## File path: be/src/vec/columns/column_dictionary.h ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "gutil/hash/string_hash.h" +#include "olap/decimal12.h" +#include "olap/uint24.h" +#include "runtime/string_value.h" +#include "util/slice.h" +#include "vec/columns/column.h" +#include "vec/columns/column_decimal.h" +#include "vec/columns/column_impl.h" +#include "vec/columns/column_string.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/types.h" + +namespace doris::vectorized { + +/** + * For low cardinality string columns, using ColumnDictionary can reducememory + * usage and improve query efficiency. + * For equal predicate comparisons, convert the predicate constant to encodings + * according to the dictionary, so that encoding comparisons are used instead + * of string comparisons to improve performance. + * For range comparison predicates, it is necessary to sort the dictionary + * contents, convert the encoding column, and then compare the encoding directly. + * If the read data page contains plain-encoded data pages, the dictionary + * columns are converted into PredicateColumn for processing. + * Currently ColumnDictionary is only used for storage layer. + */ +template +class ColumnDictionary final : public COWHelper> { +private: +friend class COWHelper; + +ColumnDictionary() {} +ColumnDictionary(const size_t n) : codes(n) {} +ColumnDictionary(const ColumnDictionary& src) : codes(src.codes.begin(), src.codes.end()) {} + +public: +using Self = ColumnDictionary; +using value_type = T; +using Container = PaddedPODArray; +using DictContainer = PaddedPODArray; + +bool is_numeric() const override { return false; } + +bool is_predicate_column() const override { return false; } + +bool is_column_dictionary() const override { return true; } + +size_t size() const override { return codes.size(); } + +[[noreturn]] StringRef get_data_at(size_t n) const override { +LOG(FATAL) << "get_data_at not supported in ColumnDictionary"; +} + +void insert_from(const IColumn& src, size_t n) override { +LOG(FATAL) << "insert_from not supported in ColumnDictionary"; +} + +void insert_range_from(const IColumn& src, size_t start, size_t length) override { +LOG(FATAL) << "insert_range_from not supported in ColumnDictionary"; +} + +void insert_indices_from(const IColumn& src, const int* indices_begin, + const int* indices_end) override { +LOG(FATAL) << "insert_indices_from not supported in ColumnDictionary"; +} + +void pop_back(size_t n) override { LOG(FATAL) << "pop_back not supported in ColumnDictionary"; } + +void update_hash_with_value(size_t n, SipHash& hash) const override { +LOG(FATAL) << "update_hash_with_value not supported in ColumnDictionary"; +} + +void insert_data(const char* pos, size_t /*length*/) override { +codes.push_back(unaligned_load(pos)); +} + +void insert_data(const T value) { codes.push_back(value); } + +void insert_default() override { codes.push_back(T()); } + +void clear() override { codes.clear(); } + +// TODO: Make dict memory usage more precise +size_t byte_size() const override { return codes.size() * sizeof(codes[0]); } + +size_t allocated_bytes() const override { return byte_size(); } + +void protect() override {} + +void get_permutation(bool reverse, size_t limit, int nan_direction_hint, + IColumn::Permutation& res) const override { +LOG(FATAL) << "get_permutation not supported in ColumnDictionary"; +} + +void reserve(size_t n) override { codes.reserve(n); } + +[[noreturn]] const char* get_family_name() const override { +LOG(FATAL) << "get_family_name not supported in ColumnDictionary"; +} + +[[noreturn]] MutableColumnPtr clone_resized(size_t size) const override { +LOG(FATAL) << "clone
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825596739 ## File path: be/src/vec/columns/column_dictionary.h ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "gutil/hash/string_hash.h" +#include "olap/decimal12.h" +#include "olap/uint24.h" +#include "runtime/string_value.h" +#include "util/slice.h" +#include "vec/columns/column.h" +#include "vec/columns/column_decimal.h" +#include "vec/columns/column_impl.h" +#include "vec/columns/column_string.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/types.h" + +namespace doris::vectorized { + +/** + * For low cardinality string columns, using ColumnDictionary can reducememory + * usage and improve query efficiency. + * For equal predicate comparisons, convert the predicate constant to encodings + * according to the dictionary, so that encoding comparisons are used instead + * of string comparisons to improve performance. + * For range comparison predicates, it is necessary to sort the dictionary + * contents, convert the encoding column, and then compare the encoding directly. + * If the read data page contains plain-encoded data pages, the dictionary + * columns are converted into PredicateColumn for processing. + * Currently ColumnDictionary is only used for storage layer. + */ +template +class ColumnDictionary final : public COWHelper> { +private: +friend class COWHelper; + +ColumnDictionary() {} +ColumnDictionary(const size_t n) : codes(n) {} +ColumnDictionary(const ColumnDictionary& src) : codes(src.codes.begin(), src.codes.end()) {} + +public: +using Self = ColumnDictionary; +using value_type = T; +using Container = PaddedPODArray; +using DictContainer = PaddedPODArray; + +bool is_numeric() const override { return false; } + +bool is_predicate_column() const override { return false; } + +bool is_column_dictionary() const override { return true; } + +size_t size() const override { return codes.size(); } + +[[noreturn]] StringRef get_data_at(size_t n) const override { +LOG(FATAL) << "get_data_at not supported in ColumnDictionary"; +} + +void insert_from(const IColumn& src, size_t n) override { +LOG(FATAL) << "insert_from not supported in ColumnDictionary"; +} + +void insert_range_from(const IColumn& src, size_t start, size_t length) override { +LOG(FATAL) << "insert_range_from not supported in ColumnDictionary"; +} + +void insert_indices_from(const IColumn& src, const int* indices_begin, + const int* indices_end) override { +LOG(FATAL) << "insert_indices_from not supported in ColumnDictionary"; +} + +void pop_back(size_t n) override { LOG(FATAL) << "pop_back not supported in ColumnDictionary"; } + +void update_hash_with_value(size_t n, SipHash& hash) const override { +LOG(FATAL) << "update_hash_with_value not supported in ColumnDictionary"; +} + +void insert_data(const char* pos, size_t /*length*/) override { +codes.push_back(unaligned_load(pos)); +} + +void insert_data(const T value) { codes.push_back(value); } + +void insert_default() override { codes.push_back(T()); } + +void clear() override { codes.clear(); } + +// TODO: Make dict memory usage more precise +size_t byte_size() const override { return codes.size() * sizeof(codes[0]); } + +size_t allocated_bytes() const override { return byte_size(); } + +void protect() override {} + +void get_permutation(bool reverse, size_t limit, int nan_direction_hint, + IColumn::Permutation& res) const override { +LOG(FATAL) << "get_permutation not supported in ColumnDictionary"; +} + +void reserve(size_t n) override { codes.reserve(n); } + +[[noreturn]] const char* get_family_name() const override { +LOG(FATAL) << "get_family_name not supported in ColumnDictionary"; +} + +[[noreturn]] MutableColumnPtr clone_resized(size_t size) const override { +LOG(FATAL) << "clone
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825598921 ## File path: be/src/vec/columns/column_dictionary.h ## @@ -0,0 +1,381 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "gutil/hash/string_hash.h" +#include "olap/decimal12.h" +#include "olap/uint24.h" +#include "runtime/string_value.h" +#include "util/slice.h" +#include "vec/columns/column.h" +#include "vec/columns/column_decimal.h" +#include "vec/columns/column_impl.h" +#include "vec/columns/column_string.h" +#include "vec/columns/column_vector.h" +#include "vec/columns/predicate_column.h" +#include "vec/core/types.h" + +namespace doris::vectorized { + +/** + * For low cardinality string columns, using ColumnDictionary can reducememory + * usage and improve query efficiency. + * For equal predicate comparisons, convert the predicate constant to encodings + * according to the dictionary, so that encoding comparisons are used instead + * of string comparisons to improve performance. + * For range comparison predicates, it is necessary to sort the dictionary + * contents, convert the encoding column, and then compare the encoding directly. + * If the read data page contains plain-encoded data pages, the dictionary + * columns are converted into PredicateColumn for processing. + * Currently ColumnDictionary is only used for storage layer. + */ +template +class ColumnDictionary final : public COWHelper> { +private: +friend class COWHelper; + +ColumnDictionary() {} +ColumnDictionary(const size_t n) : codes(n) {} +ColumnDictionary(const ColumnDictionary& src) : codes(src.codes.begin(), src.codes.end()) {} + +public: +using Self = ColumnDictionary; +using value_type = T; +using Container = PaddedPODArray; +using DictContainer = PaddedPODArray; + +bool is_numeric() const override { return false; } + +bool is_predicate_column() const override { return false; } + +bool is_column_dictionary() const override { return true; } + +size_t size() const override { return codes.size(); } + +[[noreturn]] StringRef get_data_at(size_t n) const override { +LOG(FATAL) << "get_data_at not supported in ColumnDictionary"; +} + +void insert_from(const IColumn& src, size_t n) override { +LOG(FATAL) << "insert_from not supported in ColumnDictionary"; +} + +void insert_range_from(const IColumn& src, size_t start, size_t length) override { +LOG(FATAL) << "insert_range_from not supported in ColumnDictionary"; +} + +void insert_indices_from(const IColumn& src, const int* indices_begin, + const int* indices_end) override { +LOG(FATAL) << "insert_indices_from not supported in ColumnDictionary"; +} + +void pop_back(size_t n) override { LOG(FATAL) << "pop_back not supported in ColumnDictionary"; } + +void update_hash_with_value(size_t n, SipHash& hash) const override { +LOG(FATAL) << "update_hash_with_value not supported in ColumnDictionary"; +} + +void insert_data(const char* pos, size_t /*length*/) override { +codes.push_back(unaligned_load(pos)); +} + +void insert_data(const T value) { codes.push_back(value); } + +void insert_default() override { codes.push_back(T()); } + +void clear() override { codes.clear(); } + +// TODO: Make dict memory usage more precise +size_t byte_size() const override { return codes.size() * sizeof(codes[0]); } + +size_t allocated_bytes() const override { return byte_size(); } + +void protect() override {} + +void get_permutation(bool reverse, size_t limit, int nan_direction_hint, + IColumn::Permutation& res) const override { +LOG(FATAL) << "get_permutation not supported in ColumnDictionary"; +} + +void reserve(size_t n) override { codes.reserve(n); } + +[[noreturn]] const char* get_family_name() const override { +LOG(FATAL) << "get_family_name not supported in ColumnDictionary"; +} + +[[noreturn]] MutableColumnPtr clone_resized(size_t size) const override { +LOG(FATAL) << "clone
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825601591 ## File path: be/src/runtime/string_value.h ## @@ -22,9 +22,53 @@ #include "udf/udf.h" #include "util/hash_util.hpp" +#include "util/cpu_info.h" +#include "vec/common/string_ref.h" +#ifdef __SSE4_2__ +#include "util/sse_util.hpp" +#endif namespace doris { +// Compare two strings using sse4.2 intrinsics if they are available. This code assumes +// that the trivial cases are already handled (i.e. one string is empty). +// Returns: +// < 0 if s1 < s2 +// 0 if s1 == s2 +// > 0 if s1 > s2 +// The SSE code path is just under 2x faster than the non-sse code path. +// - s1/n1: ptr/len for the first string +// - s2/n2: ptr/len for the second string +// - len: min(n1, n2) - this can be more cheaply passed in by the caller +static inline int string_compare(const char* s1, int64_t n1, const char* s2, int64_t n2, Review comment: Why move it here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825602217 ## File path: be/src/olap/rowset/segment_v2/segment_iterator.cpp ## @@ -856,6 +857,18 @@ void SegmentIterator::_evaluate_short_circuit_predicate(uint16_t* vec_sel_rowid_ for (auto column_predicate : _short_cir_eval_predicate) { auto column_id = column_predicate->column_id(); auto& short_cir_column = _current_return_columns[column_id]; +auto* col_ptr = short_cir_column.get(); Review comment: Please add a todo for code refactor here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825604376 ## File path: be/src/olap/in_list_predicate.cpp ## @@ -122,21 +123,56 @@ IN_LIST_PRED_COLUMN_BLOCK_EVALUATE(NotInListPredicate, ==) void CLASS::evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) const { \ uint16_t new_size = 0; \ if (column.is_nullable()) { \ -auto* nullable_column = \ - vectorized::check_and_get_column(column); \ -auto& null_bitmap = reinterpret_cast&>(*( \ -nullable_column->get_null_map_column_ptr())).get_data(); \ -auto* nest_column_vector = vectorized::check_and_get_column \ - >(nullable_column->get_nested_column()); \ -auto& data_array = nest_column_vector->get_data(); \ -for (uint16_t i = 0; i < *size; i++) { \ -uint16_t idx = sel[i]; \ -sel[new_size] = idx; \ -const type& cell_value = reinterpret_cast(data_array[idx]); \ -bool ret = !null_bitmap[idx] && (_values.find(cell_value) OP _values.end()); \ -new_size += _opposite ? !ret : ret; \ +auto* nullable_col = \ + vectorized::check_and_get_column(column); \ +auto& null_bitmap = reinterpret_cast( \ + nullable_col->get_null_map_column()).get_data(); \ +auto& nested_col = nullable_col->get_nested_column(); \ +if (nested_col.is_column_dictionary()) { \ Review comment: Too many branch, need a todo for code refactor here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] syb853553110 commented on issue #8435: [Enhancement] The bitmap_hash function can be implemented using murmur_hash3_128
syb853553110 commented on issue #8435: URL: https://github.com/apache/incubator-doris/issues/8435#issuecomment-1066405091 > I think this modification will be incompatible with old data @syb853553110 https://doris.apache.org/zh-CN/sql-reference/sql-functions/bitmap-functions/bitmap_hash.html#description Is it possible to add a method? For example bitmap_hash128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [incubator-doris] wangbo commented on a change in pull request #8318: [improvement](storage) Low cardinality string optimization in storage layer
wangbo commented on a change in pull request #8318: URL: https://github.com/apache/incubator-doris/pull/8318#discussion_r825609607 ## File path: be/src/olap/comparison_predicate.cpp ## @@ -145,28 +146,68 @@ COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(LessEqualPredicate, <=) COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterPredicate, >) COMPARISON_PRED_COLUMN_BLOCK_EVALUATE(GreaterEqualPredicate, >=) -#define COMPARISON_PRED_COLUMN_EVALUATE(CLASS, OP) \ +#define COMPARISON_PRED_COLUMN_EVALUATE(CLASS, OP, IS_RANGE) \ template \ void CLASS::evaluate(vectorized::IColumn& column, uint16_t* sel, uint16_t* size) const { \ uint16_t new_size = 0; \ if (column.is_nullable()) { \ -auto* nullable_column = \ +auto* nullable_col = \ vectorized::check_and_get_column(column); \ -auto& null_bitmap = reinterpret_cast&>(\ - *(nullable_column->get_null_map_column_ptr())) \ +auto& null_bitmap = reinterpret_cast( \ +nullable_col->get_null_map_column()) \ .get_data(); \ -auto* nest_column_vector = \ - vectorized::check_and_get_column>( \ -nullable_column->get_nested_column()); \ -auto& data_array = nest_column_vector->get_data(); \ -for (uint16_t i = 0; i < *size; i++) { \ -uint16_t idx = sel[i]; \ -sel[new_size] = idx; \ -const type& cell_value = reinterpret_cast(data_array[idx]); \ -bool ret = !null_bitmap[idx] && (cell_value OP _value); \ -new_size += _opposite ? !ret : ret; \ +auto& nested_col = nullable_col->get_nested_column(); \ +if (nested_col.is_column_dictionary()) { \ +if constexpr (std::is_same_v) { \ +auto* nested_col_ptr = vectorized::check_and_get_column< \ + vectorized::ColumnDictionary>(nested_col); \ +auto code = nested_col_ptr->find_code(_value); \ +if (code < 0 && IS_RANGE) { \ +code = nested_col_ptr->find_bound_code(_value, 0 OP 1, 1 OP 1 ); \ +} \ +auto& data_array = nested_col_ptr->get_data(); \ +for (uint16_t i = 0; i < *size; i++) { \ +uint16_t idx = sel[i]; \ +sel[new_size] = idx; \ +const auto& cell_value = \ +reinterpret_cast(data_array[idx]); \ +bool ret = !null_bitmap[idx] && (cell_value OP code); \ +new_size += _opposite ? !ret : ret; \ +} \ +} \ +} else { \ +auto* nested_col_ptr = \ + vectorized::check_and_get_column>( \ +nested_col); \ +
[GitHub] [incubator-doris] github-actions[bot] commented on pull request #8202: [improvment] show export support label like
github-actions[bot] commented on pull request #8202: URL: https://github.com/apache/incubator-doris/pull/8202#issuecomment-1066426017 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[incubator-doris] branch dev-1.0.1 created (now 9f6dabf)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch dev-1.0.1 in repository https://gitbox.apache.org/repos/asf/incubator-doris.git. at 9f6dabf [fix][cherry-pick] fix compilation bug with cherry-pick This branch includes the following new commits: new 6d9cd70 [chore](dependency) upgrade-grpc-version (#8218) new 343b851 [refactor](fe) Remove version hash on FE side (#8099) new 3f1dbd7 [improvement](olap) using placement-new to avoid dynamic mallocing for ParsedPage (#8172) new 6faac21 [Improvement] Add minimum fe meta version check (#8203) new 0245d71 [refactor] change mysql server version to avoid some cve issues (#8223) new accff9d [Enhancement](routine_load) Support show routine load statement with like predicate (#8188) new c04c9e1 [Feature](create_table) Support create table with random distribution to avoid data skew (#8041) new 3ac2b90 [feature](iceberg) Step3: Support query iceberg external table (#8179) new fa8e124 [typo](doc)fix some confusing doc content (#8239) new cf6582c [chore] Support aarch64 target with ldb_toolchain (#8249) new 3378b40 [fix](be-ut) fix unit test bug for tablet_info_test (#8253) new 8081a9d [refactor](fe) Remove old fe meta version (#8246) new 66c9bc2 [community] add more collaborators in .asf.yaml (#8029) (#8252) new 6268818 [doc] Modify document of compilation on ARM64 (#8254) new f39a58d [typo] fix listdb description error (#8257) new 1aba49e [docs] fix document date-time-functions typo (#8053) new fb7edf1 [feature][show-transaction] Support view transactions info for specified status by `SHOW TRANSACTION` stmt (#8156) new beaad22 [improvement] Upgrade MySQL version to 5.7.37 to reduce unnecessary CVE issues (#8247) new 0bcdba3 Revert "[chore](dependency) upgrade-grpc-version (#8218)" (#8250) new c0de629 [chore] make options of build.sh and run-be-ut.sh work (#8271) new bee31bd [feature-wip][array-type] Refactor type info for nested array. (#8279) new b375e4f [fix](ut) query stmt test error (#8303) new 6cfa843 [fix] (rpc-udf) Fixed the problem that the query could not be interrupted (#8248) new 836d8ce [fix](fe-ut) Fix FE unit test (#8293) new 925d3f6 [refactor] remove pusher.cpp and related mock test code (#8288) new f01312a [refactor] remove types_test (#8289) new a22f286 [improvement][fix](grouping-set)(tablet-repair) optimize compaction too slow replica process, (#8123) new ae45eed [improvement](restore) allow query on part of partitions when others are in RESTORE (#8245) new f6fee5b [improvement](routine-load) Support routine load task succeed with empty data consumed (#8256) new 4d7fb6c [Feature] Support Changing the bucketing mode of the table from Hash Distribution to Random Distribution (#8259) new b1b52fe [Enhancement] Support Skipping compaction lower replica where select queryable replica for better scan performance (#8146) new f256b88 [typo]update spark build doc (#8333) new 62da121 [typo](comment) Translate the code comments of gensrc (#8308) new 377f2b3 [docs] fix invalid links in docker-dev document (#8313) new 564e4a0 [improvement][website] The expansion of sidebar is off by default (#8314) new 55f5f57 [typo] translate the comments of schema_change.h (#8321) new 3865303 [fix](ut) fix be ut fragment_mgr_test compile failed (#8344) new 02aab7b [fix](planner) Convert format in RewriteFromUnixTimeRule (#8235) new 4a17e0c [doc] Add sync job fe configuration item description (#8349) new cd423c3 format fe config title , add link for tablet_rebalancer_type (#8346) new bb41500 [docs]update http port doc to be more intuitive (#8343) new eb198c3 support doriswriter build in macos (#8330) new 83da0cf [community] Modify doris connector release doc (#8275) new 08b0d3b [typo]fix some typo in fe_config (#8325) new 9adfd37 [license] Organize third-party dependent licenses for bianry releases (#8350) new 2714d0f [optimize] optimze tablet read, avoid to create too much scanner for small tablet (#8096) new 6f8a026 [doc] Translate Chinese comment to English (#8340) new 3c6fe9d [improvement](regression-test) add aggregation tests from trino to doris (#8375) new 440c95a [fix](replica) handle replica version missing info to avoid -214 error (#8209) new 722236d [refactor] remove agent status (#8273) new 85c33a0 [docs] add document conditional-functions (#8339) new 1c71a59 [refactor] remove old schema change code on BE (#8342) new b63fc67 [doc] Update BROKER LOAD.md (#8361) new d796075 [doc] update substring.md (#8398) new dfd50da [typo] translate the comments of delete_handler.cpp (#8402) new ecefc73 [improvement](vectorized) Merge block in scanner to speed u