[GitHub] [doris] hello-stephen commented on pull request #16365: [refactor](Nereids) remove trick datatype code in Expression
hello-stephen commented on PR #16365: URL: https://github.com/apache/doris/pull/16365#issuecomment-1413307136 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.76 seconds load time: 491 seconds storage size: 17170926637 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202080816_clickbench_pr_89485.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei opened a new issue, #16366: [Enhancement] Doris query layer should be exception safe
yiguolei opened a new issue, #16366: URL: https://github.com/apache/doris/issues/16366 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Description _No response_ ### Solution _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei opened a new pull request, #16367: [enhancement](stream receiver) make data stream receiver exception safe.
yiguolei opened a new pull request, #16367: URL: https://github.com/apache/doris/pull/16367 # Proposed changes Issue Number: close #xxx ## Problem summary part of https://github.com/apache/doris/issues/16366 ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener opened a new pull request, #16368: [enhance](Nereids): polish code
jackwener opened a new pull request, #16368: URL: https://github.com/apache/doris/pull/16368 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring merged pull request #16355: [Feature-WIP](inverted index) support array type for inverted index reader
dataroaring merged PR #16355: URL: https://github.com/apache/doris/pull/16355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Feature-WIP](inverted index) support array type for inverted index reader (#16355)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new bb179b77f7 [Feature-WIP](inverted index) support array type for inverted index reader (#16355) bb179b77f7 is described below commit bb179b77f75d2b0471eb7b3b75ad783d21596194 Author: YueW <45946325+tany...@users.noreply.github.com> AuthorDate: Thu Feb 2 16:14:14 2023 +0800 [Feature-WIP](inverted index) support array type for inverted index reader (#16355) --- be/src/vec/exec/scan/vscan_node.cpp| 20 ++- .../main/java/org/apache/doris/catalog/Type.java | 10 .../java/org/apache/doris/analysis/IndexDef.java | 4 ++ .../org/apache/doris/analysis/MatchPredicate.java | 69 +++-- .../data/inverted_index_p0/test_array_index.out| 58 ++ .../inverted_index_p0/test_array_index.groovy | 70 ++ 6 files changed, 197 insertions(+), 34 deletions(-) diff --git a/be/src/vec/exec/scan/vscan_node.cpp b/be/src/vec/exec/scan/vscan_node.cpp index 198e7ab0c7..d0fc12f37a 100644 --- a/be/src/vec/exec/scan/vscan_node.cpp +++ b/be/src/vec/exec/scan/vscan_node.cpp @@ -49,6 +49,17 @@ static bool ignore_cast(SlotDescriptor* slot, VExpr* expr) { if (slot->type().is_string_type() && expr->type().is_string_type()) { return true; } +if (slot->type().is_array_type()) { +if (slot->type().children[0].type == expr->type().type) { +return true; +} +if (slot->type().children[0].is_date_type() && expr->type().is_date_type()) { +return true; +} +if (slot->type().children[0].is_string_type() && expr->type().is_string_type()) { +return true; +} +} return false; } @@ -391,7 +402,14 @@ Status VScanNode::_normalize_conjuncts() { std::vector slots = _output_tuple_desc->slots(); for (int slot_idx = 0; slot_idx < slots.size(); ++slot_idx) { -switch (slots[slot_idx]->type().type) { +auto type = slots[slot_idx]->type().type; +if (slots[slot_idx]->type().type == TYPE_ARRAY) { +type = slots[slot_idx]->type().children[0].type; +if (type == TYPE_ARRAY) { +continue; +} +} +switch (type) { #define M(NAME) \ case TYPE_##NAME: { \ ColumnValueRange range(slots[slot_idx]->col_name(), \ diff --git a/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java b/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java index e6c2e3a4cd..ef3ec7c834 100644 --- a/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java +++ b/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java @@ -109,6 +109,7 @@ public abstract class Type { private static final Logger LOG = LogManager.getLogger(Type.class); private static final ArrayList integerTypes; +private static final ArrayList stringTypes; private static final ArrayList numericTypes; private static final ArrayList numericDateTimeTypes; private static final ArrayList supportedTypes; @@ -123,6 +124,11 @@ public abstract class Type { integerTypes.add(BIGINT); integerTypes.add(LARGEINT); +stringTypes = Lists.newArrayList(); +stringTypes.add(CHAR); +stringTypes.add(VARCHAR); +stringTypes.add(STRING); + numericTypes = Lists.newArrayList(); numericTypes.addAll(integerTypes); numericTypes.add(FLOAT); @@ -207,6 +213,10 @@ public abstract class Type { return integerTypes; } +public static ArrayList getStringTypes() { +return stringTypes; +} + public static ArrayList getNumericTypes() { return numericTypes; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java index ed03dbd84e..d1c21b5d37 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java @@ -17,6 +17,7 @@ package org.apache.doris.analysis; +import org.apache.doris.catalog.ArrayType; import org.apache.doris.catalog.Column; import org.apache.doris.catalog.KeysType; import org.apache.doris.catalog.PrimitiveType; @@ -176,6 +177,9 @@ public class IndexDef { || indexType == IndexType.NGRAM_BF) { String indexColName = column.getName(); PrimitiveType colType = column.getDataType(); +if (indexType == IndexType.INVERTED && colType.isArrayType()) { +colType = ((ArrayType) column.getType()).getItemType().getP
[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15966: [Feature](map)support complex struct for doris
github-actions[bot] commented on code in PR #15966: URL: https://github.com/apache/doris/pull/15966#discussion_r1094172591 ## be/src/vec/data_types/data_type_factory.cpp: ## @@ -169,6 +179,12 @@ DataTypePtr DataTypeFactory::create_data_type(const TypeDescriptor& col_desc, bo } nested = std::make_shared(dataTypes, names); break; +case TYPE_MAP: +DCHECK(col_desc.children.size() == 2); +nested = std::make_shared( +create_data_type(col_desc.children[0], col_desc.contains_nulls[0]), +create_data_type(col_desc.children[1], col_desc.contains_nulls[1])); +break; } case INVALID_TYPE: Review Comment: warning: 'case' statement not in switch statement [clang-diagnostic-error] ```cpp case INVALID_TYPE: ^ ``` ## be/src/vec/data_types/data_type_factory.cpp: ## @@ -169,6 +179,12 @@ } nested = std::make_shared(dataTypes, names); break; +case TYPE_MAP: +DCHECK(col_desc.children.size() == 2); +nested = std::make_shared( +create_data_type(col_desc.children[0], col_desc.contains_nulls[0]), +create_data_type(col_desc.children[1], col_desc.contains_nulls[1])); +break; } case INVALID_TYPE: default: Review Comment: warning: 'default' statement not in switch statement [clang-diagnostic-error] ```cpp default: ^ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #16349: [fix](join) crash caused by canceling query (Cherry-pick from #16311)
yiguolei merged PR #16349: URL: https://github.com/apache/doris/pull/16349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16367: [enhancement](stream receiver) make data stream receiver exception safe.
github-actions[bot] commented on PR #16367: URL: https://github.com/apache/doris/pull/16367#issuecomment-1413317797 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch branch-1.2-lts updated: [fix](join) crash caused by canceling query (#16311) (#16349)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-1.2-lts by this push: new 495d37d337 [fix](join) crash caused by canceling query (#16311) (#16349) 495d37d337 is described below commit 495d37d33761f70565fac6978ee97a1df09e01a1 Author: Jerry Hu AuthorDate: Thu Feb 2 16:17:17 2023 +0800 [fix](join) crash caused by canceling query (#16311) (#16349) If the query was canceled, the status in shared context may be `OK` with other fields not set. --- be/src/vec/exec/join/vhash_join_node.cpp| 3 ++- be/src/vec/runtime/shared_hash_table_controller.cpp | 11 +++ be/src/vec/runtime/shared_hash_table_controller.h | 1 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/be/src/vec/exec/join/vhash_join_node.cpp b/be/src/vec/exec/join/vhash_join_node.cpp index 9f967a8190..9408622f78 100644 --- a/be/src/vec/exec/join/vhash_join_node.cpp +++ b/be/src/vec/exec/join/vhash_join_node.cpp @@ -1081,7 +1081,8 @@ std::vector HashJoinNode::_convert_block_to_null(Block& block) { HashJoinNode::~HashJoinNode() { if (_shared_hashtable_controller && _should_build_hash_table) { -_shared_hashtable_controller->signal(id()); +// signal at here is abnormal +_shared_hashtable_controller->signal(id(), Status::Cancelled("signaled in destructor")); } } diff --git a/be/src/vec/runtime/shared_hash_table_controller.cpp b/be/src/vec/runtime/shared_hash_table_controller.cpp index e9e125a168..e558798644 100644 --- a/be/src/vec/runtime/shared_hash_table_controller.cpp +++ b/be/src/vec/runtime/shared_hash_table_controller.cpp @@ -42,6 +42,17 @@ SharedHashTableContextPtr SharedHashTableController::get_context(int my_node_id) return _shared_contexts[my_node_id]; } +void SharedHashTableController::signal(int my_node_id, Status status) { +std::lock_guard lock(_mutex); +auto it = _shared_contexts.find(my_node_id); +if (it != _shared_contexts.cend()) { +it->second->signaled = true; +it->second->status = status; +_shared_contexts.erase(it); +} +_cv.notify_all(); +} + void SharedHashTableController::signal(int my_node_id) { std::lock_guard lock(_mutex); auto it = _shared_contexts.find(my_node_id); diff --git a/be/src/vec/runtime/shared_hash_table_controller.h b/be/src/vec/runtime/shared_hash_table_controller.h index e2c54f533d..1b058dcebe 100644 --- a/be/src/vec/runtime/shared_hash_table_controller.h +++ b/be/src/vec/runtime/shared_hash_table_controller.h @@ -67,6 +67,7 @@ public: TUniqueId get_builder_fragment_instance_id(int my_node_id); SharedHashTableContextPtr get_context(int my_node_id); void signal(int my_node_id); +void signal(int my_node_id, Status status); Status wait_for_signal(RuntimeState* state, const SharedHashTableContextPtr& context); bool should_build_hash_table(const TUniqueId& fragment_instance_id, int my_node_id); - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] weizhengte opened a new pull request, #16369: [Improvement](statistics) optimise histogram keyword
weizhengte opened a new pull request, #16369: URL: https://github.com/apache/doris/pull/16369 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Kikyou1997 opened a new pull request, #16370: [fix](planner) Doris returns empty sets when select from a inline view
Kikyou1997 opened a new pull request, #16370: URL: https://github.com/apache/doris/pull/16370 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15966: [Feature](map)support complex struct for doris
github-actions[bot] commented on code in PR #15966: URL: https://github.com/apache/doris/pull/15966#discussion_r1094185768 ## be/src/vec/data_types/data_type_factory.cpp: ## @@ -169,7 +179,12 @@ } nested = std::make_shared(dataTypes, names); break; -} +case TYPE_MAP: +DCHECK(col_desc.children.size() == 2); +nested = std::make_shared( +create_data_type(col_desc.children[0], col_desc.contains_nulls[0]), +create_data_type(col_desc.children[1], col_desc.contains_nulls[1])); +break; case INVALID_TYPE: default: Review Comment: warning: cannot jump from switch statement to this case label [clang-diagnostic-error] ```cpp default: ^ ``` **be/src/vec/data_types/data_type_factory.cpp:172:** jump bypasses variable initialization ```cpp Strings names; ^ ``` **be/src/vec/data_types/data_type_factory.cpp:171:** jump bypasses variable initialization ```cpp DataTypes dataTypes; ^ ``` **be/src/vec/data_types/data_type_factory.cpp:169:** jump bypasses variable initialization ```cpp size_t child_size = col_desc.children.size(); ^ ``` ## be/src/vec/data_types/data_type_factory.cpp: ## @@ -169,7 +179,12 @@ DataTypePtr DataTypeFactory::create_data_type(const TypeDescriptor& col_desc, bo } nested = std::make_shared(dataTypes, names); break; -} +case TYPE_MAP: +DCHECK(col_desc.children.size() == 2); +nested = std::make_shared( +create_data_type(col_desc.children[0], col_desc.contains_nulls[0]), +create_data_type(col_desc.children[1], col_desc.contains_nulls[1])); +break; case INVALID_TYPE: Review Comment: warning: cannot jump from switch statement to this case label [clang-diagnostic-error] ```cpp case INVALID_TYPE: ^ ``` **be/src/vec/data_types/data_type_factory.cpp:172:** jump bypasses variable initialization ```cpp Strings names; ^ ``` **be/src/vec/data_types/data_type_factory.cpp:171:** jump bypasses variable initialization ```cpp DataTypes dataTypes; ^ ``` **be/src/vec/data_types/data_type_factory.cpp:169:** jump bypasses variable initialization ```cpp size_t child_size = col_desc.children.size(); ^ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Tanya-W opened a new pull request, #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.
Tanya-W opened a new pull request, #16371: URL: https://github.com/apache/doris/pull/16371 # Proposed changes Issue Number: Step5 of [DSIP-023: Add inverted index for full text search](https://cwiki.apache.org/confluence/display/DORIS/DSIP-023%3A+Add+inverted+index+for+full+text+search?src=contextnavpagetreemode) implementation for add/drop inverted index. dependency pr: https://github.com/apache/doris/pull/14211 https://github.com/apache/doris/pull/15823 https://github.com/apache/doris/pull/14207 https://github.com/apache/doris/pull/15821 ## Problem summary 1. Support create multiple inverted indexes at the same time 2. When execute alter inverted index, only update fe's meta, no need to modified be's meta, read/write base on fe's meta ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on a diff in pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.
github-actions[bot] commented on code in PR #16371: URL: https://github.com/apache/doris/pull/16371#discussion_r1094196033 ## be/src/olap/schema_change.cpp: ## @@ -586,6 +592,248 @@ Status VSchemaChangeWithSorting::_external_sorting(vector& src_ return Status::OK(); } +SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex( +const std::vector& alter_inverted_indexs, +const TabletSchemaSPtr& tablet_schema) +: SchemaChange(), Review Comment: warning: initializer for base class 'doris::SchemaChange' is redundant [readability-redundant-member-init] ```suggestion : , _alter_inverted_indexs(alter_inverted_indexs), _tablet_schema(tablet_schema) { ``` ## be/src/olap/schema_change.cpp: ## @@ -586,6 +592,248 @@ return Status::OK(); } +SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex( +const std::vector& alter_inverted_indexs, +const TabletSchemaSPtr& tablet_schema) +: SchemaChange(), + _alter_inverted_indexs(alter_inverted_indexs), + _tablet_schema(tablet_schema) {} + +SchemaChangeForInvertedIndex::~SchemaChangeForInvertedIndex() { +VLOG_NOTICE << "~SchemaChangeForInvertedIndex()"; +_inverted_index_builders.clear(); +_index_metas.clear(); +} + +Status SchemaChangeForInvertedIndex::process(RowsetReaderSharedPtr rowset_reader, + RowsetWriter* rowset_writer, + TabletSharedPtr new_tablet, + TabletSharedPtr base_tablet, + TabletSchemaSPtr base_tablet_schema) { +Status res = Status::OK(); +if (rowset_reader->rowset()->empty() || rowset_reader->rowset()->num_rows() == 0) { +return Status::OK(); +} + +std::vector return_columns; +for (auto& inverted_index : _alter_inverted_indexs) { +DCHECK_EQ(inverted_index.columns.size(), 1); +auto column_name = inverted_index.columns[0]; +auto idx = _tablet_schema->field_index(column_name); +return_columns.emplace_back(idx); +} + +// create inverted index writer +auto rowset_meta = rowset_reader->rowset()->rowset_meta(); +std::string segment_dir = base_tablet->tablet_path(); +auto fs = rowset_meta->fs(); +for (auto i = 0; i < rowset_meta->num_segments(); ++i) { +std::string segment_filename = +fmt::format("{}_{}.dat", rowset_meta->rowset_id().to_string(), i); +for (auto& inverted_index : _alter_inverted_indexs) { +DCHECK_EQ(inverted_index.columns.size(), 1); +auto column_name = inverted_index.columns[0]; +auto column = _tablet_schema->column(column_name); +auto index_id = inverted_index.index_id; + +std::unique_ptr field(FieldFactory::create(column)); +_index_metas.emplace_back(new TabletIndex()); +_index_metas.back()->init_from_thrift(inverted_index, *_tablet_schema); +std::unique_ptr inverted_index_builder; +try { +RETURN_IF_ERROR(segment_v2::InvertedIndexColumnWriter::create( +field.get(), &inverted_index_builder, segment_filename, segment_dir, +_index_metas.back().get(), fs)); +} catch (const std::exception& e) { +LOG(WARNING) << "CLuceneError occured: " << e.what(); +return Status::Error(); +} + +if (inverted_index_builder) { +std::string writer_sign = fmt::format("{}_{}", i, index_id); +_inverted_index_builders.insert( +std::make_pair(writer_sign, std::move(inverted_index_builder))); +} +} +} + +SegmentCacheHandle segment_cache_handle; +// load segments +RETURN_NOT_OK(SegmentLoader::instance()->load_segments( +std::static_pointer_cast(rowset_reader->rowset()), &segment_cache_handle, +false)); + +// create iterator for each segment +StorageReadOptions read_options; +OlapReaderStatistics stats; +read_options.stats = &stats; +read_options.tablet_schema = _tablet_schema; +std::unique_ptr schema = +std::make_unique(_tablet_schema->columns(), return_columns); +for (auto& seg_ptr : segment_cache_handle.get_segments()) { +std::unique_ptr iter; +res = seg_ptr->new_iterator(*schema, read_options, &iter); +if (!res.ok()) { +LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() + << "]: " << res.to_string(); +return Status::Error(); +} + +std::shared_ptr block = + std::make_shared(_tablet_schema->create_block(return_columns)); +do { +block->clear_column_data(); +res = iter->next_batch(block.g
[GitHub] [doris] BiteTheDDDDt merged pull request #16357: [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable
BiteThet merged PR #16357: URL: https://github.com/apache/doris/pull/16357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable
This is an automated email from the ASF dual-hosted git repository. panxiaolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 68d2067f51 [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable 68d2067f51 is described below commit 68d2067f518c555908e17178df2f3fe91a00ae3d Author: Kang AuthorDate: Thu Feb 2 16:42:58 2023 +0800 [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable --- .../test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out | 9 - .../test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy | 4 +++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out b/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out index 2a9437547a..70f2bed42a 100644 --- a/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out +++ b/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out @@ -4,10 +4,17 @@ 1 1 +-- !select_k1 -- +1 +2 +2 +3 +3 + -- !select_star -- 1 1 a -2 2 bb 2 2 b +2 2 bb 3 3 c 3 3 c diff --git a/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy b/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy index f7b133e426..710e52fbc4 100644 --- a/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy +++ b/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy @@ -69,7 +69,9 @@ suite ("test_dup_mv_bitmap_hash") { sql "insert into d_table select 2,2,'bb';" sql "insert into d_table select 3,3,'c';" -qt_select_star "select * from d_table order by k1;" +qt_select_k1 "select k1 from d_table order by k1;" + +qt_select_star "select * from d_table order by k1,k2,k3;" explain { sql("select k1,bitmap_union_count(bitmap_hash(k3)) from d_table group by k1;") - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on a diff in pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.
github-actions[bot] commented on code in PR #16371: URL: https://github.com/apache/doris/pull/16371#discussion_r1094198749 ## be/src/olap/schema_change.cpp: ## @@ -586,6 +592,248 @@ Status VSchemaChangeWithSorting::_external_sorting(vector& src_ return Status::OK(); } +SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex( +const std::vector& alter_inverted_indexs, +const TabletSchemaSPtr& tablet_schema) +: SchemaChange(), Review Comment: warning: initializer for base class 'doris::SchemaChange' is redundant [readability-redundant-member-init] ```suggestion : , ``` ## be/src/olap/schema_change.h: ## @@ -172,11 +177,42 @@ class VSchemaChangeWithSorting : public SchemaChange { std::unique_ptr _mem_tracker; }; +class SchemaChangeForInvertedIndex : public SchemaChange { +public: +explicit SchemaChangeForInvertedIndex(const std::vector& alter_inverted_indexs, + const TabletSchemaSPtr& tablet_schema); +virtual ~SchemaChangeForInvertedIndex(); Review Comment: warning: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override] ```suggestion ~SchemaChangeForInvertedIndex() override; ``` ## be/src/olap/schema_change.cpp: ## @@ -586,6 +592,248 @@ return Status::OK(); } +SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex( +const std::vector& alter_inverted_indexs, +const TabletSchemaSPtr& tablet_schema) +: SchemaChange(), + _alter_inverted_indexs(alter_inverted_indexs), + _tablet_schema(tablet_schema) {} + +SchemaChangeForInvertedIndex::~SchemaChangeForInvertedIndex() { +VLOG_NOTICE << "~SchemaChangeForInvertedIndex()"; +_inverted_index_builders.clear(); +_index_metas.clear(); +} + +Status SchemaChangeForInvertedIndex::process(RowsetReaderSharedPtr rowset_reader, + RowsetWriter* rowset_writer, + TabletSharedPtr new_tablet, + TabletSharedPtr base_tablet, + TabletSchemaSPtr base_tablet_schema) { +Status res = Status::OK(); +if (rowset_reader->rowset()->empty() || rowset_reader->rowset()->num_rows() == 0) { +return Status::OK(); +} + +std::vector return_columns; +for (auto& inverted_index : _alter_inverted_indexs) { +DCHECK_EQ(inverted_index.columns.size(), 1); +auto column_name = inverted_index.columns[0]; +auto idx = _tablet_schema->field_index(column_name); +return_columns.emplace_back(idx); +} + +// create inverted index writer +auto rowset_meta = rowset_reader->rowset()->rowset_meta(); +std::string segment_dir = base_tablet->tablet_path(); +auto fs = rowset_meta->fs(); +for (auto i = 0; i < rowset_meta->num_segments(); ++i) { +std::string segment_filename = +fmt::format("{}_{}.dat", rowset_meta->rowset_id().to_string(), i); +for (auto& inverted_index : _alter_inverted_indexs) { +DCHECK_EQ(inverted_index.columns.size(), 1); +auto column_name = inverted_index.columns[0]; +auto column = _tablet_schema->column(column_name); +auto index_id = inverted_index.index_id; + +std::unique_ptr field(FieldFactory::create(column)); +_index_metas.emplace_back(new TabletIndex()); +_index_metas.back()->init_from_thrift(inverted_index, *_tablet_schema); +std::unique_ptr inverted_index_builder; +try { +RETURN_IF_ERROR(segment_v2::InvertedIndexColumnWriter::create( +field.get(), &inverted_index_builder, segment_filename, segment_dir, +_index_metas.back().get(), fs)); +} catch (const std::exception& e) { +LOG(WARNING) << "CLuceneError occured: " << e.what(); +return Status::Error(); +} + +if (inverted_index_builder) { +std::string writer_sign = fmt::format("{}_{}", i, index_id); +_inverted_index_builders.insert( +std::make_pair(writer_sign, std::move(inverted_index_builder))); +} +} +} + +SegmentCacheHandle segment_cache_handle; +// load segments +RETURN_NOT_OK(SegmentLoader::instance()->load_segments( +std::static_pointer_cast(rowset_reader->rowset()), &segment_cache_handle, +false)); + +// create iterator for each segment +StorageReadOptions read_options; +OlapReaderStatistics stats; +read_options.stats = &stats; +read_options.tablet_schema = _tablet_schema; +std::unique_ptr schema = +std::make_unique(_tablet_schema->columns(), return_columns); +
[GitHub] [doris] wsjz opened a new pull request, #16372: [fix](iceberg) fix iceberg catalog rest access
wsjz opened a new pull request, #16372: URL: https://github.com/apache/doris/pull/16372 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16372: [fix](iceberg) fix iceberg catalog rest access
github-actions[bot] commented on PR #16372: URL: https://github.com/apache/doris/pull/16372#issuecomment-1413352710 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16358: [Improve](row-store) check light schema change must enabled
hello-stephen commented on PR #16358: URL: https://github.com/apache/doris/pull/16358#issuecomment-1413352729 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.72 seconds load time: 486 seconds storage size: 17170849132 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202084816_clickbench_pr_89361.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow merged pull request #16312: [fix](Nereids): fix bugs in test join5
morrySnow merged PR #16312: URL: https://github.com/apache/doris/pull/16312 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [fix](Nereids) fix bugs in test join5 (#16312)
This is an automated email from the ASF dual-hosted git repository. morrysnow pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 398da44e46 [fix](Nereids) fix bugs in test join5 (#16312) 398da44e46 is described below commit 398da44e469170ca8a79904e9b7697f77301c943 Author: 谢健 AuthorDate: Thu Feb 2 16:51:45 2023 +0800 [fix](Nereids) fix bugs in test join5 (#16312) make bucket-shuffle-join in PhysicalPlanTranlator when property of left child is not enforced --- .../glue/translator/PhysicalPlanTranslator.java| 6 ++- .../nereids/properties/DistributionSpecHash.java | 3 ++ .../suites/nereids_p0/join/test_join5.groovy | 60 +++--- 3 files changed, 39 insertions(+), 30 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index 581c2418a3..b51bb7a700 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -1553,6 +1553,7 @@ public class PhysicalPlanTranslator extends DefaultPlanVisitor, List> onClauseUsedSlots = JoinUtils.getOnClauseUsedSlots(physicalHashJoin); List rightPartitionExprIds = Lists.newArrayList(leftDistributionSpec.getOrderedShuffledColumns()); for (int i = 0; i < leftDistributionSpec.getOrderedShuffledColumns().size(); i++) { @@ -1572,11 +1573,14 @@ public class PhysicalPlanTranslator extends DefaultPlanVisitor
[GitHub] [doris] github-actions[bot] commented on pull request #16323: [fix](Nereids) result order in group-by-costant case is not stable
github-actions[bot] commented on PR #16323: URL: https://github.com/apache/doris/pull/16323#issuecomment-1413358952 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16323: [fix](test) result order in group-by-costant case is not stable
github-actions[bot] commented on PR #16323: URL: https://github.com/apache/doris/pull/16323#issuecomment-1413359007 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow merged pull request #16323: [fix](test) result order in group-by-costant case is not stable
morrySnow merged PR #16323: URL: https://github.com/apache/doris/pull/16323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (398da44e46 -> 09abd32957)
This is an automated email from the ASF dual-hosted git repository. morrysnow pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 398da44e46 [fix](Nereids) fix bugs in test join5 (#16312) add 09abd32957 [fix](test) result order in group-by-costant case is not stable (#16323) No new revisions were added by this update. Summary of changes: .../data/nereids_syntax_p0/group_by_constant.out | 2 +- .../nereids_syntax_p0/group_by_constant.groovy | 44 +++--- 2 files changed, 23 insertions(+), 23 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15663: [Improvement](topn) order by key topn query optimization
github-actions[bot] commented on PR #15663: URL: https://github.com/apache/doris/pull/15663#issuecomment-1413373172 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen opened a new pull request, #16373: [regression](fix) 1. fix broker load test case and add orc test 2. se…
hello-stephen opened a new pull request, #16373: URL: https://github.com/apache/doris/pull/16373 …t enableBrokerLoad=true in pipeline add a load test for the orc file and let it run in the TeamCity pipeline. # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] platoneko opened a new pull request, #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets
platoneko opened a new pull request, #16374: URL: https://github.com/apache/doris/pull/16374 # Proposed changes Issue Number: close #10986 ## Problem summary Fix core and support reclaiming rowsets in multiple resources in a tablet. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets
github-actions[bot] commented on PR #16374: URL: https://github.com/apache/doris/pull/16374#issuecomment-1413396934 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman merged pull request #16271: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog
morningman merged PR #16271: URL: https://github.com/apache/doris/pull/16271 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 557159d3ce [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271) 557159d3ce is described below commit 557159d3ceff022903839e45ab07d94c922d244d Author: Tiewei Fang <43782773+bepppo...@users.noreply.github.com> AuthorDate: Thu Feb 2 17:31:33 2023 +0800 [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271) --- be/src/exec/table_connector.cpp| 9 +++-- .../docker-compose/mysql/init/03-create-table.sql | 5 +++ .../docker-compose/oracle/init/03-create-table.sql | 6 .../postgresql/init/02-create-table.sql| 6 .../java/org/apache/doris/analysis/InsertStmt.java | 39 -- .../doris/transaction/DatabaseTransactionMgr.java | 3 +- .../doris/transaction/GlobalTransactionMgr.java| 5 +-- .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out| 13 .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out | 13 .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out | 13 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy | 23 ++--- .../test_oracle_jdbc_catalog.groovy| 14 +++- .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy| 13 13 files changed, 140 insertions(+), 22 deletions(-) diff --git a/be/src/exec/table_connector.cpp b/be/src/exec/table_connector.cpp index 12dc3acdc2..30b01b1d03 100644 --- a/be/src/exec/table_connector.cpp +++ b/be/src/exec/table_connector.cpp @@ -226,8 +226,13 @@ Status TableConnector::convert_column_data(const vectorized::ColumnPtr& column_p case TYPE_VARCHAR: case TYPE_CHAR: case TYPE_STRING: { -// here need check the ' is used, now for pg array string must be " -fmt::format_to(_insert_stmt_buffer, "\"{}\"", fmt::basic_string_view(item, size)); +// TODO(zhangstar333): check array data type of postgresql +// for oracle/pg database string must be ' +if (table_type == TOdbcTableType::ORACLE || table_type == TOdbcTableType::POSTGRESQL) { +fmt::format_to(_insert_stmt_buffer, "'{}'", fmt::basic_string_view(item, size)); +} else { +fmt::format_to(_insert_stmt_buffer, "\"{}\"", fmt::basic_string_view(item, size)); +} break; } case TYPE_ARRAY: { diff --git a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql index 02c257cbc8..8fb1aebc4b 100644 --- a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql +++ b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql @@ -223,4 +223,9 @@ create table doris_test.ex_tb20 ( decimal_unsigned_long decimal(65, 5) unsigned ) engine=innodb charset=utf8; +create table doris_test.test_insert ( +`id` varchar(128) NULL, +`name` varchar(128) NULL, +`age` int NULL +) engine=innodb charset=utf8; diff --git a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql index d5dd8cf1c6..d2d8d6af7e 100644 --- a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql +++ b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql @@ -78,3 +78,9 @@ t4 timestamp, t5 interval year(3) to month, t6 interval day(3) to second(6) ); + +create table doris_test.test_insert( +id varchar2(128), +name varchar2(128), +age number(5) +); diff --git a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql index b721da297a..6ace3b20cb 100644 --- a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql +++ b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql @@ -150,3 +150,9 @@ CREATE TABLE catalog_pg_test.test12 ( ID INT NOT NULL, uuid_value uuid ); + +CREATE TABLE catalog_pg_test.test_insert ( + id varchar(128), + name varchar(128), + age int +); diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java index 44140b24e9..891fe3349b 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java @@ -31,6 +31,8 @@ import org.apache.doris.catalog.Partition; import org.apache.doris.catalog.PartitionType; import org.apache.doris.catalog.Table; import org.apache.doris.catalog.TableIf; +import org.apache.doris.catalog.external.JdbcExternalDatabase; +import org.apache.doris.catalog.external.JdbcExternalTable; import org.apache.doris.common.AnalysisException; import org.apache.doris.common.DdlExce
[doris] branch master updated (557159d3ce -> cb6875b5a4)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 557159d3ce [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271) add cb6875b5a4 [improvement](multi-catalog) use date/datetimev2 as default col type for catalog table (#16304) No new revisions were added by this update. Summary of changes: be/src/runtime/buffer_control_block.cpp| 2 +- be/src/vec/exec/scan/vscanner.cpp | 7 ++- .../community/developer-guide/regression-testing.md| 4 ++-- .../doris/catalog/HiveMetaStoreClientHelper.java | 4 ++-- .../apache/doris/external/elasticsearch/EsUtil.java| 2 +- .../org/apache/doris/external/jdbc/JdbcClient.java | 18 +- 6 files changed, 21 insertions(+), 16 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman merged pull request #16304: [improvement](multi-catalog) use date/datetimev2 as default col type for catalog table
morningman merged PR #16304: URL: https://github.com/apache/doris/pull/16304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files
github-actions[bot] commented on PR #16299: URL: https://github.com/apache/doris/pull/16299#issuecomment-1413422083 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files
github-actions[bot] commented on PR #16299: URL: https://github.com/apache/doris/pull/16299#issuecomment-1413422169 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15971: [Feature](Nereids) Support order and limit in subquery
github-actions[bot] commented on PR #15971: URL: https://github.com/apache/doris/pull/15971#issuecomment-1413428296 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15971: [Feature](Nereids) Support order and limit in subquery
github-actions[bot] commented on PR #15971: URL: https://github.com/apache/doris/pull/15971#issuecomment-1413428382 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15966: [Feature](map) add map type to doris
github-actions[bot] commented on PR #15966: URL: https://github.com/apache/doris/pull/15966#issuecomment-1413438340 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BePPPower opened a new pull request, #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader
BePPPower opened a new pull request, #16375: URL: https://github.com/apache/doris/pull/16375 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader
github-actions[bot] commented on PR #16375: URL: https://github.com/apache/doris/pull/16375#issuecomment-1413467981 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow merged pull request #15971: [Feature](Nereids) Support order and limit in subquery
morrySnow merged PR #15971: URL: https://github.com/apache/doris/pull/15971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Feature](Nereids) Support order and limit in subquery (#15971)
This is an automated email from the ASF dual-hosted git repository. morrysnow pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new e31913faca [Feature](Nereids) Support order and limit in subquery (#15971) e31913faca is described below commit e31913faca12206b5cbaf914b815e1a8b10cb275 Author: zhengshiJ <32082872+zhengs...@users.noreply.github.com> AuthorDate: Thu Feb 2 18:17:30 2023 +0800 [Feature](Nereids) Support order and limit in subquery (#15971) 1.Compatible with the old optimizer, the sort and limit in the subquery will not take effect, just delete it directly. ``` select * from sub_query_correlated_subquery1 where sub_query_correlated_subquery1.k1 > (select sum(sub_query_correlated_subquery3.k3) a from sub_query_correlated_subquery3 where sub_query_correlated_subquery3.v2 = sub_query_correlated_subquery1.k2 order by a limit 1); ``` 2.Adjust the unnesting position of the subquery to ensure that the conjunct in the filter has been optimized, and then unnesting Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k1 = i1.k1) AND (k2 = 1)) ) > 0); ``` The reason why the above can be supported is that conjunction will be performed, which can be converted into the following ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2 or k2 = 1)) ) > 0); ``` Not Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k2 = i1.k1) AND (k2 = 1)) ) > 0); ``` --- .../apache/doris/nereids/analyzer/UnboundSlot.java | 2 +- .../batch/EliminateSpecificPlanUnderApplyJob.java | 42 .../jobs/batch/NereidsRewriteJobExecutor.java | 9 ++-- .../org/apache/doris/nereids/rules/RuleType.java | 2 + .../nereids/rules/analysis/CheckAfterRewrite.java | 4 +- .../nereids/rules/analysis/SubExprAnalyzer.java| 34 + .../rewrite/logical/EliminateLimitUnderApply.java | 43 + .../rewrite/logical/EliminateSortUnderApply.java | 56 ++ .../nereids/rules/analysis/CheckRowPolicyTest.java | 2 +- .../nereids_syntax_p0/sub_query_correlated.out | 43 + .../sub_query_diff_old_optimize.out| 20 ++-- .../nereids_syntax_p0/sub_query_correlated.groovy | 36 -- .../sub_query_diff_old_optimize.groovy | 29 ++- 13 files changed, 277 insertions(+), 45 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java index 09eb1c94f5..66c5e43f70 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java @@ -69,7 +69,7 @@ public class UnboundSlot extends Slot implements Unbound, PropagateNullable { @Override public String toString() { -return "'" + getName(); +return "'" + getName() + "'"; } @Override diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java new file mode 100644 index 00..2b8f7b25e0 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java @@ -0,0 +1,42 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.jobs.batch; + +import org.apache.doris.nereids.CascadesContext; +import org.apache.doris.nereids.rules.rewrite.logical.EliminateLimitUnderApply; +import org.apache.doris.nereids.rules.rewrite.logical.EliminateSortUnderApply; + +import com.google.com
[GitHub] [doris] xy720 opened a new pull request, #16376: [chore](regression-test) Remove array config in regression test
xy720 opened a new pull request, #16376: URL: https://github.com/apache/doris/pull/16376 # Proposed changes The fe config "enable_array_type" is not used, this commit removes it from regression test. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function
github-actions[bot] commented on PR #15837: URL: https://github.com/apache/doris/pull/15837#issuecomment-1413522154 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function
github-actions[bot] commented on PR #15837: URL: https://github.com/apache/doris/pull/15837#issuecomment-1413522206 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16359: [Enhancement](Stmt)ShowPartitionsStmt support forward to master
hello-stephen commented on PR #16359: URL: https://github.com/apache/doris/pull/16359#issuecomment-1413524245 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.41 seconds load time: 518 seconds storage size: 17122376961 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202103753_clickbench_pr_89440.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets
github-actions[bot] commented on PR #16374: URL: https://github.com/apache/doris/pull/16374#issuecomment-1413546730 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt merged pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function
BiteThet merged PR #15837: URL: https://github.com/apache/doris/pull/15837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837)
This is an automated email from the ASF dual-hosted git repository. panxiaolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 0d5b115993 [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837) 0d5b115993 is described below commit 0d5b1159930cc37edad3324aaffcf66855022d5c Author: Pxl AuthorDate: Thu Feb 2 18:57:39 2023 +0800 [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837) support duplicate base column for diffrent aggregate function --- .gitignore | 2 + be/src/olap/rowset/segment_v2/segment_writer.cpp | 5 +- be/src/olap/schema_change.cpp | 14 +- be/src/vec/exprs/vslot_ref.cpp | 6 +- .../doris/alter/MaterializedViewHandler.java | 12 +- .../doris/analysis/CreateMaterializedViewStmt.java | 157 -- .../main/java/org/apache/doris/analysis/Expr.java | 41 - .../java/org/apache/doris/analysis/InsertStmt.java | 10 +- .../org/apache/doris/analysis/LiteralExpr.java | 6 + .../org/apache/doris/analysis/MVColumnItem.java| 10 +- .../doris/analysis/MVColumnOneChildPattern.java| 6 +- .../java/org/apache/doris/analysis/QueryStmt.java | 6 + .../java/org/apache/doris/analysis/SelectStmt.java | 5 +- .../java/org/apache/doris/analysis/SlotRef.java| 12 +- .../main/java/org/apache/doris/catalog/Column.java | 2 +- .../java/org/apache/doris/catalog/FunctionSet.java | 3 + .../doris/catalog/MaterializedIndexMeta.java | 1 + .../java/org/apache/doris/common/FeNameFormat.java | 4 + .../doris/planner/MaterializedViewSelector.java| 46 -- .../org/apache/doris/planner/OlapScanNode.java | 5 +- .../org/apache/doris/planner/RollupSelector.java | 13 +- .../apache/doris/planner/SingleNodePlanner.java| 30 +++- .../java/org/apache/doris/qe/StmtExecutor.java | 15 ++ .../doris/rewrite/mvrewrite/CountFieldToSum.java | 41 +++-- .../doris/rewrite/mvrewrite/ExprToSlotRefRule.java | 179 ++--- .../doris/rewrite/mvrewrite/MVExprEquivalent.java | 48 ++ .../doris/rewrite/mvrewrite/SlotRefEqualRule.java | 11 +- .../analysis/CreateMaterializedViewStmtTest.java | 106 +++- .../analysis/MVColumnOneChildPatternTest.java | 11 +- .../doris/nereids/rules/mv/SelectMvIndexTest.java | 1 + .../planner/MaterializedViewFunctionTest.java | 25 +-- .../planner/MaterializedViewSelectorTest.java | 4 +- .../agg_have_dup_base/agg_have_dup_base.out| 25 +++ .../materialized_view_p0/k1ap2spa/k1ap2spa.out | 13 ++ .../test_dup_group_by_mv_abs.out | 19 +++ .../test_dup_group_by_mv_plus.out | 19 +++ .../agg_have_dup_base.groovy} | 36 ++--- .../k1ap2spa.groovy} | 36 + .../test_dup_group_by_mv_abs.groovy} | 36 + .../test_dup_group_by_mv_plus.groovy} | 36 + .../test_dup_mv_abs/test_dup_mv_abs.groovy | 4 + .../test_dup_mv_bin/test_dup_mv_bin.groovy | 4 + .../test_dup_mv_plus/test_dup_mv_plus.groovy | 4 + .../test_materialized_view_nereids.groovy | 2 + 44 files changed, 651 insertions(+), 420 deletions(-) diff --git a/.gitignore b/.gitignore index 01c6c35993..5ba2f22e45 100644 --- a/.gitignore +++ b/.gitignore @@ -95,3 +95,5 @@ tools/**/tpch-data/ # be-ut data_test + +/conf/log4j2-spring.xml diff --git a/be/src/olap/rowset/segment_v2/segment_writer.cpp b/be/src/olap/rowset/segment_v2/segment_writer.cpp index 517a0c9ec3..3da1dbef56 100644 --- a/be/src/olap/rowset/segment_v2/segment_writer.cpp +++ b/be/src/olap/rowset/segment_v2/segment_writer.cpp @@ -208,7 +208,10 @@ Status SegmentWriter::init(const std::vector& col_ids, bool has_key) { Status SegmentWriter::append_block(const vectorized::Block* block, size_t row_pos, size_t num_rows) { -assert(block->columns() == _column_writers.size()); +CHECK(block->columns() == _column_writers.size()) +<< ", block->columns()=" << block->columns() +<< ", _column_writers.size()=" << _column_writers.size(); + _olap_data_convertor->set_source_content(block, row_pos, num_rows); // find all row pos for short key indexes diff --git a/be/src/olap/schema_change.cpp b/be/src/olap/schema_change.cpp index 1f62169414..63a9a211eb 100644 --- a/be/src/olap/schema_change.cpp +++ b/be/src/olap/schema_change.cpp @@ -22,7 +22,6 @@ #include "gutil/integral_types.h" #include "olap/merger.h" #include "olap/olap_common.h" -#include "olap/row_cursor.h" #include "olap/rowset/segment_v2/column_reader.h" #include "olap/storage_engine.h" #include "olap/tablet.h" @@ -39,8 +38,6 @@ #incl
[GitHub] [doris] hello-stephen commented on pull request #16361: [fix](scan) coredump caused by null of _scanner_ctx
hello-stephen commented on PR #16361: URL: https://github.com/apache/doris/pull/16361#issuecomment-1413550386 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.9 seconds load time: 511 seconds storage size: 17123154634 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202105836_clickbench_pr_89453.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc
github-actions[bot] commented on PR #16363: URL: https://github.com/apache/doris/pull/16363#issuecomment-1413558392 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc
github-actions[bot] commented on PR #16363: URL: https://github.com/apache/doris/pull/16363#issuecomment-1413558466 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc
hello-stephen commented on PR #16363: URL: https://github.com/apache/doris/pull/16363#issuecomment-1413587230 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.67 seconds load time: 485 seconds storage size: 17170743013 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202112733_clickbench_pr_89615.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring merged pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files
dataroaring merged PR #16299: URL: https://github.com/apache/doris/pull/16299 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (0d5b115993 -> 6ee0dbfb23)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 0d5b115993 [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837) add 6ee0dbfb23 [fix](cooldown) Fix bugs in cooldown single replica files (#16299) No new revisions were added by this update. Summary of changes: be/src/agent/task_worker_pool.cpp | 34 +++-- be/src/olap/base_tablet.h | 4 +- be/src/olap/snapshot_manager.cpp | 7 - be/src/olap/tablet.cpp| 276 +++--- be/src/olap/tablet.h | 53 +--- be/src/olap/tablet_manager.cpp| 2 - be/src/olap/tablet_meta.cpp | 18 +-- be/src/olap/tablet_meta.h | 16 +-- be/src/olap/version_graph.h | 4 + be/test/olap/tablet_test.cpp | 2 +- 10 files changed, 199 insertions(+), 217 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16367: [enhancement](stream receiver) make data stream receiver exception safe.
github-actions[bot] commented on PR #16367: URL: https://github.com/apache/doris/pull/16367#issuecomment-1413593626 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] 02/06: [Refactor](function) opt the exec of function with null column (#16256)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit 38c9fe7f8d7ec7e91c6821cd66c8d366e7d98bf0 Author: HappenLee AuthorDate: Wed Feb 1 15:56:31 2023 +0800 [Refactor](function) opt the exec of function with null column (#16256) --- be/src/vec/exprs/vectorized_fn_call.cpp | 2 + be/src/vec/functions/function.cpp | 14 ++-- be/src/vec/functions/function_cast.h | 11 ++- be/src/vec/functions/function_helpers.cpp | 123 ++ be/src/vec/functions/function_helpers.h | 26 +++ 5 files changed, 86 insertions(+), 90 deletions(-) diff --git a/be/src/vec/exprs/vectorized_fn_call.cpp b/be/src/vec/exprs/vectorized_fn_call.cpp index 3999599716..a0e07d31f9 100644 --- a/be/src/vec/exprs/vectorized_fn_call.cpp +++ b/be/src/vec/exprs/vectorized_fn_call.cpp @@ -51,6 +51,8 @@ doris::Status VectorizedFnCall::prepare(doris::RuntimeState* state, argument_template.reserve(_children.size()); std::vector child_expr_name; for (auto child : _children) { +// TODO: rethink we really create column here. maybe only need nullptr just to +// get the function auto column = child->data_type()->create_column(); argument_template.emplace_back(std::move(column), child->data_type(), child->expr_name()); child_expr_name.emplace_back(child->expr_name()); diff --git a/be/src/vec/functions/function.cpp b/be/src/vec/functions/function.cpp index 41f3141c06..662a2a58af 100644 --- a/be/src/vec/functions/function.cpp +++ b/be/src/vec/functions/function.cpp @@ -217,11 +217,12 @@ Status PreparedFunctionImpl::default_implementation_for_nulls( } if (null_presence.has_nullable) { -Block temporary_block = create_block_with_nested_columns(block, args, result); +auto [temporary_block, new_args, new_result] = +create_block_with_nested_columns(block, args, result); RETURN_IF_ERROR(execute_without_low_cardinality_columns( -context, temporary_block, args, result, temporary_block.rows(), dry_run)); +context, temporary_block, new_args, new_result, temporary_block.rows(), dry_run)); block.get_by_position(result).column = - wrap_in_nullable(temporary_block.get_by_position(result).column, block, args, + wrap_in_nullable(temporary_block.get_by_position(new_result).column, block, args, result, input_rows_count); *executed = true; return Status::OK(); @@ -295,10 +296,9 @@ DataTypePtr FunctionBuilderImpl::get_return_type_without_low_cardinality( } if (null_presence.has_nullable) { ColumnNumbers numbers(arguments.size()); -for (size_t i = 0; i < arguments.size(); i++) { -numbers[i] = i; -} -Block nested_block = create_block_with_nested_columns(Block(arguments), numbers); +std::iota(numbers.begin(), numbers.end(), 0); +auto [nested_block, _] = +create_block_with_nested_columns(Block(arguments), numbers, false); auto return_type = get_return_type_impl( ColumnsWithTypeAndName(nested_block.begin(), nested_block.end())); return make_nullable(return_type); diff --git a/be/src/vec/functions/function_cast.h b/be/src/vec/functions/function_cast.h index e3baaecdd2..a6817134ea 100644 --- a/be/src/vec/functions/function_cast.h +++ b/be/src/vec/functions/function_cast.h @@ -1592,7 +1592,9 @@ private: Block tmp_block; size_t tmp_res_index = 0; if (source_is_nullable) { -tmp_block = create_block_with_nested_columns_only_args(block, arguments); +auto [t_block, tmp_args] = +create_block_with_nested_columns(block, arguments, true); +tmp_block = std::move(t_block); tmp_res_index = tmp_block.columns(); tmp_block.insert({nullptr, nested_type, ""}); @@ -1624,7 +1626,8 @@ private: return [wrapper, skip_not_null_check](FunctionContext* context, Block& block, const ColumnNumbers& arguments, const size_t result, size_t input_rows_count) { -Block tmp_block = create_block_with_nested_columns(block, arguments, result); +auto [tmp_block, tmp_args, tmp_res] = +create_block_with_nested_columns(block, arguments, result); /// Check that all values are not-NULL. /// Check can be skipped in case if LowCardinality dictionary is transformed. @@ -1640,8 +1643,8 @@ private: } }
[doris] 04/06: [fix](multi-catalog) remove the eof check among parquet columns (#16302)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit 29e6480bc8be1fc882c3e7a1f28b7164a3b71c97 Author: Ashin Gau AuthorDate: Thu Feb 2 09:22:09 2023 +0800 [fix](multi-catalog) remove the eof check among parquet columns (#16302) Read parquet file failed: ``` ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]Read parquet file xxx failed, reason = [CORRUPTION]The number of rows are not equal among parquet columns ``` This error may be thrown when reading non-predicate columns in lazy-read, for example: A row group with 1000 rows has tow non-predicate columns. Column A has one page, Column B has two pages with 500 rows for each page. The read range of `ParquetColumnReader` is [0, 400), and the rows between [0, 450) are all filtered by predicate columns. So column A can skip the first page, and reach the EOF, while column B can also skip the first page, but doesn't read the EOF. --- be/src/vec/exec/format/parquet/vparquet_group_reader.cpp | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp index 34b478114e..5b1c0fd828 100644 --- a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp +++ b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp @@ -134,7 +134,6 @@ Status RowGroupReader::_read_column_data(Block* block, const std::vectorget_by_name(read_col); auto& column_ptr = column_with_type_and_name.column; @@ -150,15 +149,13 @@ Status RowGroupReader::_read_column_data(Block* block, const std::vector 0 && (has_eof ^ col_eof)) { -return Status::Corruption("The number of rows are not equal among parquet columns"); -} if (batch_read_rows > 0 && batch_read_rows != col_read_rows) { return Status::Corruption("Can't read the same number of rows among parquet columns"); } batch_read_rows = col_read_rows; -has_eof = col_eof; -col_idx++; +if (col_eof) { +has_eof = true; +} } *read_rows = batch_read_rows; *batch_eof = has_eof; - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] 03/06: [Enhance] use fast_float::from_chars to do str cast to float/double to avoid lose precision (#16190)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit a1dcec461cd17f734ec7cff349328a1dcf96f55d Author: HappenLee AuthorDate: Wed Feb 1 23:53:34 2023 +0800 [Enhance] use fast_float::from_chars to do str cast to float/double to avoid lose precision (#16190) --- be/src/util/string_parser.hpp | 146 ++-- be/test/util/string_parser_test.cpp | 5 +- 2 files changed, 41 insertions(+), 110 deletions(-) diff --git a/be/src/util/string_parser.hpp b/be/src/util/string_parser.hpp index 653f0dac14..02006b7c7d 100644 --- a/be/src/util/string_parser.hpp +++ b/be/src/util/string_parser.hpp @@ -20,6 +20,8 @@ #pragma once +#include + #include #include #include @@ -111,13 +113,7 @@ public: template static inline T string_to_float(const char* s, int len, ParseResult* result) { -T ans = string_to_float_internal(s, len, result); -if (LIKELY(*result == PARSE_SUCCESS)) { -return ans; -} - -int i = skip_leading_whitespace(s, len); -return string_to_float_internal(s + i, len - i, result); +return string_to_float_internal(s, len, result); } // Parses a string for 'true' or 'false', case insensitive. @@ -425,118 +421,54 @@ inline T StringParser::string_to_int_no_overflow(const char* s, int len, ParseRe template inline T StringParser::string_to_float_internal(const char* s, int len, ParseResult* result) { -if (UNLIKELY(len <= 0)) { +int i = 0; +// skip leading spaces +for (; i < len; ++i) { +if (!is_whitespace(s[i])) { +break; +} +} + +// skip back spaces +int j = len - 1; +for (; j >= i; j--) { +if (!is_whitespace(s[j])) { +break; +} +} + +// skip leading '+', from_chars can handle '-' +if (i < len && s[i] == '+') { +i++; +} +if (UNLIKELY(i > j)) { *result = PARSE_FAILURE; return 0; } // Use double here to not lose precision while accumulating the result double val = 0; -bool negative = false; -int i = 0; -double divide = 1; -bool decimal = false; -int64_t remainder = 0; -// The number of 'significant figures' we've encountered so far (i.e., digits excluding -// leading 0s). This technically shouldn't count trailing 0s either, but for us it -// doesn't matter if we count them based on the implementation below. -int sig_figs = 0; - -switch (*s) { -case '-': -negative = true; -case '+': -i = 1; -} - -int first = i; -for (; i < len; ++i) { -if (LIKELY(s[i] >= '0' && s[i] <= '9')) { -if (s[i] != '0' || sig_figs > 0) { -++sig_figs; -} -if (decimal) { -// According to the IEEE floating-point spec, a double has up to 15-17 -// significant decimal digits (see -// http://en.wikipedia.org/wiki/Double-precision_floating-point_format). We stop -// processing digits after we've already seen at least 18 sig figs to avoid -// overflowing 'remainder' (we stop after 18 instead of 17 to get the rounding -// right). -if (sig_figs <= 18) { -remainder = remainder * 10 + s[i] - '0'; -divide *= 10; +auto res = fast_float::from_chars(s + i, s + j + 1, val); + +if (res.ec == std::errc() && res.ptr == s + j + 1) { +if (abs(val) == std::numeric_limits::infinity()) { +auto contain_inf = false; +for (int k = i; k < j + 1; k++) { +if (s[k] == 'i' || s[k] == 'I') { +contain_inf = true; +break; } -} else { -val = val * 10 + s[i] - '0'; -} -} else if (s[i] == '.') { -decimal = true; -} else if (s[i] == 'e' || s[i] == 'E') { -break; -} else if (s[i] == 'i' || s[i] == 'I') { -if (len > i + 2 && (s[i + 1] == 'n' || s[i + 1] == 'N') && -(s[i + 2] == 'f' || s[i + 2] == 'F')) { -// Note: Hive writes inf as Infinity, at least for text. We'll be a little loose -// here and interpret any column with inf as a prefix as infinity rather than -// checking every remaining byte. -*result = PARSE_SUCCESS; -return negative ? -INFINITY : INFINITY; -} else { -// Starts with 'i', but isn't inf... -*result = PARSE_FAILURE; -return 0; -} -} else if (s[i] == 'n' || s[i] == 'N') { -if (len > i + 2 && (s[i + 1] == 'a' || s[i + 1] == 'A') && -(s[i + 2] == 'n' |
[doris] branch branch-1.2-lts updated (495d37d337 -> be7c4d267e)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git from 495d37d337 [fix](join) crash caused by canceling query (#16311) (#16349) new e48404e1c0 [fix](planner) create view generate wrong sql when sql contains multi count distinct (#16092) new 38c9fe7f8d [Refactor](function) opt the exec of function with null column (#16256) new a1dcec461c [Enhance] use fast_float::from_chars to do str cast to float/double to avoid lose precision (#16190) new 29e6480bc8 [fix](multi-catalog) remove the eof check among parquet columns (#16302) new df7200f8ae [test](regression) add tvf regression to test the remove of eof check (#16342) new be7c4d267e [branch1.2] fix compile bug after cherry-pick The 6 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: be/src/util/string_parser.hpp | 146 ++--- .../exec/format/parquet/vparquet_group_reader.cpp | 9 +- be/src/vec/exprs/vectorized_fn_call.cpp| 2 + be/src/vec/functions/function.cpp | 14 +- be/src/vec/functions/function_cast.h | 11 +- be/src/vec/functions/function_helpers.cpp | 119 - be/src/vec/functions/function_helpers.h| 26 ++-- be/test/util/string_parser_test.cpp| 5 +- .../org/apache/doris/analysis/BaseViewStmt.java| 1 + .../apache/doris/analysis/FunctionCallExpr.java| 1 + .../java/org/apache/doris/analysis/SelectStmt.java | 8 +- regression-test/conf/regression-conf.groovy| 29 .../external_table_emr_p2/hive/test_tvf_p2.out | 32 + .../suites/ddl_p0/test_create_view.groovy | 72 ++ .../external_table_emr_p2/hive/test_tvf_p2.groovy | 31 ++--- 15 files changed, 276 insertions(+), 230 deletions(-) create mode 100644 regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out create mode 100644 regression-test/suites/ddl_p0/test_create_view.groovy copy be/src/runtime/tuple_row.cpp => regression-test/suites/external_table_emr_p2/hive/test_tvf_p2.groovy (55%) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] 01/06: [fix](planner) create view generate wrong sql when sql contains multi count distinct (#16092)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit e48404e1c08f4887506181a4b6dd577cc4dc534e Author: morrySnow <101034200+morrys...@users.noreply.github.com> AuthorDate: Tue Jan 31 23:42:53 2023 +0800 [fix](planner) create view generate wrong sql when sql contains multi count distinct (#16092) If sql in create view has more than one count distinct, and write column name explicitly. We will generate sql contains function multi_count_distinct. It cannot be analyzed and all query containing this view will fail. --- .../org/apache/doris/analysis/BaseViewStmt.java| 1 + .../java/org/apache/doris/analysis/SelectStmt.java | 8 ++- .../suites/ddl_p0/test_create_view.groovy | 72 ++ 3 files changed, 80 insertions(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java index 477e440f5e..8114448f0d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java @@ -117,6 +117,7 @@ public class BaseViewStmt extends DdlStmt { Analyzer tmpAnalyzer = new Analyzer(analyzer); List colNames = cols.stream().map(c -> c.getColName()).collect(Collectors.toList()); +cloneStmt.setNeedToSql(true); cloneStmt.substituteSelectList(tmpAnalyzer, colNames); try (ToSqlContext toSqlContext = ToSqlContext.getOrNewThreadLocalContext()) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java index 1aae7c324f..8f47a9ed04 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java @@ -1898,7 +1898,7 @@ public class SelectStmt extends QueryStmt { if (i != 0) { strBuilder.append(", "); } -if (needToSql) { +if (needToSql && CollectionUtils.isNotEmpty(originalExpr)) { strBuilder.append(originalExpr.get(i).toSql()); } else { strBuilder.append(resultExprs.get(i).toSql()); @@ -2072,6 +2072,9 @@ public class SelectStmt extends QueryStmt { // Resolve and replace non-InlineViewRef table refs with a BaseTableRef or ViewRef. TableRef tblRef = fromClause.get(i); tblRef = analyzer.resolveTableRef(tblRef); +if (tblRef instanceof InlineViewRef) { +((InlineViewRef) tblRef).setNeedToSql(needToSql); +} Preconditions.checkNotNull(tblRef); fromClause.set(i, tblRef); tblRef.setLeftTblRef(leftTblRef); @@ -2101,6 +2104,9 @@ public class SelectStmt extends QueryStmt { resultExprs.add(item.getExpr()); } } +if (needToSql) { +originalExpr = Expr.cloneList(resultExprs); +} // substitute group by if (groupByClause != null) { boolean aliasFirst = false; diff --git a/regression-test/suites/ddl_p0/test_create_view.groovy b/regression-test/suites/ddl_p0/test_create_view.groovy new file mode 100644 index 00..4c401017ee --- /dev/null +++ b/regression-test/suites/ddl_p0/test_create_view.groovy @@ -0,0 +1,72 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +suite("test_create_view") { + +sql """DROP TABLE IF EXISTS count_distinct""" +sql """ +CREATE TABLE IF NOT EXISTS count_distinct +( +RQ DATE NOT NULL COMMENT "日期", +v1 VARCHAR(100) NOT NULL COMMENT "字段1", +v2 VARCHAR(100) NOT NULL COMMENT "字段2", +v3 VARCHAR(100) REPLACE_IF_NOT_NULL COMMENT "字段3" +) +AGGREGATE KEY(RQ,v1,v2) +PARTITION BY RANGE(RQ) +( +PARTITION p20220908 VALUES LESS THAN ('2022-09-09') +) +DISTRIBUTED BY HASH(v1,v2) BUCKE
[doris] 05/06: [test](regression) add tvf regression to test the remove of eof check (#16342)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit df7200f8aeacead5b42ec9b74b528dab2a806675 Author: Ashin Gau AuthorDate: Thu Feb 2 10:06:36 2023 +0800 [test](regression) add tvf regression to test the remove of eof check (#16342) Add regression test for #16302. This regression test will be failed if add EOF check for non-predicate columns. --- regression-test/conf/regression-conf.groovy| 29 .../external_table_emr_p2/hive/test_tvf_p2.out | 32 ++ .../external_table_emr_p2/hive/test_tvf_p2.groovy | 30 3 files changed, 91 insertions(+) diff --git a/regression-test/conf/regression-conf.groovy b/regression-test/conf/regression-conf.groovy index 544d4b12c1..6779873340 100644 --- a/regression-test/conf/regression-conf.groovy +++ b/regression-test/conf/regression-conf.groovy @@ -94,6 +94,35 @@ es_8_port=39200 cacheDataPath = "/tmp" +//hive catalog test config for bigdata +enableExternalHiveTest = false +extHiveHmsHost = "***.**.**.**" +extHiveHmsPort = 7004 +extHdfsPort = 4007 +extHiveHmsUser = "" +extHiveHmsPassword= "***" + +//mysql jdbc connector test config for bigdata +enableExternalMysqlTest = false +extMysqlHost = "***.**.**.**" +extMysqlPort = 3306 +extMysqlUser = "" +extMysqlPassword = "***" + +//postgresql jdbc connector test config for bigdata +enableExternalPgTest = false +extPgHost = "***.**.**.*" +extPgPort = 5432 +extPgUser = "" +extPgPassword = "***" + +// elasticsearch external test config for bigdata +enableExternalEsTest = false +extEsHost = "***" +extEsPort = 9200 +extEsUser = "***" +extEsPassword = "***" + s3Endpoint = "cos.ap-hongkong.myqcloud.com" s3BucketName = "doris-build-hk-1308700295" s3Region = "ap-hongkong" diff --git a/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out b/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out new file mode 100644 index 00..7f94b13974 --- /dev/null +++ b/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out @@ -0,0 +1,32 @@ +-- This file is automatically generated. You should know what you did if you want to edit this +-- !eof_check -- +2451718\N 9242\N \N 2886\N 4 250 1374252 18 \N \N \N 0 15131435\N \N 0 \N 158878 +\N \N 14846 1945858 \N 1015\N 4 581 2383831 \N \N 5 1 0 110 \N \N \N 0 110 \N -213 +\N 50835 25618 1166535 \N 1748\N 4 \N 2880907 7 \N 17 \N \N 115 \N 125 1 \N 115 \N \N +245219545280 29385 1298621 1649018 1815\N 4 \N 3379765 24 73 \N \N 0 261717703399\N 0 \N 2826\N +245148853117 31945 \N 8644\N \N 4 783 4877135 100 \N \N \N \N \N 3450\N \N \N 565 581 -2885 +\N 53900 35887 702626 \N 2568\N 4 \N 2381514 \N \N \N 0 \N \N \N \N 1 \N 19 20 -357 +\N 53985 38881 760602 289764 \N \N 4 227 3377513 68 75 \N \N \N 5833\N \N \N \N 524 \N -4588 +\N \N 51685 1833943 \N \N \N 4 \N 1879197 \N \N \N \N 0 46 163 \N \N 0 \N 49 -116 +\N \N 62073 \N 287578 \N \N 4 990 1626478990 91 \N \N 0 63818247\N \N 0 6381\N \N +\N 34914 64259 167395 897626 \N \N 4 327 1937905815 \N \N 51 \N \N 1480\N \N \N \N \N -707 +\N 70509 100949 \N \N \N \N 4 185 2381361 35 1 \N \N 0 \N 41 \N \N 0 \N 82 33 +245248974165 103575 \N 1359778 \N \N 4 \N 2383538 1 \N 23 0 \N 0 15 23 \N \N 0 \N -14 +2451253\N 111502 246668 \N \N \N 4 \N 2881367 \N \N \N 21 0 \N \N 121874 0 \N 999 -49 +2451093\N 121339 \N \N \N \N 4 894 1937908811 92 \N \N \N \N \N 1364 9 \N 305 314 \N +2452592
[doris] 06/06: [branch1.2] fix compile bug after cherry-pick
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit be7c4d267e09d0b6f7ed6930684f270d1324f91d Author: morningman AuthorDate: Thu Feb 2 18:11:55 2023 +0800 [branch1.2] fix compile bug after cherry-pick --- be/src/vec/functions/function_helpers.cpp | 8 .../src/main/java/org/apache/doris/analysis/FunctionCallExpr.java | 1 + 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/be/src/vec/functions/function_helpers.cpp b/be/src/vec/functions/function_helpers.cpp index c77f3c5ab7..c42d0dc0ea 100644 --- a/be/src/vec/functions/function_helpers.cpp +++ b/be/src/vec/functions/function_helpers.cpp @@ -79,14 +79,6 @@ std::tuple create_block_with_nested_columns(const Block& b } } -// TODO: only support match function, rethink the logic -for (const auto& ctn : block) { -if (ctn.name.size() > BeConsts::BLOCK_TEMP_COLUMN_PREFIX.size() && -starts_with(ctn.name, BeConsts::BLOCK_TEMP_COLUMN_PREFIX)) { -res.insert(ctn); -} -} - return {res, res_args}; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java index 526d2fd9f6..3aff4d3517 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java @@ -835,6 +835,7 @@ public class FunctionCallExpr extends Expr { if (!getChild(1).isConstant()) { throw new AnalysisException(fnName + "function's second argument should be constant"); } +throw new AnalysisException(fnName + "not support on vectorized engine now."); } if ((fnName.getFunction().equalsIgnoreCase("HLL_UNION_AGG") - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16364: [Bug](CURRENT_TIMESTAMP) Fix wrong default value after schema change
hello-stephen commented on PR #16364: URL: https://github.com/apache/doris/pull/16364#issuecomment-141368 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.07 seconds load time: 490 seconds storage size: 17171763885 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202113838_clickbench_pr_89475.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] YangShaw commented on a diff in pull request #14397: [feature](nereids)support window function
YangShaw commented on code in PR #14397: URL: https://github.com/apache/doris/pull/14397#discussion_r1094410796 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Window.java: ## @@ -0,0 +1,176 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.properties.OrderKey; +import org.apache.doris.nereids.trees.expressions.functions.PropagateNullable; +import org.apache.doris.nereids.trees.expressions.shape.UnaryExpression; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.base.Preconditions; +import com.google.common.collect.Lists; + +import java.util.List; +import java.util.Objects; +import java.util.Optional; +import java.util.stream.Collectors; + +/** + * represents window function. WindowFunction of this window is saved as Window's child, + * which is an UnboundFunction at first and will be analyzed as relevant BoundFunction + * (can be a WindowFunction or AggregateFunction) after BindFunction. + */ +public class Window extends Expression implements UnaryExpression, PropagateNullable { Review Comment: so stupid what I have ever done.. ## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/logical/NormalizeWindow.java: ## @@ -0,0 +1,165 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.rules.rewrite.logical; + +import org.apache.doris.nereids.properties.OrderKey; +import org.apache.doris.nereids.rules.Rule; +import org.apache.doris.nereids.rules.RuleType; +import org.apache.doris.nereids.rules.rewrite.OneRewriteRuleFactory; +import org.apache.doris.nereids.trees.expressions.Alias; +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.NamedExpression; +import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.SlotReference; +import org.apache.doris.nereids.trees.expressions.Window; +import org.apache.doris.nereids.trees.plans.Plan; +import org.apache.doris.nereids.trees.plans.logical.LogicalProject; +import org.apache.doris.nereids.trees.plans.logical.LogicalWindow; +import org.apache.doris.nereids.util.ExpressionUtils; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableSet; +import com.google.common.collect.Lists; +import com.google.common.collect.Sets; + +import java.util.List; +import java.util.Optional; +import java.util.Set; +import java.util.stream.Collectors; + +/** + * NormalizeWindow: generate bottomProject for expressions within Window, and topProject for origin output of SQL + * e.g. SELECT k1#1, k2#2, SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY k5#5) FROM t + * + * Original Plan: + * LogicalWindow( + * outputs:[k1#1, k2#2, Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY k5#5)#6], + * windowExpressions:[] + * ) + * + * After Normalize: + * LogicalProject(k1#1, k2#2, Alias(SlotReference#7)#6) + * +-- LogicalWindow( + * outputs:[k1#1, k2#2, Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY k5#5)#6], + * windowExpressions:[Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY k5#5)#6] + * ) + * +-- LogicalProject(k1#1, k2#2, k3#3, k4#4, k5#5) + * + */ +publ
[GitHub] [doris] YangShaw commented on a diff in pull request #14397: [feature](nereids)support window function
YangShaw commented on code in PR #14397: URL: https://github.com/apache/doris/pull/14397#discussion_r1094411155 ## fe/fe-core/src/main/java/org/apache/doris/nereids/parser/LogicalPlanBuilder.java: ## @@ -1437,12 +1530,19 @@ private LogicalPlan withProjection(LogicalPlan input, SelectColumnClauseContext expressions, input, isDistinct); } else { List projects = getNamedExpressions(selectCtx.namedExpressionSeq()); -return new LogicalProject<>(projects, ImmutableList.of(), input, isDistinct); +if (containsWindowExpressions(projects)) { +return new LogicalWindow<>(projects, input); +} +return new LogicalProject<>(projects, Collections.emptyList(), input, isDistinct); Review Comment: nice idea~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16073: [feature](Load)Add cluster_token auth for stream load to avoid double auth in mysql load
github-actions[bot] commented on PR #16073: URL: https://github.com/apache/doris/pull/16073#issuecomment-1413617156 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yixiutt opened a new pull request, #16377: [fix](vertical compaction) fix uint32_t init value
yixiutt opened a new pull request, #16377: URL: https://github.com/apache/doris/pull/16377 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xiaokang commented on a diff in pull request #16263: [Improve](row-store) support row cache
xiaokang commented on code in PR #16263: URL: https://github.com/apache/doris/pull/16263#discussion_r109442 ## be/src/common/config.h: ## @@ -246,6 +246,7 @@ CONF_mBool(row_nums_check, "true"); // modify them upon necessity CONF_Int32(min_file_descriptor_number, "6"); CONF_Int64(index_stream_cache_capacity, "10737418240"); +CONF_String(row_cache_mem_limit, "20%"); Review Comment: default 20% is a little large compared to page_cache ## be/src/vec/jsonb/serialize.cpp: ## @@ -319,4 +319,22 @@ void JsonbSerializeUtil::jsonb_to_block(const TupleDescriptor& desc, } } +// single row +void JsonbSerializeUtil::jsonb_to_block(const TupleDescriptor& desc, const Slice& data, Review Comment: jsonb_to_block for single row can be reused by jsonb_to_block for multiple rows ## be/src/olap/rowset/segment_v2/segment_writer.cpp: ## @@ -252,8 +253,13 @@ Status SegmentWriter::append_block(const vectorized::Block* block, size_t row_po if (_tablet_schema->keys_type() == UNIQUE_KEYS && _opts.enable_unique_key_merge_on_write) { // create primary indexes for (size_t pos = 0; pos < num_rows; pos++) { -RETURN_IF_ERROR( - _primary_key_index_builder->add_item(_full_encode_keys(key_columns, pos))); +const std::string& key = _full_encode_keys(key_columns, pos); +RETURN_IF_ERROR(_primary_key_index_builder->add_item(key)); +if (!config::disable_storage_row_cache && _tablet_schema->store_row_column() && +_opts.is_direct_write) { +// invalidate cache +RowCache::instance()->erase({_opts.rowset_ctx->tablet_id, key}); Review Comment: can insert new value to cache -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16377: [fix](vertical compaction) fix uint32_t init value
github-actions[bot] commented on PR #16377: URL: https://github.com/apache/doris/pull/16377#issuecomment-1413628647 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value
zhannngchen commented on code in PR #16377: URL: https://github.com/apache/doris/pull/16377#discussion_r109053 ## be/src/vec/olap/vertical_merge_iterator.h: ## @@ -190,14 +190,14 @@ class VerticalMergeIteratorContext { size_t _ori_return_cols = 0; // segment order, used to compare key -uint32_t _order = -1; +uint32_t _order = 0; -uint32_t _seq_col_idx = -1; Review Comment: `_seq_col_idx` change to int32_t is better? align with the type in schema -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value
zhannngchen commented on code in PR #16377: URL: https://github.com/apache/doris/pull/16377#discussion_r109691 ## be/src/vec/olap/vertical_merge_iterator.h: ## @@ -190,14 +190,14 @@ class VerticalMergeIteratorContext { size_t _ori_return_cols = 0; // segment order, used to compare key -uint32_t _order = -1; +uint32_t _order = 0; -uint32_t _seq_col_idx = -1; Review Comment: `_seq_col_idx` change to int32_t is better? align with the type in schema -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value
zhannngchen commented on code in PR #16377: URL: https://github.com/apache/doris/pull/16377#discussion_r109691 ## be/src/vec/olap/vertical_merge_iterator.h: ## @@ -190,14 +190,14 @@ class VerticalMergeIteratorContext { size_t _ori_return_cols = 0; // segment order, used to compare key -uint32_t _order = -1; +uint32_t _order = 0; -uint32_t _seq_col_idx = -1; Review Comment: `_seq_col_idx` change to int32_t is better? align with the type in schema -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16073: [feature](Load)Add cluster_token auth for stream load to avoid double auth in mysql load
github-actions[bot] commented on PR #16073: URL: https://github.com/apache/doris/pull/16073#issuecomment-1413684972 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [refactor](row-store) make row store column a hidden column in meta (#16251)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 1d8265c5a3 [refactor](row-store) make row store column a hidden column in meta (#16251) 1d8265c5a3 is described below commit 1d8265c5a3df818fbf87a12192ee419a7593d851 Author: lihangyu <15605149...@163.com> AuthorDate: Thu Feb 2 20:56:13 2023 +0800 [refactor](row-store) make row store column a hidden column in meta (#16251) This could simplfy storage engine logic and make code more readable, and we could analyze the hidden `__DORIS_ROW_STORE_COL__` length etc.. --- be/src/common/consts.h | 3 +- be/src/olap/compaction.cpp | 5 be/src/olap/memtable.cpp | 32 be/src/olap/memtable.h | 4 +++ be/src/olap/rowset/beta_rowset_writer.cpp | 34 -- be/src/olap/rowset/beta_rowset_writer.h| 1 - be/src/olap/rowset/segment_v2/segment.cpp | 13 - be/src/olap/rowset/segment_v2/segment.h| 1 - be/src/olap/rowset/segment_v2/segment_writer.cpp | 29 -- be/src/olap/rowset/segment_v2/segment_writer.h | 2 -- be/src/olap/rowset/vertical_beta_rowset_writer.cpp | 3 -- be/src/olap/schema.h | 2 +- be/src/olap/tablet.cpp | 8 ++--- be/src/olap/tablet_schema.cpp | 17 +++ be/src/olap/tablet_schema.h| 3 +- be/src/vec/jsonb/serialize.cpp | 4 +++ .../java/org/apache/doris/analysis/ColumnDef.java | 6 .../org/apache/doris/analysis/CreateTableStmt.java | 7 - .../main/java/org/apache/doris/catalog/Column.java | 6 19 files changed, 73 insertions(+), 107 deletions(-) diff --git a/be/src/common/consts.h b/be/src/common/consts.h index f6c7ece8e0..bf7a2e6013 100644 --- a/be/src/common/consts.h +++ b/be/src/common/consts.h @@ -26,11 +26,10 @@ const std::string CSV_WITH_NAMES = "csv_with_names"; const std::string CSV_WITH_NAMES_AND_TYPES = "csv_with_names_and_types"; const std::string BLOCK_TEMP_COLUMN_PREFIX = "__TEMP__"; const std::string ROWID_COL = "__DORIS_ROWID_COL__"; -const std::string SOURCE_COL = "__DORIS_SOURCE_COL__"; +const std::string ROW_STORE_COL = "__DORIS_ROW_STORE_COL__"; constexpr int MAX_DECIMAL32_PRECISION = 9; constexpr int MAX_DECIMAL64_PRECISION = 18; constexpr int MAX_DECIMAL128_PRECISION = 38; -constexpr int SOURCE_COL_UNIQUE_ID = INT32_MAX; } // namespace BeConsts } // namespace doris diff --git a/be/src/olap/compaction.cpp b/be/src/olap/compaction.cpp index eb7d2521bf..76c7fc3374 100644 --- a/be/src/olap/compaction.cpp +++ b/be/src/olap/compaction.cpp @@ -276,11 +276,6 @@ Status Compaction::do_compaction_impl(int64_t permits) { stats.rowid_conversion = &_rowid_conversion; } -if (_cur_tablet_schema->store_row_column()) { -// table with row column not support vertical compaction now -vertical_compaction = false; -} - if (use_vectorized_compaction) { if (vertical_compaction) { res = Merger::vertical_merge_rowsets(_tablet, compaction_type(), _cur_tablet_schema, diff --git a/be/src/olap/memtable.cpp b/be/src/olap/memtable.cpp index c56683a9c8..9ec3cb0fda 100644 --- a/be/src/olap/memtable.cpp +++ b/be/src/olap/memtable.cpp @@ -27,6 +27,7 @@ #include "vec/aggregate_functions/aggregate_function_reader.h" #include "vec/aggregate_functions/aggregate_function_simple_factory.h" #include "vec/core/field.h" +#include "vec/jsonb/serialize.h" namespace doris { using namespace ErrorCode; @@ -356,6 +357,10 @@ Status MemTable::_do_flush(int64_t& duration_ns) { SCOPED_RAW_TIMER(&duration_ns); _collect_vskiplist_results(); vectorized::Block block = _output_mutable_block.to_block(); +if (_tablet_schema->store_row_column()) { +// convert block to row store format +serialize_block_to_row_column(block); +} RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size)); return Status::OK(); } @@ -364,4 +369,31 @@ Status MemTable::close() { return flush(); } +void MemTable::serialize_block_to_row_column(vectorized::Block& block) { +if (block.rows() == 0) { +return; +} +MonotonicStopWatch watch; +watch.start(); +// find row column id +int row_column_id = 0; +for (int i = 0; i < _tablet_schema->num_columns(); ++i) { +if (_tablet_schema->column(i).is_row_store_column()) { +row_column_id = i; +break; +} +} +vectorized::ColumnString* row_store_column = + static_cast(block.get_by_position(row_column_id) +
[GitHub] [doris] dataroaring merged pull request #16251: [refactor](row-store) make row store column a hidden column in meta
dataroaring merged PR #16251: URL: https://github.com/apache/doris/pull/16251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16368: [enhance](Nereids): polish code
hello-stephen commented on PR #16368: URL: https://github.com/apache/doris/pull/16368#issuecomment-1413700547 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.68 seconds load time: 494 seconds storage size: 17170879622 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202125543_clickbench_pr_89635.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt opened a new pull request, #16378: [Feature](Materialized-View) support multiple slot on one column in materialized view
BiteThet opened a new pull request, #16378: URL: https://github.com/apache/doris/pull/16378 # Proposed changes support multiple slot on one column in materialized view ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring merged pull request #16358: [Improve](row-store) check light schema change must enabled
dataroaring merged PR #16358: URL: https://github.com/apache/doris/pull/16358 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Improve](row-store) check light schema change enabled (#16358)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 13f74088fa [Improve](row-store) check light schema change enabled (#16358) 13f74088fa is described below commit 13f74088fad806de6145ff6081a7056863d35d06 Author: lihangyu <15605149...@163.com> AuthorDate: Thu Feb 2 20:57:18 2023 +0800 [Improve](row-store) check light schema change enabled (#16358) --- .../org/apache/doris/datasource/InternalCatalog.java | 4 .../suites/point_query_p0/test_point_query.groovy | 18 +- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java index c8214ec7d4..31455d814c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java @@ -1949,6 +1949,10 @@ public class InternalCatalog implements CatalogIf { boolean storeRowColumn = false; try { storeRowColumn = PropertyAnalyzer.analyzeStoreRowColumn(properties); +if (storeRowColumn && !enableLightSchemaChange) { +throw new DdlException( +"Row store column rely on light schema change, enable light schema change first"); +} } catch (AnalysisException e) { throw new DdlException(e.getMessage()); } diff --git a/regression-test/suites/point_query_p0/test_point_query.groovy b/regression-test/suites/point_query_p0/test_point_query.groovy index 6806650dfb..5d36a4eb31 100644 --- a/regression-test/suites/point_query_p0/test_point_query.groovy +++ b/regression-test/suites/point_query_p0/test_point_query.groovy @@ -24,6 +24,22 @@ suite("test_point_query") { def url = context.config.jdbcUrl + "&useServerPrepStmts=true" def result1 = connect(user=user, password=password, url=url) { sql """DROP TABLE IF EXISTS ${tableName}""" +test { +// abnormal case +sql """ + CREATE TABLE IF NOT EXISTS ${tableName} ( +`k1` int NULL COMMENT "" + ) ENGINE=OLAP + UNIQUE KEY(`k1`) + DISTRIBUTED BY HASH(`k1`) BUCKETS 1 + PROPERTIES ( + "replication_allocation" = "tag.location.default: 1", + "store_row_column" = "true", + "light_schema_change" = "false" + ) + """ +exception "errCode = 2, detailMessage = Row store column rely on light schema change, enable light schema change first" +} sql """ CREATE TABLE IF NOT EXISTS ${tableName} ( `k1` int(11) NULL COMMENT "", @@ -123,4 +139,4 @@ suite("test_point_query") { qt_sql """execute stmt2 using (1231, 119291.11, 'ddd')""" qt_sql """execute stmt2 using (1237, 120939.11130, 'addd')""" } -} \ No newline at end of file +} - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch branch-1.2-lts updated (be7c4d267e -> fb5420c262)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git from be7c4d267e [branch1.2] fix compile bug after cherry-pick new 231952a57d [fix](load) sequence column do not compare correctly in memtable (#16211) new d4b8629c77 [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271) new fb5420c262 [improvement](multi-catalog) increase default batch_size to 4064 (#16326) The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: be/src/exec/table_connector.cpp| 9 +- be/src/olap/memtable.cpp | 5 +- be/src/vec/exec/format/csv/csv_reader.cpp | 2 +- be/src/vec/exec/format/generic_reader.h| 2 + be/src/vec/exec/format/json/new_json_reader.cpp| 2 +- be/src/vec/exec/format/orc/vorc_reader.cpp | 2 +- be/src/vec/exec/format/parquet/vparquet_reader.cpp | 2 +- be/test/olap/delta_writer_test.cpp | 115 ++--- .../docker-compose/mysql/init/03-create-table.sql | 5 + .../docker-compose/oracle/init/03-create-table.sql | 6 ++ .../postgresql/init/02-create-table.sql| 6 ++ .../java/org/apache/doris/analysis/InsertStmt.java | 39 +-- .../java/org/apache/doris/qe/SessionVariable.java | 4 +- .../doris/transaction/DatabaseTransactionMgr.java | 3 +- .../doris/transaction/GlobalTransactionMgr.java| 5 +- .../unique/test_unique_table_new_sequence.out | 8 +- .../unique/test_unique_table_sequence.out | 6 +- .../data/data_model_p0/unique/unique_key_data1.csv | 1 + .../data/data_model_p0/unique/unique_key_data2.csv | 3 +- .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out| 13 +++ .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out | 13 +++ .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out | 13 +++ .../unique/test_unique_table_new_sequence.groovy | 8 +- .../unique/test_unique_table_sequence.groovy | 8 +- .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy | 23 - .../test_oracle_jdbc_catalog.groovy| 14 ++- .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy| 13 +++ 27 files changed, 248 insertions(+), 82 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] 01/03: [fix](load) sequence column do not compare correctly in memtable (#16211)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit 231952a57d9ce300588a23f7e2ee0614fc58259b Author: zhannngchen <48427519+zhannngc...@users.noreply.github.com> AuthorDate: Thu Feb 2 11:00:23 2023 +0800 [fix](load) sequence column do not compare correctly in memtable (#16211) --- be/src/olap/memtable.cpp | 5 +- be/test/olap/delta_writer_test.cpp | 115 ++--- .../unique/test_unique_table_new_sequence.out | 8 +- .../unique/test_unique_table_sequence.out | 6 +- .../data/data_model_p0/unique/unique_key_data1.csv | 1 + .../data/data_model_p0/unique/unique_key_data2.csv | 3 +- .../unique/test_unique_table_new_sequence.groovy | 8 +- .../unique/test_unique_table_sequence.groovy | 8 +- 8 files changed, 100 insertions(+), 54 deletions(-) diff --git a/be/src/olap/memtable.cpp b/be/src/olap/memtable.cpp index adc57dfee7..2fdf41e158 100644 --- a/be/src/olap/memtable.cpp +++ b/be/src/olap/memtable.cpp @@ -321,8 +321,9 @@ void MemTable::_replace_row(const ContiguousRow& src_row, TableKey row_in_skipli void MemTable::_aggregate_two_row_in_block(RowInBlock* new_row, RowInBlock* row_in_skiplist) { if (_tablet_schema->has_sequence_col()) { auto sequence_idx = _tablet_schema->sequence_col_idx(); -auto res = _input_mutable_block.compare_at(row_in_skiplist->_row_pos, new_row->_row_pos, - sequence_idx, _input_mutable_block, -1); +DCHECK_LT(sequence_idx, _input_mutable_block.columns()); +auto col_ptr = _input_mutable_block.mutable_columns()[sequence_idx].get(); +auto res = col_ptr->compare_at(row_in_skiplist->_row_pos, new_row->_row_pos, *col_ptr, -1); // dst sequence column larger than src, don't need to update if (res > 0) { return; diff --git a/be/test/olap/delta_writer_test.cpp b/be/test/olap/delta_writer_test.cpp index 16051d1adc..b3aa765c2f 100644 --- a/be/test/olap/delta_writer_test.cpp +++ b/be/test/olap/delta_writer_test.cpp @@ -29,6 +29,7 @@ #include "gen_cpp/internal_service.pb.h" #include "olap/field.h" #include "olap/options.h" +#include "olap/rowset/beta_rowset.h" #include "olap/storage_engine.h" #include "olap/tablet.h" #include "olap/tablet_meta_manager.h" @@ -247,7 +248,7 @@ static void create_tablet_request_with_sequence_col(int64_t tablet_id, int32_t s request->tablet_schema.short_key_column_count = 2; request->tablet_schema.keys_type = TKeysType::UNIQUE_KEYS; request->tablet_schema.storage_type = TStorageType::COLUMN; -request->tablet_schema.__set_sequence_col_idx(2); +request->tablet_schema.__set_sequence_col_idx(4); request->__set_storage_format(TStorageFormat::V2); TColumn k1; @@ -262,13 +263,6 @@ static void create_tablet_request_with_sequence_col(int64_t tablet_id, int32_t s k2.column_type.type = TPrimitiveType::SMALLINT; request->tablet_schema.columns.push_back(k2); -TColumn sequence_col; -sequence_col.column_name = SEQUENCE_COL; -sequence_col.__set_is_key(false); -sequence_col.column_type.type = TPrimitiveType::INT; -sequence_col.__set_aggregation_type(TAggregationType::REPLACE); -request->tablet_schema.columns.push_back(sequence_col); - TColumn v1; v1.column_name = "v1"; v1.__set_is_key(false); @@ -282,6 +276,13 @@ static void create_tablet_request_with_sequence_col(int64_t tablet_id, int32_t s v2.column_type.type = TPrimitiveType::DATEV2; v2.__set_aggregation_type(TAggregationType::REPLACE); request->tablet_schema.columns.push_back(v2); + +TColumn sequence_col; +sequence_col.column_name = SEQUENCE_COL; +sequence_col.__set_is_key(false); +sequence_col.column_type.type = TPrimitiveType::INT; +sequence_col.__set_aggregation_type(TAggregationType::REPLACE); +request->tablet_schema.columns.push_back(sequence_col); } static TDescriptorTable create_descriptor_tablet() { @@ -346,15 +347,15 @@ static TDescriptorTable create_descriptor_tablet_with_sequence_col() { TSlotDescriptorBuilder().type(TYPE_TINYINT).column_name("k1").column_pos(0).build()); tuple_builder.add_slot( TSlotDescriptorBuilder().type(TYPE_SMALLINT).column_name("k2").column_pos(1).build()); +tuple_builder.add_slot( + TSlotDescriptorBuilder().type(TYPE_DATETIME).column_name("v1").column_pos(2).build()); +tuple_builder.add_slot( + TSlotDescriptorBuilder().type(TYPE_DATEV2).column_name("v2").column_pos(3).build()); tuple_builder.add_slot(TSlotDescriptorBuilder() .type(TYPE_INT) .column_name(SEQUENCE_COL) - .column_pos(2) + .column_p
[doris] 02/03: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit d4b8629c77a80f8a1cf95561375ba453337ae406 Author: Tiewei Fang <43782773+bepppo...@users.noreply.github.com> AuthorDate: Thu Feb 2 17:31:33 2023 +0800 [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271) --- be/src/exec/table_connector.cpp| 9 +++-- .../docker-compose/mysql/init/03-create-table.sql | 5 +++ .../docker-compose/oracle/init/03-create-table.sql | 6 .../postgresql/init/02-create-table.sql| 6 .../java/org/apache/doris/analysis/InsertStmt.java | 39 -- .../doris/transaction/DatabaseTransactionMgr.java | 3 +- .../doris/transaction/GlobalTransactionMgr.java| 5 +-- .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out| 13 .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out | 13 .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out | 13 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy | 23 ++--- .../test_oracle_jdbc_catalog.groovy| 14 +++- .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy| 13 13 files changed, 140 insertions(+), 22 deletions(-) diff --git a/be/src/exec/table_connector.cpp b/be/src/exec/table_connector.cpp index f2c3ff8101..6c310e4a60 100644 --- a/be/src/exec/table_connector.cpp +++ b/be/src/exec/table_connector.cpp @@ -336,8 +336,13 @@ Status TableConnector::convert_column_data(const vectorized::ColumnPtr& column_p case TYPE_VARCHAR: case TYPE_CHAR: case TYPE_STRING: { -// here need check the ' is used, now for pg array string must be " -fmt::format_to(_insert_stmt_buffer, "\"{}\"", fmt::basic_string_view(item, size)); +// TODO(zhangstar333): check array data type of postgresql +// for oracle/pg database string must be ' +if (table_type == TOdbcTableType::ORACLE || table_type == TOdbcTableType::POSTGRESQL) { +fmt::format_to(_insert_stmt_buffer, "'{}'", fmt::basic_string_view(item, size)); +} else { +fmt::format_to(_insert_stmt_buffer, "\"{}\"", fmt::basic_string_view(item, size)); +} break; } case TYPE_ARRAY: { diff --git a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql index 6c8371e7c7..1847551d0e 100644 --- a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql +++ b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql @@ -223,4 +223,9 @@ create table doris_test.ex_tb20 ( decimal_unsigned_long decimal(65, 5) unsigned ) engine=innodb charset=utf8; +create table doris_test.test_insert ( +`id` varchar(128) NULL, +`name` varchar(128) NULL, +`age` int NULL +) engine=innodb charset=utf8; diff --git a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql index d5dd8cf1c6..d2d8d6af7e 100644 --- a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql +++ b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql @@ -78,3 +78,9 @@ t4 timestamp, t5 interval year(3) to month, t6 interval day(3) to second(6) ); + +create table doris_test.test_insert( +id varchar2(128), +name varchar2(128), +age number(5) +); diff --git a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql index d2dbac7695..93a307f882 100644 --- a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql +++ b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql @@ -143,3 +143,9 @@ CREATE TABLE catalog_pg_test.test12 ( ID INT NOT NULL, uuid_value uuid ); + +CREATE TABLE catalog_pg_test.test_insert ( + id varchar(128), + name varchar(128), + age int +); diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java index 44140b24e9..891fe3349b 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java @@ -31,6 +31,8 @@ import org.apache.doris.catalog.Partition; import org.apache.doris.catalog.PartitionType; import org.apache.doris.catalog.Table; import org.apache.doris.catalog.TableIf; +import org.apache.doris.catalog.external.JdbcExternalDatabase; +import org.apache.doris.catalog.external.JdbcExternalTable; import org.apache.doris.common.AnalysisException; import org.apache.doris.common.DdlException; import org.apache.doris.common.ErrorCode; @@ -39,6 +41,8 @@ import org.apache.doris.common.Pair; import org.apache.doris.common.UserException; import org.apache.doris.common.util.DebugUtil
[doris] 03/03: [improvement](multi-catalog) increase default batch_size to 4064 (#16326)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch branch-1.2-lts in repository https://gitbox.apache.org/repos/asf/doris.git commit fb5420c26276f5a34511d76497a3a2a1ce7ffe57 Author: Ashin Gau AuthorDate: Thu Feb 2 11:51:09 2023 +0800 [improvement](multi-catalog) increase default batch_size to 4064 (#16326) The performance of ClickBench Q30 is affected by batch_size: | batch_size | 1024 | 4096 | 20480 | | -- | -- | -- | -- | | Q30 query time | 2.27 | 1.08 | 0.62 | Because aggregation operator will create a new result block for each batch block, and Q30 has 90 columns, which is time-consuming. Larger batch_size will decrease the number of aggregation blocks, so the larger batch_size will improve performance. Doris internal reader will read at least 4064 rows even if batch_size < 4064, so this PR keep the process of reading external table the same as internal table. --- be/src/vec/exec/format/csv/csv_reader.cpp | 2 +- be/src/vec/exec/format/generic_reader.h | 2 ++ be/src/vec/exec/format/json/new_json_reader.cpp | 2 +- be/src/vec/exec/format/orc/vorc_reader.cpp| 2 +- be/src/vec/exec/format/parquet/vparquet_reader.cpp| 2 +- fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java | 4 ++-- 6 files changed, 8 insertions(+), 6 deletions(-) diff --git a/be/src/vec/exec/format/csv/csv_reader.cpp b/be/src/vec/exec/format/csv/csv_reader.cpp index d811866d13..c7099b24c7 100644 --- a/be/src/vec/exec/format/csv/csv_reader.cpp +++ b/be/src/vec/exec/format/csv/csv_reader.cpp @@ -188,7 +188,7 @@ Status CsvReader::get_next_block(Block* block, size_t* read_rows, bool* eof) { return Status::OK(); } -const int batch_size = _state->batch_size(); +const int batch_size = std::max(_state->batch_size(), (int)_MIN_BATCH_SIZE); size_t rows = 0; auto columns = block->mutate_columns(); while (rows < batch_size && !_line_reader_eof) { diff --git a/be/src/vec/exec/format/generic_reader.h b/be/src/vec/exec/format/generic_reader.h index 30e93aacd8..9f4cfd00ee 100644 --- a/be/src/vec/exec/format/generic_reader.h +++ b/be/src/vec/exec/format/generic_reader.h @@ -60,6 +60,8 @@ public: } protected: +const size_t _MIN_BATCH_SIZE = 4064; // 4094 - 32(padding) + /// Whether the underlying FileReader has filled the partition&missing columns bool _fill_all_columns = false; }; diff --git a/be/src/vec/exec/format/json/new_json_reader.cpp b/be/src/vec/exec/format/json/new_json_reader.cpp index 68a3f089e5..0ed5a0aeb0 100644 --- a/be/src/vec/exec/format/json/new_json_reader.cpp +++ b/be/src/vec/exec/format/json/new_json_reader.cpp @@ -105,7 +105,7 @@ Status NewJsonReader::get_next_block(Block* block, size_t* read_rows, bool* eof) return Status::OK(); } -const int batch_size = _state->batch_size(); +const int batch_size = std::max(_state->batch_size(), (int)_MIN_BATCH_SIZE); auto columns = block->mutate_columns(); while (columns[0]->size() < batch_size && !_reader_eof) { diff --git a/be/src/vec/exec/format/orc/vorc_reader.cpp b/be/src/vec/exec/format/orc/vorc_reader.cpp index c295712491..f313cb60f0 100644 --- a/be/src/vec/exec/format/orc/vorc_reader.cpp +++ b/be/src/vec/exec/format/orc/vorc_reader.cpp @@ -72,7 +72,7 @@ OrcReader::OrcReader(RuntimeProfile* profile, const TFileScanRangeParams& params : _profile(profile), _scan_params(params), _scan_range(range), - _batch_size(batch_size), + _batch_size(std::max(batch_size, _MIN_BATCH_SIZE)), _range_start_offset(range.start_offset), _range_size(range.size), _ctz(ctz), diff --git a/be/src/vec/exec/format/parquet/vparquet_reader.cpp b/be/src/vec/exec/format/parquet/vparquet_reader.cpp index 7881eebe2d..cfc904d607 100644 --- a/be/src/vec/exec/format/parquet/vparquet_reader.cpp +++ b/be/src/vec/exec/format/parquet/vparquet_reader.cpp @@ -36,7 +36,7 @@ ParquetReader::ParquetReader(RuntimeProfile* profile, const TFileScanRangeParams : _profile(profile), _scan_params(params), _scan_range(range), - _batch_size(batch_size), + _batch_size(std::max(batch_size, _MIN_BATCH_SIZE)), _range_start_offset(range.start_offset), _range_size(range.size), _ctz(ctz) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java index bfaefd8ac5..db99c90e26 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java @@ -384,9 +384,9 @@ public class SessionVariable implements Serializable, Writable { @VariableMgr.VarAttr(name = CODEGEN_LEVEL) public in
[GitHub] [doris] xy720 opened a new pull request, #16379: [feature](struct-type/map-type) Add switch for struct and map type for creating table
xy720 opened a new pull request, #16379: URL: https://github.com/apache/doris/pull/16379 # Proposed changes Issue Number: #14917 Add switches to forbid uses creating table with struct or map column. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [x] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [x] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16369: [Improvement](statistics) optimise histogram keyword
hello-stephen commented on PR #16369: URL: https://github.com/apache/doris/pull/16369#issuecomment-1413730263 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 33.46 seconds load time: 486 seconds storage size: 17170706881 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202131525_clickbench_pr_89593.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] eldenmoon opened a new pull request, #16380: [Improve](point query) support retry different backends in PointQuery…
eldenmoon opened a new pull request, #16380: URL: https://github.com/apache/doris/pull/16380 …Executor # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16370: [fix](planner) Doris returns empty sets when select from a inline view
hello-stephen commented on PR #16370: URL: https://github.com/apache/doris/pull/16370#issuecomment-1413755038 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.07 seconds load time: 494 seconds storage size: 17122767970 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/2023020211_clickbench_pr_89609.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16372: [fix](iceberg) fix iceberg catalog rest access
hello-stephen commented on PR #16372: URL: https://github.com/apache/doris/pull/16372#issuecomment-1413782919 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 35.08 seconds load time: 498 seconds storage size: 17123367964 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202135323_clickbench_pr_89569.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16089: [enhance](cooldown)accelerate cooldown task produce efficiency
github-actions[bot] commented on PR #16089: URL: https://github.com/apache/doris/pull/16089#issuecomment-1413803974 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.
hello-stephen commented on PR #16371: URL: https://github.com/apache/doris/pull/16371#issuecomment-1413808938 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.59 seconds load time: 505 seconds storage size: 17171308895 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202141208_clickbench_pr_89574.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee commented on a diff in pull request #16337: [improvement](jdbc) refator jdbc of copy result set by batch
HappenLee commented on code in PR #16337: URL: https://github.com/apache/doris/pull/16337#discussion_r1094664749 ## fe/java-udf/src/main/java/org/apache/doris/udf/JdbcExecutor.java: ## @@ -325,277 +325,233 @@ private void init(String driverUrl, String sql, int batchSize, String driverClas public void copyBatchBooleanResult(Object columnObj, boolean isNullable, int numRows, long nullMapAddr, long columnAddr) { Boolean[] column = (Boolean[]) columnObj; -byte[] columnData = new byte[numRows]; if (isNullable) { -byte[] nullMap = new byte[numRows]; for (int i = 0; i < numRows; i++) { if (column[i] == null) { -nullMap[i] = 1; +UdfUtils.UNSAFE.putByte(nullMapAddr + i, (byte) 1); Review Comment: should always putByte. if column[i] != null, put 0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader
hello-stephen commented on PR #16375: URL: https://github.com/apache/doris/pull/16375#issuecomment-1413920053 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.57 seconds load time: 498 seconds storage size: 17122682039 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202152411_clickbench_pr_89644.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets
hello-stephen commented on PR #16374: URL: https://github.com/apache/doris/pull/16374#issuecomment-1413925976 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.34 seconds load time: 494 seconds storage size: 17170830266 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202152815_clickbench_pr_89672.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc
github-actions[bot] commented on PR #16123: URL: https://github.com/apache/doris/pull/16123#issuecomment-1413945568 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc
github-actions[bot] commented on PR #16123: URL: https://github.com/apache/doris/pull/16123#issuecomment-1413970256 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc
github-actions[bot] commented on PR #16123: URL: https://github.com/apache/doris/pull/16123#issuecomment-1413990933 clang-tidy review says "All clean, LGTM! :+1:" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #16258: [feature](cooldown)Add cooldown delete
hello-stephen commented on PR #16258: URL: https://github.com/apache/doris/pull/16258#issuecomment-1414002975 TeamCity pipeline, clickbench performance test result: the sum of best hot time: 34.25 seconds load time: 496 seconds storage size: 17122695512 Bytes https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202161630_clickbench_pr_89683.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org