[GitHub] [doris] starocean999 commented on pull request #11212: [fix] the nullable info is lost in ifnull expr
starocean999 commented on PR #11212: URL: https://github.com/apache/doris/pull/11212#issuecomment-1198946020 > Do you have any case to repeat the error? > > ```java > if (children.get(0).isNullable()) { > return children.get(1).isNullable(); > } > return false; > ``` > > The nullable here needs to be consistent with `FE`, I think we should do more change at `ifnull`. This is the modification of get_return_type_impl, we also need to implement similar logic in execute_impl  thx, the be is modified and consistent with fe now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-flink-connector] JNSimba merged pull request #51: [docs] Fix broken link for doris connector docs
JNSimba merged PR #51: URL: https://github.com/apache/doris-flink-connector/pull/51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-flink-connector] branch master updated: [docs] Fix broken link for doris connector docs (#51)
This is an automated email from the ASF dual-hosted git repository. diwu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-flink-connector.git The following commit(s) were added to refs/heads/master by this push: new f19c2b3 [docs] Fix broken link for doris connector docs (#51) f19c2b3 is described below commit f19c2b3e5fe4c141ebc788d9bad2be193d2235de Author: Paul Lin AuthorDate: Fri Jul 29 15:02:40 2022 +0800 [docs] Fix broken link for doris connector docs (#51) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a4521ab..dd69234 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ Flink Doris Connector now support flink version from 1.11 to 1.14. If you wish to contribute or use a connector from flink 1.13 (and earlier), please use the [branch-for-flink-before-1.13](https://github.com/apache/doris-flink-connector/tree/branch-for-flink-before-1.13) -More information about compilation and usage, please visit [Flink Doris Connector](https://doris.apache.org/docs/ecosystem/flink-doris-connector.html) +More information about compilation and usage, please visit [Flink Doris Connector](https://doris.apache.org/docs/ecosystem/flink-doris-connector/) ## License - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wangbo commented on a diff in pull request #11257: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq
wangbo commented on code in PR #11257: URL: https://github.com/apache/doris/pull/11257#discussion_r932930559 ## be/src/vec/aggregate_functions/aggregate_function_uniq.h: ## @@ -109,18 +102,96 @@ class AggregateFunctionUniq final detail::OneAdder::add(this->data(place), *columns[0], row_num); } +static ALWAYS_INLINE const KeyType* get_keys(std::vector& keys_container, + const IColumn& column, size_t batch_size) { +if constexpr (std::is_same_v) { +keys_container.resize(batch_size); +for (size_t i = 0; i != batch_size; ++i) { +StringRef value = column.get_data_at(i); +keys_container[i] = Data::get_key(value); +} +return keys_container.data(); +} else { +using ColumnType = +std::conditional_t, ColumnDecimal, ColumnVector>; +return assert_cast(column).get_data().data(); +} +} + +void add_batch(size_t batch_size, AggregateDataPtr* places, size_t place_offset, + const IColumn** columns, Arena* arena) const override { +std::vector keys_container; +const KeyType* keys = get_keys(keys_container, *columns[0], batch_size); + +std::vector array_of_data_set(batch_size); + +for (size_t i = 0; i != batch_size; ++i) { +array_of_data_set[i] = &(this->data(places[i] + place_offset).set); +} + +for (size_t i = 0; i != batch_size; ++i) { +if (i + HASH_MAP_PREFETCH_DIST < batch_size) { +array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch( +keys[i + HASH_MAP_PREFETCH_DIST]); +} + +array_of_data_set[i]->insert(keys[i]); +} +} + void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena*) const override { -this->data(place).set.merge(this->data(rhs).set); +auto& rhs_set = this->data(rhs).set; +if (rhs_set.size() == 0) return; + +auto& set = this->data(place).set; +set.rehash(set.size() + rhs_set.size()); + +for (auto elem : rhs_set) { +set.insert(elem); Review Comment: phmap has merge method, is that same with insert one by one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: create table fix
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new b50fb0afd20 create table fix b50fb0afd20 is described below commit b50fb0afd200cc3c60fcb53ef11f6beaf812a5df Author: jiafeng.zhang AuthorDate: Fri Jul 29 15:02:21 2022 +0800 create table fix --- .../sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md | 4 ++-- .../sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md b/docs/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md index 5c216b10b58..f87c3407db7 100644 --- a/docs/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md +++ b/docs/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md @@ -363,8 +363,8 @@ distribution_info PARTITION BY RANGE(k1) ( PARTITION p1 VALUES LESS THAN ("2020-02-01"), -PARTITION p1 VALUES LESS THAN ("2020-03-01"), -PARTITION p1 VALUES LESS THAN ("2020-04-01") +PARTITION p2 VALUES LESS THAN ("2020-03-01"), +PARTITION p3 VALUES LESS THAN ("2020-04-01") ) DISTRIBUTED BY HASH(k1) BUCKETS 32 PROPERTIES ( diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md index 588eb6f01c0..812ba9f6c80 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md @@ -369,8 +369,8 @@ distribution_info PARTITION BY RANGE(k1) ( PARTITION p1 VALUES LESS THAN ("2020-02-01"), -PARTITION p1 VALUES LESS THAN ("2020-03-01"), -PARTITION p1 VALUES LESS THAN ("2020-04-01") +PARTITION p2 VALUES LESS THAN ("2020-03-01"), +PARTITION p3 VALUES LESS THAN ("2020-04-01") ) DISTRIBUTED BY HASH(k1) BUCKETS 32 PROPERTIES ( - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wangbo commented on a diff in pull request #11257: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq
wangbo commented on code in PR #11257: URL: https://github.com/apache/doris/pull/11257#discussion_r932931508 ## be/src/vec/aggregate_functions/aggregate_function_uniq.h: ## @@ -109,18 +102,96 @@ class AggregateFunctionUniq final detail::OneAdder::add(this->data(place), *columns[0], row_num); } +static ALWAYS_INLINE const KeyType* get_keys(std::vector& keys_container, + const IColumn& column, size_t batch_size) { +if constexpr (std::is_same_v) { +keys_container.resize(batch_size); +for (size_t i = 0; i != batch_size; ++i) { +StringRef value = column.get_data_at(i); +keys_container[i] = Data::get_key(value); +} +return keys_container.data(); +} else { +using ColumnType = +std::conditional_t, ColumnDecimal, ColumnVector>; +return assert_cast(column).get_data().data(); +} +} + +void add_batch(size_t batch_size, AggregateDataPtr* places, size_t place_offset, + const IColumn** columns, Arena* arena) const override { +std::vector keys_container; +const KeyType* keys = get_keys(keys_container, *columns[0], batch_size); + +std::vector array_of_data_set(batch_size); + +for (size_t i = 0; i != batch_size; ++i) { +array_of_data_set[i] = &(this->data(places[i] + place_offset).set); +} + +for (size_t i = 0; i != batch_size; ++i) { +if (i + HASH_MAP_PREFETCH_DIST < batch_size) { +array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch( +keys[i + HASH_MAP_PREFETCH_DIST]); +} + +array_of_data_set[i]->insert(keys[i]); +} +} + void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena*) const override { -this->data(place).set.merge(this->data(rhs).set); +auto& rhs_set = this->data(rhs).set; +if (rhs_set.size() == 0) return; + +auto& set = this->data(place).set; +set.rehash(set.size() + rhs_set.size()); + +for (auto elem : rhs_set) { +set.insert(elem); +} +} + +void add_batch_single_place(size_t batch_size, AggregateDataPtr place, const IColumn** columns, +Arena* arena) const override { +std::vector keys_container; +const KeyType* keys = get_keys(keys_container, *columns[0], batch_size); +auto& set = this->data(place).set; + +for (size_t i = 0; i != batch_size; ++i) { +if (i + HASH_MAP_PREFETCH_DIST < batch_size) { +set.prefetch(keys[i + HASH_MAP_PREFETCH_DIST]); +} +set.insert(keys[i]); +} } void serialize(ConstAggregateDataPtr __restrict place, BufferWritable& buf) const override { -this->data(place).set.write(buf); +auto& set = this->data(place).set; +write_var_uint(set.size(), buf); +for (const auto& elem : set) { +write_pod_binary(elem, buf); Review Comment: How about add a Todo here; After phmap is included in BE's code. We can serialize phmap in copy way -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11328: [doc]Stream load doc fix
github-actions[bot] commented on PR #11328: URL: https://github.com/apache/doris/pull/11328#issuecomment-1198957822 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11328: [doc]Stream load doc fix
github-actions[bot] commented on PR #11328: URL: https://github.com/apache/doris/pull/11328#issuecomment-1198957854 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 closed pull request #11100: Bump terser from 4.8.0 to 4.8.1 in /docs
hf200012 closed pull request #11100: Bump terser from 4.8.0 to 4.8.1 in /docs URL: https://github.com/apache/doris/pull/11100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dependabot[bot] commented on pull request #11100: Bump terser from 4.8.0 to 4.8.1 in /docs
dependabot[bot] commented on PR #11100: URL: https://github.com/apache/doris/pull/11100#issuecomment-1198959655 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting `@dependabot ignore this major version` or `@dependabot ignore this minor version`. If you change your mind, just re-open this PR and I'll resolve any conflicts on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xy720 opened a new pull request, #11329: Add regresstion-test for array_slice function
xy720 opened a new pull request, #11329: URL: https://github.com/apache/doris/pull/11329 # Proposed changes Issue Number: close #xxx In #11200, we fix the bug to pass constant arguments to array functions. And in #11054 , we support array slice. Here we add the regression-test of array_slice function with constant arguments. ## Problem Summary: ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11324: [Improvement](vectorized) Remove row-based conjuncts on vectorized nodes
yiguolei merged PR #11324: URL: https://github.com/apache/doris/pull/11324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (4f8e66c4b3 -> 3fe7b21ac8)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 4f8e66c4b3 [doc][fix]Modify docs/community to keep the same as the website site directory structure (#11327) add 3fe7b21ac8 [Improvement](vectorized) Remove row-based conjuncts on vectorized nodes (#11324) No new revisions were added by this update. Summary of changes: be/src/vec/exec/join/vhash_join_node.cpp| 3 --- fe/fe-core/src/main/java/org/apache/doris/planner/PlanNode.java | 7 +-- 2 files changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener opened a new pull request, #11330: [refactor]: refactor UT of Nereids
jackwener opened a new pull request, #11330: URL: https://github.com/apache/doris/pull/11330 # Proposed changes Issue Number: close #xxx ## Problem Summary: Extract the plan constructor. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhengshiJ commented on a diff in pull request #11264: [feature](nereids)add InPredicate in expressions
zhengshiJ commented on code in PR #11264: URL: https://github.com/apache/doris/pull/11264#discussion_r930974172 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/InPredicate.java: ## @@ -0,0 +1,82 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.types.BooleanType; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableList.Builder; + +import java.util.List; +import java.util.Objects; +import java.util.stream.Collectors; + +/** + * In predicate expression. + */ +public class InPredicate extends Expression { + +private Expression compareExpr; +private List optionsList; + +public InPredicate(Expression compareExpr, List optionsList) { +super(new Builder().add(compareExpr).addAll(optionsList).build().toArray(new Expression[0])); +this.compareExpr = compareExpr; +this.optionsList = ImmutableList.copyOf(Objects.requireNonNull(optionsList, "In list cannot be null")); +} + +public R accept(ExpressionVisitor visitor, C context) { +return visitor.visitInPredicate(this, context); +} + +@Override +public DataType getDataType() throws UnboundException { +return BooleanType.INSTANCE; +} + +@Override +public boolean nullable() throws UnboundException { +return optionsList.stream().map(Expression::nullable) +.reduce((a, b) -> a || b).get(); +} + +@Override +public String toString() { +return compareExpr + " IN " + optionsList.stream() +.map(Expression::toString) +.collect(Collectors.joining(",", "(", ")")); +} + +@Override +public String toSql() { +return compareExpr.toSql() + " IN " + optionsList.stream() +.map(Expression::toSql) +.collect(Collectors.joining(",", "(", ")")); +} + Review Comment: You also need to add equals() and hashcode() here ## fe/fe-core/src/main/java/org/apache/doris/nereids/parser/LogicalPlanBuilder.java: ## @@ -782,8 +783,10 @@ private Expression withPredicate(Expression valueExpression, PredicateContext ct break; case DorisParser.IN: if (ctx.query() == null) { -//TODO: InPredicate Review Comment: Comments can be deleted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] YangShaw commented on a diff in pull request #11264: [feature](nereids)add InPredicate in expressions
YangShaw commented on code in PR #11264: URL: https://github.com/apache/doris/pull/11264#discussion_r932961499 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/InPredicate.java: ## @@ -0,0 +1,82 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.types.BooleanType; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableList.Builder; + +import java.util.List; +import java.util.Objects; +import java.util.stream.Collectors; + +/** + * In predicate expression. + */ +public class InPredicate extends Expression { + +private Expression compareExpr; +private List optionsList; + +public InPredicate(Expression compareExpr, List optionsList) { +super(new Builder().add(compareExpr).addAll(optionsList).build().toArray(new Expression[0])); +this.compareExpr = compareExpr; +this.optionsList = ImmutableList.copyOf(Objects.requireNonNull(optionsList, "In list cannot be null")); +} + +public R accept(ExpressionVisitor visitor, C context) { +return visitor.visitInPredicate(this, context); +} + +@Override +public DataType getDataType() throws UnboundException { +return BooleanType.INSTANCE; +} + +@Override +public boolean nullable() throws UnboundException { +return optionsList.stream().map(Expression::nullable) +.reduce((a, b) -> a || b).get(); +} + +@Override +public String toString() { +return compareExpr + " IN " + optionsList.stream() +.map(Expression::toString) +.collect(Collectors.joining(",", "(", ")")); +} + +@Override +public String toSql() { +return compareExpr.toSql() + " IN " + optionsList.stream() +.map(Expression::toSql) +.collect(Collectors.joining(",", "(", ")")); +} + Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg commented on a diff in pull request #11257: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq
mrhhsg commented on code in PR #11257: URL: https://github.com/apache/doris/pull/11257#discussion_r932965068 ## be/src/vec/aggregate_functions/aggregate_function_uniq.h: ## @@ -109,18 +102,96 @@ class AggregateFunctionUniq final detail::OneAdder::add(this->data(place), *columns[0], row_num); } +static ALWAYS_INLINE const KeyType* get_keys(std::vector& keys_container, + const IColumn& column, size_t batch_size) { +if constexpr (std::is_same_v) { +keys_container.resize(batch_size); +for (size_t i = 0; i != batch_size; ++i) { +StringRef value = column.get_data_at(i); +keys_container[i] = Data::get_key(value); +} +return keys_container.data(); +} else { +using ColumnType = +std::conditional_t, ColumnDecimal, ColumnVector>; +return assert_cast(column).get_data().data(); +} +} + +void add_batch(size_t batch_size, AggregateDataPtr* places, size_t place_offset, + const IColumn** columns, Arena* arena) const override { +std::vector keys_container; +const KeyType* keys = get_keys(keys_container, *columns[0], batch_size); + +std::vector array_of_data_set(batch_size); + +for (size_t i = 0; i != batch_size; ++i) { +array_of_data_set[i] = &(this->data(places[i] + place_offset).set); +} + +for (size_t i = 0; i != batch_size; ++i) { +if (i + HASH_MAP_PREFETCH_DIST < batch_size) { +array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch( +keys[i + HASH_MAP_PREFETCH_DIST]); +} + +array_of_data_set[i]->insert(keys[i]); +} +} + void merge(AggregateDataPtr __restrict place, ConstAggregateDataPtr rhs, Arena*) const override { -this->data(place).set.merge(this->data(rhs).set); +auto& rhs_set = this->data(rhs).set; +if (rhs_set.size() == 0) return; + +auto& set = this->data(place).set; +set.rehash(set.size() + rhs_set.size()); + +for (auto elem : rhs_set) { +set.insert(elem); Review Comment: Yes, `phmap::merge` inserts elements one by one and it requires the src(rhs) is not constant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11294: [Doc] Add alter table comment doc
github-actions[bot] commented on PR #11294: URL: https://github.com/apache/doris/pull/11294#issuecomment-1198992417 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen commented on pull request #11325: [Improvement] start|stop script files improvements
hello-stephen commented on PR #11325: URL: https://github.com/apache/doris/pull/11325#issuecomment-1198996724 May I ask what is the problem when the script is a soft link? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10997: [Doc] Resolve Historical Conflict Documents
github-actions[bot] commented on PR #10997: URL: https://github.com/apache/doris/pull/10997#issuecomment-1198998326 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10997: [Doc] Resolve Historical Conflict Documents
github-actions[bot] commented on PR #10997: URL: https://github.com/apache/doris/pull/10997#issuecomment-1198998360 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wangshuo128 commented on a diff in pull request #11299: [enhancement](nereids) Normalize expressions before performing plan rewriting
wangshuo128 commented on code in PR #11299: URL: https://github.com/apache/doris/pull/11299#discussion_r932980621 ## fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/logical/PushDownPredicateTest.java: ## @@ -242,6 +243,7 @@ public void pushDownPredicateIntoScanTest4() { } private Memo rewrite(Plan plan) { -return PlanRewriter.topDownRewriteMemo(plan, new ConnectContext(), new PushPredicateThroughJoin()); +Plan normalizedPlan = PlanRewriter.topDownRewrite(plan, new ConnectContext(), new NormalizeExpressions()); Review Comment: Ok, never mind. I was thinking to avoid copying in and copying out of `memo` multiple times. Let's refactor this when we have more similar cases in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] smallhibiscus opened a new issue, #11331: [Feature] The poseexplode function support like hive.
smallhibiscus opened a new issue, #11331: URL: https://github.com/apache/doris/issues/11331 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description Need poseexplode function support, the implementation of this function in hive https://blog.csdn.net/dzysunshine/article/details/101110467 ### Use case  ### Related issues _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhengshiJ opened a new pull request, #11332: [feature](nereids) add scalar subquery expression
zhengshiJ opened a new pull request, #11332: URL: https://github.com/apache/doris/pull/11332 # Proposed changes Issue Number: ## Problem Summary: scalar subquery: A subquery that will return only one row and one column. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wolfboys commented on pull request #11325: [Improvement] start|stop script files improvements
wolfboys commented on PR #11325: URL: https://github.com/apache/doris/pull/11325#issuecomment-1199009911 > May I ask what is the problem when the script is a soft link? see https://github.com/apache/doris/pull/10918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11257: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq
github-actions[bot] commented on PR #11257: URL: https://github.com/apache/doris/pull/11257#issuecomment-1199009969 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] starocean999 opened a new pull request, #11333: [Vectorized][Function] add orthogonal bitmap agg functions (#10126)
starocean999 opened a new pull request, #11333: URL: https://github.com/apache/doris/pull/11333 * [Vectorized][Function] add orthogonal bitmap agg functions save some file about orthogonal bitmap function add some file to rebase update functions file * refactor union_count function refactor orthogonal union count functions * remove bool is_variadic # Proposed changes Issue Number: close #xxx ## Problem Summary: ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (3fe7b21ac8 -> 512ff192bf)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 3fe7b21ac8 [Improvement](vectorized) Remove row-based conjuncts on vectorized nodes (#11324) add 512ff192bf [Doc] Resolve Historical Conflict Documents (#10997) No new revisions were added by this update. Summary of changes: .../admin-manual/cluster-management/upgrade.md | 7 + .../year.md => string-functions/substr.md} | 32 +++--- .../admin-manual/cluster-management/upgrade.md | 6 .../sql-functions/string-functions/substr.md} | 28 ++- 4 files changed, 44 insertions(+), 29 deletions(-) copy docs/en/docs/sql-manual/sql-functions/{date-time-functions/year.md => string-functions/substr.md} (61%) copy docs/{en/docs/sql-manual/sql-functions/date-time-functions/hour.md => zh-CN/docs/sql-manual/sql-functions/string-functions/substr.md} (62%) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 commented on a diff in pull request #11332: [feature](nereids) add scalar subquery expression
924060929 commented on code in PR #11332: URL: https://github.com/apache/doris/pull/11332#discussion_r932998072 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/ScalarSubquery.java: ## @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.base.Preconditions; + +import java.util.List; +import java.util.Objects; + +/** + * A subquery that will return only one row and one column. + */ +public class ScalarSubquery extends SubqueryExpr { +public ScalarSubquery(LogicalPlan subquery) { +super(Objects.requireNonNull(subquery, "subquery can not be null")); +} + +@Override +public DataType getDataType() throws UnboundException { +Preconditions.checkArgument(queryPlan.getOutput().size() == 1); +return queryPlan.getOutput().get(0).getDataType(); +} + +@Override +public String toSql() { +return " (SCALARSUBQUERY) " + super.toSql(); +} + +@Override +public String toString() { +return " (SCALARSUBQUERY) " + super.toString(); +} + +public R accept(ExpressionVisitor visitor, C context) { +return visitor.visitScalarSubquery(this, context); +} + +@Override +public Expression withChildren(List children) { +Preconditions.checkArgument(children.size() == 1); +Preconditions.checkArgument(children.get(0) instanceof ScalarSubquery); +return new ScalarSubquery(((SubqueryExpr) children.get(0)).getQueryPlan()); Review Comment: I think ScalarSubquery's children can be any plan, not only SubqueryExpr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 merged pull request #10997: [Doc] Resolve Historical Conflict Documents
hf200012 merged PR #10997: URL: https://github.com/apache/doris/pull/10997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 commented on a diff in pull request #11332: [feature](nereids) add scalar subquery expression
924060929 commented on code in PR #11332: URL: https://github.com/apache/doris/pull/11332#discussion_r933003489 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/ScalarSubquery.java: ## @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.base.Preconditions; + +import java.util.List; +import java.util.Objects; + +/** + * A subquery that will return only one row and one column. + */ +public class ScalarSubquery extends SubqueryExpr { +public ScalarSubquery(LogicalPlan subquery) { +super(Objects.requireNonNull(subquery, "subquery can not be null")); +} + +@Override +public DataType getDataType() throws UnboundException { +Preconditions.checkArgument(queryPlan.getOutput().size() == 1); +return queryPlan.getOutput().get(0).getDataType(); +} + +@Override +public String toSql() { +return " (SCALARSUBQUERY) " + super.toSql(); +} + +@Override +public String toString() { +return " (SCALARSUBQUERY) " + super.toString(); +} + +public R accept(ExpressionVisitor visitor, C context) { +return visitor.visitScalarSubquery(this, context); +} + +@Override +public Expression withChildren(List children) { +Preconditions.checkArgument(children.size() == 1); +Preconditions.checkArgument(children.get(0) instanceof ScalarSubquery); +return new ScalarSubquery(((SubqueryExpr) children.get(0)).getQueryPlan()); Review Comment: InSubquery's children also is any plan too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 commented on pull request #11101: [doc] fix some docs issue
hf200012 commented on PR #11101: URL: https://github.com/apache/doris/pull/11101#issuecomment-1199023958 @mychaow rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11212: [fix] the nullable info is lost in ifnull expr
github-actions[bot] commented on PR #11212: URL: https://github.com/apache/doris/pull/11212#issuecomment-1199025816 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-spark-connector] caoliang-web opened a new pull request, #46: [doc]Modify click link invalid problem
caoliang-web opened a new pull request, #46: URL: https://github.com/apache/doris-spark-connector/pull/46 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 merged pull request #11328: [doc]Stream load doc fix
hf200012 merged PR #11328: URL: https://github.com/apache/doris/pull/11328 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [doc]Stream load doc fix (#11328)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 303963cfdd [doc]Stream load doc fix (#11328) 303963cfdd is described below commit 303963cfddee877e93839e598c6858d0fbcf2636 Author: jiafeng.zhang AuthorDate: Fri Jul 29 16:46:44 2022 +0800 [doc]Stream load doc fix (#11328) Stream load doc fix --- .../join-optimization/doris-join-optimization.md | 222 .../import/import-way/stream-load-manual.md| 9 +- .../Show-Statements/SHOW-STREAM-LOAD.md| 15 +- .../join-optimization/doris-join-optimization.md | 226 + .../import/import-way/stream-load-manual.md| 8 +- .../Show-Statements/SHOW-STREAM-LOAD.md| 15 +- 6 files changed, 479 insertions(+), 16 deletions(-) diff --git a/docs/en/advanced/join-optimization/doris-join-optimization.md b/docs/en/advanced/join-optimization/doris-join-optimization.md new file mode 100644 index 00..da17f8e699 --- /dev/null +++ b/docs/en/advanced/join-optimization/doris-join-optimization.md @@ -0,0 +1,222 @@ +--- +{ +"title": "Doris Join optimization principle", +"language": "en" +} + + +--- + + + +# Doris Join optimization principle + +Doris supports two physical operators, one is **Hash Join**, and the other is **Nest Loop Join**. + +- Hash Join: Create a hash table on the right table based on the equivalent join column, and the left table uses the hash table to perform join calculations in a streaming manner. Its limitation is that it can only be applied to equivalent joins. +- Nest Loop Join: With two for loops, it is very intuitive. Then it is applicable to unequal-valued joins, such as: greater than or less than or the need to find a Cartesian product. It is a general join operator, but has poor performance. + +As a distributed MPP database, data shuffle needs to be performed during the Join process. Data needs to be split and scheduled to ensure that the final Join result is correct. As a simple example, assume that the relationship S and R are joined, and N represents the number of nodes participating in the join calculation; T represents the number of tuples in the relationship. + + + +## Doris Shuffle way + +1. Doris supports 4 Shuffle methods + + 1. BroadCast Join + + It requires the full data of the right table to be sent to the left table, that is, each node participating in Join has the full data of the right table, that is, T(R). + + Its applicable scenarios are more general, and it can support Hash Join and Nest loop Join at the same time, and its network overhead is N * T(R). + +  + + The data in the left table is not moved, and the data in the right table is sent to the scanning node of the data in the left table. + +2. Shuffle Join + + When Hash Join is performed, the corresponding Hash value can be calculated through the Join column, and Hash bucketing can be performed. + + Its network overhead is: T(R) + T(N), but it can only support Hash Join, because it also calculates buckets according to the conditions of Join. + +  + + The left and right table data are sent to different partition nodes according to the partition, and the calculated demerits are sent. + +3. Bucket Shuffle Join + + Doris's table data itself is bucketed by Hash calculation, so you can use the properties of the bucketed columns of the table itself to shuffle the Join data. If two tables need to be joined, and the Join column is the bucket column of the left table, then the data in the left table can actually be calculated by sending the data into the buckets of the left table without moving the data in the right table. + + Its network overhead is: T(R) is equivalent to only Shuffle the data in the right table. + +  + + The data in the left table does not move, and the data in the right table is sent to the node that scans the table in the left table according to the result of the partition calculation. + +4. Colocation + + It is similar to Bucket Shuffle Join, which means that the data has been shuffled according to the preset Join column scenario when data is imported. Then the join calculation can be performed directly without considering the Shuffle problem of the data during the actual query. + +  + + The data has been pre-partitioned, and the Join calculation is performed directly locally + +### Comparison of four Shuffle methods + +| Shuffle Mode | Network Overhead | Physical Operators | Applicable Scenarios | +| -- | -
[GitHub] [doris] hf200012 merged pull request #11294: [Doc] Add alter table comment doc
hf200012 merged PR #11294: URL: https://github.com/apache/doris/pull/11294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 closed issue #11293: [Enhancement] Support alter table and column comment
hf200012 closed issue #11293: [Enhancement] Support alter table and column comment URL: https://github.com/apache/doris/issues/11293 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Doc] Add alter table comment doc (#11294)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 337a2fe2b9 [Doc] Add alter table comment doc (#11294) 337a2fe2b9 is described below commit 337a2fe2b9fc78822d879829a04ba4072047a743 Author: Stalary AuthorDate: Fri Jul 29 16:47:16 2022 +0800 [Doc] Add alter table comment doc (#11294) Add alter table comment doc --- .../Alter/ALTER-TABLE-COMMENT.md | 80 ++ .../Alter/ALTER-TABLE-COMMENT.md | 80 ++ 2 files changed, 160 insertions(+) diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md b/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md new file mode 100644 index 00..4957ae2503 --- /dev/null +++ b/docs/en/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md @@ -0,0 +1,80 @@ +--- +{ +"title": "ALTER-TABLE-COMMENT", +"language": "en" +} +--- + + + +## ALTER-TABLE-COMMENT + +### Name + +ALTER TABLE COMMENT + +### Description + +This statement is used to modify the comment of an existing table. The operation is synchronous, and the command returns to indicate completion. + +grammar: + +```sql +ALTER TABLE [database.]table alter_clause; +``` + +1. Modify table comment + +grammar: + +```sql +MODIFY COMMENT "new table comment"; +``` + +2. Modify column comment + +grammar: + +```sql +MODIFY COLUMN col1 COMMENT "new column comment"; +``` + +### Example + +1. Change the table1's comment to table1_comment + +```sql +ALTER TABLE table1 MODIFY COMMENT "table1_comment"; +``` + +2. Change the table1's col1 comment to table1_comment + +```sql +ALTER TABLE table1 MODIFY COLUMN col1 COMMENT "table1_col1_comment"; +``` + +### Keywords + +```text +ALTER, TABLE, COMMENT, ALTER TABLE +``` + +### Best Practice + diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md new file mode 100644 index 00..29047ee86f --- /dev/null +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COMMENT.md @@ -0,0 +1,80 @@ +--- +{ +"title": "ALTER-TABLE-COMMENT", +"language": "zh-CN" +} +--- + + + +## ALTER-TABLE-COMMENT + +### Name + +ALTER TABLE COMMENT + +### Description + +该语句用于对已有 table 的 comment 进行修改。这个操作是同步的,命令返回表示执行完毕。 + +语法: + +```sql +ALTER TABLE [database.]table alter_clause; +``` + +1. 修改表注释 + +语法: + +```sql +MODIFY COMMENT "new table comment"; +``` + +2. 修改列注释 + + 语法: + +```sql +MODIFY COLUMN col1 COMMENT "new column comment"; +``` + +### Example + +1. 将名为 table1 的 comment 修改为 table1_comment + +```sql +ALTER TABLE table1 MODIFY COMMENT "table1_comment"; +``` + +2. 将名为 table1 的 col1 列的 comment 修改为 table1_col1_comment + +```sql +ALTER TABLE table1 MODIFY COLUMN col1 COMMENT "table1_col1_comment"; +``` + +### Keywords + +```text +ALTER, TABLE, COMMENT, ALTER TABLE +``` + +### Best Practice + - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [refactor](be)remove redundant code in column writer (#10915)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 934fe77c06 [refactor](be)remove redundant code in column writer (#10915) 934fe77c06 is described below commit 934fe77c0604b70245f209a8b9661ae18079b2c8 Author: AlexYue AuthorDate: Fri Jul 29 16:48:59 2022 +0800 [refactor](be)remove redundant code in column writer (#10915) --- be/src/olap/rowset/segment_v2/column_writer.cpp | 31 +++-- be/src/olap/rowset/segment_v2/column_writer.h | 20 2 files changed, 13 insertions(+), 38 deletions(-) diff --git a/be/src/olap/rowset/segment_v2/column_writer.cpp b/be/src/olap/rowset/segment_v2/column_writer.cpp index cabc27605f..1d7bee6271 100644 --- a/be/src/olap/rowset/segment_v2/column_writer.cpp +++ b/be/src/olap/rowset/segment_v2/column_writer.cpp @@ -336,20 +336,20 @@ Status ScalarColumnWriter::append_data(const uint8_t** ptr, size_t num_rows) { return Status::OK(); } -Status ScalarColumnWriter::append_data_in_current_page(const uint8_t** ptr, size_t* num_written) { -RETURN_IF_ERROR(_page_builder->add(*ptr, num_written)); +Status ScalarColumnWriter::append_data_in_current_page(const uint8_t* data, size_t* num_written) { +RETURN_IF_ERROR(_page_builder->add(data, num_written)); if (_opts.need_zone_map) { -_zone_map_index_builder->add_values(*ptr, *num_written); +_zone_map_index_builder->add_values(data, *num_written); } if (_opts.need_bitmap_index) { -_bitmap_index_builder->add_values(*ptr, *num_written); +_bitmap_index_builder->add_values(data, *num_written); } if (_opts.need_bloom_filter) { -_bloom_filter_index_builder->add_values(*ptr, *num_written); +_bloom_filter_index_builder->add_values(data, *num_written); } _next_rowid += *num_written; -*ptr += get_field()->size() * (*num_written); + // we must write null bits after write data, because we don't // know how many rows can be written into current page if (is_nullable()) { @@ -358,22 +358,9 @@ Status ScalarColumnWriter::append_data_in_current_page(const uint8_t** ptr, size return Status::OK(); } -Status ScalarColumnWriter::append_data_in_current_page(const uint8_t* ptr, size_t* num_written) { -RETURN_IF_ERROR(_page_builder->add(ptr, num_written)); -if (_opts.need_zone_map) { -_zone_map_index_builder->add_values(ptr, *num_written); -} -if (_opts.need_bitmap_index) { -_bitmap_index_builder->add_values(ptr, *num_written); -} -if (_opts.need_bloom_filter) { -_bloom_filter_index_builder->add_values(ptr, *num_written); -} - -_next_rowid += *num_written; -if (is_nullable()) { -_null_bitmap_builder->add_run(false, *num_written); -} +Status ScalarColumnWriter::append_data_in_current_page(const uint8_t** data, size_t* num_written) { +RETURN_IF_ERROR(append_data_in_current_page(*data, num_written)); +*data += get_field()->size() * (*num_written); return Status::OK(); } diff --git a/be/src/olap/rowset/segment_v2/column_writer.h b/be/src/olap/rowset/segment_v2/column_writer.h index a5c2442a24..2c3ea79781 100644 --- a/be/src/olap/rowset/segment_v2/column_writer.h +++ b/be/src/olap/rowset/segment_v2/column_writer.h @@ -133,12 +133,6 @@ public: // used for append not null data. virtual Status append_data(const uint8_t** ptr, size_t num_rows) = 0; -// used for append not null data. When page is full, will append data not reach num_rows. -virtual Status append_data_in_current_page(const uint8_t** ptr, size_t* num_rows) = 0; - -// used for append not null data. When page is full, will append data not reach num_rows. -virtual Status append_data_in_current_page(const uint8_t* ptr, size_t* num_rows) = 0; - bool is_nullable() const { return _is_nullable; } Field* get_field() const { return _field.get(); } @@ -188,8 +182,10 @@ public: } Status append_data(const uint8_t** ptr, size_t num_rows) override; -Status append_data_in_current_page(const uint8_t** ptr, size_t* num_rows) override; -Status append_data_in_current_page(const uint8_t* ptr, size_t* num_rows) override; +// used for append not null data. When page is full, will append data not reach num_rows. +Status append_data_in_current_page(const uint8_t** ptr, size_t* num_written); + +Status append_data_in_current_page(const uint8_t* ptr, size_t* num_written); private: std::unique_ptr _page_builder; @@ -267,14 +263,6 @@ public: Status init() override; Status append_data(const uint8_t** ptr, size_t num_rows) override; -Status append_data_in_current_page(const uint8_t** ptr, size_t* num_rows) override { -return Status::NotSupported( -"array writ
[GitHub] [doris] yiguolei merged pull request #10915: [refactor](be)remove redundant code in column writer
yiguolei merged PR #10915: URL: https://github.com/apache/doris/pull/10915 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #10864: [tracing] Support opentelemtry collector.
yiguolei merged PR #10864: URL: https://github.com/apache/doris/pull/10864 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [tracing] Support opentelemtry collector. (#10864)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new d3c88471ad [tracing] Support opentelemtry collector. (#10864) d3c88471ad is described below commit d3c88471ad566eef5b7aef62b7d51cc24ba84fbb Author: luozenglin <37725793+luozeng...@users.noreply.github.com> AuthorDate: Fri Jul 29 16:49:40 2022 +0800 [tracing] Support opentelemtry collector. (#10864) * [tracing] Support opentelemtry collector. 1. support for exporting traces to multiple distributed tracing system via collector; 2. support using collector to process traces. --- be/CMakeLists.txt | 16 +++ be/src/common/config.h | 16 +++ be/src/service/doris_main.cpp | 2 +- be/src/util/telemetry/telemetry.cpp| 23 +++- be/src/util/telemetry/telemetry.h | 2 +- docs/en/docs/admin-manual/tracing.md | 139 ++- docs/zh-CN/docs/admin-manual/tracing.md| 152 +++-- fe/fe-core/pom.xml | 3 +- .../main/java/org/apache/doris/common/Config.java | 18 +++ .../apache/doris/common/telemetry/Telemetry.java | 24 ++-- 10 files changed, 358 insertions(+), 37 deletions(-) diff --git a/be/CMakeLists.txt b/be/CMakeLists.txt index 7f11c4cb9c..f47dfd978f 100644 --- a/be/CMakeLists.txt +++ b/be/CMakeLists.txt @@ -352,6 +352,18 @@ set_target_properties(opentelemetry_trace PROPERTIES IMPORTED_LOCATION ${THIRDPA add_library(opentelemetry_http_client_curl STATIC IMPORTED) set_target_properties(opentelemetry_http_client_curl PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libopentelemetry_http_client_curl.a) +add_library(opentelemetry_exporter_otlp_http STATIC IMPORTED) +set_target_properties(opentelemetry_exporter_otlp_http PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libopentelemetry_exporter_otlp_http.a) + +add_library(opentelemetry_exporter_otlp_http_client STATIC IMPORTED) +set_target_properties(opentelemetry_exporter_otlp_http_client PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libopentelemetry_exporter_otlp_http_client.a) + +add_library(opentelemetry_otlp_recordable STATIC IMPORTED) +set_target_properties(opentelemetry_otlp_recordable PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libopentelemetry_otlp_recordable.a) + +add_library(opentelemetry_proto STATIC IMPORTED) +set_target_properties(opentelemetry_proto PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libopentelemetry_proto.a) + add_library(xml2 STATIC IMPORTED) set_target_properties(xml2 PROPERTIES IMPORTED_LOCATION ${THIRDPARTY_DIR}/lib64/libxml2.a) @@ -678,6 +690,10 @@ set(COMMON_THIRDPARTY opentelemetry_exporter_ostream_span opentelemetry_trace opentelemetry_http_client_curl +opentelemetry_exporter_otlp_http +opentelemetry_exporter_otlp_http_client +opentelemetry_otlp_recordable +opentelemetry_proto ${AWS_LIBS} # put this after lz4 to avoid using lz4 lib in librdkafka librdkafka_cpp diff --git a/be/src/common/config.h b/be/src/common/config.h index 06c8bebf2c..36cfeb684d 100644 --- a/be/src/common/config.h +++ b/be/src/common/config.h @@ -732,9 +732,25 @@ CONF_String(function_service_protocol, "h2:grpc"); // use which load balancer to select server to connect CONF_String(rpc_load_balancer, "rr"); +// Enable tracing +// If this configuration is enabled, you should also specify the trace_export_url. CONF_Bool(enable_tracing, "false"); +// Enable opentelemtry collector +CONF_Bool(enable_otel_collector, "false"); + +// Current support for exporting traces: +// zipkin: Export traces directly to zipkin, which is used to enable the tracing feature quickly. +// collector: The collector can be used to receive and process traces and support export to a variety of +// third-party systems. +CONF_mString(trace_exporter, "zipkin"); +CONF_Validator(trace_exporter, [](const std::string& config) -> bool { +return config == "zipkin" || config == "collector"; +}); + // The endpoint to export spans to. +// export to zipkin like: http://127.0.0.1:9411/api/v2/spans +// export to collector like: http://127.0.0.1:4318/v1/traces CONF_String(trace_export_url, "http://127.0.0.1:9411/api/v2/spans";); // The maximum buffer/queue size to collect span. After the size is reached, spans are dropped. diff --git a/be/src/service/doris_main.cpp b/be/src/service/doris_main.cpp index a78bb801b1..d62230a165 100644 --- a/be/src/service/doris_main.cpp +++ b/be/src/service/doris_main.cpp @@ -397,7 +397,7 @@ int main(int argc, char** argv) { // SHOULD be called after exec env is initialized. EXIT_IF_ERROR(engine->start_bg_threads()); -doris::telemetry::initTracer(); +doris::telemetry::i
[GitHub] [doris] BiteTheDDDDt opened a new issue, #11334: [Bug] core dump on compaction
BiteThet opened a new issue, #11334: URL: https://github.com/apache/doris/issues/11334 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? ```cpp *** Query id: 0-0 *** *** Aborted at 1659076474 (unix time) try "date -d @1659076474" if you are using GNU date *** *** Current BE git commitID: fd0e2d3 *** *** SIGABRT unkown detail explain (@0x1f58bea) received by PID 35818 (TID 0x7f8e62d4f700) from PID 35818; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris-release/be/src/common/signal_handler.h:420 1# 0x7F8ED2661920 in /lib64/libc.so.6 2# __GI_raise in /lib64/libc.so.6 3# abort in /lib64/libc.so.6 4# 0x559A41D0F289 in /home/disk1/palo-qa/teamcity/local/9137/PALO-BE/be/lib/doris_be 5# 0x559A41D0489D at src/logging.cc:1650 6# google::LogMessage::SendToLog() at src/logging.cc:1607 7# google::LogMessage::Flush() at src/logging.cc:1477 8# google::LogMessageFatal::~LogMessageFatal() at src/logging.cc:2227 9# doris::vectorized::IColumn::insert_many_fix_len_data(char const*, unsigned long) at /root/doris-release/be/src/vec/columns/column.h:189 10# doris::segment_v2::BitShufflePageDecoder<(doris::FieldType)15>::read_by_rowids(unsigned int const*, unsigned long, unsigned long*, COW::mutable_ptr&) at /root/doris-release/be/src/olap/rowset/segment_v2/bitshuffle_page.h:423 11# doris::segment_v2::FileColumnIterator::read_by_rowids(unsigned int const*, unsigned long, COW::mutable_ptr&) at /root/doris-release/be/src/olap/rowset/segment_v2/column_reader.cpp:781 12# doris::segment_v2::SegmentIterator::_read_columns_by_rowids(std::vector >&, std::vector >&, unsigned short*, unsigned long, std::vector::mutable_ptr, std::allocator::mutable_ptr > >*) at /root/doris-release/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1015 13# doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*) at /root/doris-release/be/src/olap/rowset/segment_v2/segment_iterator.cpp:1121 14# doris::BetaRowsetReader::next_block(doris::vectorized::Block*) at /root/doris-release/be/src/olap/rowset/beta_rowset_reader.cpp:205 15# doris::vectorized::VCollectIterator::Level0Iterator::_refresh_current_row() at /root/doris-release/be/src/vec/olap/vcollect_iterator.cpp:209 16# doris::vectorized::VCollectIterator::Level0Iterator::init() at /root/doris-release/be/src/vec/olap/vcollect_iterator.cpp:194 17# doris::vectorized::VCollectIterator::build_heap(std::vector, std::allocator > >&) at /root/doris-release/be/src/vec/olap/vcollect_iterator.cpp:72 18# doris::vectorized::BlockReader::_init_collect_iter(doris::TabletReader::ReaderParams const&, std::vector, std::allocator > >*) at /root/doris-release/be/src/vec/olap/block_reader.cpp:63 19# doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) at /root/doris-release/be/src/vec/olap/block_reader.cpp:135 20# doris::Merger::vmerge_rowsets(std::shared_ptr, doris::ReaderType, doris::TabletSchema const*, std::vector, std::allocator > > const&, doris::RowsetWriter*, doris::Merger::Statistics*) at /root/doris-release/be/src/olap/merger.cpp:114 21# doris::Compaction::do_compaction_impl(long) at /root/doris-release/be/src/olap/compaction.cpp:170 22# doris::Compaction::do_compaction(long) at /root/doris-release/be/src/olap/compaction.cpp:122 23# doris::BaseCompaction::execute_compact_impl() at /root/doris-release/be/src/olap/base_compaction.cpp:70 24# doris::Compaction::execute_compact() at /root/doris-release/be/src/olap/compaction.cpp:60 25# doris::Tablet::execute_compaction(doris::CompactionType) at /root/doris-release/be/src/olap/tablet.cpp:1571 26# std::_Function_handler, doris::CompactionType)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291 27# doris::ThreadPool::dispatch_thread() at /root/doris-release/be/src/util/threadpool.cpp:548 28# doris::Thread::supervise_thread(void*) at /root/doris-release/be/src/util/thread.cpp:409 29# start_thread in /lib64/libpthread.so.0 30# clone in /lib64/libc.so.6 ``` ### What You Expected? fix ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.ap
[GitHub] [doris] yiguolei merged pull request #11257: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq
yiguolei merged PR #11257: URL: https://github.com/apache/doris/pull/11257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq (#11257)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new a7199fb98e [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq (#11257) a7199fb98e is described below commit a7199fb98e18b925664b38460b667d04cbee8e01 Author: Jerry Hu AuthorDate: Fri Jul 29 16:55:22 2022 +0800 [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq (#11257) --- .../vec/aggregate_functions/aggregate_function.h | 15 +++ .../aggregate_function_nothing.h | 3 + .../aggregate_functions/aggregate_function_null.h | 12 ++ .../aggregate_functions/aggregate_function_uniq.h | 137 - be/src/vec/exec/vaggregation_node.cpp | 12 +- 5 files changed, 136 insertions(+), 43 deletions(-) diff --git a/be/src/vec/aggregate_functions/aggregate_function.h b/be/src/vec/aggregate_functions/aggregate_function.h index 677c189002..c7c7fc38ca 100644 --- a/be/src/vec/aggregate_functions/aggregate_function.h +++ b/be/src/vec/aggregate_functions/aggregate_function.h @@ -107,6 +107,10 @@ public: virtual void deserialize_vec(AggregateDataPtr places, ColumnString* column, Arena* arena, size_t num_rows) const = 0; +/// Deserializes state and merge it with current aggregation function. +virtual void deserialize_and_merge(AggregateDataPtr __restrict place, BufferReadable& buf, + Arena* arena) const = 0; + /// Returns true if a function requires Arena to handle own states (see add(), merge(), deserialize()). virtual bool allocates_memory_in_arena() const { return false; } @@ -253,6 +257,17 @@ public: size_t align_of_data() const override { return alignof(Data); } void reset(AggregateDataPtr place) const override {} + +void deserialize_and_merge(AggregateDataPtr __restrict place, BufferReadable& buf, + Arena* arena) const override { +Data deserialized_data; +AggregateDataPtr deserialized_place = (AggregateDataPtr)&deserialized_data; + +auto derived = static_cast(this); +derived->create(deserialized_place); +derived->deserialize(deserialized_place, buf, arena); +derived->merge(place, deserialized_place, arena); +} }; using AggregateFunctionPtr = std::shared_ptr; diff --git a/be/src/vec/aggregate_functions/aggregate_function_nothing.h b/be/src/vec/aggregate_functions/aggregate_function_nothing.h index c0ae740be4..64af14a6cf 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_nothing.h +++ b/be/src/vec/aggregate_functions/aggregate_function_nothing.h @@ -64,6 +64,9 @@ public: void insert_result_into(ConstAggregateDataPtr, IColumn& to) const override { to.insert_default(); } + +void deserialize_and_merge(AggregateDataPtr __restrict place, BufferReadable& buf, + Arena* arena) const override {} }; } // namespace doris::vectorized diff --git a/be/src/vec/aggregate_functions/aggregate_function_null.h b/be/src/vec/aggregate_functions/aggregate_function_null.h index 5b804b82a7..89960bc9f0 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_null.h +++ b/be/src/vec/aggregate_functions/aggregate_function_null.h @@ -151,6 +151,18 @@ public: } } +void deserialize_and_merge(AggregateDataPtr __restrict place, BufferReadable& buf, + Arena* arena) const override { +bool flag = true; +if (result_is_nullable) { +read_binary(flag, buf); +} +if (flag) { +set_flag(place); +nested_function->deserialize_and_merge(nested_place(place), buf, arena); +} +} + void insert_result_into(ConstAggregateDataPtr __restrict place, IColumn& to) const override { if constexpr (result_is_nullable) { ColumnNullable& to_concrete = assert_cast(to); diff --git a/be/src/vec/aggregate_functions/aggregate_function_uniq.h b/be/src/vec/aggregate_functions/aggregate_function_uniq.h index c717307c72..988e9bdb01 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_uniq.h +++ b/be/src/vec/aggregate_functions/aggregate_function_uniq.h @@ -20,6 +20,8 @@ #pragma once +#include + #include #include "gutil/hash/city.h" @@ -34,29 +36,26 @@ namespace doris::vectorized { +// Here is an empirical value. +static constexpr size_t HASH_MAP_PREFETCH_DIST = 16; + /// uniqExact template struct AggregateFunctionUniqExactData { -using Key = T; - -/// When creating, the hash table must be small. -using Set = HashSet, HashTableGrower<4>, -HashTableAllocatorWithStackMemory>; - -Set set; - -static String get_name() { return "uniqExact"; } -}; - -/// For
[GitHub] [doris] zhengshiJ commented on a diff in pull request #11332: [feature](nereids) add scalar subquery expression
zhengshiJ commented on code in PR #11332: URL: https://github.com/apache/doris/pull/11332#discussion_r933023032 ## fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/ScalarSubquery.java: ## @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.expressions; + +import org.apache.doris.nereids.exceptions.UnboundException; +import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor; +import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.base.Preconditions; + +import java.util.List; +import java.util.Objects; + +/** + * A subquery that will return only one row and one column. + */ +public class ScalarSubquery extends SubqueryExpr { +public ScalarSubquery(LogicalPlan subquery) { +super(Objects.requireNonNull(subquery, "subquery can not be null")); +} + +@Override +public DataType getDataType() throws UnboundException { +Preconditions.checkArgument(queryPlan.getOutput().size() == 1); +return queryPlan.getOutput().get(0).getDataType(); +} + +@Override +public String toSql() { +return " (SCALARSUBQUERY) " + super.toSql(); +} + +@Override +public String toString() { +return " (SCALARSUBQUERY) " + super.toString(); +} + +public R accept(ExpressionVisitor visitor, C context) { +return visitor.visitScalarSubquery(this, context); +} + +@Override +public Expression withChildren(List children) { +Preconditions.checkArgument(children.size() == 1); +Preconditions.checkArgument(children.get(0) instanceof ScalarSubquery); +return new ScalarSubquery(((SubqueryExpr) children.get(0)).getQueryPlan()); Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 merged pull request #11061: [Doc] update doc json import with read json by line
hf200012 merged PR #11061: URL: https://github.com/apache/doris/pull/11061 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (a7199fb98e -> ea2fac597e)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from a7199fb98e [improvement]Use phmap::flat_hash_set in AggregateFunctionUniq (#11257) add ea2fac597e update json import with read json by line (#11061) No new revisions were added by this update. Summary of changes: .../import/import-way/load-json-format.md | 70 -- .../import/import-way/load-json-format.md | 68 - 2 files changed, 105 insertions(+), 33 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #11131: [Improvement] support tablet schema cache
yiguolei commented on code in PR #11131: URL: https://github.com/apache/doris/pull/11131#discussion_r933031064 ## be/src/olap/schema_change.cpp: ## @@ -2365,7 +2364,7 @@ Status SchemaChangeHandler::_parse_request( } const TabletSchema& ref_tablet_schema = *base_tablet_schema; -const TabletSchema& new_tablet_schema = new_tablet->tablet_schema(); +const TabletSchema& new_tablet_schema = *new_tablet->tablet_schema(); Review Comment: use copy_from here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11264: [feature](nereids)add InPredicate in expressions
github-actions[bot] commented on PR #11264: URL: https://github.com/apache/doris/pull/11264#issuecomment-1199048314 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] starocean999 opened a new pull request, #11335: [FIX]DCHECK error of array functions
starocean999 opened a new pull request, #11335: URL: https://github.com/apache/doris/pull/11335 # Proposed changes Issue Number: close (https://github.com/apache/doris/issues/11317) ## Problem Summary: ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #11131: [Improvement] support tablet schema cache
yiguolei commented on code in PR #11131: URL: https://github.com/apache/doris/pull/11131#discussion_r933031064 ## be/src/olap/schema_change.cpp: ## @@ -2365,7 +2364,7 @@ Status SchemaChangeHandler::_parse_request( } const TabletSchema& ref_tablet_schema = *base_tablet_schema; -const TabletSchema& new_tablet_schema = new_tablet->tablet_schema(); +const TabletSchema& new_tablet_schema = *new_tablet->tablet_schema(); Review Comment: use copy_from here if you will modify the tablet schema. If you will not modify tablet schema, then just use share ptr here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-spark-connector] JNSimba merged pull request #46: [doc]Modify click link invalid problem
JNSimba merged PR #46: URL: https://github.com/apache/doris-spark-connector/pull/46 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-spark-connector] branch master updated: [doc] modify click link invalid problem (#46)
This is an automated email from the ASF dual-hosted git repository. diwu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-spark-connector.git The following commit(s) were added to refs/heads/master by this push: new fbdeafd [doc] modify click link invalid problem (#46) fbdeafd is described below commit fbdeafd05dbc5e4a1c886346deecd1a2f6926fe1 Author: caoliang-web <71004656+caoliang-...@users.noreply.github.com> AuthorDate: Fri Jul 29 17:17:51 2022 +0800 [doc] modify click link invalid problem (#46) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 4269c1f..b93eecf 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ under the License. ### Spark Doris Connector -More information about compilation and usage, please visit [Spark Doris Connector](https://doris.apache.org/docs/ecosystem/spark-doris-connector.html) +More information about compilation and usage, please visit [Spark Doris Connector](https://doris.apache.org/docs/ecosystem/spark-doris-connector) ## License - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 merged pull request #11075: [Doc] update flink connector faq
hf200012 merged PR #11075: URL: https://github.com/apache/doris/pull/11075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: update flink connecotr problem (#11075)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 4cbfd7822a update flink connecotr problem (#11075) 4cbfd7822a is described below commit 4cbfd7822a1b1d2f45e27973ea865d500a0d1285 Author: wudi <676366...@qq.com> AuthorDate: Fri Jul 29 17:22:59 2022 +0800 update flink connecotr problem (#11075) update flink connecotr problem --- docs/en/docs/ecosystem/flink-doris-connector.md| 9 +++-- docs/zh-CN/docs/ecosystem/flink-doris-connector.md | 11 --- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/docs/en/docs/ecosystem/flink-doris-connector.md b/docs/en/docs/ecosystem/flink-doris-connector.md index 66896544ef..1a504cdba1 100644 --- a/docs/en/docs/ecosystem/flink-doris-connector.md +++ b/docs/en/docs/ecosystem/flink-doris-connector.md @@ -365,7 +365,7 @@ source.sinkTo(builder.build()); | doris.read.field| --| N | List of column names in the Doris table, separated by commas | | doris.filter.query | --| N | Filter expression of the query, which is transparently transmitted to Doris. Doris uses this expression to complete source-side data filtering. | | sink.label-prefix | -- | Y | The label prefix used by stream load imports. In the 2pc scenario, global uniqueness is required to ensure the EOS semantics of Flink. | -| sink.properties.* | -- | N | The stream load parameters. eg: sink.properties.column_separator' = ',' Setting 'sink.properties.escape_delimiters' = 'true' if you want to use a control char as a separator, so that such as '\\x01' will translate to binary 0x01 Support JSON format import, you need to enable both 'sink.properties.format' ='json' and 'sink.properties.strip_outer_array' ='true'| +| sink.properties.* | -- | N | The stream load parameters. eg: `sink.properties.column_separator' = ','` Setting `'sink.properties.escape_delimiters' = 'true'` if you want to use a control char as a separator, so that such as '\\x01' will translate to binary 0x01 Support JSON format import, you need to enable both `'sink.properties.format' ='json'` and `'sink.properties.read_json_by_line' = 'true'` | | sink.enable-delete | true | N | Whether to enable deletion. This option requires Doris table to enable batch delete function (0.15+ version is enabled by default), and only supports Uniq model.| | sink.enable-2pc | true | N| Whether to enable two-phase commit (2pc), the default is true, to ensure Exactly-Once semantics. For two-phase commit, please refer to [here](../data-operate/import/import-way/stream-load-manual.md). | | sink.max-retries | 1 | N| In the 2pc scenario, the number of retries after the commit phase fails. | @@ -450,7 +450,7 @@ The most suitable scenario for using Flink Doris Connector is to synchronize sou ### common problem -1. Bitmap type write +**1. Bitmap type write** ```sql CREATE TABLE bitmap_sink ( @@ -468,3 +468,8 @@ WITH ( 'sink.properties.columns' = 'dt,page,user_id,user_id=to_bitmap(user_id)' ) + +**2. errCode = 2, detailMessage = Label [label_0_1] has already been used, relate to txn [19650]** + +In the Exactly-Once scenario, the Flink Job must be restarted from the latest Checkpoint/Savepoint, otherwise the above error will be reported. +When Exactly-Once is not required, it can also be solved by turning off 2PC commits (`sink.enable-2pc=false`) or changing to a different `sink.label-prefix`. \ No newline at end of file diff --git a/docs/zh-CN/docs/ecosystem/flink-doris-connector.md b/docs/zh-CN/docs/ecosystem/flink-doris-connector.md index 83e8102447..55ba3bc4cf 100644 --- a/docs/zh-CN/docs/ecosystem/flink-doris-connector.md +++ b/docs/zh-CN/docs/ecosystem/flink-doris-connector.md @@ -364,8 +364,8 @@ source.sinkTo(builder.build()); | doris.deserialize.queue.size | 64 | N| 异步转换 Arrow 格式的内部处理队列,当 doris.deserialize.arrow.async 为 true 时生效 | | doris.read.field | -- | N| 读取 Doris 表的列名列表,多列之间使用逗号分隔
[GitHub] [doris] github-actions[bot] commented on pull request #11299: [enhancement](nereids) Normalize expressions before performing plan rewriting
github-actions[bot] commented on PR #11299: URL: https://github.com/apache/doris/pull/11299#issuecomment-1199069226 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener opened a new pull request, #11336: [fix](ci): add checkout to fix PR-title-checker
jackwener opened a new pull request, #11336: URL: https://github.com/apache/doris/pull/11336 # Proposed changes Issue Number: close #xxx ## Problem Summary: fix github-action PR-title-checker ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hello-stephen opened a new pull request, #11337: [opt] unify stop script
hello-stephen opened a new pull request, #11337: URL: https://github.com/apache/doris/pull/11337 # Proposed changes Issue Number: close #xxx ## Problem Summary: unify stop script, be stop more friendly ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener commented on a diff in pull request #11337: [opt] unify stop script
jackwener commented on code in PR #11337: URL: https://github.com/apache/doris/pull/11337#discussion_r933059835 ## bin/stop_be.sh: ## @@ -68,11 +68,18 @@ if [ -f $pidfile ]; then exit 1 fi -# kill +#kill pid process and check it Review Comment: Don't remove the blank -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] SaintBacchus opened a new pull request, #11338: [feature-wip][multi-catalog][WIP]Support use catalog.db and show databases from catalog stmt
SaintBacchus opened a new pull request, #11338: URL: https://github.com/apache/doris/pull/11338 # Proposed changes Issue Number: close #xxx ## Problem Summary: Support use catalog.db and show databases from catalog stmt. These stmts are supported in presto. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ByteYue closed issue #10914: [refactor] remove redundant code in column writer
ByteYue closed issue #10914: [refactor] remove redundant code in column writer URL: https://github.com/apache/doris/issues/10914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 opened a new pull request, #11339: [enhancement](workflow) Use ccache to speed the BE UT (Clang) up
adonis0147 opened a new pull request, #11339: URL: https://github.com/apache/doris/pull/11339 # Proposed changes ~~Issue Number: close #xxx~~ ## Problem Summary: We can use [ccache-action](https://github.com/hendrikmuhs/ccache-action) to speed the workflow BE UT (Clang) up. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11274: [sample]Optimize flink oracle cdc, add flink read es to doris sample code
yiguolei merged PR #11274: URL: https://github.com/apache/doris/pull/11274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: Optimize flink oracle cdc, add flink read es to doris sample code (#11274)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new cb7eb725fe Optimize flink oracle cdc, add flink read es to doris sample code (#11274) cb7eb725fe is described below commit cb7eb725fe2fb197200a1bcc19fd180f566af907 Author: caoliang-web <71004656+caoliang-...@users.noreply.github.com> AuthorDate: Fri Jul 29 18:11:18 2022 +0800 Optimize flink oracle cdc, add flink read es to doris sample code (#11274) --- samples/doris-demo/flink-demo/pom.xml | 16 ++ .../doris/demo/flink/cdc/FlinkOracleCdcDemo.java | 4 - .../flink/elasticsearch/ElasticsearchInput.java| 248 + .../flink/elasticsearch/FlinkReadEs2Doris.java | 100 + 4 files changed, 364 insertions(+), 4 deletions(-) diff --git a/samples/doris-demo/flink-demo/pom.xml b/samples/doris-demo/flink-demo/pom.xml index 0f662acc41..586a86c68d 100644 --- a/samples/doris-demo/flink-demo/pom.xml +++ b/samples/doris-demo/flink-demo/pom.xml @@ -118,6 +118,22 @@ under the License. flink-connector-oracle-cdc 2.1.1 + + +org.apache.flink + flink-connector-elasticsearch7_${scala.binary.version} +${flink.version} + + +org.apache.httpcomponents +httpclient +4.5.13 + + +org.apache.httpcomponents +httpcore +4.4.12 + diff --git a/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/cdc/FlinkOracleCdcDemo.java b/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/cdc/FlinkOracleCdcDemo.java index 5a1045a029..92350a14f3 100644 --- a/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/cdc/FlinkOracleCdcDemo.java +++ b/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/cdc/FlinkOracleCdcDemo.java @@ -104,10 +104,6 @@ public class FlinkOracleCdcDemo { LogicalType[] types={new IntType(),new VarCharType(),new VarCharType(), new DoubleType()}; -Properties pro = new Properties(); -pro.setProperty("format", "json"); -pro.setProperty("strip_outer_array", "false"); - map.addSink( DorisSink.sink( fields, diff --git a/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/elasticsearch/ElasticsearchInput.java b/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/elasticsearch/ElasticsearchInput.java new file mode 100644 index 00..33c9567b53 --- /dev/null +++ b/samples/doris-demo/flink-demo/src/main/java/org/apache/doris/demo/flink/elasticsearch/ElasticsearchInput.java @@ -0,0 +1,248 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. +package org.apache.doris.demo.flink.elasticsearch; + +import org.apache.commons.collections.map.CaseInsensitiveMap; +import org.apache.flink.annotation.PublicEvolving; +import org.apache.flink.api.common.io.DefaultInputSplitAssigner; +import org.apache.flink.api.common.io.RichInputFormat; +import org.apache.flink.api.common.io.statistics.BaseStatistics; +import org.apache.flink.api.common.typeinfo.TypeInformation; +import org.apache.flink.api.java.typeutils.ResultTypeQueryable; +import org.apache.flink.api.java.typeutils.RowTypeInfo; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.core.io.GenericInputSplit; +import org.apache.flink.core.io.InputSplit; +import org.apache.flink.core.io.InputSplitAssigner; +import org.apache.flink.streaming.connectors.elasticsearch7.RestClientFactory; +import org.apache.flink.types.Row; +import org.apache.flink.util.Preconditions; +import org.apache.http.HttpHost; +import org.elasticsearch.action.search.ClearScrollRequest; +import org.elasticsearch.action.search.SearchRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.search.SearchScrollRequest; +import org.elasticsearch.client.RequestOptions; +import org.ela
[GitHub] [doris] hello-stephen commented on pull request #11325: [Improvement] start|stop script files improvements
hello-stephen commented on PR #11325: URL: https://github.com/apache/doris/pull/11325#issuecomment-1199111829 > > May I ask what is the problem when the script is a soft link? > > see #10918 Thanks, I tried your new script and figured that when using 'ln -s bin/start_fe.sh xxx.sh' to make a soft link on another directory and run xxx.sh, it could also work properly. The old script can not work in this situation. LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] stalary opened a new pull request, #11340: [Feature] doe support array
stalary opened a new pull request, #11340: URL: https://github.com/apache/doris/pull/11340 # Proposed changes Issue Number: close #xxx ## Problem Summary: doe support array ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-spark-connector] caoliang-web commented on a diff in pull request #45: [Fix] add tips for Mac OS users in case meeting getopt error
caoliang-web commented on code in PR #45: URL: https://github.com/apache/doris-spark-connector/pull/45#discussion_r933095406 ## spark-doris-connector/build.sh: ## @@ -45,6 +45,17 @@ usage() { exit 1 } +# we use GNU enhanced version getopt command here for long option names, rather than the original version +# determine the version of the getopt command before using +getopt -T > /dev/null +if [ $? -ne 4 ]; then + echo " +The GNU version of getopt command is required. +On Mac OS, you can use Homebrew to install gnu-getopt: brew install gnu-getopt, then set gnu-getopt as default getopt. Refernence: https://stackoverflow.com/questions/12152077/how-can-i-make-bash-deal-with-long-param-using-getopt-command-in-mac + " + exit 1 +fi + Review Comment: When the build.sh is executed, "The GNU version of the getopt command is required" appears, and the execution will be exited. After installing gnu-getopt, the problem remains, and executing build.sh in the linux environment has no response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman commented on a diff in pull request #11292: [feature-wip][multi-catalog]Support prefetch for orc file format
morningman commented on code in PR #11292: URL: https://github.com/apache/doris/pull/11292#discussion_r933095117 ## be/src/exec/arrow/parquet_reader.h: ## @@ -86,8 +85,9 @@ class ParquetReaderWrap final : public ArrowReaderWrap { int32_t* wbtyes); private: -void prefetch_batch(); Status read_next_batch(); +void readBatches(arrow::RecordBatchVector& batches, int current_group) override; Review Comment: ```suggestion void read_batches(arrow::RecordBatchVector& batches, int current_group) override; ``` ## be/src/exec/arrow/orc_reader.h: ## @@ -40,11 +40,12 @@ class ORCReaderWrap final : public ArrowReaderWrap { const std::vector& tuple_slot_descs, const std::vector& conjunct_ctxs, const std::string& timezone) override; -Status next_batch(std::shared_ptr* batch, bool* eof) override; private: Status _next_stripe_reader(bool* eof); Status _seek_start_stripe(); +void readBatches(arrow::RecordBatchVector& batches, int current_group) override; Review Comment: ```suggestion void read_batches(arrow::RecordBatchVector& batches, int current_group) override; ``` ## be/src/exec/arrow/parquet_reader.h: ## @@ -86,8 +85,9 @@ class ParquetReaderWrap final : public ArrowReaderWrap { int32_t* wbtyes); private: -void prefetch_batch(); Status read_next_batch(); +void readBatches(arrow::RecordBatchVector& batches, int current_group) override; +bool filterRowGroup(int current_group) override; Review Comment: ```suggestion bool filter_row_group(int current_group) override; ``` ## be/src/exec/arrow/orc_reader.h: ## @@ -40,11 +40,12 @@ class ORCReaderWrap final : public ArrowReaderWrap { const std::vector& tuple_slot_descs, const std::vector& conjunct_ctxs, const std::string& timezone) override; -Status next_batch(std::shared_ptr* batch, bool* eof) override; private: Status _next_stripe_reader(bool* eof); Status _seek_start_stripe(); +void readBatches(arrow::RecordBatchVector& batches, int current_group) override; +bool filterRowGroup(int current_group) override; Review Comment: ```suggestion bool filter_row_group(int current_group) override; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman commented on a diff in pull request #11338: [feature-wip][multi-catalog][WIP]Support use catalog.db and show databases from catalog stmt
morningman commented on code in PR #11338: URL: https://github.com/apache/doris/pull/11338#discussion_r933099724 ## fe/fe-core/src/main/java/org/apache/doris/analysis/UseStmt.java: ## @@ -35,19 +35,35 @@ */ public class UseStmt extends StatementBase { private static final Logger LOG = LogManager.getLogger(UseStmt.class); +private String catalogName; private String database; public UseStmt(String db) { database = db; } +public UseStmt(String catalogName, String db) { Review Comment: No need to modify use stmt in `sql_parser.cup`? ## fe/fe-core/src/main/cup/sql_parser.cup: ## @@ -2738,6 +2742,10 @@ show_param ::= {: RESULT = new ShowDbStmt(parser.wild, parser.where); :} +| KW_SCHEMAS KW_FROM STRING_LITERAL:catalogName Review Comment: use `ident` instead of `STRING_LITERAL`. So we don't need to use `""`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] SaintBacchus commented on a diff in pull request #11338: [feature-wip][multi-catalog][WIP]Support use catalog.db and show databases from catalog stmt
SaintBacchus commented on code in PR #11338: URL: https://github.com/apache/doris/pull/11338#discussion_r933103466 ## fe/fe-core/src/main/cup/sql_parser.cup: ## @@ -2738,6 +2742,10 @@ show_param ::= {: RESULT = new ShowDbStmt(parser.wild, parser.where); :} +| KW_SCHEMAS KW_FROM STRING_LITERAL:catalogName Review Comment: yes, I'm modifying. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] SaintBacchus commented on a diff in pull request #11338: [feature-wip][multi-catalog][WIP]Support use catalog.db and show databases from catalog stmt
SaintBacchus commented on code in PR #11338: URL: https://github.com/apache/doris/pull/11338#discussion_r933104103 ## fe/fe-core/src/main/java/org/apache/doris/analysis/UseStmt.java: ## @@ -35,19 +35,35 @@ */ public class UseStmt extends StatementBase { private static final Logger LOG = LogManager.getLogger(UseStmt.class); +private String catalogName; private String database; public UseStmt(String db) { database = db; } +public UseStmt(String catalogName, String db) { Review Comment: Still in working -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman merged pull request #11260: [feature-wip](multi-catalog)(fix) partition value error when a block contains multiple splits
morningman merged PR #11260: URL: https://github.com/apache/doris/pull/11260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [feature-wip](multi-catalog)(fix) partition value error when a block contains multiple splits (#11260)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 84ce2a1e98 [feature-wip](multi-catalog)(fix) partition value error when a block contains multiple splits (#11260) 84ce2a1e98 is described below commit 84ce2a1e983a118feae614166d1d1d905deeb4bb Author: Ashin Gau AuthorDate: Fri Jul 29 18:48:59 2022 +0800 [feature-wip](multi-catalog)(fix) partition value error when a block contains multiple splits (#11260) `FileArrowScanner::get_next` returns a block when full, so it maybe contains multiple splits in small files or crosses two splits in large files. However, a block can only fill the partition values from one file. Different splits may be from different files, causing the error of embed partition values. --- be/src/vec/exec/file_arrow_scanner.cpp | 2 +- be/src/vec/exec/file_scanner.cpp | 5 + be/src/vec/exec/file_scanner.h | 2 +- be/src/vec/exec/file_text_scanner.cpp | 5 + 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/be/src/vec/exec/file_arrow_scanner.cpp b/be/src/vec/exec/file_arrow_scanner.cpp index 30511b2392..e6c4fa7597 100644 --- a/be/src/vec/exec/file_arrow_scanner.cpp +++ b/be/src/vec/exec/file_arrow_scanner.cpp @@ -194,7 +194,7 @@ Status FileArrowScanner::_append_batch_to_block(Block* block) { } _rows += num_elements; _arrow_batch_cur_idx += num_elements; -return Status::OK(); +return _fill_columns_from_path(block, num_elements); } void VFileParquetScanner::_update_profile(std::shared_ptr& statistics) { diff --git a/be/src/vec/exec/file_scanner.cpp b/be/src/vec/exec/file_scanner.cpp index a0f473ffc9..bb1ba21924 100644 --- a/be/src/vec/exec/file_scanner.cpp +++ b/be/src/vec/exec/file_scanner.cpp @@ -164,7 +164,6 @@ Status FileScanner::_filter_block(vectorized::Block* _block) { Status FileScanner::finalize_block(vectorized::Block* _block, bool* eof) { *eof = _scanner_eof; _read_row_counter += _block->rows(); -RETURN_IF_ERROR(_fill_columns_from_path(_block)); if (LIKELY(_rows > 0)) { RETURN_IF_ERROR(_filter_block(_block)); } @@ -172,11 +171,9 @@ Status FileScanner::finalize_block(vectorized::Block* _block, bool* eof) { return Status::OK(); } -Status FileScanner::_fill_columns_from_path(vectorized::Block* _block) { +Status FileScanner::_fill_columns_from_path(vectorized::Block* _block, size_t rows) { const TFileRangeDesc& range = _ranges.at(_next_range - 1); if (range.__isset.columns_from_path && !_partition_slot_descs.empty()) { -size_t rows = _rows; - for (const auto& slot_desc : _partition_slot_descs) { if (slot_desc == nullptr) continue; auto it = _partition_slot_index_map.find(slot_desc->id()); diff --git a/be/src/vec/exec/file_scanner.h b/be/src/vec/exec/file_scanner.h index 16e75aefc0..df4c1d4ef6 100644 --- a/be/src/vec/exec/file_scanner.h +++ b/be/src/vec/exec/file_scanner.h @@ -55,6 +55,7 @@ protected: virtual void _init_profiles(RuntimeProfile* profile) = 0; Status finalize_block(vectorized::Block* dest_block, bool* eof); +Status _fill_columns_from_path(vectorized::Block* output_block, size_t rows); Status init_block(vectorized::Block* block); std::unique_ptr _text_converter; @@ -106,7 +107,6 @@ protected: private: Status _init_expr_ctxes(); Status _filter_block(vectorized::Block* output_block); -Status _fill_columns_from_path(vectorized::Block* output_block); }; } // namespace doris::vectorized diff --git a/be/src/vec/exec/file_text_scanner.cpp b/be/src/vec/exec/file_text_scanner.cpp index 593b78867f..02da0bca2c 100644 --- a/be/src/vec/exec/file_text_scanner.cpp +++ b/be/src/vec/exec/file_text_scanner.cpp @@ -91,6 +91,7 @@ Status FileTextScanner::get_next(Block* block, bool* eof) { const int batch_size = _state->batch_size(); +int current_rows = _rows; while (_rows < batch_size && !_scanner_eof) { if (_cur_line_reader == nullptr || _cur_line_reader_eof) { RETURN_IF_ERROR(_open_next_reader()); @@ -114,6 +115,10 @@ Status FileTextScanner::get_next(Block* block, bool* eof) { COUNTER_UPDATE(_rows_read_counter, 1); RETURN_IF_ERROR(_fill_file_columns(Slice(ptr, size), block)); } +if (_cur_line_reader_eof) { +RETURN_IF_ERROR(_fill_columns_from_path(block, _rows - current_rows)); +current_rows = _rows; +} } return finalize_block(block, eof); - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freesinger commented on a diff in pull request #10322: [feature](JSON datatype)Support JSON datatype
freesinger commented on code in PR #10322: URL: https://github.com/apache/doris/pull/10322#discussion_r933111884 ## be/src/vec/core/field.h: ## @@ -572,13 +671,21 @@ class Field { create(reinterpret_cast(data), size); } +void create_json(const unsigned char* data, size_t size) { +new (&storage) JsonField(reinterpret_cast(data), size); +which = Types::JSON; +} + ALWAYS_INLINE void destroy() { if (which < Types::MIN_NON_POD) return; switch (which) { case Types::String: destroy(); break; +case Types::JSON: +destroy(); Review Comment: ```C++ template void destroy() { T* MAY_ALIAS ptr = reinterpret_cast(&storage); ptr->~T(); } ``` [Line707](https://github.com/freesinger/incubator-doris/blob/878d392fb89991c9e80f9fff0936fe3ce97350de/be/src/vec/core/field.h#L707) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freesinger commented on a diff in pull request #10322: [feature](JSON datatype)Support JSON datatype
freesinger commented on code in PR #10322: URL: https://github.com/apache/doris/pull/10322#discussion_r933120201 ## be/src/vec/columns/column_json.h: ## @@ -0,0 +1,299 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include + +#include "vec/columns/column.h" +#include "vec/columns/column_impl.h" +#include "vec/common/assert_cast.h" +#include "vec/common/memcmp_small.h" +#include "vec/common/memcpy_small.h" +#include "vec/common/pod_array.h" +#include "vec/common/sip_hash.h" +#include "vec/core/field.h" + +namespace doris::vectorized { +class ColumnJson final : public COWHelper { +public: +using Char = UInt8; +using Chars = PaddedPODArray; + +private: +friend class COWHelper; + +Offsets offsets; + +Chars chars; + +size_t ALWAYS_INLINE offset_at(ssize_t i) const { return offsets[i - 1]; } + +size_t ALWAYS_INLINE size_at(ssize_t i) const { return offsets[i] - offsets[i - 1]; } + +template +struct less; + +template +struct lessWithCollation; + +ColumnJson() = default; + +ColumnJson(const ColumnJson& src) +: offsets(src.offsets.begin(), src.offsets.end()), + chars(src.chars.begin(), src.chars.end()) {} + +public: +const char* get_family_name() const override { return "JSON"; } + +size_t size() const override { return offsets.size(); } + +size_t byte_size() const override { return chars.size() + offsets.size() * sizeof(offsets[0]); } + +size_t allocated_bytes() const override { +return chars.allocated_bytes() + offsets.allocated_bytes(); +} + +void protect() override; + +MutableColumnPtr clone_resized(size_t to_size) const override; + +Field operator[](size_t n) const override { +assert(n < size()); +return Field(&chars[offset_at(n)], size_at(n) - 1); +} + +void get(size_t n, Field& res) const override { +assert(n < size()); +res.assign_json(&chars[offset_at(n)], size_at(n) - 1); +} + +StringRef get_data_at(size_t n) const override { +assert(n < size()); +return StringRef(&chars[offset_at(n)], size_at(n) - 1); +} + +/// Suppress gcc 7.3.1 warning: '*((void*)& +8)' may be used uninitialized in this function +#if !__clang__ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wmaybe-uninitialized" +#endif + +void insert(const Field& x) override { +const JsonField& s = doris::vectorized::get(x); + +const size_t old_size = chars.size(); +const size_t size_to_append = s.get_size() + 1; +const size_t new_size = old_size + size_to_append; + +chars.resize(new_size); +memcpy(chars.data() + old_size, s.get_value(), size_to_append); +offsets.push_back(new_size); +} + +#if !__clang__ +#pragma GCC diagnostic pop +#endif + +void insert_from(const IColumn& src_, size_t n) override { +const ColumnJson& src = assert_cast(src_); +const size_t size_to_append = +src.offsets[n] - src.offsets[n - 1]; /// -1th index is Ok, see PaddedPODArray. + +if (size_to_append == 1) { +/// shortcut for empty json +chars.push_back(0); +offsets.push_back(chars.size()); +} else { +const size_t old_size = chars.size(); +const size_t offset = src.offsets[n - 1]; +const size_t new_size = old_size + size_to_append; + +chars.resize(new_size); +memcpy_small_allow_read_write_overflow15(chars.data() + old_size, &src.chars[offset], + size_to_append); +offsets.push_back(new_size); +} +} + +void insert_data(const char* pos, size_t length) override { +const size_t old_size = chars.size(); +const size_t new_size = old_size + length + 1; + +chars.resize(new_size); +if (length) memcpy(chars.data() + old_size, pos, length); +chars[old_size + length] = 0; +offsets.push_back(new_size); +} + +void insert_many_binary_data(char* data_array, uint32_t* len_array, + uint32_t* start_offset_array,
[GitHub] [doris] FreeOnePlus opened a new issue, #11341: [Feature] Manually clean up LoadLabel
FreeOnePlus opened a new issue, #11341: URL: https://github.com/apache/doris/issues/11341 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description LOAD LABEL tasks are all pulled in batches, the name cannot be repeated, and the second pull will not work. Is it possible to clean up this pull task record, or to reuse a label ### Use case _No response_ ### Related issues _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring merged pull request #11006: (performance)[scanner] Isolate local and remote queries using different scanner…
dataroaring merged PR #11006: URL: https://github.com/apache/doris/pull/11006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: (performance)[scanner] Isolate local and remote queries using different scanner… (#11006)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new d6f937cb01 (performance)[scanner] Isolate local and remote queries using different scanner… (#11006) d6f937cb01 is described below commit d6f937cb01821c452abd3ebb096a9c9428fc01c7 Author: Luwei <814383...@qq.com> AuthorDate: Fri Jul 29 19:14:46 2022 +0800 (performance)[scanner] Isolate local and remote queries using different scanner… (#11006) --- be/src/common/config.h | 5 + be/src/exec/olap_scan_node.cpp | 12 +++- be/src/exec/olap_scanner.cpp| 17 + be/src/exec/olap_scanner.h | 2 ++ be/src/olap/tablet.h| 2 ++ be/src/runtime/exec_env.h | 2 ++ be/src/runtime/exec_env_init.cpp| 5 + be/src/vec/exec/volap_scan_node.cpp | 12 +++- be/src/vec/exec/volap_scanner.cpp | 17 + be/src/vec/exec/volap_scanner.h | 2 ++ 10 files changed, 74 insertions(+), 2 deletions(-) diff --git a/be/src/common/config.h b/be/src/common/config.h index 36cfeb684d..6f7342bb2f 100644 --- a/be/src/common/config.h +++ b/be/src/common/config.h @@ -805,6 +805,11 @@ CONF_Int32(s3_transfer_executor_pool_size, "2"); CONF_Bool(enable_time_lut, "true"); +// number of s3 scanner thread pool size +CONF_Int32(doris_remote_scanner_thread_pool_thread_num, "16"); +// number of s3 scanner thread pool queue size +CONF_Int32(doris_remote_scanner_thread_pool_queue_size, "10240"); + #ifdef BE_TEST // test s3 CONF_String(test_s3_resource, "resource"); diff --git a/be/src/exec/olap_scan_node.cpp b/be/src/exec/olap_scan_node.cpp index ce3486a56d..cabf14ae62 100644 --- a/be/src/exec/olap_scan_node.cpp +++ b/be/src/exec/olap_scan_node.cpp @@ -1502,6 +1502,7 @@ void OlapScanNode::transfer_thread(RuntimeState* state) { * 4. Regularly increase the priority of the remaining tasks in the queue to avoid starvation for large queries */ PriorityThreadPool* thread_pool = state->exec_env()->scan_thread_pool(); +PriorityThreadPool* remote_thread_pool = state->exec_env()->remote_scan_thread_pool(); _total_assign_num = 0; _nice = 18 + std::max(0, 2 - (int)_olap_scanners.size() / 5); std::list olap_scanners; @@ -1580,8 +1581,17 @@ void OlapScanNode::transfer_thread(RuntimeState* state) { task.priority = _nice; task.queue_id = state->exec_env()->store_path_to_index((*iter)->scan_disk()); (*iter)->start_wait_worker_timer(); + +TabletStorageType type = (*iter)->get_storage_type(); +bool ret = false; COUNTER_UPDATE(_scanner_sched_counter, 1); -if (thread_pool->offer(task)) { +if (type == TabletStorageType::STORAGE_TYPE_LOCAL) { +ret = thread_pool->offer(task); +} else { +ret = remote_thread_pool->offer(task); +} + +if (ret) { olap_scanners.erase(iter++); } else { LOG(FATAL) << "Failed to assign scanner task to thread pool!"; diff --git a/be/src/exec/olap_scanner.cpp b/be/src/exec/olap_scanner.cpp index e49b322652..97d289c3f4 100644 --- a/be/src/exec/olap_scanner.cpp +++ b/be/src/exec/olap_scanner.cpp @@ -127,6 +127,23 @@ Status OlapScanner::prepare( return Status::OK(); } +TabletStorageType OlapScanner::get_storage_type() { +int local_reader = 0; +for (const auto& reader : _tablet_reader_params.rs_readers) { +if (reader->rowset()->rowset_meta()->resource_id().empty()) { +local_reader++; +} +} +int total_reader = _tablet_reader_params.rs_readers.size(); + +if (local_reader == total_reader) { +return TabletStorageType::STORAGE_TYPE_LOCAL; +} else if (local_reader == 0) { +return TabletStorageType::STORAGE_TYPE_REMOTE; +} +return TabletStorageType::STORAGE_TYPE_REMOTE_AND_LOCAL; +} + Status OlapScanner::open() { auto span = _runtime_state->get_tracer()->StartSpan("OlapScanner::open"); auto scope = opentelemetry::trace::Scope {span}; diff --git a/be/src/exec/olap_scanner.h b/be/src/exec/olap_scanner.h index 44fab43dc6..e95e31d106 100644 --- a/be/src/exec/olap_scanner.h +++ b/be/src/exec/olap_scanner.h @@ -88,6 +88,8 @@ public: const std::vector& get_query_slots() const { return _query_slots; } +TabletStorageType get_storage_type(); + protected: Status _init_tablet_reader_params( const std::vector& key_ranges, const std::vector& filters, diff --git a/be/src/olap/tablet.h b/be/src/olap/tablet.h index accc157d6a..adacb11eb2 100644 --- a/be/src/olap/tablet.h +++ b/be/src/olap/tablet.h @@ -55,6 +55,8 @@ struct Rows
[GitHub] [doris] dataroaring merged pull request #11263: [Doc] update FAQ about ODBC
dataroaring merged PR #11263: URL: https://github.com/apache/doris/pull/11263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Doc] update FAQ about ODBC (#11263)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new a7e7df658c [Doc] update FAQ about ODBC (#11263) a7e7df658c is described below commit a7e7df658cc9a1d438fa2a266a8928b37ebd1032 Author: TengJianPing <18241664+jackte...@users.noreply.github.com> AuthorDate: Fri Jul 29 19:18:28 2022 +0800 [Doc] update FAQ about ODBC (#11263) --- docs/en/docs/faq/install-faq.md| 4 +++- docs/zh-CN/docs/faq/install-faq.md | 5 +++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/en/docs/faq/install-faq.md b/docs/en/docs/faq/install-faq.md index fcc98aa567..311b0cf8d1 100644 --- a/docs/en/docs/faq/install-faq.md +++ b/docs/en/docs/faq/install-faq.md @@ -289,7 +289,9 @@ In doris 1.0 onwards, openssl has been upgraded to 1.1 and is built into the dor ``` ERROR 1105 (HY000): errCode = 2, detailMessage = driver connect Error: HY000 [MySQL][ODBC 8.0(w) Driver]SSL connection error: Failed to set ciphers to use (2026) ``` -The solution is to use the `Connector/ODBC 8.0.28` version of ODBC Connector and select `Linux - Generic` in the operating system, this version of ODBC Driver uses openssl version 1,1. For details, see the [ODBC exterior documentation](../ecosystem/external-table/odbc-of-doris.md) +The solution is to use the `Connector/ODBC 8.0.28` version of ODBC Connector and select `Linux - Generic` in the operating system, this version of ODBC Driver uses openssl version 1.1. Or use a lower version of ODBC connector, e.g. [Connector/ODBC 5.3.14](https://dev.mysql.com/downloads/connector/odbc/5.3.html). For details, see the [ODBC exterior documentation](../ecosystem/external-table/odbc-of-doris.md). + + You can verify the version of openssl used by MySQL ODBC Driver by ``` ldd /path/to/libmyodbc8w.so |grep libssl.so diff --git a/docs/zh-CN/docs/faq/install-faq.md b/docs/zh-CN/docs/faq/install-faq.md index f1e688ddf6..f3adee6030 100644 --- a/docs/zh-CN/docs/faq/install-faq.md +++ b/docs/zh-CN/docs/faq/install-faq.md @@ -285,9 +285,10 @@ cp fe-core/target/generated-sources/cup/org/apache/doris/analysis/action_table.d ``` ERROR 1105 (HY000): errCode = 2, detailMessage = driver connect Error: HY000 [MySQL][ODBC 8.0(w) Driver]SSL connection error: Failed to set ciphers to use (2026) ``` -解决方式是使用`Connector/ODBC 8.0.28` 版本的 ODBC Connector, 并且选择 在操作系统处选择 `Linux - Generic`, 这个版本的ODBC Driver 使用 openssl 1.1 版本。具体使用方式见 [ODBC外表使用文档](../ecosystem/external-table/odbc-of-doris.md) +解决方式是使用`Connector/ODBC 8.0.28` 版本的 ODBC Connector, 并且在操作系统处选择 `Linux - Generic`, 这个版本的ODBC Driver 使用 openssl 1.1 版本。或者使用低版本的ODBC Connector,比如[Connector/ODBC 5.3.14](https://dev.mysql.com/downloads/connector/odbc/5.3.html)。具体使用方式见 [ODBC外表使用文档](../ecosystem/external-table/odbc-of-doris.md)。 + 可以通过如下方式验证 MySQL ODBC Driver 使用的openssl 版本 ``` ldd /path/to/libmyodbc8w.so |grep libssl.so ``` -如果输出包含 `libssl.so.10` 则使用过程中可能出现问题, 如果包含`libssl.so.1.1` 则与doris 1.0 兼容 +如果输出包含 `libssl.so.10` 则使用过程中可能出现问题, 如果包含`libssl.so.1.1` 则与doris 1.0 兼容。 - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11336: [fix](ci): add checkout to fix PR-title-checker
github-actions[bot] commented on PR #11336: URL: https://github.com/apache/doris/pull/11336#issuecomment-1199163110 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11336: [fix](ci): add checkout to fix PR-title-checker
github-actions[bot] commented on PR #11336: URL: https://github.com/apache/doris/pull/11336#issuecomment-1199163137 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11223: [doc](unique key) add suggestion for replace_if_not_null
github-actions[bot] commented on PR #11223: URL: https://github.com/apache/doris/pull/11223#issuecomment-1199166777 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freesinger commented on a diff in pull request #10322: [feature](JSON datatype)Support JSON datatype
freesinger commented on code in PR #10322: URL: https://github.com/apache/doris/pull/10322#discussion_r933135588 ## be/src/vec/olap/olap_data_convertor.cpp: ## @@ -577,6 +580,77 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorDate::convert_to_olap() { } } +// class OlapBlockDataConvertor::OlapColumnDataConvertorJson +void OlapBlockDataConvertor::OlapColumnDataConvertorJson::set_source_column( +const ColumnWithTypeAndName& typed_column, size_t row_pos, size_t num_rows) { + OlapBlockDataConvertor::OlapColumnDataConvertorBase::set_source_column(typed_column, row_pos, + num_rows); +_slice.resize(num_rows); +} + +const void* OlapBlockDataConvertor::OlapColumnDataConvertorJson::get_data() const { +return _slice.data(); +} + +const void* OlapBlockDataConvertor::OlapColumnDataConvertorJson::get_data_at(size_t offset) const { +assert(offset < _num_rows && _num_rows == _slice.size()); +UInt8 null_flag = 0; +if (_nullmap) { +null_flag = _nullmap[offset]; Review Comment: no need for _row_pos here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow opened a new pull request, #11342: [refactor](Nereids)split rewrite and insert into memo to 2 functions
morrySnow opened a new pull request, #11342: URL: https://github.com/apache/doris/pull/11342 ## Problem Summary: Split rewrite and insert into memo to 2 functions to make the code easy to read. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [x] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow commented on pull request #11342: [refactor](Nereids)split rewrite and insert into memo to 2 functions
morrySnow commented on PR #11342: URL: https://github.com/apache/doris/pull/11342#issuecomment-1199175298 @924060929 @englefly PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wsjz opened a new pull request, #11343: (feature-wip)[parquet-reader] file reader
wsjz opened a new pull request, #11343: URL: https://github.com/apache/doris/pull/11343 # Proposed changes Issue Number: close #xxx ## Problem Summary: ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11337: [opt] unify stop script
github-actions[bot] commented on PR #11337: URL: https://github.com/apache/doris/pull/11337#issuecomment-1199186641 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11337: [opt] unify stop script
github-actions[bot] commented on PR #11337: URL: https://github.com/apache/doris/pull/11337#issuecomment-1199186663 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring merged pull request #11223: [doc](unique key) add suggestion for replace_if_not_null
dataroaring merged PR #11223: URL: https://github.com/apache/doris/pull/11223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [doc](unique key) add suggestion for replace_if_not_null (#11223)
This is an automated email from the ASF dual-hosted git repository. dataroaring pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new e7635e54ee [doc](unique key) add suggestion for replace_if_not_null (#11223) e7635e54ee is described below commit e7635e54eeeb6276e4b490f877aff3929db4190f Author: zhannngchen <48427519+zhannngc...@users.noreply.github.com> AuthorDate: Fri Jul 29 20:02:18 2022 +0800 [doc](unique key) add suggestion for replace_if_not_null (#11223) --- docs/en/docs/data-table/data-model.md| 1 + docs/zh-CN/docs/data-table/data-model.md | 1 + 2 files changed, 2 insertions(+) diff --git a/docs/en/docs/data-table/data-model.md b/docs/en/docs/data-table/data-model.md index a9c9ec0e2a..c1f2084084 100644 --- a/docs/en/docs/data-table/data-model.md +++ b/docs/en/docs/data-table/data-model.md @@ -450,4 +450,5 @@ Because the data model was established when the table was built, and **could not 1. Aggregate model can greatly reduce the amount of data scanned and the amount of query computation by pre-aggregation. It is very suitable for report query scenarios with fixed patterns. But this model is not very friendly for count (*) queries. At the same time, because the aggregation method on the Value column is fixed, semantic correctness should be considered in other types of aggregation queries. 2. Uniq model guarantees the uniqueness of primary key for scenarios requiring unique primary key constraints. However, the query advantage brought by pre-aggregation such as ROLLUP cannot be exploited (because the essence is REPLACE, there is no such aggregation as SUM). + - \[Note\] The Unique model only supports the entire row update. If the user needs unique key with partial update (such as loading multiple source tables into one doris table), you can consider using the Aggregate model, setting the aggregate type of the non-primary key columns to REPLACE_IF_NOT_NULL. For detail, please refer to [CREATE TABLE Manual](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md) 3. Duplicate is suitable for ad-hoc queries of any dimension. Although it is also impossible to take advantage of the pre-aggregation feature, it is not constrained by the aggregation model and can take advantage of the queue-store model (only reading related columns, but not all Key columns). diff --git a/docs/zh-CN/docs/data-table/data-model.md b/docs/zh-CN/docs/data-table/data-model.md index 4e3fc609cb..90f8df7b86 100644 --- a/docs/zh-CN/docs/data-table/data-model.md +++ b/docs/zh-CN/docs/data-table/data-model.md @@ -459,4 +459,5 @@ Duplicate 模型没有聚合模型的这个局限性。因为该模型不涉及 1. Aggregate 模型可以通过预聚合,极大地降低聚合查询时所需扫描的数据量和查询的计算量,非常适合有固定模式的报表类查询场景。但是该模型对 count(*) 查询很不友好。同时因为固定了 Value 列上的聚合方式,在进行其他类型的聚合查询时,需要考虑语意正确性。 2. Unique 模型针对需要唯一主键约束的场景,可以保证主键唯一性约束。但是无法利用 ROLLUP 等预聚合带来的查询优势(因为本质是 REPLACE,没有 SUM 这种聚合方式)。 + - 【注意】Unique 模型仅支持整行更新,如果用户既需要唯一主键约束,又需要更新部分列(例如将多张源表导入到一张 doris 表的情形),则可以考虑使用 Aggregate 模型,同时将非主键列的聚合类型设置为 REPLACE_IF_NOT_NULL。具体的用法可以参考[语法手册](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md) 3. Duplicate 适合任意维度的 Ad-hoc 查询。虽然同样无法利用预聚合的特性,但是不受聚合模型的约束,可以发挥列存模型的优势(只读取相关列,而不需要读取所有 Key 列)。 - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11308: [Bug][Function] core dump on sum(distinct)
github-actions[bot] commented on PR #11308: URL: https://github.com/apache/doris/pull/11308#issuecomment-1199197481 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #11308: [Bug][Function] core dump on sum(distinct)
github-actions[bot] commented on PR #11308: URL: https://github.com/apache/doris/pull/11308#issuecomment-1199197447 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] luozenglin commented on a diff in pull request #11316: [fix] Fix the query result error caused by the grouping sets statemen…
luozenglin commented on code in PR #11316: URL: https://github.com/apache/doris/pull/11316#discussion_r933206295 ## be/src/exec/repeat_node.h: ## @@ -52,12 +55,17 @@ class RepeatNode : public ExecNode { std::vector> _grouping_list; // Tuple id used for output, it has new slots. TupleId _output_tuple_id; -const TupleDescriptor* _tuple_desc; +const TupleDescriptor* _output_tuple_desc; std::unique_ptr _child_row_batch; bool _child_eos; int _repeat_id_idx; RuntimeState* _runtime_state; + +// Exprs used to evaluate input rows +std::vector _probe_exprs; Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #11283: [feature-wip](unique-key-merge-on-write) Add support for tablet migration, DSIP-018[5/3]
zhannngchen commented on code in PR #11283: URL: https://github.com/apache/doris/pull/11283#discussion_r933224897 ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -161,6 +161,10 @@ Status BetaRowsetWriter::add_rowset(RowsetSharedPtr rowset) { _total_data_size += rowset->rowset_meta()->data_disk_size(); _total_index_size += rowset->rowset_meta()->index_disk_size(); _num_segment += rowset->num_segments(); +// append key_bounds to current rowset +rowset->get_segments_key_bounds(&_segments_encoded_key_bounds); +// append key_bounds to current rowset +rowset->get_segments_key_bounds(&_segments_encoded_key_bounds); Review Comment: Fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [opt] unify stop script (#11337)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 8483660fe7 [opt] unify stop script (#11337) 8483660fe7 is described below commit 8483660fe78dc901565ac7864bac669769c5b05c Author: Dongyang Li AuthorDate: Fri Jul 29 21:04:03 2022 +0800 [opt] unify stop script (#11337) --- bin/stop_be.sh | 21 ++--- bin/stop_fe.sh | 10 +- 2 files changed, 19 insertions(+), 12 deletions(-) diff --git a/bin/stop_be.sh b/bin/stop_be.sh index 1ea07d8b2e..f46c0d4702 100755 --- a/bin/stop_be.sh +++ b/bin/stop_be.sh @@ -49,30 +49,37 @@ pidfile=$PID_DIR/be.pid if [ -f $pidfile ]; then pid=$(cat $pidfile) -#check if pid valid +# check if pid valid if test -z "$pid"; then echo "ERROR: invalid pid." exit 1 fi -#check if pid process exist +# check if pid process exist if ! kill -0 $pid; then echo "ERROR: be process $pid does not exist." exit 1 fi pidcomm=$(ps -p $pid -o comm=) -#check if pid process is backend process +# check if pid process is backend process if [ "doris_be"x != "$pidcomm"x ]; then echo "ERROR: pid process may not be be. " exit 1 fi -# kill +# kill pid process and check it if kill -${signum} $pid >/dev/null 2>&1; then -echo "stop $pidcomm, and remove pid file. " -rm $pidfile -exit 0 +while true; do +if kill -0 $pid >/dev/null; then +echo "waiting be to stop, pid: $pid" +sleep 2 +else +echo "stop $pidcomm, and remove pid file. " +if [ -f $pidfile ]; then rm $pidfile; fi +exit 0 +fi +done else echo "ERROR: failed to stop $pid" exit 1 diff --git a/bin/stop_fe.sh b/bin/stop_fe.sh index 4910387111..4b35e6edc7 100755 --- a/bin/stop_fe.sh +++ b/bin/stop_fe.sh @@ -44,29 +44,29 @@ pidfile=$PID_DIR/fe.pid if [ -f $pidfile ]; then pid=$(cat $pidfile) -#check if pid valid +# check if pid valid if test -z "$pid"; then echo "ERROR: invalid pid." exit 1 fi -#check if pid process exist +# check if pid process exist if ! kill -0 $pid; then echo "ERROR: fe process $pid does not exist." exit 1 fi pidcomm=$(ps -p $pid -o comm=) -#check if pid process is frontend process +# check if pid process is frontend process if [ "java"x != "$pidcomm"x ]; then echo "ERROR: pid process may not be fe. " exit 1 fi -#kill pid process and check it +# kill pid process and check it if kill $pid >/dev/null 2>&1; then while true; do -if ps -p $pid >/dev/null; then +if kill -0 $pid >/dev/null; then echo "waiting fe to stop, pid: $pid" sleep 2 else - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #11337: [opt] unify stop script
yiguolei merged PR #11337: URL: https://github.com/apache/doris/pull/11337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] 924060929 merged pull request #11332: [feature](nereids) add scalar subquery expression
924060929 merged PR #11332: URL: https://github.com/apache/doris/pull/11332 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org