[GitHub] [doris] zhannngchen commented on a diff in pull request #10548: [WIP](unique-key-merge-on-write) Add delete bitmap for DSIP-018
zhannngchen commented on code in PR #10548: URL: https://github.com/apache/doris/pull/10548#discussion_r916508357 ## be/src/olap/tablet_meta.h: ## @@ -214,9 +217,120 @@ class TabletMeta { RowsetTypePB _preferred_rowset_type = BETA_ROWSET; std::string _remote_storage_name; StorageMediumPB _storage_medium = StorageMediumPB::HDD; +std::unique_ptr _delete_bitmap; std::shared_mutex _meta_lock; }; +/** + * Wraps multiple bitmaps for recording rows (row id) that are deleted or + * overwritten. + * + * RowsetId and SegmentId are for locating segment, Version here is a single + * uint32_t means that at which "version" of the load causes the delete or + * overwrite. + * + * The start and end version of a load is the same, it's ok and straightforward + * to use a single uint32_t. + * + * e.g. + * There is a key "key1" in rowset id 1, version [1,1], segment id 1, row id 1. + * A new load also contains "key1", the rowset id 2, version [2,2], segment id 1 + * the delete bitmap will be `{1,1,2} -> 1`, which means the "row id 1" in + * "rowset id 1, segment id 1" is deleted/overitten by some loads at "version 2" + */ +class DeleteBitmap { +public: +mutable std::shared_mutex lock; +using SegmentId = uint32_t; +using Version = uint32_t; +using BitmapKey = std::tuple; +std::map delete_bitmap; // Ordered map + +DeleteBitmap(); + +/** + * Copy c-tor for making delete bitmap snapshot on read path + */ +DeleteBitmap(const DeleteBitmap& r); +DeleteBitmap& operator=(const DeleteBitmap& r); +/** + * Move c-tor for making delete bitmap snapshot on read path + */ +DeleteBitmap(DeleteBitmap&& r); +DeleteBitmap& operator=(DeleteBitmap&& r); + +/** + * Makes a snapshot of delete bimap, read lock will be acquired in this + * process + */ +DeleteBitmap snapshot() const; + +/** + * Marks the specific row deleted + */ +void add(const BitmapKey& bitmap, uint32_t row_id); Review Comment: We should change all the BitmapKey parameter from `bitmap` to `bmk`, the name is confusing, since it's not bitmap, it's just a key. ## be/src/olap/tablet_meta.cpp: ## @@ -710,4 +742,106 @@ bool operator!=(const TabletMeta& a, const TabletMeta& b) { return !(a == b); } +DeleteBitmap::DeleteBitmap() { +} + +DeleteBitmap::DeleteBitmap(const DeleteBitmap& o) { +delete_bitmap = o.delete_bitmap; // just copy data +} + +DeleteBitmap& DeleteBitmap::operator=(const DeleteBitmap& o) { +delete_bitmap = o.delete_bitmap; // just copy data +return *this; +} + +DeleteBitmap::DeleteBitmap(DeleteBitmap&& o) { +delete_bitmap = std::move(o.delete_bitmap); +} + +DeleteBitmap& DeleteBitmap::operator=(DeleteBitmap&& o) { +delete_bitmap = std::move(o.delete_bitmap); +return *this; +} + +DeleteBitmap DeleteBitmap::snapshot() const { +std::shared_lock l(lock); +return DeleteBitmap(*this); +} + +void DeleteBitmap::add(const BitmapKey& bitmap, uint32_t row_id) { +std::lock_guard l(lock); Review Comment: > It looks like the granularity of the lock is tablet level. > > I wonder will there be a performance penalty for this part of serial update when loading concurrently that need to update multiple versions of delete-bitmaps? We can't update the delete bitmap concurrently, at least in current design. A load will first lookup a row key before it update the delete map, but before that, it MUST see all previous versions data. If we can't guarantee that, data inconsistency will happen. e.g. - current rowset layout : [0-5][6-6][7-7] - inflight load job: load1, load2, load3 - load1 committed first, load2 in second, load3 committed in third. We CAN NOT update delete bitmap at this time, because load2 may overwrite some data in load1, but it can't see load1 - load2 published first, with version [9-9](which is determined at commit stage), load1 published in second, with version [8-8], load 3 published in third, with version[10-10]. The publish sequence is not acceptable, since load2 didn't update the delete bitmap on rowset[8-8] correctly. - So we have to guarantee that load1 published first, the load2, then load3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on pull request #10548: [WIP](unique-key-merge-on-write) Add delete bitmap for DSIP-018
zhannngchen commented on PR #10548: URL: https://github.com/apache/doris/pull/10548#issuecomment-1178629366 @compiletheworld the formatter failed, pls fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xiaokang opened a new pull request, #10694: Key topn opt2
xiaokang opened a new pull request, #10694: URL: https://github.com/apache/doris/pull/10694 # Proposed changes Issue Number: close #10646 ## Problem Summary: Describe the overview of changes. BE part for https://github.com/apache/doris/issues/10646. The FE part is https://github.com/apache/doris/pull/10647. There is a common query pattern to find latest time serials data. eg. SELECT * from t_log WHERE t>t1 AND tmailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10392: [Enhancement][Vectorized] Use SIMD to skip batches of null data in nu…
BiteThet commented on code in PR #10392: URL: https://github.com/apache/doris/pull/10392#discussion_r916537248 ## be/src/vec/aggregate_functions/aggregate_function_null.h: ## @@ -197,9 +200,66 @@ class AggregateFunctionNullUnary final } } +void add_not_nullable(AggregateDataPtr __restrict place, const IColumn** columns, + size_t row_num, Arena* arena) { +const ColumnNullable* column = assert_cast(columns[0]); +this->set_flag(place); +const IColumn* nested_column = &column->get_nested_column(); +this->nested_function->add(this->nested_place(place), &nested_column, row_num, arena); +} + +void add_batch(size_t batch_size, AggregateDataPtr* places, size_t place_offset, + const IColumn** columns, Arena* arena) const override { +int processed_records_num = 0; + +// we can use column->has_null() to judge whether whole batch of data is null and skip batch, +// but it's maybe too coarse-grained. +#ifdef __AVX2__ +const ColumnNullable* column = assert_cast(columns[0]); +// The overhead introduced is negligible here, just an extra memory read from NullMap +const NullMap& null_map_data = column->get_null_map_data(); + +// NullMap use uint8_t type to indicate values is null or not, 1 indicates null, 0 versus. +// It's important to keep consistent with element type size in NullMap +constexpr int simd_batch_size = 256 / (8 * sizeof(uint8_t)); +__m256i all0 = _mm256_setzero_si256(); +auto to_read_null_map_position = reinterpret_cast(null_map_data.data()); + +while (processed_records_num + simd_batch_size < batch_size) { +to_read_null_map_position = to_read_null_map_position + processed_records_num; +// load unaligned data from null_map, 1 means value is null, 0 versus +__m256i f = +_mm256_loadu_si256(reinterpret_cast(to_read_null_map_position)); +int mask = _mm256_movemask_epi8(_mm256_cmpgt_epi8(f, all0)); +// all data is null +if (mask == 0x) { +} else if (mask == 0) { // all data is not null +for (size_t i = processed_records_num; i < processed_records_num + simd_batch_size; + i++) { +add_not_nullable(places[i] + place_offset, columns, i, arena); +} +} else { +// data is partly null +for (size_t i = processed_records_num; i < processed_records_num + simd_batch_size; Review Comment: Maybe we can calculate thw `lowbit` of mask to find not null offset. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow commented on a diff in pull request #10672: [refactor](nereids) Refine some code snippets
morrySnow commented on code in PR #10672: URL: https://github.com/apache/doris/pull/10672#discussion_r916542491 ## fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java: ## @@ -89,51 +87,36 @@ public static Expression or(List expressions) { * Use AND/OR to combine expressions together. */ public static Expression combine(NodeType op, List expressions) { - Objects.requireNonNull(expressions, "expressions is null"); if (expressions.size() == 0) { -if (op == NodeType.AND) { -return new Literal(true); -} -if (op == NodeType.OR) { -return new Literal(false); -} -} - -if (expressions.size() == 1) { +return new Literal(op == NodeType.AND); +} else if (expressions.size() == 1) { return expressions.get(0); } -List distinctExpressions = Lists.newArrayList(new LinkedHashSet<>(expressions)); -if (op == NodeType.AND) { -if (distinctExpressions.contains(Literal.FALSE_LITERAL)) { -return Literal.FALSE_LITERAL; +Expression shortCircuit = (op == NodeType.AND ? Literal.FALSE_LITERAL : Literal.TRUE_LITERAL); +Expression skip = (op == NodeType.AND ? Literal.TRUE_LITERAL : Literal.FALSE_LITERAL); +LinkedHashSet distinctExpressions = Sets.newLinkedHashSetWithExpectedSize(expressions.size()); +for (Expression expression : expressions) { +if (expression.equals(shortCircuit)) { +return shortCircuit; +} else if (!expression.equals(skip)) { +distinctExpressions.add(expression); } -distinctExpressions = distinctExpressions.stream().filter(p -> !p.equals(Literal.TRUE_LITERAL)) -.collect(Collectors.toList()); } -if (op == NodeType.OR) { -if (distinctExpressions.contains(Literal.TRUE_LITERAL)) { -return Literal.TRUE_LITERAL; +List result = Lists.newArrayListWithCapacity(distinctExpressions.size() / 2 + 1); Review Comment: maybe we could use stream reduce api to do this without recursion ## fe/fe-core/src/main/java/org/apache/doris/nereids/util/Utils.java: ## @@ -28,10 +28,7 @@ public class Utils { * @return quoted string */ public static String quoteIfNeeded(String part) { -if (part.matches("[a-zA-Z0-9_]+") && !part.matches("\\d+")) { -return part; -} else { -return part.replace("`", "``"); -} +return part.matches("\\w*[\\w&&[^\\d]]+\\w*") Review Comment: this pattern is not intuitional. it is better add a comment to explain the pattern means all legal string except pure digit string. ## fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java: ## @@ -89,51 +87,36 @@ public static Expression or(List expressions) { * Use AND/OR to combine expressions together. */ public static Expression combine(NodeType op, List expressions) { - Objects.requireNonNull(expressions, "expressions is null"); if (expressions.size() == 0) { -if (op == NodeType.AND) { -return new Literal(true); -} -if (op == NodeType.OR) { -return new Literal(false); -} -} - -if (expressions.size() == 1) { +return new Literal(op == NodeType.AND); Review Comment: If u do that, u need add a check at the top of this function to check `NodeType` is either `AND` or `OR` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhengshiJ commented on a diff in pull request #10678: [feature](nereides) support sort translator
zhengshiJ commented on code in PR #10678: URL: https://github.com/apache/doris/pull/10678#discussion_r916543059 ## fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java: ## @@ -117,6 +119,24 @@ public void setMaterializedTupleInfo( } } +/** + * Sets tupleInfo. + * Just for Nereids. + */ +public void setTupleInfo( Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 commented on a diff in pull request #10672: [refactor](nereids) Refine some code snippets
adonis0147 commented on code in PR #10672: URL: https://github.com/apache/doris/pull/10672#discussion_r916569297 ## fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java: ## @@ -89,51 +87,36 @@ public static Expression or(List expressions) { * Use AND/OR to combine expressions together. */ public static Expression combine(NodeType op, List expressions) { - Objects.requireNonNull(expressions, "expressions is null"); if (expressions.size() == 0) { -if (op == NodeType.AND) { -return new Literal(true); -} -if (op == NodeType.OR) { -return new Literal(false); -} -} - -if (expressions.size() == 1) { +return new Literal(op == NodeType.AND); +} else if (expressions.size() == 1) { return expressions.get(0); } -List distinctExpressions = Lists.newArrayList(new LinkedHashSet<>(expressions)); -if (op == NodeType.AND) { -if (distinctExpressions.contains(Literal.FALSE_LITERAL)) { -return Literal.FALSE_LITERAL; +Expression shortCircuit = (op == NodeType.AND ? Literal.FALSE_LITERAL : Literal.TRUE_LITERAL); +Expression skip = (op == NodeType.AND ? Literal.TRUE_LITERAL : Literal.FALSE_LITERAL); +LinkedHashSet distinctExpressions = Sets.newLinkedHashSetWithExpectedSize(expressions.size()); +for (Expression expression : expressions) { +if (expression.equals(shortCircuit)) { +return shortCircuit; +} else if (!expression.equals(skip)) { +distinctExpressions.add(expression); } -distinctExpressions = distinctExpressions.stream().filter(p -> !p.equals(Literal.TRUE_LITERAL)) -.collect(Collectors.toList()); } -if (op == NodeType.OR) { -if (distinctExpressions.contains(Literal.TRUE_LITERAL)) { -return Literal.TRUE_LITERAL; +List result = Lists.newArrayListWithCapacity(distinctExpressions.size() / 2 + 1); Review Comment: The output would not be the same as the original one if we used `stream reduce` api. recursion: `(A AND B) AND (C AND D)` reduce: `((A AND B) AND C) AND D)` If the inconsistency is acceptable, I will use the `reduce` way to simplify it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #10671: [docs] fix keywords in sql-functions help documents
yiguolei merged PR #10671: URL: https://github.com/apache/doris/pull/10671 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [docs]fix keywords in sql-functions help documents (#10671)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new d127cfeea2 [docs]fix keywords in sql-functions help documents (#10671) d127cfeea2 is described below commit d127cfeea28e7eb57f5c8b149eae636dd9335b64 Author: carlvinhust2012 AuthorDate: Fri Jul 8 16:22:47 2022 +0800 [docs]fix keywords in sql-functions help documents (#10671) Co-authored-by: hucheng01 --- docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md | 2 +- docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md | 2 +- docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md | 2 +- docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md | 2 +- docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md | 2 +- .../docs/sql-manual/sql-functions/table-functions/explode-json-array.md | 2 +- .../en/docs/sql-manual/sql-functions/table-functions/explode-numbers.md | 2 +- docs/en/docs/sql-manual/sql-functions/table-functions/explode-split.md | 2 +- .../docs/sql-manual/sql-functions/array-functions/arrays_overlap.md | 2 +- docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_array.md | 2 +- docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_object.md | 2 +- docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_quote.md | 2 +- .../docs/sql-manual/sql-functions/table-functions/explode-bitmap.md | 2 +- .../docs/sql-manual/sql-functions/table-functions/explode-json-array.md | 2 +- .../docs/sql-manual/sql-functions/table-functions/explode-numbers.md| 2 +- .../docs/sql-manual/sql-functions/table-functions/explode-split.md | 2 +- 16 files changed, 16 insertions(+), 16 deletions(-) diff --git a/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md b/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md index 5cd3d30e36..ca0b5f01ba 100644 --- a/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md +++ b/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md @@ -63,4 +63,4 @@ mysql> select c_left,c_right,arrays_overlap(c_left,c_right) from array_test; ### keywords -ARRAYS_OVERLAP +ARRAY,ARRAYS,OVERLAP,ARRAYS_OVERLAP diff --git a/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md b/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md index 14e99263ec..38eaf68a12 100644 --- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md +++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md @@ -67,4 +67,4 @@ MySQL> select json_array("a", null, "c"); +--+ ``` ### keywords -json_array +json,array,json_array diff --git a/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md b/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md index d21daf3b47..0576e4e4e2 100644 --- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md +++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md @@ -68,4 +68,4 @@ MySQL> select json_object('username',null); +-+ ``` ### keywords -json_object +json,object,json_object diff --git a/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md b/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md index 6cc120a166..ff54b2e92f 100644 --- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md +++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md @@ -67,4 +67,4 @@ MySQL> select json_quote("\n\b\r\t"); ++ ``` ### keywords -json_quote +json,quote,json_quote diff --git a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md index 99c2627a0b..482dd406d0 100644 --- a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md +++ b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md @@ -154,4 +154,4 @@ lateral view explode_split("a,b", ",") tmp2 as e2 order by k1, e1, e2; ### keywords -explode_bitmap \ No newline at end of file +explode,bitmap,explode_bitmap \ No newline at end of file diff --git a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md index d3721ca04f..ab0f0b5b83 100644 --- a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md +++ b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md @@ -283,4 +283,4 @@ mysql> select k1, e1 from example1 lateral view explode_json_array_string('{"a": ### keywords -explode_json_array \ No newline at end of file +explode,json,a
[GitHub] [doris] Gabriel39 opened a new pull request, #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code
Gabriel39 opened a new pull request, #10695: URL: https://github.com/apache/doris/pull/10695 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10549: [Bug] Fix array functions arguments mismatch
github-actions[bot] commented on PR #10549: URL: https://github.com/apache/doris/pull/10549#issuecomment-1178734320 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10549: [Bug] Fix array functions arguments mismatch
github-actions[bot] commented on PR #10549: URL: https://github.com/apache/doris/pull/10549#issuecomment-1178734359 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10670: [fix](optimizer) join reorder may cause column non-existence problem
github-actions[bot] commented on PR #10670: URL: https://github.com/apache/doris/pull/10670#issuecomment-1178736075 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] icedrugs89 opened a new issue, #10696: [Bug] BrokerLoad导入任务出现type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = Broker list path exception. path=hdfs:xxx
[GitHub] [doris] github-actions[bot] commented on pull request #10692: [refactor]broker rpc timeout configuration parameterization
github-actions[bot] commented on PR #10692: URL: https://github.com/apache/doris/pull/10692#issuecomment-1178745074 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10692: [refactor]broker rpc timeout configuration parameterization
github-actions[bot] commented on PR #10692: URL: https://github.com/apache/doris/pull/10692#issuecomment-1178745020 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS commented on a diff in pull request #10492: [feature-wip] support avro format in routine load and stream load
Henry2SS commented on code in PR #10492: URL: https://github.com/apache/doris/pull/10492#discussion_r916626276 ## be/src/common/config.h: ## @@ -763,6 +763,9 @@ CONF_Int32(quick_compaction_batch_size, "10"); // do compaction min rowsets CONF_Int32(quick_compaction_min_rowsets, "10"); +// Avro schema file path, set default to "${DORIS_HOME}/conf/avro_schema.json" +CONF_String(avro_schema_file_path, "${DORIS_HOME}/conf/avro_schema.json"); Review Comment: > The old query engine with `tuple` and `RowBatch` interface will be deprecated in the future. So you'd better implement this feature in vec engine, with `block` interface. Hi, support Vectorized `VAvroScanner` now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] icedrugs89 commented on issue #10696: [Bug] BrokerLoad导入任务出现type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = Broker list path exception. path=hdfs:xxx
icedrugs89 commented on issue #10696: URL: https://github.com/apache/doris/issues/10696#issuecomment-1178755340 1、指定具体的年月日分区是可以导入的,怀疑是表的分区目录过大,进行如下验证: 1)broker机器安装hdfs客户端环境,使用通配符*访问hive对应的所有分区文件,耗时大约在56秒 2)查看对应的thrift中的socket代码,执行获取File的status状态可能由于hive分区数太大耗时太久超时 3)查看thrift-0.9.3/lib/java/src/org/apache/thrift/transport/TSocket.java中的SocketTimeout默认读写超时时间太短小于50秒  4)由社区的技术团队修改FE的参数fe/fe-core/src/main/java/org/apache/doris/common/ClientPool.java增加static int brokerTimeoutMs = 1; 5)生成测试版本, 重启FE后再次验证,问题修复 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] kpfly commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance
kpfly commented on PR #10629: URL: https://github.com/apache/doris/pull/10629#issuecomment-1178764456 before this patch,when load JSON data, the tcmalloc hook may bring about a 30% performance loss. how about after this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener commented on pull request #10624: [enhancement]: remove redundant field.
jackwener commented on PR #10624: URL: https://github.com/apache/doris/pull/10624#issuecomment-1178764697 In addition, This PR will not influent the upgrade. Because deleted field is optional -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman closed issue #10674: [Bug] join reorder may cause column non-existence problem
morningman closed issue #10674: [Bug] join reorder may cause column non-existence problem URL: https://github.com/apache/doris/issues/10674 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman merged pull request #10670: [fix](optimizer) join reorder may cause column non-existence problem
morningman merged PR #10670: URL: https://github.com/apache/doris/pull/10670 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [fix](optimizer) join reorder may cause column non-existence problem (#10670)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 036cba [fix](optimizer) join reorder may cause column non-existence problem (#10670) 036cba is described below commit 036cba7ec4325f0f097eacf9e2b26b4d7fa8 Author: yinzhijian <373141...@qq.com> AuthorDate: Fri Jul 8 17:28:32 2022 +0800 [fix](optimizer) join reorder may cause column non-existence problem (#10670) for example: select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b; If t3 is a large table, it will be placed first after the reorderTable, and the problem that t2.b does not exist will occur in reanalyzing. --- .../java/org/apache/doris/analysis/SelectStmt.java | 13 - .../java/org/apache/doris/planner/QueryPlanTest.java | 18 ++ 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java index 881e7a2947..05ffa486c5 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java @@ -820,7 +820,18 @@ public class SelectStmt extends QueryStmt { List candidateEqJoinPredicates = analyzer.getEqJoinConjunctsExcludeAuxPredicates(tid); for (Expr candidateEqJoinPredicate : candidateEqJoinPredicates) { List candidateTupleList = Lists.newArrayList(); - Expr.getIds(Lists.newArrayList(candidateEqJoinPredicate), candidateTupleList, null); +List candidateEqJoinPredicateList = Lists.newArrayList(candidateEqJoinPredicate); +// If a large table or view has joinClause is ranked first, +// and the joinClause is not judged here, +// the column in joinClause may not be found during reanalyzing. +// for example: +// select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b; +// If t3 is a large table, it will be placed first after the reorderTable, +// and the problem that t2.b does not exist will occur in reanalyzing +if (candidateTableRef.getOnClause() != null) { + candidateEqJoinPredicateList.add(candidateTableRef.getOnClause()); +} +Expr.getIds(candidateEqJoinPredicateList, candidateTupleList, null); int count = candidateTupleList.size(); for (TupleId tupleId : candidateTupleList) { if (validTupleId.contains(tupleId) || tid.equals(tupleId)) { diff --git a/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java b/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java index 4163112aa7..b8499836df 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java @@ -2169,4 +2169,22 @@ public class QueryPlanTest extends TestWithFeService { Assert.assertFalse(explainString.contains("CROSS JOIN")); } + +@Test +public void testDefaultJoinReorderWithView() throws Exception { +connectContext.setDatabase("default_cluster:test"); +createTable("CREATE TABLE t_1 (col1 varchar, col2 varchar, col3 int)\n" + "DISTRIBUTED BY HASH(col3)\n" ++ "BUCKETS 3\n" + "PROPERTIES(\n" + " \"replication_num\"=\"1\"\n" + ");"); +createTable("CREATE TABLE t_2 (col1 varchar, col2 varchar, col3 int)\n" + "DISTRIBUTED BY HASH(col3)\n" ++ "BUCKETS 3\n" + "PROPERTIES(\n" + " \"replication_num\"=\"1\"\n" + ");"); +createView("CREATE VIEW v_1 as select col1 from t_1;"); +createView("CREATE VIEW v_2 as select x.col2 from (select t_2.col2, 1 + 1 from t_2) x;"); + +String sql = "explain select t_1.col2, v_1.col1 from t_1 inner join t_2 on t_1.col1 = t_2.col1 inner join v_1 " ++ "on v_1.col1 = t_2.col2 inner join v_2 on v_2.col2 = t_2.col1"; +String explainString = getSQLPlanOrErrorMsg(sql); +System.out.println(explainString); +// errCode = 2, detailMessage = Unknown column 'col2' in 't_2' +Assert.assertFalse(explainString.contains("errCode")); +} } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] cambyzju commented on pull request #10673: [feature-wip](array-type) explode support more sub types
cambyzju commented on PR #10673: URL: https://github.com/apache/doris/pull/10673#issuecomment-1178768701 rebase to trigger P0 regression again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] k-i-d-d commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance
k-i-d-d commented on PR #10629: URL: https://github.com/apache/doris/pull/10629#issuecomment-1178772483 > before this patch,when load JSON data, the tcmalloc hook may bring about a 30% performance loss. how about after this patch? 10%, In addition to load large JSON, other load are usually only have a loss of less than 2% -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] kpfly commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance
kpfly commented on PR #10629: URL: https://github.com/apache/doris/pull/10629#issuecomment-1178775502 > nice -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [BugFix] Column datas doesn't match nullmap when vectorization load (#10684)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 35a282fd61 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684) 35a282fd61 is described below commit 35a282fd6112b36e858d6739e73aa30a1e9b1d64 Author: Lightman <31928846+lchangli...@users.noreply.github.com> AuthorDate: Fri Jul 8 17:39:44 2022 +0800 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684) * block column doesn't match nullmap * remove _nullmap+_row_pos in convertor_to_olap --- be/src/vec/olap/olap_data_convertor.cpp | 46 - be/src/vec/olap/olap_data_convertor.h | 2 +- 2 files changed, 23 insertions(+), 25 deletions(-) diff --git a/be/src/vec/olap/olap_data_convertor.cpp b/be/src/vec/olap/olap_data_convertor.cpp index 72a55e545b..03a1a208cd 100644 --- a/be/src/vec/olap/olap_data_convertor.cpp +++ b/be/src/vec/olap/olap_data_convertor.cpp @@ -138,6 +138,7 @@ void OlapBlockDataConvertor::OlapColumnDataConvertorBase::set_source_column( auto nullable_column = assert_cast(_typed_column.column.get()); _nullmap = nullable_column->get_null_map_data().data(); +_nullmap += row_pos; } } @@ -194,7 +195,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap() size_t total_size = 0; if (_nullmap) { -const UInt8* nullmap_cur = _nullmap + _row_pos; +const UInt8* nullmap_cur = _nullmap; while (bitmap_value_cur != bitmap_value_end) { if (!*nullmap_cur) { total_size += bitmap_value_cur->getSizeInBytes(); @@ -215,7 +216,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap() char* raw_data = _raw_data.data(); Slice* slice = _slice.data(); if (_nullmap) { -const UInt8* nullmap_cur = _nullmap + _row_pos; +const UInt8* nullmap_cur = _nullmap; while (bitmap_value_cur != bitmap_value_end) { if (!*nullmap_cur) { slice_size = bitmap_value_cur->getSizeInBytes(); @@ -233,7 +234,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap() ++nullmap_cur; ++bitmap_value_cur; } -assert(nullmap_cur == _nullmap + _row_pos + _num_rows && slice == _slice.get_end_ptr()); +assert(nullmap_cur == _nullmap + _num_rows && slice == _slice.get_end_ptr()); } else { while (bitmap_value_cur != bitmap_value_end) { slice_size = bitmap_value_cur->getSizeInBytes(); @@ -254,8 +255,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap() Status OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() { assert(_typed_column.column); const vectorized::ColumnHLL* column_hll = nullptr; -const UInt8* nullmap = get_nullmap(); -if (nullmap) { +if (_nullmap) { auto nullable_column = assert_cast(_typed_column.column.get()); column_hll = assert_cast( @@ -270,8 +270,8 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() { HyperLogLog* hll_value_end = hll_value_cur + _num_rows; size_t total_size = 0; -if (nullmap) { -const UInt8* nullmap_cur = nullmap + _row_pos; +if (_nullmap) { +const UInt8* nullmap_cur = _nullmap; while (hll_value_cur != hll_value_end) { if (!*nullmap_cur) { total_size += hll_value_cur->max_serialized_size(); @@ -292,8 +292,8 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() { Slice* slice = _slice.data(); hll_value_cur = hll_value; -if (nullmap) { -const UInt8* nullmap_cur = nullmap + _row_pos; +if (_nullmap) { +const UInt8* nullmap_cur = _nullmap; while (hll_value_cur != hll_value_end) { if (!*nullmap_cur) { slice_size = hll_value_cur->serialize((uint8_t*)raw_data); @@ -310,7 +310,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() { ++nullmap_cur; ++hll_value_cur; } -assert(nullmap_cur == nullmap + _row_pos + _num_rows && slice == _slice.get_end_ptr()); +assert(nullmap_cur == _nullmap + _num_rows && slice == _slice.get_end_ptr()); } else { while (hll_value_cur != hll_value_end) { slice_size = hll_value_cur->serialize((uint8_t*)raw_data); @@ -372,7 +372,7 @@ Status OlapBlockDataConvertor::OlapColumnDataConvertorChar::convert_to_olap() { } for (size_t i = 0; i < _num_rows; i++) { -if (!_nullmap || !_nullmap[i + _row_pos]) { +if (!_nullmap || !_nullmap[i]) { _slice[i] = column_string->
[GitHub] [doris] yiguolei merged pull request #10684: [BugFix] Column datas doesn't match nullmap when vectorization load
yiguolei merged PR #10684: URL: https://github.com/apache/doris/pull/10684 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #10660: [Doc] add flink-doris-connector 1.1.0 doc
yiguolei merged PR #10660: URL: https://github.com/apache/doris/pull/10660 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch dev-1.0.1 updated: [fix](optimizer) join reorder may cause column non-existence problem (#10670)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch dev-1.0.1 in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/dev-1.0.1 by this push: new c60ed8f18a [fix](optimizer) join reorder may cause column non-existence problem (#10670) c60ed8f18a is described below commit c60ed8f18aac93fe0a515a807387502829760e48 Author: yinzhijian <373141...@qq.com> AuthorDate: Fri Jul 8 17:28:32 2022 +0800 [fix](optimizer) join reorder may cause column non-existence problem (#10670) for example: select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b; If t3 is a large table, it will be placed first after the reorderTable, and the problem that t2.b does not exist will occur in reanalyzing. --- .../src/main/java/org/apache/doris/analysis/SelectStmt.java | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java index 84710d14c3..955f9650b4 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java @@ -817,7 +817,18 @@ public class SelectStmt extends QueryStmt { List candidateEqJoinPredicates = analyzer.getEqJoinConjunctsExcludeAuxPredicates(tid); for (Expr candidateEqJoinPredicate : candidateEqJoinPredicates) { List candidateTupleList = Lists.newArrayList(); - Expr.getIds(Lists.newArrayList(candidateEqJoinPredicate), candidateTupleList, null); +List candidateEqJoinPredicateList = Lists.newArrayList(candidateEqJoinPredicate); +// If a large table or view has joinClause is ranked first, +// and the joinClause is not judged here, +// the column in joinClause may not be found during reanalyzing. +// for example: +// select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b; +// If t3 is a large table, it will be placed first after the reorderTable, +// and the problem that t2.b does not exist will occur in reanalyzing +if (candidateTableRef.getOnClause() != null) { + candidateEqJoinPredicateList.add(candidateTableRef.getOnClause()); +} +Expr.getIds(candidateEqJoinPredicateList, candidateTupleList, null); int count = candidateTupleList.size(); for (TupleId tupleId : candidateTupleList) { if (validTupleId.contains(tupleId) || tid.equals(tupleId)) { - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (35a282fd61 -> 2b1d8ac28a)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from 35a282fd61 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684) add 2b1d8ac28a [Doc] add flink-doris-connector 1.1.0 doc (#10660) No new revisions were added by this update. Summary of changes: docs/en/docs/ecosystem/flink-doris-connector.md| 343 ++-- docs/zh-CN/docs/ecosystem/flink-doris-connector.md | 346 - 2 files changed, 297 insertions(+), 392 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] whutpencil opened a new issue, #10697: [Enhancement] The broker load Kerberos update mechanism has defects, resulting in import errors
whutpencil opened a new issue, #10697: URL: https://github.com/apache/doris/issues/10697 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description In our scenario, every day, several broker load jobs fail to import every day, and an error of GSS authentication failure is reported, and the whole error reporting time is even as long as 5 minutes.  ### Solution There is no need for a scheduled task to update the token regularly, but to check whether the cached filesystem is expired every time it is obtained, and update it if it is expired. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 opened a new pull request, #10698: [Doc]broker load rpc timeout problem FQA
hf200012 opened a new pull request, #10698: URL: https://github.com/apache/doris/pull/10698 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] whutpencil opened a new pull request, #10699: [enhancement] Improve the availability of broker load
whutpencil opened a new pull request, #10699: URL: https://github.com/apache/doris/pull/10699 # Proposed changes Issue Number: https://github.com/apache/doris/issues/10697 ## Problem Summary: See issue. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #10700: [improvement]pre-serialize aggregation keys
mrhhsg opened a new pull request, #10700: URL: https://github.com/apache/doris/pull/10700 # Proposed changes Issue Number: close #xxx ## Problem Summary: Test with ssb-flat 100g with the SQL: ```sql select count() from ( SELECT C_CITY, SUM(LO_REVENUE) AS revenue FROM lineorder_flat GROUP BY C_CITY, S_CITY) a; ``` ||non-pre serialize|pre serialize| |-|-|-| |profile|https://user-images.githubusercontent.com/1179834/177964945-8803ad98-923b-4468-848d-2dd83c31ebb8.png";>|https://user-images.githubusercontent.com/1179834/177964545-899a4045-179c-47ea-8ec7-18fefc1d7e71.png";>| ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA
github-actions[bot] commented on PR #10698: URL: https://github.com/apache/doris/pull/10698#issuecomment-1178780755 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA
github-actions[bot] commented on PR #10698: URL: https://github.com/apache/doris/pull/10698#issuecomment-1178780723 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #10701: [reafactor](predicate) refactor predicates in scan node
Gabriel39 opened a new pull request, #10701: URL: https://github.com/apache/doris/pull/10701 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener commented on a diff in pull request #10694: optimize topn query if order by columns is prefix of sort keys of table
jackwener commented on code in PR #10694: URL: https://github.com/apache/doris/pull/10694#discussion_r916590916 ## be/src/olap/reader.cpp: ## @@ -197,11 +197,17 @@ Status TabletReader::_capture_rs_readers(const ReaderParams& read_params, // it's ok for rowset to return unordered result need_ordered_result = false; } + +if (read_params.read_orderby_key) { +need_ordered_result = true; Review Comment: need_ordered_result = read_params.read_orderby_key ## be/src/olap/rowset/rowset_reader_context.h: ## @@ -34,6 +34,9 @@ struct RowsetReaderContext { const TabletSchema* tablet_schema = nullptr; // whether rowset should return ordered rows. bool need_ordered_result = true; +// Review Comment: Forgot to comment? ## be/src/vec/olap/vcollect_iterator.h: ## @@ -102,12 +102,14 @@ class VCollectIterator { // if row cursors equal, compare data version. class LevelIteratorComparator { public: -LevelIteratorComparator(int sequence = -1) : _sequence(sequence) {} +LevelIteratorComparator(int sequence, bool is_reverse) : +_sequence(sequence), _is_reverse(is_reverse) {} bool operator()(LevelIterator* lhs, LevelIterator* rhs); private: int _sequence; +bool _is_reverse = false; Review Comment: Add comment explain its function? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman opened a new pull request, #10702: [refactor] Rename Catalog to Env
morningman opened a new pull request, #10702: URL: https://github.com/apache/doris/pull/10702 # Proposed changes Issue Number: close #xxx ## Problem Summary: Change the Catalog class name to Env Autocomplete by IDE. No functional changes and bug fixes involved ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (No Need) 3. Has document been added or modified: (No Need) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
yiguolei commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916669026 ## be/src/vec/columns/column.h: ## @@ -246,6 +246,14 @@ class IColumn : public COW { /// Returns pointer to the position after the read data. virtual const char* deserialize_and_insert_from_arena(const char* pos) = 0; +virtual size_t get_max_row_byte_size() const { return 0; } Review Comment: Add some comments for new method. Then other people could read the code more clearly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg commented on a diff in pull request #10701: [reafactor](predicate) refactor predicates in scan node
mrhhsg commented on code in PR #10701: URL: https://github.com/apache/doris/pull/10701#discussion_r916669902 ## be/src/exec/olap_common.h: ## @@ -44,25 +45,26 @@ std::string cast_to_string(int8_t); /** * @brief Column's value range **/ -template +template class ColumnValueRange { public: -typedef typename std::set::iterator iterator_type; +using CppType = typename doris::PrimitiveTypeTraits::CppType; +using IteratorType = typename std::set::iterator; ColumnValueRange(); -ColumnValueRange(std::string col_name, PrimitiveType type); +ColumnValueRange(std::string col_name, doris::PrimitiveType type); -ColumnValueRange(std::string col_name, PrimitiveType type, const T& min, const T& max, - bool contain_null); +ColumnValueRange(std::string col_name, doris::PrimitiveType type, const CppType& min, Review Comment: Maybe we can remove the arg `type` because it is the same as the template arg `doris::PrimitiveType primitive_type`? ## be/src/exec/olap_common.cpp: ## @@ -42,17 +42,32 @@ std::string cast_to_string(int8_t value) { } template <> -void ColumnValueRange::convert_to_fixed_value() { +void ColumnValueRange::convert_to_fixed_value() { Review Comment: namespace `doris` is redundant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
yiguolei commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916681902 ## be/src/vec/exec/vaggregation_node.h: ## @@ -50,13 +50,42 @@ struct AggregationMethodSerialized { Data data; Iterator iterator; bool inited = false; +std::vector keys; +AggregationMethodSerialized() +: _serialized_key_buffer_size(0), + _serialized_key_buffer(nullptr), + _mem_pool(new MemPool) {} -AggregationMethodSerialized() = default; +using State = ColumnsHashing::HashMethodSerialized; template explicit AggregationMethodSerialized(const Other& other) : data(other.data) {} -using State = ColumnsHashing::HashMethodSerialized; +void serialize_keys(const ColumnRawPtrs& key_columns, const size_t num_rows) { +size_t max_one_row_byte_size = 0; +for (const auto& column : key_columns) { +max_one_row_byte_size += +std::max(max_one_row_byte_size, column->get_max_row_byte_size()); Review Comment: max_one_row_byte_size += column->get_max_row_byte_size() ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10678: [feature](nereides) support sort translator
github-actions[bot] commented on PR #10678: URL: https://github.com/apache/doris/pull/10678#issuecomment-1178821588 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10678: [feature](nereides) support sort translator
github-actions[bot] commented on PR #10678: URL: https://github.com/apache/doris/pull/10678#issuecomment-1178821541 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wangsha0 opened a new issue, #10703: [Feature] About supporting the format file of EC when using BrokerLoad to sync data from hdfs to doris
wangsha0 opened a new issue, #10703: URL: https://github.com/apache/doris/issues/10703 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description hdfs file has used EC police BrokerLoad Error: ``` JobId: 31095590 Label: xx State: CANCELLED Progress: ETL:N/A; LOAD:N/A Type: BROKER EtlInfo: NULL TaskInfo: cluster:N/A; timeout(s):14400; max_filter_ratio:0.01 ErrorMsg: type:LOAD_RUN_FAIL; msg:ParseError : Bad read of hdfs://nsxx CreateTime: 2022-07-08 15:19:28 EtlStartTime: 2022-07-08 15:19:30 EtlFinishTime: 2022-07-08 15:19:30 LoadStartTime: 2022-07-08 15:19:30 LoadFinishTime: 2022-07-08 15:20:41 URL: NULL JobDetails: {xx ``` after: `hdfs ec -getPolicy -path hdfs://ns10xx` then: `RS-6-3-1024k` ### Use case RS-6-3-1024k ### Related issues RS-6-3-1024k ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] EmmyMiao87 commented on a diff in pull request #10659: [enhancement](nereids) make aggregate works
EmmyMiao87 commented on code in PR #10659: URL: https://github.com/apache/doris/pull/10659#discussion_r916687709 ## fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundFunction.java: ## @@ -52,6 +54,14 @@ public List getArguments() { return children(); } +@Override +public String sql() throws UnboundException { Review Comment: The three words toString, toSql, toDigest seem to be unified -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] EmmyMiao87 commented on a diff in pull request #10659: [enhancement](nereids) make aggregate works
EmmyMiao87 commented on code in PR #10659: URL: https://github.com/apache/doris/pull/10659#discussion_r916689097 ## fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java: ## @@ -263,23 +313,18 @@ public PlanFragment visitPhysicalHashJoin( // NOTICE: We must visit from right to left, to ensure the last fragment is root fragment PlanFragment rightFragment = visit(hashJoin.child(1), context); PlanFragment leftFragment = visit(hashJoin.child(0), context); -PhysicalHashJoin physicalHashJoin = hashJoin.getOperator(); - -//Expression predicateExpr = physicalHashJoin.getCondition().get(); -//List eqExprList = Utils.getEqConjuncts(hashJoin.child(0).getOutput(), -//hashJoin.child(1).getOutput(), predicateExpr); -JoinType joinType = physicalHashJoin.getJoinType(); - PlanNode leftFragmentPlanRoot = leftFragment.getPlanRoot(); PlanNode rightFragmentPlanRoot = rightFragment.getPlanRoot(); +PhysicalHashJoin physicalHashJoin = hashJoin.getOperator(); +JoinType joinType = physicalHashJoin.getJoinType(); if (joinType.equals(JoinType.CROSS_JOIN) Review Comment: Then if we encounter a `PhysicalHashJoin` whose `JoinType` is cross join, an error should be reported directly here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10512: [feature] (vectorization)parquet push down support
github-actions[bot] commented on PR #10512: URL: https://github.com/apache/doris/pull/10512#issuecomment-1178840627 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-spark-connector] branch branch-1.1.0 created (now 2e38c12)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to branch branch-1.1.0 in repository https://gitbox.apache.org/repos/asf/doris-spark-connector.git at 2e38c12 Remove disclaimer (#42) No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow commented on a diff in pull request #10657: [feature](Nereids): enforcer job.
morrySnow commented on code in PR #10657: URL: https://github.com/apache/doris/pull/10657#discussion_r916409080 ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Group.java: ## @@ -135,6 +137,35 @@ public void setCostLowerBound(double costLowerBound) { this.costLowerBound = costLowerBound; } +/** + * Set or update lowestCostPlans: properties --> new Pair<>(cost, expression) + */ +public void setBestPlan(GroupExpression expression, double cost, PhysicalProperties properties) { Review Comment: ```suggestion public void updateBestPlan(GroupExpression expression, double cost, PhysicalProperties properties) { ``` furthermore, this setter, function use plan as name, but getter function use expression instead ## fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/JobContext.java: ## @@ -26,7 +26,7 @@ public class JobContext { private final PlannerContext plannerContext; private final PhysicalProperties requiredProperties; -private final double costUpperBound; +private double costUpperBound; Review Comment: if we need update cost upper bound, the better way is generate a new `JobContext` with new upper bound. If we only use one `JobContext` in all job of cascades, you need a stack to save cost status carefully ## fe/fe-core/src/main/java/org/apache/doris/nereids/operators/plans/physical/PhysicalDistribution.java: ## @@ -0,0 +1,42 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.operators.plans.physical; + +import org.apache.doris.nereids.operators.OperatorType; +import org.apache.doris.nereids.properties.DistributionSpec; +import org.apache.doris.nereids.trees.expressions.Expression; + +import java.util.List; + +/** + * Enforcer operator. + */ +public class PhysicalDistribution extends PhysicalUnaryOperator { + +protected DistributionSpec distributionSpec; + + +public PhysicalDistribution(DistributionSpec spec) { +super(OperatorType.PHYSICAL_DISTRIBUTION); +} + +@Override +public List getExpressions() { +return null; Review Comment: maybe we need to return an empty list ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/GroupExpression.java: ## @@ -44,6 +44,8 @@ public class GroupExpression { // Mapping from output properties to the corresponding best cost, statistics, and child properties. private final Map>> lowestCostTable; +// Each physical group expression maintains mapping incoming requests to the corresponding child requests. +private final Map requestPropertiesMap; Review Comment: value need a list? ## fe/fe-core/src/main/java/org/apache/doris/nereids/memo/GroupExpression.java: ## @@ -61,6 +63,14 @@ public GroupExpression(Operator op, List children) { this.ruleMasks = new BitSet(RuleType.SENTINEL.ordinal()); this.statDerived = false; this.lowestCostTable = Maps.newHashMap(); +this.requestPropertiesMap = Maps.newHashMap(); +} + +// TODO: rename +public PhysicalProperties getPropertyFromMap(PhysicalProperties requiredPropertySet) { Review Comment: ```suggestion public PhysicalProperties getPropertyFromMap(PhysicalProperties requiredProperties) { ``` ## fe/fe-core/src/main/java/org/apache/doris/nereids/properties/OrderKey.java: ## @@ -42,6 +42,15 @@ public OrderKey(Expression expr, boolean isAsc, boolean nullFirst) { this.nullFirst = nullFirst; } +/** + * Whether other `OrderKey` is satisfied the current `OrderKey`. + * + * @param other another OrderKey. + */ +public boolean matches(OrderKey other) { +return expr.equals(other.expr) && isAsc == other.isAsc && nullFirst == other.nullFirst; Review Comment: we need semantic equal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us..
[GitHub] [doris] morrySnow commented on a diff in pull request #10657: [feature](Nereids): enforcer job.
morrySnow commented on code in PR #10657: URL: https://github.com/apache/doris/pull/10657#discussion_r916707968 ## fe/fe-core/src/main/java/org/apache/doris/nereids/operators/OperatorType.java: ## @@ -23,9 +23,9 @@ * 1. ANY: match any operator * 2. MULTI: match multiple operators * 3. FIXED: the leaf node of pattern tree, which can be matched by a single operator - * but this operator cannot be used in rules + * but this operator cannot be used in rules * 4. MULTI_FIXED: the leaf node of pattern tree, which can be matched by multiple operators, - *but these operators cannot be used in rules + * but these operators cannot be used in rules Review Comment: maintain the indentation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wangbo commented on a diff in pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code
wangbo commented on code in PR #10695: URL: https://github.com/apache/doris/pull/10695#discussion_r916723622 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -960,13 +960,6 @@ void DefaultValueColumnIterator::insert_default_data(const TypeInfo* type_info, dst->insert_many_data(data_ptr, data_len, n); break; } -case OLAP_FIELD_TYPE_DATEV2: { -assert(type_size == sizeof(FieldTypeTraits::CppType)); //uint32_t - -int128 = *((FieldTypeTraits::CppType*)mem_value); -dst->insert_many_data(data_ptr, data_len, n); -break; Review Comment: Why here is removed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] EmmyMiao87 merged pull request #10678: [feature](nereides) support sort translator
EmmyMiao87 merged PR #10678: URL: https://github.com/apache/doris/pull/10678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [feature](nereides) support sort translator (#10678)
This is an automated email from the ASF dual-hosted git repository. lingmiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new e6da00bb26 [feature](nereides) support sort translator (#10678) e6da00bb26 is described below commit e6da00bb261ef6a04207336e318e64b2558193c3 Author: zhengshiJ <32082872+zhengs...@users.noreply.github.com> AuthorDate: Fri Jul 8 19:22:48 2022 +0800 [feature](nereides) support sort translator (#10678) Physical sort: * 1. Build sortInfo *There are two types of slotRef: *one is generated by the previous node, collectively called old. *the other is newly generated by the sort node, collectively called new. *Filling of sortInfo related data structures, *a. ordering use newSlotRef. *b. sortTupleSlotExprs use oldSlotRef. * 2. Create sortNode * 3. Create mergeFragment TODO: 1.Currently, columns that do not exist in select but exist in order by cannot be parsed. eg: select key from table order by value; 2.For the combination of Literal and slotRefrance in select, there is a problem with parsing, eg: select key ,(10-value) from table; --- .../java/org/apache/doris/analysis/SortInfo.java | 4 ++ .../org/apache/doris/nereids/NereidsPlanner.java | 6 +-- .../glue/translator/PhysicalPlanTranslator.java| 56 +++--- .../glue/translator/PlanTranslatorContext.java | 13 - .../java/org/apache/doris/planner/PlanNode.java| 7 +++ .../java/org/apache/doris/planner/SortNode.java| 23 + .../java/org/apache/doris/qe/StmtExecutor.java | 1 - 7 files changed, 98 insertions(+), 12 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java index 05763e8d07..63394db4cd 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java @@ -141,6 +141,10 @@ public class SortInfo { this.sortTupleSlotExprs = sortTupleSlotExprs; } +public void setSortTupleDesc(TupleDescriptor tupleDesc) { +sortTupleDesc = tupleDesc; +} + public TupleDescriptor getSortTupleDescriptor() { return sortTupleDesc; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java index bc7aed0bf1..23fafa518d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java @@ -25,7 +25,6 @@ import org.apache.doris.analysis.StatementBase; import org.apache.doris.analysis.TupleDescriptor; import org.apache.doris.analysis.TupleId; import org.apache.doris.common.AnalysisException; -import org.apache.doris.common.Id; import org.apache.doris.common.UserException; import org.apache.doris.nereids.glue.LogicalPlanAdapter; import org.apache.doris.nereids.glue.translator.PhysicalPlanTranslator; @@ -39,7 +38,6 @@ import org.apache.doris.nereids.memo.GroupExpression; import org.apache.doris.nereids.memo.Memo; import org.apache.doris.nereids.properties.PhysicalProperties; import org.apache.doris.nereids.trees.expressions.NamedExpression; -import org.apache.doris.nereids.trees.expressions.Slot; import org.apache.doris.nereids.trees.plans.Plan; import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; import org.apache.doris.nereids.trees.plans.physical.PhysicalPlan; @@ -104,8 +102,8 @@ public class NereidsPlanner extends Planner { outputCandidates.put(slotDescriptor.getId().asInt(), slotRef); } } -physicalPlan.getOutput().stream().map(Slot::getExprId) -.map(Id::asInt).forEach(i -> outputExprs.add(outputCandidates.get(i))); +physicalPlan.getOutput().stream() +.forEach(i -> outputExprs.add(planTranslatorContext.findExpr(i))); root.setOutputExprs(outputExprs); root.getPlanRoot().convertToVectoriezd(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index 511ac61fb4..0dd6b9e859 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -204,6 +204,26 @@ public class PhysicalPlanTranslator extends PlanOperatorVisitor sort, PlanTranslatorContext context) { @@ -211,24 +231,35 @@ public class PhysicalPlanTranslator extends PlanOperatorVisitor execOrderingExprList = Lists.newArr
[GitHub] [doris] github-actions[bot] commented on pull request #10699: [enhancement] Improve the availability of broker load
github-actions[bot] commented on PR #10699: URL: https://github.com/apache/doris/pull/10699#issuecomment-1178875878 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10699: [enhancement] Improve the availability of broker load
github-actions[bot] commented on PR #10699: URL: https://github.com/apache/doris/pull/10699#issuecomment-1178875855 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhengshubin opened a new issue, #10704: [Feature] Use Function to set the default value when add new column
zhengshubin opened a new issue, #10704: URL: https://github.com/apache/doris/issues/10704 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description Now,If I want to add a date Type column use default value with old datetime type coloum, I must rebuild this table . But If can Use Function to set the default value when add new column, it can improve work efficiency。 ### Use case The sql express just like this : ALTER TABLE t1 ADD COLUMN dt DATE DEFAULT toDate(oldColumn) ### Related issues _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10650: [Bug][Function] pass intermediate argument list to be
github-actions[bot] commented on PR #10650: URL: https://github.com/apache/doris/pull/10650#issuecomment-1178894215 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 commented on a diff in pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code
Gabriel39 commented on code in PR #10695: URL: https://github.com/apache/doris/pull/10695#discussion_r916741114 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -960,13 +960,6 @@ void DefaultValueColumnIterator::insert_default_data(const TypeInfo* type_info, dst->insert_many_data(data_ptr, data_len, n); break; } -case OLAP_FIELD_TYPE_DATEV2: { -assert(type_size == sizeof(FieldTypeTraits::CppType)); //uint32_t - -int128 = *((FieldTypeTraits::CppType*)mem_value); -dst->insert_many_data(data_ptr, data_len, n); -break; Review Comment: because this is same as default behavior -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10655: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2]
github-actions[bot] commented on PR #10655: URL: https://github.com/apache/doris/pull/10655#issuecomment-1178901442 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-flink-connector] hf200012 opened a new pull request, #44: remove DISCLAIMER
hf200012 opened a new pull request, #44: URL: https://github.com/apache/doris-flink-connector/pull/44 remove DISCLAIMER # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-flink-connector] hf200012 closed pull request #44: remove DISCLAIMER
hf200012 closed pull request #44: remove DISCLAIMER URL: https://github.com/apache/doris-flink-connector/pull/44 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-flink-connector] hf200012 opened a new pull request, #45: remove DISCLAIMER
hf200012 opened a new pull request, #45: URL: https://github.com/apache/doris-flink-connector/pull/45 remove DISCLAIMER # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-flink-connector] branch master updated: remove DISCLAIMER (#45)
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-flink-connector.git The following commit(s) were added to refs/heads/master by this push: new e1e2f13 remove DISCLAIMER (#45) e1e2f13 is described below commit e1e2f133b1457828fbc5b4f9d126bc362f102fa1 Author: jiafeng.zhang AuthorDate: Fri Jul 8 20:00:39 2022 +0800 remove DISCLAIMER (#45) remove DISCLAIMER --- DISCLAIMER | 12 1 file changed, 12 deletions(-) diff --git a/DISCLAIMER b/DISCLAIMER deleted file mode 100644 index 2769edd..000 --- a/DISCLAIMER +++ /dev/null @@ -1,12 +0,0 @@ -Apache Doris is an effort undergoing incubation at The -Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC. - -Incubation is required of all newly accepted -projects until a further review indicates that the -infrastructure, communications, and decision making process have -stabilized in a manner consistent with other successful ASF -projects. - -While incubation status is not necessarily a reflection -of the completeness or stability of the code, it does indicate -that the project has yet to be fully endorsed by the ASF. - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-flink-connector] hf200012 merged pull request #45: remove DISCLAIMER
hf200012 merged PR #45: URL: https://github.com/apache/doris-flink-connector/pull/45 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yangzhg opened a new issue, #10705: [Bug] Fe crash by bdbje
yangzhg opened a new issue, #10705: URL: https://github.com/apache/doris/issues/10705 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? 2022-07-05 23:27:27,685 WARN (replayer|79) [Catalog.replayJournal():2506] replay journal cost too much time: 1001 replayedJournalId: 1160191 2022-07-05 23:27:27,687 INFO (replayer|79) [Catalog.replayJournal():2478] replayed journal id is 1160191, replay to journal id is 1160192 2022-07-05 23:27:28,687 WARN (replayer|79) [BDBJournalCursor.next():148] Catch an exception when get next JournalEntity. key:1160192 com.sleepycat.je.LockTimeoutException: (JE 7.3.7) Lock expired. Locker 497486454 -1_replayer_ReplicaThreadLocker: waited for lock on database=1150001 LockAddr:1459556016 LSN=0x43/0x6648b type=READ grant=WAIT_NEW timeoutMillis=1000 startTime=1657034847687 endTime=1657034848687 Owners: [604669736 -1495664_ReplayThread_ReplayTxn" type="WRITE"/>] Waiters: [] at com.sleepycat.je.txn.LockManager.makeTimeoutException(LockManager.java:1117) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.txn.LockManager.waitForLock(LockManager.java:606) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.txn.LockManager.lock(LockManager.java:345) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.txn.BasicLocker.lockInternal(BasicLocker.java:124) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.rep.txn.ReplicaThreadLocker.lockInternal(ReplicaThreadLocker.java:63) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.txn.Locker.lock(Locker.java:499) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:3585) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:3316) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.dbi.CursorImpl.lockLNAndCheckDefunct(CursorImpl.java:2138) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.dbi.CursorImpl.searchExact(CursorImpl.java:1950) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Cursor.searchExact(Cursor.java:4194) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Cursor.searchNoDups(Cursor.java:4055) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Cursor.search(Cursor.java:3857) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Cursor.getInternal(Cursor.java:1284) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Database.get(Database.java:1271) ~[je-7.3.7.jar:7.3.7] at com.sleepycat.je.Database.get(Database.java:1330) ~[je-7.3.7.jar:7.3.7] at org.apache.doris.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:108) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2488) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.catalog.Catalog$3.runOneCycle(Catalog.java:2277) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:0.15-SNAPSHOT] ### What You Expected? normal ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code
github-actions[bot] commented on PR #10695: URL: https://github.com/apache/doris/pull/10695#issuecomment-1178918184 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code
github-actions[bot] commented on PR #10695: URL: https://github.com/apache/doris/pull/10695#issuecomment-1178918203 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA
github-actions[bot] commented on PR #10698: URL: https://github.com/apache/doris/pull/10698#issuecomment-1178919064 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] liaoxin01 opened a new pull request, #10706: [feature-wip](unique-key-merge-on-write) add bloom filter index for primary key, DSIP-018[1.2]
liaoxin01 opened a new pull request, #10706: URL: https://github.com/apache/doris/pull/10706 # Proposed changes Add Bloom filter index for primary key. This patch is for step 1.2 in scheduling. For the detail, see DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman commented on pull request #10702: [refactor] Rename Catalog to Env
morningman commented on PR #10702: URL: https://github.com/apache/doris/pull/10702#issuecomment-1178939953 > Why do we need this modification? As we discussed in dev@doris: https://lists.apache.org/thread/tr2fgydon657wvoy8vf1ccr8z9xos693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] annotated tag 1.1.0-rc04 updated (a6eb47ac08 -> 113293c989)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a change to annotated tag 1.1.0-rc04 in repository https://gitbox.apache.org/repos/asf/doris.git *** WARNING: tag 1.1.0-rc04 was modified! *** from a6eb47ac08 (commit) to 113293c989 (tag) tagging a6eb47ac0875ed51291ed7b1cd990d40f7d901de (commit) by morningman on Fri Jul 8 20:41:15 2022 +0800 - Log - 1.1.0-rc04 --- No new revisions were added by this update. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #10691: [refactor] update stop_be.sh to avoid error message
yiguolei merged PR #10691: URL: https://github.com/apache/doris/pull/10691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [refactor] update stop_be.sh to avoid error message (#10691)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 6f29a8ac0d [refactor] update stop_be.sh to avoid error message (#10691) 6f29a8ac0d is described below commit 6f29a8ac0d3de9dcb7082681fdc6700b1952ea95 Author: minghong AuthorDate: Fri Jul 8 20:49:00 2022 +0800 [refactor] update stop_be.sh to avoid error message (#10691) * update stop_be.sh to avoid error message * update stop_be.sh --- bin/stop_be.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bin/stop_be.sh b/bin/stop_be.sh index 9d65d73307..e189f91112 100755 --- a/bin/stop_be.sh +++ b/bin/stop_be.sh @@ -23,7 +23,7 @@ export DORIS_HOME=`cd "$curdir/.."; pwd` export PID_DIR=`cd "$curdir"; pwd` signum=9 -if [ $1 = "--grace" ]; then +if [[ $1 = "--grace" ]]; then signum=15 fi - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei closed issue #10641: [Bug] Core dump when aggregate function with no group by
yiguolei closed issue #10641: [Bug] Core dump when aggregate function with no group by URL: https://github.com/apache/doris/issues/10641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #10650: [Bug][Function] pass intermediate argument list to be
yiguolei merged PR #10650: URL: https://github.com/apache/doris/pull/10650 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [Bug][Function] pass intermediate argument list to be (#10650)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new f58a071605 [Bug][Function] pass intermediate argument list to be (#10650) f58a071605 is described below commit f58a071605a1aaa7a68a99cdd2f098a5868787e4 Author: Pxl <952130...@qq.com> AuthorDate: Fri Jul 8 20:50:05 2022 +0800 [Bug][Function] pass intermediate argument list to be (#10650) --- .../aggregate_function_orthogonal_bitmap.cpp | 2 -- .../aggregate_function_topn.cpp| 5 + .../aggregate_functions/aggregate_function_topn.h | 8 be/src/vec/data_types/data_type_factory.hpp| 4 be/src/vec/exprs/vectorized_agg_fn.cpp | 22 ++ be/src/vec/exprs/vectorized_agg_fn.h | 2 +- .../org/apache/doris/analysis/AggregateInfo.java | 15 --- .../apache/doris/analysis/FunctionCallExpr.java| 20 +++- .../org/apache/doris/analysis/FunctionParams.java | 15 +++ gensrc/thrift/Exprs.thrift | 1 + gensrc/thrift/Types.thrift | 1 + 11 files changed, 52 insertions(+), 43 deletions(-) diff --git a/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp b/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp index 470a6c8388..9794a72090 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp +++ b/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp @@ -34,8 +34,6 @@ AggregateFunctionPtr create_aggregate_function_orthogonal(const std::string& nam LOG(WARNING) << "Incorrect number of arguments for aggregate function " << name; return nullptr; } else if (argument_types.size() == 1) { -// only used at AGGREGATE (merge finalize) for variadic function -// and for orthogonal_bitmap_union_count function return std::make_shared>>(argument_types); } else { const IDataType& argument_type = *argument_types[1]; diff --git a/be/src/vec/aggregate_functions/aggregate_function_topn.cpp b/be/src/vec/aggregate_functions/aggregate_function_topn.cpp index 04df93ce67..19f52fbff8 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_topn.cpp +++ b/be/src/vec/aggregate_functions/aggregate_function_topn.cpp @@ -23,10 +23,7 @@ AggregateFunctionPtr create_aggregate_function_topn(const std::string& name, const DataTypes& argument_types, const Array& parameters, const bool result_is_nullable) { -if (argument_types.size() == 1) { -return AggregateFunctionPtr( -new AggregateFunctionTopN(argument_types)); -} else if (argument_types.size() == 2) { +if (argument_types.size() == 2) { return AggregateFunctionPtr( new AggregateFunctionTopN>( argument_types)); diff --git a/be/src/vec/aggregate_functions/aggregate_function_topn.h b/be/src/vec/aggregate_functions/aggregate_function_topn.h index 97ac5c7cba..ae9fdf322d 100644 --- a/be/src/vec/aggregate_functions/aggregate_function_topn.h +++ b/be/src/vec/aggregate_functions/aggregate_function_topn.h @@ -168,14 +168,6 @@ struct StringDataImplTopN { } }; -struct AggregateFunctionTopNImplEmpty { -// only used at AGGREGATE (merge finalize) -static void add(AggregateFunctionTopNData& __restrict place, const IColumn** columns, -size_t row_num) { -LOG(FATAL) << "AggregateFunctionTopNImplEmpty do not support add()"; -} -}; - template struct AggregateFunctionTopNImplInt { static void add(AggregateFunctionTopNData& __restrict place, const IColumn** columns, diff --git a/be/src/vec/data_types/data_type_factory.hpp b/be/src/vec/data_types/data_type_factory.hpp index 08dc6a9f31..59740debd3 100644 --- a/be/src/vec/data_types/data_type_factory.hpp +++ b/be/src/vec/data_types/data_type_factory.hpp @@ -102,6 +102,10 @@ public: DataTypePtr create_data_type(const arrow::DataType* type, bool is_nullable); +DataTypePtr create_data_type(const TTypeDesc& raw_type) { +return create_data_type(TypeDescriptor::from_thrift(raw_type), raw_type.is_nullable); +} + private: DataTypePtr _create_primitive_data_type(const FieldType& type) const; diff --git a/be/src/vec/exprs/vectorized_agg_fn.cpp b/be/src/vec/exprs/vectorized_agg_fn.cpp index ad7066a9a4..b7e14817f1 100644 --- a/be/src/vec/exprs/vectorized_agg_fn.cpp +++ b/be/src/vec/exprs/vectorized_agg_fn.cpp @@ -33,7 +33,6 @@ AggFnEvaluator::AggFnEvaluator(const TExprNode& desc) : _fn(desc.fn), _is_merge(desc.agg_expr.is_merge_agg),
[GitHub] [doris] yiguolei merged pull request #10577: [enhancement](regression-test) add real data path for regression test.
yiguolei merged PR #10577: URL: https://github.com/apache/doris/pull/10577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated (f58a071605 -> 2b2bf017f8)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/doris.git from f58a071605 [Bug][Function] pass intermediate argument list to be (#10650) add 2b2bf017f8 [enhancement](regression-test) add real data path for regression test. (#10577) No new revisions were added by this update. Summary of changes: regression-test/conf/regression-conf.groovy| 1 + .../org/apache/doris/regression/Config.groovy | 13 ++-- .../apache/doris/regression/ConfigOptions.groovy | 9 ++ .../org/apache/doris/regression/suite/Suite.groovy | 5 .../doris/regression/suite/SuiteContext.groovy | 35 ++ 5 files changed, 61 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10701: [refactor](predicate) refactor predicates in scan node
BiteThet commented on code in PR #10701: URL: https://github.com/apache/doris/pull/10701#discussion_r916809110 ## be/src/vec/exec/volap_scan_node.cpp: ## @@ -937,7 +944,7 @@ std::pair VOlapScanNode::should_push_down_eq_predicate(doris::SlotD return result_pair; } -template +template Status VOlapScanNode::change_fixed_value_range(ColumnValueRange& temp_range, PrimitiveType type, void* value, const ChangeFixedValueRangeFunc& func) { Review Comment: Does `T` always equal to `type`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
BiteThet commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916821559 ## be/src/vec/common/columns_hashing.h: ## @@ -111,29 +111,48 @@ struct HashMethodString : public columns_hashing_impl::HashMethodBase< * That is, for example, for strings, it contains first the serialized length of the string, and then the bytes. * Therefore, when aggregating by several strings, there is no ambiguity. */ -template +template struct HashMethodSerialized -: public columns_hashing_impl::HashMethodBase, Value, - Mapped, false> { -using Self = HashMethodSerialized; +: public columns_hashing_impl::HashMethodBase< + HashMethodSerialized, Value, Mapped, false> { +using Self = HashMethodSerialized; using Base = columns_hashing_impl::HashMethodBase; +using KeyHolderType = +std::conditional_t; ColumnRawPtrs key_columns; size_t keys_size; +const StringRef* keys; HashMethodSerialized(const ColumnRawPtrs& key_columns_, const Sizes& /*key_sizes*/, const HashMethodContextPtr&) : key_columns(key_columns_), keys_size(key_columns_.size()) {} +void set_serialized_keys(StringRef* keys_) { keys = keys_; } Review Comment: Maybe we can add const here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman merged pull request #10655: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2]
morningman merged PR #10655: URL: https://github.com/apache/doris/pull/10655 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655)
This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new feeef7e4da [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655) feeef7e4da is described below commit feeef7e4dab86c87a15ba724964627a9768ca682 Author: zhannngchen <48427519+zhannngc...@users.noreply.github.com> AuthorDate: Fri Jul 8 21:39:13 2022 +0800 [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655) Add interfaces for segment key bounds, key bounds will be used to speed up point lookup on the primary key index of each segment. For the detail, see DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model KeyBounds will be updated by BetaRowsetWriter, will be used to construct a RowsetTree(based on IntervalTree, will be added through next patch) --- be/src/olap/rowset/rowset.h | 5 + be/src/olap/rowset/rowset_meta.h | 13 + gensrc/proto/olap_file.proto | 8 3 files changed, 26 insertions(+) diff --git a/be/src/olap/rowset/rowset.h b/be/src/olap/rowset/rowset.h index 158848be89..ec2a39652b 100644 --- a/be/src/olap/rowset/rowset.h +++ b/be/src/olap/rowset/rowset.h @@ -258,6 +258,11 @@ public: } } +virtual Status get_segments_key_bounds(std::vector* segments_key_bounds) { +_rowset_meta->get_segments_key_bounds(segments_key_bounds); +return Status::OK(); +} + protected: friend class RowsetFactory; diff --git a/be/src/olap/rowset/rowset_meta.h b/be/src/olap/rowset/rowset_meta.h index e4153e2345..c91fe0469d 100644 --- a/be/src/olap/rowset/rowset_meta.h +++ b/be/src/olap/rowset/rowset_meta.h @@ -298,6 +298,19 @@ public: return score; } +void get_segments_key_bounds(std::vector* segments_key_bounds) const { +for (const KeyBoundsPB& key_range : _rowset_meta_pb.segments_key_bounds()) { +segments_key_bounds->push_back(key_range); +} +} + +void set_segments_key_bounds(const std::vector& segments_key_bounds) { +for (const KeyBoundsPB& key_bounds : segments_key_bounds) { +KeyBoundsPB* new_key_bounds = _rowset_meta_pb.add_segments_key_bounds(); +*new_key_bounds = key_bounds; +} +} + const AlphaRowsetExtraMetaPB& alpha_rowset_extra_meta_pb() const { return _rowset_meta_pb.alpha_rowset_extra_meta_pb(); } diff --git a/gensrc/proto/olap_file.proto b/gensrc/proto/olap_file.proto index 0d484a292d..4385d5803d 100644 --- a/gensrc/proto/olap_file.proto +++ b/gensrc/proto/olap_file.proto @@ -53,6 +53,11 @@ enum SegmentsOverlapPB { NONOVERLAPPING = 2; } +message KeyBoundsPB { +required bytes min_key = 1; +required bytes max_key = 2; +} + message RowsetMetaPB { required int64 rowset_id = 1; optional int64 partition_id = 2; @@ -99,6 +104,9 @@ message RowsetMetaPB { optional int64 oldest_write_timestamp = 25 [default = -1]; // latest write time optional int64 newest_write_timestamp = 26 [default = -1]; +// the encoded segment min/max key of segments in this rowset, +// only used in unique key data model with primary_key_index support. +repeated KeyBoundsPB segments_key_bounds = 27; // spare field id for future use optional AlphaRowsetExtraMetaPB alpha_rowset_extra_meta_pb = 50; // to indicate whether the data between the segments overlap - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new issue, #10707: [Bug] InPredicate core dump in runtime filer
Gabriel39 opened a new issue, #10707: URL: https://github.com/apache/doris/issues/10707 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version InPredicate with no child means a predicate which is always false. For runtime filter, it possibly occurs. But this cause core dump in InPredicate DCHECK ### What's Wrong? InPredicate with no child means a predicate which is always false. For runtime filter, it possibly occurs. But this cause core dump in InPredicate DCHECK ### What You Expected? works well ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] weizuo93 opened a new issue, #10708: [Feature] Add interface to check tablet segment lost
weizuo93 opened a new issue, #10708: URL: https://github.com/apache/doris/issues/10708 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description There may be some exceptions that cause segment to be lost on BE node. However, the metadata shows that the tablet is normal. This abnormal replica is not detected by FE and cannot be automatically repaired.When query comes, exception information is thrown that `failed to initialize storage reader`. I think we'd better be able to check tablet segment lost. ### Use case _No response_ ### Related issues _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #10709: [BUG] fix DCHECK failed for vectorized InPredicate
Gabriel39 opened a new pull request, #10709: URL: https://github.com/apache/doris/pull/10709 # Proposed changes Issue Number: close #10707 ## Problem Summary: Describe the overview of changes. ## Checklist(Required) 1. Does it affect the original behavior: (Yes/No/I Don't know) 2. Has unit tests been added: (Yes/No/No Need) 3. Has document been added or modified: (Yes/No/No Need) 4. Does it need to update dependencies: (Yes/No) 5. Are there any changes that cannot be rolled back: (Yes/No) ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10709: [BUG] fix DCHECK failed for vectorized InPredicate
github-actions[bot] commented on PR #10709: URL: https://github.com/apache/doris/pull/10709#issuecomment-1179020003 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10709: [BUG] fix DCHECK failed for vectorized InPredicate
github-actions[bot] commented on PR #10709: URL: https://github.com/apache/doris/pull/10709#issuecomment-1179020063 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener opened a new pull request, #10710: [improve](planner): split output expr to multiple line.
jackwener opened a new pull request, #10710: URL: https://github.com/apache/doris/pull/10710 # Proposed changes Issue Number: close #xxx ## Problem Summary: Describe the overview of changes. split output expr to multiple line. ``` +---+ | Explain String| +---+ | PLAN FRAGMENT 0 | | OUTPUT EXPRS: | | `user_id`| | `default_cluster:test`.`tbl`.`date` | | `city` | | `default_cluster:test`.`tbl`.`age` | +---+ ``` ## Checklist(Required) 1. Does it affect the original behavior: Yes 2. Has unit tests been added: No need 3. Has document been added or modified: No need 4. Does it need to update dependencies: No 5. Are there any changes that cannot be rolled back: No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] weizuo93 opened a new pull request, #10711: [Feature] Add interface to check tablet segment lost
weizuo93 opened a new pull request, #10711: URL: https://github.com/apache/doris/pull/10711 # Proposed changes Issue Number: close #10708 ## Problem Summary: There may be some exceptions that cause segment to be lost on BE node. However, the metadata shows that the tablet is normal. This abnormal replica is not detected by FE and cannot be automatically repaired.When query comes, exception information is thrown that `failed to initialize storage reader`. I think we'd better be able to check tablet segment lost. This patch add a interface to check tablet segment lost. ``` curl -X GET http://be_host:webserver_port/api/check_tablet_segment_existence ``` The return of the interface is all tablets on the current BE node that have lost segment. ``` { msg: "Succeed to check all tablet segment", num: 3, bad_tablets: [ 11190, 11210, 11216 ], host: "172.3.0.101" } ``` ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (No Need) 3. Has document been added or modified: (Yes) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (No) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
BiteThet commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916847546 ## be/src/vec/columns/column_nullable.cpp: ## @@ -134,6 +134,24 @@ const char* ColumnNullable::deserialize_and_insert_from_arena(const char* pos) { return pos; } +size_t ColumnNullable::get_max_row_byte_size() const { +constexpr auto flag_size = sizeof(get_null_map_data()[0]); Review Comment: Maybe we can just use NullMap::T -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman commented on a diff in pull request #10620: [Enhancement][multi-catalog]Impl parallel for file scanner to improve the scanner performance
morningman commented on code in PR #10620: URL: https://github.com/apache/doris/pull/10620#discussion_r916803679 ## fe/fe-core/src/main/java/org/apache/doris/common/Config.java: ## @@ -1654,6 +1654,12 @@ public class Config extends ConfigBase { @ConfField(mutable = false, masterOnly = true) public static boolean enable_multi_catalog = false; // 1 min +@ConfField(mutable = true, masterOnly = true) Review Comment: Both `file_scan_node_spilt_size` and `file_scan_node_spilt_num` are NOT `masterOnly` config. ## fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java: ## @@ -134,6 +135,30 @@ public int numBackends() { } } +private static class FileSpiltStrategy { Review Comment: ```suggestion private static class FileSplitStrategy { ``` And all other `split` typo. ## fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java: ## @@ -340,6 +380,7 @@ protected void toThrift(TPlanNode planNode) { @Override public List getScanRangeLocations(long maxScanRangeLength) { +LOG.info("There is {} scanRangeLocations for execution.", scanRangeLocations.size()); Review Comment: ```suggestion LOG.debug("There is {} scanRangeLocations for execution.", scanRangeLocations.size()); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10710: [improve](planner): split output expr to multiple line.
github-actions[bot] commented on PR #10710: URL: https://github.com/apache/doris/pull/10710#issuecomment-1179030775 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #10710: [improve](planner): split output expr to multiple line.
github-actions[bot] commented on PR #10710: URL: https://github.com/apache/doris/pull/10710#issuecomment-1179030745 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
BiteThet commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916864491 ## be/src/vec/columns/column_nullable.cpp: ## @@ -134,6 +134,24 @@ const char* ColumnNullable::deserialize_and_insert_from_arena(const char* pos) { return pos; } +size_t ColumnNullable::get_max_row_byte_size() const { +constexpr auto flag_size = sizeof(get_null_map_data()[0]); +return flag_size + get_nested_column().get_max_row_byte_size(); +} + +void ColumnNullable::serialize_vec(std::vector& keys, size_t num_rows, + size_t max_row_byte_size) const { +const auto& arr = get_null_map_data(); +static constexpr auto s = sizeof(arr[0]); +for (size_t i = 0; i < num_rows; ++i) { +auto* val = const_cast(keys[i].data + keys[i].size); +*val = (arr[i] ? 1 : 0); Review Comment: Can we just use `*val=arr[i]` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
BiteThet commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916867517 ## be/src/vec/exec/vaggregation_node.cpp: ## @@ -1034,6 +1049,12 @@ Status AggregationNode::_merge_with_serialized_key(Block* block) { using HashMethodType = std::decay_t; using AggState = typename HashMethodType::State; AggState state(key_columns, _probe_key_sz, nullptr); +if constexpr (ColumnsHashing::IsPreSerializedKeysHashMethodTraits< + AggState>::value) { +SCOPED_TIMER(_serialize_key_timer); Review Comment: Maybe we can do some abstract for those same code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
BiteThet commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916872115 ## be/src/vec/exec/vaggregation_node.h: ## @@ -50,13 +50,41 @@ struct AggregationMethodSerialized { Data data; Iterator iterator; bool inited = false; +std::vector keys; +AggregationMethodSerialized() +: _serialized_key_buffer_size(0), + _serialized_key_buffer(nullptr), + _mem_pool(new MemPool) {} -AggregationMethodSerialized() = default; +using State = ColumnsHashing::HashMethodSerialized; template explicit AggregationMethodSerialized(const Other& other) : data(other.data) {} -using State = ColumnsHashing::HashMethodSerialized; +void serialize_keys(const ColumnRawPtrs& key_columns, const size_t num_rows) { +size_t max_one_row_byte_size = 0; +for (const auto& column : key_columns) { +max_one_row_byte_size += column->get_max_row_byte_size(); Review Comment: Does we should consider case that some string column have few long string? This may increase a lot of memory allocation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman commented on a diff in pull request #10620: [Enhancement][multi-catalog]Impl parallel for file scanner to improve the scanner performance
morningman commented on code in PR #10620: URL: https://github.com/apache/doris/pull/10620#discussion_r916877906 ## fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java: ## @@ -311,6 +350,7 @@ private TFileRangeDesc createFileRangeDesc( // set hdfs params for hdfs file type. if (scanProvider.getTableFileType() == TFileType.FILE_HDFS) { THdfsParams tHdfsParams = BrokerUtil.generateHdfsParam(scanProvider.getTableProperties()); +tHdfsParams.addToHdfsConf(new THdfsConf("dfs.client.read.shortcircuit", "false")); Review Comment: Why do we need to disable this feature? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys
yiguolei commented on code in PR #10700: URL: https://github.com/apache/doris/pull/10700#discussion_r916899062 ## be/src/vec/exec/vaggregation_node.h: ## @@ -50,13 +50,41 @@ struct AggregationMethodSerialized { Data data; Iterator iterator; bool inited = false; +std::vector keys; +AggregationMethodSerialized() +: _serialized_key_buffer_size(0), + _serialized_key_buffer(nullptr), + _mem_pool(new MemPool) {} -AggregationMethodSerialized() = default; +using State = ColumnsHashing::HashMethodSerialized; template explicit AggregationMethodSerialized(const Other& other) : data(other.data) {} -using State = ColumnsHashing::HashMethodSerialized; +void serialize_keys(const ColumnRawPtrs& key_columns, const size_t num_rows) { +size_t max_one_row_byte_size = 0; +for (const auto& column : key_columns) { +max_one_row_byte_size += column->get_max_row_byte_size(); Review Comment: Maybe not, the memory is allocated block by block. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org