[GitHub] [doris] zhannngchen commented on a diff in pull request #10548: [WIP](unique-key-merge-on-write) Add delete bitmap for DSIP-018

2022-07-08 Thread GitBox


zhannngchen commented on code in PR #10548:
URL: https://github.com/apache/doris/pull/10548#discussion_r916508357


##
be/src/olap/tablet_meta.h:
##
@@ -214,9 +217,120 @@ class TabletMeta {
 RowsetTypePB _preferred_rowset_type = BETA_ROWSET;
 std::string _remote_storage_name;
 StorageMediumPB _storage_medium = StorageMediumPB::HDD;
+std::unique_ptr _delete_bitmap;
 std::shared_mutex _meta_lock;
 };
 
+/**
+ * Wraps multiple bitmaps for recording rows (row id) that are deleted or
+ * overwritten.
+ *
+ * RowsetId and SegmentId are for locating segment, Version here is a single
+ * uint32_t means that at which "version" of the load causes the delete or
+ * overwrite.
+ *
+ * The start and end version of a load is the same, it's ok and straightforward
+ * to use a single uint32_t.
+ *
+ * e.g.
+ * There is a key "key1" in rowset id 1, version [1,1], segment id 1, row id 1.
+ * A new load also contains "key1", the rowset id 2, version [2,2], segment id 
1
+ * the delete bitmap will be `{1,1,2} -> 1`, which means the "row id 1" in
+ * "rowset id 1, segment id 1" is deleted/overitten by some loads at "version 
2"
+ */
+class DeleteBitmap {
+public:
+mutable std::shared_mutex lock;
+using SegmentId = uint32_t;
+using Version = uint32_t;
+using BitmapKey = std::tuple;
+std::map delete_bitmap; // Ordered map
+
+DeleteBitmap();
+
+/**
+ * Copy c-tor for making delete bitmap snapshot on read path
+ */
+DeleteBitmap(const DeleteBitmap& r);
+DeleteBitmap& operator=(const DeleteBitmap& r);
+/**
+ * Move c-tor for making delete bitmap snapshot on read path
+ */
+DeleteBitmap(DeleteBitmap&& r);
+DeleteBitmap& operator=(DeleteBitmap&& r);
+
+/**
+ * Makes a snapshot of delete bimap, read lock will be acquired in this
+ * process
+ */
+DeleteBitmap snapshot() const;
+
+/**
+ * Marks the specific row deleted
+ */
+void add(const BitmapKey& bitmap, uint32_t row_id);

Review Comment:
   We should change all the BitmapKey parameter from `bitmap` to `bmk`, the 
name is confusing, since it's not bitmap, it's just a key.



##
be/src/olap/tablet_meta.cpp:
##
@@ -710,4 +742,106 @@ bool operator!=(const TabletMeta& a, const TabletMeta& b) 
{
 return !(a == b);
 }
 
+DeleteBitmap::DeleteBitmap() {
+}
+
+DeleteBitmap::DeleteBitmap(const DeleteBitmap& o) {
+delete_bitmap = o.delete_bitmap; // just copy data
+}
+
+DeleteBitmap& DeleteBitmap::operator=(const DeleteBitmap& o) {
+delete_bitmap = o.delete_bitmap; // just copy data
+return *this;
+}
+
+DeleteBitmap::DeleteBitmap(DeleteBitmap&& o) {
+delete_bitmap = std::move(o.delete_bitmap);
+}
+
+DeleteBitmap& DeleteBitmap::operator=(DeleteBitmap&& o) {
+delete_bitmap = std::move(o.delete_bitmap);
+return *this;
+}
+
+DeleteBitmap DeleteBitmap::snapshot() const {
+std::shared_lock l(lock);
+return DeleteBitmap(*this);
+}
+
+void DeleteBitmap::add(const BitmapKey& bitmap, uint32_t row_id) {
+std::lock_guard l(lock);

Review Comment:
   > It looks like the granularity of the lock is tablet level.
   > 
   > I wonder will there be a performance penalty for this part of serial 
update when loading concurrently that need to update multiple versions of 
delete-bitmaps?
   
   We can't update the delete bitmap concurrently, at least in current design.
   A load will first lookup a row key before it update the delete map, but 
before that, it MUST see all previous versions data. If we can't guarantee 
that, data inconsistency will happen.
   
   e.g. 
   - current rowset layout : [0-5][6-6][7-7]
   - inflight load job: load1, load2, load3
   - load1 committed first, load2 in second, load3 committed in third. We CAN 
NOT update delete bitmap at this time, because load2 may overwrite some data in 
load1, but it can't see load1
   - load2 published first, with version [9-9](which is determined at commit 
stage), load1 published in second, with version [8-8], load 3 published in 
third, with version[10-10]. The publish sequence is not acceptable, since load2 
didn't update the delete bitmap on rowset[8-8] correctly.
   - So we have to guarantee that load1 published first, the load2, then load3



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on pull request #10548: [WIP](unique-key-merge-on-write) Add delete bitmap for DSIP-018

2022-07-08 Thread GitBox


zhannngchen commented on PR #10548:
URL: https://github.com/apache/doris/pull/10548#issuecomment-1178629366

   @compiletheworld  the formatter failed, pls fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang opened a new pull request, #10694: Key topn opt2

2022-07-08 Thread GitBox


xiaokang opened a new pull request, #10694:
URL: https://github.com/apache/doris/pull/10694

   # Proposed changes
   
   Issue Number: close #10646
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   BE part for https://github.com/apache/doris/issues/10646. The FE part is 
https://github.com/apache/doris/pull/10647.
   
   There is a common query pattern to find latest time serials data.
eg. SELECT * from t_log WHERE t>t1 AND tmailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10392: [Enhancement][Vectorized] Use SIMD to skip batches of null data in nu…

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10392:
URL: https://github.com/apache/doris/pull/10392#discussion_r916537248


##
be/src/vec/aggregate_functions/aggregate_function_null.h:
##
@@ -197,9 +200,66 @@ class AggregateFunctionNullUnary final
 }
 }
 
+void add_not_nullable(AggregateDataPtr __restrict place, const IColumn** 
columns,
+  size_t row_num, Arena* arena) {
+const ColumnNullable* column = assert_cast(columns[0]);
+this->set_flag(place);
+const IColumn* nested_column = &column->get_nested_column();
+this->nested_function->add(this->nested_place(place), &nested_column, 
row_num, arena);
+}
+
+void add_batch(size_t batch_size, AggregateDataPtr* places, size_t 
place_offset,
+   const IColumn** columns, Arena* arena) const override {
+int processed_records_num = 0;
+
+// we can use column->has_null() to judge whether whole batch of data 
is null and skip batch,
+// but it's maybe too coarse-grained.
+#ifdef __AVX2__
+const ColumnNullable* column = assert_cast(columns[0]);
+// The overhead introduced is negligible here, just an extra memory 
read from NullMap
+const NullMap& null_map_data = column->get_null_map_data();
+
+// NullMap use uint8_t type to indicate values is null or not, 1 
indicates null, 0 versus.
+// It's important to keep consistent with element type size in NullMap
+constexpr int simd_batch_size = 256 / (8 * sizeof(uint8_t));
+__m256i all0 = _mm256_setzero_si256();
+auto to_read_null_map_position = reinterpret_cast(null_map_data.data());
+
+while (processed_records_num + simd_batch_size < batch_size) {
+to_read_null_map_position = to_read_null_map_position + 
processed_records_num;
+// load unaligned data from null_map, 1 means value is null, 0 
versus
+__m256i f =
+_mm256_loadu_si256(reinterpret_cast(to_read_null_map_position));
+int mask = _mm256_movemask_epi8(_mm256_cmpgt_epi8(f, all0));
+// all data is null
+if (mask == 0x) {
+} else if (mask == 0) { // all data is not null
+for (size_t i = processed_records_num; i < 
processed_records_num + simd_batch_size;
+ i++) {
+add_not_nullable(places[i] + place_offset, columns, i, 
arena);
+}
+} else {
+// data is partly null
+for (size_t i = processed_records_num; i < 
processed_records_num + simd_batch_size;

Review Comment:
   Maybe we can calculate thw `lowbit` of mask to find not null offset.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow commented on a diff in pull request #10672: [refactor](nereids) Refine some code snippets

2022-07-08 Thread GitBox


morrySnow commented on code in PR #10672:
URL: https://github.com/apache/doris/pull/10672#discussion_r916542491


##
fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java:
##
@@ -89,51 +87,36 @@ public static Expression or(List expressions) {
  * Use AND/OR to combine expressions together.
  */
 public static Expression combine(NodeType op, List 
expressions) {
-
 Objects.requireNonNull(expressions, "expressions is null");
 
 if (expressions.size() == 0) {
-if (op == NodeType.AND) {
-return new Literal(true);
-}
-if (op == NodeType.OR) {
-return new Literal(false);
-}
-}
-
-if (expressions.size() == 1) {
+return new Literal(op == NodeType.AND);
+} else if (expressions.size() == 1) {
 return expressions.get(0);
 }
 
-List distinctExpressions = Lists.newArrayList(new 
LinkedHashSet<>(expressions));
-if (op == NodeType.AND) {
-if (distinctExpressions.contains(Literal.FALSE_LITERAL)) {
-return Literal.FALSE_LITERAL;
+Expression shortCircuit = (op == NodeType.AND ? Literal.FALSE_LITERAL 
: Literal.TRUE_LITERAL);
+Expression skip = (op == NodeType.AND ? Literal.TRUE_LITERAL : 
Literal.FALSE_LITERAL);
+LinkedHashSet distinctExpressions = 
Sets.newLinkedHashSetWithExpectedSize(expressions.size());
+for (Expression expression : expressions) {
+if (expression.equals(shortCircuit)) {
+return shortCircuit;
+} else if (!expression.equals(skip)) {
+distinctExpressions.add(expression);
 }
-distinctExpressions = distinctExpressions.stream().filter(p -> 
!p.equals(Literal.TRUE_LITERAL))
-.collect(Collectors.toList());
 }
 
-if (op == NodeType.OR) {
-if (distinctExpressions.contains(Literal.TRUE_LITERAL)) {
-return Literal.TRUE_LITERAL;
+List result = 
Lists.newArrayListWithCapacity(distinctExpressions.size() / 2 + 1);

Review Comment:
   maybe we could use stream reduce api to do this without recursion



##
fe/fe-core/src/main/java/org/apache/doris/nereids/util/Utils.java:
##
@@ -28,10 +28,7 @@ public class Utils {
  * @return quoted string
  */
 public static String quoteIfNeeded(String part) {
-if (part.matches("[a-zA-Z0-9_]+") && !part.matches("\\d+")) {
-return part;
-} else {
-return part.replace("`", "``");
-}
+return part.matches("\\w*[\\w&&[^\\d]]+\\w*")

Review Comment:
   this pattern is not intuitional.  it is better add a comment to explain the 
pattern means all legal string except pure digit string.



##
fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java:
##
@@ -89,51 +87,36 @@ public static Expression or(List expressions) {
  * Use AND/OR to combine expressions together.
  */
 public static Expression combine(NodeType op, List 
expressions) {
-
 Objects.requireNonNull(expressions, "expressions is null");
 
 if (expressions.size() == 0) {
-if (op == NodeType.AND) {
-return new Literal(true);
-}
-if (op == NodeType.OR) {
-return new Literal(false);
-}
-}
-
-if (expressions.size() == 1) {
+return new Literal(op == NodeType.AND);

Review Comment:
   If u do that, u need add a check at the top of this function to check 
`NodeType` is either `AND` or `OR`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhengshiJ commented on a diff in pull request #10678: [feature](nereides) support sort translator

2022-07-08 Thread GitBox


zhengshiJ commented on code in PR #10678:
URL: https://github.com/apache/doris/pull/10678#discussion_r916543059


##
fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java:
##
@@ -117,6 +119,24 @@ public void setMaterializedTupleInfo(
 }
 }
 
+/**
+ * Sets tupleInfo.
+ * Just for Nereids.
+ */
+public void setTupleInfo(

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] adonis0147 commented on a diff in pull request #10672: [refactor](nereids) Refine some code snippets

2022-07-08 Thread GitBox


adonis0147 commented on code in PR #10672:
URL: https://github.com/apache/doris/pull/10672#discussion_r916569297


##
fe/fe-core/src/main/java/org/apache/doris/nereids/util/ExpressionUtils.java:
##
@@ -89,51 +87,36 @@ public static Expression or(List expressions) {
  * Use AND/OR to combine expressions together.
  */
 public static Expression combine(NodeType op, List 
expressions) {
-
 Objects.requireNonNull(expressions, "expressions is null");
 
 if (expressions.size() == 0) {
-if (op == NodeType.AND) {
-return new Literal(true);
-}
-if (op == NodeType.OR) {
-return new Literal(false);
-}
-}
-
-if (expressions.size() == 1) {
+return new Literal(op == NodeType.AND);
+} else if (expressions.size() == 1) {
 return expressions.get(0);
 }
 
-List distinctExpressions = Lists.newArrayList(new 
LinkedHashSet<>(expressions));
-if (op == NodeType.AND) {
-if (distinctExpressions.contains(Literal.FALSE_LITERAL)) {
-return Literal.FALSE_LITERAL;
+Expression shortCircuit = (op == NodeType.AND ? Literal.FALSE_LITERAL 
: Literal.TRUE_LITERAL);
+Expression skip = (op == NodeType.AND ? Literal.TRUE_LITERAL : 
Literal.FALSE_LITERAL);
+LinkedHashSet distinctExpressions = 
Sets.newLinkedHashSetWithExpectedSize(expressions.size());
+for (Expression expression : expressions) {
+if (expression.equals(shortCircuit)) {
+return shortCircuit;
+} else if (!expression.equals(skip)) {
+distinctExpressions.add(expression);
 }
-distinctExpressions = distinctExpressions.stream().filter(p -> 
!p.equals(Literal.TRUE_LITERAL))
-.collect(Collectors.toList());
 }
 
-if (op == NodeType.OR) {
-if (distinctExpressions.contains(Literal.TRUE_LITERAL)) {
-return Literal.TRUE_LITERAL;
+List result = 
Lists.newArrayListWithCapacity(distinctExpressions.size() / 2 + 1);

Review Comment:
   The output would not be the same as the original one if we used `stream 
reduce` api.
   
   recursion: `(A AND B) AND (C AND D)`
   reduce: `((A AND B) AND C) AND D)`
   
   If the inconsistency is acceptable, I will use the `reduce` way to simplify 
it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #10671: [docs] fix keywords in sql-functions help documents

2022-07-08 Thread GitBox


yiguolei merged PR #10671:
URL: https://github.com/apache/doris/pull/10671


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [docs]fix keywords in sql-functions help documents (#10671)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new d127cfeea2 [docs]fix keywords in sql-functions help documents (#10671)
d127cfeea2 is described below

commit d127cfeea28e7eb57f5c8b149eae636dd9335b64
Author: carlvinhust2012 
AuthorDate: Fri Jul 8 16:22:47 2022 +0800

[docs]fix keywords in sql-functions help documents (#10671)

Co-authored-by: hucheng01 
---
 docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md | 2 +-
 docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md  | 2 +-
 docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md | 2 +-
 docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md  | 2 +-
 docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md | 2 +-
 .../docs/sql-manual/sql-functions/table-functions/explode-json-array.md | 2 +-
 .../en/docs/sql-manual/sql-functions/table-functions/explode-numbers.md | 2 +-
 docs/en/docs/sql-manual/sql-functions/table-functions/explode-split.md  | 2 +-
 .../docs/sql-manual/sql-functions/array-functions/arrays_overlap.md | 2 +-
 docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_array.md   | 2 +-
 docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_object.md  | 2 +-
 docs/zh-CN/docs/sql-manual/sql-functions/json-functions/json_quote.md   | 2 +-
 .../docs/sql-manual/sql-functions/table-functions/explode-bitmap.md | 2 +-
 .../docs/sql-manual/sql-functions/table-functions/explode-json-array.md | 2 +-
 .../docs/sql-manual/sql-functions/table-functions/explode-numbers.md| 2 +-
 .../docs/sql-manual/sql-functions/table-functions/explode-split.md  | 2 +-
 16 files changed, 16 insertions(+), 16 deletions(-)

diff --git 
a/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md 
b/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md
index 5cd3d30e36..ca0b5f01ba 100644
--- a/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md
+++ b/docs/en/docs/sql-manual/sql-functions/array-functions/arrays_overlap.md
@@ -63,4 +63,4 @@ mysql> select c_left,c_right,arrays_overlap(c_left,c_right) 
from array_test;
 
 ### keywords
 
-ARRAYS_OVERLAP
+ARRAY,ARRAYS,OVERLAP,ARRAYS_OVERLAP
diff --git a/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md 
b/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md
index 14e99263ec..38eaf68a12 100644
--- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md
+++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_array.md
@@ -67,4 +67,4 @@ MySQL> select json_array("a", null, "c");
 +--+
 ```
 ### keywords
-json_array
+json,array,json_array
diff --git 
a/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md 
b/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md
index d21daf3b47..0576e4e4e2 100644
--- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md
+++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_object.md
@@ -68,4 +68,4 @@ MySQL> select json_object('username',null);
 +-+
 ```
 ### keywords
-json_object
+json,object,json_object
diff --git a/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md 
b/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md
index 6cc120a166..ff54b2e92f 100644
--- a/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md
+++ b/docs/en/docs/sql-manual/sql-functions/json-functions/json_quote.md
@@ -67,4 +67,4 @@ MySQL> select json_quote("\n\b\r\t");
 ++
 ```
 ### keywords
-json_quote
+json,quote,json_quote
diff --git 
a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md 
b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md
index 99c2627a0b..482dd406d0 100644
--- a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md
+++ b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-bitmap.md
@@ -154,4 +154,4 @@ lateral view explode_split("a,b", ",") tmp2 as e2 order by 
k1, e1, e2;
 
 ### keywords
 
-explode_bitmap
\ No newline at end of file
+explode,bitmap,explode_bitmap
\ No newline at end of file
diff --git 
a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md 
b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md
index d3721ca04f..ab0f0b5b83 100644
--- 
a/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md
+++ 
b/docs/en/docs/sql-manual/sql-functions/table-functions/explode-json-array.md
@@ -283,4 +283,4 @@ mysql> select k1, e1 from example1 lateral view 
explode_json_array_string('{"a":
 
 ### keywords
 
-explode_json_array
\ No newline at end of file
+explode,json,a

[GitHub] [doris] Gabriel39 opened a new pull request, #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code

2022-07-08 Thread GitBox


Gabriel39 opened a new pull request, #10695:
URL: https://github.com/apache/doris/pull/10695

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10549: [Bug] Fix array functions arguments mismatch

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10549:
URL: https://github.com/apache/doris/pull/10549#issuecomment-1178734320

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10549: [Bug] Fix array functions arguments mismatch

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10549:
URL: https://github.com/apache/doris/pull/10549#issuecomment-1178734359

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10670: [fix](optimizer) join reorder may cause column non-existence problem

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10670:
URL: https://github.com/apache/doris/pull/10670#issuecomment-1178736075

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] icedrugs89 opened a new issue, #10696: [Bug] BrokerLoad导入任务出现type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = Broker list path exception. path=hdfs:xxx

2022-07-08 Thread GitBox


[GitHub] [doris] github-actions[bot] commented on pull request #10692: [refactor]broker rpc timeout configuration parameterization

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10692:
URL: https://github.com/apache/doris/pull/10692#issuecomment-1178745074

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10692: [refactor]broker rpc timeout configuration parameterization

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10692:
URL: https://github.com/apache/doris/pull/10692#issuecomment-1178745020

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Henry2SS commented on a diff in pull request #10492: [feature-wip] support avro format in routine load and stream load

2022-07-08 Thread GitBox


Henry2SS commented on code in PR #10492:
URL: https://github.com/apache/doris/pull/10492#discussion_r916626276


##
be/src/common/config.h:
##
@@ -763,6 +763,9 @@ CONF_Int32(quick_compaction_batch_size, "10");
 // do compaction min rowsets
 CONF_Int32(quick_compaction_min_rowsets, "10");
 
+// Avro schema file path, set default to "${DORIS_HOME}/conf/avro_schema.json"
+CONF_String(avro_schema_file_path, "${DORIS_HOME}/conf/avro_schema.json");

Review Comment:
   > The old query engine with `tuple` and `RowBatch` interface will be 
deprecated in the future. So you'd better implement this feature in vec engine, 
with `block` interface.
   
   Hi, support Vectorized `VAvroScanner` now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] icedrugs89 commented on issue #10696: [Bug] BrokerLoad导入任务出现type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = Broker list path exception. path=hdfs:xxx

2022-07-08 Thread GitBox


icedrugs89 commented on issue #10696:
URL: https://github.com/apache/doris/issues/10696#issuecomment-1178755340

   1、指定具体的年月日分区是可以导入的,怀疑是表的分区目录过大,进行如下验证:
   1)broker机器安装hdfs客户端环境,使用通配符*访问hive对应的所有分区文件,耗时大约在56秒
   2)查看对应的thrift中的socket代码,执行获取File的status状态可能由于hive分区数太大耗时太久超时
   
3)查看thrift-0.9.3/lib/java/src/org/apache/thrift/transport/TSocket.java中的SocketTimeout默认读写超时时间太短小于50秒
   
![image](https://user-images.githubusercontent.com/100941547/177959022-6b10291d-4846-4b94-80b9-44c534da1b8a.png)
   
4)由社区的技术团队修改FE的参数fe/fe-core/src/main/java/org/apache/doris/common/ClientPool.java增加static
 int brokerTimeoutMs = 1;
   5)生成测试版本, 重启FE后再次验证,问题修复


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] kpfly commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance

2022-07-08 Thread GitBox


kpfly commented on PR #10629:
URL: https://github.com/apache/doris/pull/10629#issuecomment-1178764456

   before this patch,when load JSON data, the tcmalloc hook may bring about a 
30% performance loss.  
   how about after this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener commented on pull request #10624: [enhancement]: remove redundant field.

2022-07-08 Thread GitBox


jackwener commented on PR #10624:
URL: https://github.com/apache/doris/pull/10624#issuecomment-1178764697

   In addition, This PR will not influent the upgrade. Because deleted field is 
optional


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman closed issue #10674: [Bug] join reorder may cause column non-existence problem

2022-07-08 Thread GitBox


morningman closed issue #10674: [Bug] join reorder may cause column 
non-existence problem
URL: https://github.com/apache/doris/issues/10674


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman merged pull request #10670: [fix](optimizer) join reorder may cause column non-existence problem

2022-07-08 Thread GitBox


morningman merged PR #10670:
URL: https://github.com/apache/doris/pull/10670


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [fix](optimizer) join reorder may cause column non-existence problem (#10670)

2022-07-08 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 036cba [fix](optimizer) join reorder may cause column 
non-existence problem (#10670)
036cba is described below

commit 036cba7ec4325f0f097eacf9e2b26b4d7fa8
Author: yinzhijian <373141...@qq.com>
AuthorDate: Fri Jul 8 17:28:32 2022 +0800

[fix](optimizer) join reorder may cause column non-existence problem 
(#10670)

for example:
select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b;
If t3 is a large table, it will be placed first after the reorderTable,
and the problem that t2.b does not exist will occur in reanalyzing.
---
 .../java/org/apache/doris/analysis/SelectStmt.java | 13 -
 .../java/org/apache/doris/planner/QueryPlanTest.java   | 18 ++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
index 881e7a2947..05ffa486c5 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
@@ -820,7 +820,18 @@ public class SelectStmt extends QueryStmt {
 List candidateEqJoinPredicates = 
analyzer.getEqJoinConjunctsExcludeAuxPredicates(tid);
 for (Expr candidateEqJoinPredicate : 
candidateEqJoinPredicates) {
 List candidateTupleList = 
Lists.newArrayList();
-
Expr.getIds(Lists.newArrayList(candidateEqJoinPredicate), candidateTupleList, 
null);
+List candidateEqJoinPredicateList = 
Lists.newArrayList(candidateEqJoinPredicate);
+// If a large table or view has joinClause is ranked 
first,
+// and the joinClause is not judged here,
+// the column in joinClause may not be found during 
reanalyzing.
+// for example:
+// select * from t1 inner join t2 on t1.a = t2.b inner 
join t3 on t3.c = t2.b;
+// If t3 is a large table, it will be placed first 
after the reorderTable,
+// and the problem that t2.b does not exist will occur 
in reanalyzing
+if (candidateTableRef.getOnClause() != null) {
+
candidateEqJoinPredicateList.add(candidateTableRef.getOnClause());
+}
+Expr.getIds(candidateEqJoinPredicateList, 
candidateTupleList, null);
 int count = candidateTupleList.size();
 for (TupleId tupleId : candidateTupleList) {
 if (validTupleId.contains(tupleId) || 
tid.equals(tupleId)) {
diff --git 
a/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java 
b/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java
index 4163112aa7..b8499836df 100644
--- a/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java
+++ b/fe/fe-core/src/test/java/org/apache/doris/planner/QueryPlanTest.java
@@ -2169,4 +2169,22 @@ public class QueryPlanTest extends TestWithFeService {
 Assert.assertFalse(explainString.contains("CROSS JOIN"));
 
 }
+
+@Test
+public void testDefaultJoinReorderWithView() throws Exception {
+connectContext.setDatabase("default_cluster:test");
+createTable("CREATE TABLE t_1 (col1 varchar, col2 varchar, col3 
int)\n" + "DISTRIBUTED BY HASH(col3)\n"
++ "BUCKETS 3\n" + "PROPERTIES(\n" + "
\"replication_num\"=\"1\"\n" + ");");
+createTable("CREATE TABLE t_2 (col1 varchar, col2 varchar, col3 
int)\n" + "DISTRIBUTED BY HASH(col3)\n"
++ "BUCKETS 3\n" + "PROPERTIES(\n" + "
\"replication_num\"=\"1\"\n" + ");");
+createView("CREATE VIEW v_1 as select col1 from t_1;");
+createView("CREATE VIEW v_2 as select x.col2 from (select t_2.col2, 1 
+ 1 from t_2) x;");
+
+String sql = "explain select t_1.col2, v_1.col1 from t_1 inner join 
t_2 on t_1.col1 = t_2.col1 inner join v_1 "
++ "on v_1.col1 = t_2.col2 inner join v_2 on v_2.col2 = 
t_2.col1";
+String explainString = getSQLPlanOrErrorMsg(sql);
+System.out.println(explainString);
+// errCode = 2, detailMessage = Unknown column 'col2' in 't_2'
+Assert.assertFalse(explainString.contains("errCode"));
+}
 }


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] cambyzju commented on pull request #10673: [feature-wip](array-type) explode support more sub types

2022-07-08 Thread GitBox


cambyzju commented on PR #10673:
URL: https://github.com/apache/doris/pull/10673#issuecomment-1178768701

   rebase to trigger P0 regression again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] k-i-d-d commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance

2022-07-08 Thread GitBox


k-i-d-d commented on PR #10629:
URL: https://github.com/apache/doris/pull/10629#issuecomment-1178772483

   > before this patch,when load JSON data, the tcmalloc hook may bring about a 
30% performance loss. how about after this patch?
   
   10%, 
   In addition to load large JSON, other load are usually only have a loss of 
less than 2%


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] kpfly commented on pull request #10629: [Enhancement] Improve TCMalloc Hook consume MemTracker performance

2022-07-08 Thread GitBox


kpfly commented on PR #10629:
URL: https://github.com/apache/doris/pull/10629#issuecomment-1178775502

   > 
   nice
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [BugFix] Column datas doesn't match nullmap when vectorization load (#10684)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 35a282fd61 [BugFix] Column datas doesn't match nullmap when 
vectorization load (#10684)
35a282fd61 is described below

commit 35a282fd6112b36e858d6739e73aa30a1e9b1d64
Author: Lightman <31928846+lchangli...@users.noreply.github.com>
AuthorDate: Fri Jul 8 17:39:44 2022 +0800

[BugFix] Column datas doesn't match nullmap when vectorization load (#10684)

* block column doesn't match nullmap

* remove _nullmap+_row_pos in convertor_to_olap
---
 be/src/vec/olap/olap_data_convertor.cpp | 46 -
 be/src/vec/olap/olap_data_convertor.h   |  2 +-
 2 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/be/src/vec/olap/olap_data_convertor.cpp 
b/be/src/vec/olap/olap_data_convertor.cpp
index 72a55e545b..03a1a208cd 100644
--- a/be/src/vec/olap/olap_data_convertor.cpp
+++ b/be/src/vec/olap/olap_data_convertor.cpp
@@ -138,6 +138,7 @@ void 
OlapBlockDataConvertor::OlapColumnDataConvertorBase::set_source_column(
 auto nullable_column =
 assert_cast(_typed_column.column.get());
 _nullmap = nullable_column->get_null_map_data().data();
+_nullmap += row_pos;
 }
 }
 
@@ -194,7 +195,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap()
 
 size_t total_size = 0;
 if (_nullmap) {
-const UInt8* nullmap_cur = _nullmap + _row_pos;
+const UInt8* nullmap_cur = _nullmap;
 while (bitmap_value_cur != bitmap_value_end) {
 if (!*nullmap_cur) {
 total_size += bitmap_value_cur->getSizeInBytes();
@@ -215,7 +216,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap()
 char* raw_data = _raw_data.data();
 Slice* slice = _slice.data();
 if (_nullmap) {
-const UInt8* nullmap_cur = _nullmap + _row_pos;
+const UInt8* nullmap_cur = _nullmap;
 while (bitmap_value_cur != bitmap_value_end) {
 if (!*nullmap_cur) {
 slice_size = bitmap_value_cur->getSizeInBytes();
@@ -233,7 +234,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap()
 ++nullmap_cur;
 ++bitmap_value_cur;
 }
-assert(nullmap_cur == _nullmap + _row_pos + _num_rows && slice == 
_slice.get_end_ptr());
+assert(nullmap_cur == _nullmap + _num_rows && slice == 
_slice.get_end_ptr());
 } else {
 while (bitmap_value_cur != bitmap_value_end) {
 slice_size = bitmap_value_cur->getSizeInBytes();
@@ -254,8 +255,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorBitMap::convert_to_olap()
 Status OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() {
 assert(_typed_column.column);
 const vectorized::ColumnHLL* column_hll = nullptr;
-const UInt8* nullmap = get_nullmap();
-if (nullmap) {
+if (_nullmap) {
 auto nullable_column =
 assert_cast(_typed_column.column.get());
 column_hll = assert_cast(
@@ -270,8 +270,8 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() {
 HyperLogLog* hll_value_end = hll_value_cur + _num_rows;
 
 size_t total_size = 0;
-if (nullmap) {
-const UInt8* nullmap_cur = nullmap + _row_pos;
+if (_nullmap) {
+const UInt8* nullmap_cur = _nullmap;
 while (hll_value_cur != hll_value_end) {
 if (!*nullmap_cur) {
 total_size += hll_value_cur->max_serialized_size();
@@ -292,8 +292,8 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() {
 Slice* slice = _slice.data();
 
 hll_value_cur = hll_value;
-if (nullmap) {
-const UInt8* nullmap_cur = nullmap + _row_pos;
+if (_nullmap) {
+const UInt8* nullmap_cur = _nullmap;
 while (hll_value_cur != hll_value_end) {
 if (!*nullmap_cur) {
 slice_size = hll_value_cur->serialize((uint8_t*)raw_data);
@@ -310,7 +310,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorHLL::convert_to_olap() {
 ++nullmap_cur;
 ++hll_value_cur;
 }
-assert(nullmap_cur == nullmap + _row_pos + _num_rows && slice == 
_slice.get_end_ptr());
+assert(nullmap_cur == _nullmap + _num_rows && slice == 
_slice.get_end_ptr());
 } else {
 while (hll_value_cur != hll_value_end) {
 slice_size = hll_value_cur->serialize((uint8_t*)raw_data);
@@ -372,7 +372,7 @@ Status 
OlapBlockDataConvertor::OlapColumnDataConvertorChar::convert_to_olap() {
 }
 
 for (size_t i = 0; i < _num_rows; i++) {
-if (!_nullmap || !_nullmap[i + _row_pos]) {
+if (!_nullmap || !_nullmap[i]) {
 _slice[i] = column_string->

[GitHub] [doris] yiguolei merged pull request #10684: [BugFix] Column datas doesn't match nullmap when vectorization load

2022-07-08 Thread GitBox


yiguolei merged PR #10684:
URL: https://github.com/apache/doris/pull/10684


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #10660: [Doc] add flink-doris-connector 1.1.0 doc

2022-07-08 Thread GitBox


yiguolei merged PR #10660:
URL: https://github.com/apache/doris/pull/10660


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch dev-1.0.1 updated: [fix](optimizer) join reorder may cause column non-existence problem (#10670)

2022-07-08 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch dev-1.0.1
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/dev-1.0.1 by this push:
 new c60ed8f18a [fix](optimizer) join reorder may cause column 
non-existence problem (#10670)
c60ed8f18a is described below

commit c60ed8f18aac93fe0a515a807387502829760e48
Author: yinzhijian <373141...@qq.com>
AuthorDate: Fri Jul 8 17:28:32 2022 +0800

[fix](optimizer) join reorder may cause column non-existence problem 
(#10670)

for example:
select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b;
If t3 is a large table, it will be placed first after the reorderTable,
and the problem that t2.b does not exist will occur in reanalyzing.
---
 .../src/main/java/org/apache/doris/analysis/SelectStmt.java | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
index 84710d14c3..955f9650b4 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
@@ -817,7 +817,18 @@ public class SelectStmt extends QueryStmt {
 List candidateEqJoinPredicates = 
analyzer.getEqJoinConjunctsExcludeAuxPredicates(tid);
 for (Expr candidateEqJoinPredicate : 
candidateEqJoinPredicates) {
 List candidateTupleList = 
Lists.newArrayList();
-
Expr.getIds(Lists.newArrayList(candidateEqJoinPredicate), candidateTupleList, 
null);
+List candidateEqJoinPredicateList = 
Lists.newArrayList(candidateEqJoinPredicate);
+// If a large table or view has joinClause is ranked 
first,
+// and the joinClause is not judged here,
+// the column in joinClause may not be found during 
reanalyzing.
+// for example:
+// select * from t1 inner join t2 on t1.a = t2.b inner 
join t3 on t3.c = t2.b;
+// If t3 is a large table, it will be placed first 
after the reorderTable,
+// and the problem that t2.b does not exist will occur 
in reanalyzing
+if (candidateTableRef.getOnClause() != null) {
+
candidateEqJoinPredicateList.add(candidateTableRef.getOnClause());
+}
+Expr.getIds(candidateEqJoinPredicateList, 
candidateTupleList, null);
 int count = candidateTupleList.size();
 for (TupleId tupleId : candidateTupleList) {
 if (validTupleId.contains(tupleId) || 
tid.equals(tupleId)) {


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated (35a282fd61 -> 2b1d8ac28a)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from 35a282fd61 [BugFix] Column datas doesn't match nullmap when 
vectorization load (#10684)
 add 2b1d8ac28a [Doc] add flink-doris-connector 1.1.0 doc (#10660)

No new revisions were added by this update.

Summary of changes:
 docs/en/docs/ecosystem/flink-doris-connector.md| 343 ++--
 docs/zh-CN/docs/ecosystem/flink-doris-connector.md | 346 -
 2 files changed, 297 insertions(+), 392 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] whutpencil opened a new issue, #10697: [Enhancement] The broker load Kerberos update mechanism has defects, resulting in import errors

2022-07-08 Thread GitBox


whutpencil opened a new issue, #10697:
URL: https://github.com/apache/doris/issues/10697

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   In our scenario, every day, several broker load jobs fail to import every 
day, and an error of GSS authentication failure is reported, and the whole 
error reporting time is even as long as 5 minutes.
   
   
![image](https://user-images.githubusercontent.com/24907215/177964603-bcba1547-deb2-42e6-ba28-1ab327356317.png)
   
   
   ### Solution
   
   There is no need for a scheduled task to update the token regularly, but to 
check whether the cached filesystem is expired every time it is obtained, and 
update it if it is expired.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hf200012 opened a new pull request, #10698: [Doc]broker load rpc timeout problem FQA

2022-07-08 Thread GitBox


hf200012 opened a new pull request, #10698:
URL: https://github.com/apache/doris/pull/10698

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] whutpencil opened a new pull request, #10699: [enhancement] Improve the availability of broker load

2022-07-08 Thread GitBox


whutpencil opened a new pull request, #10699:
URL: https://github.com/apache/doris/pull/10699

   # Proposed changes
   
   Issue Number: https://github.com/apache/doris/issues/10697
   
   ## Problem Summary:
   
   See issue.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] mrhhsg opened a new pull request, #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


mrhhsg opened a new pull request, #10700:
URL: https://github.com/apache/doris/pull/10700

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Test with ssb-flat 100g with the SQL:
   ```sql
   select count() from ( SELECT  C_CITY,   SUM(LO_REVENUE) AS revenue FROM 
lineorder_flat GROUP BY C_CITY, S_CITY) a;
   ```
   
   ||non-pre serialize|pre serialize|
   |-|-|-|
   |profile|https://user-images.githubusercontent.com/1179834/177964945-8803ad98-923b-4468-848d-2dd83c31ebb8.png";>|https://user-images.githubusercontent.com/1179834/177964545-899a4045-179c-47ea-8ec7-18fefc1d7e71.png";>|
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10698:
URL: https://github.com/apache/doris/pull/10698#issuecomment-1178780755

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10698:
URL: https://github.com/apache/doris/pull/10698#issuecomment-1178780723

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #10701: [reafactor](predicate) refactor predicates in scan node

2022-07-08 Thread GitBox


Gabriel39 opened a new pull request, #10701:
URL: https://github.com/apache/doris/pull/10701

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener commented on a diff in pull request #10694: optimize topn query if order by columns is prefix of sort keys of table

2022-07-08 Thread GitBox


jackwener commented on code in PR #10694:
URL: https://github.com/apache/doris/pull/10694#discussion_r916590916


##
be/src/olap/reader.cpp:
##
@@ -197,11 +197,17 @@ Status TabletReader::_capture_rs_readers(const 
ReaderParams& read_params,
 // it's ok for rowset to return unordered result
 need_ordered_result = false;
 }
+
+if (read_params.read_orderby_key) {
+need_ordered_result = true;

Review Comment:
   need_ordered_result = read_params.read_orderby_key



##
be/src/olap/rowset/rowset_reader_context.h:
##
@@ -34,6 +34,9 @@ struct RowsetReaderContext {
 const TabletSchema* tablet_schema = nullptr;
 // whether rowset should return ordered rows.
 bool need_ordered_result = true;
+//

Review Comment:
   Forgot to comment?



##
be/src/vec/olap/vcollect_iterator.h:
##
@@ -102,12 +102,14 @@ class VCollectIterator {
 // if row cursors equal, compare data version.
 class LevelIteratorComparator {
 public:
-LevelIteratorComparator(int sequence = -1) : _sequence(sequence) {}
+LevelIteratorComparator(int sequence, bool is_reverse) :
+_sequence(sequence), _is_reverse(is_reverse) {}
 
 bool operator()(LevelIterator* lhs, LevelIterator* rhs);
 
 private:
 int _sequence;
+bool _is_reverse = false;

Review Comment:
   Add comment explain its function?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman opened a new pull request, #10702: [refactor] Rename Catalog to Env

2022-07-08 Thread GitBox


morningman opened a new pull request, #10702:
URL: https://github.com/apache/doris/pull/10702

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Change the Catalog class name to Env
   Autocomplete by IDE.
   No functional changes and bug fixes involved
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No Need)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


yiguolei commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916669026


##
be/src/vec/columns/column.h:
##
@@ -246,6 +246,14 @@ class IColumn : public COW {
 /// Returns pointer to the position after the read data.
 virtual const char* deserialize_and_insert_from_arena(const char* pos) = 0;
 
+virtual size_t get_max_row_byte_size() const { return 0; }

Review Comment:
   Add some comments for new method. Then other people could read the code more 
clearly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] mrhhsg commented on a diff in pull request #10701: [reafactor](predicate) refactor predicates in scan node

2022-07-08 Thread GitBox


mrhhsg commented on code in PR #10701:
URL: https://github.com/apache/doris/pull/10701#discussion_r916669902


##
be/src/exec/olap_common.h:
##
@@ -44,25 +45,26 @@ std::string cast_to_string(int8_t);
 /**
  * @brief Column's value range
  **/
-template 
+template 
 class ColumnValueRange {
 public:
-typedef typename std::set::iterator iterator_type;
+using CppType = typename 
doris::PrimitiveTypeTraits::CppType;
+using IteratorType = typename std::set::iterator;
 
 ColumnValueRange();
 
-ColumnValueRange(std::string col_name, PrimitiveType type);
+ColumnValueRange(std::string col_name, doris::PrimitiveType type);
 
-ColumnValueRange(std::string col_name, PrimitiveType type, const T& min, 
const T& max,
- bool contain_null);
+ColumnValueRange(std::string col_name, doris::PrimitiveType type, const 
CppType& min,

Review Comment:
   Maybe we can remove the arg `type` because it is the same as the template 
arg `doris::PrimitiveType primitive_type`?



##
be/src/exec/olap_common.cpp:
##
@@ -42,17 +42,32 @@ std::string cast_to_string(int8_t value) {
 }
 
 template <>
-void ColumnValueRange::convert_to_fixed_value() {
+void 
ColumnValueRange::convert_to_fixed_value() {

Review Comment:
   namespace `doris` is redundant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


yiguolei commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916681902


##
be/src/vec/exec/vaggregation_node.h:
##
@@ -50,13 +50,42 @@ struct AggregationMethodSerialized {
 Data data;
 Iterator iterator;
 bool inited = false;
+std::vector keys;
+AggregationMethodSerialized()
+: _serialized_key_buffer_size(0),
+  _serialized_key_buffer(nullptr),
+  _mem_pool(new MemPool) {}
 
-AggregationMethodSerialized() = default;
+using State = ColumnsHashing::HashMethodSerialized;
 
 template 
 explicit AggregationMethodSerialized(const Other& other) : 
data(other.data) {}
 
-using State = ColumnsHashing::HashMethodSerialized;
+void serialize_keys(const ColumnRawPtrs& key_columns, const size_t 
num_rows) {
+size_t max_one_row_byte_size = 0;
+for (const auto& column : key_columns) {
+max_one_row_byte_size +=
+std::max(max_one_row_byte_size, 
column->get_max_row_byte_size());

Review Comment:
   max_one_row_byte_size += column->get_max_row_byte_size() ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10678: [feature](nereides) support sort translator

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10678:
URL: https://github.com/apache/doris/pull/10678#issuecomment-1178821588

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10678: [feature](nereides) support sort translator

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10678:
URL: https://github.com/apache/doris/pull/10678#issuecomment-1178821541

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] wangsha0 opened a new issue, #10703: [Feature] About supporting the format file of EC when using BrokerLoad to sync data from hdfs to doris

2022-07-08 Thread GitBox


wangsha0 opened a new issue, #10703:
URL: https://github.com/apache/doris/issues/10703

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   hdfs file has used EC police
   BrokerLoad Error:
   
   ```
JobId: 31095590
Label: xx
State: CANCELLED
 Progress: ETL:N/A; LOAD:N/A
 Type: BROKER
  EtlInfo: NULL
 TaskInfo: cluster:N/A; timeout(s):14400; max_filter_ratio:0.01
 ErrorMsg: type:LOAD_RUN_FAIL; msg:ParseError : Bad read of hdfs://nsxx
   CreateTime: 2022-07-08 15:19:28
 EtlStartTime: 2022-07-08 15:19:30
EtlFinishTime: 2022-07-08 15:19:30
LoadStartTime: 2022-07-08 15:19:30
   LoadFinishTime: 2022-07-08 15:20:41
  URL: NULL
   JobDetails: {xx
   ```
   
   after:
   
   `hdfs ec -getPolicy -path hdfs://ns10xx`
   
   then:
   `RS-6-3-1024k`
   
   ### Use case
   
   RS-6-3-1024k
   
   ### Related issues
   
   RS-6-3-1024k
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] EmmyMiao87 commented on a diff in pull request #10659: [enhancement](nereids) make aggregate works

2022-07-08 Thread GitBox


EmmyMiao87 commented on code in PR #10659:
URL: https://github.com/apache/doris/pull/10659#discussion_r916687709


##
fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundFunction.java:
##
@@ -52,6 +54,14 @@ public List getArguments() {
 return children();
 }
 
+@Override
+public String sql() throws UnboundException {

Review Comment:
   The three words toString, toSql, toDigest seem to be unified



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] EmmyMiao87 commented on a diff in pull request #10659: [enhancement](nereids) make aggregate works

2022-07-08 Thread GitBox


EmmyMiao87 commented on code in PR #10659:
URL: https://github.com/apache/doris/pull/10659#discussion_r916689097


##
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##
@@ -263,23 +313,18 @@ public PlanFragment visitPhysicalHashJoin(
 // NOTICE: We must visit from right to left, to ensure the last 
fragment is root fragment
 PlanFragment rightFragment = visit(hashJoin.child(1), context);
 PlanFragment leftFragment = visit(hashJoin.child(0), context);
-PhysicalHashJoin physicalHashJoin = hashJoin.getOperator();
-
-//Expression predicateExpr = 
physicalHashJoin.getCondition().get();
-//List eqExprList = 
Utils.getEqConjuncts(hashJoin.child(0).getOutput(),
-//hashJoin.child(1).getOutput(), predicateExpr);
-JoinType joinType = physicalHashJoin.getJoinType();
-
 PlanNode leftFragmentPlanRoot = leftFragment.getPlanRoot();
 PlanNode rightFragmentPlanRoot = rightFragment.getPlanRoot();
+PhysicalHashJoin physicalHashJoin = hashJoin.getOperator();
+JoinType joinType = physicalHashJoin.getJoinType();
 
 if (joinType.equals(JoinType.CROSS_JOIN)

Review Comment:
   Then if we encounter a `PhysicalHashJoin` whose `JoinType` is cross join, an 
error should be reported directly here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10512: [feature] (vectorization)parquet push down support

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10512:
URL: https://github.com/apache/doris/pull/10512#issuecomment-1178840627

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris-spark-connector] branch branch-1.1.0 created (now 2e38c12)

2022-07-08 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch branch-1.1.0
in repository https://gitbox.apache.org/repos/asf/doris-spark-connector.git


  at 2e38c12  Remove disclaimer (#42)

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow commented on a diff in pull request #10657: [feature](Nereids): enforcer job.

2022-07-08 Thread GitBox


morrySnow commented on code in PR #10657:
URL: https://github.com/apache/doris/pull/10657#discussion_r916409080


##
fe/fe-core/src/main/java/org/apache/doris/nereids/memo/Group.java:
##
@@ -135,6 +137,35 @@ public void setCostLowerBound(double costLowerBound) {
 this.costLowerBound = costLowerBound;
 }
 
+/**
+ * Set or update lowestCostPlans: properties --> new Pair<>(cost, 
expression)
+ */
+public void setBestPlan(GroupExpression expression, double cost, 
PhysicalProperties properties) {

Review Comment:
   ```suggestion
   public void updateBestPlan(GroupExpression expression, double cost, 
PhysicalProperties properties) {
   ```
   
   furthermore, this setter, function use plan as name, but getter function use 
expression instead



##
fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/JobContext.java:
##
@@ -26,7 +26,7 @@
 public class JobContext {
 private final PlannerContext plannerContext;
 private final PhysicalProperties requiredProperties;
-private final double costUpperBound;
+private double costUpperBound;

Review Comment:
   if we need update cost upper bound, the better way is generate a new 
`JobContext` with new upper bound. If we only use one `JobContext` in all job 
of cascades, you need a stack to save cost status carefully



##
fe/fe-core/src/main/java/org/apache/doris/nereids/operators/plans/physical/PhysicalDistribution.java:
##
@@ -0,0 +1,42 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.operators.plans.physical;
+
+import org.apache.doris.nereids.operators.OperatorType;
+import org.apache.doris.nereids.properties.DistributionSpec;
+import org.apache.doris.nereids.trees.expressions.Expression;
+
+import java.util.List;
+
+/**
+ * Enforcer operator.
+ */
+public class PhysicalDistribution extends PhysicalUnaryOperator {
+
+protected DistributionSpec distributionSpec;
+
+
+public PhysicalDistribution(DistributionSpec spec) {
+super(OperatorType.PHYSICAL_DISTRIBUTION);
+}
+
+@Override
+public List getExpressions() {
+return null;

Review Comment:
   maybe we need to return an empty list



##
fe/fe-core/src/main/java/org/apache/doris/nereids/memo/GroupExpression.java:
##
@@ -44,6 +44,8 @@ public class GroupExpression {
 
 // Mapping from output properties to the corresponding best cost, 
statistics, and child properties.
 private final Map>> lowestCostTable;
+// Each physical group expression maintains mapping incoming requests to 
the corresponding child requests.
+private final Map 
requestPropertiesMap;

Review Comment:
   value need a list?



##
fe/fe-core/src/main/java/org/apache/doris/nereids/memo/GroupExpression.java:
##
@@ -61,6 +63,14 @@ public GroupExpression(Operator op, List children) {
 this.ruleMasks = new BitSet(RuleType.SENTINEL.ordinal());
 this.statDerived = false;
 this.lowestCostTable = Maps.newHashMap();
+this.requestPropertiesMap = Maps.newHashMap();
+}
+
+// TODO: rename
+public PhysicalProperties getPropertyFromMap(PhysicalProperties 
requiredPropertySet) {

Review Comment:
   ```suggestion
   public PhysicalProperties getPropertyFromMap(PhysicalProperties 
requiredProperties) {
   ```
   



##
fe/fe-core/src/main/java/org/apache/doris/nereids/properties/OrderKey.java:
##
@@ -42,6 +42,15 @@ public OrderKey(Expression expr, boolean isAsc, boolean 
nullFirst) {
 this.nullFirst = nullFirst;
 }
 
+/**
+ * Whether other `OrderKey` is satisfied the current `OrderKey`.
+ *
+ * @param other another OrderKey.
+ */
+public boolean matches(OrderKey other) {
+return expr.equals(other.expr) && isAsc == other.isAsc && nullFirst == 
other.nullFirst;

Review Comment:
   we need semantic equal



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us..

[GitHub] [doris] morrySnow commented on a diff in pull request #10657: [feature](Nereids): enforcer job.

2022-07-08 Thread GitBox


morrySnow commented on code in PR #10657:
URL: https://github.com/apache/doris/pull/10657#discussion_r916707968


##
fe/fe-core/src/main/java/org/apache/doris/nereids/operators/OperatorType.java:
##
@@ -23,9 +23,9 @@
  * 1. ANY: match any operator
  * 2. MULTI: match multiple operators
  * 3. FIXED: the leaf node of pattern tree, which can be matched by a single 
operator
- * but this operator cannot be used in rules
+ * but this operator cannot be used in rules
  * 4. MULTI_FIXED: the leaf node of pattern tree, which can be matched by 
multiple operators,
- *but these operators cannot be used in rules
+ * but these operators cannot be used in rules

Review Comment:
   maintain the indentation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] wangbo commented on a diff in pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code

2022-07-08 Thread GitBox


wangbo commented on code in PR #10695:
URL: https://github.com/apache/doris/pull/10695#discussion_r916723622


##
be/src/olap/rowset/segment_v2/column_reader.cpp:
##
@@ -960,13 +960,6 @@ void DefaultValueColumnIterator::insert_default_data(const 
TypeInfo* type_info,
 dst->insert_many_data(data_ptr, data_len, n);
 break;
 }
-case OLAP_FIELD_TYPE_DATEV2: {
-assert(type_size == 
sizeof(FieldTypeTraits::CppType)); //uint32_t
-
-int128 = 
*((FieldTypeTraits::CppType*)mem_value);
-dst->insert_many_data(data_ptr, data_len, n);
-break;

Review Comment:
   Why here is removed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] EmmyMiao87 merged pull request #10678: [feature](nereides) support sort translator

2022-07-08 Thread GitBox


EmmyMiao87 merged PR #10678:
URL: https://github.com/apache/doris/pull/10678


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [feature](nereides) support sort translator (#10678)

2022-07-08 Thread lingmiao
This is an automated email from the ASF dual-hosted git repository.

lingmiao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new e6da00bb26 [feature](nereides) support sort translator (#10678)
e6da00bb26 is described below

commit e6da00bb261ef6a04207336e318e64b2558193c3
Author: zhengshiJ <32082872+zhengs...@users.noreply.github.com>
AuthorDate: Fri Jul 8 19:22:48 2022 +0800

[feature](nereides) support sort translator (#10678)

Physical sort:
 * 1. Build sortInfo
 *There are two types of slotRef:
 *one is generated by the previous node, collectively called old.
 *the other is newly generated by the sort node, collectively 
called new.
 *Filling of sortInfo related data structures,
 *a. ordering use newSlotRef.
 *b. sortTupleSlotExprs use oldSlotRef.
 * 2. Create sortNode
 * 3. Create mergeFragment

TODO:
1.Currently, columns that do not exist in select but exist in order by 
cannot be parsed.
eg: select key from table order by value;

2.For the combination of Literal and slotRefrance in select, there is a 
problem with parsing,
eg: select key ,(10-value) from table;
---
 .../java/org/apache/doris/analysis/SortInfo.java   |  4 ++
 .../org/apache/doris/nereids/NereidsPlanner.java   |  6 +--
 .../glue/translator/PhysicalPlanTranslator.java| 56 +++---
 .../glue/translator/PlanTranslatorContext.java | 13 -
 .../java/org/apache/doris/planner/PlanNode.java|  7 +++
 .../java/org/apache/doris/planner/SortNode.java| 23 +
 .../java/org/apache/doris/qe/StmtExecutor.java |  1 -
 7 files changed, 98 insertions(+), 12 deletions(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
index 05763e8d07..63394db4cd 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
@@ -141,6 +141,10 @@ public class SortInfo {
 this.sortTupleSlotExprs = sortTupleSlotExprs;
 }
 
+public void setSortTupleDesc(TupleDescriptor tupleDesc) {
+sortTupleDesc = tupleDesc;
+}
+
 public TupleDescriptor getSortTupleDescriptor() {
 return sortTupleDesc;
 }
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
index bc7aed0bf1..23fafa518d 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java
@@ -25,7 +25,6 @@ import org.apache.doris.analysis.StatementBase;
 import org.apache.doris.analysis.TupleDescriptor;
 import org.apache.doris.analysis.TupleId;
 import org.apache.doris.common.AnalysisException;
-import org.apache.doris.common.Id;
 import org.apache.doris.common.UserException;
 import org.apache.doris.nereids.glue.LogicalPlanAdapter;
 import org.apache.doris.nereids.glue.translator.PhysicalPlanTranslator;
@@ -39,7 +38,6 @@ import org.apache.doris.nereids.memo.GroupExpression;
 import org.apache.doris.nereids.memo.Memo;
 import org.apache.doris.nereids.properties.PhysicalProperties;
 import org.apache.doris.nereids.trees.expressions.NamedExpression;
-import org.apache.doris.nereids.trees.expressions.Slot;
 import org.apache.doris.nereids.trees.plans.Plan;
 import org.apache.doris.nereids.trees.plans.logical.LogicalPlan;
 import org.apache.doris.nereids.trees.plans.physical.PhysicalPlan;
@@ -104,8 +102,8 @@ public class NereidsPlanner extends Planner {
 outputCandidates.put(slotDescriptor.getId().asInt(), slotRef);
 }
 }
-physicalPlan.getOutput().stream().map(Slot::getExprId)
-.map(Id::asInt).forEach(i -> 
outputExprs.add(outputCandidates.get(i)));
+physicalPlan.getOutput().stream()
+.forEach(i -> 
outputExprs.add(planTranslatorContext.findExpr(i)));
 root.setOutputExprs(outputExprs);
 root.getPlanRoot().convertToVectoriezd();
 
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
index 511ac61fb4..0dd6b9e859 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
@@ -204,6 +204,26 @@ public class PhysicalPlanTranslator extends 
PlanOperatorVisitor sort,
 PlanTranslatorContext context) {
@@ -211,24 +231,35 @@ public class PhysicalPlanTranslator extends 
PlanOperatorVisitor execOrderingExprList = Lists.newArr

[GitHub] [doris] github-actions[bot] commented on pull request #10699: [enhancement] Improve the availability of broker load

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10699:
URL: https://github.com/apache/doris/pull/10699#issuecomment-1178875878

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10699: [enhancement] Improve the availability of broker load

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10699:
URL: https://github.com/apache/doris/pull/10699#issuecomment-1178875855

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhengshubin opened a new issue, #10704: [Feature] Use Function to set the default value when add new column

2022-07-08 Thread GitBox


zhengshubin opened a new issue, #10704:
URL: https://github.com/apache/doris/issues/10704

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   Now,If I want to add a date Type column  use  default value with old 
datetime type  coloum, I must rebuild this table .  But If  can Use Function to 
set the default value when add new column, it  can improve work efficiency。
   
   ### Use case
   
   The sql express just like this :
   ALTER TABLE  t1 ADD COLUMN  dt DATE DEFAULT   toDate(oldColumn) 
   
   
   
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10650: [Bug][Function] pass intermediate argument list to be

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10650:
URL: https://github.com/apache/doris/pull/10650#issuecomment-1178894215

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 commented on a diff in pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code

2022-07-08 Thread GitBox


Gabriel39 commented on code in PR #10695:
URL: https://github.com/apache/doris/pull/10695#discussion_r916741114


##
be/src/olap/rowset/segment_v2/column_reader.cpp:
##
@@ -960,13 +960,6 @@ void DefaultValueColumnIterator::insert_default_data(const 
TypeInfo* type_info,
 dst->insert_many_data(data_ptr, data_len, n);
 break;
 }
-case OLAP_FIELD_TYPE_DATEV2: {
-assert(type_size == 
sizeof(FieldTypeTraits::CppType)); //uint32_t
-
-int128 = 
*((FieldTypeTraits::CppType*)mem_value);
-dst->insert_many_data(data_ptr, data_len, n);
-break;

Review Comment:
   because this is same as default behavior



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10655: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2]

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10655:
URL: https://github.com/apache/doris/pull/10655#issuecomment-1178901442

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris-flink-connector] hf200012 opened a new pull request, #44: remove DISCLAIMER

2022-07-08 Thread GitBox


hf200012 opened a new pull request, #44:
URL: https://github.com/apache/doris-flink-connector/pull/44

   remove DISCLAIMER
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris-flink-connector] hf200012 closed pull request #44: remove DISCLAIMER

2022-07-08 Thread GitBox


hf200012 closed pull request #44: remove DISCLAIMER
URL: https://github.com/apache/doris-flink-connector/pull/44


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris-flink-connector] hf200012 opened a new pull request, #45: remove DISCLAIMER

2022-07-08 Thread GitBox


hf200012 opened a new pull request, #45:
URL: https://github.com/apache/doris-flink-connector/pull/45

   remove DISCLAIMER
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris-flink-connector] branch master updated: remove DISCLAIMER (#45)

2022-07-08 Thread jiafengzheng
This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-flink-connector.git


The following commit(s) were added to refs/heads/master by this push:
 new e1e2f13  remove DISCLAIMER (#45)
e1e2f13 is described below

commit e1e2f133b1457828fbc5b4f9d126bc362f102fa1
Author: jiafeng.zhang 
AuthorDate: Fri Jul 8 20:00:39 2022 +0800

remove DISCLAIMER (#45)

remove DISCLAIMER
---
 DISCLAIMER | 12 
 1 file changed, 12 deletions(-)

diff --git a/DISCLAIMER b/DISCLAIMER
deleted file mode 100644
index 2769edd..000
--- a/DISCLAIMER
+++ /dev/null
@@ -1,12 +0,0 @@
-Apache Doris is an effort undergoing incubation at The
-Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC.
-
-Incubation is required of all newly accepted
-projects until a further review indicates that the
-infrastructure, communications, and decision making process have
-stabilized in a manner consistent with other successful ASF
-projects.
-
-While incubation status is not necessarily a reflection
-of the completeness or stability of the code, it does indicate
-that the project has yet to be fully endorsed by the ASF.


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris-flink-connector] hf200012 merged pull request #45: remove DISCLAIMER

2022-07-08 Thread GitBox


hf200012 merged PR #45:
URL: https://github.com/apache/doris-flink-connector/pull/45


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yangzhg opened a new issue, #10705: [Bug] Fe crash by bdbje

2022-07-08 Thread GitBox


yangzhg opened a new issue, #10705:
URL: https://github.com/apache/doris/issues/10705

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   2022-07-05 23:27:27,685 WARN (replayer|79) [Catalog.replayJournal():2506] 
replay journal cost too much time: 1001 replayedJournalId: 1160191
   2022-07-05 23:27:27,687 INFO (replayer|79) [Catalog.replayJournal():2478] 
replayed journal id is 1160191, replay to journal id is 1160192
   2022-07-05 23:27:28,687 WARN (replayer|79) [BDBJournalCursor.next():148] 
Catch an exception when get next JournalEntity. key:1160192
   com.sleepycat.je.LockTimeoutException: (JE 7.3.7) Lock expired. Locker 
497486454 -1_replayer_ReplicaThreadLocker: waited for lock on database=1150001 
LockAddr:1459556016 LSN=0x43/0x6648b type=READ grant=WAIT_NEW 
timeoutMillis=1000 startTime=1657034847687 endTime=1657034848687
   Owners: [604669736 
-1495664_ReplayThread_ReplayTxn" type="WRITE"/>]
   Waiters: []
   
   at 
com.sleepycat.je.txn.LockManager.makeTimeoutException(LockManager.java:1117) 
~[je-7.3.7.jar:7.3.7]
   at 
com.sleepycat.je.txn.LockManager.waitForLock(LockManager.java:606) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.txn.LockManager.lock(LockManager.java:345) 
~[je-7.3.7.jar:7.3.7]
   at 
com.sleepycat.je.txn.BasicLocker.lockInternal(BasicLocker.java:124) 
~[je-7.3.7.jar:7.3.7]
   at 
com.sleepycat.je.rep.txn.ReplicaThreadLocker.lockInternal(ReplicaThreadLocker.java:63)
 ~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.txn.Locker.lock(Locker.java:499) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:3585) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:3316) 
~[je-7.3.7.jar:7.3.7]
   at 
com.sleepycat.je.dbi.CursorImpl.lockLNAndCheckDefunct(CursorImpl.java:2138) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.dbi.CursorImpl.searchExact(CursorImpl.java:1950) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Cursor.searchExact(Cursor.java:4194) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Cursor.searchNoDups(Cursor.java:4055) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Cursor.search(Cursor.java:3857) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Cursor.getInternal(Cursor.java:1284) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Database.get(Database.java:1271) 
~[je-7.3.7.jar:7.3.7]
   at com.sleepycat.je.Database.get(Database.java:1330) 
~[je-7.3.7.jar:7.3.7]
   at 
org.apache.doris.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:108) 
[palo-fe.jar:0.15-SNAPSHOT]
   at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2488) 
[palo-fe.jar:0.15-SNAPSHOT]
   at org.apache.doris.catalog.Catalog$3.runOneCycle(Catalog.java:2277) 
[palo-fe.jar:0.15-SNAPSHOT]
   at org.apache.doris.common.util.Daemon.run(Daemon.java:116) 
[palo-fe.jar:0.15-SNAPSHOT]
   
   ### What You Expected?
   
   normal
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10695:
URL: https://github.com/apache/doris/pull/10695#issuecomment-1178918184

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10695: [BUG](datev2) fix bloom filter for datev2 and remove redundant code

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10695:
URL: https://github.com/apache/doris/pull/10695#issuecomment-1178918203

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10698: [Doc]broker load rpc timeout problem FQA

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10698:
URL: https://github.com/apache/doris/pull/10698#issuecomment-1178919064

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] liaoxin01 opened a new pull request, #10706: [feature-wip](unique-key-merge-on-write) add bloom filter index for primary key, DSIP-018[1.2]

2022-07-08 Thread GitBox


liaoxin01 opened a new pull request, #10706:
URL: https://github.com/apache/doris/pull/10706

   # Proposed changes
   
   Add Bloom filter index for primary key. This patch is for step 1.2 in 
scheduling.
   For the detail, see 
DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (Yes)
   3. Has document been added or modified: (No)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on pull request #10702: [refactor] Rename Catalog to Env

2022-07-08 Thread GitBox


morningman commented on PR #10702:
URL: https://github.com/apache/doris/pull/10702#issuecomment-1178939953

   > Why do we need this modification?
   
   As we discussed in dev@doris: 
https://lists.apache.org/thread/tr2fgydon657wvoy8vf1ccr8z9xos693


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] annotated tag 1.1.0-rc04 updated (a6eb47ac08 -> 113293c989)

2022-07-08 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to annotated tag 1.1.0-rc04
in repository https://gitbox.apache.org/repos/asf/doris.git


*** WARNING: tag 1.1.0-rc04 was modified! ***

from a6eb47ac08 (commit)
  to 113293c989 (tag)
 tagging a6eb47ac0875ed51291ed7b1cd990d40f7d901de (commit)
  by morningman
  on Fri Jul 8 20:41:15 2022 +0800

- Log -
1.1.0-rc04
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #10691: [refactor] update stop_be.sh to avoid error message

2022-07-08 Thread GitBox


yiguolei merged PR #10691:
URL: https://github.com/apache/doris/pull/10691


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [refactor] update stop_be.sh to avoid error message (#10691)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 6f29a8ac0d [refactor] update stop_be.sh to avoid error message (#10691)
6f29a8ac0d is described below

commit 6f29a8ac0d3de9dcb7082681fdc6700b1952ea95
Author: minghong 
AuthorDate: Fri Jul 8 20:49:00 2022 +0800

[refactor] update stop_be.sh to avoid error message (#10691)

* update stop_be.sh to avoid error message

* update stop_be.sh
---
 bin/stop_be.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bin/stop_be.sh b/bin/stop_be.sh
index 9d65d73307..e189f91112 100755
--- a/bin/stop_be.sh
+++ b/bin/stop_be.sh
@@ -23,7 +23,7 @@ export DORIS_HOME=`cd "$curdir/.."; pwd`
 export PID_DIR=`cd "$curdir"; pwd`
 
 signum=9
-if [ $1 = "--grace" ]; then
+if [[ $1 = "--grace" ]]; then
 signum=15
 fi
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei closed issue #10641: [Bug] Core dump when aggregate function with no group by

2022-07-08 Thread GitBox


yiguolei closed issue #10641: [Bug] Core dump when aggregate function with no 
group by
URL: https://github.com/apache/doris/issues/10641


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #10650: [Bug][Function] pass intermediate argument list to be

2022-07-08 Thread GitBox


yiguolei merged PR #10650:
URL: https://github.com/apache/doris/pull/10650


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [Bug][Function] pass intermediate argument list to be (#10650)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new f58a071605 [Bug][Function] pass intermediate argument list to be 
(#10650)
f58a071605 is described below

commit f58a071605a1aaa7a68a99cdd2f098a5868787e4
Author: Pxl <952130...@qq.com>
AuthorDate: Fri Jul 8 20:50:05 2022 +0800

[Bug][Function] pass intermediate argument list to be (#10650)
---
 .../aggregate_function_orthogonal_bitmap.cpp   |  2 --
 .../aggregate_function_topn.cpp|  5 +
 .../aggregate_functions/aggregate_function_topn.h  |  8 
 be/src/vec/data_types/data_type_factory.hpp|  4 
 be/src/vec/exprs/vectorized_agg_fn.cpp | 22 ++
 be/src/vec/exprs/vectorized_agg_fn.h   |  2 +-
 .../org/apache/doris/analysis/AggregateInfo.java   | 15 ---
 .../apache/doris/analysis/FunctionCallExpr.java| 20 +++-
 .../org/apache/doris/analysis/FunctionParams.java  | 15 +++
 gensrc/thrift/Exprs.thrift |  1 +
 gensrc/thrift/Types.thrift |  1 +
 11 files changed, 52 insertions(+), 43 deletions(-)

diff --git 
a/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp 
b/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp
index 470a6c8388..9794a72090 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp
+++ b/be/src/vec/aggregate_functions/aggregate_function_orthogonal_bitmap.cpp
@@ -34,8 +34,6 @@ AggregateFunctionPtr 
create_aggregate_function_orthogonal(const std::string& nam
 LOG(WARNING) << "Incorrect number of arguments for aggregate function 
" << name;
 return nullptr;
 } else if (argument_types.size() == 1) {
-// only used at AGGREGATE (merge finalize) for variadic function
-// and for orthogonal_bitmap_union_count function
 return 
std::make_shared>>(argument_types);
 } else {
 const IDataType& argument_type = *argument_types[1];
diff --git a/be/src/vec/aggregate_functions/aggregate_function_topn.cpp 
b/be/src/vec/aggregate_functions/aggregate_function_topn.cpp
index 04df93ce67..19f52fbff8 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_topn.cpp
+++ b/be/src/vec/aggregate_functions/aggregate_function_topn.cpp
@@ -23,10 +23,7 @@ AggregateFunctionPtr create_aggregate_function_topn(const 
std::string& name,
 const DataTypes& 
argument_types,
 const Array& parameters,
 const bool 
result_is_nullable) {
-if (argument_types.size() == 1) {
-return AggregateFunctionPtr(
-new 
AggregateFunctionTopN(argument_types));
-} else if (argument_types.size() == 2) {
+if (argument_types.size() == 2) {
 return AggregateFunctionPtr(
 new 
AggregateFunctionTopN>(
 argument_types));
diff --git a/be/src/vec/aggregate_functions/aggregate_function_topn.h 
b/be/src/vec/aggregate_functions/aggregate_function_topn.h
index 97ac5c7cba..ae9fdf322d 100644
--- a/be/src/vec/aggregate_functions/aggregate_function_topn.h
+++ b/be/src/vec/aggregate_functions/aggregate_function_topn.h
@@ -168,14 +168,6 @@ struct StringDataImplTopN {
 }
 };
 
-struct AggregateFunctionTopNImplEmpty {
-// only used at AGGREGATE (merge finalize)
-static void add(AggregateFunctionTopNData& __restrict place, const 
IColumn** columns,
-size_t row_num) {
-LOG(FATAL) << "AggregateFunctionTopNImplEmpty do not support add()";
-}
-};
-
 template 
 struct AggregateFunctionTopNImplInt {
 static void add(AggregateFunctionTopNData& __restrict place, const 
IColumn** columns,
diff --git a/be/src/vec/data_types/data_type_factory.hpp 
b/be/src/vec/data_types/data_type_factory.hpp
index 08dc6a9f31..59740debd3 100644
--- a/be/src/vec/data_types/data_type_factory.hpp
+++ b/be/src/vec/data_types/data_type_factory.hpp
@@ -102,6 +102,10 @@ public:
 
 DataTypePtr create_data_type(const arrow::DataType* type, bool 
is_nullable);
 
+DataTypePtr create_data_type(const TTypeDesc& raw_type) {
+return create_data_type(TypeDescriptor::from_thrift(raw_type), 
raw_type.is_nullable);
+}
+
 private:
 DataTypePtr _create_primitive_data_type(const FieldType& type) const;
 
diff --git a/be/src/vec/exprs/vectorized_agg_fn.cpp 
b/be/src/vec/exprs/vectorized_agg_fn.cpp
index ad7066a9a4..b7e14817f1 100644
--- a/be/src/vec/exprs/vectorized_agg_fn.cpp
+++ b/be/src/vec/exprs/vectorized_agg_fn.cpp
@@ -33,7 +33,6 @@ AggFnEvaluator::AggFnEvaluator(const TExprNode& desc)
 : _fn(desc.fn),
   _is_merge(desc.agg_expr.is_merge_agg),

[GitHub] [doris] yiguolei merged pull request #10577: [enhancement](regression-test) add real data path for regression test.

2022-07-08 Thread GitBox


yiguolei merged PR #10577:
URL: https://github.com/apache/doris/pull/10577


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated (f58a071605 -> 2b2bf017f8)

2022-07-08 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from f58a071605 [Bug][Function] pass intermediate argument list to be 
(#10650)
 add 2b2bf017f8 [enhancement](regression-test) add real data path for 
regression test. (#10577)

No new revisions were added by this update.

Summary of changes:
 regression-test/conf/regression-conf.groovy|  1 +
 .../org/apache/doris/regression/Config.groovy  | 13 ++--
 .../apache/doris/regression/ConfigOptions.groovy   |  9 ++
 .../org/apache/doris/regression/suite/Suite.groovy |  5 
 .../doris/regression/suite/SuiteContext.groovy | 35 ++
 5 files changed, 61 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10701: [refactor](predicate) refactor predicates in scan node

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10701:
URL: https://github.com/apache/doris/pull/10701#discussion_r916809110


##
be/src/vec/exec/volap_scan_node.cpp:
##
@@ -937,7 +944,7 @@ std::pair 
VOlapScanNode::should_push_down_eq_predicate(doris::SlotD
 return result_pair;
 }
 
-template 
+template 
 Status VOlapScanNode::change_fixed_value_range(ColumnValueRange& 
temp_range, PrimitiveType type,
void* value, const 
ChangeFixedValueRangeFunc& func) {

Review Comment:
   Does `T` always equal to `type`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916821559


##
be/src/vec/common/columns_hashing.h:
##
@@ -111,29 +111,48 @@ struct HashMethodString : public 
columns_hashing_impl::HashMethodBase<
   * That is, for example, for strings, it contains first the serialized length 
of the string, and then the bytes.
   * Therefore, when aggregating by several strings, there is no ambiguity.
   */
-template 
+template 
 struct HashMethodSerialized
-: public 
columns_hashing_impl::HashMethodBase, Value,
-  Mapped, false> {
-using Self = HashMethodSerialized;
+: public columns_hashing_impl::HashMethodBase<
+  HashMethodSerialized, 
Value, Mapped, false> {
+using Self = HashMethodSerialized;
 using Base = columns_hashing_impl::HashMethodBase;
+using KeyHolderType =
+std::conditional_t;
 
 ColumnRawPtrs key_columns;
 size_t keys_size;
+const StringRef* keys;
 
 HashMethodSerialized(const ColumnRawPtrs& key_columns_, const Sizes& 
/*key_sizes*/,
  const HashMethodContextPtr&)
 : key_columns(key_columns_), keys_size(key_columns_.size()) {}
 
+void set_serialized_keys(StringRef* keys_) { keys = keys_; }

Review Comment:
   Maybe we can add const here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman merged pull request #10655: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2]

2022-07-08 Thread GitBox


morningman merged PR #10655:
URL: https://github.com/apache/doris/pull/10655


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655)

2022-07-08 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new feeef7e4da [feature-wip](unique-key-merge-on-write) add interface for 
segment key bounds, DSIP-018[3/2] (#10655)
feeef7e4da is described below

commit feeef7e4dab86c87a15ba724964627a9768ca682
Author: zhannngchen <48427519+zhannngc...@users.noreply.github.com>
AuthorDate: Fri Jul 8 21:39:13 2022 +0800

[feature-wip](unique-key-merge-on-write) add interface for segment key 
bounds, DSIP-018[3/2] (#10655)

Add interfaces for segment key bounds, key bounds will be used to speed up 
point lookup
on the primary key index of each segment.
For the detail, see 
DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model

KeyBounds will be updated by BetaRowsetWriter, will be used to construct a 
RowsetTree(based on IntervalTree,
will be added through next patch)
---
 be/src/olap/rowset/rowset.h  |  5 +
 be/src/olap/rowset/rowset_meta.h | 13 +
 gensrc/proto/olap_file.proto |  8 
 3 files changed, 26 insertions(+)

diff --git a/be/src/olap/rowset/rowset.h b/be/src/olap/rowset/rowset.h
index 158848be89..ec2a39652b 100644
--- a/be/src/olap/rowset/rowset.h
+++ b/be/src/olap/rowset/rowset.h
@@ -258,6 +258,11 @@ public:
 }
 }
 
+virtual Status get_segments_key_bounds(std::vector* 
segments_key_bounds) {
+_rowset_meta->get_segments_key_bounds(segments_key_bounds);
+return Status::OK();
+}
+
 protected:
 friend class RowsetFactory;
 
diff --git a/be/src/olap/rowset/rowset_meta.h b/be/src/olap/rowset/rowset_meta.h
index e4153e2345..c91fe0469d 100644
--- a/be/src/olap/rowset/rowset_meta.h
+++ b/be/src/olap/rowset/rowset_meta.h
@@ -298,6 +298,19 @@ public:
 return score;
 }
 
+void get_segments_key_bounds(std::vector* 
segments_key_bounds) const {
+for (const KeyBoundsPB& key_range : 
_rowset_meta_pb.segments_key_bounds()) {
+segments_key_bounds->push_back(key_range);
+}
+}
+
+void set_segments_key_bounds(const std::vector& 
segments_key_bounds) {
+for (const KeyBoundsPB& key_bounds : segments_key_bounds) {
+KeyBoundsPB* new_key_bounds = 
_rowset_meta_pb.add_segments_key_bounds();
+*new_key_bounds = key_bounds;
+}
+}
+
 const AlphaRowsetExtraMetaPB& alpha_rowset_extra_meta_pb() const {
 return _rowset_meta_pb.alpha_rowset_extra_meta_pb();
 }
diff --git a/gensrc/proto/olap_file.proto b/gensrc/proto/olap_file.proto
index 0d484a292d..4385d5803d 100644
--- a/gensrc/proto/olap_file.proto
+++ b/gensrc/proto/olap_file.proto
@@ -53,6 +53,11 @@ enum SegmentsOverlapPB {
 NONOVERLAPPING = 2;
 }
 
+message KeyBoundsPB {
+required bytes min_key = 1;
+required bytes max_key = 2;
+}
+
 message RowsetMetaPB {
 required int64 rowset_id = 1;
 optional int64 partition_id = 2;
@@ -99,6 +104,9 @@ message RowsetMetaPB {
 optional int64 oldest_write_timestamp = 25 [default = -1];
 // latest write time
 optional int64 newest_write_timestamp = 26 [default = -1];
+// the encoded segment min/max key of segments in this rowset,
+// only used in unique key data model with primary_key_index support.
+repeated KeyBoundsPB segments_key_bounds = 27;
 // spare field id for future use
 optional AlphaRowsetExtraMetaPB alpha_rowset_extra_meta_pb = 50;
 // to indicate whether the data between the segments overlap


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new issue, #10707: [Bug] InPredicate core dump in runtime filer

2022-07-08 Thread GitBox


Gabriel39 opened a new issue, #10707:
URL: https://github.com/apache/doris/issues/10707

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   InPredicate with no child means a predicate which is always false. For 
runtime filter, it possibly occurs. But this cause core dump in InPredicate 
DCHECK
   
   ### What's Wrong?
   
   InPredicate with no child means a predicate which is always false. For 
runtime filter, it possibly occurs. But this cause core dump in InPredicate 
DCHECK
   
   ### What You Expected?
   
   works well
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] weizuo93 opened a new issue, #10708: [Feature] Add interface to check tablet segment lost

2022-07-08 Thread GitBox


weizuo93 opened a new issue, #10708:
URL: https://github.com/apache/doris/issues/10708

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   There may be some exceptions that cause segment to be lost on BE node. 
However, the metadata shows that the tablet is normal. This abnormal replica is 
not detected by FE and cannot be automatically repaired.When query comes, 
exception information is thrown that `failed to initialize storage reader`.
   
   I think  we'd better be able to  check tablet segment lost.
   
   ### Use case
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #10709: [BUG] fix DCHECK failed for vectorized InPredicate

2022-07-08 Thread GitBox


Gabriel39 opened a new pull request, #10709:
URL: https://github.com/apache/doris/pull/10709

   # Proposed changes
   
   Issue Number: close #10707
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10709: [BUG] fix DCHECK failed for vectorized InPredicate

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10709:
URL: https://github.com/apache/doris/pull/10709#issuecomment-1179020003

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10709: [BUG] fix DCHECK failed for vectorized InPredicate

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10709:
URL: https://github.com/apache/doris/pull/10709#issuecomment-1179020063

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener opened a new pull request, #10710: [improve](planner): split output expr to multiple line.

2022-07-08 Thread GitBox


jackwener opened a new pull request, #10710:
URL: https://github.com/apache/doris/pull/10710

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   split output expr to multiple line.
   
   ```
   +---+
   | Explain String|
   +---+
   | PLAN FRAGMENT 0   |
   |   OUTPUT EXPRS:   |
   |  `user_id`|
   |  `default_cluster:test`.`tbl`.`date` |
   |  `city`  |
   |  `default_cluster:test`.`tbl`.`age`  |
   +---+
   ```
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: Yes
   2. Has unit tests been added: No need
   3. Has document been added or modified: No need
   4. Does it need to update dependencies: No
   5. Are there any changes that cannot be rolled back: No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] weizuo93 opened a new pull request, #10711: [Feature] Add interface to check tablet segment lost

2022-07-08 Thread GitBox


weizuo93 opened a new pull request, #10711:
URL: https://github.com/apache/doris/pull/10711

   # Proposed changes
   
   Issue Number: close #10708 
   
   ## Problem Summary:
   
   There may be some exceptions that cause segment to be lost on BE node. 
However, the metadata shows that the tablet is normal. This abnormal replica is 
not detected by FE and cannot be automatically repaired.When query comes, 
exception information is thrown that `failed to initialize storage reader`. I 
think we'd better be able to check tablet segment lost.
   
   This patch add a interface to check tablet segment lost.
   ```
   curl -X GET http://be_host:webserver_port/api/check_tablet_segment_existence
   ```
   
The return of the interface is all tablets on the current BE node that have 
lost segment.
   ```
   {
   msg: "Succeed to check all tablet segment",
   num: 3,
   bad_tablets: [
   11190,
   11210,
   11216
   ],
   host: "172.3.0.101"
   }
   ```
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No Need)
   3. Has document been added or modified: (Yes)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916847546


##
be/src/vec/columns/column_nullable.cpp:
##
@@ -134,6 +134,24 @@ const char* 
ColumnNullable::deserialize_and_insert_from_arena(const char* pos) {
 return pos;
 }
 
+size_t ColumnNullable::get_max_row_byte_size() const {
+constexpr auto flag_size = sizeof(get_null_map_data()[0]);

Review Comment:
   Maybe we can just use NullMap::T



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on a diff in pull request #10620: [Enhancement][multi-catalog]Impl parallel for file scanner to improve the scanner performance

2022-07-08 Thread GitBox


morningman commented on code in PR #10620:
URL: https://github.com/apache/doris/pull/10620#discussion_r916803679


##
fe/fe-core/src/main/java/org/apache/doris/common/Config.java:
##
@@ -1654,6 +1654,12 @@ public class Config extends ConfigBase {
 @ConfField(mutable = false, masterOnly = true)
 public static boolean enable_multi_catalog = false; // 1 min
 
+@ConfField(mutable = true, masterOnly = true)

Review Comment:
   Both `file_scan_node_spilt_size` and `file_scan_node_spilt_num` are NOT 
`masterOnly` config.



##
fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java:
##
@@ -134,6 +135,30 @@ public int numBackends() {
 }
 }
 
+private static class FileSpiltStrategy {

Review Comment:
   ```suggestion
   private static class FileSplitStrategy {
   ```
   
   And all other `split` typo.
   



##
fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java:
##
@@ -340,6 +380,7 @@ protected void toThrift(TPlanNode planNode) {
 
 @Override
 public List getScanRangeLocations(long 
maxScanRangeLength) {
+LOG.info("There is {} scanRangeLocations for execution.", 
scanRangeLocations.size());

Review Comment:
   ```suggestion
   LOG.debug("There is {} scanRangeLocations for execution.", 
scanRangeLocations.size());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10710: [improve](planner): split output expr to multiple line.

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10710:
URL: https://github.com/apache/doris/pull/10710#issuecomment-1179030775

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #10710: [improve](planner): split output expr to multiple line.

2022-07-08 Thread GitBox


github-actions[bot] commented on PR #10710:
URL: https://github.com/apache/doris/pull/10710#issuecomment-1179030745

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916864491


##
be/src/vec/columns/column_nullable.cpp:
##
@@ -134,6 +134,24 @@ const char* 
ColumnNullable::deserialize_and_insert_from_arena(const char* pos) {
 return pos;
 }
 
+size_t ColumnNullable::get_max_row_byte_size() const {
+constexpr auto flag_size = sizeof(get_null_map_data()[0]);
+return flag_size + get_nested_column().get_max_row_byte_size();
+}
+
+void ColumnNullable::serialize_vec(std::vector& keys, size_t 
num_rows,
+   size_t max_row_byte_size) const {
+const auto& arr = get_null_map_data();
+static constexpr auto s = sizeof(arr[0]);
+for (size_t i = 0; i < num_rows; ++i) {
+auto* val = const_cast(keys[i].data + keys[i].size);
+*val = (arr[i] ? 1 : 0);

Review Comment:
   Can we just use `*val=arr[i]` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916867517


##
be/src/vec/exec/vaggregation_node.cpp:
##
@@ -1034,6 +1049,12 @@ Status 
AggregationNode::_merge_with_serialized_key(Block* block) {
 using HashMethodType = std::decay_t;
 using AggState = typename HashMethodType::State;
 AggState state(key_columns, _probe_key_sz, nullptr);
+if constexpr 
(ColumnsHashing::IsPreSerializedKeysHashMethodTraits<
+  AggState>::value) {
+SCOPED_TIMER(_serialize_key_timer);

Review Comment:
   Maybe we can do some abstract for those same code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


BiteThet commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916872115


##
be/src/vec/exec/vaggregation_node.h:
##
@@ -50,13 +50,41 @@ struct AggregationMethodSerialized {
 Data data;
 Iterator iterator;
 bool inited = false;
+std::vector keys;
+AggregationMethodSerialized()
+: _serialized_key_buffer_size(0),
+  _serialized_key_buffer(nullptr),
+  _mem_pool(new MemPool) {}
 
-AggregationMethodSerialized() = default;
+using State = ColumnsHashing::HashMethodSerialized;
 
 template 
 explicit AggregationMethodSerialized(const Other& other) : 
data(other.data) {}
 
-using State = ColumnsHashing::HashMethodSerialized;
+void serialize_keys(const ColumnRawPtrs& key_columns, const size_t 
num_rows) {
+size_t max_one_row_byte_size = 0;
+for (const auto& column : key_columns) {
+max_one_row_byte_size += column->get_max_row_byte_size();

Review Comment:
   Does we should consider case that some string column have few long string? 
This may increase a lot of memory allocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman commented on a diff in pull request #10620: [Enhancement][multi-catalog]Impl parallel for file scanner to improve the scanner performance

2022-07-08 Thread GitBox


morningman commented on code in PR #10620:
URL: https://github.com/apache/doris/pull/10620#discussion_r916877906


##
fe/fe-core/src/main/java/org/apache/doris/planner/external/ExternalFileScanNode.java:
##
@@ -311,6 +350,7 @@ private TFileRangeDesc createFileRangeDesc(
 // set hdfs params for hdfs file type.
 if (scanProvider.getTableFileType() == TFileType.FILE_HDFS) {
 THdfsParams tHdfsParams = 
BrokerUtil.generateHdfsParam(scanProvider.getTableProperties());
+tHdfsParams.addToHdfsConf(new 
THdfsConf("dfs.client.read.shortcircuit", "false"));

Review Comment:
   Why do we need to disable this feature?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei commented on a diff in pull request #10700: [improvement]pre-serialize aggregation keys

2022-07-08 Thread GitBox


yiguolei commented on code in PR #10700:
URL: https://github.com/apache/doris/pull/10700#discussion_r916899062


##
be/src/vec/exec/vaggregation_node.h:
##
@@ -50,13 +50,41 @@ struct AggregationMethodSerialized {
 Data data;
 Iterator iterator;
 bool inited = false;
+std::vector keys;
+AggregationMethodSerialized()
+: _serialized_key_buffer_size(0),
+  _serialized_key_buffer(nullptr),
+  _mem_pool(new MemPool) {}
 
-AggregationMethodSerialized() = default;
+using State = ColumnsHashing::HashMethodSerialized;
 
 template 
 explicit AggregationMethodSerialized(const Other& other) : 
data(other.data) {}
 
-using State = ColumnsHashing::HashMethodSerialized;
+void serialize_keys(const ColumnRawPtrs& key_columns, const size_t 
num_rows) {
+size_t max_one_row_byte_size = 0;
+for (const auto& column : key_columns) {
+max_one_row_byte_size += column->get_max_row_byte_size();

Review Comment:
   Maybe not, the memory is allocated block by block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



  1   2   >