[GitHub] [doris] hello-stephen commented on pull request #16365: [refactor](Nereids) remove trick datatype code in Expression

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16365:
URL: https://github.com/apache/doris/pull/16365#issuecomment-1413307136

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 33.76 seconds
load time: 491 seconds
storage size: 17170926637 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202080816_clickbench_pr_89485.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei opened a new issue, #16366: [Enhancement] Doris query layer should be exception safe

2023-02-02 Thread via GitHub


yiguolei opened a new issue, #16366:
URL: https://github.com/apache/doris/issues/16366

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Description
   
   _No response_
   
   ### Solution
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei opened a new pull request, #16367: [enhancement](stream receiver) make data stream receiver exception safe.

2023-02-02 Thread via GitHub


yiguolei opened a new pull request, #16367:
URL: https://github.com/apache/doris/pull/16367

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   part of https://github.com/apache/doris/issues/16366
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener opened a new pull request, #16368: [enhance](Nereids): polish code

2023-02-02 Thread via GitHub


jackwener opened a new pull request, #16368:
URL: https://github.com/apache/doris/pull/16368

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring merged pull request #16355: [Feature-WIP](inverted index) support array type for inverted index reader

2023-02-02 Thread via GitHub


dataroaring merged PR #16355:
URL: https://github.com/apache/doris/pull/16355


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [Feature-WIP](inverted index) support array type for inverted index reader (#16355)

2023-02-02 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new bb179b77f7 [Feature-WIP](inverted index) support array type for 
inverted index reader (#16355)
bb179b77f7 is described below

commit bb179b77f75d2b0471eb7b3b75ad783d21596194
Author: YueW <45946325+tany...@users.noreply.github.com>
AuthorDate: Thu Feb 2 16:14:14 2023 +0800

[Feature-WIP](inverted index) support array type for inverted index reader 
(#16355)
---
 be/src/vec/exec/scan/vscan_node.cpp| 20 ++-
 .../main/java/org/apache/doris/catalog/Type.java   | 10 
 .../java/org/apache/doris/analysis/IndexDef.java   |  4 ++
 .../org/apache/doris/analysis/MatchPredicate.java  | 69 +++--
 .../data/inverted_index_p0/test_array_index.out| 58 ++
 .../inverted_index_p0/test_array_index.groovy  | 70 ++
 6 files changed, 197 insertions(+), 34 deletions(-)

diff --git a/be/src/vec/exec/scan/vscan_node.cpp 
b/be/src/vec/exec/scan/vscan_node.cpp
index 198e7ab0c7..d0fc12f37a 100644
--- a/be/src/vec/exec/scan/vscan_node.cpp
+++ b/be/src/vec/exec/scan/vscan_node.cpp
@@ -49,6 +49,17 @@ static bool ignore_cast(SlotDescriptor* slot, VExpr* expr) {
 if (slot->type().is_string_type() && expr->type().is_string_type()) {
 return true;
 }
+if (slot->type().is_array_type()) {
+if (slot->type().children[0].type == expr->type().type) {
+return true;
+}
+if (slot->type().children[0].is_date_type() && 
expr->type().is_date_type()) {
+return true;
+}
+if (slot->type().children[0].is_string_type() && 
expr->type().is_string_type()) {
+return true;
+}
+}
 return false;
 }
 
@@ -391,7 +402,14 @@ Status VScanNode::_normalize_conjuncts() {
 std::vector slots = _output_tuple_desc->slots();
 
 for (int slot_idx = 0; slot_idx < slots.size(); ++slot_idx) {
-switch (slots[slot_idx]->type().type) {
+auto type = slots[slot_idx]->type().type;
+if (slots[slot_idx]->type().type == TYPE_ARRAY) {
+type = slots[slot_idx]->type().children[0].type;
+if (type == TYPE_ARRAY) {
+continue;
+}
+}
+switch (type) {
 #define M(NAME)
  \
 case TYPE_##NAME: {
  \
 ColumnValueRange range(slots[slot_idx]->col_name(),   
  \
diff --git a/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java 
b/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java
index e6c2e3a4cd..ef3ec7c834 100644
--- a/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java
+++ b/fe/fe-common/src/main/java/org/apache/doris/catalog/Type.java
@@ -109,6 +109,7 @@ public abstract class Type {
 
 private static final Logger LOG = LogManager.getLogger(Type.class);
 private static final ArrayList integerTypes;
+private static final ArrayList stringTypes;
 private static final ArrayList numericTypes;
 private static final ArrayList numericDateTimeTypes;
 private static final ArrayList supportedTypes;
@@ -123,6 +124,11 @@ public abstract class Type {
 integerTypes.add(BIGINT);
 integerTypes.add(LARGEINT);
 
+stringTypes = Lists.newArrayList();
+stringTypes.add(CHAR);
+stringTypes.add(VARCHAR);
+stringTypes.add(STRING);
+
 numericTypes = Lists.newArrayList();
 numericTypes.addAll(integerTypes);
 numericTypes.add(FLOAT);
@@ -207,6 +213,10 @@ public abstract class Type {
 return integerTypes;
 }
 
+public static ArrayList getStringTypes() {
+return stringTypes;
+}
+
 public static ArrayList getNumericTypes() {
 return numericTypes;
 }
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java
index ed03dbd84e..d1c21b5d37 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/IndexDef.java
@@ -17,6 +17,7 @@
 
 package org.apache.doris.analysis;
 
+import org.apache.doris.catalog.ArrayType;
 import org.apache.doris.catalog.Column;
 import org.apache.doris.catalog.KeysType;
 import org.apache.doris.catalog.PrimitiveType;
@@ -176,6 +177,9 @@ public class IndexDef {
 || indexType == IndexType.NGRAM_BF) {
 String indexColName = column.getName();
 PrimitiveType colType = column.getDataType();
+if (indexType == IndexType.INVERTED && colType.isArrayType()) {
+colType = ((ArrayType) 
column.getType()).getItemType().getP

[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15966: [Feature](map)support complex struct for doris

2023-02-02 Thread via GitHub


github-actions[bot] commented on code in PR #15966:
URL: https://github.com/apache/doris/pull/15966#discussion_r1094172591


##
be/src/vec/data_types/data_type_factory.cpp:
##
@@ -169,6 +179,12 @@ DataTypePtr DataTypeFactory::create_data_type(const 
TypeDescriptor& col_desc, bo
 }
 nested = std::make_shared(dataTypes, names);
 break;
+case TYPE_MAP:
+DCHECK(col_desc.children.size() == 2);
+nested = std::make_shared(
+create_data_type(col_desc.children[0], 
col_desc.contains_nulls[0]),
+create_data_type(col_desc.children[1], 
col_desc.contains_nulls[1]));
+break;
 }
 case INVALID_TYPE:

Review Comment:
   warning: 'case' statement not in switch statement [clang-diagnostic-error]
   ```cpp
   case INVALID_TYPE:
   ^
   ```
   



##
be/src/vec/data_types/data_type_factory.cpp:
##
@@ -169,6 +179,12 @@
 }
 nested = std::make_shared(dataTypes, names);
 break;
+case TYPE_MAP:
+DCHECK(col_desc.children.size() == 2);
+nested = std::make_shared(
+create_data_type(col_desc.children[0], 
col_desc.contains_nulls[0]),
+create_data_type(col_desc.children[1], 
col_desc.contains_nulls[1]));
+break;
 }
 case INVALID_TYPE:
 default:

Review Comment:
   warning: 'default' statement not in switch statement [clang-diagnostic-error]
   ```cpp
   default:
   ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #16349: [fix](join) crash caused by canceling query (Cherry-pick from #16311)

2023-02-02 Thread via GitHub


yiguolei merged PR #16349:
URL: https://github.com/apache/doris/pull/16349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16367: [enhancement](stream receiver) make data stream receiver exception safe.

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16367:
URL: https://github.com/apache/doris/pull/16367#issuecomment-1413317797

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-1.2-lts updated: [fix](join) crash caused by canceling query (#16311) (#16349)

2023-02-02 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-1.2-lts by this push:
 new 495d37d337 [fix](join) crash caused by canceling query (#16311) 
(#16349)
495d37d337 is described below

commit 495d37d33761f70565fac6978ee97a1df09e01a1
Author: Jerry Hu 
AuthorDate: Thu Feb 2 16:17:17 2023 +0800

[fix](join) crash caused by canceling query (#16311) (#16349)

If the query was canceled,
the status in shared context may be `OK` with other fields not set.
---
 be/src/vec/exec/join/vhash_join_node.cpp|  3 ++-
 be/src/vec/runtime/shared_hash_table_controller.cpp | 11 +++
 be/src/vec/runtime/shared_hash_table_controller.h   |  1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/be/src/vec/exec/join/vhash_join_node.cpp 
b/be/src/vec/exec/join/vhash_join_node.cpp
index 9f967a8190..9408622f78 100644
--- a/be/src/vec/exec/join/vhash_join_node.cpp
+++ b/be/src/vec/exec/join/vhash_join_node.cpp
@@ -1081,7 +1081,8 @@ std::vector 
HashJoinNode::_convert_block_to_null(Block& block) {
 
 HashJoinNode::~HashJoinNode() {
 if (_shared_hashtable_controller && _should_build_hash_table) {
-_shared_hashtable_controller->signal(id());
+// signal at here is abnormal
+_shared_hashtable_controller->signal(id(), Status::Cancelled("signaled 
in destructor"));
 }
 }
 
diff --git a/be/src/vec/runtime/shared_hash_table_controller.cpp 
b/be/src/vec/runtime/shared_hash_table_controller.cpp
index e9e125a168..e558798644 100644
--- a/be/src/vec/runtime/shared_hash_table_controller.cpp
+++ b/be/src/vec/runtime/shared_hash_table_controller.cpp
@@ -42,6 +42,17 @@ SharedHashTableContextPtr 
SharedHashTableController::get_context(int my_node_id)
 return _shared_contexts[my_node_id];
 }
 
+void SharedHashTableController::signal(int my_node_id, Status status) {
+std::lock_guard lock(_mutex);
+auto it = _shared_contexts.find(my_node_id);
+if (it != _shared_contexts.cend()) {
+it->second->signaled = true;
+it->second->status = status;
+_shared_contexts.erase(it);
+}
+_cv.notify_all();
+}
+
 void SharedHashTableController::signal(int my_node_id) {
 std::lock_guard lock(_mutex);
 auto it = _shared_contexts.find(my_node_id);
diff --git a/be/src/vec/runtime/shared_hash_table_controller.h 
b/be/src/vec/runtime/shared_hash_table_controller.h
index e2c54f533d..1b058dcebe 100644
--- a/be/src/vec/runtime/shared_hash_table_controller.h
+++ b/be/src/vec/runtime/shared_hash_table_controller.h
@@ -67,6 +67,7 @@ public:
 TUniqueId get_builder_fragment_instance_id(int my_node_id);
 SharedHashTableContextPtr get_context(int my_node_id);
 void signal(int my_node_id);
+void signal(int my_node_id, Status status);
 Status wait_for_signal(RuntimeState* state, const 
SharedHashTableContextPtr& context);
 bool should_build_hash_table(const TUniqueId& fragment_instance_id, int 
my_node_id);
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] weizhengte opened a new pull request, #16369: [Improvement](statistics) optimise histogram keyword

2023-02-02 Thread via GitHub


weizhengte opened a new pull request, #16369:
URL: https://github.com/apache/doris/pull/16369

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Kikyou1997 opened a new pull request, #16370: [fix](planner) Doris returns empty sets when select from a inline view

2023-02-02 Thread via GitHub


Kikyou1997 opened a new pull request, #16370:
URL: https://github.com/apache/doris/pull/16370

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15966: [Feature](map)support complex struct for doris

2023-02-02 Thread via GitHub


github-actions[bot] commented on code in PR #15966:
URL: https://github.com/apache/doris/pull/15966#discussion_r1094185768


##
be/src/vec/data_types/data_type_factory.cpp:
##
@@ -169,7 +179,12 @@
 }
 nested = std::make_shared(dataTypes, names);
 break;
-}
+case TYPE_MAP:
+DCHECK(col_desc.children.size() == 2);
+nested = std::make_shared(
+create_data_type(col_desc.children[0], 
col_desc.contains_nulls[0]),
+create_data_type(col_desc.children[1], 
col_desc.contains_nulls[1]));
+break;
 case INVALID_TYPE:
 default:

Review Comment:
   warning: cannot jump from switch statement to this case label 
[clang-diagnostic-error]
   ```cpp
   default:
   ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:172:** jump bypasses variable 
initialization
   ```cpp
   Strings names;
   ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:171:** jump bypasses variable 
initialization
   ```cpp
   DataTypes dataTypes;
 ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:169:** jump bypasses variable 
initialization
   ```cpp
   size_t child_size = col_desc.children.size();
  ^
   ```
   



##
be/src/vec/data_types/data_type_factory.cpp:
##
@@ -169,7 +179,12 @@ DataTypePtr DataTypeFactory::create_data_type(const 
TypeDescriptor& col_desc, bo
 }
 nested = std::make_shared(dataTypes, names);
 break;
-}
+case TYPE_MAP:
+DCHECK(col_desc.children.size() == 2);
+nested = std::make_shared(
+create_data_type(col_desc.children[0], 
col_desc.contains_nulls[0]),
+create_data_type(col_desc.children[1], 
col_desc.contains_nulls[1]));
+break;
 case INVALID_TYPE:

Review Comment:
   warning: cannot jump from switch statement to this case label 
[clang-diagnostic-error]
   ```cpp
   case INVALID_TYPE:
   ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:172:** jump bypasses variable 
initialization
   ```cpp
   Strings names;
   ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:171:** jump bypasses variable 
initialization
   ```cpp
   DataTypes dataTypes;
 ^
   ```
   **be/src/vec/data_types/data_type_factory.cpp:169:** jump bypasses variable 
initialization
   ```cpp
   size_t child_size = col_desc.children.size();
  ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Tanya-W opened a new pull request, #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.

2023-02-02 Thread via GitHub


Tanya-W opened a new pull request, #16371:
URL: https://github.com/apache/doris/pull/16371

# Proposed changes
   Issue Number: Step5 of [DSIP-023: Add inverted index for full text 
search](https://cwiki.apache.org/confluence/display/DORIS/DSIP-023%3A+Add+inverted+index+for+full+text+search?src=contextnavpagetreemode)
   implementation for add/drop inverted index.
   
   dependency pr: https://github.com/apache/doris/pull/14211 
https://github.com/apache/doris/pull/15823 
https://github.com/apache/doris/pull/14207 
https://github.com/apache/doris/pull/15821
   
   ## Problem summary
   1. Support create multiple inverted indexes at the same time
   2. When execute alter inverted index, only update fe's meta, no need to 
modified be's meta, read/write base on fe's meta
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on a diff in pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.

2023-02-02 Thread via GitHub


github-actions[bot] commented on code in PR #16371:
URL: https://github.com/apache/doris/pull/16371#discussion_r1094196033


##
be/src/olap/schema_change.cpp:
##
@@ -586,6 +592,248 @@ Status 
VSchemaChangeWithSorting::_external_sorting(vector& src_
 return Status::OK();
 }
 
+SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex(
+const std::vector& alter_inverted_indexs,
+const TabletSchemaSPtr& tablet_schema)
+: SchemaChange(),

Review Comment:
   warning: initializer for base class 'doris::SchemaChange' is redundant 
[readability-redundant-member-init]
   
   ```suggestion
   : , _alter_inverted_indexs(alter_inverted_indexs), 
_tablet_schema(tablet_schema) {
   ```
   



##
be/src/olap/schema_change.cpp:
##
@@ -586,6 +592,248 @@
 return Status::OK();
 }
 
+SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex(
+const std::vector& alter_inverted_indexs,
+const TabletSchemaSPtr& tablet_schema)
+: SchemaChange(),
+  _alter_inverted_indexs(alter_inverted_indexs),
+  _tablet_schema(tablet_schema) {}
+
+SchemaChangeForInvertedIndex::~SchemaChangeForInvertedIndex() {
+VLOG_NOTICE << "~SchemaChangeForInvertedIndex()";
+_inverted_index_builders.clear();
+_index_metas.clear();
+}
+
+Status SchemaChangeForInvertedIndex::process(RowsetReaderSharedPtr 
rowset_reader,
+ RowsetWriter* rowset_writer,
+ TabletSharedPtr new_tablet,
+ TabletSharedPtr base_tablet,
+ TabletSchemaSPtr 
base_tablet_schema) {
+Status res = Status::OK();
+if (rowset_reader->rowset()->empty() || 
rowset_reader->rowset()->num_rows() == 0) {
+return Status::OK();
+}
+
+std::vector return_columns;
+for (auto& inverted_index : _alter_inverted_indexs) {
+DCHECK_EQ(inverted_index.columns.size(), 1);
+auto column_name = inverted_index.columns[0];
+auto idx = _tablet_schema->field_index(column_name);
+return_columns.emplace_back(idx);
+}
+
+// create inverted index writer
+auto rowset_meta = rowset_reader->rowset()->rowset_meta();
+std::string segment_dir = base_tablet->tablet_path();
+auto fs = rowset_meta->fs();
+for (auto i = 0; i < rowset_meta->num_segments(); ++i) {
+std::string segment_filename =
+fmt::format("{}_{}.dat", rowset_meta->rowset_id().to_string(), 
i);
+for (auto& inverted_index : _alter_inverted_indexs) {
+DCHECK_EQ(inverted_index.columns.size(), 1);
+auto column_name = inverted_index.columns[0];
+auto column = _tablet_schema->column(column_name);
+auto index_id = inverted_index.index_id;
+
+std::unique_ptr field(FieldFactory::create(column));
+_index_metas.emplace_back(new TabletIndex());
+_index_metas.back()->init_from_thrift(inverted_index, 
*_tablet_schema);
+std::unique_ptr 
inverted_index_builder;
+try {
+RETURN_IF_ERROR(segment_v2::InvertedIndexColumnWriter::create(
+field.get(), &inverted_index_builder, 
segment_filename, segment_dir,
+_index_metas.back().get(), fs));
+} catch (const std::exception& e) {
+LOG(WARNING) << "CLuceneError occured: " << e.what();
+return Status::Error();
+}
+
+if (inverted_index_builder) {
+std::string writer_sign = fmt::format("{}_{}", i, index_id);
+_inverted_index_builders.insert(
+std::make_pair(writer_sign, 
std::move(inverted_index_builder)));
+}
+}
+}
+
+SegmentCacheHandle segment_cache_handle;
+// load segments
+RETURN_NOT_OK(SegmentLoader::instance()->load_segments(
+std::static_pointer_cast(rowset_reader->rowset()), 
&segment_cache_handle,
+false));
+
+// create iterator for each segment
+StorageReadOptions read_options;
+OlapReaderStatistics stats;
+read_options.stats = &stats;
+read_options.tablet_schema = _tablet_schema;
+std::unique_ptr schema =
+std::make_unique(_tablet_schema->columns(), 
return_columns);
+for (auto& seg_ptr : segment_cache_handle.get_segments()) {
+std::unique_ptr iter;
+res = seg_ptr->new_iterator(*schema, read_options, &iter);
+if (!res.ok()) {
+LOG(WARNING) << "failed to create iterator[" << seg_ptr->id()
+ << "]: " << res.to_string();
+return Status::Error();
+}
+
+std::shared_ptr block =
+
std::make_shared(_tablet_schema->create_block(return_columns));
+do {
+block->clear_column_data();
+res = iter->next_batch(block.g

[GitHub] [doris] BiteTheDDDDt merged pull request #16357: [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable

2023-02-02 Thread via GitHub


BiteThet merged PR #16357:
URL: https://github.com/apache/doris/pull/16357


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable

2023-02-02 Thread panxiaolei
This is an automated email from the ASF dual-hosted git repository.

panxiaolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 68d2067f51 [improvement](testcase) change order by sql in 
test_dup_mv_bitmap_hash.groovy to make result stable
68d2067f51 is described below

commit 68d2067f518c555908e17178df2f3fe91a00ae3d
Author: Kang 
AuthorDate: Thu Feb 2 16:42:58 2023 +0800

[improvement](testcase) change order by sql in 
test_dup_mv_bitmap_hash.groovy to make result stable

change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable
---
 .../test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out  | 9 -
 .../test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy   | 4 +++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git 
a/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out
 
b/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out
index 2a9437547a..70f2bed42a 100644
--- 
a/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out
+++ 
b/regression-test/data/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.out
@@ -4,10 +4,17 @@
 1
 1
 
+-- !select_k1 --
+1
+2
+2
+3
+3
+
 -- !select_star --
 1  1   a
-2  2   bb
 2  2   b
+2  2   bb
 3  3   c
 3  3   c
 
diff --git 
a/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy
 
b/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy
index f7b133e426..710e52fbc4 100644
--- 
a/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy
+++ 
b/regression-test/suites/materialized_view_p0/test_dup_mv_bitmap_hash/test_dup_mv_bitmap_hash.groovy
@@ -69,7 +69,9 @@ suite ("test_dup_mv_bitmap_hash") {
 sql "insert into d_table select 2,2,'bb';"
 sql "insert into d_table select 3,3,'c';"
 
-qt_select_star "select * from d_table order by k1;"
+qt_select_k1 "select k1 from d_table order by k1;"
+
+qt_select_star "select * from d_table order by k1,k2,k3;"
 
 explain {
 sql("select k1,bitmap_union_count(bitmap_hash(k3)) from d_table group 
by k1;")


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on a diff in pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.

2023-02-02 Thread via GitHub


github-actions[bot] commented on code in PR #16371:
URL: https://github.com/apache/doris/pull/16371#discussion_r1094198749


##
be/src/olap/schema_change.cpp:
##
@@ -586,6 +592,248 @@ Status 
VSchemaChangeWithSorting::_external_sorting(vector& src_
 return Status::OK();
 }
 
+SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex(
+const std::vector& alter_inverted_indexs,
+const TabletSchemaSPtr& tablet_schema)
+: SchemaChange(),

Review Comment:
   warning: initializer for base class 'doris::SchemaChange' is redundant 
[readability-redundant-member-init]
   
   ```suggestion
   : ,
   ```
   



##
be/src/olap/schema_change.h:
##
@@ -172,11 +177,42 @@ class VSchemaChangeWithSorting : public SchemaChange {
 std::unique_ptr _mem_tracker;
 };
 
+class SchemaChangeForInvertedIndex : public SchemaChange {
+public:
+explicit SchemaChangeForInvertedIndex(const std::vector& 
alter_inverted_indexs,
+  const TabletSchemaSPtr& 
tablet_schema);
+virtual ~SchemaChangeForInvertedIndex();

Review Comment:
   warning: prefer using 'override' or (rarely) 'final' instead of 'virtual' 
[modernize-use-override]
   
   ```suggestion
   ~SchemaChangeForInvertedIndex() override;
   ```
   



##
be/src/olap/schema_change.cpp:
##
@@ -586,6 +592,248 @@
 return Status::OK();
 }
 
+SchemaChangeForInvertedIndex::SchemaChangeForInvertedIndex(
+const std::vector& alter_inverted_indexs,
+const TabletSchemaSPtr& tablet_schema)
+: SchemaChange(),
+  _alter_inverted_indexs(alter_inverted_indexs),
+  _tablet_schema(tablet_schema) {}
+
+SchemaChangeForInvertedIndex::~SchemaChangeForInvertedIndex() {
+VLOG_NOTICE << "~SchemaChangeForInvertedIndex()";
+_inverted_index_builders.clear();
+_index_metas.clear();
+}
+
+Status SchemaChangeForInvertedIndex::process(RowsetReaderSharedPtr 
rowset_reader,
+ RowsetWriter* rowset_writer,
+ TabletSharedPtr new_tablet,
+ TabletSharedPtr base_tablet,
+ TabletSchemaSPtr 
base_tablet_schema) {
+Status res = Status::OK();
+if (rowset_reader->rowset()->empty() || 
rowset_reader->rowset()->num_rows() == 0) {
+return Status::OK();
+}
+
+std::vector return_columns;
+for (auto& inverted_index : _alter_inverted_indexs) {
+DCHECK_EQ(inverted_index.columns.size(), 1);
+auto column_name = inverted_index.columns[0];
+auto idx = _tablet_schema->field_index(column_name);
+return_columns.emplace_back(idx);
+}
+
+// create inverted index writer
+auto rowset_meta = rowset_reader->rowset()->rowset_meta();
+std::string segment_dir = base_tablet->tablet_path();
+auto fs = rowset_meta->fs();
+for (auto i = 0; i < rowset_meta->num_segments(); ++i) {
+std::string segment_filename =
+fmt::format("{}_{}.dat", rowset_meta->rowset_id().to_string(), 
i);
+for (auto& inverted_index : _alter_inverted_indexs) {
+DCHECK_EQ(inverted_index.columns.size(), 1);
+auto column_name = inverted_index.columns[0];
+auto column = _tablet_schema->column(column_name);
+auto index_id = inverted_index.index_id;
+
+std::unique_ptr field(FieldFactory::create(column));
+_index_metas.emplace_back(new TabletIndex());
+_index_metas.back()->init_from_thrift(inverted_index, 
*_tablet_schema);
+std::unique_ptr 
inverted_index_builder;
+try {
+RETURN_IF_ERROR(segment_v2::InvertedIndexColumnWriter::create(
+field.get(), &inverted_index_builder, 
segment_filename, segment_dir,
+_index_metas.back().get(), fs));
+} catch (const std::exception& e) {
+LOG(WARNING) << "CLuceneError occured: " << e.what();
+return Status::Error();
+}
+
+if (inverted_index_builder) {
+std::string writer_sign = fmt::format("{}_{}", i, index_id);
+_inverted_index_builders.insert(
+std::make_pair(writer_sign, 
std::move(inverted_index_builder)));
+}
+}
+}
+
+SegmentCacheHandle segment_cache_handle;
+// load segments
+RETURN_NOT_OK(SegmentLoader::instance()->load_segments(
+std::static_pointer_cast(rowset_reader->rowset()), 
&segment_cache_handle,
+false));
+
+// create iterator for each segment
+StorageReadOptions read_options;
+OlapReaderStatistics stats;
+read_options.stats = &stats;
+read_options.tablet_schema = _tablet_schema;
+std::unique_ptr schema =
+std::make_unique(_tablet_schema->columns(), 
return_columns);
+

[GitHub] [doris] wsjz opened a new pull request, #16372: [fix](iceberg) fix iceberg catalog rest access

2023-02-02 Thread via GitHub


wsjz opened a new pull request, #16372:
URL: https://github.com/apache/doris/pull/16372

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16372: [fix](iceberg) fix iceberg catalog rest access

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16372:
URL: https://github.com/apache/doris/pull/16372#issuecomment-1413352710

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16358: [Improve](row-store) check light schema change must enabled

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16358:
URL: https://github.com/apache/doris/pull/16358#issuecomment-1413352729

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.72 seconds
load time: 486 seconds
storage size: 17170849132 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202084816_clickbench_pr_89361.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow merged pull request #16312: [fix](Nereids): fix bugs in test join5

2023-02-02 Thread via GitHub


morrySnow merged PR #16312:
URL: https://github.com/apache/doris/pull/16312


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [fix](Nereids) fix bugs in test join5 (#16312)

2023-02-02 Thread morrysnow
This is an automated email from the ASF dual-hosted git repository.

morrysnow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 398da44e46 [fix](Nereids) fix bugs in test join5 (#16312)
398da44e46 is described below

commit 398da44e469170ca8a79904e9b7697f77301c943
Author: 谢健 
AuthorDate: Thu Feb 2 16:51:45 2023 +0800

[fix](Nereids) fix bugs in test join5 (#16312)

make bucket-shuffle-join in PhysicalPlanTranlator when property of left 
child is not enforced
---
 .../glue/translator/PhysicalPlanTranslator.java|  6 ++-
 .../nereids/properties/DistributionSpecHash.java   |  3 ++
 .../suites/nereids_p0/join/test_join5.groovy   | 60 +++---
 3 files changed, 39 insertions(+), 30 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
index 581c2418a3..b51bb7a700 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
@@ -1553,6 +1553,7 @@ public class PhysicalPlanTranslator extends 
DefaultPlanVisitor, List> onClauseUsedSlots = 
JoinUtils.getOnClauseUsedSlots(physicalHashJoin);
 List rightPartitionExprIds = 
Lists.newArrayList(leftDistributionSpec.getOrderedShuffledColumns());
 for (int i = 0; i < 
leftDistributionSpec.getOrderedShuffledColumns().size(); i++) {
@@ -1572,11 +1573,14 @@ public class PhysicalPlanTranslator extends 
DefaultPlanVisitor

[GitHub] [doris] github-actions[bot] commented on pull request #16323: [fix](Nereids) result order in group-by-costant case is not stable

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16323:
URL: https://github.com/apache/doris/pull/16323#issuecomment-1413358952

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16323: [fix](test) result order in group-by-costant case is not stable

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16323:
URL: https://github.com/apache/doris/pull/16323#issuecomment-1413359007

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow merged pull request #16323: [fix](test) result order in group-by-costant case is not stable

2023-02-02 Thread via GitHub


morrySnow merged PR #16323:
URL: https://github.com/apache/doris/pull/16323


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated (398da44e46 -> 09abd32957)

2023-02-02 Thread morrysnow
This is an automated email from the ASF dual-hosted git repository.

morrysnow pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from 398da44e46 [fix](Nereids) fix bugs in test join5 (#16312)
 add 09abd32957 [fix](test) result order in group-by-costant case is not 
stable (#16323)

No new revisions were added by this update.

Summary of changes:
 .../data/nereids_syntax_p0/group_by_constant.out   |  2 +-
 .../nereids_syntax_p0/group_by_constant.groovy | 44 +++---
 2 files changed, 23 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15663: [Improvement](topn) order by key topn query optimization

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15663:
URL: https://github.com/apache/doris/pull/15663#issuecomment-1413373172

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen opened a new pull request, #16373: [regression](fix) 1. fix broker load test case and add orc test 2. se…

2023-02-02 Thread via GitHub


hello-stephen opened a new pull request, #16373:
URL: https://github.com/apache/doris/pull/16373

   …t enableBrokerLoad=true in pipeline
   
   add a load test for the orc file and let it run in the TeamCity pipeline.
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] platoneko opened a new pull request, #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets

2023-02-02 Thread via GitHub


platoneko opened a new pull request, #16374:
URL: https://github.com/apache/doris/pull/16374

   # Proposed changes
   
   Issue Number: close #10986 
   
   ## Problem summary
   
   Fix core and support reclaiming rowsets in multiple resources in a tablet.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16374:
URL: https://github.com/apache/doris/pull/16374#issuecomment-1413396934

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman merged pull request #16271: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog

2023-02-02 Thread via GitHub


morningman merged PR #16271:
URL: https://github.com/apache/doris/pull/16271


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 557159d3ce [feature](JdbcExternalCatalog) support insert data in 
JdbcExternalCatalog (#16271)
557159d3ce is described below

commit 557159d3ceff022903839e45ab07d94c922d244d
Author: Tiewei Fang <43782773+bepppo...@users.noreply.github.com>
AuthorDate: Thu Feb 2 17:31:33 2023 +0800

[feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog 
(#16271)
---
 be/src/exec/table_connector.cpp|  9 +++--
 .../docker-compose/mysql/init/03-create-table.sql  |  5 +++
 .../docker-compose/oracle/init/03-create-table.sql |  6 
 .../postgresql/init/02-create-table.sql|  6 
 .../java/org/apache/doris/analysis/InsertStmt.java | 39 --
 .../doris/transaction/DatabaseTransactionMgr.java  |  3 +-
 .../doris/transaction/GlobalTransactionMgr.java|  5 +--
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out| 13 
 .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out   | 13 
 .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out  | 13 
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy | 23 ++---
 .../test_oracle_jdbc_catalog.groovy| 14 +++-
 .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy| 13 
 13 files changed, 140 insertions(+), 22 deletions(-)

diff --git a/be/src/exec/table_connector.cpp b/be/src/exec/table_connector.cpp
index 12dc3acdc2..30b01b1d03 100644
--- a/be/src/exec/table_connector.cpp
+++ b/be/src/exec/table_connector.cpp
@@ -226,8 +226,13 @@ Status TableConnector::convert_column_data(const 
vectorized::ColumnPtr& column_p
 case TYPE_VARCHAR:
 case TYPE_CHAR:
 case TYPE_STRING: {
-// here need check the ' is used, now for pg array string must be "
-fmt::format_to(_insert_stmt_buffer, "\"{}\"", 
fmt::basic_string_view(item, size));
+// TODO(zhangstar333): check array data type of postgresql
+// for oracle/pg database string must be '
+if (table_type == TOdbcTableType::ORACLE || table_type == 
TOdbcTableType::POSTGRESQL) {
+fmt::format_to(_insert_stmt_buffer, "'{}'", 
fmt::basic_string_view(item, size));
+} else {
+fmt::format_to(_insert_stmt_buffer, "\"{}\"", 
fmt::basic_string_view(item, size));
+}
 break;
 }
 case TYPE_ARRAY: {
diff --git a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql 
b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
index 02c257cbc8..8fb1aebc4b 100644
--- a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
+++ b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
@@ -223,4 +223,9 @@ create table doris_test.ex_tb20 (
 decimal_unsigned_long decimal(65, 5) unsigned
 ) engine=innodb charset=utf8;
 
+create table doris_test.test_insert (
+`id` varchar(128) NULL,
+`name` varchar(128) NULL,
+`age` int NULL
+) engine=innodb charset=utf8;
 
diff --git a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql 
b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
index d5dd8cf1c6..d2d8d6af7e 100644
--- a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
+++ b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
@@ -78,3 +78,9 @@ t4 timestamp,
 t5 interval year(3) to month,
 t6 interval day(3) to second(6)
 );
+
+create table doris_test.test_insert(
+id varchar2(128),
+name varchar2(128),
+age number(5)
+);
diff --git 
a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql 
b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
index b721da297a..6ace3b20cb 100644
--- a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
+++ b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
@@ -150,3 +150,9 @@ CREATE TABLE catalog_pg_test.test12 (
ID INT NOT NULL,
uuid_value uuid
 );
+
+CREATE TABLE catalog_pg_test.test_insert (
+   id varchar(128),
+   name varchar(128),
+   age int
+);
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
index 44140b24e9..891fe3349b 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
@@ -31,6 +31,8 @@ import org.apache.doris.catalog.Partition;
 import org.apache.doris.catalog.PartitionType;
 import org.apache.doris.catalog.Table;
 import org.apache.doris.catalog.TableIf;
+import org.apache.doris.catalog.external.JdbcExternalDatabase;
+import org.apache.doris.catalog.external.JdbcExternalTable;
 import org.apache.doris.common.AnalysisException;
 import org.apache.doris.common.DdlExce

[doris] branch master updated (557159d3ce -> cb6875b5a4)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from 557159d3ce [feature](JdbcExternalCatalog) support insert data in 
JdbcExternalCatalog (#16271)
 add cb6875b5a4 [improvement](multi-catalog) use date/datetimev2 as default 
col type for catalog table (#16304)

No new revisions were added by this update.

Summary of changes:
 be/src/runtime/buffer_control_block.cpp|  2 +-
 be/src/vec/exec/scan/vscanner.cpp  |  7 ++-
 .../community/developer-guide/regression-testing.md|  4 ++--
 .../doris/catalog/HiveMetaStoreClientHelper.java   |  4 ++--
 .../apache/doris/external/elasticsearch/EsUtil.java|  2 +-
 .../org/apache/doris/external/jdbc/JdbcClient.java | 18 +-
 6 files changed, 21 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman merged pull request #16304: [improvement](multi-catalog) use date/datetimev2 as default col type for catalog table

2023-02-02 Thread via GitHub


morningman merged PR #16304:
URL: https://github.com/apache/doris/pull/16304


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16299:
URL: https://github.com/apache/doris/pull/16299#issuecomment-1413422083

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16299:
URL: https://github.com/apache/doris/pull/16299#issuecomment-1413422169

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15971: [Feature](Nereids) Support order and limit in subquery

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15971:
URL: https://github.com/apache/doris/pull/15971#issuecomment-1413428296

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15971: [Feature](Nereids) Support order and limit in subquery

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15971:
URL: https://github.com/apache/doris/pull/15971#issuecomment-1413428382

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15966: [Feature](map) add map type to doris

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15966:
URL: https://github.com/apache/doris/pull/15966#issuecomment-1413438340

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BePPPower opened a new pull request, #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader

2023-02-02 Thread via GitHub


BePPPower opened a new pull request, #16375:
URL: https://github.com/apache/doris/pull/16375

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16375:
URL: https://github.com/apache/doris/pull/16375#issuecomment-1413467981

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow merged pull request #15971: [Feature](Nereids) Support order and limit in subquery

2023-02-02 Thread via GitHub


morrySnow merged PR #15971:
URL: https://github.com/apache/doris/pull/15971


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [Feature](Nereids) Support order and limit in subquery (#15971)

2023-02-02 Thread morrysnow
This is an automated email from the ASF dual-hosted git repository.

morrysnow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new e31913faca [Feature](Nereids) Support order and limit in subquery 
(#15971)
e31913faca is described below

commit e31913faca12206b5cbaf914b815e1a8b10cb275
Author: zhengshiJ <32082872+zhengs...@users.noreply.github.com>
AuthorDate: Thu Feb 2 18:17:30 2023 +0800

[Feature](Nereids) Support order and limit in subquery (#15971)

1.Compatible with the old optimizer, the sort and limit in the subquery 
will not take effect, just delete it directly.
```
select * from sub_query_correlated_subquery1 where 
sub_query_correlated_subquery1.k1 > (select 
sum(sub_query_correlated_subquery3.k3) a from sub_query_correlated_subquery3 
where sub_query_correlated_subquery3.v2 = sub_query_correlated_subquery1.k2 
order by a limit 1);
```

2.Adjust the unnesting position of the subquery to ensure that the conjunct 
in the filter has been optimized, and then unnesting

Support:
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT 
count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) 
or ((k1 = i1.k1) AND (k2 = 1)) )  > 0);
```
The reason why the above can be supported is that conjunction will be 
performed, which can be converted into the following
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT 
count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2 or 
k2 = 1)) )  > 0);
```

Not Support:
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT 
count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) 
or ((k2 = i1.k1) AND (k2 = 1)) )  > 0);
```
---
 .../apache/doris/nereids/analyzer/UnboundSlot.java |  2 +-
 .../batch/EliminateSpecificPlanUnderApplyJob.java  | 42 
 .../jobs/batch/NereidsRewriteJobExecutor.java  |  9 ++--
 .../org/apache/doris/nereids/rules/RuleType.java   |  2 +
 .../nereids/rules/analysis/CheckAfterRewrite.java  |  4 +-
 .../nereids/rules/analysis/SubExprAnalyzer.java| 34 +
 .../rewrite/logical/EliminateLimitUnderApply.java  | 43 +
 .../rewrite/logical/EliminateSortUnderApply.java   | 56 ++
 .../nereids/rules/analysis/CheckRowPolicyTest.java |  2 +-
 .../nereids_syntax_p0/sub_query_correlated.out | 43 +
 .../sub_query_diff_old_optimize.out| 20 ++--
 .../nereids_syntax_p0/sub_query_correlated.groovy  | 36 --
 .../sub_query_diff_old_optimize.groovy | 29 ++-
 13 files changed, 277 insertions(+), 45 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java
index 09eb1c94f5..66c5e43f70 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundSlot.java
@@ -69,7 +69,7 @@ public class UnboundSlot extends Slot implements Unbound, 
PropagateNullable {
 
 @Override
 public String toString() {
-return "'" + getName();
+return "'" + getName() + "'";
 }
 
 @Override
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java
 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java
new file mode 100644
index 00..2b8f7b25e0
--- /dev/null
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/batch/EliminateSpecificPlanUnderApplyJob.java
@@ -0,0 +1,42 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.jobs.batch;
+
+import org.apache.doris.nereids.CascadesContext;
+import org.apache.doris.nereids.rules.rewrite.logical.EliminateLimitUnderApply;
+import org.apache.doris.nereids.rules.rewrite.logical.EliminateSortUnderApply;
+
+import com.google.com

[GitHub] [doris] xy720 opened a new pull request, #16376: [chore](regression-test) Remove array config in regression test

2023-02-02 Thread via GitHub


xy720 opened a new pull request, #16376:
URL: https://github.com/apache/doris/pull/16376

   # Proposed changes
   
   The fe config "enable_array_type" is not used, this commit removes it from 
regression test.
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15837:
URL: https://github.com/apache/doris/pull/15837#issuecomment-1413522154

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #15837:
URL: https://github.com/apache/doris/pull/15837#issuecomment-1413522206

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16359: [Enhancement](Stmt)ShowPartitionsStmt support forward to master

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16359:
URL: https://github.com/apache/doris/pull/16359#issuecomment-1413524245

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.41 seconds
load time: 518 seconds
storage size: 17122376961 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202103753_clickbench_pr_89440.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16374:
URL: https://github.com/apache/doris/pull/16374#issuecomment-1413546730

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt merged pull request #15837: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function

2023-02-02 Thread via GitHub


BiteThet merged PR #15837:
URL: https://github.com/apache/doris/pull/15837


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837)

2023-02-02 Thread panxiaolei
This is an automated email from the ASF dual-hosted git repository.

panxiaolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 0d5b115993 [Feature](Materialized-View) support duplicate base column 
for diffrent aggregate function (#15837)
0d5b115993 is described below

commit 0d5b1159930cc37edad3324aaffcf66855022d5c
Author: Pxl 
AuthorDate: Thu Feb 2 18:57:39 2023 +0800

[Feature](Materialized-View) support duplicate base column for diffrent 
aggregate function (#15837)

support duplicate base column for diffrent aggregate function
---
 .gitignore |   2 +
 be/src/olap/rowset/segment_v2/segment_writer.cpp   |   5 +-
 be/src/olap/schema_change.cpp  |  14 +-
 be/src/vec/exprs/vslot_ref.cpp |   6 +-
 .../doris/alter/MaterializedViewHandler.java   |  12 +-
 .../doris/analysis/CreateMaterializedViewStmt.java | 157 --
 .../main/java/org/apache/doris/analysis/Expr.java  |  41 -
 .../java/org/apache/doris/analysis/InsertStmt.java |  10 +-
 .../org/apache/doris/analysis/LiteralExpr.java |   6 +
 .../org/apache/doris/analysis/MVColumnItem.java|  10 +-
 .../doris/analysis/MVColumnOneChildPattern.java|   6 +-
 .../java/org/apache/doris/analysis/QueryStmt.java  |   6 +
 .../java/org/apache/doris/analysis/SelectStmt.java |   5 +-
 .../java/org/apache/doris/analysis/SlotRef.java|  12 +-
 .../main/java/org/apache/doris/catalog/Column.java |   2 +-
 .../java/org/apache/doris/catalog/FunctionSet.java |   3 +
 .../doris/catalog/MaterializedIndexMeta.java   |   1 +
 .../java/org/apache/doris/common/FeNameFormat.java |   4 +
 .../doris/planner/MaterializedViewSelector.java|  46 --
 .../org/apache/doris/planner/OlapScanNode.java |   5 +-
 .../org/apache/doris/planner/RollupSelector.java   |  13 +-
 .../apache/doris/planner/SingleNodePlanner.java|  30 +++-
 .../java/org/apache/doris/qe/StmtExecutor.java |  15 ++
 .../doris/rewrite/mvrewrite/CountFieldToSum.java   |  41 +++--
 .../doris/rewrite/mvrewrite/ExprToSlotRefRule.java | 179 ++---
 .../doris/rewrite/mvrewrite/MVExprEquivalent.java  |  48 ++
 .../doris/rewrite/mvrewrite/SlotRefEqualRule.java  |  11 +-
 .../analysis/CreateMaterializedViewStmtTest.java   | 106 +++-
 .../analysis/MVColumnOneChildPatternTest.java  |  11 +-
 .../doris/nereids/rules/mv/SelectMvIndexTest.java  |   1 +
 .../planner/MaterializedViewFunctionTest.java  |  25 +--
 .../planner/MaterializedViewSelectorTest.java  |   4 +-
 .../agg_have_dup_base/agg_have_dup_base.out|  25 +++
 .../materialized_view_p0/k1ap2spa/k1ap2spa.out |  13 ++
 .../test_dup_group_by_mv_abs.out   |  19 +++
 .../test_dup_group_by_mv_plus.out  |  19 +++
 .../agg_have_dup_base.groovy}  |  36 ++---
 .../k1ap2spa.groovy}   |  36 +
 .../test_dup_group_by_mv_abs.groovy}   |  36 +
 .../test_dup_group_by_mv_plus.groovy}  |  36 +
 .../test_dup_mv_abs/test_dup_mv_abs.groovy |   4 +
 .../test_dup_mv_bin/test_dup_mv_bin.groovy |   4 +
 .../test_dup_mv_plus/test_dup_mv_plus.groovy   |   4 +
 .../test_materialized_view_nereids.groovy  |   2 +
 44 files changed, 651 insertions(+), 420 deletions(-)

diff --git a/.gitignore b/.gitignore
index 01c6c35993..5ba2f22e45 100644
--- a/.gitignore
+++ b/.gitignore
@@ -95,3 +95,5 @@ tools/**/tpch-data/
 
 # be-ut
 data_test
+
+/conf/log4j2-spring.xml
diff --git a/be/src/olap/rowset/segment_v2/segment_writer.cpp 
b/be/src/olap/rowset/segment_v2/segment_writer.cpp
index 517a0c9ec3..3da1dbef56 100644
--- a/be/src/olap/rowset/segment_v2/segment_writer.cpp
+++ b/be/src/olap/rowset/segment_v2/segment_writer.cpp
@@ -208,7 +208,10 @@ Status SegmentWriter::init(const std::vector& 
col_ids, bool has_key) {
 
 Status SegmentWriter::append_block(const vectorized::Block* block, size_t 
row_pos,
size_t num_rows) {
-assert(block->columns() == _column_writers.size());
+CHECK(block->columns() == _column_writers.size())
+<< ", block->columns()=" << block->columns()
+<< ", _column_writers.size()=" << _column_writers.size();
+
 _olap_data_convertor->set_source_content(block, row_pos, num_rows);
 
 // find all row pos for short key indexes
diff --git a/be/src/olap/schema_change.cpp b/be/src/olap/schema_change.cpp
index 1f62169414..63a9a211eb 100644
--- a/be/src/olap/schema_change.cpp
+++ b/be/src/olap/schema_change.cpp
@@ -22,7 +22,6 @@
 #include "gutil/integral_types.h"
 #include "olap/merger.h"
 #include "olap/olap_common.h"
-#include "olap/row_cursor.h"
 #include "olap/rowset/segment_v2/column_reader.h"
 #include "olap/storage_engine.h"
 #include "olap/tablet.h"
@@ -39,8 +38,6 @@
 #incl

[GitHub] [doris] hello-stephen commented on pull request #16361: [fix](scan) coredump caused by null of _scanner_ctx

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16361:
URL: https://github.com/apache/doris/pull/16361#issuecomment-1413550386

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 33.9 seconds
load time: 511 seconds
storage size: 17123154634 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202105836_clickbench_pr_89453.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16363:
URL: https://github.com/apache/doris/pull/16363#issuecomment-1413558392

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16363:
URL: https://github.com/apache/doris/pull/16363#issuecomment-1413558466

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16363: [fix](nereids)the order exprs in sort node should be slotRef in its tupleDesc

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16363:
URL: https://github.com/apache/doris/pull/16363#issuecomment-1413587230

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.67 seconds
load time: 485 seconds
storage size: 17170743013 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202112733_clickbench_pr_89615.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring merged pull request #16299: [fix](cooldown) Fix bugs in cooldown single replica files

2023-02-02 Thread via GitHub


dataroaring merged PR #16299:
URL: https://github.com/apache/doris/pull/16299


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated (0d5b115993 -> 6ee0dbfb23)

2023-02-02 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


from 0d5b115993 [Feature](Materialized-View) support duplicate base column 
for diffrent aggregate function (#15837)
 add 6ee0dbfb23 [fix](cooldown) Fix bugs in cooldown single replica files 
(#16299)

No new revisions were added by this update.

Summary of changes:
 be/src/agent/task_worker_pool.cpp |  34 +++--
 be/src/olap/base_tablet.h |   4 +-
 be/src/olap/snapshot_manager.cpp  |   7 -
 be/src/olap/tablet.cpp| 276 +++---
 be/src/olap/tablet.h  |  53 +---
 be/src/olap/tablet_manager.cpp|   2 -
 be/src/olap/tablet_meta.cpp   |  18 +--
 be/src/olap/tablet_meta.h |  16 +--
 be/src/olap/version_graph.h   |   4 +
 be/test/olap/tablet_test.cpp  |   2 +-
 10 files changed, 199 insertions(+), 217 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16367: [enhancement](stream receiver) make data stream receiver exception safe.

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16367:
URL: https://github.com/apache/doris/pull/16367#issuecomment-1413593626

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 02/06: [Refactor](function) opt the exec of function with null column (#16256)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 38c9fe7f8d7ec7e91c6821cd66c8d366e7d98bf0
Author: HappenLee 
AuthorDate: Wed Feb 1 15:56:31 2023 +0800

[Refactor](function) opt the exec of function with null column (#16256)
---
 be/src/vec/exprs/vectorized_fn_call.cpp   |   2 +
 be/src/vec/functions/function.cpp |  14 ++--
 be/src/vec/functions/function_cast.h  |  11 ++-
 be/src/vec/functions/function_helpers.cpp | 123 ++
 be/src/vec/functions/function_helpers.h   |  26 +++
 5 files changed, 86 insertions(+), 90 deletions(-)

diff --git a/be/src/vec/exprs/vectorized_fn_call.cpp 
b/be/src/vec/exprs/vectorized_fn_call.cpp
index 3999599716..a0e07d31f9 100644
--- a/be/src/vec/exprs/vectorized_fn_call.cpp
+++ b/be/src/vec/exprs/vectorized_fn_call.cpp
@@ -51,6 +51,8 @@ doris::Status VectorizedFnCall::prepare(doris::RuntimeState* 
state,
 argument_template.reserve(_children.size());
 std::vector child_expr_name;
 for (auto child : _children) {
+// TODO: rethink we really create column here. maybe only need nullptr 
just to
+// get the function
 auto column = child->data_type()->create_column();
 argument_template.emplace_back(std::move(column), child->data_type(), 
child->expr_name());
 child_expr_name.emplace_back(child->expr_name());
diff --git a/be/src/vec/functions/function.cpp 
b/be/src/vec/functions/function.cpp
index 41f3141c06..662a2a58af 100644
--- a/be/src/vec/functions/function.cpp
+++ b/be/src/vec/functions/function.cpp
@@ -217,11 +217,12 @@ Status 
PreparedFunctionImpl::default_implementation_for_nulls(
 }
 
 if (null_presence.has_nullable) {
-Block temporary_block = create_block_with_nested_columns(block, args, 
result);
+auto [temporary_block, new_args, new_result] =
+create_block_with_nested_columns(block, args, result);
 RETURN_IF_ERROR(execute_without_low_cardinality_columns(
-context, temporary_block, args, result, 
temporary_block.rows(), dry_run));
+context, temporary_block, new_args, new_result, 
temporary_block.rows(), dry_run));
 block.get_by_position(result).column =
-
wrap_in_nullable(temporary_block.get_by_position(result).column, block, args,
+
wrap_in_nullable(temporary_block.get_by_position(new_result).column, block, 
args,
  result, input_rows_count);
 *executed = true;
 return Status::OK();
@@ -295,10 +296,9 @@ DataTypePtr 
FunctionBuilderImpl::get_return_type_without_low_cardinality(
 }
 if (null_presence.has_nullable) {
 ColumnNumbers numbers(arguments.size());
-for (size_t i = 0; i < arguments.size(); i++) {
-numbers[i] = i;
-}
-Block nested_block = 
create_block_with_nested_columns(Block(arguments), numbers);
+std::iota(numbers.begin(), numbers.end(), 0);
+auto [nested_block, _] =
+create_block_with_nested_columns(Block(arguments), 
numbers, false);
 auto return_type = get_return_type_impl(
 ColumnsWithTypeAndName(nested_block.begin(), 
nested_block.end()));
 return make_nullable(return_type);
diff --git a/be/src/vec/functions/function_cast.h 
b/be/src/vec/functions/function_cast.h
index e3baaecdd2..a6817134ea 100644
--- a/be/src/vec/functions/function_cast.h
+++ b/be/src/vec/functions/function_cast.h
@@ -1592,7 +1592,9 @@ private:
 Block tmp_block;
 size_t tmp_res_index = 0;
 if (source_is_nullable) {
-tmp_block = 
create_block_with_nested_columns_only_args(block, arguments);
+auto [t_block, tmp_args] =
+create_block_with_nested_columns(block, arguments, 
true);
+tmp_block = std::move(t_block);
 tmp_res_index = tmp_block.columns();
 tmp_block.insert({nullptr, nested_type, ""});
 
@@ -1624,7 +1626,8 @@ private:
 return [wrapper, skip_not_null_check](FunctionContext* context, 
Block& block,
   const ColumnNumbers& 
arguments,
   const size_t result, size_t 
input_rows_count) {
-Block tmp_block = create_block_with_nested_columns(block, 
arguments, result);
+auto [tmp_block, tmp_args, tmp_res] =
+create_block_with_nested_columns(block, arguments, 
result);
 
 /// Check that all values are not-NULL.
 /// Check can be skipped in case if LowCardinality dictionary 
is transformed.
@@ -1640,8 +1643,8 @@ private:
 }
 }
 

[doris] 04/06: [fix](multi-catalog) remove the eof check among parquet columns (#16302)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 29e6480bc8be1fc882c3e7a1f28b7164a3b71c97
Author: Ashin Gau 
AuthorDate: Thu Feb 2 09:22:09 2023 +0800

[fix](multi-catalog) remove the eof check among parquet columns (#16302)

Read parquet file failed:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]Read 
parquet file xxx failed, reason = [CORRUPTION]The number of rows are not equal 
among parquet columns
```
This error may be thrown when reading non-predicate columns in lazy-read, 
for example:
A row group with 1000 rows has tow non-predicate columns.
Column A has one page, Column B has two pages with 500 rows for each page.
The read range of `ParquetColumnReader` is [0, 400), and the rows between 
[0, 450) are all filtered by predicate columns.
So column A can skip the first page, and reach the EOF,  while column B can 
also skip the first page, but doesn't read the EOF.
---
 be/src/vec/exec/format/parquet/vparquet_group_reader.cpp | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp 
b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
index 34b478114e..5b1c0fd828 100644
--- a/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
+++ b/be/src/vec/exec/format/parquet/vparquet_group_reader.cpp
@@ -134,7 +134,6 @@ Status RowGroupReader::_read_column_data(Block* block, 
const std::vectorget_by_name(read_col);
 auto& column_ptr = column_with_type_and_name.column;
@@ -150,15 +149,13 @@ Status RowGroupReader::_read_column_data(Block* block, 
const std::vector 0 && (has_eof ^ col_eof)) {
-return Status::Corruption("The number of rows are not equal among 
parquet columns");
-}
 if (batch_read_rows > 0 && batch_read_rows != col_read_rows) {
 return Status::Corruption("Can't read the same number of rows 
among parquet columns");
 }
 batch_read_rows = col_read_rows;
-has_eof = col_eof;
-col_idx++;
+if (col_eof) {
+has_eof = true;
+}
 }
 *read_rows = batch_read_rows;
 *batch_eof = has_eof;


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 03/06: [Enhance] use fast_float::from_chars to do str cast to float/double to avoid lose precision (#16190)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit a1dcec461cd17f734ec7cff349328a1dcf96f55d
Author: HappenLee 
AuthorDate: Wed Feb 1 23:53:34 2023 +0800

[Enhance] use fast_float::from_chars to do str cast to float/double to 
avoid lose precision (#16190)
---
 be/src/util/string_parser.hpp   | 146 ++--
 be/test/util/string_parser_test.cpp |   5 +-
 2 files changed, 41 insertions(+), 110 deletions(-)

diff --git a/be/src/util/string_parser.hpp b/be/src/util/string_parser.hpp
index 653f0dac14..02006b7c7d 100644
--- a/be/src/util/string_parser.hpp
+++ b/be/src/util/string_parser.hpp
@@ -20,6 +20,8 @@
 
 #pragma once
 
+#include 
+
 #include 
 #include 
 #include 
@@ -111,13 +113,7 @@ public:
 
 template 
 static inline T string_to_float(const char* s, int len, ParseResult* 
result) {
-T ans = string_to_float_internal(s, len, result);
-if (LIKELY(*result == PARSE_SUCCESS)) {
-return ans;
-}
-
-int i = skip_leading_whitespace(s, len);
-return string_to_float_internal(s + i, len - i, result);
+return string_to_float_internal(s, len, result);
 }
 
 // Parses a string for 'true' or 'false', case insensitive.
@@ -425,118 +421,54 @@ inline T StringParser::string_to_int_no_overflow(const 
char* s, int len, ParseRe
 
 template 
 inline T StringParser::string_to_float_internal(const char* s, int len, 
ParseResult* result) {
-if (UNLIKELY(len <= 0)) {
+int i = 0;
+// skip leading spaces
+for (; i < len; ++i) {
+if (!is_whitespace(s[i])) {
+break;
+}
+}
+
+// skip back spaces
+int j = len - 1;
+for (; j >= i; j--) {
+if (!is_whitespace(s[j])) {
+break;
+}
+}
+
+// skip leading '+', from_chars can handle '-'
+if (i < len && s[i] == '+') {
+i++;
+}
+if (UNLIKELY(i > j)) {
 *result = PARSE_FAILURE;
 return 0;
 }
 
 // Use double here to not lose precision while accumulating the result
 double val = 0;
-bool negative = false;
-int i = 0;
-double divide = 1;
-bool decimal = false;
-int64_t remainder = 0;
-// The number of 'significant figures' we've encountered so far (i.e., 
digits excluding
-// leading 0s). This technically shouldn't count trailing 0s either, but 
for us it
-// doesn't matter if we count them based on the implementation below.
-int sig_figs = 0;
-
-switch (*s) {
-case '-':
-negative = true;
-case '+':
-i = 1;
-}
-
-int first = i;
-for (; i < len; ++i) {
-if (LIKELY(s[i] >= '0' && s[i] <= '9')) {
-if (s[i] != '0' || sig_figs > 0) {
-++sig_figs;
-}
-if (decimal) {
-// According to the IEEE floating-point spec, a double has up 
to 15-17
-// significant decimal digits (see
-// 
http://en.wikipedia.org/wiki/Double-precision_floating-point_format). We stop
-// processing digits after we've already seen at least 18 sig 
figs to avoid
-// overflowing 'remainder' (we stop after 18 instead of 17 to 
get the rounding
-// right).
-if (sig_figs <= 18) {
-remainder = remainder * 10 + s[i] - '0';
-divide *= 10;
+auto res = fast_float::from_chars(s + i, s + j + 1, val);
+
+if (res.ec == std::errc() && res.ptr == s + j + 1) {
+if (abs(val) == std::numeric_limits::infinity()) {
+auto contain_inf = false;
+for (int k = i; k < j + 1; k++) {
+if (s[k] == 'i' || s[k] == 'I') {
+contain_inf = true;
+break;
 }
-} else {
-val = val * 10 + s[i] - '0';
-}
-} else if (s[i] == '.') {
-decimal = true;
-} else if (s[i] == 'e' || s[i] == 'E') {
-break;
-} else if (s[i] == 'i' || s[i] == 'I') {
-if (len > i + 2 && (s[i + 1] == 'n' || s[i + 1] == 'N') &&
-(s[i + 2] == 'f' || s[i + 2] == 'F')) {
-// Note: Hive writes inf as Infinity, at least for text. We'll 
be a little loose
-// here and interpret any column with inf as a prefix as 
infinity rather than
-// checking every remaining byte.
-*result = PARSE_SUCCESS;
-return negative ? -INFINITY : INFINITY;
-} else {
-// Starts with 'i', but isn't inf...
-*result = PARSE_FAILURE;
-return 0;
-}
-} else if (s[i] == 'n' || s[i] == 'N') {
-if (len > i + 2 && (s[i + 1] == 'a' || s[i + 1] == 'A') &&
-(s[i + 2] == 'n' |

[doris] branch branch-1.2-lts updated (495d37d337 -> be7c4d267e)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


from 495d37d337 [fix](join) crash caused by canceling query (#16311) 
(#16349)
 new e48404e1c0 [fix](planner) create view generate wrong sql when sql 
contains multi count distinct (#16092)
 new 38c9fe7f8d [Refactor](function) opt the exec of function with null 
column (#16256)
 new a1dcec461c [Enhance] use fast_float::from_chars to do str cast to 
float/double to avoid lose precision (#16190)
 new 29e6480bc8 [fix](multi-catalog) remove the eof check among parquet 
columns (#16302)
 new df7200f8ae [test](regression) add tvf regression to test the remove of 
eof check (#16342)
 new be7c4d267e [branch1.2] fix compile bug after cherry-pick

The 6 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/util/string_parser.hpp  | 146 ++---
 .../exec/format/parquet/vparquet_group_reader.cpp  |   9 +-
 be/src/vec/exprs/vectorized_fn_call.cpp|   2 +
 be/src/vec/functions/function.cpp  |  14 +-
 be/src/vec/functions/function_cast.h   |  11 +-
 be/src/vec/functions/function_helpers.cpp  | 119 -
 be/src/vec/functions/function_helpers.h|  26 ++--
 be/test/util/string_parser_test.cpp|   5 +-
 .../org/apache/doris/analysis/BaseViewStmt.java|   1 +
 .../apache/doris/analysis/FunctionCallExpr.java|   1 +
 .../java/org/apache/doris/analysis/SelectStmt.java |   8 +-
 regression-test/conf/regression-conf.groovy|  29 
 .../external_table_emr_p2/hive/test_tvf_p2.out |  32 +
 .../suites/ddl_p0/test_create_view.groovy  |  72 ++
 .../external_table_emr_p2/hive/test_tvf_p2.groovy  |  31 ++---
 15 files changed, 276 insertions(+), 230 deletions(-)
 create mode 100644 
regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out
 create mode 100644 regression-test/suites/ddl_p0/test_create_view.groovy
 copy be/src/runtime/tuple_row.cpp => 
regression-test/suites/external_table_emr_p2/hive/test_tvf_p2.groovy (55%)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 01/06: [fix](planner) create view generate wrong sql when sql contains multi count distinct (#16092)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit e48404e1c08f4887506181a4b6dd577cc4dc534e
Author: morrySnow <101034200+morrys...@users.noreply.github.com>
AuthorDate: Tue Jan 31 23:42:53 2023 +0800

[fix](planner) create view generate wrong sql when sql contains multi count 
distinct (#16092)

If sql in create view has more than one count distinct, and write column 
name explicitly.
We will generate sql contains function multi_count_distinct.
It cannot be analyzed and all query containing this view will fail.
---
 .../org/apache/doris/analysis/BaseViewStmt.java|  1 +
 .../java/org/apache/doris/analysis/SelectStmt.java |  8 ++-
 .../suites/ddl_p0/test_create_view.groovy  | 72 ++
 3 files changed, 80 insertions(+), 1 deletion(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java
index 477e440f5e..8114448f0d 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/BaseViewStmt.java
@@ -117,6 +117,7 @@ public class BaseViewStmt extends DdlStmt {
 
 Analyzer tmpAnalyzer = new Analyzer(analyzer);
 List colNames = cols.stream().map(c -> 
c.getColName()).collect(Collectors.toList());
+cloneStmt.setNeedToSql(true);
 cloneStmt.substituteSelectList(tmpAnalyzer, colNames);
 
 try (ToSqlContext toSqlContext = 
ToSqlContext.getOrNewThreadLocalContext()) {
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
index 1aae7c324f..8f47a9ed04 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SelectStmt.java
@@ -1898,7 +1898,7 @@ public class SelectStmt extends QueryStmt {
 if (i != 0) {
 strBuilder.append(", ");
 }
-if (needToSql) {
+if (needToSql && CollectionUtils.isNotEmpty(originalExpr)) {
 strBuilder.append(originalExpr.get(i).toSql());
 } else {
 strBuilder.append(resultExprs.get(i).toSql());
@@ -2072,6 +2072,9 @@ public class SelectStmt extends QueryStmt {
 // Resolve and replace non-InlineViewRef table refs with a 
BaseTableRef or ViewRef.
 TableRef tblRef = fromClause.get(i);
 tblRef = analyzer.resolveTableRef(tblRef);
+if (tblRef instanceof InlineViewRef) {
+((InlineViewRef) tblRef).setNeedToSql(needToSql);
+}
 Preconditions.checkNotNull(tblRef);
 fromClause.set(i, tblRef);
 tblRef.setLeftTblRef(leftTblRef);
@@ -2101,6 +2104,9 @@ public class SelectStmt extends QueryStmt {
 resultExprs.add(item.getExpr());
 }
 }
+if (needToSql) {
+originalExpr = Expr.cloneList(resultExprs);
+}
 // substitute group by
 if (groupByClause != null) {
 boolean aliasFirst = false;
diff --git a/regression-test/suites/ddl_p0/test_create_view.groovy 
b/regression-test/suites/ddl_p0/test_create_view.groovy
new file mode 100644
index 00..4c401017ee
--- /dev/null
+++ b/regression-test/suites/ddl_p0/test_create_view.groovy
@@ -0,0 +1,72 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("test_create_view") {
+
+sql """DROP TABLE IF EXISTS count_distinct"""
+sql """
+CREATE TABLE IF NOT EXISTS count_distinct
+(
+RQ DATE NOT NULL  COMMENT "日期",
+v1 VARCHAR(100) NOT NULL  COMMENT "字段1",
+v2 VARCHAR(100) NOT NULL  COMMENT "字段2",
+v3 VARCHAR(100) REPLACE_IF_NOT_NULL  COMMENT "字段3"
+)
+AGGREGATE KEY(RQ,v1,v2)
+PARTITION BY RANGE(RQ)
+(
+PARTITION p20220908 VALUES LESS THAN ('2022-09-09')
+)
+DISTRIBUTED BY HASH(v1,v2) BUCKE

[doris] 05/06: [test](regression) add tvf regression to test the remove of eof check (#16342)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit df7200f8aeacead5b42ec9b74b528dab2a806675
Author: Ashin Gau 
AuthorDate: Thu Feb 2 10:06:36 2023 +0800

[test](regression) add tvf regression to test the remove of eof check 
(#16342)

Add regression test for #16302. This regression test will be failed if add 
EOF check for non-predicate columns.
---
 regression-test/conf/regression-conf.groovy| 29 
 .../external_table_emr_p2/hive/test_tvf_p2.out | 32 ++
 .../external_table_emr_p2/hive/test_tvf_p2.groovy  | 30 
 3 files changed, 91 insertions(+)

diff --git a/regression-test/conf/regression-conf.groovy 
b/regression-test/conf/regression-conf.groovy
index 544d4b12c1..6779873340 100644
--- a/regression-test/conf/regression-conf.groovy
+++ b/regression-test/conf/regression-conf.groovy
@@ -94,6 +94,35 @@ es_8_port=39200
 
 cacheDataPath = "/tmp"
 
+//hive  catalog test config for bigdata
+enableExternalHiveTest = false
+extHiveHmsHost = "***.**.**.**"
+extHiveHmsPort = 7004
+extHdfsPort = 4007
+extHiveHmsUser = ""
+extHiveHmsPassword= "***"
+
+//mysql jdbc connector test config for bigdata
+enableExternalMysqlTest = false
+extMysqlHost = "***.**.**.**"
+extMysqlPort = 3306
+extMysqlUser = ""
+extMysqlPassword = "***"
+
+//postgresql jdbc connector test config for bigdata
+enableExternalPgTest = false
+extPgHost = "***.**.**.*"
+extPgPort = 5432
+extPgUser = ""
+extPgPassword = "***"
+
+// elasticsearch external test config for bigdata
+enableExternalEsTest = false
+extEsHost = "***"
+extEsPort = 9200
+extEsUser = "***"
+extEsPassword = "***"
+
 s3Endpoint = "cos.ap-hongkong.myqcloud.com"
 s3BucketName = "doris-build-hk-1308700295"
 s3Region = "ap-hongkong"
diff --git a/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out 
b/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out
new file mode 100644
index 00..7f94b13974
--- /dev/null
+++ b/regression-test/data/external_table_emr_p2/hive/test_tvf_p2.out
@@ -0,0 +1,32 @@
+-- This file is automatically generated. You should know what you did if you 
want to edit this
+-- !eof_check --
+2451718\N  9242\N  \N  2886\N  4   250 
1374252 18  \N  \N  \N  0   15131435\N  \N  
0   \N  158878
+\N \N  14846   1945858 \N  1015\N  4   581 2383831 
\N  \N  5   1   0   110 \N  \N  \N  0   
110 \N  -213
+\N 50835   25618   1166535 \N  1748\N  4   \N  2880907 
7   \N  17  \N  \N  115 \N  125 1   \N  
115 \N  \N
+245219545280   29385   1298621 1649018 1815\N  4   \N  
3379765 24  73  \N  \N  0   261717703399\N  
0   \N  2826\N
+245148853117   31945   \N  8644\N  \N  4   783 
4877135 100 \N  \N  \N  \N  \N  3450\N  \N  
\N  565 581 -2885
+\N 53900   35887   702626  \N  2568\N  4   \N  2381514 
\N  \N  \N  0   \N  \N  \N  \N  1   \N  
19  20  -357
+\N 53985   38881   760602  289764  \N  \N  4   227 3377513 
68  75  \N  \N  \N  5833\N  \N  \N  \N  
524 \N  -4588
+\N \N  51685   1833943 \N  \N  \N  4   \N  1879197 
\N  \N  \N  \N  0   46  163 \N  \N  0   
\N  49  -116
+\N \N  62073   \N  287578  \N  \N  4   990 
1626478990  91  \N  \N  0   63818247\N  
\N  0   6381\N  \N
+\N 34914   64259   167395  897626  \N  \N  4   327 
1937905815  \N  \N  51  \N  \N  1480\N  
\N  \N  \N  \N  -707
+\N 70509   100949  \N  \N  \N  \N  4   185 2381361 
35  1   \N  \N  0   \N  41  \N  \N  0   
\N  82  33
+245248974165   103575  \N  1359778 \N  \N  4   \N  
2383538 1   \N  23  0   \N  0   15  23  \N  
\N  0   \N  -14
+2451253\N  111502  246668  \N  \N  \N  4   \N  
2881367 \N  \N  \N  21  0   \N  \N  121874  
0   \N  999 -49
+2451093\N  121339  \N  \N  \N  \N  4   894 
1937908811  92  \N  \N  \N  \N  \N  1364
9   \N  305 314 \N
+2452592 

[doris] 06/06: [branch1.2] fix compile bug after cherry-pick

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit be7c4d267e09d0b6f7ed6930684f270d1324f91d
Author: morningman 
AuthorDate: Thu Feb 2 18:11:55 2023 +0800

[branch1.2] fix compile bug after cherry-pick
---
 be/src/vec/functions/function_helpers.cpp | 8 
 .../src/main/java/org/apache/doris/analysis/FunctionCallExpr.java | 1 +
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/be/src/vec/functions/function_helpers.cpp 
b/be/src/vec/functions/function_helpers.cpp
index c77f3c5ab7..c42d0dc0ea 100644
--- a/be/src/vec/functions/function_helpers.cpp
+++ b/be/src/vec/functions/function_helpers.cpp
@@ -79,14 +79,6 @@ std::tuple 
create_block_with_nested_columns(const Block& b
 }
 }
 
-// TODO: only support match function, rethink the logic
-for (const auto& ctn : block) {
-if (ctn.name.size() > BeConsts::BLOCK_TEMP_COLUMN_PREFIX.size() &&
-starts_with(ctn.name, BeConsts::BLOCK_TEMP_COLUMN_PREFIX)) {
-res.insert(ctn);
-}
-}
-
 return {res, res_args};
 }
 
diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java
index 526d2fd9f6..3aff4d3517 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/FunctionCallExpr.java
@@ -835,6 +835,7 @@ public class FunctionCallExpr extends Expr {
 if (!getChild(1).isConstant()) {
 throw new AnalysisException(fnName + "function's second 
argument should be constant");
 }
+throw new AnalysisException(fnName + "not support on vectorized 
engine now.");
 }
 
 if ((fnName.getFunction().equalsIgnoreCase("HLL_UNION_AGG")


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16364: [Bug](CURRENT_TIMESTAMP) Fix wrong default value after schema change

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16364:
URL: https://github.com/apache/doris/pull/16364#issuecomment-141368

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.07 seconds
load time: 490 seconds
storage size: 17171763885 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202113838_clickbench_pr_89475.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] YangShaw commented on a diff in pull request #14397: [feature](nereids)support window function

2023-02-02 Thread via GitHub


YangShaw commented on code in PR #14397:
URL: https://github.com/apache/doris/pull/14397#discussion_r1094410796


##
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/Window.java:
##
@@ -0,0 +1,176 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.trees.expressions;
+
+import org.apache.doris.nereids.exceptions.UnboundException;
+import org.apache.doris.nereids.properties.OrderKey;
+import org.apache.doris.nereids.trees.expressions.functions.PropagateNullable;
+import org.apache.doris.nereids.trees.expressions.shape.UnaryExpression;
+import org.apache.doris.nereids.trees.expressions.visitor.ExpressionVisitor;
+import org.apache.doris.nereids.types.DataType;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+
+import java.util.List;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+/**
+ * represents window function. WindowFunction of this window is saved as 
Window's child,
+ * which is an UnboundFunction at first and will be analyzed as relevant 
BoundFunction
+ * (can be a WindowFunction or AggregateFunction) after BindFunction.
+ */
+public class Window extends Expression implements UnaryExpression, 
PropagateNullable {

Review Comment:
   so stupid what I have ever done..



##
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/logical/NormalizeWindow.java:
##
@@ -0,0 +1,165 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.rules.rewrite.logical;
+
+import org.apache.doris.nereids.properties.OrderKey;
+import org.apache.doris.nereids.rules.Rule;
+import org.apache.doris.nereids.rules.RuleType;
+import org.apache.doris.nereids.rules.rewrite.OneRewriteRuleFactory;
+import org.apache.doris.nereids.trees.expressions.Alias;
+import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.NamedExpression;
+import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.SlotReference;
+import org.apache.doris.nereids.trees.expressions.Window;
+import org.apache.doris.nereids.trees.plans.Plan;
+import org.apache.doris.nereids.trees.plans.logical.LogicalProject;
+import org.apache.doris.nereids.trees.plans.logical.LogicalWindow;
+import org.apache.doris.nereids.util.ExpressionUtils;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+
+import java.util.List;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * NormalizeWindow: generate bottomProject for expressions within Window, and 
topProject for origin output of SQL
+ * e.g. SELECT k1#1, k2#2, SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY k5#5) 
FROM t
+ *
+ * Original Plan:
+ * LogicalWindow(
+ *   outputs:[k1#1, k2#2, Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY 
k5#5)#6],
+ *   windowExpressions:[]
+ *   )
+ *
+ * After Normalize:
+ * LogicalProject(k1#1, k2#2, Alias(SlotReference#7)#6)
+ * +-- LogicalWindow(
+ *   outputs:[k1#1, k2#2, Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY 
k5#5)#6],
+ *   windowExpressions:[Alias(SUM(k3#3) OVER (PARTITION BY k4#4 ORDER BY 
k5#5)#6]
+ *   )
+ *   +-- LogicalProject(k1#1, k2#2, k3#3, k4#4, k5#5)
+ *
+ */
+publ

[GitHub] [doris] YangShaw commented on a diff in pull request #14397: [feature](nereids)support window function

2023-02-02 Thread via GitHub


YangShaw commented on code in PR #14397:
URL: https://github.com/apache/doris/pull/14397#discussion_r1094411155


##
fe/fe-core/src/main/java/org/apache/doris/nereids/parser/LogicalPlanBuilder.java:
##
@@ -1437,12 +1530,19 @@ private LogicalPlan withProjection(LogicalPlan input, 
SelectColumnClauseContext
 expressions, input, isDistinct);
 } else {
 List projects = 
getNamedExpressions(selectCtx.namedExpressionSeq());
-return new LogicalProject<>(projects, ImmutableList.of(), 
input, isDistinct);
+if (containsWindowExpressions(projects)) {
+return new LogicalWindow<>(projects, input);
+}
+return new LogicalProject<>(projects, 
Collections.emptyList(), input, isDistinct);

Review Comment:
   nice idea~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16073: [feature](Load)Add cluster_token auth for stream load to avoid double auth in mysql load

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16073:
URL: https://github.com/apache/doris/pull/16073#issuecomment-1413617156

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yixiutt opened a new pull request, #16377: [fix](vertical compaction) fix uint32_t init value

2023-02-02 Thread via GitHub


yixiutt opened a new pull request, #16377:
URL: https://github.com/apache/doris/pull/16377

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang commented on a diff in pull request #16263: [Improve](row-store) support row cache

2023-02-02 Thread via GitHub


xiaokang commented on code in PR #16263:
URL: https://github.com/apache/doris/pull/16263#discussion_r109442


##
be/src/common/config.h:
##
@@ -246,6 +246,7 @@ CONF_mBool(row_nums_check, "true");
 // modify them upon necessity
 CONF_Int32(min_file_descriptor_number, "6");
 CONF_Int64(index_stream_cache_capacity, "10737418240");
+CONF_String(row_cache_mem_limit, "20%");

Review Comment:
   default 20% is a little large compared to page_cache



##
be/src/vec/jsonb/serialize.cpp:
##
@@ -319,4 +319,22 @@ void JsonbSerializeUtil::jsonb_to_block(const 
TupleDescriptor& desc,
 }
 }
 
+// single row
+void JsonbSerializeUtil::jsonb_to_block(const TupleDescriptor& desc, const 
Slice& data,

Review Comment:
   jsonb_to_block for single row can be reused by jsonb_to_block for multiple 
rows



##
be/src/olap/rowset/segment_v2/segment_writer.cpp:
##
@@ -252,8 +253,13 @@ Status SegmentWriter::append_block(const 
vectorized::Block* block, size_t row_po
 if (_tablet_schema->keys_type() == UNIQUE_KEYS && 
_opts.enable_unique_key_merge_on_write) {
 // create primary indexes
 for (size_t pos = 0; pos < num_rows; pos++) {
-RETURN_IF_ERROR(
-
_primary_key_index_builder->add_item(_full_encode_keys(key_columns, pos)));
+const std::string& key = _full_encode_keys(key_columns, pos);
+RETURN_IF_ERROR(_primary_key_index_builder->add_item(key));
+if (!config::disable_storage_row_cache && 
_tablet_schema->store_row_column() &&
+_opts.is_direct_write) {
+// invalidate cache
+RowCache::instance()->erase({_opts.rowset_ctx->tablet_id, 
key});

Review Comment:
   can insert new value to cache



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16377: [fix](vertical compaction) fix uint32_t init value

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16377:
URL: https://github.com/apache/doris/pull/16377#issuecomment-1413628647

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value

2023-02-02 Thread via GitHub


zhannngchen commented on code in PR #16377:
URL: https://github.com/apache/doris/pull/16377#discussion_r109053


##
be/src/vec/olap/vertical_merge_iterator.h:
##
@@ -190,14 +190,14 @@ class VerticalMergeIteratorContext {
 size_t _ori_return_cols = 0;
 
 // segment order, used to compare key
-uint32_t _order = -1;
+uint32_t _order = 0;
 
-uint32_t _seq_col_idx = -1;

Review Comment:
   `_seq_col_idx` change to int32_t is better? align with the type in schema



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value

2023-02-02 Thread via GitHub


zhannngchen commented on code in PR #16377:
URL: https://github.com/apache/doris/pull/16377#discussion_r109691


##
be/src/vec/olap/vertical_merge_iterator.h:
##
@@ -190,14 +190,14 @@ class VerticalMergeIteratorContext {
 size_t _ori_return_cols = 0;
 
 // segment order, used to compare key
-uint32_t _order = -1;
+uint32_t _order = 0;
 
-uint32_t _seq_col_idx = -1;

Review Comment:
   `_seq_col_idx` change to int32_t is better? align with the type in schema



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #16377: [fix](vertical compaction) fix uint32_t init value

2023-02-02 Thread via GitHub


zhannngchen commented on code in PR #16377:
URL: https://github.com/apache/doris/pull/16377#discussion_r109691


##
be/src/vec/olap/vertical_merge_iterator.h:
##
@@ -190,14 +190,14 @@ class VerticalMergeIteratorContext {
 size_t _ori_return_cols = 0;
 
 // segment order, used to compare key
-uint32_t _order = -1;
+uint32_t _order = 0;
 
-uint32_t _seq_col_idx = -1;

Review Comment:
   `_seq_col_idx` change to int32_t is better? align with the type in schema



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16073: [feature](Load)Add cluster_token auth for stream load to avoid double auth in mysql load

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16073:
URL: https://github.com/apache/doris/pull/16073#issuecomment-1413684972

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [refactor](row-store) make row store column a hidden column in meta (#16251)

2023-02-02 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 1d8265c5a3 [refactor](row-store) make row store column a hidden column 
in meta (#16251)
1d8265c5a3 is described below

commit 1d8265c5a3df818fbf87a12192ee419a7593d851
Author: lihangyu <15605149...@163.com>
AuthorDate: Thu Feb 2 20:56:13 2023 +0800

[refactor](row-store) make row store column a hidden column in meta (#16251)

This could simplfy storage engine logic and make code more readable, and we 
could analyze
the hidden `__DORIS_ROW_STORE_COL__` length etc..
---
 be/src/common/consts.h |  3 +-
 be/src/olap/compaction.cpp |  5 
 be/src/olap/memtable.cpp   | 32 
 be/src/olap/memtable.h |  4 +++
 be/src/olap/rowset/beta_rowset_writer.cpp  | 34 --
 be/src/olap/rowset/beta_rowset_writer.h|  1 -
 be/src/olap/rowset/segment_v2/segment.cpp  | 13 -
 be/src/olap/rowset/segment_v2/segment.h|  1 -
 be/src/olap/rowset/segment_v2/segment_writer.cpp   | 29 --
 be/src/olap/rowset/segment_v2/segment_writer.h |  2 --
 be/src/olap/rowset/vertical_beta_rowset_writer.cpp |  3 --
 be/src/olap/schema.h   |  2 +-
 be/src/olap/tablet.cpp |  8 ++---
 be/src/olap/tablet_schema.cpp  | 17 +++
 be/src/olap/tablet_schema.h|  3 +-
 be/src/vec/jsonb/serialize.cpp |  4 +++
 .../java/org/apache/doris/analysis/ColumnDef.java  |  6 
 .../org/apache/doris/analysis/CreateTableStmt.java |  7 -
 .../main/java/org/apache/doris/catalog/Column.java |  6 
 19 files changed, 73 insertions(+), 107 deletions(-)

diff --git a/be/src/common/consts.h b/be/src/common/consts.h
index f6c7ece8e0..bf7a2e6013 100644
--- a/be/src/common/consts.h
+++ b/be/src/common/consts.h
@@ -26,11 +26,10 @@ const std::string CSV_WITH_NAMES = "csv_with_names";
 const std::string CSV_WITH_NAMES_AND_TYPES = "csv_with_names_and_types";
 const std::string BLOCK_TEMP_COLUMN_PREFIX = "__TEMP__";
 const std::string ROWID_COL = "__DORIS_ROWID_COL__";
-const std::string SOURCE_COL = "__DORIS_SOURCE_COL__";
+const std::string ROW_STORE_COL = "__DORIS_ROW_STORE_COL__";
 
 constexpr int MAX_DECIMAL32_PRECISION = 9;
 constexpr int MAX_DECIMAL64_PRECISION = 18;
 constexpr int MAX_DECIMAL128_PRECISION = 38;
-constexpr int SOURCE_COL_UNIQUE_ID = INT32_MAX;
 } // namespace BeConsts
 } // namespace doris
diff --git a/be/src/olap/compaction.cpp b/be/src/olap/compaction.cpp
index eb7d2521bf..76c7fc3374 100644
--- a/be/src/olap/compaction.cpp
+++ b/be/src/olap/compaction.cpp
@@ -276,11 +276,6 @@ Status Compaction::do_compaction_impl(int64_t permits) {
 stats.rowid_conversion = &_rowid_conversion;
 }
 
-if (_cur_tablet_schema->store_row_column()) {
-// table with row column not support vertical compaction now
-vertical_compaction = false;
-}
-
 if (use_vectorized_compaction) {
 if (vertical_compaction) {
 res = Merger::vertical_merge_rowsets(_tablet, compaction_type(), 
_cur_tablet_schema,
diff --git a/be/src/olap/memtable.cpp b/be/src/olap/memtable.cpp
index c56683a9c8..9ec3cb0fda 100644
--- a/be/src/olap/memtable.cpp
+++ b/be/src/olap/memtable.cpp
@@ -27,6 +27,7 @@
 #include "vec/aggregate_functions/aggregate_function_reader.h"
 #include "vec/aggregate_functions/aggregate_function_simple_factory.h"
 #include "vec/core/field.h"
+#include "vec/jsonb/serialize.h"
 
 namespace doris {
 using namespace ErrorCode;
@@ -356,6 +357,10 @@ Status MemTable::_do_flush(int64_t& duration_ns) {
 SCOPED_RAW_TIMER(&duration_ns);
 _collect_vskiplist_results();
 vectorized::Block block = _output_mutable_block.to_block();
+if (_tablet_schema->store_row_column()) {
+// convert block to row store format
+serialize_block_to_row_column(block);
+}
 RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
 return Status::OK();
 }
@@ -364,4 +369,31 @@ Status MemTable::close() {
 return flush();
 }
 
+void MemTable::serialize_block_to_row_column(vectorized::Block& block) {
+if (block.rows() == 0) {
+return;
+}
+MonotonicStopWatch watch;
+watch.start();
+// find row column id
+int row_column_id = 0;
+for (int i = 0; i < _tablet_schema->num_columns(); ++i) {
+if (_tablet_schema->column(i).is_row_store_column()) {
+row_column_id = i;
+break;
+}
+}
+vectorized::ColumnString* row_store_column =
+
static_cast(block.get_by_position(row_column_id)
+  

[GitHub] [doris] dataroaring merged pull request #16251: [refactor](row-store) make row store column a hidden column in meta

2023-02-02 Thread via GitHub


dataroaring merged PR #16251:
URL: https://github.com/apache/doris/pull/16251


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16368: [enhance](Nereids): polish code

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16368:
URL: https://github.com/apache/doris/pull/16368#issuecomment-1413700547

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 33.68 seconds
load time: 494 seconds
storage size: 17170879622 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202125543_clickbench_pr_89635.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt opened a new pull request, #16378: [Feature](Materialized-View) support multiple slot on one column in materialized view

2023-02-02 Thread via GitHub


BiteThet opened a new pull request, #16378:
URL: https://github.com/apache/doris/pull/16378

   # Proposed changes
   
   support multiple slot on one column in materialized view
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring merged pull request #16358: [Improve](row-store) check light schema change must enabled

2023-02-02 Thread via GitHub


dataroaring merged PR #16358:
URL: https://github.com/apache/doris/pull/16358


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [Improve](row-store) check light schema change enabled (#16358)

2023-02-02 Thread dataroaring
This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 13f74088fa [Improve](row-store) check light schema change enabled 
(#16358)
13f74088fa is described below

commit 13f74088fad806de6145ff6081a7056863d35d06
Author: lihangyu <15605149...@163.com>
AuthorDate: Thu Feb 2 20:57:18 2023 +0800

[Improve](row-store) check light schema change enabled (#16358)
---
 .../org/apache/doris/datasource/InternalCatalog.java   |  4 
 .../suites/point_query_p0/test_point_query.groovy  | 18 +-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java 
b/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java
index c8214ec7d4..31455d814c 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java
@@ -1949,6 +1949,10 @@ public class InternalCatalog implements 
CatalogIf {
 boolean storeRowColumn = false;
 try {
 storeRowColumn = 
PropertyAnalyzer.analyzeStoreRowColumn(properties);
+if (storeRowColumn && !enableLightSchemaChange) {
+throw new DdlException(
+"Row store column rely on light schema change, enable 
light schema change first");
+}
 } catch (AnalysisException e) {
 throw new DdlException(e.getMessage());
 }
diff --git a/regression-test/suites/point_query_p0/test_point_query.groovy 
b/regression-test/suites/point_query_p0/test_point_query.groovy
index 6806650dfb..5d36a4eb31 100644
--- a/regression-test/suites/point_query_p0/test_point_query.groovy
+++ b/regression-test/suites/point_query_p0/test_point_query.groovy
@@ -24,6 +24,22 @@ suite("test_point_query") {
 def url = context.config.jdbcUrl + "&useServerPrepStmts=true"
 def result1 = connect(user=user, password=password, url=url) {
 sql """DROP TABLE IF EXISTS ${tableName}"""
+test {
+// abnormal case
+sql """
+  CREATE TABLE IF NOT EXISTS ${tableName} (
+`k1` int NULL COMMENT ""
+  ) ENGINE=OLAP
+  UNIQUE KEY(`k1`)
+  DISTRIBUTED BY HASH(`k1`) BUCKETS 1
+  PROPERTIES (
+  "replication_allocation" = "tag.location.default: 1",
+  "store_row_column" = "true",
+  "light_schema_change" = "false"
+  )
+  """
+exception "errCode = 2, detailMessage = Row store column rely on light 
schema change, enable light schema change first"
+}
 sql """
   CREATE TABLE IF NOT EXISTS ${tableName} (
 `k1` int(11) NULL COMMENT "",
@@ -123,4 +139,4 @@ suite("test_point_query") {
 qt_sql """execute stmt2 using (1231, 119291.11, 'ddd')"""
 qt_sql """execute stmt2 using (1237, 120939.11130, 'addd')"""
 }
-}
\ No newline at end of file
+}


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-1.2-lts updated (be7c4d267e -> fb5420c262)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a change to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


from be7c4d267e [branch1.2] fix compile bug after cherry-pick
 new 231952a57d [fix](load) sequence column do not compare correctly in 
memtable (#16211)
 new d4b8629c77 [feature](JdbcExternalCatalog) support insert data in 
JdbcExternalCatalog (#16271)
 new fb5420c262 [improvement](multi-catalog) increase default batch_size to 
4064 (#16326)

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/exec/table_connector.cpp|   9 +-
 be/src/olap/memtable.cpp   |   5 +-
 be/src/vec/exec/format/csv/csv_reader.cpp  |   2 +-
 be/src/vec/exec/format/generic_reader.h|   2 +
 be/src/vec/exec/format/json/new_json_reader.cpp|   2 +-
 be/src/vec/exec/format/orc/vorc_reader.cpp |   2 +-
 be/src/vec/exec/format/parquet/vparquet_reader.cpp |   2 +-
 be/test/olap/delta_writer_test.cpp | 115 ++---
 .../docker-compose/mysql/init/03-create-table.sql  |   5 +
 .../docker-compose/oracle/init/03-create-table.sql |   6 ++
 .../postgresql/init/02-create-table.sql|   6 ++
 .../java/org/apache/doris/analysis/InsertStmt.java |  39 +--
 .../java/org/apache/doris/qe/SessionVariable.java  |   4 +-
 .../doris/transaction/DatabaseTransactionMgr.java  |   3 +-
 .../doris/transaction/GlobalTransactionMgr.java|   5 +-
 .../unique/test_unique_table_new_sequence.out  |   8 +-
 .../unique/test_unique_table_sequence.out  |   6 +-
 .../data/data_model_p0/unique/unique_key_data1.csv |   1 +
 .../data/data_model_p0/unique/unique_key_data2.csv |   3 +-
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out|  13 +++
 .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out   |  13 +++
 .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out  |  13 +++
 .../unique/test_unique_table_new_sequence.groovy   |   8 +-
 .../unique/test_unique_table_sequence.groovy   |   8 +-
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy |  23 -
 .../test_oracle_jdbc_catalog.groovy|  14 ++-
 .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy|  13 +++
 27 files changed, 248 insertions(+), 82 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] 01/03: [fix](load) sequence column do not compare correctly in memtable (#16211)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 231952a57d9ce300588a23f7e2ee0614fc58259b
Author: zhannngchen <48427519+zhannngc...@users.noreply.github.com>
AuthorDate: Thu Feb 2 11:00:23 2023 +0800

[fix](load) sequence column do not compare correctly in memtable (#16211)
---
 be/src/olap/memtable.cpp   |   5 +-
 be/test/olap/delta_writer_test.cpp | 115 ++---
 .../unique/test_unique_table_new_sequence.out  |   8 +-
 .../unique/test_unique_table_sequence.out  |   6 +-
 .../data/data_model_p0/unique/unique_key_data1.csv |   1 +
 .../data/data_model_p0/unique/unique_key_data2.csv |   3 +-
 .../unique/test_unique_table_new_sequence.groovy   |   8 +-
 .../unique/test_unique_table_sequence.groovy   |   8 +-
 8 files changed, 100 insertions(+), 54 deletions(-)

diff --git a/be/src/olap/memtable.cpp b/be/src/olap/memtable.cpp
index adc57dfee7..2fdf41e158 100644
--- a/be/src/olap/memtable.cpp
+++ b/be/src/olap/memtable.cpp
@@ -321,8 +321,9 @@ void MemTable::_replace_row(const ContiguousRow& src_row, 
TableKey row_in_skipli
 void MemTable::_aggregate_two_row_in_block(RowInBlock* new_row, RowInBlock* 
row_in_skiplist) {
 if (_tablet_schema->has_sequence_col()) {
 auto sequence_idx = _tablet_schema->sequence_col_idx();
-auto res = _input_mutable_block.compare_at(row_in_skiplist->_row_pos, 
new_row->_row_pos,
-   sequence_idx, 
_input_mutable_block, -1);
+DCHECK_LT(sequence_idx, _input_mutable_block.columns());
+auto col_ptr = 
_input_mutable_block.mutable_columns()[sequence_idx].get();
+auto res = col_ptr->compare_at(row_in_skiplist->_row_pos, 
new_row->_row_pos, *col_ptr, -1);
 // dst sequence column larger than src, don't need to update
 if (res > 0) {
 return;
diff --git a/be/test/olap/delta_writer_test.cpp 
b/be/test/olap/delta_writer_test.cpp
index 16051d1adc..b3aa765c2f 100644
--- a/be/test/olap/delta_writer_test.cpp
+++ b/be/test/olap/delta_writer_test.cpp
@@ -29,6 +29,7 @@
 #include "gen_cpp/internal_service.pb.h"
 #include "olap/field.h"
 #include "olap/options.h"
+#include "olap/rowset/beta_rowset.h"
 #include "olap/storage_engine.h"
 #include "olap/tablet.h"
 #include "olap/tablet_meta_manager.h"
@@ -247,7 +248,7 @@ static void create_tablet_request_with_sequence_col(int64_t 
tablet_id, int32_t s
 request->tablet_schema.short_key_column_count = 2;
 request->tablet_schema.keys_type = TKeysType::UNIQUE_KEYS;
 request->tablet_schema.storage_type = TStorageType::COLUMN;
-request->tablet_schema.__set_sequence_col_idx(2);
+request->tablet_schema.__set_sequence_col_idx(4);
 request->__set_storage_format(TStorageFormat::V2);
 
 TColumn k1;
@@ -262,13 +263,6 @@ static void 
create_tablet_request_with_sequence_col(int64_t tablet_id, int32_t s
 k2.column_type.type = TPrimitiveType::SMALLINT;
 request->tablet_schema.columns.push_back(k2);
 
-TColumn sequence_col;
-sequence_col.column_name = SEQUENCE_COL;
-sequence_col.__set_is_key(false);
-sequence_col.column_type.type = TPrimitiveType::INT;
-sequence_col.__set_aggregation_type(TAggregationType::REPLACE);
-request->tablet_schema.columns.push_back(sequence_col);
-
 TColumn v1;
 v1.column_name = "v1";
 v1.__set_is_key(false);
@@ -282,6 +276,13 @@ static void 
create_tablet_request_with_sequence_col(int64_t tablet_id, int32_t s
 v2.column_type.type = TPrimitiveType::DATEV2;
 v2.__set_aggregation_type(TAggregationType::REPLACE);
 request->tablet_schema.columns.push_back(v2);
+
+TColumn sequence_col;
+sequence_col.column_name = SEQUENCE_COL;
+sequence_col.__set_is_key(false);
+sequence_col.column_type.type = TPrimitiveType::INT;
+sequence_col.__set_aggregation_type(TAggregationType::REPLACE);
+request->tablet_schema.columns.push_back(sequence_col);
 }
 
 static TDescriptorTable create_descriptor_tablet() {
@@ -346,15 +347,15 @@ static TDescriptorTable 
create_descriptor_tablet_with_sequence_col() {
 
TSlotDescriptorBuilder().type(TYPE_TINYINT).column_name("k1").column_pos(0).build());
 tuple_builder.add_slot(
 
TSlotDescriptorBuilder().type(TYPE_SMALLINT).column_name("k2").column_pos(1).build());
+tuple_builder.add_slot(
+
TSlotDescriptorBuilder().type(TYPE_DATETIME).column_name("v1").column_pos(2).build());
+tuple_builder.add_slot(
+
TSlotDescriptorBuilder().type(TYPE_DATEV2).column_name("v2").column_pos(3).build());
 tuple_builder.add_slot(TSlotDescriptorBuilder()
.type(TYPE_INT)
.column_name(SEQUENCE_COL)
-   .column_pos(2)
+   .column_p

[doris] 02/03: [feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit d4b8629c77a80f8a1cf95561375ba453337ae406
Author: Tiewei Fang <43782773+bepppo...@users.noreply.github.com>
AuthorDate: Thu Feb 2 17:31:33 2023 +0800

[feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog 
(#16271)
---
 be/src/exec/table_connector.cpp|  9 +++--
 .../docker-compose/mysql/init/03-create-table.sql  |  5 +++
 .../docker-compose/oracle/init/03-create-table.sql |  6 
 .../postgresql/init/02-create-table.sql|  6 
 .../java/org/apache/doris/analysis/InsertStmt.java | 39 --
 .../doris/transaction/DatabaseTransactionMgr.java  |  3 +-
 .../doris/transaction/GlobalTransactionMgr.java|  5 +--
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.out| 13 
 .../jdbc_catalog_p0/test_oracle_jdbc_catalog.out   | 13 
 .../data/jdbc_catalog_p0/test_pg_jdbc_catalog.out  | 13 
 .../jdbc_catalog_p0/test_mysql_jdbc_catalog.groovy | 23 ++---
 .../test_oracle_jdbc_catalog.groovy| 14 +++-
 .../jdbc_catalog_p0/test_pg_jdbc_catalog.groovy| 13 
 13 files changed, 140 insertions(+), 22 deletions(-)

diff --git a/be/src/exec/table_connector.cpp b/be/src/exec/table_connector.cpp
index f2c3ff8101..6c310e4a60 100644
--- a/be/src/exec/table_connector.cpp
+++ b/be/src/exec/table_connector.cpp
@@ -336,8 +336,13 @@ Status TableConnector::convert_column_data(const 
vectorized::ColumnPtr& column_p
 case TYPE_VARCHAR:
 case TYPE_CHAR:
 case TYPE_STRING: {
-// here need check the ' is used, now for pg array string must be "
-fmt::format_to(_insert_stmt_buffer, "\"{}\"", 
fmt::basic_string_view(item, size));
+// TODO(zhangstar333): check array data type of postgresql
+// for oracle/pg database string must be '
+if (table_type == TOdbcTableType::ORACLE || table_type == 
TOdbcTableType::POSTGRESQL) {
+fmt::format_to(_insert_stmt_buffer, "'{}'", 
fmt::basic_string_view(item, size));
+} else {
+fmt::format_to(_insert_stmt_buffer, "\"{}\"", 
fmt::basic_string_view(item, size));
+}
 break;
 }
 case TYPE_ARRAY: {
diff --git a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql 
b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
index 6c8371e7c7..1847551d0e 100644
--- a/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
+++ b/docker/thirdparties/docker-compose/mysql/init/03-create-table.sql
@@ -223,4 +223,9 @@ create table doris_test.ex_tb20 (
 decimal_unsigned_long decimal(65, 5) unsigned
 ) engine=innodb charset=utf8;
 
+create table doris_test.test_insert (
+`id` varchar(128) NULL,
+`name` varchar(128) NULL,
+`age` int NULL
+) engine=innodb charset=utf8;
 
diff --git a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql 
b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
index d5dd8cf1c6..d2d8d6af7e 100644
--- a/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
+++ b/docker/thirdparties/docker-compose/oracle/init/03-create-table.sql
@@ -78,3 +78,9 @@ t4 timestamp,
 t5 interval year(3) to month,
 t6 interval day(3) to second(6)
 );
+
+create table doris_test.test_insert(
+id varchar2(128),
+name varchar2(128),
+age number(5)
+);
diff --git 
a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql 
b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
index d2dbac7695..93a307f882 100644
--- a/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
+++ b/docker/thirdparties/docker-compose/postgresql/init/02-create-table.sql
@@ -143,3 +143,9 @@ CREATE TABLE catalog_pg_test.test12 (
ID INT NOT NULL,
uuid_value uuid
 );
+
+CREATE TABLE catalog_pg_test.test_insert (
+   id varchar(128),
+   name varchar(128),
+   age int
+);
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
index 44140b24e9..891fe3349b 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/InsertStmt.java
@@ -31,6 +31,8 @@ import org.apache.doris.catalog.Partition;
 import org.apache.doris.catalog.PartitionType;
 import org.apache.doris.catalog.Table;
 import org.apache.doris.catalog.TableIf;
+import org.apache.doris.catalog.external.JdbcExternalDatabase;
+import org.apache.doris.catalog.external.JdbcExternalTable;
 import org.apache.doris.common.AnalysisException;
 import org.apache.doris.common.DdlException;
 import org.apache.doris.common.ErrorCode;
@@ -39,6 +41,8 @@ import org.apache.doris.common.Pair;
 import org.apache.doris.common.UserException;
 import org.apache.doris.common.util.DebugUtil

[doris] 03/03: [improvement](multi-catalog) increase default batch_size to 4064 (#16326)

2023-02-02 Thread morningman
This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch branch-1.2-lts
in repository https://gitbox.apache.org/repos/asf/doris.git

commit fb5420c26276f5a34511d76497a3a2a1ce7ffe57
Author: Ashin Gau 
AuthorDate: Thu Feb 2 11:51:09 2023 +0800

[improvement](multi-catalog) increase default batch_size to 4064 (#16326)

The performance of ClickBench Q30 is affected by batch_size:
| batch_size | 1024 | 4096 | 20480 |
| -- | -- | -- | -- |
| Q30 query time | 2.27 | 1.08 | 0.62 |

Because aggregation operator will create a new result block for each batch 
block, and Q30 has 90 columns, which is time-consuming. Larger batch_size will 
decrease the number of aggregation blocks, so the larger batch_size will 
improve performance.

Doris internal reader will read at least 4064 rows even if batch_size < 
4064, so this PR keep the process of reading external table the same  as 
internal table.
---
 be/src/vec/exec/format/csv/csv_reader.cpp | 2 +-
 be/src/vec/exec/format/generic_reader.h   | 2 ++
 be/src/vec/exec/format/json/new_json_reader.cpp   | 2 +-
 be/src/vec/exec/format/orc/vorc_reader.cpp| 2 +-
 be/src/vec/exec/format/parquet/vparquet_reader.cpp| 2 +-
 fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java | 4 ++--
 6 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/be/src/vec/exec/format/csv/csv_reader.cpp 
b/be/src/vec/exec/format/csv/csv_reader.cpp
index d811866d13..c7099b24c7 100644
--- a/be/src/vec/exec/format/csv/csv_reader.cpp
+++ b/be/src/vec/exec/format/csv/csv_reader.cpp
@@ -188,7 +188,7 @@ Status CsvReader::get_next_block(Block* block, size_t* 
read_rows, bool* eof) {
 return Status::OK();
 }
 
-const int batch_size = _state->batch_size();
+const int batch_size = std::max(_state->batch_size(), 
(int)_MIN_BATCH_SIZE);
 size_t rows = 0;
 auto columns = block->mutate_columns();
 while (rows < batch_size && !_line_reader_eof) {
diff --git a/be/src/vec/exec/format/generic_reader.h 
b/be/src/vec/exec/format/generic_reader.h
index 30e93aacd8..9f4cfd00ee 100644
--- a/be/src/vec/exec/format/generic_reader.h
+++ b/be/src/vec/exec/format/generic_reader.h
@@ -60,6 +60,8 @@ public:
 }
 
 protected:
+const size_t _MIN_BATCH_SIZE = 4064; // 4094 - 32(padding)
+
 /// Whether the underlying FileReader has filled the partition&missing 
columns
 bool _fill_all_columns = false;
 };
diff --git a/be/src/vec/exec/format/json/new_json_reader.cpp 
b/be/src/vec/exec/format/json/new_json_reader.cpp
index 68a3f089e5..0ed5a0aeb0 100644
--- a/be/src/vec/exec/format/json/new_json_reader.cpp
+++ b/be/src/vec/exec/format/json/new_json_reader.cpp
@@ -105,7 +105,7 @@ Status NewJsonReader::get_next_block(Block* block, size_t* 
read_rows, bool* eof)
 return Status::OK();
 }
 
-const int batch_size = _state->batch_size();
+const int batch_size = std::max(_state->batch_size(), 
(int)_MIN_BATCH_SIZE);
 auto columns = block->mutate_columns();
 
 while (columns[0]->size() < batch_size && !_reader_eof) {
diff --git a/be/src/vec/exec/format/orc/vorc_reader.cpp 
b/be/src/vec/exec/format/orc/vorc_reader.cpp
index c295712491..f313cb60f0 100644
--- a/be/src/vec/exec/format/orc/vorc_reader.cpp
+++ b/be/src/vec/exec/format/orc/vorc_reader.cpp
@@ -72,7 +72,7 @@ OrcReader::OrcReader(RuntimeProfile* profile, const 
TFileScanRangeParams& params
 : _profile(profile),
   _scan_params(params),
   _scan_range(range),
-  _batch_size(batch_size),
+  _batch_size(std::max(batch_size, _MIN_BATCH_SIZE)),
   _range_start_offset(range.start_offset),
   _range_size(range.size),
   _ctz(ctz),
diff --git a/be/src/vec/exec/format/parquet/vparquet_reader.cpp 
b/be/src/vec/exec/format/parquet/vparquet_reader.cpp
index 7881eebe2d..cfc904d607 100644
--- a/be/src/vec/exec/format/parquet/vparquet_reader.cpp
+++ b/be/src/vec/exec/format/parquet/vparquet_reader.cpp
@@ -36,7 +36,7 @@ ParquetReader::ParquetReader(RuntimeProfile* profile, const 
TFileScanRangeParams
 : _profile(profile),
   _scan_params(params),
   _scan_range(range),
-  _batch_size(batch_size),
+  _batch_size(std::max(batch_size, _MIN_BATCH_SIZE)),
   _range_start_offset(range.start_offset),
   _range_size(range.size),
   _ctz(ctz) {
diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java 
b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
index bfaefd8ac5..db99c90e26 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
@@ -384,9 +384,9 @@ public class SessionVariable implements Serializable, 
Writable {
 @VariableMgr.VarAttr(name = CODEGEN_LEVEL)
 public in

[GitHub] [doris] xy720 opened a new pull request, #16379: [feature](struct-type/map-type) Add switch for struct and map type for creating table

2023-02-02 Thread via GitHub


xy720 opened a new pull request, #16379:
URL: https://github.com/apache/doris/pull/16379

   # Proposed changes
   
   Issue Number: #14917
   
   Add switches to forbid uses creating table with struct or map column.
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [x] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [x] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16369: [Improvement](statistics) optimise histogram keyword

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16369:
URL: https://github.com/apache/doris/pull/16369#issuecomment-1413730263

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 33.46 seconds
load time: 486 seconds
storage size: 17170706881 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202131525_clickbench_pr_89593.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] eldenmoon opened a new pull request, #16380: [Improve](point query) support retry different backends in PointQuery…

2023-02-02 Thread via GitHub


eldenmoon opened a new pull request, #16380:
URL: https://github.com/apache/doris/pull/16380

   …Executor
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16370: [fix](planner) Doris returns empty sets when select from a inline view

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16370:
URL: https://github.com/apache/doris/pull/16370#issuecomment-1413755038

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.07 seconds
load time: 494 seconds
storage size: 17122767970 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/2023020211_clickbench_pr_89609.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16372: [fix](iceberg) fix iceberg catalog rest access

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16372:
URL: https://github.com/apache/doris/pull/16372#issuecomment-1413782919

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 35.08 seconds
load time: 498 seconds
storage size: 17123367964 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202135323_clickbench_pr_89569.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16089: [enhance](cooldown)accelerate cooldown task produce efficiency

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16089:
URL: https://github.com/apache/doris/pull/16089#issuecomment-1413803974

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16371: [Feature-WIP](inverted index) Implementation for alter inverted index.

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16371:
URL: https://github.com/apache/doris/pull/16371#issuecomment-1413808938

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.59 seconds
load time: 505 seconds
storage size: 17171308895 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202141208_clickbench_pr_89574.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] HappenLee commented on a diff in pull request #16337: [improvement](jdbc) refator jdbc of copy result set by batch

2023-02-02 Thread via GitHub


HappenLee commented on code in PR #16337:
URL: https://github.com/apache/doris/pull/16337#discussion_r1094664749


##
fe/java-udf/src/main/java/org/apache/doris/udf/JdbcExecutor.java:
##
@@ -325,277 +325,233 @@ private void init(String driverUrl, String sql, int 
batchSize, String driverClas
 public void copyBatchBooleanResult(Object columnObj, boolean isNullable, 
int numRows, long nullMapAddr,
 long columnAddr) {
 Boolean[] column = (Boolean[]) columnObj;
-byte[] columnData = new byte[numRows];
 if (isNullable) {
-byte[] nullMap = new byte[numRows];
 for (int i = 0; i < numRows; i++) {
 if (column[i] == null) {
-nullMap[i] = 1;
+UdfUtils.UNSAFE.putByte(nullMapAddr + i, (byte) 1);

Review Comment:
   should always putByte. if column[i] != null, put 0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16375: [Enhencement](LineReader) rename NewPlainTextLineReader/NewPlainBinaryLineReader to PlainTextLineReader/PlainBinaryLineReader

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16375:
URL: https://github.com/apache/doris/pull/16375#issuecomment-1413920053

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.57 seconds
load time: 498 seconds
storage size: 17122682039 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202152411_clickbench_pr_89644.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16374: [fix](cooldown) Fix core in remove_all_remote_rowsets

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16374:
URL: https://github.com/apache/doris/pull/16374#issuecomment-1413925976

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.34 seconds
load time: 494 seconds
storage size: 17170830266 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202152815_clickbench_pr_89672.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16123:
URL: https://github.com/apache/doris/pull/16123#issuecomment-1413945568

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16123:
URL: https://github.com/apache/doris/pull/16123#issuecomment-1413970256

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #16123: [enhancement-wip](BE http)Support BE http service with brpc

2023-02-02 Thread via GitHub


github-actions[bot] commented on PR #16123:
URL: https://github.com/apache/doris/pull/16123#issuecomment-1413990933

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hello-stephen commented on pull request #16258: [feature](cooldown)Add cooldown delete

2023-02-02 Thread via GitHub


hello-stephen commented on PR #16258:
URL: https://github.com/apache/doris/pull/16258#issuecomment-1414002975

   TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.25 seconds
load time: 496 seconds
storage size: 17122695512 Bytes

https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230202161630_clickbench_pr_89683.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



  1   2   3   >