[GitHub] [doris] adonis0147 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


adonis0147 commented on code in PR #12852:
URL: https://github.com/apache/doris/pull/12852#discussion_r977277813


##
be/src/vec/columns/column_dictionary.h:
##
@@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> {
 if (code >= 0) {
 return code;
 }
-auto bound = std::upper_bound(_dict_data.begin(), 
_dict_data.end(), value) -
- _dict_data.begin();
+auto bound = std::upper_bound(_dict_data->begin(), 
_dict_data->end(), value) -
+ _dict_data->begin();
 return greater ? bound - greater + eq : bound - eq;
 }
 
 void find_codes(const phmap::flat_hash_set& values,
 std::vector& selected) const {
-size_t dict_word_num = _dict_data.size();
+size_t dict_word_num = _dict_data->size();
 selected.resize(dict_word_num);
 selected.assign(dict_word_num, false);
-for (const auto& value : values) {
-if (auto it = _inverted_index.find(value); it != 
_inverted_index.end()) {
-selected[it->second] = true;
+for (size_t i = 0; i < _dict_data->size(); i++) {
+if (values.find((*_dict_data)[i]) != values.end()) {
+selected[i] = true;
 }
 }
 }
 
 void clear() {
-_dict_data.clear();
-_inverted_index.clear();
-_code_convert_table.clear();
+_dict_data->clear();
 _hash_values.clear();
 }
 
 void clear_hash_values() { _hash_values.clear(); }
 
 void sort() {
-size_t dict_size = _dict_data.size();
-_code_convert_table.reserve(dict_size);
-std::sort(_dict_data.begin(), _dict_data.end(), _comparator);
+size_t dict_size = _dict_data->size();
+
+_perm.resize(dict_size);
+for (size_t i = 0; i < dict_size; ++i) {
+_perm[i] = i;
+}
+
+struct Comparator {
+public:
+Comparator(DictContainer& dict_data) : _dict_data(dict_data) {}
+bool operator()(const size_t a, const size_t b) const {
+return _comparator(_dict_data[a], _dict_data[b]);
+}
+
+private:
+StringValue::Comparator _comparator;
+DictContainer& _dict_data;
+};
+Comparator comparator(*_dict_data);
+std::sort(_perm.begin(), _perm.end(), comparator);

Review Comment:
   ```suggestion
   std::sort(_perm.begin(), _perm.end(),
 [&dict_data = *_dict_data, &comparator = 
_comparator](const size_t a,
   
const size_t b) {
 return comparator(dict_data[a], dict_data[b]);
 });
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei opened a new pull request, #12857: [bugfix](scanner) olap scanner compute is wrong

2022-09-22 Thread GitBox


yiguolei opened a new pull request, #12857:
URL: https://github.com/apache/doris/pull/12857

   # Proposed changes
   
   This issue is introduced by https://github.com/apache/doris/pull/8096, the 
operator priority is wrong , so that in some cases, there will be many scanners.
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] adonis0147 commented on pull request #12691: [chore](thirdparty) Support third-party incremental build

2022-09-22 Thread GitBox


adonis0147 commented on PR #12691:
URL: https://github.com/apache/doris/pull/12691#issuecomment-1254625331

   > Hi, @adonis0147 Thanks for your feedback, using MD5 for the incremental 
build is a generic idea, however, there is another problem to resolve -- how to 
manage the MD5 list? It seems that we still need to update the MD5 list 
manually, can you point out how it works in detail?
   
   We already have the MD5 list in 
[thirdparty/vars.sh](https://github.com/apache/doris/blob/master/thirdparty/vars.sh).
 We update this file when we want to update the third-parties. Therefore, we 
can write the MD5 to a file at a last place of each `build_xxx` function.
   
   > And, there is another case that sometimes Doris developers have to build 
**specific third-parties in a specific order** when some dependencies are 
updated and they require specific build order (one may rely on another, e.g. 
brpc relies on protubuf), it seems hard to resolve this problem by updating 
nothing but the `build-thirdparty.sh`?
   
   This problem is inevitable in both ways (either MD5 way or version counter 
way) if we want to support incremental installing. We should sort out the 
dependencies tree in our build script first. The reason is that it is hard for 
a developer to find out the dependencies when he want to upgrade a specific 
package only.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] sohardforaname opened a new pull request, #12858: [Improve](Nereids)Optimize planner

2022-09-22 Thread GitBox


sohardforaname opened a new pull request, #12858:
URL: https://github.com/apache/doris/pull/12858

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   optimize planner
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hf200012 opened a new pull request, #12859: Replace jvm's garbage collector CMS with G1

2022-09-22 Thread GitBox


hf200012 opened a new pull request, #12859:
URL: https://github.com/apache/doris/pull/12859

   
   Replace jvm's garbage collector CMS with G1
   From the test use, the overall performance is better than the CMS
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


BiteThet commented on code in PR #12852:
URL: https://github.com/apache/doris/pull/12852#discussion_r977291889


##
be/src/vec/columns/column_dictionary.h:
##
@@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> {
 
 Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* 
col_ptr) override {
 auto* res_col = reinterpret_cast(col_ptr);
+res_col->get_offsets().reserve(sel_size);
+res_col->get_chars().reserve(_dict.avg_str_len() * sel_size);
 for (size_t i = 0; i < sel_size; i++) {
 uint16_t n = sel[i];
 auto& code = reinterpret_cast(_codes[n]);
 auto value = _dict.get_value(code);
-res_col->insert_data(value.ptr, value.len);
+res_col->insert_data_without_reserve(value.ptr, value.len);

Review Comment:
   I think `_dict.avg_str_len() * sel_size` may be less than sum length of 
elements.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend

2022-09-22 Thread GitBox


ReganHoo commented on issue #11024:
URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640547

   >


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] ReganHoo closed issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend

2022-09-22 Thread GitBox


ReganHoo closed issue #11024: [Bug] cannot access the hive external table 
stored with s3 as the backend
URL: https://github.com/apache/doris/issues/11024


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend

2022-09-22 Thread GitBox


ReganHoo commented on issue #11024:
URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640922

   > I also encountered this issue. Did you fix it? @ReganHoo
   
   Update your doris version to 1.1.2 to solve this problem
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1

2022-09-22 Thread GitBox


hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1
URL: https://github.com/apache/doris/pull/12859


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12846: [chore](build) add optiuon to disable -frecord-gcc-switches

2022-09-22 Thread GitBox


yiguolei merged PR #12846:
URL: https://github.com/apache/doris/pull/12846


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch master updated: [chore](build) add option to disable -frecord-gcc-switches (#12846)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
 new 8fcd8ed8b3 [chore](build) add option to disable -frecord-gcc-switches 
(#12846)
8fcd8ed8b3 is described below

commit 8fcd8ed8b32868858437f8c973af6b70322176f2
Author: Zhengguo Yang 
AuthorDate: Thu Sep 22 15:38:14 2022 +0800

[chore](build) add option to disable -frecord-gcc-switches (#12846)
---
 be/CMakeLists.txt | 6 +-
 build.sh  | 4 
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/be/CMakeLists.txt b/be/CMakeLists.txt
index 2fc57ecf3c..094ebc4c3d 100644
--- a/be/CMakeLists.txt
+++ b/be/CMakeLists.txt
@@ -410,7 +410,7 @@ check_function_exists(sched_getcpu HAVE_SCHED_GETCPU)
 #  -pthread: enable multithreaded malloc
 #  -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG: enable nanosecond precision for 
boost
 #  -fno-omit-frame-pointers: Keep frame pointer for functions in register
-set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -frecord-gcc-switches -Wall 
-Wno-sign-compare -pthread -Werror")
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall -Wno-sign-compare -pthread 
-Werror")
 set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -fstrict-aliasing 
-fno-omit-frame-pointer")
 set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -std=gnu++17 -D__STDC_FORMAT_MACROS")
 set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG")
@@ -418,6 +418,10 @@ set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} 
-DBOOST_SYSTEM_NO_DEPRECATED")
 # Enable the cpu and heap profile of brpc
 set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -DBRPC_ENABLE_CPU_PROFILER")
 
+if (RECORD_COMPILER_SWITCHES)
+set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -frecord-gcc-switches")
+endif()
+
 function(TRY_TO_CHANGE_LINKER LINKER_COMMAND LINKER_NAME)
 if (CUSTUM_LINKER_COMMAND STREQUAL "ld")
 execute_process(COMMAND ${CMAKE_C_COMPILER} -fuse-ld=${LINKER_COMMAND} 
-Wl,--version ERROR_QUIET OUTPUT_VARIABLE LD_VERSION)
diff --git a/build.sh b/build.sh
index 661cac059d..dc98cbce5b 100755
--- a/build.sh
+++ b/build.sh
@@ -270,6 +270,10 @@ if [[ -z "${USE_DWARF}" ]]; then
 USE_DWARF='OFF'
 fi
 
+if [[ -z "${RECORD_COMPILER_SWITCHES}" ]]; then
+RECORD_COMPILER_SWITCHES='OFF'
+fi
+
 echo "Get params:
 BUILD_FE-- ${BUILD_FE}
 BUILD_BE-- ${BUILD_BE}


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


Gabriel39 commented on code in PR #12852:
URL: https://github.com/apache/doris/pull/12852#discussion_r977308937


##
be/src/vec/columns/column_dictionary.h:
##
@@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> {
 
 Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* 
col_ptr) override {
 auto* res_col = reinterpret_cast(col_ptr);
+res_col->get_offsets().reserve(sel_size);
+res_col->get_chars().reserve(_dict.avg_str_len() * sel_size);
 for (size_t i = 0; i < sel_size; i++) {
 uint16_t n = sel[i];
 auto& code = reinterpret_cast(_codes[n]);
 auto value = _dict.get_value(code);
-res_col->insert_data(value.ptr, value.len);
+res_col->insert_data_without_reserve(value.ptr, value.len);

Review Comment:
   If so, `chars` in ColumnString will still to reserve a bigger memory block. 
`_dict.avg_str_len() * sel_size` is just a conservative estimation here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xiaokang opened a new pull request, #12860: [bugfix])(function)return error instead of crash be for unsupported CAST

2022-09-22 Thread GitBox


xiaokang opened a new pull request, #12860:
URL: https://github.com/apache/doris/pull/12860

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   For unsupported CAST, create create_unsupport_wrapper that return 
Status::InvalidArgument instead of LOG(FATAL)  to avoid  be crash.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [x] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] jackwener commented on a diff in pull request #12858: [Improve](Nereids)Optimize planner

2022-09-22 Thread GitBox


jackwener commented on code in PR #12858:
URL: https://github.com/apache/doris/pull/12858#discussion_r977328537


##
fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostEstimate.java:
##
@@ -90,11 +90,27 @@ public static CostEstimate ofMemory(double memoryCost) {
 /**
  * Sums partial cost estimates of some (single) plan node.
  */
+@Deprecated

Review Comment:
   No rename it, just remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12857:
URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672072

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12857:
URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672127

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] luozenglin opened a new issue, #12861: [Bug] data error when using select into outfile format as parquet

2022-09-22 Thread GitBox


luozenglin opened a new issue, #12861:
URL: https://github.com/apache/doris/issues/12861

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   When I export the data using `select into outfile format as parquet` and 
then load it into a table with the same schema, the tinyint column becomes NULL.
   
   ```
set enable_vectorized_engine = false;
   
   CREATE TABLE `test_select_into_property_test_output_format_parquet_tb` (
 `k1` tinyint(4) NOT NULL,
 `k2` smallint(6) NOT NULL,
 `k3` int(11) NOT NULL,
 `k4` bigint(20) NOT NULL,
 `k5` datetime NOT NULL,
 `v1` date REPLACE NOT NULL,
 `v2` char(1) REPLACE NOT NULL,
 `v3` varchar(4096) REPLACE NOT NULL,
 `v4` float SUM NOT NULL,
 `v5` double SUM NOT NULL,
 `v6` decimal(20, 7) SUM NOT NULL
   ) ENGINE=OLAP
   AGGREGATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`k1`) BUCKETS 15
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   
   mysql> select * from test_select_into_property_test_output_format_parquet_tb 
where k1 <= 5;
   
+--+--+--+--+-++--+---+---++-+
   | k1   | k2   | k3   | k4   | k5  | v1 | v2   | v3   
 | v4| v5 | v6  |
   
+--+--+--+--+-++--+---+---++-+
   |1 |   10 |  100 | 1000 | 2011-01-01 00:00:00 | 2010-01-01 | t| 
ynqnzeowymt   | 38.638844 | 180.998031 | 7395.231067 |
   |2 |   20 |  200 | 2000 | 2012-01-01 00:00:00 | 2010-01-02 | f| 
hfkfwlr   | 506.04404 | 539.922834 | 2080.504502 |
   |3 |   30 |  300 | 3000 | 2013-01-01 00:00:00 | 2010-01-03 | t| 
uoclasp   | 377.79321 | 577.044148 | 4605.253205 |
   |4 |   40 |  400 | 4000 | 2014-01-01 00:00:00 | 2010-01-04 | n| 
iswngzeodfhptjzgswsddt| 871.35455 | 919.067864 | 7291.703724 |
   |5 |   50 |  500 | 5000 | 2015-01-01 00:00:00 | 2010-01-05 | a| 
sqodagzlyrmcelyxgcgcsfuxadcdt |  462.0679 | 929.660783 | 3903.906901 |
   
+--+--+--+--+-++--+---+---++-+
   
   select k1 k_0, k2 k_1, k3 k_2, k4 k_3, k5 k_4, v1 k_5, v2 k_6, v3 k_7, v4 
k_8, v5 k_9, v6 k_10 from 
test_select_into_property_test_output_format_parquet_tb INTO OUTFILE 
"hdfs://:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_844373478"
 FORMAT AS parquet PROPERTIES 
("broker.name"="ahdfs","broker.username"="","broker.password"="", 
"schema" = 
"required,int32,k_0;required,int32,k_1;required,int32,k_2;required,int64,k_3;required,int64,k_4;required,int64,k_5;required,byte_array,k_6;required,byte_array,k_7;required,float,k_8;required,double,k_9;required,byte_array,k_10");
   
   
   CREATE TABLE `select_into_check_table` (
 `k_0` tinyint(4) NULL,
 `k_1` smallint(6) NULL,
 `k_2` int(11) NULL,
 `k_3` bigint(20) NULL,
 `k_4` datetime NULL,
 `k_5` date NULL,
 `k_6` char(1) NULL,
 `k_7` char(29) NULL,
 `k_8` float NULL,
 `k_9` double NULL,
 `k_10` decimal(27, 9) NULL
   ) ENGINE=OLAP
   DUPLICATE KEY(`k_0`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`k_0`) BUCKETS 13
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   
   
   LOAD LABEL 
test_select_into_property_test_output_format_parquet_db.label_21_04_47_50_543709_8920444695
 ( DATA INFILE(" 
hdfs:/:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_8443734786915a56b133f4b71-a671fd00077a30b4_0.parquet")
 INTO TABLE `select_into_check_table` FORMAT AS "parquet") WITH BROKER "ahdfs" 
("username"="", "password"="");
   
   
   mysql> select * from select_into_check_table;
   
+--+--+--+---+-++--+---+---++-+
   | k_0  | k_1  | k_2  | k_3   | k_4 | k_5| k_6  | k_7 
  | k_8   | k_9| k_10|
   
+--+--+--+---+-++--+---+---++-+
   | NULL | NULL | 1000 |

[GitHub] [doris] zhannngchen opened a new pull request, #12862: [debug](test)a test pr for qa pipeline debug, will not merge

2022-09-22 Thread GitBox


zhannngchen opened a new pull request, #12862:
URL: https://github.com/apache/doris/pull/12862

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] mrhhsg opened a new pull request, #12863: [improvement](scan) merge scan keys based on the number of scanners

2022-09-22 Thread GitBox


mrhhsg opened a new pull request, #12863:
URL: https://github.com/apache/doris/pull/12863

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary
   
   A scanner that takes too many scan keys will cause performance degradation, 
so it's better to try to merge the scan keys.
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


Gabriel39 commented on code in PR #12852:
URL: https://github.com/apache/doris/pull/12852#discussion_r977345927


##
be/src/vec/columns/column_dictionary.h:
##
@@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> {
 if (code >= 0) {
 return code;
 }
-auto bound = std::upper_bound(_dict_data.begin(), 
_dict_data.end(), value) -
- _dict_data.begin();
+auto bound = std::upper_bound(_dict_data->begin(), 
_dict_data->end(), value) -
+ _dict_data->begin();
 return greater ? bound - greater + eq : bound - eq;
 }
 
 void find_codes(const phmap::flat_hash_set& values,
 std::vector& selected) const {
-size_t dict_word_num = _dict_data.size();
+size_t dict_word_num = _dict_data->size();
 selected.resize(dict_word_num);
 selected.assign(dict_word_num, false);
-for (const auto& value : values) {
-if (auto it = _inverted_index.find(value); it != 
_inverted_index.end()) {
-selected[it->second] = true;
+for (size_t i = 0; i < _dict_data->size(); i++) {
+if (values.find((*_dict_data)[i]) != values.end()) {
+selected[i] = true;
 }
 }
 }
 
 void clear() {
-_dict_data.clear();
-_inverted_index.clear();
-_code_convert_table.clear();
+_dict_data->clear();
 _hash_values.clear();
 }
 
 void clear_hash_values() { _hash_values.clear(); }
 
 void sort() {
-size_t dict_size = _dict_data.size();
-_code_convert_table.reserve(dict_size);
-std::sort(_dict_data.begin(), _dict_data.end(), _comparator);
+size_t dict_size = _dict_data->size();
+
+_perm.resize(dict_size);
+for (size_t i = 0; i < dict_size; ++i) {
+_perm[i] = i;
+}
+
+struct Comparator {
+public:
+Comparator(DictContainer& dict_data) : _dict_data(dict_data) {}
+bool operator()(const size_t a, const size_t b) const {
+return _comparator(_dict_data[a], _dict_data[b]);
+}
+
+private:
+StringValue::Comparator _comparator;
+DictContainer& _dict_data;
+};
+Comparator comparator(*_dict_data);
+std::sort(_perm.begin(), _perm.end(), comparator);

Review Comment:
   Done, thanks for your suggestion!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] luozenglin opened a new pull request, #12864: [fix](parquet) fix write error data as parquet format.

2022-09-22 Thread GitBox


luozenglin opened a new pull request, #12864:
URL: https://github.com/apache/doris/pull/12864

   Fix incorrect data conversion when writing tiny int and small int data to 
parquet files in non-vectorized engine.
   
   # Proposed changes
   
   Issue Number: close #12861
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline debug, will not merge

2022-09-22 Thread GitBox


zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline 
debug, will not merge
URL: https://github.com/apache/doris/pull/12853


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline debug, will not merge

2022-09-22 Thread GitBox


zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline 
debug, will not merge
URL: https://github.com/apache/doris/pull/12855


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12824:
URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695697

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12824:
URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695752

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12822:
URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697309

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12822:
URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697351

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dataroaring opened a new pull request, #12865: test_p0

2022-09-22 Thread GitBox


dataroaring opened a new pull request, #12865:
URL: https://github.com/apache/doris/pull/12865

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] freemandealer opened a new pull request, #12866: [enhancement](compaction) introduce segment compaction (#12609)

2022-09-22 Thread GitBox


freemandealer opened a new pull request, #12866:
URL: https://github.com/apache/doris/pull/12866

   Implement segmentwise compaction during rowset write to reduce the number of 
segments produced by load jobs, otherwise may cause OLAP_ERR_TOO_MANY_SEGMENTS 
(-238).
   
   Signed-off-by: freemandealer 
   
   # Proposed changes
   
   Issue Number: close #12609
   
   ## Problem summ
   
   ## Intro
   
   The default limit is 200 segment perf rowset. Too many segments may fail the 
whole load process (OLAP_ERR_TOO_MANY_SEGMENTS -238). If we increase the limit, 
the load will succeed but the pressure is transferred to the subsequential 
rowsetwise  compaction. Things get worse when the user issue a query, e.g. 
insert into select stmt, right after load job but before rowsetwise compaction, 
he/she will suffer the performance disaster or maybe end up with OOM.
   
   So we are introducing segmentwise compaction which will compact data DURING 
the write process, instead of waiting for rowsetwise compaction until txn has 
been committed.
   
   
   
   ## Design
   
   ### Tigger
   
   Every time when a rowset writer produces more than N (e.g. 10) segments, we 
trigger segment compaction. Note that only one segment compaction job for a 
single rowset at a time to ensure no recursing/queuing nightmare.
   
   ### Target Selection
   
   We collect segments during every trigger. We skip big segments whose row num 
> M (e.g. 1) coz we get little benefits from compacting them comparing our 
effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do 
actual compaction.
   
   ### Compaction Process
   
   A new thread pool is introduced to help do the job. We submit the 
above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the 
worker thread does the followings:
   
   - build a MergeIterator from the target segments
   - create a new segment writer
   - for each block readed from MergeIterator, the Writer append it
   
   ### SegID handling
   
   SegID must remain consecutive after segment compaction. 
   
   If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big 
segment seg_4:
   
   - we create a segment named "seg_0-3" to save compacted data for seg_0, 
seg_1, seg_2 and seg_3
   - delete seg_0, seg_1, seg_2 and seg_3
   - rename seg_0-3 to seg_0
   - rename seg_4 to seg_1
   
   It is worth noticing that we should wait inflight segment compaction tasks 
to finish before building rowset meta and committing this txn.
   
   
   
   ## Test results
   
   ### The amount of data can Doris load 
   
   First, we test the data amount that we can successfully load into doris 
disable/enable segment compaction.Tests are based on TPCH. Table is created as 
1 bucket and no parallel. We trigger segment compaction every 10 segments 
produced by rowset writer.
   
   | cases | data amount|
   | - | -- |
   | Disable SegCompaciton | 1.12 million rows, 18.67GB |
   | Enable SegCompaction  | 11 million rows, 183GB |
   
   The result shows that the amount of data we can load to doris improve 10 
times after enabling segment compaction. The ratio is correspond to the 
triggering segment number.
   
   ### Impact on latency
   
   When segment compaction is disabled, a load job will finish in 1260s during 
the test. And the sequential rowsetwise compaction cost 151s.
   
   We give the test results when enabling segment compaction in different 
triggering segment number:
   
   | triggering segment number| Load Latency | RowsetCompaction Latency |
   |  |  |  |
   | 5 (trigger every 5 segments) | 089s (-13%)  | 242s (+60%)  |
   | 10   | 1053s (-16%) | 166s (+9%)   |
   | 20   | 960s (-23%)  | 172s (+13%)  |
   | 40   | 1320s (+4%)  | 169s (+11%)  |
   
   We load without segment compaction for serveral times and each gives us a 
different latency range from (-25%, +25%). So we believe that segment 
compaction has little impact on the latency.
   
   In addition to the above costs, we wait inflight segment compaction tasks to 
finish before building rowset meta and publishing the data. The length of the 
wait time depends on when the build takes the place but there is a theoretical 
range for it and the range is related to the time each segment compaction task 
will cost:
   
   | triggering segment number | Single SegCompaction Task Latency |
   | - | - |
   | 5 | 5s|
   | 10| 9s|
   | 20| 20s   |
   | 40| 60s   |
   
   ### I

[GitHub] [doris] freemandealer closed pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)

2022-09-22 Thread GitBox


freemandealer closed pull request #12610: [WIP][Enhancement](compaction) 
segment compaction (#12609)
URL: https://github.com/apache/doris/pull/12610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] freemandealer commented on pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)

2022-09-22 Thread GitBox


freemandealer commented on PR #12610:
URL: https://github.com/apache/doris/pull/12610#issuecomment-1254772650

   A brandnew PR with updated code as well as detailed design and test results 
are provided here: https://github.com/apache/doris/pull/12866


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12867: [Improvement](predicate) Replace for-loop by memcpy

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12867:
URL: https://github.com/apache/doris/pull/12867

   # Proposed changes
   
   This PR replace for-loop by memcpy.
   I did two experiments.
   
   Experiment 1
   
   Run ckbench q20 and print a flame graph. Compare proportion of this function 
time to the total time.
   
   I got:
   for-loop:1.74%
   memcpy:0.013%
   
   Experiment 2
   
   Run `SELECT JavaEnable FROM hits`. 9900w+ rows returned and JavaEnable is 
SMALL INT.
   Compare the BlockLoadTime.
   I got: 
   for-loop:1s225ms
   memcpy:805.603ms
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman opened a new pull request, #12868: [draft] for testing p0, not merge

2022-09-22 Thread GitBox


morningman opened a new pull request, #12868:
URL: https://github.com/apache/doris/pull/12868

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12869: [Bug](date)(1.1-lts) Fix wrong type in TimestampArithmeticExpr

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12869:
URL: https://github.com/apache/doris/pull/12869

   # Proposed changes
   
   Cherry pick from #12727
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12870: [Bug](date)(1.1-lts) Fix wrong result produced by date function

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12870:
URL: https://github.com/apache/doris/pull/12870

   # Proposed changes
   
   Cherry pick from #12720
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Henry2SS opened a new issue, #12871: [Enhancement](rewrite) support Or to In rule

2022-09-22 Thread GitBox


Henry2SS opened a new issue, #12871:
URL: https://github.com/apache/doris/issues/12871

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   support Or to In rewrite rule :
   
   for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 
4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)`
   
   ### Solution
   
   support Or to In rewrite rule :
   
   for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 
4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)`
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Henry2SS opened a new pull request, #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems

2022-09-22 Thread GitBox


Henry2SS opened a new pull request, #12872:
URL: https://github.com/apache/doris/pull/12872

   # Proposed changes
   
   Issue Number: close #12871
   
   ## Problem summary
   
   1. support Or to In rewrite rule
   2. fix Expr clone problems. It should create a new object, or it will always 
be shallow-copy.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   3. Has unit tests been added:
   - [x] Yes
   - [ ] No
   - [ ] No Need
   4. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   5. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   6. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems

2022-09-22 Thread GitBox


Henry2SS commented on PR #12872:
URL: https://github.com/apache/doris/pull/12872#issuecomment-1254828517

   tested locally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12873: [feature](outfile)(1.1-lts) support parquet writer

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12873:
URL: https://github.com/apache/doris/pull/12873

   # Proposed changes
   
   Cherry pick from #12492
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] caiconghui opened a new issue, #12874: [Bug] set enable_projection to false will cause select stmt analyze failed

2022-09-22 Thread GitBox


caiconghui opened a new issue, #12874:
URL: https://github.com/apache/doris/issues/12874

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   master and lts
   
   ### What's Wrong?
   
   set enable_projection=false;
   select count() from (select a, b from table001 order by b limit 1) a
   
   then throw exception like the following
   ERROR 1105 (HY000): errCode = 2, detailMessage = couldn't resolve slot 
descriptor 0
   
   ### What You Expected?
   
   work
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems

2022-09-22 Thread GitBox


Henry2SS commented on PR #12872:
URL: https://github.com/apache/doris/pull/12872#issuecomment-1254835927

   1. fe unit-tests passed locally.
   2. compiled and manually tested function passed 
   
   test results:
   
   
![image](https://user-images.githubusercontent.com/45096548/191724681-3f895152-23b0-459d-b046-6fcb7530515d.png)
   
   
![image](https://user-images.githubusercontent.com/45096548/191723800-848baf3b-f738-4eb8-bd6f-020c638d709a.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] caiconghui commented on issue #12874: [Bug] set enable_projection to false will cause select stmt analyze failed

2022-09-22 Thread GitBox


caiconghui commented on issue #12874:
URL: https://github.com/apache/doris/issues/12874#issuecomment-1254836058

   mysql> show columns from baseall;
   +---++--+---+-+-+
   | Field | Type   | Null | Key   | Default | Extra   |
   +---++--+---+-+-+
   | k0| BOOLEAN| Yes  | true  | NULL| |
   | k1| TINYINT| Yes  | true  | NULL| |
   | k2| SMALLINT   | Yes  | true  | NULL| |
   | k3| INT| Yes  | true  | NULL| |
   | k4| BIGINT | Yes  | true  | NULL| |
   | k5| DECIMAL(9,3)   | Yes  | true  | NULL| |
   | k6| CHAR(5)| Yes  | true  | NULL| |
   | k10   | DATE   | Yes  | true  | NULL| |
   | k11   | DATETIME   | Yes  | true  | NULL| |
   | k7| VARCHAR(20)| Yes  | true  | NULL| |
   | k8| DOUBLE | Yes  | false | NULL| MAX |
   | k9| FLOAT  | Yes  | false | NULL| SUM |
   | k12   | VARCHAR(65533) | Yes  | false | NULL| REPLACE |
   | k13   | LARGEINT   | Yes  | false | NULL| REPLACE |
   +---++-
   
   
   select count() from (select k0, k1 from baseall order by k1 limit 1) a
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris-website] branch master updated: add ADMIN-CLEAN-TRASH

2022-09-22 Thread jiafengzheng
This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
 new d569a99c79c add ADMIN-CLEAN-TRASH
d569a99c79c is described below

commit d569a99c79c9d8dfcb759ec3072fad8023400a01
Author: jiafeng.zhang 
AuthorDate: Thu Sep 22 18:36:31 2022 +0800

add ADMIN-CLEAN-TRASH

add ADMIN-CLEAN-TRASH
---
 sidebars.json | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sidebars.json b/sidebars.json
index feae10df74f..1ee53be255e 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -667,6 +667,7 @@
 "items": [
 
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REPAIR",
 
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CHECK-TABLET",
+
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CLEAN-TRASH",
 
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET",
 
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REPAIR-TABLE",
 
"sql-manual/sql-reference/Database-Administration-Statements/ADMIN-SET-CONFIG",


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen opened a new pull request, #12875: [feature-wip](unique-key-merge-on-write) fix thread safe issue in BetaRowsetWriter

2022-09-22 Thread GitBox


zhannngchen opened a new pull request, #12875:
URL: https://github.com/apache/doris/pull/12875

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow opened a new pull request, #12876: test bucket shuffle

2022-09-22 Thread GitBox


morrySnow opened a new pull request, #12876:
URL: https://github.com/apache/doris/pull/12876

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] nextdreamblue opened a new pull request, #12877: [fix](type) fix DECIMAL scale when cast function on fe

2022-09-22 Thread GitBox


nextdreamblue opened a new pull request, #12877:
URL: https://github.com/apache/doris/pull/12877

   # Proposed changes
   
   Issue Number: close #12717
   
   ## Problem summary
   
   根据cast传递的DECIMAL类型的精度来处理DECIMAL数据.
   
   before:
   MySQL [test]> select cast('135.75999' as DECIMAL(10,3));
   ++
   | CAST('135.75999' AS DECIMAL(10,3)) |
   ++
   |  135.75999 |
   ++
   1 row in set (0.00 sec)
   
   now:
   MySQL [stage]> select cast('135.75999' as DECIMAL(10,3));
   ++
   | CAST('135.75999' AS DECIMAL(10,3)) |
   ++
   |135.759 |
   ++
   1 row in set (0.01 sec)
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [x] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch opt_perf updated: [bugfix](scanner) olap scanner compute is wrong

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch opt_perf
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/opt_perf by this push:
 new b65178b7a7 [bugfix](scanner) olap scanner compute is wrong
b65178b7a7 is described below

commit b65178b7a7df72efc7d1d275b4dc4116bb9413e2
Author: yiguolei 
AuthorDate: Thu Sep 22 15:06:06 2022 +0800

[bugfix](scanner) olap scanner compute is wrong
---
 be/src/exec/olap_scan_node.cpp  | 2 +-
 be/src/vec/exec/scan/new_olap_scan_node.cpp | 2 +-
 be/src/vec/exec/volap_scan_node.cpp | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/be/src/exec/olap_scan_node.cpp b/be/src/exec/olap_scan_node.cpp
index e49fdde6d1..d3b3a3aabd 100644
--- a/be/src/exec/olap_scan_node.cpp
+++ b/be/src/exec/olap_scan_node.cpp
@@ -921,7 +921,7 @@ Status OlapScanNode::start_scan_thread(RuntimeState* state) 
{
 int size_based_scanners_per_tablet = 1;
 if (config::doris_scan_range_max_mb > 0) {
 size_based_scanners_per_tablet = std::max(
-1, (int)tablet->tablet_footprint() / 
config::doris_scan_range_max_mb << 20);
+1, (int)(tablet->tablet_footprint() / 
(config::doris_scan_range_max_mb << 20)));
 }
 int ranges_per_scanner =
 std::max(1, (int)ranges->size() /
diff --git a/be/src/vec/exec/scan/new_olap_scan_node.cpp 
b/be/src/vec/exec/scan/new_olap_scan_node.cpp
index 973e6c23ee..8242abef77 100644
--- a/be/src/vec/exec/scan/new_olap_scan_node.cpp
+++ b/be/src/vec/exec/scan/new_olap_scan_node.cpp
@@ -290,7 +290,7 @@ Status 
NewOlapScanNode::_init_scanners(std::list* scanners) {
 
 if (config::doris_scan_range_max_mb > 0) {
 size_based_scanners_per_tablet = std::max(
-1, (int)tablet->tablet_footprint() / 
config::doris_scan_range_max_mb << 20);
+1, (int)(tablet->tablet_footprint() / 
(config::doris_scan_range_max_mb << 20)));
 }
 
 int ranges_per_scanner =
diff --git a/be/src/vec/exec/volap_scan_node.cpp 
b/be/src/vec/exec/volap_scan_node.cpp
index 8197c88dbd..ebe6ab90cd 100644
--- a/be/src/vec/exec/volap_scan_node.cpp
+++ b/be/src/vec/exec/volap_scan_node.cpp
@@ -912,7 +912,7 @@ Status VOlapScanNode::start_scan_thread(RuntimeState* 
state) {
 
 if (config::doris_scan_range_max_mb > 0) {
 size_based_scanners_per_tablet = std::max(
-1, (int)tablet->tablet_footprint() / 
config::doris_scan_range_max_mb << 20);
+1, (int)(tablet->tablet_footprint() / 
(config::doris_scan_range_max_mb << 20)));
 }
 
 int ranges_per_scanner =


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zy-kkk opened a new pull request, #12878: [typo](docs)Optimized date function doc order and add partial function doc

2022-09-22 Thread GitBox


zy-kkk opened a new pull request, #12878:
URL: https://github.com/apache/doris/pull/12878

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12879: [Improvement](predicate) Replace for-loop by memcpy

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12879:
URL: https://github.com/apache/doris/pull/12879

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] Gabriel39 opened a new pull request, #12880: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


Gabriel39 opened a new pull request, #12880:
URL: https://github.com/apache/doris/pull/12880

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] HappenLee opened a new pull request, #12881: [Opt](Vectorized) Support push down no grouping agg

2022-09-22 Thread GitBox


HappenLee opened a new pull request, #12881:
URL: https://github.com/apache/doris/pull/12881

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BiteTheDDDDt opened a new pull request, #12882: [Chore](clang) support build with clang15

2022-09-22 Thread GitBox


BiteThet opened a new pull request, #12882:
URL: https://github.com/apache/doris/pull/12882

   # Proposed changes
   
   1. remove some unsed variables
   2. use clang-format15 reformat
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   3. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   5. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   6. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12881: [Opt](Vectorized) Support push down no grouping agg

2022-09-22 Thread GitBox


yiguolei merged PR #12881:
URL: https://github.com/apache/doris/pull/12881


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhangstar333 opened a new pull request, #12883: [Bug](jdbc) fix insert into date type to oracle using wrong type

2022-09-22 Thread GitBox


zhangstar333 opened a new pull request, #12883:
URL: https://github.com/apache/doris/pull/12883

   # Proposed changes
   using JDBC insert into date type to ORACLE,
   it's should be use to_date function convert string to java.sql.date
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch opt_perf updated: [Opt](Vectorized) Support push down no grouping agg (#12881)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch opt_perf
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/opt_perf by this push:
 new c5ec7601d4 [Opt](Vectorized) Support push down no grouping agg (#12881)
c5ec7601d4 is described below

commit c5ec7601d4a45493051292c7234997c622a46f36
Author: HappenLee 
AuthorDate: Thu Sep 22 19:46:21 2022 +0800

[Opt](Vectorized) Support push down no grouping agg (#12881)
---
 be/src/olap/iterators.h|   1 +
 be/src/olap/reader.h   |   1 +
 be/src/olap/rowset/beta_rowset_reader.cpp  |   1 +
 be/src/olap/rowset/rowset_reader_context.h |   1 +
 be/src/olap/rowset/segment_v2/column_reader.cpp|  42 ++
 be/src/olap/rowset/segment_v2/column_reader.h  |  12 ++
 be/src/olap/rowset/segment_v2/segment.cpp  |   8 +-
 be/src/olap/rowset/segment_v2/segment_iterator.cpp |   1 -
 be/src/olap/rowset/segment_v2/segment_iterator.h   |   4 +-
 be/src/vec/exec/scan/new_olap_scanner.cpp  |   9 +-
 be/src/vec/exec/volap_scanner.cpp  |   8 +-
 be/src/vec/olap/block_reader.cpp   |   1 +
 be/src/vec/olap/vgeneric_iterators.cpp |  79 ++
 be/src/vec/olap/vgeneric_iterators.h   |   6 +
 be/test/vec/exec/vgeneric_iterators_test.cpp   |   1 -
 .../org/apache/doris/catalog/PrimitiveType.java|   4 +
 .../org/apache/doris/planner/OlapScanNode.java |  11 ++
 .../apache/doris/planner/SingleNodePlanner.java| 160 +
 .../java/org/apache/doris/qe/SessionVariable.java  |  13 +-
 gensrc/thrift/PlanNodes.thrift |   8 ++
 20 files changed, 359 insertions(+), 12 deletions(-)

diff --git a/be/src/olap/iterators.h b/be/src/olap/iterators.h
index 22f081d0eb..4f12118c2c 100644
--- a/be/src/olap/iterators.h
+++ b/be/src/olap/iterators.h
@@ -77,6 +77,7 @@ public:
 std::vector column_predicates;
 std::unordered_map> 
col_id_to_predicates;
 std::unordered_map> 
col_id_to_del_predicates;
+TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE;
 
 // REQUIRED (null is not allowed)
 OlapReaderStatistics* stats = nullptr;
diff --git a/be/src/olap/reader.h b/be/src/olap/reader.h
index 004e75c773..ae476e4fa2 100644
--- a/be/src/olap/reader.h
+++ b/be/src/olap/reader.h
@@ -91,6 +91,7 @@ public:
 // use only in vec exec engine
 std::vector* origin_return_columns = nullptr;
 std::unordered_set* tablet_columns_convert_to_null_set = 
nullptr;
+TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE;
 
 // used for comapction to record row ids
 bool record_rowids = false;
diff --git a/be/src/olap/rowset/beta_rowset_reader.cpp 
b/be/src/olap/rowset/beta_rowset_reader.cpp
index df15b72f62..87893927d5 100644
--- a/be/src/olap/rowset/beta_rowset_reader.cpp
+++ b/be/src/olap/rowset/beta_rowset_reader.cpp
@@ -49,6 +49,7 @@ Status BetaRowsetReader::init(RowsetReaderContext* 
read_context) {
 // convert RowsetReaderContext to StorageReadOptions
 StorageReadOptions read_options;
 read_options.stats = _stats;
+read_options.push_down_agg_type_opt = _context->push_down_agg_type_opt;
 if (read_context->lower_bound_keys != nullptr) {
 for (int i = 0; i < read_context->lower_bound_keys->size(); ++i) {
 
read_options.key_ranges.emplace_back(&read_context->lower_bound_keys->at(i),
diff --git a/be/src/olap/rowset/rowset_reader_context.h 
b/be/src/olap/rowset/rowset_reader_context.h
index de61117426..ce2fd4b721 100644
--- a/be/src/olap/rowset/rowset_reader_context.h
+++ b/be/src/olap/rowset/rowset_reader_context.h
@@ -41,6 +41,7 @@ struct RowsetReaderContext {
 std::vector* read_orderby_key_columns = nullptr;
 // projection columns: the set of columns rowset reader should return
 const std::vector* return_columns = nullptr;
+TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE;
 // column name -> column predicate
 // adding column_name for predicate to make use of column selectivity
 const std::vector* predicates = nullptr;
diff --git a/be/src/olap/rowset/segment_v2/column_reader.cpp 
b/be/src/olap/rowset/segment_v2/column_reader.cpp
index d42358c5e7..451b8f3e91 100644
--- a/be/src/olap/rowset/segment_v2/column_reader.cpp
+++ b/be/src/olap/rowset/segment_v2/column_reader.cpp
@@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map(
 return Status::OK();
 }
 
+Status ColumnReader::next_batch_of_zone_map(size_t* n, 
vectorized::MutableColumnPtr& dst) const {
+// TODO: this work to get min/max value seems should only do once
+FieldType type = _type_info->type();
+std::unique_ptr min_value(WrapperField::create_by_type(type, 
_meta.length()));
+std::unique_ptr max_value(WrapperField::create_by_type(type, 
_meta.length()));
+_parse_zone

[GitHub] [doris] yiguolei merged pull request #12880: [Improvement](dict) optimize dictionary column

2022-09-22 Thread GitBox


yiguolei merged PR #12880:
URL: https://github.com/apache/doris/pull/12880


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch opt_perf updated: [Improvement](dict) optimize dictionary column (#12880)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch opt_perf
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/opt_perf by this push:
 new eb68ee6560 [Improvement](dict) optimize dictionary column (#12880)
eb68ee6560 is described below

commit eb68ee6560b94a6844f4605398f997f878646936
Author: Gabriel 
AuthorDate: Thu Sep 22 19:46:38 2022 +0800

[Improvement](dict) optimize dictionary column (#12880)
---
 be/src/vec/columns/column_dictionary.h | 87 ++
 be/src/vec/columns/column_string.h | 11 +
 2 files changed, 58 insertions(+), 40 deletions(-)

diff --git a/be/src/vec/columns/column_dictionary.h 
b/be/src/vec/columns/column_dictionary.h
index d56265b757..93fbcb9a3e 100644
--- a/be/src/vec/columns/column_dictionary.h
+++ b/be/src/vec/columns/column_dictionary.h
@@ -192,11 +192,13 @@ public:
 
 Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* 
col_ptr) override {
 auto* res_col = reinterpret_cast(col_ptr);
+res_col->get_offsets().reserve(sel_size);
+res_col->get_chars().reserve(_dict.avg_str_len() * sel_size);
 for (size_t i = 0; i < sel_size; i++) {
 uint16_t n = sel[i];
 auto& code = reinterpret_cast(_codes[n]);
 auto value = _dict.get_value(code);
-res_col->insert_data(value.ptr, value.len);
+res_col->insert_data_without_reserve(value.ptr, value.len);
 }
 return Status::OK();
 }
@@ -281,42 +283,36 @@ public:
 
 class Dictionary {
 public:
-Dictionary() = default;
+Dictionary() : _dict_data(new DictContainer()), _total_str_len(0) {};
 
-void reserve(size_t n) {
-_dict_data.reserve(n);
-_inverted_index.reserve(n);
-}
+void reserve(size_t n) { _dict_data->reserve(n); }
 
 void insert_value(StringValue& value) {
-_dict_data.push_back_without_reserve(value);
-_inverted_index[value] = _inverted_index.size();
+_dict_data->push_back_without_reserve(value);
+_total_str_len += value.len;
 }
 
 int32_t find_code(const StringValue& value) const {
-auto it = _inverted_index.find(value);
-if (it != _inverted_index.end()) {
-return it->second;
+for (size_t i = 0; i < _dict_data->size(); i++) {
+if ((*_dict_data)[i] == value) {
+return i;
+}
 }
 return -2; // -1 is null code
 }
 
 T get_null_code() const { return -1; }
 
-inline StringValue& get_value(T code) {
-return code >= _dict_data.size() ? _null_value : _dict_data[code];
-}
+inline StringValue& get_value(T code) { return (*_dict_data)[code]; }
 
-inline const StringValue& get_value(T code) const {
-return code >= _dict_data.size() ? _null_value : _dict_data[code];
-}
+inline const StringValue& get_value(T code) const { return 
(*_dict_data)[code]; }
 
 // The function is only used in the runtime filter feature
 inline void generate_hash_values_for_runtime_filter(FieldType type) {
 if (_hash_values.empty()) {
-_hash_values.resize(_dict_data.size());
-for (size_t i = 0; i < _dict_data.size(); i++) {
-auto& sv = _dict_data[i];
+_hash_values.resize(_dict_data->size());
+for (size_t i = 0; i < _dict_data->size(); i++) {
+auto& sv = (*_dict_data)[i];
 // The char data is stored in the disk with the schema 
length,
 // and zeros are filled if the length is insufficient
 
@@ -360,40 +356,50 @@ public:
 if (code >= 0) {
 return code;
 }
-auto bound = std::upper_bound(_dict_data.begin(), 
_dict_data.end(), value) -
- _dict_data.begin();
+auto bound = std::upper_bound(_dict_data->begin(), 
_dict_data->end(), value) -
+ _dict_data->begin();
 return greater ? bound - greater + eq : bound - eq;
 }
 
 void find_codes(const phmap::flat_hash_set& values,
 std::vector& selected) const {
-size_t dict_word_num = _dict_data.size();
+size_t dict_word_num = _dict_data->size();
 selected.resize(dict_word_num);
 selected.assign(dict_word_num, false);
-for (const auto& value : values) {
-if (auto it = _inverted_index.find(value); it != 
_inverted_index.end()) {
-selected[it->second] = true;
+for (size_t i = 0; i < _dict_data->size(); i++) {
+if (values.find((*_dict_data)[i]) != values.end())

[GitHub] [doris] yiguolei merged pull request #12879: [Improvement](predicate) Replace for-loop by memcpy

2022-09-22 Thread GitBox


yiguolei merged PR #12879:
URL: https://github.com/apache/doris/pull/12879


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] mrhhsg opened a new pull request, #12884: [improvement](scan) merge scan keys based on the number of scanners

2022-09-22 Thread GitBox


mrhhsg opened a new pull request, #12884:
URL: https://github.com/apache/doris/pull/12884

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch opt_perf updated: [Improvement](predicate) Replace for-loop by memcpy (#12879)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch opt_perf
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/opt_perf by this push:
 new 2b27aaa2fa [Improvement](predicate) Replace for-loop by memcpy (#12879)
2b27aaa2fa is described below

commit 2b27aaa2fa888ca2fe4634553034ea2f33e37ab4
Author: Gabriel 
AuthorDate: Thu Sep 22 19:46:52 2022 +0800

[Improvement](predicate) Replace for-loop by memcpy (#12879)
---
 be/src/vec/columns/predicate_column.h | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/be/src/vec/columns/predicate_column.h 
b/be/src/vec/columns/predicate_column.h
index fd99d4c04b..d5ad52b6ac 100644
--- a/be/src/vec/columns/predicate_column.h
+++ b/be/src/vec/columns/predicate_column.h
@@ -133,13 +133,9 @@ private:
 }
 }
 
-// note(wb): Write data one by one has a slight performance improvement 
than memcpy directly
 void insert_many_default_type(const char* data_ptr, size_t num) {
-T* input_val_ptr = (T*)data_ptr;
 T* res_val_ptr = (T*)data.get_end_ptr();
-for (int i = 0; i < num; i++) {
-res_val_ptr[i] = input_val_ptr[i];
-}
+memcpy(res_val_ptr, data_ptr, num * sizeof(T));
 res_val_ptr += num;
 data.set_end_ptr(res_val_ptr);
 }


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12884: [improvement](scan) merge scan keys based on the number of scanners

2022-09-22 Thread GitBox


yiguolei merged PR #12884:
URL: https://github.com/apache/doris/pull/12884


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch opt_perf updated: [improvement](scan) merge scan keys based on the number of scanners (#12884)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch opt_perf
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/opt_perf by this push:
 new 3d2a73c028 [improvement](scan) merge scan keys based on the number of 
scanners (#12884)
3d2a73c028 is described below

commit 3d2a73c028802bfcdeeba0ff5851cfded6d548e4
Author: Jerry Hu 
AuthorDate: Thu Sep 22 20:10:42 2022 +0800

[improvement](scan) merge scan keys based on the number of scanners (#12884)
---
 be/src/exec/olap_common.cpp | 113 +++
 be/src/exec/olap_common.h   | 116 
 be/src/runtime/datetime_value.h |  21 +
 be/src/vec/exec/scan/new_olap_scan_node.cpp |  22 --
 be/src/vec/runtime/vdatetime_value.h|  11 +++
 5 files changed, 262 insertions(+), 21 deletions(-)

diff --git a/be/src/exec/olap_common.cpp b/be/src/exec/olap_common.cpp
index 8069c47a17..087a62928c 100644
--- a/be/src/exec/olap_common.cpp
+++ b/be/src/exec/olap_common.cpp
@@ -59,6 +59,42 @@ void 
ColumnValueRange::convert_to_fixed_value() {
 return;
 }
 
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
+template <>
+std::vector>
+ColumnValueRange::split(size_t count) {
+__builtin_unreachable();
+}
+
 Status 
OlapScanKeys::get_key_range(std::vector>* 
key_range) {
 key_range->clear();
 
@@ -74,6 +110,83 @@ Status 
OlapScanKeys::get_key_range(std::vector>*
 return Status::OK();
 }
 
+Status 
OlapScanKeys::extend_scan_splitted_keys(std::vector& 
ranges) {
+using namespace std;
+DCHECK(!_has_range_value);
+
+std::vector new_begin_keys;
+std::vector new_end_keys;
+for (size_t i = 0; i != ranges.size(); ++i) {
+std::visit(
+[&](auto&& range) {
+using RangeType = std::decay_t;
+using CppType = typename RangeType::CppType;
+auto begin_keys = _begin_scan_keys;
+auto end_keys = _end_scan_keys;
+if (begin_keys.empty()) {
+begin_keys.emplace_back();
+begin_keys.back().add_value(
+cast_to_string(
+range.get_range_min_value(), 
range.scale()),
+range.contain_null());
+end_keys.emplace_back();
+
end_keys.back().add_value(cast_to_string(
+range.get_range_max_value(), range.scale()));
+} else {
+for (int i = 0; i < begin_keys.size(); ++i) {
+begin_keys[i].add_value(
+cast_to_string(
+range.get_range_min_value(), 
range.scale()),
+range.contain_null());
+}
+
+for (int i = 0; i < end_keys.size(); ++i) {
+
end_keys[i].add_value(cast_to_string(
+range.get_range_max_value(), 
range.scale()));
+}
+}
+new_begin_keys.insert(new_begin_keys.end(), 
begin_keys.begin(),
+  begin_keys.end());
+new_end_keys.insert(new_end_keys.end(), end_keys.begin(), 
end_keys.end());
+},
+ranges[i]);
+}
+_begin_scan_keys = new_begin_keys;
+_end_scan_keys = new_end_keys;
+return Status::OK();
+}
+
+OlapScanKeys OlapScanKeys::merge(size_t to_ranges_count) {
+OlapScanKeys merged;
+merged.set_is_convertible(_is_convertible);
+merged.set_max_scan_key_num(_max_scan_key_num);
+bool exact_value = false;
+for (size_t i = 0; i != _column_ranges.size(); ++i) {
+std::visit(
+[&](auto&& range) {
+if (i == _index_of_max_size_range) {
+return;
+}
+merged.extend_scan_key(range, &exact_value);
+},
+_column_ranges[i]);
+}
+
+size_t size_of_ranges = std::max(size_t(1), merged.size());
+size_t split_to_count = (to_ranges_count + size_of_ranges - 1) / 
size_of_ranges;
+std::vector splitted = std::visit(
+[&](auto&& range) {
+aut

[GitHub] [doris] dutyu opened a new issue, #12885: [Enhancement] auditloader plugin always discard audit log when clsuter is busy

2022-09-22 Thread GitBox


dutyu opened a new issue, #12885:
URL: https://github.com/apache/doris/issues/12885

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   I've installed the auditloader plugin, i found that when cluster is busy 
(users submit many sqls to the cluster), the doris_audit_tbl__ table is always 
missing some audit log where i can find in fe.audit.log.  I've reviewed the 
code and found that AuditLoaderPlugin use a LinkedBlockingDeque which the 
capacity is 1, if users submit many sqls, the `AuditLoaderPlugin.exec` method 
is always failed cause of the queue is full. Maybe use a configuration to 
control the capacity of the queue is an elegant way to handle this problem.
   
   ### Solution
   
   Use a configuration to control the capacity of the queue.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] liaoxin01 opened a new pull request, #12886: [feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change

2022-09-22 Thread GitBox


liaoxin01 opened a new pull request, #12886:
URL: https://github.com/apache/doris/pull/12886

   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dutyu opened a new pull request, #12887: [enhancement](AuditLoaderPlugin): add audit queue capacity configurat…

2022-09-22 Thread GitBox


dutyu opened a new pull request, #12887:
URL: https://github.com/apache/doris/pull/12887

   …ion and improve performance for datetime format.
   
   # Proposed changes
   
   Ease the audit log discard problem for auditloader plugin.
   
   Issue Number: close #12885 
   
   ## Problem summary
   
   See  #12885 
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [*] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [*] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [*] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [*] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [*] No
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] mrhhsg opened a new pull request, #12888: [bugfix](scanner) remove invalid of '[[noreturn]]'

2022-09-22 Thread GitBox


mrhhsg opened a new pull request, #12888:
URL: https://github.com/apache/doris/pull/12888

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] superhanliu2 commented on a diff in pull request #12837: Update vars.sh

2022-09-22 Thread GitBox


superhanliu2 commented on code in PR #12837:
URL: https://github.com/apache/doris/pull/12837#discussion_r977643300


##
thirdparty/vars.sh:
##
@@ -288,7 +288,7 @@ JEMALLOC_SOURCE="jemalloc-5.2.1"
 JEMALLOC_MD5SUM="3d41fbf006e6ebffd489bdb304d009ae"
 
 # cctz
-CCTZ_DOWNLOAD="https://github.com/google/cctz/archive/v2.3.tar.gz";
+CCTZ_DOWNLOAD="https://codeload.github.com/google/cctz/tar.gz/refs/tags/v2.3";

Review Comment:
   I check again today and I find that the old value is right too .may be the 
network traffic yesterday,sorry. please close this pr, 3x a lot



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] superhanliu2 commented on a diff in pull request #12837: Update vars.sh

2022-09-22 Thread GitBox


superhanliu2 commented on code in PR #12837:
URL: https://github.com/apache/doris/pull/12837#discussion_r977643300


##
thirdparty/vars.sh:
##
@@ -288,7 +288,7 @@ JEMALLOC_SOURCE="jemalloc-5.2.1"
 JEMALLOC_MD5SUM="3d41fbf006e6ebffd489bdb304d009ae"
 
 # cctz
-CCTZ_DOWNLOAD="https://github.com/google/cctz/archive/v2.3.tar.gz";
+CCTZ_DOWNLOAD="https://codeload.github.com/google/cctz/tar.gz/refs/tags/v2.3";

Review Comment:
   I check again today and I find that the old value is right too .may be the 
network traffic yesterday,sorry. please close this pr, 3x a lot. @jackwener 
@adonis0147 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xinyiZzz opened a new pull request, #12889: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker

2022-09-22 Thread GitBox


xinyiZzz opened a new pull request, #12889:
URL: https://github.com/apache/doris/pull/12889

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   cherry-pick:
   https://github.com/apache/doris/pull/12666
   https://github.com/apache/doris/pull/12339
   https://github.com/apache/doris/pull/12682
   https://github.com/apache/doris/pull/12688
   https://github.com/apache/doris/pull/12708
   https://github.com/apache/doris/pull/12782
   https://github.com/apache/doris/pull/12776
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] xinyiZzz merged pull request #12889: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker

2022-09-22 Thread GitBox


xinyiZzz merged PR #12889:
URL: https://github.com/apache/doris/pull/12889


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-1.1-lts updated: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker (#12889)

2022-09-22 Thread zouxinyi
This is an automated email from the ASF dual-hosted git repository.

zouxinyi pushed a commit to branch branch-1.1-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-1.1-lts by this push:
 new d3006ddd12 [branch-1.1-lts](cherry-pick) Some fixes for mem tracker 
(#12889)
d3006ddd12 is described below

commit d3006ddd121f5c89dfea8f38f192de7d03fe5dd4
Author: Xinyi Zou 
AuthorDate: Thu Sep 22 21:47:45 2022 +0800

[branch-1.1-lts](cherry-pick) Some fixes for mem tracker (#12889)

* [fix][memtracker] remove gc and fix print

* [fix](memory) Fix BE OOM when load -238 fail

* [fix](memtracker) Process physical mem check does not include tc/jemalloc 
allocator cache (#12688)

tcmalloc/jemalloc allocator cache does not participate in the mem check as 
part of the process physical memory.

because new/malloc will trigger mem hook when using tcmalloc/jemalloc 
allocator cache, but it may not actually alloc physical memory, which is not 
expected in mem hook fail.

in addition:

The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, 
the parent is the process mem tracker, which is updated every 1s.
Modify the process default mem_limit to 90%. expect mem tracker to 
effectively limit the memory usage of the process.

* Fix memory leak by calling  in mem hook (#12708)

After the consume mem tracker exceeds the mem limit in the mem hook, the 
boost stacktrace will be printed. A query/load will only be printed once, and 
the process tracker will only be printed once per second.

After the process memory reaches the upper limit, the boost stacktrace will 
be printed every second. The observed phenomena are as follows:

After query/load is canceled, the memory increases instantly;
tcmalloc profile total physical memory is less than perf process memory;
The process mem tracker is smaller than the perf process memory;

* [fix](memtracker) Fix thread mem tracker try consume accuracy #12782

* [Bugfix](mem) Fix memory limit check may overflow (#12776)

This bug is because the result of subtracting signed and unsigned numbers 
may overflow if it is negative.

Co-authored-by: Zhengguo Yang 
---
 be/src/common/config.h   |   2 +-
 be/src/common/daemon.cpp |   1 +
 be/src/http/default_path_handlers.cpp|   5 +-
 be/src/runtime/exec_env.h|   9 ++
 be/src/runtime/exec_env_init.cpp |   1 +
 be/src/runtime/load_channel.cpp  |   9 +-
 be/src/runtime/load_channel.h|   2 +-
 be/src/runtime/load_channel_mgr.cpp  |  10 +-
 be/src/runtime/load_channel_mgr.h|   2 +-
 be/src/runtime/memory/mem_tracker.cpp|   9 +-
 be/src/runtime/memory/mem_tracker_limiter.cpp| 136 ---
 be/src/runtime/memory/mem_tracker_limiter.h  | 134 +++---
 be/src/runtime/memory/thread_mem_tracker_mgr.cpp |  11 +-
 be/src/runtime/memory/thread_mem_tracker_mgr.h   |  18 +--
 be/src/runtime/tablets_channel.cpp   |   4 +
 be/src/service/doris_main.cpp|  10 +-
 be/src/util/mem_info.cpp |  15 ++-
 be/src/util/mem_info.h   |  37 +-
 be/src/util/perf_counters.cpp|   6 +
 be/src/util/perf_counters.h  |   6 +-
 be/src/util/system_metrics.cpp   |   3 +-
 21 files changed, 235 insertions(+), 195 deletions(-)

diff --git a/be/src/common/config.h b/be/src/common/config.h
index 7f1921d496..106609ee05 100644
--- a/be/src/common/config.h
+++ b/be/src/common/config.h
@@ -68,7 +68,7 @@ CONF_Int64(tc_max_total_thread_cache_bytes, "1073741824");
 // defaults to bytes if no unit is given"
 // must larger than 0. and if larger than physical memory size,
 // it will be set to physical memory size.
-CONF_String(mem_limit, "80%");
+CONF_String(mem_limit, "90%");
 
 // the port heartbeat service used
 CONF_Int32(heartbeat_service_port, "9050");
diff --git a/be/src/common/daemon.cpp b/be/src/common/daemon.cpp
index ea628bb100..bb39bf13ef 100644
--- a/be/src/common/daemon.cpp
+++ b/be/src/common/daemon.cpp
@@ -68,6 +68,7 @@ namespace doris {
 bool k_doris_exit = false;
 
 void Daemon::tcmalloc_gc_thread() {
+// TODO All cache GC wish to be supported
 while 
(!_stop_background_threads_latch.wait_for(MonoDelta::FromSeconds(10))) {
 size_t used_size = 0;
 size_t free_size = 0;
diff --git a/be/src/http/default_path_handlers.cpp 
b/be/src/http/default_path_handlers.cpp
index c7cdcd2ad8..3efed02a09 100644
--- a/be/src/http/default_path_handlers.cpp
+++ b/be/src/http/default_path_handlers.cpp
@@ -32,6 +32,7 @@
 #include "runtime/mem_tracker.h"
 #include "runtime/memory/mem_tracker_limiter.h"
 #include "util/d

[GitHub] [doris] jackwener opened a new pull request, #12890: [fix](Nereids): fix Outer LAsscom and improve onConditon checker

2022-09-22 Thread GitBox


jackwener opened a new pull request, #12890:
URL: https://github.com/apache/doris/pull/12890

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   - fix Outer LAsscom, current forgot to check onCondtion for Outer LAsscom
   - improve onConditon checker.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [x] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [x] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [x] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [x] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow closed pull request #12876: test bucket shuffle

2022-09-22 Thread GitBox


morrySnow closed pull request #12876: test bucket shuffle
URL: https://github.com/apache/doris/pull/12876


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morrySnow opened a new pull request, #12891: [enhancement](Nereids) plan bucket shuffle join on fragment without scan node

2022-09-22 Thread GitBox


morrySnow opened a new pull request, #12891:
URL: https://github.com/apache/doris/pull/12891

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)

2022-09-22 Thread GitBox


zhannngchen commented on code in PR #12866:
URL: https://github.com/apache/doris/pull/12866#discussion_r977657612


##
be/src/olap/olap_server.cpp:
##
@@ -700,6 +708,19 @@ Status 
StorageEngine::submit_quick_compaction_task(TabletSharedPtr tablet) {
 return Status::OK();
 }
 
+Status StorageEngine::_handle_seg_compaction(BetaRowsetWriter* writer,
+ SegCompactionCandidatesSharedPtr 
segments) {
+writer->do_segcompaction(segments);
+return Status::OK();
+}
+
+Status StorageEngine::submit_seg_compaction_task(BetaRowsetWriter* writer,
+ 
SegCompactionCandidatesSharedPtr segments) {
+_seg_compaction_thread_pool->submit_func(

Review Comment:
   ditto



##
be/src/olap/olap_server.cpp:
##
@@ -700,6 +708,19 @@ Status 
StorageEngine::submit_quick_compaction_task(TabletSharedPtr tablet) {
 return Status::OK();
 }
 
+Status StorageEngine::_handle_seg_compaction(BetaRowsetWriter* writer,
+ SegCompactionCandidatesSharedPtr 
segments) {
+writer->do_segcompaction(segments);

Review Comment:
   should be `return writer->do_segcompaction(segments);` ?



##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const 
vectorized::Block* block) {
 return _add_block(block, &_segment_writer);
 }
 
+vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader(
+SegCompactionCandidatesSharedPtr segments, std::shared_ptr 
schema,
+OlapReaderStatistics* stat) {
+StorageReadOptions read_options;
+read_options.stats = stat;
+read_options.use_page_cache = false;
+read_options.tablet_schema = _context.tablet_schema;
+std::vector> seg_iterators;
+for (auto& seg_ptr : *segments) {
+std::unique_ptr iter;
+auto s = seg_ptr->new_iterator(*schema, read_options, &iter);
+if (!s.ok()) {
+LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << 
"]: " << s.to_string();

Review Comment:
   should return here?



##
be/src/common/config.h:
##
@@ -875,6 +878,12 @@ CONF_Bool(enable_new_load_scan_node, "false");
 // Temp config. True to use new file scanner. Will remove after fully test.
 CONF_Bool(enable_new_file_scanner, "false");
 
+CONF_Bool(enable_segcompaction, "false"); // currently only support vectorized 
storage
+// Trigger segcompaction if the num of segments in a rowset exceeds this 
threshold.
+CONF_Int32(segcompaction_threshold_segment_num, "10");
+
+CONF_Int32(segcompaction_small_threshold, "100");

Review Comment:
use 1048576 instead.



##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const 
vectorized::Block* block) {
 return _add_block(block, &_segment_writer);
 }
 
+vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader(
+SegCompactionCandidatesSharedPtr segments, std::shared_ptr 
schema,
+OlapReaderStatistics* stat) {
+StorageReadOptions read_options;
+read_options.stats = stat;
+read_options.use_page_cache = false;
+read_options.tablet_schema = _context.tablet_schema;
+std::vector> seg_iterators;
+for (auto& seg_ptr : *segments) {
+std::unique_ptr iter;
+auto s = seg_ptr->new_iterator(*schema, read_options, &iter);
+if (!s.ok()) {
+LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << 
"]: " << s.to_string();
+}
+seg_iterators.push_back(std::move(iter));
+}
+std::vector iterators;
+for (auto& owned_it : seg_iterators) {
+// transfer ownership
+iterators.push_back(owned_it.release());
+}
+bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS);
+bool is_reverse = false;
+auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, 
is_reverse, nullptr);
+merge_itr->init(read_options);
+
+return (vectorized::VMergeIterator*)merge_itr;
+}
+
+std::unique_ptr 
BetaRowsetWriter::create_segcompaction_writer(
+uint64_t begin, uint64_t end) {
+Status status;
+std::unique_ptr writer = nullptr;
+status = _create_segment_writer_for_segcompaction(&writer, begin, end);
+if (status != Status::OK()) {
+writer = nullptr;
+LOG(ERROR) << "failed to create segment writer for begin:" << begin << 
" end:" << end
+   << " path:" << writer->get_data_dir()->path();
+}
+if (writer->get_data_dir())

Review Comment:
   `if (writer != nullptr && writer->get_data_dir())`



##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const 
vectorized::Block* block) {
 return _add_block(block, &_segment_writer);
 }
 
+vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_read

[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)

2022-09-22 Thread GitBox


zhannngchen commented on code in PR #12866:
URL: https://github.com/apache/doris/pull/12866#discussion_r977716456


##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const 
vectorized::Block* block) {
 return _add_block(block, &_segment_writer);
 }
 
+vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader(
+SegCompactionCandidatesSharedPtr segments, std::shared_ptr 
schema,
+OlapReaderStatistics* stat) {
+StorageReadOptions read_options;
+read_options.stats = stat;
+read_options.use_page_cache = false;
+read_options.tablet_schema = _context.tablet_schema;
+std::vector> seg_iterators;
+for (auto& seg_ptr : *segments) {
+std::unique_ptr iter;
+auto s = seg_ptr->new_iterator(*schema, read_options, &iter);
+if (!s.ok()) {
+LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << 
"]: " << s.to_string();
+}
+seg_iterators.push_back(std::move(iter));
+}
+std::vector iterators;
+for (auto& owned_it : seg_iterators) {
+// transfer ownership
+iterators.push_back(owned_it.release());
+}
+bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS);
+bool is_reverse = false;
+auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, 
is_reverse, nullptr);
+merge_itr->init(read_options);
+
+return (vectorized::VMergeIterator*)merge_itr;
+}
+
+std::unique_ptr 
BetaRowsetWriter::create_segcompaction_writer(
+uint64_t begin, uint64_t end) {
+Status status;
+std::unique_ptr writer = nullptr;
+status = _create_segment_writer_for_segcompaction(&writer, begin, end);
+if (status != Status::OK()) {
+writer = nullptr;
+LOG(ERROR) << "failed to create segment writer for begin:" << begin << 
" end:" << end
+   << " path:" << writer->get_data_dir()->path();
+}
+if (writer->get_data_dir())
+LOG(INFO) << "segcompaction segment writer created for begin:" << 
begin << " end:" << end
+  << " path:" << writer->get_data_dir()->path();
+return writer;
+}
+
+Status BetaRowsetWriter::delete_original_segments(uint32_t begin, uint32_t 
end) {
+auto fs = _rowset_meta->fs();
+if (!fs) {
+return Status::OLAPInternalError(OLAP_ERR_INIT_FAILED);
+}
+for (uint32_t i = begin; i <= end; ++i) {
+auto seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id, i);
+// Even if an error is encountered, these files that have not been 
cleaned up
+// will be cleaned up by the GC background. So here we only print the 
error
+// message when we encounter an error.
+WARN_IF_ERROR(fs->delete_file(seg_path),
+  strings::Substitute("Failed to delete file=$0", 
seg_path));
+}
+return Status::OK();
+}
+
+void BetaRowsetWriter::rename_compacted_segments(int64_t begin, int64_t end) {
+int ret;
+auto src_seg_path = 
BetaRowset::local_segment_path_segcompacted(_context.tablet_path,
+
_context.rowset_id, begin, end);
+auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id,
+   _num_segcompacted++);
+ret = rename(src_seg_path.c_str(), dst_seg_path.c_str());
+DCHECK_EQ(ret, 0);
+}
+
+// todo: will rename only do the job? maybe need deep modification
+void BetaRowsetWriter::rename_compacted_segment_plain(uint64_t seg_id) {
+int ret;
+auto src_seg_path =
+BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id, seg_id);
+auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id,
+   _num_segcompacted++);
+LOG(INFO) << "segcompaction skip this segment. rename " << src_seg_path << 
" to "
+  << dst_seg_path;
+if (src_seg_path.compare(dst_seg_path) != 0) {
+CHECK_EQ(_segid_statistics_map.find(seg_id + 1) == 
_segid_statistics_map.end(), false);

Review Comment:
   DCHECK_EQ



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] HappenLee opened a new pull request, #12892: [config](vec) control num free block by be config

2022-09-22 Thread GitBox


HappenLee opened a new pull request, #12892:
URL: https://github.com/apache/doris/pull/12892

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12867: [Improvement](predicate) Replace for-loop by memcpy

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12867:
URL: https://github.com/apache/doris/pull/12867#issuecomment-1255102655

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12867: [Improvement](predicate) Replace for-loop by memcpy

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12867:
URL: https://github.com/apache/doris/pull/12867#issuecomment-1255102723

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)

2022-09-22 Thread GitBox


zhannngchen commented on code in PR #12866:
URL: https://github.com/apache/doris/pull/12866#discussion_r977743775


##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const 
vectorized::Block* block) {
 return _add_block(block, &_segment_writer);
 }
 
+vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader(
+SegCompactionCandidatesSharedPtr segments, std::shared_ptr 
schema,
+OlapReaderStatistics* stat) {
+StorageReadOptions read_options;
+read_options.stats = stat;
+read_options.use_page_cache = false;
+read_options.tablet_schema = _context.tablet_schema;
+std::vector> seg_iterators;
+for (auto& seg_ptr : *segments) {
+std::unique_ptr iter;
+auto s = seg_ptr->new_iterator(*schema, read_options, &iter);
+if (!s.ok()) {
+LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << 
"]: " << s.to_string();
+}
+seg_iterators.push_back(std::move(iter));
+}
+std::vector iterators;
+for (auto& owned_it : seg_iterators) {
+// transfer ownership
+iterators.push_back(owned_it.release());
+}
+bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS);
+bool is_reverse = false;
+auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, 
is_reverse, nullptr);
+merge_itr->init(read_options);
+
+return (vectorized::VMergeIterator*)merge_itr;
+}
+
+std::unique_ptr 
BetaRowsetWriter::create_segcompaction_writer(
+uint64_t begin, uint64_t end) {
+Status status;
+std::unique_ptr writer = nullptr;
+status = _create_segment_writer_for_segcompaction(&writer, begin, end);
+if (status != Status::OK()) {
+writer = nullptr;
+LOG(ERROR) << "failed to create segment writer for begin:" << begin << 
" end:" << end
+   << " path:" << writer->get_data_dir()->path();
+}
+if (writer->get_data_dir())
+LOG(INFO) << "segcompaction segment writer created for begin:" << 
begin << " end:" << end
+  << " path:" << writer->get_data_dir()->path();
+return writer;
+}
+
+Status BetaRowsetWriter::delete_original_segments(uint32_t begin, uint32_t 
end) {
+auto fs = _rowset_meta->fs();
+if (!fs) {
+return Status::OLAPInternalError(OLAP_ERR_INIT_FAILED);
+}
+for (uint32_t i = begin; i <= end; ++i) {
+auto seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id, i);
+// Even if an error is encountered, these files that have not been 
cleaned up
+// will be cleaned up by the GC background. So here we only print the 
error
+// message when we encounter an error.
+WARN_IF_ERROR(fs->delete_file(seg_path),
+  strings::Substitute("Failed to delete file=$0", 
seg_path));
+}
+return Status::OK();
+}
+
+void BetaRowsetWriter::rename_compacted_segments(int64_t begin, int64_t end) {
+int ret;
+auto src_seg_path = 
BetaRowset::local_segment_path_segcompacted(_context.tablet_path,
+
_context.rowset_id, begin, end);
+auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id,
+   _num_segcompacted++);
+ret = rename(src_seg_path.c_str(), dst_seg_path.c_str());
+DCHECK_EQ(ret, 0);
+}
+
+// todo: will rename only do the job? maybe need deep modification
+void BetaRowsetWriter::rename_compacted_segment_plain(uint64_t seg_id) {
+int ret;
+auto src_seg_path =
+BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id, seg_id);
+auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, 
_context.rowset_id,
+   _num_segcompacted++);
+LOG(INFO) << "segcompaction skip this segment. rename " << src_seg_path << 
" to "
+  << dst_seg_path;
+if (src_seg_path.compare(dst_seg_path) != 0) {
+CHECK_EQ(_segid_statistics_map.find(seg_id + 1) == 
_segid_statistics_map.end(), false);
+CHECK_EQ(_segid_statistics_map.find(_num_segcompacted) == 
_segid_statistics_map.end(),
+ true);
+statistics org = _segid_statistics_map[seg_id + 1];
+_segid_statistics_map.emplace(_num_segcompacted, org);
+clear_statistics_for_deleting_segments(seg_id, seg_id);
+ret = rename(src_seg_path.c_str(), dst_seg_path.c_str());
+DCHECK_EQ(ret, 0);
+}
+}
+
+void BetaRowsetWriter::clear_statistics_for_deleting_segments(uint64_t begin, 
uint64_t end) {
+LOG(INFO) << "_segid_statistics_map clear record segid range from:" << 
begin + 1
+  << " to:" << end + 1;
+for (int i = begin; i <= end; ++i) {
+_segid_statistics_map.erase(i + 1);
+}
+}
+
+Status B

[GitHub] [doris] BePPPower commented on a diff in pull request #12848: [feature-wip](new-scan)Add new jdbc scanner and new jdbc scan node

2022-09-22 Thread GitBox


BePPPower commented on code in PR #12848:
URL: https://github.com/apache/doris/pull/12848#discussion_r977757888


##
be/src/vec/exec/scan/new_jdbc_scan_node.cpp:
##
@@ -0,0 +1,62 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/exec/scan/new_jdbc_scan_node.h"
+#ifdef LIBJVM
+
+#include "vec/exec/scan/new_jdbc_scanner.h"
+#include "vec/exec/scan/vscanner.h"
+namespace doris::vectorized {
+NewJdbcScanNode::NewJdbcScanNode(ObjectPool* pool, const TPlanNode& tnode,
+ const DescriptorTbl& descs)
+: VScanNode(pool, tnode, descs),
+  _table_name(tnode.jdbc_scan_node.table_name),
+  _tuple_id(tnode.jdbc_scan_node.tuple_id),
+  _query_string(tnode.jdbc_scan_node.query_string) {
+_output_tuple_id = tnode.jdbc_scan_node.tuple_id;
+}
+
+std::string NewJdbcScanNode::get_name() {
+return fmt::format("VNewJdbcScanNode({0})", _table_name);
+}
+
+Status NewJdbcScanNode::prepare(RuntimeState* state) {
+VLOG_CRITICAL << "VNewJdbcScanNode::Prepare";
+RETURN_IF_ERROR(VScanNode::prepare(state));
+SCOPED_CONSUME_MEM_TRACKER(mem_tracker());

Review Comment:
   Here seems to be not needed, because VScanNode::prepare has done 
`SCOPED_CONSUME_MEM_TRACKER(mem_tracker())`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)

2022-09-22 Thread GitBox


zhannngchen commented on code in PR #12866:
URL: https://github.com/apache/doris/pull/12866#discussion_r977705741


##
be/src/olap/rowset/beta_rowset_writer.cpp:
##
@@ -309,12 +641,23 @@ Status BetaRowsetWriter::_create_segment_writer(
 DCHECK(file_writer != nullptr);
 segment_v2::SegmentWriterOptions writer_options;
 writer_options.enable_unique_key_merge_on_write = 
_context.enable_unique_key_merge_on_write;
-writer->reset(new segment_v2::SegmentWriter(file_writer.get(), 
_num_segment,
-_context.tablet_schema, 
_context.data_dir,
-_context.max_rows_per_segment, 
writer_options));
-{
-std::lock_guard l(_lock);
-_file_writers.push_back(std::move(file_writer));
+
+if (is_segcompaction) {
+writer->reset(new segment_v2::SegmentWriter(file_writer.get(), 
_num_segcompacted + 1,

Review Comment:
   The only difference of these 2 branch is the parameter sgement_id?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] BePPPower commented on a diff in pull request #12848: [feature-wip](new-scan)Add new jdbc scanner and new jdbc scan node

2022-09-22 Thread GitBox


BePPPower commented on code in PR #12848:
URL: https://github.com/apache/doris/pull/12848#discussion_r977757888


##
be/src/vec/exec/scan/new_jdbc_scan_node.cpp:
##
@@ -0,0 +1,62 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/exec/scan/new_jdbc_scan_node.h"
+#ifdef LIBJVM
+
+#include "vec/exec/scan/new_jdbc_scanner.h"
+#include "vec/exec/scan/vscanner.h"
+namespace doris::vectorized {
+NewJdbcScanNode::NewJdbcScanNode(ObjectPool* pool, const TPlanNode& tnode,
+ const DescriptorTbl& descs)
+: VScanNode(pool, tnode, descs),
+  _table_name(tnode.jdbc_scan_node.table_name),
+  _tuple_id(tnode.jdbc_scan_node.tuple_id),
+  _query_string(tnode.jdbc_scan_node.query_string) {
+_output_tuple_id = tnode.jdbc_scan_node.tuple_id;
+}
+
+std::string NewJdbcScanNode::get_name() {
+return fmt::format("VNewJdbcScanNode({0})", _table_name);
+}
+
+Status NewJdbcScanNode::prepare(RuntimeState* state) {
+VLOG_CRITICAL << "VNewJdbcScanNode::Prepare";
+RETURN_IF_ERROR(VScanNode::prepare(state));
+SCOPED_CONSUME_MEM_TRACKER(mem_tracker());

Review Comment:
   Here seems to be not needed? because VScanNode::prepare has done 
`SCOPED_CONSUME_MEM_TRACKER(mem_tracker())`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] morningman opened a new pull request, #12893: [improvement](load) support loading data with missing column

2022-09-22 Thread GitBox


morningman opened a new pull request, #12893:
URL: https://github.com/apache/doris/pull/12893

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   This PR is from #11742, and add arrow reader support.
   If there are 5 columns in table and 4 columns in file,
   the load can still finish, with a default null column loaded.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] englefly opened a new pull request, #12894: [feature](nereids) extract single table expression for push down

2022-09-22 Thread GitBox


englefly opened a new pull request, #12894:
URL: https://github.com/apache/doris/pull/12894

   # Proposed changes
   TPCH q7, we have expression like
   ``` (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')
 or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE')```
   this expression implies 
   `(n1.n_name='FRANCE' or n1.n_name=''GERMANY)`
   The implied expression is logical redundancy, but it could be used to reduce 
the output tuple number of scan(n1), if nereids push down this expression down.
   
   This pr introduces a RULE to extract such expressions.
   NOTE:
   1. we only extract expression on a single table.
   2. if the extracted expression cannot be pushed down, e.g. it is on right 
table of left outer join, we need another rule to remove all the useless 
expressions.
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   4. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   5. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   6. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   7. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] zhannngchen closed pull request #12862: [debug](test)a test pr for qa pipeline debug, will not merge

2022-09-22 Thread GitBox


zhannngchen closed pull request #12862: [debug](test)a test pr for qa pipeline 
debug, will not merge
URL: https://github.com/apache/doris/pull/12862


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] dinggege1024 opened a new issue, #12895: [Enhancement] spark load support ORC format table

2022-09-22 Thread GitBox


dinggege1024 opened a new issue, #12895:
URL: https://github.com/apache/doris/issues/12895

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Description
   
   Until now doris spark load do not support ORC format file, I would like to 
help this .
   
   Is there anythine i need to pay attention?
   
![image](https://user-images.githubusercontent.com/109070189/191785733-c88d9f6f-f029-4029-be8c-bb2f635bb08a.png)
   
   
   ### Solution
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] sahilm-10 commented on issue #11706: Good First Issue

2022-09-22 Thread GitBox


sahilm-10 commented on issue #11706:
URL: https://github.com/apache/doris/issues/11706#issuecomment-1255198436

   @luzhijing I am interested in DOCS & BLOGS TRANSLATION. I am new to 
Open-Source, if there's any post remaining to assign , Please assign me. I want 
to contribute. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] HappenLee commented on pull request #12892: [config](vec) control num free block by be config

2022-09-22 Thread GitBox


HappenLee commented on PR #12892:
URL: https://github.com/apache/doris/pull/12892#issuecomment-1255210036

   > please also modify free block in `src/vec/exec/scan/scanner_context.cpp`
   
   just a test pr in branch `opt_perf`, if it's effective, I will create a new 
pr to do this in `master` branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] wsjz opened a new pull request, #12896: [feature-wip](parquet-reader) refactor parquet_predicate

2022-09-22 Thread GitBox


wsjz opened a new pull request, #12896:
URL: https://github.com/apache/doris/pull/12896

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Describe your changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
   - [ ] Yes
   - [ ] No
   - [ ] I don't know
   2. Has unit tests been added:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   3. Has document been added or modified:
   - [ ] Yes
   - [ ] No
   - [ ] No Need
   4. Does it need to update dependencies:
   - [ ] Yes
   - [ ] No
   5. Are there any changes that cannot be rolled back:
   - [ ] Yes (If Yes, please explain WHY)
   - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12838: [Bug](view) Show create view support comment

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12838:
URL: https://github.com/apache/doris/pull/12838#issuecomment-1255403424

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12838: [Bug](view) Show create view support comment

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12838:
URL: https://github.com/apache/doris/pull/12838#issuecomment-1255403458

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] commented on pull request #12820: [fix](streamload&sink) release and allocate memory in the same tracker

2022-09-22 Thread GitBox


github-actions[bot] commented on PR #12820:
URL: https://github.com/apache/doris/pull/12820#issuecomment-1255513860

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] github-actions[bot] closed pull request #8603: fix string default value bug

2022-09-22 Thread GitBox


github-actions[bot] closed pull request #8603: fix string default value bug
URL: https://github.com/apache/doris/pull/8603


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12870: [Bug](date)(1.1-lts) Fix wrong result produced by date function

2022-09-22 Thread GitBox


yiguolei merged PR #12870:
URL: https://github.com/apache/doris/pull/12870


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-1.1-lts updated: [Bug](date) Fix wrong result produced by date function (#12870)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-1.1-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-1.1-lts by this push:
 new 97e51a11e0 [Bug](date) Fix wrong result produced by date function 
(#12870)
97e51a11e0 is described below

commit 97e51a11e068dc44e7390e9e66279d4858e188a0
Author: Gabriel 
AuthorDate: Fri Sep 23 08:50:36 2022 +0800

[Bug](date) Fix wrong result produced by date function (#12870)
---
 .../src/main/java/org/apache/doris/analysis/DateLiteral.java  | 6 +-
 .../src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java   | 8 
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java
index 9de0ae375d..5531936bd0 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java
@@ -387,7 +387,11 @@ public class DateLiteral extends LiteralExpr {
 
 @Override
 public long getLongValue() {
-return (year * 1 + month * 100 + day) * 100L + hour * 1 + 
minute * 100 + second;
+if (this.getType().isDate()) {
+return year * 1 + month * 100 + day;
+} else {
+return (year * 1 + month * 100 + day) * 100L + hour * 
1 + minute * 100 + second;
+}
 }
 
 @Override
diff --git 
a/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java 
b/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java
index af93c5ceda..66d7920171 100644
--- a/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java
+++ b/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java
@@ -98,22 +98,22 @@ public class FEFunctionsTest {
 @Test
 public void dateAddTest() throws AnalysisException {
 DateLiteral actualResult = FEFunctions.dateAdd(new 
DateLiteral("2018-08-08", Type.DATE), new IntLiteral(1));
-DateLiteral expectedResult = new DateLiteral("2018-08-09 00:00:00", 
Type.DATETIME);
+DateLiteral expectedResult = new DateLiteral("2018-08-09", Type.DATE);
 Assert.assertEquals(expectedResult, actualResult);
 
 actualResult = FEFunctions.dateAdd(new DateLiteral("2018-08-08", 
Type.DATE), new IntLiteral(-1));
-expectedResult = new DateLiteral("2018-08-07 00:00:00", Type.DATETIME);
+expectedResult = new DateLiteral("2018-08-07", Type.DATE);
 Assert.assertEquals(expectedResult, actualResult);
 }
 
 @Test
 public void addDateTest() throws AnalysisException {
 DateLiteral actualResult = FEFunctions.addDate(new 
DateLiteral("2018-08-08", Type.DATE), new IntLiteral(1));
-DateLiteral expectedResult = new DateLiteral("2018-08-09 00:00:00", 
Type.DATETIME);
+DateLiteral expectedResult = new DateLiteral("2018-08-09", Type.DATE);
 Assert.assertEquals(expectedResult, actualResult);
 
 actualResult = FEFunctions.addDate(new DateLiteral("2018-08-08", 
Type.DATE), new IntLiteral(-1));
-expectedResult = new DateLiteral("2018-08-07 00:00:00", Type.DATETIME);
+expectedResult = new DateLiteral("2018-08-07", Type.DATE);
 Assert.assertEquals(expectedResult, actualResult);
 
 }


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12869: [Bug](date)(1.1-lts) Fix wrong type in TimestampArithmeticExpr

2022-09-22 Thread GitBox


yiguolei merged PR #12869:
URL: https://github.com/apache/doris/pull/12869


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[doris] branch branch-1.1-lts updated: [Bug](date) Fix wrong type in TimestampArithmeticExpr (#12869)

2022-09-22 Thread yiguolei
This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-1.1-lts
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-1.1-lts by this push:
 new 7b7e61d8c7 [Bug](date) Fix wrong type in TimestampArithmeticExpr 
(#12869)
7b7e61d8c7 is described below

commit 7b7e61d8c7e24f8f99595dc6f8c4f4b63ef4815b
Author: Gabriel 
AuthorDate: Fri Sep 23 08:51:31 2022 +0800

[Bug](date) Fix wrong type in TimestampArithmeticExpr (#12869)
---
 .../apache/doris/analysis/TimestampArithmeticExpr.java  | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java
 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java
index c04f16c6b8..26b0a82425 100644
--- 
a/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java
+++ 
b/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java
@@ -213,8 +213,21 @@ public class TimestampArithmeticExpr extends Expr {
 (op == ArithmeticExpr.Operator.ADD) ? "ADD" : "SUB");
 }
 
-fn = getBuiltinFunction(analyzer, funcOpName.toLowerCase(),
-collectChildReturnTypes(), 
Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
+Type[] childrenTypes = collectChildReturnTypes();
+fn = getBuiltinFunction(funcOpName.toLowerCase(), childrenTypes,
+Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF);
+Preconditions.checkArgument(fn != null);
+Type[] argTypes = fn.getArgs();
+if (argTypes.length > 0) {
+// Implicitly cast all the children to match the function if 
necessary
+for (int i = 0; i < childrenTypes.length; ++i) {
+// For varargs, we must compare with the last type in 
callArgs.argTypes.
+int ix = Math.min(argTypes.length - 1, i);
+if (!childrenTypes[i].matchesType(argTypes[ix]) && !(
+childrenTypes[i].isDateType() && 
argTypes[ix].isDateType())) {
+uncheckedCastChild(argTypes[ix], i);
+}
+}
 LOG.debug("fn is {} name is {}", fn, funcOpName);
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



[GitHub] [doris] yiguolei merged pull request #12873: [feature](outfile)(1.1-lts) support parquet writer

2022-09-22 Thread GitBox


yiguolei merged PR #12873:
URL: https://github.com/apache/doris/pull/12873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org



  1   2   >