[GitHub] [doris] adonis0147 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
adonis0147 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977277813 ## be/src/vec/columns/column_dictionary.h: ## @@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> { if (code >= 0) { return code; } -auto bound = std::upper_bound(_dict_data.begin(), _dict_data.end(), value) - - _dict_data.begin(); +auto bound = std::upper_bound(_dict_data->begin(), _dict_data->end(), value) - + _dict_data->begin(); return greater ? bound - greater + eq : bound - eq; } void find_codes(const phmap::flat_hash_set& values, std::vector& selected) const { -size_t dict_word_num = _dict_data.size(); +size_t dict_word_num = _dict_data->size(); selected.resize(dict_word_num); selected.assign(dict_word_num, false); -for (const auto& value : values) { -if (auto it = _inverted_index.find(value); it != _inverted_index.end()) { -selected[it->second] = true; +for (size_t i = 0; i < _dict_data->size(); i++) { +if (values.find((*_dict_data)[i]) != values.end()) { +selected[i] = true; } } } void clear() { -_dict_data.clear(); -_inverted_index.clear(); -_code_convert_table.clear(); +_dict_data->clear(); _hash_values.clear(); } void clear_hash_values() { _hash_values.clear(); } void sort() { -size_t dict_size = _dict_data.size(); -_code_convert_table.reserve(dict_size); -std::sort(_dict_data.begin(), _dict_data.end(), _comparator); +size_t dict_size = _dict_data->size(); + +_perm.resize(dict_size); +for (size_t i = 0; i < dict_size; ++i) { +_perm[i] = i; +} + +struct Comparator { +public: +Comparator(DictContainer& dict_data) : _dict_data(dict_data) {} +bool operator()(const size_t a, const size_t b) const { +return _comparator(_dict_data[a], _dict_data[b]); +} + +private: +StringValue::Comparator _comparator; +DictContainer& _dict_data; +}; +Comparator comparator(*_dict_data); +std::sort(_perm.begin(), _perm.end(), comparator); Review Comment: ```suggestion std::sort(_perm.begin(), _perm.end(), [&dict_data = *_dict_data, &comparator = _comparator](const size_t a, const size_t b) { return comparator(dict_data[a], dict_data[b]); }); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei opened a new pull request, #12857: [bugfix](scanner) olap scanner compute is wrong
yiguolei opened a new pull request, #12857: URL: https://github.com/apache/doris/pull/12857 # Proposed changes This issue is introduced by https://github.com/apache/doris/pull/8096, the operator priority is wrong , so that in some cases, there will be many scanners. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 commented on pull request #12691: [chore](thirdparty) Support third-party incremental build
adonis0147 commented on PR #12691: URL: https://github.com/apache/doris/pull/12691#issuecomment-1254625331 > Hi, @adonis0147 Thanks for your feedback, using MD5 for the incremental build is a generic idea, however, there is another problem to resolve -- how to manage the MD5 list? It seems that we still need to update the MD5 list manually, can you point out how it works in detail? We already have the MD5 list in [thirdparty/vars.sh](https://github.com/apache/doris/blob/master/thirdparty/vars.sh). We update this file when we want to update the third-parties. Therefore, we can write the MD5 to a file at a last place of each `build_xxx` function. > And, there is another case that sometimes Doris developers have to build **specific third-parties in a specific order** when some dependencies are updated and they require specific build order (one may rely on another, e.g. brpc relies on protubuf), it seems hard to resolve this problem by updating nothing but the `build-thirdparty.sh`? This problem is inevitable in both ways (either MD5 way or version counter way) if we want to support incremental installing. We should sort out the dependencies tree in our build script first. The reason is that it is hard for a developer to find out the dependencies when he want to upgrade a specific package only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] sohardforaname opened a new pull request, #12858: [Improve](Nereids)Optimize planner
sohardforaname opened a new pull request, #12858: URL: https://github.com/apache/doris/pull/12858 # Proposed changes Issue Number: close #xxx ## Problem summary optimize planner ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 opened a new pull request, #12859: Replace jvm's garbage collector CMS with G1
hf200012 opened a new pull request, #12859: URL: https://github.com/apache/doris/pull/12859 Replace jvm's garbage collector CMS with G1 From the test use, the overall performance is better than the CMS # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
BiteThet commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977291889 ## be/src/vec/columns/column_dictionary.h: ## @@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> { Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* col_ptr) override { auto* res_col = reinterpret_cast(col_ptr); +res_col->get_offsets().reserve(sel_size); +res_col->get_chars().reserve(_dict.avg_str_len() * sel_size); for (size_t i = 0; i < sel_size; i++) { uint16_t n = sel[i]; auto& code = reinterpret_cast(_codes[n]); auto value = _dict.get_value(code); -res_col->insert_data(value.ptr, value.len); +res_col->insert_data_without_reserve(value.ptr, value.len); Review Comment: I think `_dict.avg_str_len() * sel_size` may be less than sum length of elements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo commented on issue #11024: URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640547 > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo closed issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo closed issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend URL: https://github.com/apache/doris/issues/11024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo commented on issue #11024: URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640922 > I also encountered this issue. Did you fix it? @ReganHoo Update your doris version to 1.1.2 to solve this problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1
hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1 URL: https://github.com/apache/doris/pull/12859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12846: [chore](build) add optiuon to disable -frecord-gcc-switches
yiguolei merged PR #12846: URL: https://github.com/apache/doris/pull/12846 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch master updated: [chore](build) add option to disable -frecord-gcc-switches (#12846)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/master by this push: new 8fcd8ed8b3 [chore](build) add option to disable -frecord-gcc-switches (#12846) 8fcd8ed8b3 is described below commit 8fcd8ed8b32868858437f8c973af6b70322176f2 Author: Zhengguo Yang AuthorDate: Thu Sep 22 15:38:14 2022 +0800 [chore](build) add option to disable -frecord-gcc-switches (#12846) --- be/CMakeLists.txt | 6 +- build.sh | 4 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/be/CMakeLists.txt b/be/CMakeLists.txt index 2fc57ecf3c..094ebc4c3d 100644 --- a/be/CMakeLists.txt +++ b/be/CMakeLists.txt @@ -410,7 +410,7 @@ check_function_exists(sched_getcpu HAVE_SCHED_GETCPU) # -pthread: enable multithreaded malloc # -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG: enable nanosecond precision for boost # -fno-omit-frame-pointers: Keep frame pointer for functions in register -set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -frecord-gcc-switches -Wall -Wno-sign-compare -pthread -Werror") +set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -Wall -Wno-sign-compare -pthread -Werror") set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -fstrict-aliasing -fno-omit-frame-pointer") set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -std=gnu++17 -D__STDC_FORMAT_MACROS") set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG") @@ -418,6 +418,10 @@ set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -DBOOST_SYSTEM_NO_DEPRECATED") # Enable the cpu and heap profile of brpc set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -DBRPC_ENABLE_CPU_PROFILER") +if (RECORD_COMPILER_SWITCHES) +set(CXX_COMMON_FLAGS "${CXX_COMMON_FLAGS} -frecord-gcc-switches") +endif() + function(TRY_TO_CHANGE_LINKER LINKER_COMMAND LINKER_NAME) if (CUSTUM_LINKER_COMMAND STREQUAL "ld") execute_process(COMMAND ${CMAKE_C_COMPILER} -fuse-ld=${LINKER_COMMAND} -Wl,--version ERROR_QUIET OUTPUT_VARIABLE LD_VERSION) diff --git a/build.sh b/build.sh index 661cac059d..dc98cbce5b 100755 --- a/build.sh +++ b/build.sh @@ -270,6 +270,10 @@ if [[ -z "${USE_DWARF}" ]]; then USE_DWARF='OFF' fi +if [[ -z "${RECORD_COMPILER_SWITCHES}" ]]; then +RECORD_COMPILER_SWITCHES='OFF' +fi + echo "Get params: BUILD_FE-- ${BUILD_FE} BUILD_BE-- ${BUILD_BE} - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
Gabriel39 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977308937 ## be/src/vec/columns/column_dictionary.h: ## @@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> { Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* col_ptr) override { auto* res_col = reinterpret_cast(col_ptr); +res_col->get_offsets().reserve(sel_size); +res_col->get_chars().reserve(_dict.avg_str_len() * sel_size); for (size_t i = 0; i < sel_size; i++) { uint16_t n = sel[i]; auto& code = reinterpret_cast(_codes[n]); auto value = _dict.get_value(code); -res_col->insert_data(value.ptr, value.len); +res_col->insert_data_without_reserve(value.ptr, value.len); Review Comment: If so, `chars` in ColumnString will still to reserve a bigger memory block. `_dict.avg_str_len() * sel_size` is just a conservative estimation here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xiaokang opened a new pull request, #12860: [bugfix])(function)return error instead of crash be for unsupported CAST
xiaokang opened a new pull request, #12860: URL: https://github.com/apache/doris/pull/12860 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. For unsupported CAST, create create_unsupport_wrapper that return Status::InvalidArgument instead of LOG(FATAL) to avoid be crash. ## Checklist(Required) 1. Does it affect the original behavior: - [x] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener commented on a diff in pull request #12858: [Improve](Nereids)Optimize planner
jackwener commented on code in PR #12858: URL: https://github.com/apache/doris/pull/12858#discussion_r977328537 ## fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostEstimate.java: ## @@ -90,11 +90,27 @@ public static CostEstimate ofMemory(double memoryCost) { /** * Sums partial cost estimates of some (single) plan node. */ +@Deprecated Review Comment: No rename it, just remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong
github-actions[bot] commented on PR #12857: URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672072 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong
github-actions[bot] commented on PR #12857: URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672127 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] luozenglin opened a new issue, #12861: [Bug] data error when using select into outfile format as parquet
luozenglin opened a new issue, #12861: URL: https://github.com/apache/doris/issues/12861 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? When I export the data using `select into outfile format as parquet` and then load it into a table with the same schema, the tinyint column becomes NULL. ``` set enable_vectorized_engine = false; CREATE TABLE `test_select_into_property_test_output_format_parquet_tb` ( `k1` tinyint(4) NOT NULL, `k2` smallint(6) NOT NULL, `k3` int(11) NOT NULL, `k4` bigint(20) NOT NULL, `k5` datetime NOT NULL, `v1` date REPLACE NOT NULL, `v2` char(1) REPLACE NOT NULL, `v3` varchar(4096) REPLACE NOT NULL, `v4` float SUM NOT NULL, `v5` double SUM NOT NULL, `v6` decimal(20, 7) SUM NOT NULL ) ENGINE=OLAP AGGREGATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`k1`) BUCKETS 15 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); mysql> select * from test_select_into_property_test_output_format_parquet_tb where k1 <= 5; +--+--+--+--+-++--+---+---++-+ | k1 | k2 | k3 | k4 | k5 | v1 | v2 | v3 | v4| v5 | v6 | +--+--+--+--+-++--+---+---++-+ |1 | 10 | 100 | 1000 | 2011-01-01 00:00:00 | 2010-01-01 | t| ynqnzeowymt | 38.638844 | 180.998031 | 7395.231067 | |2 | 20 | 200 | 2000 | 2012-01-01 00:00:00 | 2010-01-02 | f| hfkfwlr | 506.04404 | 539.922834 | 2080.504502 | |3 | 30 | 300 | 3000 | 2013-01-01 00:00:00 | 2010-01-03 | t| uoclasp | 377.79321 | 577.044148 | 4605.253205 | |4 | 40 | 400 | 4000 | 2014-01-01 00:00:00 | 2010-01-04 | n| iswngzeodfhptjzgswsddt| 871.35455 | 919.067864 | 7291.703724 | |5 | 50 | 500 | 5000 | 2015-01-01 00:00:00 | 2010-01-05 | a| sqodagzlyrmcelyxgcgcsfuxadcdt | 462.0679 | 929.660783 | 3903.906901 | +--+--+--+--+-++--+---+---++-+ select k1 k_0, k2 k_1, k3 k_2, k4 k_3, k5 k_4, v1 k_5, v2 k_6, v3 k_7, v4 k_8, v5 k_9, v6 k_10 from test_select_into_property_test_output_format_parquet_tb INTO OUTFILE "hdfs://:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_844373478" FORMAT AS parquet PROPERTIES ("broker.name"="ahdfs","broker.username"="","broker.password"="", "schema" = "required,int32,k_0;required,int32,k_1;required,int32,k_2;required,int64,k_3;required,int64,k_4;required,int64,k_5;required,byte_array,k_6;required,byte_array,k_7;required,float,k_8;required,double,k_9;required,byte_array,k_10"); CREATE TABLE `select_into_check_table` ( `k_0` tinyint(4) NULL, `k_1` smallint(6) NULL, `k_2` int(11) NULL, `k_3` bigint(20) NULL, `k_4` datetime NULL, `k_5` date NULL, `k_6` char(1) NULL, `k_7` char(29) NULL, `k_8` float NULL, `k_9` double NULL, `k_10` decimal(27, 9) NULL ) ENGINE=OLAP DUPLICATE KEY(`k_0`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`k_0`) BUCKETS 13 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); LOAD LABEL test_select_into_property_test_output_format_parquet_db.label_21_04_47_50_543709_8920444695 ( DATA INFILE(" hdfs:/:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_8443734786915a56b133f4b71-a671fd00077a30b4_0.parquet") INTO TABLE `select_into_check_table` FORMAT AS "parquet") WITH BROKER "ahdfs" ("username"="", "password"=""); mysql> select * from select_into_check_table; +--+--+--+---+-++--+---+---++-+ | k_0 | k_1 | k_2 | k_3 | k_4 | k_5| k_6 | k_7 | k_8 | k_9| k_10| +--+--+--+---+-++--+---+---++-+ | NULL | NULL | 1000 |
[GitHub] [doris] zhannngchen opened a new pull request, #12862: [debug](test)a test pr for qa pipeline debug, will not merge
zhannngchen opened a new pull request, #12862: URL: https://github.com/apache/doris/pull/12862 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #12863: [improvement](scan) merge scan keys based on the number of scanners
mrhhsg opened a new pull request, #12863: URL: https://github.com/apache/doris/pull/12863 # Proposed changes Issue Number: close #xxx ## Problem Summary A scanner that takes too many scan keys will cause performance degradation, so it's better to try to merge the scan keys. Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
Gabriel39 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977345927 ## be/src/vec/columns/column_dictionary.h: ## @@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> { if (code >= 0) { return code; } -auto bound = std::upper_bound(_dict_data.begin(), _dict_data.end(), value) - - _dict_data.begin(); +auto bound = std::upper_bound(_dict_data->begin(), _dict_data->end(), value) - + _dict_data->begin(); return greater ? bound - greater + eq : bound - eq; } void find_codes(const phmap::flat_hash_set& values, std::vector& selected) const { -size_t dict_word_num = _dict_data.size(); +size_t dict_word_num = _dict_data->size(); selected.resize(dict_word_num); selected.assign(dict_word_num, false); -for (const auto& value : values) { -if (auto it = _inverted_index.find(value); it != _inverted_index.end()) { -selected[it->second] = true; +for (size_t i = 0; i < _dict_data->size(); i++) { +if (values.find((*_dict_data)[i]) != values.end()) { +selected[i] = true; } } } void clear() { -_dict_data.clear(); -_inverted_index.clear(); -_code_convert_table.clear(); +_dict_data->clear(); _hash_values.clear(); } void clear_hash_values() { _hash_values.clear(); } void sort() { -size_t dict_size = _dict_data.size(); -_code_convert_table.reserve(dict_size); -std::sort(_dict_data.begin(), _dict_data.end(), _comparator); +size_t dict_size = _dict_data->size(); + +_perm.resize(dict_size); +for (size_t i = 0; i < dict_size; ++i) { +_perm[i] = i; +} + +struct Comparator { +public: +Comparator(DictContainer& dict_data) : _dict_data(dict_data) {} +bool operator()(const size_t a, const size_t b) const { +return _comparator(_dict_data[a], _dict_data[b]); +} + +private: +StringValue::Comparator _comparator; +DictContainer& _dict_data; +}; +Comparator comparator(*_dict_data); +std::sort(_perm.begin(), _perm.end(), comparator); Review Comment: Done, thanks for your suggestion! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] luozenglin opened a new pull request, #12864: [fix](parquet) fix write error data as parquet format.
luozenglin opened a new pull request, #12864: URL: https://github.com/apache/doris/pull/12864 Fix incorrect data conversion when writing tiny int and small int data to parquet files in non-vectorized engine. # Proposed changes Issue Number: close #12861 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline debug, will not merge
zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline debug, will not merge URL: https://github.com/apache/doris/pull/12853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline debug, will not merge
zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline debug, will not merge URL: https://github.com/apache/doris/pull/12855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12824: URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695697 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12824: URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695752 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12822: URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697309 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12822: URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697351 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring opened a new pull request, #12865: test_p0
dataroaring opened a new pull request, #12865: URL: https://github.com/apache/doris/pull/12865 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freemandealer opened a new pull request, #12866: [enhancement](compaction) introduce segment compaction (#12609)
freemandealer opened a new pull request, #12866: URL: https://github.com/apache/doris/pull/12866 Implement segmentwise compaction during rowset write to reduce the number of segments produced by load jobs, otherwise may cause OLAP_ERR_TOO_MANY_SEGMENTS (-238). Signed-off-by: freemandealer # Proposed changes Issue Number: close #12609 ## Problem summ ## Intro The default limit is 200 segment perf rowset. Too many segments may fail the whole load process (OLAP_ERR_TOO_MANY_SEGMENTS -238). If we increase the limit, the load will succeed but the pressure is transferred to the subsequential rowsetwise compaction. Things get worse when the user issue a query, e.g. insert into select stmt, right after load job but before rowsetwise compaction, he/she will suffer the performance disaster or maybe end up with OOM. So we are introducing segmentwise compaction which will compact data DURING the write process, instead of waiting for rowsetwise compaction until txn has been committed. ## Design ### Tigger Every time when a rowset writer produces more than N (e.g. 10) segments, we trigger segment compaction. Note that only one segment compaction job for a single rowset at a time to ensure no recursing/queuing nightmare. ### Target Selection We collect segments during every trigger. We skip big segments whose row num > M (e.g. 1) coz we get little benefits from compacting them comparing our effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do actual compaction. ### Compaction Process A new thread pool is introduced to help do the job. We submit the above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the worker thread does the followings: - build a MergeIterator from the target segments - create a new segment writer - for each block readed from MergeIterator, the Writer append it ### SegID handling SegID must remain consecutive after segment compaction. If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big segment seg_4: - we create a segment named "seg_0-3" to save compacted data for seg_0, seg_1, seg_2 and seg_3 - delete seg_0, seg_1, seg_2 and seg_3 - rename seg_0-3 to seg_0 - rename seg_4 to seg_1 It is worth noticing that we should wait inflight segment compaction tasks to finish before building rowset meta and committing this txn. ## Test results ### The amount of data can Doris load First, we test the data amount that we can successfully load into doris disable/enable segment compaction.Tests are based on TPCH. Table is created as 1 bucket and no parallel. We trigger segment compaction every 10 segments produced by rowset writer. | cases | data amount| | - | -- | | Disable SegCompaciton | 1.12 million rows, 18.67GB | | Enable SegCompaction | 11 million rows, 183GB | The result shows that the amount of data we can load to doris improve 10 times after enabling segment compaction. The ratio is correspond to the triggering segment number. ### Impact on latency When segment compaction is disabled, a load job will finish in 1260s during the test. And the sequential rowsetwise compaction cost 151s. We give the test results when enabling segment compaction in different triggering segment number: | triggering segment number| Load Latency | RowsetCompaction Latency | | | | | | 5 (trigger every 5 segments) | 089s (-13%) | 242s (+60%) | | 10 | 1053s (-16%) | 166s (+9%) | | 20 | 960s (-23%) | 172s (+13%) | | 40 | 1320s (+4%) | 169s (+11%) | We load without segment compaction for serveral times and each gives us a different latency range from (-25%, +25%). So we believe that segment compaction has little impact on the latency. In addition to the above costs, we wait inflight segment compaction tasks to finish before building rowset meta and publishing the data. The length of the wait time depends on when the build takes the place but there is a theoretical range for it and the range is related to the time each segment compaction task will cost: | triggering segment number | Single SegCompaction Task Latency | | - | - | | 5 | 5s| | 10| 9s| | 20| 20s | | 40| 60s | ### I
[GitHub] [doris] freemandealer closed pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)
freemandealer closed pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609) URL: https://github.com/apache/doris/pull/12610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freemandealer commented on pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)
freemandealer commented on PR #12610: URL: https://github.com/apache/doris/pull/12610#issuecomment-1254772650 A brandnew PR with updated code as well as detailed design and test results are provided here: https://github.com/apache/doris/pull/12866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12867: [Improvement](predicate) Replace for-loop by memcpy
Gabriel39 opened a new pull request, #12867: URL: https://github.com/apache/doris/pull/12867 # Proposed changes This PR replace for-loop by memcpy. I did two experiments. Experiment 1 Run ckbench q20 and print a flame graph. Compare proportion of this function time to the total time. I got: for-loop:1.74% memcpy:0.013% Experiment 2 Run `SELECT JavaEnable FROM hits`. 9900w+ rows returned and JavaEnable is SMALL INT. Compare the BlockLoadTime. I got: for-loop:1s225ms memcpy:805.603ms ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman opened a new pull request, #12868: [draft] for testing p0, not merge
morningman opened a new pull request, #12868: URL: https://github.com/apache/doris/pull/12868 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12869: [Bug](date)(1.1-lts) Fix wrong type in TimestampArithmeticExpr
Gabriel39 opened a new pull request, #12869: URL: https://github.com/apache/doris/pull/12869 # Proposed changes Cherry pick from #12727 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12870: [Bug](date)(1.1-lts) Fix wrong result produced by date function
Gabriel39 opened a new pull request, #12870: URL: https://github.com/apache/doris/pull/12870 # Proposed changes Cherry pick from #12720 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS opened a new issue, #12871: [Enhancement](rewrite) support Or to In rule
Henry2SS opened a new issue, #12871: URL: https://github.com/apache/doris/issues/12871 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description support Or to In rewrite rule : for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)` ### Solution support Or to In rewrite rule : for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)` ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS opened a new pull request, #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS opened a new pull request, #12872: URL: https://github.com/apache/doris/pull/12872 # Proposed changes Issue Number: close #12871 ## Problem summary 1. support Or to In rewrite rule 2. fix Expr clone problems. It should create a new object, or it will always be shallow-copy. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 3. Has unit tests been added: - [x] Yes - [ ] No - [ ] No Need 4. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 5. Does it need to update dependencies: - [ ] Yes - [x] No 6. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS commented on PR #12872: URL: https://github.com/apache/doris/pull/12872#issuecomment-1254828517 tested locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12873: [feature](outfile)(1.1-lts) support parquet writer
Gabriel39 opened a new pull request, #12873: URL: https://github.com/apache/doris/pull/12873 # Proposed changes Cherry pick from #12492 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] caiconghui opened a new issue, #12874: [Bug] set enable_projection to false will cause select stmt analyze failed
caiconghui opened a new issue, #12874: URL: https://github.com/apache/doris/issues/12874 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master and lts ### What's Wrong? set enable_projection=false; select count() from (select a, b from table001 order by b limit 1) a then throw exception like the following ERROR 1105 (HY000): errCode = 2, detailMessage = couldn't resolve slot descriptor 0 ### What You Expected? work ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS commented on PR #12872: URL: https://github.com/apache/doris/pull/12872#issuecomment-1254835927 1. fe unit-tests passed locally. 2. compiled and manually tested function passed test results:   -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] caiconghui commented on issue #12874: [Bug] set enable_projection to false will cause select stmt analyze failed
caiconghui commented on issue #12874: URL: https://github.com/apache/doris/issues/12874#issuecomment-1254836058 mysql> show columns from baseall; +---++--+---+-+-+ | Field | Type | Null | Key | Default | Extra | +---++--+---+-+-+ | k0| BOOLEAN| Yes | true | NULL| | | k1| TINYINT| Yes | true | NULL| | | k2| SMALLINT | Yes | true | NULL| | | k3| INT| Yes | true | NULL| | | k4| BIGINT | Yes | true | NULL| | | k5| DECIMAL(9,3) | Yes | true | NULL| | | k6| CHAR(5)| Yes | true | NULL| | | k10 | DATE | Yes | true | NULL| | | k11 | DATETIME | Yes | true | NULL| | | k7| VARCHAR(20)| Yes | true | NULL| | | k8| DOUBLE | Yes | false | NULL| MAX | | k9| FLOAT | Yes | false | NULL| SUM | | k12 | VARCHAR(65533) | Yes | false | NULL| REPLACE | | k13 | LARGEINT | Yes | false | NULL| REPLACE | +---++- select count() from (select k0, k1 from baseall order by k1 limit 1) a -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris-website] branch master updated: add ADMIN-CLEAN-TRASH
This is an automated email from the ASF dual-hosted git repository. jiafengzheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git The following commit(s) were added to refs/heads/master by this push: new d569a99c79c add ADMIN-CLEAN-TRASH d569a99c79c is described below commit d569a99c79c9d8dfcb759ec3072fad8023400a01 Author: jiafeng.zhang AuthorDate: Thu Sep 22 18:36:31 2022 +0800 add ADMIN-CLEAN-TRASH add ADMIN-CLEAN-TRASH --- sidebars.json | 1 + 1 file changed, 1 insertion(+) diff --git a/sidebars.json b/sidebars.json index feae10df74f..1ee53be255e 100644 --- a/sidebars.json +++ b/sidebars.json @@ -667,6 +667,7 @@ "items": [ "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REPAIR", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CHECK-TABLET", + "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CLEAN-TRASH", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REPAIR-TABLE", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-SET-CONFIG", - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen opened a new pull request, #12875: [feature-wip](unique-key-merge-on-write) fix thread safe issue in BetaRowsetWriter
zhannngchen opened a new pull request, #12875: URL: https://github.com/apache/doris/pull/12875 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow opened a new pull request, #12876: test bucket shuffle
morrySnow opened a new pull request, #12876: URL: https://github.com/apache/doris/pull/12876 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] nextdreamblue opened a new pull request, #12877: [fix](type) fix DECIMAL scale when cast function on fe
nextdreamblue opened a new pull request, #12877: URL: https://github.com/apache/doris/pull/12877 # Proposed changes Issue Number: close #12717 ## Problem summary 根据cast传递的DECIMAL类型的精度来处理DECIMAL数据. before: MySQL [test]> select cast('135.75999' as DECIMAL(10,3)); ++ | CAST('135.75999' AS DECIMAL(10,3)) | ++ | 135.75999 | ++ 1 row in set (0.00 sec) now: MySQL [stage]> select cast('135.75999' as DECIMAL(10,3)); ++ | CAST('135.75999' AS DECIMAL(10,3)) | ++ |135.759 | ++ 1 row in set (0.01 sec) ## Checklist(Required) 1. Does it affect the original behavior: - [x] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch opt_perf updated: [bugfix](scanner) olap scanner compute is wrong
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch opt_perf in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/opt_perf by this push: new b65178b7a7 [bugfix](scanner) olap scanner compute is wrong b65178b7a7 is described below commit b65178b7a7df72efc7d1d275b4dc4116bb9413e2 Author: yiguolei AuthorDate: Thu Sep 22 15:06:06 2022 +0800 [bugfix](scanner) olap scanner compute is wrong --- be/src/exec/olap_scan_node.cpp | 2 +- be/src/vec/exec/scan/new_olap_scan_node.cpp | 2 +- be/src/vec/exec/volap_scan_node.cpp | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/be/src/exec/olap_scan_node.cpp b/be/src/exec/olap_scan_node.cpp index e49fdde6d1..d3b3a3aabd 100644 --- a/be/src/exec/olap_scan_node.cpp +++ b/be/src/exec/olap_scan_node.cpp @@ -921,7 +921,7 @@ Status OlapScanNode::start_scan_thread(RuntimeState* state) { int size_based_scanners_per_tablet = 1; if (config::doris_scan_range_max_mb > 0) { size_based_scanners_per_tablet = std::max( -1, (int)tablet->tablet_footprint() / config::doris_scan_range_max_mb << 20); +1, (int)(tablet->tablet_footprint() / (config::doris_scan_range_max_mb << 20))); } int ranges_per_scanner = std::max(1, (int)ranges->size() / diff --git a/be/src/vec/exec/scan/new_olap_scan_node.cpp b/be/src/vec/exec/scan/new_olap_scan_node.cpp index 973e6c23ee..8242abef77 100644 --- a/be/src/vec/exec/scan/new_olap_scan_node.cpp +++ b/be/src/vec/exec/scan/new_olap_scan_node.cpp @@ -290,7 +290,7 @@ Status NewOlapScanNode::_init_scanners(std::list* scanners) { if (config::doris_scan_range_max_mb > 0) { size_based_scanners_per_tablet = std::max( -1, (int)tablet->tablet_footprint() / config::doris_scan_range_max_mb << 20); +1, (int)(tablet->tablet_footprint() / (config::doris_scan_range_max_mb << 20))); } int ranges_per_scanner = diff --git a/be/src/vec/exec/volap_scan_node.cpp b/be/src/vec/exec/volap_scan_node.cpp index 8197c88dbd..ebe6ab90cd 100644 --- a/be/src/vec/exec/volap_scan_node.cpp +++ b/be/src/vec/exec/volap_scan_node.cpp @@ -912,7 +912,7 @@ Status VOlapScanNode::start_scan_thread(RuntimeState* state) { if (config::doris_scan_range_max_mb > 0) { size_based_scanners_per_tablet = std::max( -1, (int)tablet->tablet_footprint() / config::doris_scan_range_max_mb << 20); +1, (int)(tablet->tablet_footprint() / (config::doris_scan_range_max_mb << 20))); } int ranges_per_scanner = - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zy-kkk opened a new pull request, #12878: [typo](docs)Optimized date function doc order and add partial function doc
zy-kkk opened a new pull request, #12878: URL: https://github.com/apache/doris/pull/12878 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12879: [Improvement](predicate) Replace for-loop by memcpy
Gabriel39 opened a new pull request, #12879: URL: https://github.com/apache/doris/pull/12879 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12880: [Improvement](dict) optimize dictionary column
Gabriel39 opened a new pull request, #12880: URL: https://github.com/apache/doris/pull/12880 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee opened a new pull request, #12881: [Opt](Vectorized) Support push down no grouping agg
HappenLee opened a new pull request, #12881: URL: https://github.com/apache/doris/pull/12881 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt opened a new pull request, #12882: [Chore](clang) support build with clang15
BiteThet opened a new pull request, #12882: URL: https://github.com/apache/doris/pull/12882 # Proposed changes 1. remove some unsed variables 2. use clang-format15 reformat ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 3. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 4. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 5. Does it need to update dependencies: - [ ] Yes - [ ] No 6. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12881: [Opt](Vectorized) Support push down no grouping agg
yiguolei merged PR #12881: URL: https://github.com/apache/doris/pull/12881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhangstar333 opened a new pull request, #12883: [Bug](jdbc) fix insert into date type to oracle using wrong type
zhangstar333 opened a new pull request, #12883: URL: https://github.com/apache/doris/pull/12883 # Proposed changes using JDBC insert into date type to ORACLE, it's should be use to_date function convert string to java.sql.date ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch opt_perf updated: [Opt](Vectorized) Support push down no grouping agg (#12881)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch opt_perf in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/opt_perf by this push: new c5ec7601d4 [Opt](Vectorized) Support push down no grouping agg (#12881) c5ec7601d4 is described below commit c5ec7601d4a45493051292c7234997c622a46f36 Author: HappenLee AuthorDate: Thu Sep 22 19:46:21 2022 +0800 [Opt](Vectorized) Support push down no grouping agg (#12881) --- be/src/olap/iterators.h| 1 + be/src/olap/reader.h | 1 + be/src/olap/rowset/beta_rowset_reader.cpp | 1 + be/src/olap/rowset/rowset_reader_context.h | 1 + be/src/olap/rowset/segment_v2/column_reader.cpp| 42 ++ be/src/olap/rowset/segment_v2/column_reader.h | 12 ++ be/src/olap/rowset/segment_v2/segment.cpp | 8 +- be/src/olap/rowset/segment_v2/segment_iterator.cpp | 1 - be/src/olap/rowset/segment_v2/segment_iterator.h | 4 +- be/src/vec/exec/scan/new_olap_scanner.cpp | 9 +- be/src/vec/exec/volap_scanner.cpp | 8 +- be/src/vec/olap/block_reader.cpp | 1 + be/src/vec/olap/vgeneric_iterators.cpp | 79 ++ be/src/vec/olap/vgeneric_iterators.h | 6 + be/test/vec/exec/vgeneric_iterators_test.cpp | 1 - .../org/apache/doris/catalog/PrimitiveType.java| 4 + .../org/apache/doris/planner/OlapScanNode.java | 11 ++ .../apache/doris/planner/SingleNodePlanner.java| 160 + .../java/org/apache/doris/qe/SessionVariable.java | 13 +- gensrc/thrift/PlanNodes.thrift | 8 ++ 20 files changed, 359 insertions(+), 12 deletions(-) diff --git a/be/src/olap/iterators.h b/be/src/olap/iterators.h index 22f081d0eb..4f12118c2c 100644 --- a/be/src/olap/iterators.h +++ b/be/src/olap/iterators.h @@ -77,6 +77,7 @@ public: std::vector column_predicates; std::unordered_map> col_id_to_predicates; std::unordered_map> col_id_to_del_predicates; +TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE; // REQUIRED (null is not allowed) OlapReaderStatistics* stats = nullptr; diff --git a/be/src/olap/reader.h b/be/src/olap/reader.h index 004e75c773..ae476e4fa2 100644 --- a/be/src/olap/reader.h +++ b/be/src/olap/reader.h @@ -91,6 +91,7 @@ public: // use only in vec exec engine std::vector* origin_return_columns = nullptr; std::unordered_set* tablet_columns_convert_to_null_set = nullptr; +TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE; // used for comapction to record row ids bool record_rowids = false; diff --git a/be/src/olap/rowset/beta_rowset_reader.cpp b/be/src/olap/rowset/beta_rowset_reader.cpp index df15b72f62..87893927d5 100644 --- a/be/src/olap/rowset/beta_rowset_reader.cpp +++ b/be/src/olap/rowset/beta_rowset_reader.cpp @@ -49,6 +49,7 @@ Status BetaRowsetReader::init(RowsetReaderContext* read_context) { // convert RowsetReaderContext to StorageReadOptions StorageReadOptions read_options; read_options.stats = _stats; +read_options.push_down_agg_type_opt = _context->push_down_agg_type_opt; if (read_context->lower_bound_keys != nullptr) { for (int i = 0; i < read_context->lower_bound_keys->size(); ++i) { read_options.key_ranges.emplace_back(&read_context->lower_bound_keys->at(i), diff --git a/be/src/olap/rowset/rowset_reader_context.h b/be/src/olap/rowset/rowset_reader_context.h index de61117426..ce2fd4b721 100644 --- a/be/src/olap/rowset/rowset_reader_context.h +++ b/be/src/olap/rowset/rowset_reader_context.h @@ -41,6 +41,7 @@ struct RowsetReaderContext { std::vector* read_orderby_key_columns = nullptr; // projection columns: the set of columns rowset reader should return const std::vector* return_columns = nullptr; +TPushAggOp::type push_down_agg_type_opt = TPushAggOp::NONE; // column name -> column predicate // adding column_name for predicate to make use of column selectivity const std::vector* predicates = nullptr; diff --git a/be/src/olap/rowset/segment_v2/column_reader.cpp b/be/src/olap/rowset/segment_v2/column_reader.cpp index d42358c5e7..451b8f3e91 100644 --- a/be/src/olap/rowset/segment_v2/column_reader.cpp +++ b/be/src/olap/rowset/segment_v2/column_reader.cpp @@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map( return Status::OK(); } +Status ColumnReader::next_batch_of_zone_map(size_t* n, vectorized::MutableColumnPtr& dst) const { +// TODO: this work to get min/max value seems should only do once +FieldType type = _type_info->type(); +std::unique_ptr min_value(WrapperField::create_by_type(type, _meta.length())); +std::unique_ptr max_value(WrapperField::create_by_type(type, _meta.length())); +_parse_zone
[GitHub] [doris] yiguolei merged pull request #12880: [Improvement](dict) optimize dictionary column
yiguolei merged PR #12880: URL: https://github.com/apache/doris/pull/12880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch opt_perf updated: [Improvement](dict) optimize dictionary column (#12880)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch opt_perf in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/opt_perf by this push: new eb68ee6560 [Improvement](dict) optimize dictionary column (#12880) eb68ee6560 is described below commit eb68ee6560b94a6844f4605398f997f878646936 Author: Gabriel AuthorDate: Thu Sep 22 19:46:38 2022 +0800 [Improvement](dict) optimize dictionary column (#12880) --- be/src/vec/columns/column_dictionary.h | 87 ++ be/src/vec/columns/column_string.h | 11 + 2 files changed, 58 insertions(+), 40 deletions(-) diff --git a/be/src/vec/columns/column_dictionary.h b/be/src/vec/columns/column_dictionary.h index d56265b757..93fbcb9a3e 100644 --- a/be/src/vec/columns/column_dictionary.h +++ b/be/src/vec/columns/column_dictionary.h @@ -192,11 +192,13 @@ public: Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* col_ptr) override { auto* res_col = reinterpret_cast(col_ptr); +res_col->get_offsets().reserve(sel_size); +res_col->get_chars().reserve(_dict.avg_str_len() * sel_size); for (size_t i = 0; i < sel_size; i++) { uint16_t n = sel[i]; auto& code = reinterpret_cast(_codes[n]); auto value = _dict.get_value(code); -res_col->insert_data(value.ptr, value.len); +res_col->insert_data_without_reserve(value.ptr, value.len); } return Status::OK(); } @@ -281,42 +283,36 @@ public: class Dictionary { public: -Dictionary() = default; +Dictionary() : _dict_data(new DictContainer()), _total_str_len(0) {}; -void reserve(size_t n) { -_dict_data.reserve(n); -_inverted_index.reserve(n); -} +void reserve(size_t n) { _dict_data->reserve(n); } void insert_value(StringValue& value) { -_dict_data.push_back_without_reserve(value); -_inverted_index[value] = _inverted_index.size(); +_dict_data->push_back_without_reserve(value); +_total_str_len += value.len; } int32_t find_code(const StringValue& value) const { -auto it = _inverted_index.find(value); -if (it != _inverted_index.end()) { -return it->second; +for (size_t i = 0; i < _dict_data->size(); i++) { +if ((*_dict_data)[i] == value) { +return i; +} } return -2; // -1 is null code } T get_null_code() const { return -1; } -inline StringValue& get_value(T code) { -return code >= _dict_data.size() ? _null_value : _dict_data[code]; -} +inline StringValue& get_value(T code) { return (*_dict_data)[code]; } -inline const StringValue& get_value(T code) const { -return code >= _dict_data.size() ? _null_value : _dict_data[code]; -} +inline const StringValue& get_value(T code) const { return (*_dict_data)[code]; } // The function is only used in the runtime filter feature inline void generate_hash_values_for_runtime_filter(FieldType type) { if (_hash_values.empty()) { -_hash_values.resize(_dict_data.size()); -for (size_t i = 0; i < _dict_data.size(); i++) { -auto& sv = _dict_data[i]; +_hash_values.resize(_dict_data->size()); +for (size_t i = 0; i < _dict_data->size(); i++) { +auto& sv = (*_dict_data)[i]; // The char data is stored in the disk with the schema length, // and zeros are filled if the length is insufficient @@ -360,40 +356,50 @@ public: if (code >= 0) { return code; } -auto bound = std::upper_bound(_dict_data.begin(), _dict_data.end(), value) - - _dict_data.begin(); +auto bound = std::upper_bound(_dict_data->begin(), _dict_data->end(), value) - + _dict_data->begin(); return greater ? bound - greater + eq : bound - eq; } void find_codes(const phmap::flat_hash_set& values, std::vector& selected) const { -size_t dict_word_num = _dict_data.size(); +size_t dict_word_num = _dict_data->size(); selected.resize(dict_word_num); selected.assign(dict_word_num, false); -for (const auto& value : values) { -if (auto it = _inverted_index.find(value); it != _inverted_index.end()) { -selected[it->second] = true; +for (size_t i = 0; i < _dict_data->size(); i++) { +if (values.find((*_dict_data)[i]) != values.end())
[GitHub] [doris] yiguolei merged pull request #12879: [Improvement](predicate) Replace for-loop by memcpy
yiguolei merged PR #12879: URL: https://github.com/apache/doris/pull/12879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #12884: [improvement](scan) merge scan keys based on the number of scanners
mrhhsg opened a new pull request, #12884: URL: https://github.com/apache/doris/pull/12884 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch opt_perf updated: [Improvement](predicate) Replace for-loop by memcpy (#12879)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch opt_perf in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/opt_perf by this push: new 2b27aaa2fa [Improvement](predicate) Replace for-loop by memcpy (#12879) 2b27aaa2fa is described below commit 2b27aaa2fa888ca2fe4634553034ea2f33e37ab4 Author: Gabriel AuthorDate: Thu Sep 22 19:46:52 2022 +0800 [Improvement](predicate) Replace for-loop by memcpy (#12879) --- be/src/vec/columns/predicate_column.h | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/be/src/vec/columns/predicate_column.h b/be/src/vec/columns/predicate_column.h index fd99d4c04b..d5ad52b6ac 100644 --- a/be/src/vec/columns/predicate_column.h +++ b/be/src/vec/columns/predicate_column.h @@ -133,13 +133,9 @@ private: } } -// note(wb): Write data one by one has a slight performance improvement than memcpy directly void insert_many_default_type(const char* data_ptr, size_t num) { -T* input_val_ptr = (T*)data_ptr; T* res_val_ptr = (T*)data.get_end_ptr(); -for (int i = 0; i < num; i++) { -res_val_ptr[i] = input_val_ptr[i]; -} +memcpy(res_val_ptr, data_ptr, num * sizeof(T)); res_val_ptr += num; data.set_end_ptr(res_val_ptr); } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12884: [improvement](scan) merge scan keys based on the number of scanners
yiguolei merged PR #12884: URL: https://github.com/apache/doris/pull/12884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch opt_perf updated: [improvement](scan) merge scan keys based on the number of scanners (#12884)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch opt_perf in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/opt_perf by this push: new 3d2a73c028 [improvement](scan) merge scan keys based on the number of scanners (#12884) 3d2a73c028 is described below commit 3d2a73c028802bfcdeeba0ff5851cfded6d548e4 Author: Jerry Hu AuthorDate: Thu Sep 22 20:10:42 2022 +0800 [improvement](scan) merge scan keys based on the number of scanners (#12884) --- be/src/exec/olap_common.cpp | 113 +++ be/src/exec/olap_common.h | 116 be/src/runtime/datetime_value.h | 21 + be/src/vec/exec/scan/new_olap_scan_node.cpp | 22 -- be/src/vec/runtime/vdatetime_value.h| 11 +++ 5 files changed, 262 insertions(+), 21 deletions(-) diff --git a/be/src/exec/olap_common.cpp b/be/src/exec/olap_common.cpp index 8069c47a17..087a62928c 100644 --- a/be/src/exec/olap_common.cpp +++ b/be/src/exec/olap_common.cpp @@ -59,6 +59,42 @@ void ColumnValueRange::convert_to_fixed_value() { return; } +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + +template <> +std::vector> +ColumnValueRange::split(size_t count) { +__builtin_unreachable(); +} + Status OlapScanKeys::get_key_range(std::vector>* key_range) { key_range->clear(); @@ -74,6 +110,83 @@ Status OlapScanKeys::get_key_range(std::vector>* return Status::OK(); } +Status OlapScanKeys::extend_scan_splitted_keys(std::vector& ranges) { +using namespace std; +DCHECK(!_has_range_value); + +std::vector new_begin_keys; +std::vector new_end_keys; +for (size_t i = 0; i != ranges.size(); ++i) { +std::visit( +[&](auto&& range) { +using RangeType = std::decay_t; +using CppType = typename RangeType::CppType; +auto begin_keys = _begin_scan_keys; +auto end_keys = _end_scan_keys; +if (begin_keys.empty()) { +begin_keys.emplace_back(); +begin_keys.back().add_value( +cast_to_string( +range.get_range_min_value(), range.scale()), +range.contain_null()); +end_keys.emplace_back(); + end_keys.back().add_value(cast_to_string( +range.get_range_max_value(), range.scale())); +} else { +for (int i = 0; i < begin_keys.size(); ++i) { +begin_keys[i].add_value( +cast_to_string( +range.get_range_min_value(), range.scale()), +range.contain_null()); +} + +for (int i = 0; i < end_keys.size(); ++i) { + end_keys[i].add_value(cast_to_string( +range.get_range_max_value(), range.scale())); +} +} +new_begin_keys.insert(new_begin_keys.end(), begin_keys.begin(), + begin_keys.end()); +new_end_keys.insert(new_end_keys.end(), end_keys.begin(), end_keys.end()); +}, +ranges[i]); +} +_begin_scan_keys = new_begin_keys; +_end_scan_keys = new_end_keys; +return Status::OK(); +} + +OlapScanKeys OlapScanKeys::merge(size_t to_ranges_count) { +OlapScanKeys merged; +merged.set_is_convertible(_is_convertible); +merged.set_max_scan_key_num(_max_scan_key_num); +bool exact_value = false; +for (size_t i = 0; i != _column_ranges.size(); ++i) { +std::visit( +[&](auto&& range) { +if (i == _index_of_max_size_range) { +return; +} +merged.extend_scan_key(range, &exact_value); +}, +_column_ranges[i]); +} + +size_t size_of_ranges = std::max(size_t(1), merged.size()); +size_t split_to_count = (to_ranges_count + size_of_ranges - 1) / size_of_ranges; +std::vector splitted = std::visit( +[&](auto&& range) { +aut
[GitHub] [doris] dutyu opened a new issue, #12885: [Enhancement] auditloader plugin always discard audit log when clsuter is busy
dutyu opened a new issue, #12885: URL: https://github.com/apache/doris/issues/12885 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description I've installed the auditloader plugin, i found that when cluster is busy (users submit many sqls to the cluster), the doris_audit_tbl__ table is always missing some audit log where i can find in fe.audit.log. I've reviewed the code and found that AuditLoaderPlugin use a LinkedBlockingDeque which the capacity is 1, if users submit many sqls, the `AuditLoaderPlugin.exec` method is always failed cause of the queue is full. Maybe use a configuration to control the capacity of the queue is an elegant way to handle this problem. ### Solution Use a configuration to control the capacity of the queue. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] liaoxin01 opened a new pull request, #12886: [feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change
liaoxin01 opened a new pull request, #12886: URL: https://github.com/apache/doris/pull/12886 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dutyu opened a new pull request, #12887: [enhancement](AuditLoaderPlugin): add audit queue capacity configurat…
dutyu opened a new pull request, #12887: URL: https://github.com/apache/doris/pull/12887 …ion and improve performance for datetime format. # Proposed changes Ease the audit log discard problem for auditloader plugin. Issue Number: close #12885 ## Problem summary See #12885 ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [*] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [*] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [*] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [*] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [*] No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #12888: [bugfix](scanner) remove invalid of '[[noreturn]]'
mrhhsg opened a new pull request, #12888: URL: https://github.com/apache/doris/pull/12888 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] superhanliu2 commented on a diff in pull request #12837: Update vars.sh
superhanliu2 commented on code in PR #12837: URL: https://github.com/apache/doris/pull/12837#discussion_r977643300 ## thirdparty/vars.sh: ## @@ -288,7 +288,7 @@ JEMALLOC_SOURCE="jemalloc-5.2.1" JEMALLOC_MD5SUM="3d41fbf006e6ebffd489bdb304d009ae" # cctz -CCTZ_DOWNLOAD="https://github.com/google/cctz/archive/v2.3.tar.gz"; +CCTZ_DOWNLOAD="https://codeload.github.com/google/cctz/tar.gz/refs/tags/v2.3"; Review Comment: I check again today and I find that the old value is right too .may be the network traffic yesterday,sorry. please close this pr, 3x a lot -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] superhanliu2 commented on a diff in pull request #12837: Update vars.sh
superhanliu2 commented on code in PR #12837: URL: https://github.com/apache/doris/pull/12837#discussion_r977643300 ## thirdparty/vars.sh: ## @@ -288,7 +288,7 @@ JEMALLOC_SOURCE="jemalloc-5.2.1" JEMALLOC_MD5SUM="3d41fbf006e6ebffd489bdb304d009ae" # cctz -CCTZ_DOWNLOAD="https://github.com/google/cctz/archive/v2.3.tar.gz"; +CCTZ_DOWNLOAD="https://codeload.github.com/google/cctz/tar.gz/refs/tags/v2.3"; Review Comment: I check again today and I find that the old value is right too .may be the network traffic yesterday,sorry. please close this pr, 3x a lot. @jackwener @adonis0147 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz opened a new pull request, #12889: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker
xinyiZzz opened a new pull request, #12889: URL: https://github.com/apache/doris/pull/12889 # Proposed changes Issue Number: close #xxx ## Problem summary cherry-pick: https://github.com/apache/doris/pull/12666 https://github.com/apache/doris/pull/12339 https://github.com/apache/doris/pull/12682 https://github.com/apache/doris/pull/12688 https://github.com/apache/doris/pull/12708 https://github.com/apache/doris/pull/12782 https://github.com/apache/doris/pull/12776 ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xinyiZzz merged pull request #12889: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker
xinyiZzz merged PR #12889: URL: https://github.com/apache/doris/pull/12889 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch branch-1.1-lts updated: [branch-1.1-lts](cherry-pick) Some fixes for mem tracker (#12889)
This is an automated email from the ASF dual-hosted git repository. zouxinyi pushed a commit to branch branch-1.1-lts in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-1.1-lts by this push: new d3006ddd12 [branch-1.1-lts](cherry-pick) Some fixes for mem tracker (#12889) d3006ddd12 is described below commit d3006ddd121f5c89dfea8f38f192de7d03fe5dd4 Author: Xinyi Zou AuthorDate: Thu Sep 22 21:47:45 2022 +0800 [branch-1.1-lts](cherry-pick) Some fixes for mem tracker (#12889) * [fix][memtracker] remove gc and fix print * [fix](memory) Fix BE OOM when load -238 fail * [fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688) tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory. because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail. in addition: The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s. Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process. * Fix memory leak by calling in mem hook (#12708) After the consume mem tracker exceeds the mem limit in the mem hook, the boost stacktrace will be printed. A query/load will only be printed once, and the process tracker will only be printed once per second. After the process memory reaches the upper limit, the boost stacktrace will be printed every second. The observed phenomena are as follows: After query/load is canceled, the memory increases instantly; tcmalloc profile total physical memory is less than perf process memory; The process mem tracker is smaller than the perf process memory; * [fix](memtracker) Fix thread mem tracker try consume accuracy #12782 * [Bugfix](mem) Fix memory limit check may overflow (#12776) This bug is because the result of subtracting signed and unsigned numbers may overflow if it is negative. Co-authored-by: Zhengguo Yang --- be/src/common/config.h | 2 +- be/src/common/daemon.cpp | 1 + be/src/http/default_path_handlers.cpp| 5 +- be/src/runtime/exec_env.h| 9 ++ be/src/runtime/exec_env_init.cpp | 1 + be/src/runtime/load_channel.cpp | 9 +- be/src/runtime/load_channel.h| 2 +- be/src/runtime/load_channel_mgr.cpp | 10 +- be/src/runtime/load_channel_mgr.h| 2 +- be/src/runtime/memory/mem_tracker.cpp| 9 +- be/src/runtime/memory/mem_tracker_limiter.cpp| 136 --- be/src/runtime/memory/mem_tracker_limiter.h | 134 +++--- be/src/runtime/memory/thread_mem_tracker_mgr.cpp | 11 +- be/src/runtime/memory/thread_mem_tracker_mgr.h | 18 +-- be/src/runtime/tablets_channel.cpp | 4 + be/src/service/doris_main.cpp| 10 +- be/src/util/mem_info.cpp | 15 ++- be/src/util/mem_info.h | 37 +- be/src/util/perf_counters.cpp| 6 + be/src/util/perf_counters.h | 6 +- be/src/util/system_metrics.cpp | 3 +- 21 files changed, 235 insertions(+), 195 deletions(-) diff --git a/be/src/common/config.h b/be/src/common/config.h index 7f1921d496..106609ee05 100644 --- a/be/src/common/config.h +++ b/be/src/common/config.h @@ -68,7 +68,7 @@ CONF_Int64(tc_max_total_thread_cache_bytes, "1073741824"); // defaults to bytes if no unit is given" // must larger than 0. and if larger than physical memory size, // it will be set to physical memory size. -CONF_String(mem_limit, "80%"); +CONF_String(mem_limit, "90%"); // the port heartbeat service used CONF_Int32(heartbeat_service_port, "9050"); diff --git a/be/src/common/daemon.cpp b/be/src/common/daemon.cpp index ea628bb100..bb39bf13ef 100644 --- a/be/src/common/daemon.cpp +++ b/be/src/common/daemon.cpp @@ -68,6 +68,7 @@ namespace doris { bool k_doris_exit = false; void Daemon::tcmalloc_gc_thread() { +// TODO All cache GC wish to be supported while (!_stop_background_threads_latch.wait_for(MonoDelta::FromSeconds(10))) { size_t used_size = 0; size_t free_size = 0; diff --git a/be/src/http/default_path_handlers.cpp b/be/src/http/default_path_handlers.cpp index c7cdcd2ad8..3efed02a09 100644 --- a/be/src/http/default_path_handlers.cpp +++ b/be/src/http/default_path_handlers.cpp @@ -32,6 +32,7 @@ #include "runtime/mem_tracker.h" #include "runtime/memory/mem_tracker_limiter.h" #include "util/d
[GitHub] [doris] jackwener opened a new pull request, #12890: [fix](Nereids): fix Outer LAsscom and improve onConditon checker
jackwener opened a new pull request, #12890: URL: https://github.com/apache/doris/pull/12890 # Proposed changes Issue Number: close #xxx ## Problem summary - fix Outer LAsscom, current forgot to check onCondtion for Outer LAsscom - improve onConditon checker. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow closed pull request #12876: test bucket shuffle
morrySnow closed pull request #12876: test bucket shuffle URL: https://github.com/apache/doris/pull/12876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow opened a new pull request, #12891: [enhancement](Nereids) plan bucket shuffle join on fragment without scan node
morrySnow opened a new pull request, #12891: URL: https://github.com/apache/doris/pull/12891 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)
zhannngchen commented on code in PR #12866: URL: https://github.com/apache/doris/pull/12866#discussion_r977657612 ## be/src/olap/olap_server.cpp: ## @@ -700,6 +708,19 @@ Status StorageEngine::submit_quick_compaction_task(TabletSharedPtr tablet) { return Status::OK(); } +Status StorageEngine::_handle_seg_compaction(BetaRowsetWriter* writer, + SegCompactionCandidatesSharedPtr segments) { +writer->do_segcompaction(segments); +return Status::OK(); +} + +Status StorageEngine::submit_seg_compaction_task(BetaRowsetWriter* writer, + SegCompactionCandidatesSharedPtr segments) { +_seg_compaction_thread_pool->submit_func( Review Comment: ditto ## be/src/olap/olap_server.cpp: ## @@ -700,6 +708,19 @@ Status StorageEngine::submit_quick_compaction_task(TabletSharedPtr tablet) { return Status::OK(); } +Status StorageEngine::_handle_seg_compaction(BetaRowsetWriter* writer, + SegCompactionCandidatesSharedPtr segments) { +writer->do_segcompaction(segments); Review Comment: should be `return writer->do_segcompaction(segments);` ? ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const vectorized::Block* block) { return _add_block(block, &_segment_writer); } +vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader( +SegCompactionCandidatesSharedPtr segments, std::shared_ptr schema, +OlapReaderStatistics* stat) { +StorageReadOptions read_options; +read_options.stats = stat; +read_options.use_page_cache = false; +read_options.tablet_schema = _context.tablet_schema; +std::vector> seg_iterators; +for (auto& seg_ptr : *segments) { +std::unique_ptr iter; +auto s = seg_ptr->new_iterator(*schema, read_options, &iter); +if (!s.ok()) { +LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << "]: " << s.to_string(); Review Comment: should return here? ## be/src/common/config.h: ## @@ -875,6 +878,12 @@ CONF_Bool(enable_new_load_scan_node, "false"); // Temp config. True to use new file scanner. Will remove after fully test. CONF_Bool(enable_new_file_scanner, "false"); +CONF_Bool(enable_segcompaction, "false"); // currently only support vectorized storage +// Trigger segcompaction if the num of segments in a rowset exceeds this threshold. +CONF_Int32(segcompaction_threshold_segment_num, "10"); + +CONF_Int32(segcompaction_small_threshold, "100"); Review Comment: use 1048576 instead. ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const vectorized::Block* block) { return _add_block(block, &_segment_writer); } +vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader( +SegCompactionCandidatesSharedPtr segments, std::shared_ptr schema, +OlapReaderStatistics* stat) { +StorageReadOptions read_options; +read_options.stats = stat; +read_options.use_page_cache = false; +read_options.tablet_schema = _context.tablet_schema; +std::vector> seg_iterators; +for (auto& seg_ptr : *segments) { +std::unique_ptr iter; +auto s = seg_ptr->new_iterator(*schema, read_options, &iter); +if (!s.ok()) { +LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << "]: " << s.to_string(); +} +seg_iterators.push_back(std::move(iter)); +} +std::vector iterators; +for (auto& owned_it : seg_iterators) { +// transfer ownership +iterators.push_back(owned_it.release()); +} +bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS); +bool is_reverse = false; +auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, is_reverse, nullptr); +merge_itr->init(read_options); + +return (vectorized::VMergeIterator*)merge_itr; +} + +std::unique_ptr BetaRowsetWriter::create_segcompaction_writer( +uint64_t begin, uint64_t end) { +Status status; +std::unique_ptr writer = nullptr; +status = _create_segment_writer_for_segcompaction(&writer, begin, end); +if (status != Status::OK()) { +writer = nullptr; +LOG(ERROR) << "failed to create segment writer for begin:" << begin << " end:" << end + << " path:" << writer->get_data_dir()->path(); +} +if (writer->get_data_dir()) Review Comment: `if (writer != nullptr && writer->get_data_dir())` ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const vectorized::Block* block) { return _add_block(block, &_segment_writer); } +vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_read
[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)
zhannngchen commented on code in PR #12866: URL: https://github.com/apache/doris/pull/12866#discussion_r977716456 ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const vectorized::Block* block) { return _add_block(block, &_segment_writer); } +vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader( +SegCompactionCandidatesSharedPtr segments, std::shared_ptr schema, +OlapReaderStatistics* stat) { +StorageReadOptions read_options; +read_options.stats = stat; +read_options.use_page_cache = false; +read_options.tablet_schema = _context.tablet_schema; +std::vector> seg_iterators; +for (auto& seg_ptr : *segments) { +std::unique_ptr iter; +auto s = seg_ptr->new_iterator(*schema, read_options, &iter); +if (!s.ok()) { +LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << "]: " << s.to_string(); +} +seg_iterators.push_back(std::move(iter)); +} +std::vector iterators; +for (auto& owned_it : seg_iterators) { +// transfer ownership +iterators.push_back(owned_it.release()); +} +bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS); +bool is_reverse = false; +auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, is_reverse, nullptr); +merge_itr->init(read_options); + +return (vectorized::VMergeIterator*)merge_itr; +} + +std::unique_ptr BetaRowsetWriter::create_segcompaction_writer( +uint64_t begin, uint64_t end) { +Status status; +std::unique_ptr writer = nullptr; +status = _create_segment_writer_for_segcompaction(&writer, begin, end); +if (status != Status::OK()) { +writer = nullptr; +LOG(ERROR) << "failed to create segment writer for begin:" << begin << " end:" << end + << " path:" << writer->get_data_dir()->path(); +} +if (writer->get_data_dir()) +LOG(INFO) << "segcompaction segment writer created for begin:" << begin << " end:" << end + << " path:" << writer->get_data_dir()->path(); +return writer; +} + +Status BetaRowsetWriter::delete_original_segments(uint32_t begin, uint32_t end) { +auto fs = _rowset_meta->fs(); +if (!fs) { +return Status::OLAPInternalError(OLAP_ERR_INIT_FAILED); +} +for (uint32_t i = begin; i <= end; ++i) { +auto seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, i); +// Even if an error is encountered, these files that have not been cleaned up +// will be cleaned up by the GC background. So here we only print the error +// message when we encounter an error. +WARN_IF_ERROR(fs->delete_file(seg_path), + strings::Substitute("Failed to delete file=$0", seg_path)); +} +return Status::OK(); +} + +void BetaRowsetWriter::rename_compacted_segments(int64_t begin, int64_t end) { +int ret; +auto src_seg_path = BetaRowset::local_segment_path_segcompacted(_context.tablet_path, + _context.rowset_id, begin, end); +auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, + _num_segcompacted++); +ret = rename(src_seg_path.c_str(), dst_seg_path.c_str()); +DCHECK_EQ(ret, 0); +} + +// todo: will rename only do the job? maybe need deep modification +void BetaRowsetWriter::rename_compacted_segment_plain(uint64_t seg_id) { +int ret; +auto src_seg_path = +BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, seg_id); +auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, + _num_segcompacted++); +LOG(INFO) << "segcompaction skip this segment. rename " << src_seg_path << " to " + << dst_seg_path; +if (src_seg_path.compare(dst_seg_path) != 0) { +CHECK_EQ(_segid_statistics_map.find(seg_id + 1) == _segid_statistics_map.end(), false); Review Comment: DCHECK_EQ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee opened a new pull request, #12892: [config](vec) control num free block by be config
HappenLee opened a new pull request, #12892: URL: https://github.com/apache/doris/pull/12892 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12867: [Improvement](predicate) Replace for-loop by memcpy
github-actions[bot] commented on PR #12867: URL: https://github.com/apache/doris/pull/12867#issuecomment-1255102655 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12867: [Improvement](predicate) Replace for-loop by memcpy
github-actions[bot] commented on PR #12867: URL: https://github.com/apache/doris/pull/12867#issuecomment-1255102723 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)
zhannngchen commented on code in PR #12866: URL: https://github.com/apache/doris/pull/12866#discussion_r977743775 ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -102,6 +110,284 @@ Status BetaRowsetWriter::add_block(const vectorized::Block* block) { return _add_block(block, &_segment_writer); } +vectorized::VMergeIterator* BetaRowsetWriter::get_segcompaction_reader( +SegCompactionCandidatesSharedPtr segments, std::shared_ptr schema, +OlapReaderStatistics* stat) { +StorageReadOptions read_options; +read_options.stats = stat; +read_options.use_page_cache = false; +read_options.tablet_schema = _context.tablet_schema; +std::vector> seg_iterators; +for (auto& seg_ptr : *segments) { +std::unique_ptr iter; +auto s = seg_ptr->new_iterator(*schema, read_options, &iter); +if (!s.ok()) { +LOG(WARNING) << "failed to create iterator[" << seg_ptr->id() << "]: " << s.to_string(); +} +seg_iterators.push_back(std::move(iter)); +} +std::vector iterators; +for (auto& owned_it : seg_iterators) { +// transfer ownership +iterators.push_back(owned_it.release()); +} +bool is_unique = (_context.tablet_schema->keys_type() == UNIQUE_KEYS); +bool is_reverse = false; +auto merge_itr = vectorized::new_merge_iterator(iterators, -1, is_unique, is_reverse, nullptr); +merge_itr->init(read_options); + +return (vectorized::VMergeIterator*)merge_itr; +} + +std::unique_ptr BetaRowsetWriter::create_segcompaction_writer( +uint64_t begin, uint64_t end) { +Status status; +std::unique_ptr writer = nullptr; +status = _create_segment_writer_for_segcompaction(&writer, begin, end); +if (status != Status::OK()) { +writer = nullptr; +LOG(ERROR) << "failed to create segment writer for begin:" << begin << " end:" << end + << " path:" << writer->get_data_dir()->path(); +} +if (writer->get_data_dir()) +LOG(INFO) << "segcompaction segment writer created for begin:" << begin << " end:" << end + << " path:" << writer->get_data_dir()->path(); +return writer; +} + +Status BetaRowsetWriter::delete_original_segments(uint32_t begin, uint32_t end) { +auto fs = _rowset_meta->fs(); +if (!fs) { +return Status::OLAPInternalError(OLAP_ERR_INIT_FAILED); +} +for (uint32_t i = begin; i <= end; ++i) { +auto seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, i); +// Even if an error is encountered, these files that have not been cleaned up +// will be cleaned up by the GC background. So here we only print the error +// message when we encounter an error. +WARN_IF_ERROR(fs->delete_file(seg_path), + strings::Substitute("Failed to delete file=$0", seg_path)); +} +return Status::OK(); +} + +void BetaRowsetWriter::rename_compacted_segments(int64_t begin, int64_t end) { +int ret; +auto src_seg_path = BetaRowset::local_segment_path_segcompacted(_context.tablet_path, + _context.rowset_id, begin, end); +auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, + _num_segcompacted++); +ret = rename(src_seg_path.c_str(), dst_seg_path.c_str()); +DCHECK_EQ(ret, 0); +} + +// todo: will rename only do the job? maybe need deep modification +void BetaRowsetWriter::rename_compacted_segment_plain(uint64_t seg_id) { +int ret; +auto src_seg_path = +BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, seg_id); +auto dst_seg_path = BetaRowset::local_segment_path(_context.tablet_path, _context.rowset_id, + _num_segcompacted++); +LOG(INFO) << "segcompaction skip this segment. rename " << src_seg_path << " to " + << dst_seg_path; +if (src_seg_path.compare(dst_seg_path) != 0) { +CHECK_EQ(_segid_statistics_map.find(seg_id + 1) == _segid_statistics_map.end(), false); +CHECK_EQ(_segid_statistics_map.find(_num_segcompacted) == _segid_statistics_map.end(), + true); +statistics org = _segid_statistics_map[seg_id + 1]; +_segid_statistics_map.emplace(_num_segcompacted, org); +clear_statistics_for_deleting_segments(seg_id, seg_id); +ret = rename(src_seg_path.c_str(), dst_seg_path.c_str()); +DCHECK_EQ(ret, 0); +} +} + +void BetaRowsetWriter::clear_statistics_for_deleting_segments(uint64_t begin, uint64_t end) { +LOG(INFO) << "_segid_statistics_map clear record segid range from:" << begin + 1 + << " to:" << end + 1; +for (int i = begin; i <= end; ++i) { +_segid_statistics_map.erase(i + 1); +} +} + +Status B
[GitHub] [doris] BePPPower commented on a diff in pull request #12848: [feature-wip](new-scan)Add new jdbc scanner and new jdbc scan node
BePPPower commented on code in PR #12848: URL: https://github.com/apache/doris/pull/12848#discussion_r977757888 ## be/src/vec/exec/scan/new_jdbc_scan_node.cpp: ## @@ -0,0 +1,62 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "vec/exec/scan/new_jdbc_scan_node.h" +#ifdef LIBJVM + +#include "vec/exec/scan/new_jdbc_scanner.h" +#include "vec/exec/scan/vscanner.h" +namespace doris::vectorized { +NewJdbcScanNode::NewJdbcScanNode(ObjectPool* pool, const TPlanNode& tnode, + const DescriptorTbl& descs) +: VScanNode(pool, tnode, descs), + _table_name(tnode.jdbc_scan_node.table_name), + _tuple_id(tnode.jdbc_scan_node.tuple_id), + _query_string(tnode.jdbc_scan_node.query_string) { +_output_tuple_id = tnode.jdbc_scan_node.tuple_id; +} + +std::string NewJdbcScanNode::get_name() { +return fmt::format("VNewJdbcScanNode({0})", _table_name); +} + +Status NewJdbcScanNode::prepare(RuntimeState* state) { +VLOG_CRITICAL << "VNewJdbcScanNode::Prepare"; +RETURN_IF_ERROR(VScanNode::prepare(state)); +SCOPED_CONSUME_MEM_TRACKER(mem_tracker()); Review Comment: Here seems to be not needed, because VScanNode::prepare has done `SCOPED_CONSUME_MEM_TRACKER(mem_tracker())` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #12866: [enhancement](compaction) introduce segment compaction (#12609)
zhannngchen commented on code in PR #12866: URL: https://github.com/apache/doris/pull/12866#discussion_r977705741 ## be/src/olap/rowset/beta_rowset_writer.cpp: ## @@ -309,12 +641,23 @@ Status BetaRowsetWriter::_create_segment_writer( DCHECK(file_writer != nullptr); segment_v2::SegmentWriterOptions writer_options; writer_options.enable_unique_key_merge_on_write = _context.enable_unique_key_merge_on_write; -writer->reset(new segment_v2::SegmentWriter(file_writer.get(), _num_segment, -_context.tablet_schema, _context.data_dir, -_context.max_rows_per_segment, writer_options)); -{ -std::lock_guard l(_lock); -_file_writers.push_back(std::move(file_writer)); + +if (is_segcompaction) { +writer->reset(new segment_v2::SegmentWriter(file_writer.get(), _num_segcompacted + 1, Review Comment: The only difference of these 2 branch is the parameter sgement_id? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BePPPower commented on a diff in pull request #12848: [feature-wip](new-scan)Add new jdbc scanner and new jdbc scan node
BePPPower commented on code in PR #12848: URL: https://github.com/apache/doris/pull/12848#discussion_r977757888 ## be/src/vec/exec/scan/new_jdbc_scan_node.cpp: ## @@ -0,0 +1,62 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "vec/exec/scan/new_jdbc_scan_node.h" +#ifdef LIBJVM + +#include "vec/exec/scan/new_jdbc_scanner.h" +#include "vec/exec/scan/vscanner.h" +namespace doris::vectorized { +NewJdbcScanNode::NewJdbcScanNode(ObjectPool* pool, const TPlanNode& tnode, + const DescriptorTbl& descs) +: VScanNode(pool, tnode, descs), + _table_name(tnode.jdbc_scan_node.table_name), + _tuple_id(tnode.jdbc_scan_node.tuple_id), + _query_string(tnode.jdbc_scan_node.query_string) { +_output_tuple_id = tnode.jdbc_scan_node.tuple_id; +} + +std::string NewJdbcScanNode::get_name() { +return fmt::format("VNewJdbcScanNode({0})", _table_name); +} + +Status NewJdbcScanNode::prepare(RuntimeState* state) { +VLOG_CRITICAL << "VNewJdbcScanNode::Prepare"; +RETURN_IF_ERROR(VScanNode::prepare(state)); +SCOPED_CONSUME_MEM_TRACKER(mem_tracker()); Review Comment: Here seems to be not needed? because VScanNode::prepare has done `SCOPED_CONSUME_MEM_TRACKER(mem_tracker())` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman opened a new pull request, #12893: [improvement](load) support loading data with missing column
morningman opened a new pull request, #12893: URL: https://github.com/apache/doris/pull/12893 # Proposed changes Issue Number: close #xxx ## Problem summary This PR is from #11742, and add arrow reader support. If there are 5 columns in table and 4 columns in file, the load can still finish, with a default null column loaded. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] englefly opened a new pull request, #12894: [feature](nereids) extract single table expression for push down
englefly opened a new pull request, #12894: URL: https://github.com/apache/doris/pull/12894 # Proposed changes TPCH q7, we have expression like ``` (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE')``` this expression implies `(n1.n_name='FRANCE' or n1.n_name=''GERMANY)` The implied expression is logical redundancy, but it could be used to reduce the output tuple number of scan(n1), if nereids push down this expression down. This pr introduces a RULE to extract such expressions. NOTE: 1. we only extract expression on a single table. 2. if the extracted expression cannot be pushed down, e.g. it is on right table of left outer join, we need another rule to remove all the useless expressions. Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 4. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 5. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 6. Does it need to update dependencies: - [ ] Yes - [ ] No 7. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen closed pull request #12862: [debug](test)a test pr for qa pipeline debug, will not merge
zhannngchen closed pull request #12862: [debug](test)a test pr for qa pipeline debug, will not merge URL: https://github.com/apache/doris/pull/12862 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dinggege1024 opened a new issue, #12895: [Enhancement] spark load support ORC format table
dinggege1024 opened a new issue, #12895: URL: https://github.com/apache/doris/issues/12895 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description Until now doris spark load do not support ORC format file, I would like to help this . Is there anythine i need to pay attention?  ### Solution _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] sahilm-10 commented on issue #11706: Good First Issue
sahilm-10 commented on issue #11706: URL: https://github.com/apache/doris/issues/11706#issuecomment-1255198436 @luzhijing I am interested in DOCS & BLOGS TRANSLATION. I am new to Open-Source, if there's any post remaining to assign , Please assign me. I want to contribute. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee commented on pull request #12892: [config](vec) control num free block by be config
HappenLee commented on PR #12892: URL: https://github.com/apache/doris/pull/12892#issuecomment-1255210036 > please also modify free block in `src/vec/exec/scan/scanner_context.cpp` just a test pr in branch `opt_perf`, if it's effective, I will create a new pr to do this in `master` branch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] wsjz opened a new pull request, #12896: [feature-wip](parquet-reader) refactor parquet_predicate
wsjz opened a new pull request, #12896: URL: https://github.com/apache/doris/pull/12896 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12838: [Bug](view) Show create view support comment
github-actions[bot] commented on PR #12838: URL: https://github.com/apache/doris/pull/12838#issuecomment-1255403424 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12838: [Bug](view) Show create view support comment
github-actions[bot] commented on PR #12838: URL: https://github.com/apache/doris/pull/12838#issuecomment-1255403458 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12820: [fix](streamload&sink) release and allocate memory in the same tracker
github-actions[bot] commented on PR #12820: URL: https://github.com/apache/doris/pull/12820#issuecomment-1255513860 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] closed pull request #8603: fix string default value bug
github-actions[bot] closed pull request #8603: fix string default value bug URL: https://github.com/apache/doris/pull/8603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12870: [Bug](date)(1.1-lts) Fix wrong result produced by date function
yiguolei merged PR #12870: URL: https://github.com/apache/doris/pull/12870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch branch-1.1-lts updated: [Bug](date) Fix wrong result produced by date function (#12870)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch branch-1.1-lts in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-1.1-lts by this push: new 97e51a11e0 [Bug](date) Fix wrong result produced by date function (#12870) 97e51a11e0 is described below commit 97e51a11e068dc44e7390e9e66279d4858e188a0 Author: Gabriel AuthorDate: Fri Sep 23 08:50:36 2022 +0800 [Bug](date) Fix wrong result produced by date function (#12870) --- .../src/main/java/org/apache/doris/analysis/DateLiteral.java | 6 +- .../src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java | 8 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java index 9de0ae375d..5531936bd0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/DateLiteral.java @@ -387,7 +387,11 @@ public class DateLiteral extends LiteralExpr { @Override public long getLongValue() { -return (year * 1 + month * 100 + day) * 100L + hour * 1 + minute * 100 + second; +if (this.getType().isDate()) { +return year * 1 + month * 100 + day; +} else { +return (year * 1 + month * 100 + day) * 100L + hour * 1 + minute * 100 + second; +} } @Override diff --git a/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java b/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java index af93c5ceda..66d7920171 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/rewrite/FEFunctionsTest.java @@ -98,22 +98,22 @@ public class FEFunctionsTest { @Test public void dateAddTest() throws AnalysisException { DateLiteral actualResult = FEFunctions.dateAdd(new DateLiteral("2018-08-08", Type.DATE), new IntLiteral(1)); -DateLiteral expectedResult = new DateLiteral("2018-08-09 00:00:00", Type.DATETIME); +DateLiteral expectedResult = new DateLiteral("2018-08-09", Type.DATE); Assert.assertEquals(expectedResult, actualResult); actualResult = FEFunctions.dateAdd(new DateLiteral("2018-08-08", Type.DATE), new IntLiteral(-1)); -expectedResult = new DateLiteral("2018-08-07 00:00:00", Type.DATETIME); +expectedResult = new DateLiteral("2018-08-07", Type.DATE); Assert.assertEquals(expectedResult, actualResult); } @Test public void addDateTest() throws AnalysisException { DateLiteral actualResult = FEFunctions.addDate(new DateLiteral("2018-08-08", Type.DATE), new IntLiteral(1)); -DateLiteral expectedResult = new DateLiteral("2018-08-09 00:00:00", Type.DATETIME); +DateLiteral expectedResult = new DateLiteral("2018-08-09", Type.DATE); Assert.assertEquals(expectedResult, actualResult); actualResult = FEFunctions.addDate(new DateLiteral("2018-08-08", Type.DATE), new IntLiteral(-1)); -expectedResult = new DateLiteral("2018-08-07 00:00:00", Type.DATETIME); +expectedResult = new DateLiteral("2018-08-07", Type.DATE); Assert.assertEquals(expectedResult, actualResult); } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12869: [Bug](date)(1.1-lts) Fix wrong type in TimestampArithmeticExpr
yiguolei merged PR #12869: URL: https://github.com/apache/doris/pull/12869 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[doris] branch branch-1.1-lts updated: [Bug](date) Fix wrong type in TimestampArithmeticExpr (#12869)
This is an automated email from the ASF dual-hosted git repository. yiguolei pushed a commit to branch branch-1.1-lts in repository https://gitbox.apache.org/repos/asf/doris.git The following commit(s) were added to refs/heads/branch-1.1-lts by this push: new 7b7e61d8c7 [Bug](date) Fix wrong type in TimestampArithmeticExpr (#12869) 7b7e61d8c7 is described below commit 7b7e61d8c7e24f8f99595dc6f8c4f4b63ef4815b Author: Gabriel AuthorDate: Fri Sep 23 08:51:31 2022 +0800 [Bug](date) Fix wrong type in TimestampArithmeticExpr (#12869) --- .../apache/doris/analysis/TimestampArithmeticExpr.java | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java index c04f16c6b8..26b0a82425 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/TimestampArithmeticExpr.java @@ -213,8 +213,21 @@ public class TimestampArithmeticExpr extends Expr { (op == ArithmeticExpr.Operator.ADD) ? "ADD" : "SUB"); } -fn = getBuiltinFunction(analyzer, funcOpName.toLowerCase(), -collectChildReturnTypes(), Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF); +Type[] childrenTypes = collectChildReturnTypes(); +fn = getBuiltinFunction(funcOpName.toLowerCase(), childrenTypes, +Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF); +Preconditions.checkArgument(fn != null); +Type[] argTypes = fn.getArgs(); +if (argTypes.length > 0) { +// Implicitly cast all the children to match the function if necessary +for (int i = 0; i < childrenTypes.length; ++i) { +// For varargs, we must compare with the last type in callArgs.argTypes. +int ix = Math.min(argTypes.length - 1, i); +if (!childrenTypes[i].matchesType(argTypes[ix]) && !( +childrenTypes[i].isDateType() && argTypes[ix].isDateType())) { +uncheckedCastChild(argTypes[ix], i); +} +} LOG.debug("fn is {} name is {}", fn, funcOpName); } - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12873: [feature](outfile)(1.1-lts) support parquet writer
yiguolei merged PR #12873: URL: https://github.com/apache/doris/pull/12873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org