[GitHub] [doris] github-actions[bot] commented on pull request #12498: [feature](restore) add restore new property 'reserve_dynamic_partition_enable'
github-actions[bot] commented on PR #12498: URL: https://github.com/apache/doris/pull/12498#issuecomment-1254451079 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zy-kkk opened a new pull request, #12844: [test](join)add join case3
zy-kkk opened a new pull request, #12844: URL: https://github.com/apache/doris/pull/12844 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] stalary merged pull request #12513: [feature](http) refactor version info and add new http api for get version info
stalary merged PR #12513: URL: https://github.com/apache/doris/pull/12513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] liaoxin01 closed pull request #12363: [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance
liaoxin01 closed pull request #12363: [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance URL: https://github.com/apache/doris/pull/12363 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] liaoxin01 commented on pull request #12363: [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance
liaoxin01 commented on PR #12363: URL: https://github.com/apache/doris/pull/12363#issuecomment-1254456751 > Have we tested the stream load performance on this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #12803: [Opt](Vectorized) Support push down no grouping agg
BiteThet commented on code in PR #12803: URL: https://github.com/apache/doris/pull/12803#discussion_r977146006 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map( return Status::OK(); } +Status ColumnReader::next_batch_of_zone_map(size_t* n, vectorized::MutableColumnPtr& dst) const { +// TODO: this work to get min/max value seems should only do once +FieldType type = _type_info->type(); +std::unique_ptr min_value(WrapperField::create_by_type(type, _meta.length())); +std::unique_ptr max_value(WrapperField::create_by_type(type, _meta.length())); +_parse_zone_map(_zone_map_index_meta->segment_zone_map(), min_value.get(), max_value.get()); + +dst->reserve(*n); +bool is_string = is_olap_string_type(type); +if (max_value->is_null()) { +assert_cast(*dst).insert_default(); +} else { +if (is_string) { +auto sv = (StringValue*)max_value->cell_ptr(); +dst->insert_data(sv->ptr, sv->len); +} else { +dst->insert_many_fix_len_data(static_cast(max_value->cell_ptr()), 1); +} Review Comment: why not just use `insert_data` here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #12803: [Opt](Vectorized) Support push down no grouping agg
BiteThet commented on code in PR #12803: URL: https://github.com/apache/doris/pull/12803#discussion_r977146623 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map( return Status::OK(); } +Status ColumnReader::next_batch_of_zone_map(size_t* n, vectorized::MutableColumnPtr& dst) const { +// TODO: this work to get min/max value seems should only do once +FieldType type = _type_info->type(); +std::unique_ptr min_value(WrapperField::create_by_type(type, _meta.length())); +std::unique_ptr max_value(WrapperField::create_by_type(type, _meta.length())); +_parse_zone_map(_zone_map_index_meta->segment_zone_map(), min_value.get(), max_value.get()); + +dst->reserve(*n); +bool is_string = is_olap_string_type(type); +if (max_value->is_null()) { +assert_cast(*dst).insert_default(); +} else { +if (is_string) { +auto sv = (StringValue*)max_value->cell_ptr(); +dst->insert_data(sv->ptr, sv->len); +} else { +dst->insert_many_fix_len_data(static_cast(max_value->cell_ptr()), 1); +} +} + +auto size = *n - 1; +if (min_value->is_null()) { + assert_cast(*dst).insert_null_elements(size); +} else { +if (is_string) { +auto sv = (StringValue*)min_value->cell_ptr(); +dst->insert_many_data(sv->ptr, sv->len, size); +} else { +// TODO: the work may cause performance problem, opt latter +for (int i = 0; i < size; ++i) { +dst->insert_many_fix_len_data(static_cast(min_value->cell_ptr()), 1); +} Review Comment: Maybe we can use `insert_many_fix_len_data( ,size)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 opened a new pull request, #12845: [Enhancement](debugging) Add more debug info for clang build
adonis0147 opened a new pull request, #12845: URL: https://github.com/apache/doris/pull/12845 # Proposed changes Issue Number: close #12843 Add a compile option for clang: `-fno-limit-debug-info` ## Problem summary Currently, If we use clang to build the project, we can't view the contents of some objects (e.g. `std::string` objects) in gdb. It is inconvenient for us to debug the program due to the frequency of these objects is high. More information can be refered to [Cannot view std::string when compiled with clang](https://stackoverflow.com/questions/41745527/cannot-view-stdstring-when-compiled-with-clang). ### Example **source code: test.cc** ```cpp #include #include int main() { std::string s = "Hello, world!"; std::cout << s << std::endl; return 0; } ``` **Debugging with GDB** ```gdb (gdb) l 1 #include 2 #include 3 4 int main() { 5 std::string s = "Hello, world!"; 6 std::cout << s << std::endl; 7 return 0; 8 } (gdb) b 6 Breakpoint 1 at 0x14e0e: file test.cc, line 6. (gdb) r Starting program: /ssd2/lingcong/misc/clang/test Breakpoint 1, main () at test.cc:6 6 std::cout << s << std::endl; ``` ### Before ```shell clang++ -g test.cc -o test ``` ```gdb (gdb) p s $1 = (gdb) p s.c_str() Couldn't find method std::string::c_str (gdb) ``` ### After ```shell clang++ -fno-limit-debug-info -g test.cc -o test ``` ```gdb (gdb) p s $1 = {static npos = 18446744073709551615, _M_dataplus = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0x7fffe358 "Hello, world!"}, _M_string_length = 13, {_M_local_buf = "Hello, world!\000\000", _M_allocated_capacity = 8583909746840200520}} (gdb) p s.c_str() $2 = 0x7fffe358 "Hello, world!" (gdb) ``` ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow merged pull request #12765: [feature-wip](statistics) collect statistics by sql task
morrySnow merged PR #12765: URL: https://github.com/apache/doris/pull/12765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow merged pull request #12766: [feature-wip](statistics) add statistics module related syntax
morrySnow merged PR #12766: URL: https://github.com/apache/doris/pull/12766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit
zhannngchen commented on code in PR #12716: URL: https://github.com/apache/doris/pull/12716#discussion_r977151695 ## be/src/runtime/tablets_channel.cpp: ## @@ -196,77 +196,94 @@ void TabletsChannel::_close_wait(DeltaWriter* writer, template Status TabletsChannel::reduce_mem_usage(int64_t mem_limit, TabletWriterAddResult* response) { -std::lock_guard l(_lock); -if (_state == kFinished) { -// TabletsChannel is closed without LoadChannel's lock, -// therefore it's possible for reduce_mem_usage() to be called right after close() -return _close_status; -} - -// Sort the DeltaWriters by mem consumption in descend order. -std::vector writers; -for (auto& it : _tablet_writers) { -it.second->save_memtable_consumption_snapshot(); -writers.push_back(it.second); -} -std::sort(writers.begin(), writers.end(), [](const DeltaWriter* lhs, const DeltaWriter* rhs) { -return lhs->get_memtable_consumption_snapshot() > rhs->get_memtable_consumption_snapshot(); -}); +std::vector writers_to_flush; +{ +std::lock_guard l(_lock); +if (_state == kFinished) { +// TabletsChannel is closed without LoadChannel's lock, +// therefore it's possible for reduce_mem_usage() to be called right after close() +return _close_status; +} -// Decide which writes should be flushed to reduce mem consumption. -// The main idea is to flush at least one third of the mem_limit. -// This is mainly to solve the following scenarios. -// Suppose there are N tablets in this TabletsChannel, and the mem limit is M. -// If the data is evenly distributed, when each tablet memory accumulates to M/N, -// the reduce memory operation will be triggered. -// At this time, the value of M/N may be much smaller than the value of `write_buffer_size`. -// If we flush all the tablets at this time, each tablet will generate a lot of small files. -// So here we only flush part of the tablet, and the next time the reduce memory operation is triggered, -// the tablet that has not been flushed before will accumulate more data, thereby reducing the number of flushes. - -int64_t mem_to_flushed = mem_limit / 3; -int counter = 0; -int64_t sum = 0; -for (auto writer : writers) { -if (writer->memtable_consumption() <= 0) { -break; +// Sort the DeltaWriters by mem consumption in descend order. +std::vector writers; +for (auto& it : _tablet_writers) { +it.second->save_memtable_consumption_snapshot(); +writers.push_back(it.second); } -++counter; -sum += writer->memtable_consumption(); -if (sum > mem_to_flushed) { -break; +std::sort(writers.begin(), writers.end(), + [](const DeltaWriter* lhs, const DeltaWriter* rhs) { + return lhs->get_memtable_consumption_snapshot() > + rhs->get_memtable_consumption_snapshot(); + }); + +// Decide which writes should be flushed to reduce mem consumption. +// The main idea is to flush at least one third of the mem_limit. +// This is mainly to solve the following scenarios. +// Suppose there are N tablets in this TabletsChannel, and the mem limit is M. +// If the data is evenly distributed, when each tablet memory accumulates to M/N, +// the reduce memory operation will be triggered. +// At this time, the value of M/N may be much smaller than the value of `write_buffer_size`. +// If we flush all the tablets at this time, each tablet will generate a lot of small files. +// So here we only flush part of the tablet, and the next time the reduce memory operation is triggered, +// the tablet that has not been flushed before will accumulate more data, thereby reducing the number of flushes. + +int64_t mem_to_flushed = mem_limit / 3; Review Comment: Offline discussed with xinyiZzz, some conclusion: 1. change the mem_limit / 3 to 'tabletsChannel's mem comsumption' /3 2. change the default value of soft limit from 80% to 50%, which can trigger the mem table flush more frequently, and still friendly to large load than old policy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ByteYue closed issue #12642: [Bug] ArrayFileColumnIterator::next_batch DCHECK(num_read == num_items) failed
ByteYue closed issue #12642: [Bug] ArrayFileColumnIterator::next_batch DCHECK(num_read == num_items) failed URL: https://github.com/apache/doris/issues/12642 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow commented on a diff in pull request #12790: [fix](Nereids): add stats in plan.
morrySnow commented on code in PR #12790: URL: https://github.com/apache/doris/pull/12790#discussion_r977153526 ## fe/fe-core/src/main/java/org/apache/doris/nereids/NereidsPlanner.java: ## @@ -190,11 +191,14 @@ private PhysicalPlan chooseBestPlan(Group rootGroup, PhysicalProperties physical Plan plan = groupExpression.getPlan().withChildren(planChildren); if (!(plan instanceof PhysicalPlan)) { -throw new AnalysisException("generate logical plan"); +throw new AnalysisException("Result plan must be PhysicalPlan"); } // TODO: set (logical and physical)properties/statistics/... for physicalPlan. -return ((PhysicalPlan) plan).withPhysicalProperties(groupExpression.getOutputProperties(physicalProperties)); +PhysicalPlan physicalPlan = ((PhysicalPlan) plan).withPhysicalProperties( +groupExpression.getOutputProperties(physicalProperties)); +((AbstractPlan) physicalPlan).setStats(groupExpression.getOwnerGroup().getStatistics()); Review Comment: change withPhyscialProperties to withPhyscialPropertiesAndStats, and remove setStats function. We need to do our best to ensure immutable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen commented on a diff in pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit
zhannngchen commented on code in PR #12716: URL: https://github.com/apache/doris/pull/12716#discussion_r977151695 ## be/src/runtime/tablets_channel.cpp: ## @@ -196,77 +196,94 @@ void TabletsChannel::_close_wait(DeltaWriter* writer, template Status TabletsChannel::reduce_mem_usage(int64_t mem_limit, TabletWriterAddResult* response) { -std::lock_guard l(_lock); -if (_state == kFinished) { -// TabletsChannel is closed without LoadChannel's lock, -// therefore it's possible for reduce_mem_usage() to be called right after close() -return _close_status; -} - -// Sort the DeltaWriters by mem consumption in descend order. -std::vector writers; -for (auto& it : _tablet_writers) { -it.second->save_memtable_consumption_snapshot(); -writers.push_back(it.second); -} -std::sort(writers.begin(), writers.end(), [](const DeltaWriter* lhs, const DeltaWriter* rhs) { -return lhs->get_memtable_consumption_snapshot() > rhs->get_memtable_consumption_snapshot(); -}); +std::vector writers_to_flush; +{ +std::lock_guard l(_lock); +if (_state == kFinished) { +// TabletsChannel is closed without LoadChannel's lock, +// therefore it's possible for reduce_mem_usage() to be called right after close() +return _close_status; +} -// Decide which writes should be flushed to reduce mem consumption. -// The main idea is to flush at least one third of the mem_limit. -// This is mainly to solve the following scenarios. -// Suppose there are N tablets in this TabletsChannel, and the mem limit is M. -// If the data is evenly distributed, when each tablet memory accumulates to M/N, -// the reduce memory operation will be triggered. -// At this time, the value of M/N may be much smaller than the value of `write_buffer_size`. -// If we flush all the tablets at this time, each tablet will generate a lot of small files. -// So here we only flush part of the tablet, and the next time the reduce memory operation is triggered, -// the tablet that has not been flushed before will accumulate more data, thereby reducing the number of flushes. - -int64_t mem_to_flushed = mem_limit / 3; -int counter = 0; -int64_t sum = 0; -for (auto writer : writers) { -if (writer->memtable_consumption() <= 0) { -break; +// Sort the DeltaWriters by mem consumption in descend order. +std::vector writers; +for (auto& it : _tablet_writers) { +it.second->save_memtable_consumption_snapshot(); +writers.push_back(it.second); } -++counter; -sum += writer->memtable_consumption(); -if (sum > mem_to_flushed) { -break; +std::sort(writers.begin(), writers.end(), + [](const DeltaWriter* lhs, const DeltaWriter* rhs) { + return lhs->get_memtable_consumption_snapshot() > + rhs->get_memtable_consumption_snapshot(); + }); + +// Decide which writes should be flushed to reduce mem consumption. +// The main idea is to flush at least one third of the mem_limit. +// This is mainly to solve the following scenarios. +// Suppose there are N tablets in this TabletsChannel, and the mem limit is M. +// If the data is evenly distributed, when each tablet memory accumulates to M/N, +// the reduce memory operation will be triggered. +// At this time, the value of M/N may be much smaller than the value of `write_buffer_size`. +// If we flush all the tablets at this time, each tablet will generate a lot of small files. +// So here we only flush part of the tablet, and the next time the reduce memory operation is triggered, +// the tablet that has not been flushed before will accumulate more data, thereby reducing the number of flushes. + +int64_t mem_to_flushed = mem_limit / 3; Review Comment: Offline discussed with xinyiZzz, some conclusion: 1. change the mem_limit / 3 to 'tabletsChannel's mem comsumption' /3, which can avoid too many small segments 2. change the default value of soft limit from 80% to 50%, which can trigger the mem table flush more frequently, and still friendly to large load than old policy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yangzhg opened a new pull request, #12846: [chore](build) add optiuon to disable -frecord-gcc-switches
yangzhg opened a new pull request, #12846: URL: https://github.com/apache/doris/pull/12846 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12730: [Refactor](parquet) refactor parquet write to uniform and consistent logic
github-actions[bot] commented on PR #12730: URL: https://github.com/apache/doris/pull/12730#issuecomment-1254471202 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12730: [Refactor](parquet) refactor parquet write to uniform and consistent logic
github-actions[bot] commented on PR #12730: URL: https://github.com/apache/doris/pull/12730#issuecomment-1254471271 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12363: [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance
github-actions[bot] commented on PR #12363: URL: https://github.com/apache/doris/pull/12363#issuecomment-1254473418 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman opened a new pull request, #12847: [draft] try fix q11 regression test
morningman opened a new pull request, #12847: URL: https://github.com/apache/doris/pull/12847 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BePPPower opened a new pull request, #12848: [feature-wip](new-scan)Add new jdbc scanner and new jdbc scan node
BePPPower opened a new pull request, #12848: URL: https://github.com/apache/doris/pull/12848 # Proposed changes Related pr: #11582 This pr is the new jdbc scan node and scanner. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12845: [Enhancement](debugging) Add more debug info for clang build
github-actions[bot] commented on PR #12845: URL: https://github.com/apache/doris/pull/12845#issuecomment-1254491274 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12845: [Enhancement](debugging) Add more debug info for clang build
github-actions[bot] commented on PR #12845: URL: https://github.com/apache/doris/pull/12845#issuecomment-1254491299 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit
github-actions[bot] commented on PR #12716: URL: https://github.com/apache/doris/pull/12716#issuecomment-1254498489 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12716: [Enhancement](load) Refine the load channel flush policy on mem limit
github-actions[bot] commented on PR #12716: URL: https://github.com/apache/doris/pull/12716#issuecomment-1254498538 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow opened a new pull request, #12849: [enhancement](Nereids) enable one phase aggregate
morrySnow opened a new pull request, #12849: URL: https://github.com/apache/doris/pull/12849 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12846: [chore](build) add optiuon to disable -frecord-gcc-switches
github-actions[bot] commented on PR #12846: URL: https://github.com/apache/doris/pull/12846#issuecomment-1254511869 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12846: [chore](build) add optiuon to disable -frecord-gcc-switches
github-actions[bot] commented on PR #12846: URL: https://github.com/apache/doris/pull/12846#issuecomment-1254511902 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] qzsee opened a new pull request, #12850: fix fe oom because replica count too much when schema change
qzsee opened a new pull request, #12850: URL: https://github.com/apache/doris/pull/12850 # Proposed changes Issue Number: close #xxx ## Problem summary version: 0.14 I hava a table that has 5000 partitions、100buckets、3 replicas when do shcema change for this table. FE occur oom ``` 2022-08-30 17:44:59,486 ERROR (thrift-server-pool-2646|3660) [EditLog.logEdit():890] Fatal Error : write stream Exception java.lang.OutOfMemoryError: UTF16 String size is 1207959550, should be less than 1073741823 at java.lang.StringUTF16.newBytesFor(StringUTF16.java:49) ~[?:?] at java.lang.AbstractStringBuilder.inflate(AbstractStringBuilder.java:228) ~[?:?] at java.lang.AbstractStringBuilder.appendChars(AbstractStringBuilder.java:1701) ~[?:?] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:634) ~[?:?] at java.lang.StringBuffer.append(StringBuffer.java:392) ~[?:?] at java.io.StringWriter.write(StringWriter.java:122) ~[?:?] at com.google.gson.stream.JsonWriter.string(JsonWriter.java:590) ~[gson-2.8.6.jar:?] at com.google.gson.stream.JsonWriter.value(JsonWriter.java:418) ~[gson-2.8.6.jar:?] at com.google.gson.internal.bind.TypeAdapters$29.write(TypeAdapters.java:746) ~[gson-2.8.6.jar:?] at com.google.gson.internal.bind.TypeAdapters$29.write(TypeAdapters.java:760) ~[gson-2.8.6.jar:?] at com.google.gson.internal.bind.TypeAdapters$29.write(TypeAdapters.java:752) ~[gson-2.8.6.jar:?] at com.google.gson.internal.bind.TypeAdapters$29.write(TypeAdapters.java:760) ~[gson-2.8.6.jar:?] at com.google.gson.internal.bind.TypeAdapters$29.write(TypeAdapters.java:698) ~[gson-2.8.6.jar:?] at com.google.gson.internal.Streams.write(Streams.java:72) ~[gson-2.8.6.jar:?] at org.apache.doris.persist.gson.RuntimeTypeAdapterFactory$1.write(RuntimeTypeAdapterFactory.java:320) ~[palo-fe.jar:3.4.0] at com.google.gson.TypeAdapter$1.write(TypeAdapter.java:191) ~[gson-2.8.6.jar:?] at com.google.gson.Gson.toJson(Gson.java:704) ~[gson-2.8.6.jar:?] at com.google.gson.Gson.toJson(Gson.java:683) ~[gson-2.8.6.jar:?] at com.google.gson.Gson.toJson(Gson.java:638) ~[gson-2.8.6.jar:?] at org.apache.doris.alter.RollupJobV2.write(RollupJobV2.java:724) ~[palo-fe.jar:3.4.0] at org.apache.doris.alter.BatchAlterJobPersistInfo.write(BatchAlterJobPersistInfo.java:44) ~[palo-fe.jar:3.4.0] at org.apache.doris.journal.JournalEntity.write(JournalEntity.java:131) ~[palo-fe.jar:3.4.0] at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:145) ~[palo-fe.jar:3.4.0] at org.apache.doris.persist.EditLog.logEdit(EditLog.java:887) [palo-fe.jar:3.4.0] at org.apache.doris.persist.EditLog.logBatchAlterJob(EditLog.java:1364) [palo-fe.jar:3.4.0] at org.apache.doris.alter.MaterializedViewHandler.processBatchAddRollup(MaterializedViewHandler.java:290) [palo-fe.jar:3.4.0] at org.apache.doris.alter.MaterializedViewHandler.process(MaterializedViewHandler.java:1178) [palo-fe.jar:3.4.0] at org.apache.doris.alter.Alter.processAlterOlapTable(Alter.java:146) [palo-fe.jar:3.4.0] at org.apache.doris.alter.Alter.processAlterTable(Alter.java:307) [palo-fe.jar:3.4.0] at org.apache.doris.catalog.Catalog.alterTable(Catalog.java:5172) [palo-fe.jar:3.4.0] ``` AlterJobV2 has too many info need transform to json string. so oom. So, I tested a reasonable value of 120W replicas as a Schema change limitation. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] chenlinzhong opened a new issue, #12851: [Bug] left join result not correct
chenlinzhong opened a new issue, #12851: URL: https://github.com/apache/doris/issues/12851 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? In the case of multiple copies of the left join, the vectorization and non vectorization results are different, and the vectorization results are correct left join 在多副本情况下,向量化和非向量化的结果不一样,向量化的结果是正确的 ``` create table T_CORE_MAIN( ID bigint not null, CRCL_NO varchar(100) null, PARENT_ID BIGINT(20) NULL )ENGINE=OLAP UNIQUE KEY(`ID`) COMMENT "DD" DISTRIBUTED BY HASH(`ID`) BUCKETS 3 PROPERTIES( "replication_allocation"="tag.location.default:3", "in_memory"="false", "storage_format"="V2" ) ; create table T_CORE_MAIN_STATUS( ID bigint(20) not null, CRCL_ID bigint(20) null, CRCL_VALID bigint(20) null comment "" )ENGINE=OLAP UNIQUE KEY(`ID`) DISTRIBUTED BY HASH(`ID`) BUCKETS 3 PROPERTIES( "replication_allocation"="tag.location.default:3", "in_memory"="false", "storage_format"="V2" ) ; INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES (329586, 'YX20191217-03',NULL); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES (329687, 'YX20191217-03-0004','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329688', 'YX20191217-03-0005','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329689','YX20191217-03-0006','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329709', 'YX20191217-03-0007','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329710','YX20191217-03-0008','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329712', 'YX20191217-03-0010','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329684','YX20191217-03-0002','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329686', 'YX20191217-03-0003','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329934', 'YX20191217-03-0014','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329937', 'YX20191217-03-0015','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES('329947','YX20191217-03-0017','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('348093', 'YX20191217-03-0018qs','329586'); INSERT INTO T_CORE_MAIN (ID,CRCL_NO,PARENT_ID) VALUES ('329620', 'YX20191217-03-0001', '329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329711', 'YX20191217-03-0009','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329713','YX20191217-03-0011','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID)VALUES ('329714','YX20191217-03-0012','329586'); INSERT INTO T_CORE_MAIN(ID,CRCL_NO,PARENT_ID) VALUES ('329938','YX20191217-03-0016','329586'); select * from T_CORE_MAIN; INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170120090751','329713','1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128018559', '329620', '-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128260223', '329937','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128869503', '329688','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128988287', '329712', '-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES('220171170115265663','329938','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES('220171170128459903','329934','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128868479', '329687','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES('220171170128870527','329689','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170128987263', '329711', '-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170107852927', '329684','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170112455807','348093','1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170115268735','329947','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES('220171170128867455','329686','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170129379455','329709','-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES('220171170129380479', '329710', '-1'); INSERT INTO T_CORE_MAIN_STATUS(ID,CRCL_ID,CRCL_VALID) VALUES ('220171170129381503', '329714', '-1'); SELECT T.C
[GitHub] [doris-website] hf200012 merged pull request #110: Add some description
hf200012 merged PR #110: URL: https://github.com/apache/doris-website/pull/110 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris-website] hf200012 merged pull request #109: [doc](numbers)Removed documentation for version 1.1 numbers function
hf200012 merged PR #109: URL: https://github.com/apache/doris-website/pull/109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] SbloodyS commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
SbloodyS commented on issue #11024: URL: https://github.com/apache/doris/issues/11024#issuecomment-1254522074 I also encountered this issue. Did you fix it? @ReganHoo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee commented on a diff in pull request #12803: [Opt](Vectorized) Support push down no grouping agg
HappenLee commented on code in PR #12803: URL: https://github.com/apache/doris/pull/12803#discussion_r977213204 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map( return Status::OK(); } +Status ColumnReader::next_batch_of_zone_map(size_t* n, vectorized::MutableColumnPtr& dst) const { +// TODO: this work to get min/max value seems should only do once +FieldType type = _type_info->type(); +std::unique_ptr min_value(WrapperField::create_by_type(type, _meta.length())); +std::unique_ptr max_value(WrapperField::create_by_type(type, _meta.length())); +_parse_zone_map(_zone_map_index_meta->segment_zone_map(), min_value.get(), max_value.get()); + +dst->reserve(*n); +bool is_string = is_olap_string_type(type); +if (max_value->is_null()) { +assert_cast(*dst).insert_default(); +} else { +if (is_string) { +auto sv = (StringValue*)max_value->cell_ptr(); +dst->insert_data(sv->ptr, sv->len); +} else { +dst->insert_many_fix_len_data(static_cast(max_value->cell_ptr()), 1); +} +} + +auto size = *n - 1; +if (min_value->is_null()) { + assert_cast(*dst).insert_null_elements(size); +} else { +if (is_string) { +auto sv = (StringValue*)min_value->cell_ptr(); +dst->insert_many_data(sv->ptr, sv->len, size); +} else { +// TODO: the work may cause performance problem, opt latter +for (int i = 0; i < size; ++i) { +dst->insert_many_fix_len_data(static_cast(min_value->cell_ptr()), 1); +} Review Comment: en,just add a todo here to support in the future -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee commented on a diff in pull request #12803: [Opt](Vectorized) Support push down no grouping agg
HappenLee commented on code in PR #12803: URL: https://github.com/apache/doris/pull/12803#discussion_r977213640 ## be/src/olap/rowset/segment_v2/column_reader.cpp: ## @@ -171,6 +171,44 @@ Status ColumnReader::get_row_ranges_by_zone_map( return Status::OK(); } +Status ColumnReader::next_batch_of_zone_map(size_t* n, vectorized::MutableColumnPtr& dst) const { +// TODO: this work to get min/max value seems should only do once +FieldType type = _type_info->type(); +std::unique_ptr min_value(WrapperField::create_by_type(type, _meta.length())); +std::unique_ptr max_value(WrapperField::create_by_type(type, _meta.length())); +_parse_zone_map(_zone_map_index_meta->segment_zone_map(), min_value.get(), max_value.get()); + +dst->reserve(*n); +bool is_string = is_olap_string_type(type); +if (max_value->is_null()) { +assert_cast(*dst).insert_default(); +} else { +if (is_string) { +auto sv = (StringValue*)max_value->cell_ptr(); +dst->insert_data(sv->ptr, sv->len); +} else { +dst->insert_many_fix_len_data(static_cast(max_value->cell_ptr()), 1); +} Review Comment: may have uint24_t use in `Date` , `insert_data` may cause bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12803: [Opt](Vectorized) Support push down no grouping agg
github-actions[bot] commented on PR #12803: URL: https://github.com/apache/doris/pull/12803#issuecomment-1254549320 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12803: [Opt](Vectorized) Support push down no grouping agg
github-actions[bot] commented on PR #12803: URL: https://github.com/apache/doris/pull/12803#issuecomment-1254549337 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12852: [Improvement](dict) optimize dictionary column
Gabriel39 opened a new pull request, #12852: URL: https://github.com/apache/doris/pull/12852 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Ling2099 commented on issue #7878: [Enhancement] Easy to use the binary package Docker one-button experience
Ling2099 commented on issue #7878: URL: https://github.com/apache/doris/issues/7878#issuecomment-1254575688 支持!!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen opened a new pull request, #12853: a test pr for qa pipeline debug
zhannngchen opened a new pull request, #12853: URL: https://github.com/apache/doris/pull/12853 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zy-kkk opened a new pull request, #12854: [test](join)add join case5
zy-kkk opened a new pull request, #12854: URL: https://github.com/apache/doris/pull/12854 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen opened a new pull request, #12855: a test pr for qa pipeline debug, will not merge
zhannngchen opened a new pull request, #12855: URL: https://github.com/apache/doris/pull/12855 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt opened a new pull request, #12856: [Enhancement](runtime filter)optimize for runtime filter
BiteThet opened a new pull request, #12856: URL: https://github.com/apache/doris/pull/12856 # Proposed changes 1. optimize build runtime filter. 2. change hash function of some numeric type in bloom filter. ``` sql tpchafter before q1 11951.4 12087.2 q2 423.0 591.6 up q3 3462.63544.4 q4 1260.21453.6 up q5 2580.02975.8 up q6 1003.21017.4 q7 1358.61428.8 q8 1205.61378.8 up q9 21818.4 21591.8 q10 2084.42193.4 up q11 629.8 630.2 q12 829.0 881.8 q13 3478.83510.8 q14 689.2 731.8 q15 631.2 653.0 q16 895.2 898.2 q17 3849.25085.2 up q18 3934.64382.6 up q19 473.0 460.6 q20 1683.01671.2 q21 4793.04938.8 q22 1207.61353.6 ``` ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 3. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 4. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 5. Does it need to update dependencies: - [ ] Yes - [ ] No 6. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] gavinchou commented on pull request #12691: [chore](thirdparty) Support third-party incremental build
gavinchou commented on PR #12691: URL: https://github.com/apache/doris/pull/12691#issuecomment-1254605398 Hi, @adonis0147 Thanks for your feedback, using MD5 for the incremental build is a generic idea, however, there is another problem to resolve -- how to manage the MD5 list? It seems that we still need to update the MD5 list manually, can you point out how it works in detail? And, there is another case that sometimes Doris developers have to build **specific third-parties in a specific order** when some dependencies are updated and they require specific build order (one may rely on another, e.g. brpc relies on protubuf), it seems hard to resolve this problem by updating nothing but the `build-thirdparty.sh`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
adonis0147 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977277813 ## be/src/vec/columns/column_dictionary.h: ## @@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> { if (code >= 0) { return code; } -auto bound = std::upper_bound(_dict_data.begin(), _dict_data.end(), value) - - _dict_data.begin(); +auto bound = std::upper_bound(_dict_data->begin(), _dict_data->end(), value) - + _dict_data->begin(); return greater ? bound - greater + eq : bound - eq; } void find_codes(const phmap::flat_hash_set& values, std::vector& selected) const { -size_t dict_word_num = _dict_data.size(); +size_t dict_word_num = _dict_data->size(); selected.resize(dict_word_num); selected.assign(dict_word_num, false); -for (const auto& value : values) { -if (auto it = _inverted_index.find(value); it != _inverted_index.end()) { -selected[it->second] = true; +for (size_t i = 0; i < _dict_data->size(); i++) { +if (values.find((*_dict_data)[i]) != values.end()) { +selected[i] = true; } } } void clear() { -_dict_data.clear(); -_inverted_index.clear(); -_code_convert_table.clear(); +_dict_data->clear(); _hash_values.clear(); } void clear_hash_values() { _hash_values.clear(); } void sort() { -size_t dict_size = _dict_data.size(); -_code_convert_table.reserve(dict_size); -std::sort(_dict_data.begin(), _dict_data.end(), _comparator); +size_t dict_size = _dict_data->size(); + +_perm.resize(dict_size); +for (size_t i = 0; i < dict_size; ++i) { +_perm[i] = i; +} + +struct Comparator { +public: +Comparator(DictContainer& dict_data) : _dict_data(dict_data) {} +bool operator()(const size_t a, const size_t b) const { +return _comparator(_dict_data[a], _dict_data[b]); +} + +private: +StringValue::Comparator _comparator; +DictContainer& _dict_data; +}; +Comparator comparator(*_dict_data); +std::sort(_perm.begin(), _perm.end(), comparator); Review Comment: ```suggestion std::sort(_perm.begin(), _perm.end(), [&dict_data = *_dict_data, &comparator = _comparator](const size_t a, const size_t b) { return comparator(dict_data[a], dict_data[b]); }); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei opened a new pull request, #12857: [bugfix](scanner) olap scanner compute is wrong
yiguolei opened a new pull request, #12857: URL: https://github.com/apache/doris/pull/12857 # Proposed changes This issue is introduced by https://github.com/apache/doris/pull/8096, the operator priority is wrong , so that in some cases, there will be many scanners. ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] adonis0147 commented on pull request #12691: [chore](thirdparty) Support third-party incremental build
adonis0147 commented on PR #12691: URL: https://github.com/apache/doris/pull/12691#issuecomment-1254625331 > Hi, @adonis0147 Thanks for your feedback, using MD5 for the incremental build is a generic idea, however, there is another problem to resolve -- how to manage the MD5 list? It seems that we still need to update the MD5 list manually, can you point out how it works in detail? We already have the MD5 list in [thirdparty/vars.sh](https://github.com/apache/doris/blob/master/thirdparty/vars.sh). We update this file when we want to update the third-parties. Therefore, we can write the MD5 to a file at a last place of each `build_xxx` function. > And, there is another case that sometimes Doris developers have to build **specific third-parties in a specific order** when some dependencies are updated and they require specific build order (one may rely on another, e.g. brpc relies on protubuf), it seems hard to resolve this problem by updating nothing but the `build-thirdparty.sh`? This problem is inevitable in both ways (either MD5 way or version counter way) if we want to support incremental installing. We should sort out the dependencies tree in our build script first. The reason is that it is hard for a developer to find out the dependencies when he want to upgrade a specific package only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] sohardforaname opened a new pull request, #12858: [Improve](Nereids)Optimize planner
sohardforaname opened a new pull request, #12858: URL: https://github.com/apache/doris/pull/12858 # Proposed changes Issue Number: close #xxx ## Problem summary optimize planner ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 opened a new pull request, #12859: Replace jvm's garbage collector CMS with G1
hf200012 opened a new pull request, #12859: URL: https://github.com/apache/doris/pull/12859 Replace jvm's garbage collector CMS with G1 From the test use, the overall performance is better than the CMS # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
BiteThet commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977291889 ## be/src/vec/columns/column_dictionary.h: ## @@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> { Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* col_ptr) override { auto* res_col = reinterpret_cast(col_ptr); +res_col->get_offsets().reserve(sel_size); +res_col->get_chars().reserve(_dict.avg_str_len() * sel_size); for (size_t i = 0; i < sel_size; i++) { uint16_t n = sel[i]; auto& code = reinterpret_cast(_codes[n]); auto value = _dict.get_value(code); -res_col->insert_data(value.ptr, value.len); +res_col->insert_data_without_reserve(value.ptr, value.len); Review Comment: I think `_dict.avg_str_len() * sel_size` may be less than sum length of elements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo commented on issue #11024: URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640547 > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo closed issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo closed issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend URL: https://github.com/apache/doris/issues/11024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] ReganHoo commented on issue #11024: [Bug] cannot access the hive external table stored with s3 as the backend
ReganHoo commented on issue #11024: URL: https://github.com/apache/doris/issues/11024#issuecomment-1254640922 > I also encountered this issue. Did you fix it? @ReganHoo Update your doris version to 1.1.2 to solve this problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1
hf200012 closed pull request #12859: Replace jvm's garbage collector CMS with G1 URL: https://github.com/apache/doris/pull/12859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12846: [chore](build) add optiuon to disable -frecord-gcc-switches
yiguolei merged PR #12846: URL: https://github.com/apache/doris/pull/12846 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
Gabriel39 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977308937 ## be/src/vec/columns/column_dictionary.h: ## @@ -192,11 +192,13 @@ class ColumnDictionary final : public COWHelper> { Status filter_by_selector(const uint16_t* sel, size_t sel_size, IColumn* col_ptr) override { auto* res_col = reinterpret_cast(col_ptr); +res_col->get_offsets().reserve(sel_size); +res_col->get_chars().reserve(_dict.avg_str_len() * sel_size); for (size_t i = 0; i < sel_size; i++) { uint16_t n = sel[i]; auto& code = reinterpret_cast(_codes[n]); auto value = _dict.get_value(code); -res_col->insert_data(value.ptr, value.len); +res_col->insert_data_without_reserve(value.ptr, value.len); Review Comment: If so, `chars` in ColumnString will still to reserve a bigger memory block. `_dict.avg_str_len() * sel_size` is just a conservative estimation here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] xiaokang opened a new pull request, #12860: [bugfix])(function)return error instead of crash be for unsupported CAST
xiaokang opened a new pull request, #12860: URL: https://github.com/apache/doris/pull/12860 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. For unsupported CAST, create create_unsupport_wrapper that return Status::InvalidArgument instead of LOG(FATAL) to avoid be crash. ## Checklist(Required) 1. Does it affect the original behavior: - [x] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] jackwener commented on a diff in pull request #12858: [Improve](Nereids)Optimize planner
jackwener commented on code in PR #12858: URL: https://github.com/apache/doris/pull/12858#discussion_r977328537 ## fe/fe-core/src/main/java/org/apache/doris/nereids/cost/CostEstimate.java: ## @@ -90,11 +90,27 @@ public static CostEstimate ofMemory(double memoryCost) { /** * Sums partial cost estimates of some (single) plan node. */ +@Deprecated Review Comment: No rename it, just remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong
github-actions[bot] commented on PR #12857: URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672072 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12857: [bugfix](scanner) olap scanner compute is wrong
github-actions[bot] commented on PR #12857: URL: https://github.com/apache/doris/pull/12857#issuecomment-1254672127 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] luozenglin opened a new issue, #12861: [Bug] data error when using select into outfile format as parquet
luozenglin opened a new issue, #12861: URL: https://github.com/apache/doris/issues/12861 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master ### What's Wrong? When I export the data using `select into outfile format as parquet` and then load it into a table with the same schema, the tinyint column becomes NULL. ``` set enable_vectorized_engine = false; CREATE TABLE `test_select_into_property_test_output_format_parquet_tb` ( `k1` tinyint(4) NOT NULL, `k2` smallint(6) NOT NULL, `k3` int(11) NOT NULL, `k4` bigint(20) NOT NULL, `k5` datetime NOT NULL, `v1` date REPLACE NOT NULL, `v2` char(1) REPLACE NOT NULL, `v3` varchar(4096) REPLACE NOT NULL, `v4` float SUM NOT NULL, `v5` double SUM NOT NULL, `v6` decimal(20, 7) SUM NOT NULL ) ENGINE=OLAP AGGREGATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`k1`) BUCKETS 15 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); mysql> select * from test_select_into_property_test_output_format_parquet_tb where k1 <= 5; +--+--+--+--+-++--+---+---++-+ | k1 | k2 | k3 | k4 | k5 | v1 | v2 | v3 | v4| v5 | v6 | +--+--+--+--+-++--+---+---++-+ |1 | 10 | 100 | 1000 | 2011-01-01 00:00:00 | 2010-01-01 | t| ynqnzeowymt | 38.638844 | 180.998031 | 7395.231067 | |2 | 20 | 200 | 2000 | 2012-01-01 00:00:00 | 2010-01-02 | f| hfkfwlr | 506.04404 | 539.922834 | 2080.504502 | |3 | 30 | 300 | 3000 | 2013-01-01 00:00:00 | 2010-01-03 | t| uoclasp | 377.79321 | 577.044148 | 4605.253205 | |4 | 40 | 400 | 4000 | 2014-01-01 00:00:00 | 2010-01-04 | n| iswngzeodfhptjzgswsddt| 871.35455 | 919.067864 | 7291.703724 | |5 | 50 | 500 | 5000 | 2015-01-01 00:00:00 | 2010-01-05 | a| sqodagzlyrmcelyxgcgcsfuxadcdt | 462.0679 | 929.660783 | 3903.906901 | +--+--+--+--+-++--+---+---++-+ select k1 k_0, k2 k_1, k3 k_2, k4 k_3, k5 k_4, v1 k_5, v2 k_6, v3 k_7, v4 k_8, v5 k_9, v6 k_10 from test_select_into_property_test_output_format_parquet_tb INTO OUTFILE "hdfs://:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_844373478" FORMAT AS parquet PROPERTIES ("broker.name"="ahdfs","broker.username"="","broker.password"="", "schema" = "required,int32,k_0;required,int32,k_1;required,int32,k_2;required,int64,k_3;required,int64,k_4;required,int64,k_5;required,byte_array,k_6;required,byte_array,k_7;required,float,k_8;required,double,k_9;required,byte_array,k_10"); CREATE TABLE `select_into_check_table` ( `k_0` tinyint(4) NULL, `k_1` smallint(6) NULL, `k_2` int(11) NULL, `k_3` bigint(20) NULL, `k_4` datetime NULL, `k_5` date NULL, `k_6` char(1) NULL, `k_7` char(29) NULL, `k_8` float NULL, `k_9` double NULL, `k_10` decimal(27, 9) NULL ) ENGINE=OLAP DUPLICATE KEY(`k_0`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`k_0`) BUCKETS 13 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); LOAD LABEL test_select_into_property_test_output_format_parquet_db.label_21_04_47_50_543709_8920444695 ( DATA INFILE(" hdfs:/:9000/user/palo/test/data/export/test_select_into_property_test_output_format_parquet_db/label_21_04_47_49_475312_1042101013/label_21_04_47_49_475364_8443734786915a56b133f4b71-a671fd00077a30b4_0.parquet") INTO TABLE `select_into_check_table` FORMAT AS "parquet") WITH BROKER "ahdfs" ("username"="", "password"=""); mysql> select * from select_into_check_table; +--+--+--+---+-++--+---+---++-+ | k_0 | k_1 | k_2 | k_3 | k_4 | k_5| k_6 | k_7 | k_8 | k_9| k_10| +--+--+--+---+-++--+---+---++-+ | NULL | NULL | 1000 |
[GitHub] [doris] zhannngchen opened a new pull request, #12862: [debug](test)a test pr for qa pipeline debug, will not merge
zhannngchen opened a new pull request, #12862: URL: https://github.com/apache/doris/pull/12862 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #12863: [improvement](scan) merge scan keys based on the number of scanners
mrhhsg opened a new pull request, #12863: URL: https://github.com/apache/doris/pull/12863 # Proposed changes Issue Number: close #xxx ## Problem Summary A scanner that takes too many scan keys will cause performance degradation, so it's better to try to merge the scan keys. Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [x] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 commented on a diff in pull request #12852: [Improvement](dict) optimize dictionary column
Gabriel39 commented on code in PR #12852: URL: https://github.com/apache/doris/pull/12852#discussion_r977345927 ## be/src/vec/columns/column_dictionary.h: ## @@ -360,40 +362,58 @@ class ColumnDictionary final : public COWHelper> { if (code >= 0) { return code; } -auto bound = std::upper_bound(_dict_data.begin(), _dict_data.end(), value) - - _dict_data.begin(); +auto bound = std::upper_bound(_dict_data->begin(), _dict_data->end(), value) - + _dict_data->begin(); return greater ? bound - greater + eq : bound - eq; } void find_codes(const phmap::flat_hash_set& values, std::vector& selected) const { -size_t dict_word_num = _dict_data.size(); +size_t dict_word_num = _dict_data->size(); selected.resize(dict_word_num); selected.assign(dict_word_num, false); -for (const auto& value : values) { -if (auto it = _inverted_index.find(value); it != _inverted_index.end()) { -selected[it->second] = true; +for (size_t i = 0; i < _dict_data->size(); i++) { +if (values.find((*_dict_data)[i]) != values.end()) { +selected[i] = true; } } } void clear() { -_dict_data.clear(); -_inverted_index.clear(); -_code_convert_table.clear(); +_dict_data->clear(); _hash_values.clear(); } void clear_hash_values() { _hash_values.clear(); } void sort() { -size_t dict_size = _dict_data.size(); -_code_convert_table.reserve(dict_size); -std::sort(_dict_data.begin(), _dict_data.end(), _comparator); +size_t dict_size = _dict_data->size(); + +_perm.resize(dict_size); +for (size_t i = 0; i < dict_size; ++i) { +_perm[i] = i; +} + +struct Comparator { +public: +Comparator(DictContainer& dict_data) : _dict_data(dict_data) {} +bool operator()(const size_t a, const size_t b) const { +return _comparator(_dict_data[a], _dict_data[b]); +} + +private: +StringValue::Comparator _comparator; +DictContainer& _dict_data; +}; +Comparator comparator(*_dict_data); +std::sort(_perm.begin(), _perm.end(), comparator); Review Comment: Done, thanks for your suggestion! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] luozenglin opened a new pull request, #12864: [fix](parquet) fix write error data as parquet format.
luozenglin opened a new pull request, #12864: URL: https://github.com/apache/doris/pull/12864 Fix incorrect data conversion when writing tiny int and small int data to parquet files in non-vectorized engine. # Proposed changes Issue Number: close #12861 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [x] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline debug, will not merge
zhannngchen closed pull request #12853: [debug](test) a test pr for qa pipeline debug, will not merge URL: https://github.com/apache/doris/pull/12853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline debug, will not merge
zhannngchen closed pull request #12855: [debug](test)a test pr for qa pipeline debug, will not merge URL: https://github.com/apache/doris/pull/12855 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12824: URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695697 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12824: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12824: URL: https://github.com/apache/doris/pull/12824#issuecomment-1254695752 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12822: URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697309 PR approved by at least one committer and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] github-actions[bot] commented on pull request #12822: [fix](log)Audit log status is incorrect
github-actions[bot] commented on PR #12822: URL: https://github.com/apache/doris/pull/12822#issuecomment-1254697351 PR approved by anyone and no changes requested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dataroaring opened a new pull request, #12865: test_p0
dataroaring opened a new pull request, #12865: URL: https://github.com/apache/doris/pull/12865 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freemandealer opened a new pull request, #12866: [enhancement](compaction) introduce segment compaction (#12609)
freemandealer opened a new pull request, #12866: URL: https://github.com/apache/doris/pull/12866 Implement segmentwise compaction during rowset write to reduce the number of segments produced by load jobs, otherwise may cause OLAP_ERR_TOO_MANY_SEGMENTS (-238). Signed-off-by: freemandealer # Proposed changes Issue Number: close #12609 ## Problem summ ## Intro The default limit is 200 segment perf rowset. Too many segments may fail the whole load process (OLAP_ERR_TOO_MANY_SEGMENTS -238). If we increase the limit, the load will succeed but the pressure is transferred to the subsequential rowsetwise compaction. Things get worse when the user issue a query, e.g. insert into select stmt, right after load job but before rowsetwise compaction, he/she will suffer the performance disaster or maybe end up with OOM. So we are introducing segmentwise compaction which will compact data DURING the write process, instead of waiting for rowsetwise compaction until txn has been committed. ## Design ### Tigger Every time when a rowset writer produces more than N (e.g. 10) segments, we trigger segment compaction. Note that only one segment compaction job for a single rowset at a time to ensure no recursing/queuing nightmare. ### Target Selection We collect segments during every trigger. We skip big segments whose row num > M (e.g. 1) coz we get little benefits from compacting them comparing our effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do actual compaction. ### Compaction Process A new thread pool is introduced to help do the job. We submit the above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the worker thread does the followings: - build a MergeIterator from the target segments - create a new segment writer - for each block readed from MergeIterator, the Writer append it ### SegID handling SegID must remain consecutive after segment compaction. If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big segment seg_4: - we create a segment named "seg_0-3" to save compacted data for seg_0, seg_1, seg_2 and seg_3 - delete seg_0, seg_1, seg_2 and seg_3 - rename seg_0-3 to seg_0 - rename seg_4 to seg_1 It is worth noticing that we should wait inflight segment compaction tasks to finish before building rowset meta and committing this txn. ## Test results ### The amount of data can Doris load First, we test the data amount that we can successfully load into doris disable/enable segment compaction.Tests are based on TPCH. Table is created as 1 bucket and no parallel. We trigger segment compaction every 10 segments produced by rowset writer. | cases | data amount| | - | -- | | Disable SegCompaciton | 1.12 million rows, 18.67GB | | Enable SegCompaction | 11 million rows, 183GB | The result shows that the amount of data we can load to doris improve 10 times after enabling segment compaction. The ratio is correspond to the triggering segment number. ### Impact on latency When segment compaction is disabled, a load job will finish in 1260s during the test. And the sequential rowsetwise compaction cost 151s. We give the test results when enabling segment compaction in different triggering segment number: | triggering segment number| Load Latency | RowsetCompaction Latency | | | | | | 5 (trigger every 5 segments) | 089s (-13%) | 242s (+60%) | | 10 | 1053s (-16%) | 166s (+9%) | | 20 | 960s (-23%) | 172s (+13%) | | 40 | 1320s (+4%) | 169s (+11%) | We load without segment compaction for serveral times and each gives us a different latency range from (-25%, +25%). So we believe that segment compaction has little impact on the latency. In addition to the above costs, we wait inflight segment compaction tasks to finish before building rowset meta and publishing the data. The length of the wait time depends on when the build takes the place but there is a theoretical range for it and the range is related to the time each segment compaction task will cost: | triggering segment number | Single SegCompaction Task Latency | | - | - | | 5 | 5s| | 10| 9s| | 20| 20s | | 40| 60s | ### I
[GitHub] [doris] freemandealer closed pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)
freemandealer closed pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609) URL: https://github.com/apache/doris/pull/12610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] freemandealer commented on pull request #12610: [WIP][Enhancement](compaction) segment compaction (#12609)
freemandealer commented on PR #12610: URL: https://github.com/apache/doris/pull/12610#issuecomment-1254772650 A brandnew PR with updated code as well as detailed design and test results are provided here: https://github.com/apache/doris/pull/12866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12867: [Improvement](predicate) Replace for-loop by memcpy
Gabriel39 opened a new pull request, #12867: URL: https://github.com/apache/doris/pull/12867 # Proposed changes This PR replace for-loop by memcpy. I did two experiments. Experiment 1 Run ckbench q20 and print a flame graph. Compare proportion of this function time to the total time. I got: for-loop:1.74% memcpy:0.013% Experiment 2 Run `SELECT JavaEnable FROM hits`. 9900w+ rows returned and JavaEnable is SMALL INT. Compare the BlockLoadTime. I got: for-loop:1s225ms memcpy:805.603ms ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morningman opened a new pull request, #12868: [draft] for testing p0, not merge
morningman opened a new pull request, #12868: URL: https://github.com/apache/doris/pull/12868 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12869: [Bug](date)(1.1-lts) Fix wrong type in TimestampArithmeticExpr
Gabriel39 opened a new pull request, #12869: URL: https://github.com/apache/doris/pull/12869 # Proposed changes Cherry pick from #12727 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12870: [Bug](date)(1.1-lts) Fix wrong result produced by date function
Gabriel39 opened a new pull request, #12870: URL: https://github.com/apache/doris/pull/12870 # Proposed changes Cherry pick from #12720 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS opened a new issue, #12871: [Enhancement](rewrite) support Or to In rule
Henry2SS opened a new issue, #12871: URL: https://github.com/apache/doris/issues/12871 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description support Or to In rewrite rule : for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)` ### Solution support Or to In rewrite rule : for example, sql `select * from test_tbl where a = 1 or a = 2 or a in (3, 4)` should rewrite to `select * from test_tbl where a in (1,2,3,4)` ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS opened a new pull request, #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS opened a new pull request, #12872: URL: https://github.com/apache/doris/pull/12872 # Proposed changes Issue Number: close #12871 ## Problem summary 1. support Or to In rewrite rule 2. fix Expr clone problems. It should create a new object, or it will always be shallow-copy. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [x] No - [ ] I don't know 3. Has unit tests been added: - [x] Yes - [ ] No - [ ] No Need 4. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 5. Does it need to update dependencies: - [ ] Yes - [x] No 6. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS commented on PR #12872: URL: https://github.com/apache/doris/pull/12872#issuecomment-1254828517 tested locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12873: [feature](outfile)(1.1-lts) support parquet writer
Gabriel39 opened a new pull request, #12873: URL: https://github.com/apache/doris/pull/12873 # Proposed changes Cherry pick from #12492 ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] caiconghui opened a new issue, #12874: [Bug] set enable_projection to false will cause select stmt analyze failed
caiconghui opened a new issue, #12874: URL: https://github.com/apache/doris/issues/12874 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version master and lts ### What's Wrong? set enable_projection=false; select count() from (select a, b from table001 order by b limit 1) a then throw exception like the following ERROR 1105 (HY000): errCode = 2, detailMessage = couldn't resolve slot descriptor 0 ### What You Expected? work ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Henry2SS commented on pull request #12872: [enhancement](rewrite) add OrToIn rule && fix expr clone problems
Henry2SS commented on PR #12872: URL: https://github.com/apache/doris/pull/12872#issuecomment-1254835927 1. fe unit-tests passed locally. 2. compiled and manually tested function passed test results:   -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] caiconghui commented on issue #12874: [Bug] set enable_projection to false will cause select stmt analyze failed
caiconghui commented on issue #12874: URL: https://github.com/apache/doris/issues/12874#issuecomment-1254836058 mysql> show columns from baseall; +---++--+---+-+-+ | Field | Type | Null | Key | Default | Extra | +---++--+---+-+-+ | k0| BOOLEAN| Yes | true | NULL| | | k1| TINYINT| Yes | true | NULL| | | k2| SMALLINT | Yes | true | NULL| | | k3| INT| Yes | true | NULL| | | k4| BIGINT | Yes | true | NULL| | | k5| DECIMAL(9,3) | Yes | true | NULL| | | k6| CHAR(5)| Yes | true | NULL| | | k10 | DATE | Yes | true | NULL| | | k11 | DATETIME | Yes | true | NULL| | | k7| VARCHAR(20)| Yes | true | NULL| | | k8| DOUBLE | Yes | false | NULL| MAX | | k9| FLOAT | Yes | false | NULL| SUM | | k12 | VARCHAR(65533) | Yes | false | NULL| REPLACE | | k13 | LARGEINT | Yes | false | NULL| REPLACE | +---++- select count() from (select k0, k1 from baseall order by k1 limit 1) a -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhannngchen opened a new pull request, #12875: [feature-wip](unique-key-merge-on-write) fix thread safe issue in BetaRowsetWriter
zhannngchen opened a new pull request, #12875: URL: https://github.com/apache/doris/pull/12875 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] morrySnow opened a new pull request, #12876: test bucket shuffle
morrySnow opened a new pull request, #12876: URL: https://github.com/apache/doris/pull/12876 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] nextdreamblue opened a new pull request, #12877: [fix](type) fix DECIMAL scale when cast function on fe
nextdreamblue opened a new pull request, #12877: URL: https://github.com/apache/doris/pull/12877 # Proposed changes Issue Number: close #12717 ## Problem summary 根据cast传递的DECIMAL类型的精度来处理DECIMAL数据. before: MySQL [test]> select cast('135.75999' as DECIMAL(10,3)); ++ | CAST('135.75999' AS DECIMAL(10,3)) | ++ | 135.75999 | ++ 1 row in set (0.00 sec) now: MySQL [stage]> select cast('135.75999' as DECIMAL(10,3)); ++ | CAST('135.75999' AS DECIMAL(10,3)) | ++ |135.759 | ++ 1 row in set (0.01 sec) ## Checklist(Required) 1. Does it affect the original behavior: - [x] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [x] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [x] No Need 4. Does it need to update dependencies: - [ ] Yes - [x] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [x] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zy-kkk opened a new pull request, #12878: [typo](docs)Optimized date function doc order and add partial function doc
zy-kkk opened a new pull request, #12878: URL: https://github.com/apache/doris/pull/12878 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12879: [Improvement](predicate) Replace for-loop by memcpy
Gabriel39 opened a new pull request, #12879: URL: https://github.com/apache/doris/pull/12879 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] Gabriel39 opened a new pull request, #12880: [Improvement](dict) optimize dictionary column
Gabriel39 opened a new pull request, #12880: URL: https://github.com/apache/doris/pull/12880 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] HappenLee opened a new pull request, #12881: [Opt](Vectorized) Support push down no grouping agg
HappenLee opened a new pull request, #12881: URL: https://github.com/apache/doris/pull/12881 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] BiteTheDDDDt opened a new pull request, #12882: [Chore](clang) support build with clang15
BiteThet opened a new pull request, #12882: URL: https://github.com/apache/doris/pull/12882 # Proposed changes 1. remove some unsed variables 2. use clang-format15 reformat ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 3. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 4. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 5. Does it need to update dependencies: - [ ] Yes - [ ] No 6. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12881: [Opt](Vectorized) Support push down no grouping agg
yiguolei merged PR #12881: URL: https://github.com/apache/doris/pull/12881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] zhangstar333 opened a new pull request, #12883: [Bug](jdbc) fix insert into date type to oracle using wrong type
zhangstar333 opened a new pull request, #12883: URL: https://github.com/apache/doris/pull/12883 # Proposed changes using JDBC insert into date type to ORACLE, it's should be use to_date function convert string to java.sql.date ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12880: [Improvement](dict) optimize dictionary column
yiguolei merged PR #12880: URL: https://github.com/apache/doris/pull/12880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12879: [Improvement](predicate) Replace for-loop by memcpy
yiguolei merged PR #12879: URL: https://github.com/apache/doris/pull/12879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] mrhhsg opened a new pull request, #12884: [improvement](scan) merge scan keys based on the number of scanners
mrhhsg opened a new pull request, #12884: URL: https://github.com/apache/doris/pull/12884 # Proposed changes Issue Number: close #xxx ## Problem summary Describe your changes. ## Checklist(Required) 1. Does it affect the original behavior: - [ ] Yes - [ ] No - [ ] I don't know 2. Has unit tests been added: - [ ] Yes - [ ] No - [ ] No Need 3. Has document been added or modified: - [ ] Yes - [ ] No - [ ] No Need 4. Does it need to update dependencies: - [ ] Yes - [ ] No 5. Are there any changes that cannot be rolled back: - [ ] Yes (If Yes, please explain WHY) - [ ] No ## Further comments If this is a relatively large or complex change, kick off the discussion at [d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] yiguolei merged pull request #12884: [improvement](scan) merge scan keys based on the number of scanners
yiguolei merged PR #12884: URL: https://github.com/apache/doris/pull/12884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org
[GitHub] [doris] dutyu opened a new issue, #12885: [Enhancement] auditloader plugin always discard audit log when clsuter is busy
dutyu opened a new issue, #12885: URL: https://github.com/apache/doris/issues/12885 ### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description I've installed the auditloader plugin, i found that when cluster is busy (users submit many sqls to the cluster), the doris_audit_tbl__ table is always missing some audit log where i can find in fe.audit.log. I've reviewed the code and found that AuditLoaderPlugin use a LinkedBlockingDeque which the capacity is 1, if users submit many sqls, the `AuditLoaderPlugin.exec` method is always failed cause of the queue is full. Maybe use a configuration to control the capacity of the queue is an elegant way to handle this problem. ### Solution Use a configuration to control the capacity of the queue. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org