xinyiZzz commented on code in PR #47462: URL: https://github.com/apache/doris/pull/47462#discussion_r1965403939
########## be/src/runtime/memory/mem_tracker_limiter.cpp: ########## @@ -81,6 +81,11 @@ std::shared_ptr<MemTrackerLimiter> MemTrackerLimiter::create_shared(MemTrackerLi const std::string& label, int64_t byte_limit) { auto tracker = std::make_shared<MemTrackerLimiter>(type, label, byte_limit); + // Write tracker is only used to tracker the size, so limit == -1 + auto write_tracker = std::make_shared<MemTrackerLimiter>(type, "Memtable" + label, -1); Review Comment: 一个Load任务,会有两个 Type::Load 的 tracker 在全局的 Map 中,目前 Momory GC 中 cancel query/load 的逻辑会有问题(虽然做了容错,只会影响日志,最终行为没问题)。后续将 Momory GC 改成依赖 resource ctx 后就没问题。 另外既然一个Load任务有两个Tracker, MemoryProfile 中应该在 Load Memory 中区分 Framgnet 和 Memtable 内存 ########## be/src/runtime/memory/thread_mem_tracker_mgr.h: ########## @@ -278,40 +279,57 @@ inline void ThreadMemTrackerMgr::flush_untracked_mem() { _stop_consume = false; } -inline doris::Status ThreadMemTrackerMgr::try_reserve(int64_t size) { +inline doris::Status ThreadMemTrackerMgr::try_reserve(int64_t size, + bool only_check_process_memory) { Review Comment: 为什么 FlushToken::_try_reserve_memory 需要 only_check_process_memory=true 呢 只要让 FlushToken::_try_reserve_memory scoped 的 tracker limit = -1,workload group ptr = null,自然 try reserve 就只检查 process memory 了 现在为了支持 only_check_process_memory,在 memory tracker 和 work load group 分别新增了一个 reserve 方法,代码看着更乱了 ########## be/src/runtime/memory/mem_tracker_limiter.h: ########## @@ -350,9 +369,16 @@ inline void MemTrackerLimiter::cache_consume(int64_t bytes) { } inline Status MemTrackerLimiter::check_limit(int64_t bytes) { - if (bytes <= 0 || (is_overcommit_tracker() && config::enable_query_memory_overcommit)) { + // Do not enable check limit, because reserve process will check it. + if (bytes <= 0 || _enable_reserve_memory) { Review Comment: 这个 check_limit 只有 Allocator 中用到,或许应该去掉 “|| _enable_reserve_memory” 即使 reserve memory 成功,如果后续内存申请超过了 limit,也意味着超过了最初 reserve 的内存大小,那 check_limit 失败是合理的。 如果后续内存申请没有超过最初 reserve 的内存大小,那 check_limit 必然成功 ########## be/src/runtime/memory/mem_tracker_limiter.cpp: ########## @@ -81,6 +81,11 @@ std::shared_ptr<MemTrackerLimiter> MemTrackerLimiter::create_shared(MemTrackerLi const std::string& label, int64_t byte_limit) { auto tracker = std::make_shared<MemTrackerLimiter>(type, label, byte_limit); + // Write tracker is only used to tracker the size, so limit == -1 + auto write_tracker = std::make_shared<MemTrackerLimiter>(type, "Memtable" + label, -1); Review Comment: label name 最好改成 "Memtable#" + label,加个#,现在 label 中每一部分信息是用#分割的,未来或许要通过 label name 区分是 Fragment tracker 还是 Memtable tracker 或许可以新增一个 MemoryTracker Type::MemTable ? ########## be/src/runtime/memory/global_memory_arbitrator.h: ########## @@ -149,15 +149,16 @@ class GlobalMemoryArbitrator { static std::string process_limit_exceeded_errmsg_str() { return fmt::format( - "{} exceed limit {} or {} less than low water mark {}", process_memory_used_str(), - MemInfo::mem_limit_str(), sys_mem_available_str(), + "{} exceed limit {} or {} less than low water mark {}", + process_memory_used_details_str(), MemInfo::mem_limit_str(), + sys_mem_available_details_str(), PrettyPrinter::print(MemInfo::sys_mem_available_low_water_mark(), TUnit::BYTES)); } static std::string process_soft_limit_exceeded_errmsg_str() { return fmt::format("{} exceed soft limit {} or {} less than warning water mark {}.", - process_memory_used_str(), MemInfo::soft_mem_limit_str(), - sys_mem_available_str(), + process_memory_used_details_str(), MemInfo::soft_mem_limit_str(), Review Comment: 同上 ########## be/src/runtime/memory/mem_tracker_limiter.h: ########## @@ -146,6 +147,7 @@ class MemTrackerLimiter final { // Log the memory usage when memory limit is exceeded. std::string tracker_limit_exceeded_str(); bool is_overcommit_tracker() const { return type() == Type::QUERY || type() == Type::LOAD; } + void set_limit(int64_t new_mem_limit) { _limit = new_mem_limit; } Review Comment: query memory limit 现在允许运行中动态修改了么 之前特意不允许修改 _limit, 强制在 QueryContext 构造时确定 ########## be/src/runtime/memory/global_memory_arbitrator.h: ########## @@ -149,15 +149,16 @@ class GlobalMemoryArbitrator { static std::string process_limit_exceeded_errmsg_str() { return fmt::format( - "{} exceed limit {} or {} less than low water mark {}", process_memory_used_str(), - MemInfo::mem_limit_str(), sys_mem_available_str(), + "{} exceed limit {} or {} less than low water mark {}", + process_memory_used_details_str(), MemInfo::mem_limit_str(), Review Comment: 这里用 details 的内存信息可能不合适,两个问题:1. process_limit_exceeded_errmsg_str 会输出到 mysql client 展示给用户,打出 details 会增加用户理解成过本(可能问题也不大…)。 2. mysql client 一行的长度是有限制的,太长会被截断。 是否打 details @yiguolei 大佬可以确认下 ########## be/src/runtime/memory/memory_profile.cpp: ########## @@ -334,15 +333,18 @@ int64_t MemoryProfile::other_current_usage() { return memory_other_trackers_sum_bytes.get_value(); } +std::string MemoryProfile::process_memory_detail_str() const { + return fmt::format("Process Memory Summary: {}\n, {}\n, {}\n, {}\n, {}\n, {}\n", Review Comment: 不能这样改,LOG 是有行数限制的,每一个 "print" 都包括很多行,参考 query profile 的格式,之前在一个 LOG(WARNING) 全部打印出来发现会被截断,所以不得不多次 LOG(WARNING) 需要加个注释 ########## be/src/runtime/memory/memory_profile.cpp: ########## @@ -154,8 +154,7 @@ void MemoryProfile::refresh_memory_overview_profile() { ExecEnv::GetInstance()->rowsets_no_cache_mem_tracker()->set_consumption( MetadataAdder<RowsetMeta>::get_all_rowsets_size()); ExecEnv::GetInstance()->segments_no_cache_mem_tracker()->set_consumption( - MetadataAdder<segment_v2::Segment>::get_all_segments_estimate_size() - - SegmentLoader::instance()->cache_mem_usage()); + MetadataAdder<segment_v2::Segment>::get_all_segments_size()); Review Comment: 不能改,因为 segment cache 的内存大小就是估算的,所以 segments_no_cache_mem_tracker 内存等于 all estimate segment size 减去 segment cache memory tracker 大小 改成 get_all_segments_size 后会导致 segments_no_cache_mem_tracker 和 segment cache memory tracker 间内存有重叠 所以目前是把 get_all_segments_size 的值放到 metrics 中,memory tracker 中依然用估算的 segment 内存为准 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org