Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22062 )
Change subject: IMPALA-13487: Add profile counters for memory allocation ...................................................................... Patch Set 18: (4 comments) http://gerrit.cloudera.org:8080/#/c/22062/18//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22062/18//COMMIT_MSG@17 PS18, Line 17: - ScratchBatchMemAllocDuration : - ScratchBatchMemFreeDuration : - ScratchBatchMemAllocBytes Why there is no ScratchBatchMemFreeBytes when there is ParquetDataPagePoolFreeBytes? Consider using nested counters for counters derived from MemPoolCounters, just like "Buffer pool". Something like this maybe: - MemPool of ScratchBatch - AllocBytes: ... - AllocDuration: ... - FreeBytes: ... - FreeDuration: ... http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@71 PS18, Line 71: PROFILE_DEFINE_SUMMARY_STATS_TIMER(ScratchBatchMemAllocDuration, DEBUG, : "Stats of time spent in malloc() used by MemPools of the scratch batch."); : PROFILE_DEFINE_SUMMARY_STATS_TIMER(ScratchBatchMemFreeDuration, DEBUG, : "Stats of time spent in free() used by MemPools of the scratch batch."); : PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ScratchBatchMemAllocBytes, DEBUG, TUnit::BYTES, : "Stats of bytes allocated by MemPools of the scratch batch."); : PROFILE_DEFINE_TIMER(MaterializeCollectionGetMemTime, UNSTABLE, "Wall clock time spent " : "getting/allocating collection memory. Includes the memcpy duration when doubling " : "the tuple buffer."); nit: In your commit message, you write a very good instruction on how to correlate these new counters to diagnose memory issues. Can you elaborate the help strings here with that reasoning? Maybe tell us what other counters to associate with, or tell us what high value means? Will these counters be all 0 if DataCache is all hit? http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@338 PS18, Line 338: : void HdfsColumnarScanner::CloseInternal() { Why only aggregate the counters at CloseInternal? Can it be done in every GetNext with SetStats + Merge? http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@344 PS18, Line 344: counters = scratch_batch_->aux_mem_pool.GetMemPoolCounters(); : scratch_mem_alloc_duration_->Merge(counters.sys_alloc_duration); : scratch_mem_free_duration_->Merge(counters.sys_free_duration); : scratch_mem_alloc_bytes_->Merge(counters.allocated_bytes); nit: Please add whitespace before 'counter' reassignment, or put it in different variable name. I was confused at first why 'Merge(counters.sys_alloc_duration)' is called twice. A short comment explaining the difference would be nice. -- To view, visit http://gerrit.cloudera.org:8080/22062 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710 Gerrit-Change-Number: 22062 Gerrit-PatchSet: 18 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Xuebin Su <x...@cloudera.com> Gerrit-Comment-Date: Thu, 10 Apr 2025 05:55:07 +0000 Gerrit-HasComments: Yes