Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22062 )

Change subject: IMPALA-13487: Add profile counters for memory allocation
......................................................................


Patch Set 18:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/22062/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22062/18//COMMIT_MSG@17
PS18, Line 17:     - ScratchBatchMemAllocDuration
             :     - ScratchBatchMemFreeDuration
             :     - ScratchBatchMemAllocBytes
Why there is no ScratchBatchMemFreeBytes when there is 
ParquetDataPagePoolFreeBytes?

Consider using nested counters for counters derived from MemPoolCounters, just 
like "Buffer pool". Something like this maybe:

  - MemPool of ScratchBatch
    - AllocBytes: ...
    - AllocDuration: ...
    - FreeBytes: ...
    - FreeDuration: ...


http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc
File be/src/exec/hdfs-columnar-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@71
PS18, Line 71: PROFILE_DEFINE_SUMMARY_STATS_TIMER(ScratchBatchMemAllocDuration, 
DEBUG,
             :     "Stats of time spent in malloc() used by MemPools of the 
scratch batch.");
             : PROFILE_DEFINE_SUMMARY_STATS_TIMER(ScratchBatchMemFreeDuration, 
DEBUG,
             :     "Stats of time spent in free() used by MemPools of the 
scratch batch.");
             : PROFILE_DEFINE_SUMMARY_STATS_COUNTER(ScratchBatchMemAllocBytes, 
DEBUG, TUnit::BYTES,
             :     "Stats of bytes allocated by MemPools of the scratch 
batch.");
             : PROFILE_DEFINE_TIMER(MaterializeCollectionGetMemTime, UNSTABLE, 
"Wall clock time spent "
             :     "getting/allocating collection memory. Includes the memcpy 
duration when doubling "
             :     "the tuple buffer.");
nit: In your commit message, you write a very good instruction on how to 
correlate these new counters to diagnose memory issues.

Can you elaborate the help strings here with that reasoning? Maybe tell us what 
other counters to associate with, or tell us what high value means? Will these 
counters be all 0 if DataCache is all hit?


http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@338
PS18, Line 338:
              : void HdfsColumnarScanner::CloseInternal() {
Why only aggregate the counters at CloseInternal?
Can it be done in every GetNext with SetStats + Merge?


http://gerrit.cloudera.org:8080/#/c/22062/18/be/src/exec/hdfs-columnar-scanner.cc@344
PS18, Line 344:   counters = scratch_batch_->aux_mem_pool.GetMemPoolCounters();
              :   
scratch_mem_alloc_duration_->Merge(counters.sys_alloc_duration);
              :   scratch_mem_free_duration_->Merge(counters.sys_free_duration);
              :   scratch_mem_alloc_bytes_->Merge(counters.allocated_bytes);
nit: Please add whitespace before 'counter' reassignment, or put it in 
different variable name.

I was confused at first why 'Merge(counters.sys_alloc_duration)' is called 
twice. A short comment explaining the difference would be nice.



--
To view, visit http://gerrit.cloudera.org:8080/22062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710
Gerrit-Change-Number: 22062
Gerrit-PatchSet: 18
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Xuebin Su <x...@cloudera.com>
Gerrit-Comment-Date: Thu, 10 Apr 2025 05:55:07 +0000
Gerrit-HasComments: Yes

Reply via email to