sam liu created HDFS-4527:
-----------------------------

             Summary: For shortening the time of TaskTracker heartbeat, 
decouple the statics collection operations
                 Key: HDFS-4527
                 URL: https://issues.apache.org/jira/browse/HDFS-4527
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: performance
    Affects Versions: 1.1.1
            Reporter: sam liu


In each heartbeat of TaskTracker, it will calculate some system statics, like 
the free disk space, available virtual/physical memory, cpu usage, etc. 
However, it's not necessary to calculate all the statics in every heartbeat, 
and this will consume many system resource and impace the performance of 
TaskTracker heartbeat. Furthermore, the characteristics of system 
properties(disk, memory, cpu) are different and it's better to collect their 
statics in different intervals.

To reduce the latency of TaskTracker heartbeat, one solution is to decouple all 
the system statics collection operations from it, and issue separate threads to 
do the statics collection works when the TaskTracker starts. The threads could 
be three: the first one is to collect cpu related statics in a short interval; 
the second one is to collect memory related statics in a normal interval; the 
third one is to collect disk related statics in a long interval. And all the 
interval could be customized by the parameter 
"mapred.stats.collection.interval" in the mapred-site.xml. At last, the 
heartbeat could get values of system statics from the memory directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to