logging and gc JVM metrics should be provided as "gauges"
---------------------------------------------------------

                 Key: HADOOP-7866
                 URL: https://issues.apache.org/jira/browse/HADOOP-7866
             Project: Hadoop Common
          Issue Type: Bug
          Components: metrics
    Affects Versions: 0.20.2
            Reporter: Jeff Bean


JVM Metrics:

logWarn
logInfo
logError
logFatal
gcCount
gcTimeMillis

Are provided as "counters" only, meaning that they cumulate values over time 
rather than report real-time values. The code uses incrMetric() instead of 
setMetric(), for these metrics.

In tools like ganglia this leads to increasing graphs that aren't terribly 
useful: You can't tell by looking at a graph of these metrics whether or not 
garbage collection times are going up, how long individual gc events were, or 
when interesting log errors happened, because those events are overshadowed by 
trends when the metrics are reported as counters. Also, users are accustomed to 
thinking that a graph trending up indicates an operational issue, so these 
metrics cause interest and confusion among operators when they shouldn't.

I'm attaching a patch to JVM Metrics that adds the following metrics:

logWarnGauge
logInfoGauge
logErrorGauge
logFatalGauge
gcCountGauge
gcTimeMillisGauge

As well as a sample image of how those metrics look after running with this 
patch on a test cluster for a couple weeks. 

J

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to