[ https://issues.apache.org/jira/browse/HIVE-21740?focusedWorklogId=245433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-245433 ]
ASF GitHub Bot logged work on HIVE-21740: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/May/19 18:34 Start Date: 20/May/19 18:34 Worklog Time Spent: 10m Work Description: odraese commented on pull request #633: HIVE-21740: Collect LLAP execution latency metrics URL: https://github.com/apache/hive/pull/633#discussion_r285714898 ########## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/metrics/LlapTaskSchedulerMetrics.java ########## @@ -276,6 +302,43 @@ private void getTaskSchedulerStats(MetricsRecordBuilder rb) { .addCounter(SchedulerPendingPreemptionTaskCount, pendingPreemptionTasksCount.value()) .addCounter(SchedulerPreemptedTaskCount, preemptedTasksCount.value()) .addCounter(SchedulerCompletedDagCount, completedDagcount.value()); + daemonTaskLatency.forEach((k, v) -> rb.addGauge(v, v.getMean())); + } + + static class DaemonLatencyMetric implements MetricsInfo { + private String name; + private ExponentiallyDecayingReservoir reservoir; Review comment: What is the benefit of using am exponential decay here, vs. simple sliding window (i.e. guava stats)? Specifically with the low amount of entries (right now 1028), a few more recent task execution times can have a huge impact on what we believe is the node's average over a longer period of time. The whole idea of using the average here is to smooth out the task execution times across a larger amount of tasks. I have the fear, that the exponential decay is contradicting this by giving few, very recent values, a high impact. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 245433) Time Spent: 1.5h (was: 1h 20m) > Collect LLAP execution latency metrics > -------------------------------------- > > Key: HIVE-21740 > URL: https://issues.apache.org/jira/browse/HIVE-21740 > Project: Hive > Issue Type: New Feature > Reporter: Peter Vary > Assignee: Peter Vary > Priority: Major > Labels: pull-request-available > Attachments: HIVE-21740.2.patch, HIVE-21740.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > Collect metrics for LLAP task execution times -- This message was sent by Atlassian JIRA (v7.6.3#76005)