[ https://issues.apache.org/jira/browse/KUDU-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840542#comment-17840542 ]
ASF subversion and git services commented on KUDU-3566: ------------------------------------------------------- Commit b236d534abeb60520e4568bb4a1452d6674bb597 in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=b236d534a ] KUDU-3566 fix summary metrics in Prometheus format This patch corrects the output of various Kudu metrics backed by HDR histograms. From the Prometheus perspective, those metrics are output as summaries [1], not histograms [2]. It's necessary to mark them accordingly to avoid misinterpretation of the collected statistics. I updated corresponding unit tests and verified that the updated output was properly parsed and interpreted by a Prometheus 2.50.0 instance running on my macOS laptop. [1] https://prometheus.io/docs/concepts/metric_types/#summary [2] https://prometheus.io/docs/concepts/metric_types/#histogram Change-Id: I1375ddf1b0ecd730327cd44b4955813b80107f7b Reviewed-on: http://gerrit.cloudera.org:8080/21338 Tested-by: Alexey Serbin <ale...@apache.org> Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> > Incorrect semantics for Prometheus-style histogram metrics > ---------------------------------------------------------- > > Key: KUDU-3566 > URL: https://issues.apache.org/jira/browse/KUDU-3566 > Project: Kudu > Issue Type: Bug > Components: master, tserver > Affects Versions: 1.17.0 > Reporter: Alexey Serbin > Priority: Major > Labels: metrics, observability > > Original KUDU-3375 implementation incorrectly exposes [summary-type > Prometheus metrics|https://prometheus.io/docs/concepts/metric_types/#summary] > as [histogram-type > ones|https://prometheus.io/docs/concepts/metric_types/#histogram] for data > collected by corresponding HDR histograms. For example, below are snippets > from {{/metric}} and {{/metrics_prometheus}} for statistics on ListMasters > RPC. > The data exposed as Prometheus-style histogram metrics should have been > reported as summary metrics instead. > JSON-style: > {noformat} > { > "name": "handler_latency_kudu_master_MasterService_ListMasters", > "total_count": 26, > "min": 152, > "mean": 301.2692307692308, > "percentile_75": 324, > "percentile_95": 468, > "percentile_99": 844, > "percentile_99_9": 844, > "percentile_99_99": 844, > "max": 844, > "total_sum": 7833 > } > {noformat} > Prometheus-style counterpart: > {noformat} > # HELP kudu_master_handler_latency_kudu_master_MasterService_ListMasters > Microseconds spent handling kudu.master.MasterService.ListMasters RPC requests > # TYPE kudu_master_handler_latency_kudu_master_MasterService_ListMasters > histogram > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="0.75"} 324 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="0.95"} 468 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="0.99"} 844 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="0.999"} 844 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="0.9999"} 844 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds", > le="+Inf"} 26 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_sum{unit_type="microseconds"} > 7833 > kudu_master_handler_latency_kudu_master_MasterService_ListMasters_count{unit_type="microseconds"} > 26 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)