[ 
https://issues.apache.org/jira/browse/KUDU-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3566:
--------------------------------
    Status: In Review  (was: Open)

> Incorrect semantics for Prometheus-style histogram metrics
> ----------------------------------------------------------
>
>                 Key: KUDU-3566
>                 URL: https://issues.apache.org/jira/browse/KUDU-3566
>             Project: Kudu
>          Issue Type: Bug
>          Components: master, tserver
>    Affects Versions: 1.17.0
>            Reporter: Alexey Serbin
>            Priority: Major
>              Labels: metrics, observability
>
> Original KUDU-3375 implementation incorrectly exposes [summary-type 
> Prometheus metrics|https://prometheus.io/docs/concepts/metric_types/#summary] 
> as [histogram-type 
> ones|https://prometheus.io/docs/concepts/metric_types/#histogram] for data 
> collected by corresponding HDR histograms.  For example, below are snippets 
> from {{/metric}} and {{/metrics_prometheus}} for statistics on ListMasters 
> RPC.
> The data exposed as Prometheus-style histogram metrics should have been 
> reported as summary metrics instead.
> JSON-style:
> {noformat}
> {   
>     "name": "handler_latency_kudu_master_MasterService_ListMasters",          
>       "total_count": 26,
>     "min": 152,
>     "mean": 301.2692307692308,
>     "percentile_75": 324,
>     "percentile_95": 468,
>     "percentile_99": 844,
>     "percentile_99_9": 844,
>     "percentile_99_99": 844,
>     "max": 844,
>     "total_sum": 7833
> }
> {noformat}
> Prometheus-style counterpart:
> {noformat}
> # HELP kudu_master_handler_latency_kudu_master_MasterService_ListMasters 
> Microseconds spent handling kudu.master.MasterService.ListMasters RPC requests
> # TYPE kudu_master_handler_latency_kudu_master_MasterService_ListMasters 
> histogram
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="0.75"} 324
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="0.95"} 468
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="0.99"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="0.999"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="0.9999"} 844
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_bucket{unit_type="microseconds",
>  le="+Inf"} 26
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_sum{unit_type="microseconds"}
>  7833
> kudu_master_handler_latency_kudu_master_MasterService_ListMasters_count{unit_type="microseconds"}
>  26
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to