[jira] [Updated] (HDDS-13014) Improve PrometheusMetricsSink#normalizeName performance

Ivan Andika (Jira) Sun, 11 May 2025 06:54:20 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Andika updated HDDS-13014:
-------------------------------
    Description: 
>From the 5 minutes flamegraph of S3G, we see that nearly 40% of the CPU usage 
>are spent on PrometheusMetrics#putMetrics.
Moreover, 30% of the CPU usage are attributed only for 
PrometheusMetricsSinkUtil#normalizeName. We see that most of the time are spent 
on regex matching.

We need to find a way to improve the performance
Two possible improvements
* Optimize the regex matching or replace it entirely: Perhaps we can take some 
logic from Prometheus JMX exporter (https://github.com/prometheus/jmx_exporter) 
or replace it with jmx_exporter implementation.
* Add a name conversion cache between the Hadoop metrics name and the 
Prometheus metrics name


  was:
>From the 5 minutes flamegraph of S3G, we see that nearly 40% of the CPU usage 
>are spent on PrometheusMetrics#putMetrics.
Moreover, 30% of the CPU usage are attributed only for 
PrometheusMetricsSinkUtil#normalizeName. We see that most of the time are spent 
on regex matching.

We need to find a way to improve the performance, possibly removing regex 
matching entirely. Perhaps we can take some logic from Prometheus JMX exporter 
(https://github.com/prometheus/jmx_exporter) or replace it with jmx_exporter 
implementation.



> Improve PrometheusMetricsSink#normalizeName performance
> -------------------------------------------------------
>
>                 Key: HDDS-13014
>                 URL: https://issues.apache.org/jira/browse/HDDS-13014
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>         Attachments: s3g-5m-flamegraph.html
>
>
> From the 5 minutes flamegraph of S3G, we see that nearly 40% of the CPU usage 
> are spent on PrometheusMetrics#putMetrics.
> Moreover, 30% of the CPU usage are attributed only for 
> PrometheusMetricsSinkUtil#normalizeName. We see that most of the time are 
> spent on regex matching.
> We need to find a way to improve the performance
> Two possible improvements
> * Optimize the regex matching or replace it entirely: Perhaps we can take 
> some logic from Prometheus JMX exporter 
> (https://github.com/prometheus/jmx_exporter) or replace it with jmx_exporter 
> implementation.
> * Add a name conversion cache between the Hadoop metrics name and the 
> Prometheus metrics name



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

[jira] [Updated] (HDDS-13014) Improve PrometheusMetricsSink#normalizeName performance

Reply via email to