[ 
https://issues.apache.org/jira/browse/FLINK-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919382#comment-16919382
 ] 

Chesnay Schepler commented on FLINK-13787:
------------------------------------------

TTL has been reject by the Prometheus team, and at this time we will not switch 
to a fork due to an increased risk of relying on an unmaintained library.

I would encourage anyone to create a version of the 
PrometheusPushGatewayReporter that supports TTL and share it with the community 
for those who need it.

Since fixing the cleanup logic isn't reliably without TTL (both on the JM and 
TM side) I'm inclined to close this issue as "Won't fix".

> PrometheusPushGatewayReporter does not cleanup TM metrics when run on 
> kubernetes
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-13787
>                 URL: https://issues.apache.org/jira/browse/FLINK-13787
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>    Affects Versions: 1.7.2, 1.8.1, 1.9.0
>            Reporter: Kaibo Zhou
>            Priority: Major
>
> I have run a flink job on kubernetes and use PrometheusPushGatewayReporter, I 
> can see the metrics from the flink jobmanager and taskmanager from the push 
> gateway's UI.
> When I cancel the job, I found the jobmanager's metrics disappear, but the 
> taskmanager's metrics still exist, even though I have set the 
> _deleteOnShutdown_ to true_._
> The configuration is:
> {code:java}
> metrics.reporters: "prom"
> metrics.reporter.prom.class: 
> "org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter"
> metrics.reporter.prom.jobName: "WordCount"
> metrics.reporter.prom.host: "localhost"
> metrics.reporter.prom.port: "9091"
> metrics.reporter.prom.randomJobNameSuffix: "true"
> metrics.reporter.prom.filterLabelValueCharacters: "true"
> metrics.reporter.prom.deleteOnShutdown: "true"
> {code}
>  
> Other people have also encountered this problem: 
> [https://stackoverflow.com/questions/54420498/flink-prometheus-push-gateway-reporter-delete-metrics-on-job-shutdown].
>   And another similar issue: FLINK-11457.
>  
> As prometheus is a very import metrics system on kubernetes, if we can solve 
> this problem, it is beneficial for users to monitor their flink jobs.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to