[ https://issues.apache.org/jira/browse/FLINK-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607051#comment-16607051 ]
Helmut Zechmann commented on FLINK-10300: ----------------------------------------- Yes, we had the problem with flink 1.5.2, but I also reproduced it with flink 1.5.3 using the steps described above. > Prometheus job level metrics not removed after job finished > ----------------------------------------------------------- > > Key: FLINK-10300 > URL: https://issues.apache.org/jira/browse/FLINK-10300 > Project: Flink > Issue Type: Bug > Components: Metrics > Affects Versions: 1.5.3 > Reporter: Helmut Zechmann > Priority: Major > > Flink provides job level metrics for flink jobs. After a job is finished > those metrics should be removed, else we run into problems when many jobs are > executed on a cluster. > How to reproduce this: > Setup: > * flink 1.5.3 in standalone mode > * 1 JobManager > * 1 TaskManager > * flink-metrics-prometheus-1.5.3.jar in lib dir > Metrics config: > {code:java} > metrics.reporters: prom > metrics.reporter.prom.class: > org.apache.flink.metrics.prometheus.PrometheusReporter > metrics.reporter.prom.port: 7000-7001 > {code} > Run the wordcount job. After running the job, job related metrics are still > available: > > {code:java} > flink_taskmanager_Status_JVM_GarbageCollector_G1_Old_Generation_Count{tm_id="ee893c28f70d285e701f838706ce8810",host="helmuts_mbp",} > 1.0 > # HELP flink_taskmanager_job_task_operator_numRecordsOutPerSecond > numRecordsOutPerSecond (scope: taskmanager_job_task_operator) > # TYPE flink_taskmanager_job_task_operator_numRecordsOutPerSecond gauge > flink_taskmanager_job_task_operator_numRecordsOutPerSecond{job_id="2a7c77aacf6b18da389189a3bae6ff48",task_id="529e7a1eaba520b18dc7864f821ada08",task_attempt_id="3bc0d07eb56df676b088a8ec13531c98",host="helmuts_mbp",operator_id="529e7a1eaba520b18dc7864f821ada08",operator_name="DataSource__at_getDefaultTextLineDataSet_WordCountData_java:70___org_apache_flin",task_name="CHAIN_DataSource__at_getDefaultTextLineDataSet_WordCountData_java:70___org_apache_flink_api_java_io_CollectionInputFormat______FlatMap__FlatMap_at_main_WordCount_java:77______Combine__SUM_1___at_main_WordCount_java:80_",task_attempt_num="0",job_name="Flink_Java_Job_at_Fri_Sep_07_13:00:12_CEST_2018",tm_id="ee893c28f70d285e701f838706ce8810",subtask_index="0",} > 0.0 > ... > {code} > > With each finished job the prometheus output gets bigger and bigger until > the prometheus output fails to load. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)