[ https://issues.apache.org/jira/browse/FLINK-21510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-21510: ----------------------------------- Labels: auto-deprioritized-major auto-unassigned reactive (was: auto-unassigned reactive stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > ExecutionGraph metrics collide on restart > ----------------------------------------- > > Key: FLINK-21510 > URL: https://issues.apache.org/jira/browse/FLINK-21510 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Reporter: Chesnay Schepler > Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, reactive > > The ExecutionGraphBuilder registers several metrics directly on the > JobManagerJobMetricGroup, which are never cleaned up. > These include upTime/DownTime/restartingTime as well as various checkpointing > metrics (see the CheckpointStatsTracker for details; examples are number of > checkpoints, checkpoint sizes etc). > When the AdaptiveScheduler re-creates the EG these will collide with metrics > of prior attempts. > Essentially we either need to create a separate metric group that we pass to > the EG or refactor the metrics to be based on some mutable EG reference. -- This message was sent by Atlassian Jira (v8.3.4#803005)