[ https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384005#comment-15384005 ]
ASF GitHub Bot commented on FLINK-4152: --------------------------------------- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/2257#discussion_r71322624 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java --- @@ -78,6 +79,9 @@ /** The containers where a TaskManager is starting and we are waiting for it to register */ private final Map<ResourceID, YarnContainerInLaunch> containersInLaunch; + /** The container where a TaskManager has been started and is running in */ + private final Map<ResourceID, Container> containersLaunched; --- End diff -- It is true that it holds the registered resources but it does not hold the launched containers. When a `JobManager` loses its leadership the list of registered workers will be cleared. In order to reconstruct the mapping `ResourceID --> Container`, you need this new map. > TaskManager registration exponential backoff doesn't work > --------------------------------------------------------- > > Key: FLINK-4152 > URL: https://issues.apache.org/jira/browse/FLINK-4152 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, TaskManager, YARN Client > Reporter: Robert Metzger > Assignee: Till Rohrmann > Attachments: logs.tgz > > > While testing Flink 1.1 I've found that the TaskManagers are logging many > messages when registering at the JobManager. > This is the log file: > https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294 > Its logging more than 3000 messages in less than a minute. I don't think that > this is the expected behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)