[ https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384069#comment-15384069 ]
ASF GitHub Bot commented on FLINK-4152: --------------------------------------- Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2257 I think by imposing this contract we actually introduced problems we didn't have before. Thus, in order to remedy these problems for the upcoming release and go back a bit in the direction of the pre-RM age, I loosened the contract of the RM so that it can no longer reject TM registrations. This makes sense in my opinion, since we don't have a mean to shut down orphaned TMs anyway. So in this version, the RM's task is to ensure that at least a predefined set of resources is allocated and to notify the JM about a TM death (not strictly mandatory). What we could actually do is to also register orphaned TMs (or ones registered by a different RM). Then we wouldn't have the problem that we allocate too many resources for a JM. > TaskManager registration exponential backoff doesn't work > --------------------------------------------------------- > > Key: FLINK-4152 > URL: https://issues.apache.org/jira/browse/FLINK-4152 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, TaskManager, YARN Client > Reporter: Robert Metzger > Assignee: Till Rohrmann > Attachments: logs.tgz > > > While testing Flink 1.1 I've found that the TaskManagers are logging many > messages when registering at the JobManager. > This is the log file: > https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294 > Its logging more than 3000 messages in less than a minute. I don't think that > this is the expected behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)