[ 
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288227#comment-17288227
 ] 

Till Rohrmann commented on FLINK-20332:
---------------------------------------

Sounds good to me like this as a first step.

> Add workers recovered from previous attempt to pending resources
> ----------------------------------------------------------------
>
>                 Key: FLINK-20332
>                 URL: https://issues.apache.org/jira/browse/FLINK-20332
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers 
> from previous attempt should register to the new JM. Depending on the order 
> that slot requests and TM registrations arrive at the RM, it could happen 
> that RM allocates unnecessary new resources while there are recovered 
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so 
> that RM knows what resources are expected to be available soon and decide 
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to