Fan Xinpu created FLINK-11149:
---------------------------------

             Summary: Flink will request too more containers than it actually 
needs
                 Key: FLINK-11149
                 URL: https://issues.apache.org/jira/browse/FLINK-11149
             Project: Flink
          Issue Type: Improvement
          Components: YARN
    Affects Versions: 1.7.0
            Reporter: Fan Xinpu


  As known, flink will request new containers when it was notified that some 
allocated container is completed. Let me say, maybe one container failed, and 
Flink tries to request one container from NM, but actually Flink will request 
n+1 containers, the n refers to the number that ever requested after cluster is 
created.It is not graceful.

  When requesting a container, Flink will send a ContainerRequest to RM through 
AMRM Client, and AMRMClient will save the ContainerRequest in itself, and hopes 
the ContainerRequest will be removed in future, but Flink never removes the 
ContainerRequest, so one by one, the number of ContainerRequest accumulates to 
a unexpected value.

  In our environment, a cluster initially allocated 100 containers, and later 
on,it requests one container from RM, RM returns more than 2000 containers to 
it as the request actually has more than 2000 ContainerRequest. Although Flink 
will return the excess containers, this request behavior waste time and 
resource on yarn.

  So, maybe Flink can remove the ContainerRequest after the request has been 
sent to RM, then Flink will get exactly numbers of containers as it explicitly 
did.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to