[GitHub] [flink] xintongsong commented on pull request #15524: [FLINK-21667][runtime] Defer starting ResourceManager to after obtaining leadership.

GitBox Sun, 23 May 2021 20:22:40 -0700


xintongsong commented on pull request #15524:
URL: https://github.com/apache/flink/pull/15524#issuecomment-846704389



   @tillrohrmann 
   
   Not necessarily. AFAIK, the Yarn RM requires each AM to register only once, 
but it does not require using the same `AMRMClient(Async)`. That means, in a 
later leader session, we can instantiate a new `AMRMClient(Async)` to interact 
with the Yarn RM, as long as the AM has been registered in the first leader 
session. This is my understanding based on the Yarn interfaces and docs, and 
would need further verifications to be sure. If this is proved true, we may 
simply catch and ignore the registration exception if it's caused by 
duplication, or maintain whether this is the first leader session in RMService 
(which is responsible for leader election and starting the RM on obtaining 
leadership) and skip the registration if not.
   
   Another challenge I can see now is to handle resource changes between two 
leader sessions. Currently, the new leader RM relies on 
`RegisterApplicationMasterResponse` to find out what containers have already 
been allocated. This is based on the assumption that existing containers must 
came from the previous attempts. With multiple leader sessions, inheriting 
containers from previous leader sessions becomes non-trivial, because 
`AMRMClient(Async)` does not provide interfaces for getting all currently 
allocated containers. Two potential solutions are:
   - Leveraging `YarnClient#getContainers`. `YarnClient` is meant to be used on 
the client side rather than by an AM. Not sure if there's any traps using it in 
an AM. Hopefully not.
   - Alternatively, we can maintain Yarn specific component outside Flink's RM 
(as you've mentioned). The component can be reused across multiple Flink RMs / 
leader sessions. This component will be responsible for registering the AM to 
Yarn RM, as well as receiving the resource events from Yarn RM, which will 
forwarded to leader RM.
   
   Personally, I think skipping registration if not the first leader session 
and leveraging `YarnClient` sounds promising. If it ends up we have to maintain 
something Yarn specific across multiple Flink RM's, I'm leaning towards to not 
introducing such complexity and keeping it as is as each attempt has only one 
leader session.
   
   And again, I'd suggest to scope out this issue from the current PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on pull request #15524: [FLINK-21667][runtime] Defer starting ResourceManager to after obtaining leadership.

Reply via email to