GitHub user beyond1920 opened a pull request: https://github.com/apache/flink/pull/2479
[FLINK-4537] [cluster management] ResourceManager registration with JobManager This pull request is to implement ResourceManager registration with JobManager, which including: 1. Check whether input resourceManagerLeaderId is as same as the current leadershipSessionId of resourceManager. If not, it means that maybe two or more resourceManager exists at the same time, and current resourceManager is not the proper rm. so it rejects or ignores the registration. 2. Check whether exists a valid JobMaster at the giving address by connecting to the address. Reject the registration from invalid address.(Hidden in the connect logic) 3. Keep JobID and JobMasterGateway mapping relationships. 4. Start a JobMasterLeaderListener at the given JobID to listen to the leadership of the specified JobMaster. 5. Send registration successful ack to the jobMaster. Main difference are 6 points: 1. Add getJobMasterLeaderRetriever method to get job master leader retriever in HighAvailabilityServices, NonHaServices, A inner class in TaskExecutor, TestingHighAvailabilityServices. 2. Change registerJobMaster method logic of ResourceManager based on the above step 3. Change the input parameters of registerJobMaster method in ResourceManager and ResourceManagerGateway class to be consistent with registerTaskExecutor, from jobMasterRegistration to resourceManagerLeaderId + jobMasterAddress + jobID 4. Change the result type of registerJobMaster method in ResourceManager and ResourceManagerGateway class to be consistent with RetryingRegistration, from org.apache.flink.runtime.resourcemanager.RegistrationResponse to org.apache.flink.runtime.registration.RegistrationResponse 5. Add a LeaderRetrievalListener in ResourceManager to listen to leadership of jobMaster 6. Add a test class for registerJobMaster method in ResourceManager You can merge this pull request into a Git repository by running: $ git pull https://github.com/alibaba/flink jira-4537 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2479.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2479 ---- commit fa66ac8ae86745dc9daf1fb07c6c96be4f336c90 Author: beyond1920 <beyond1...@126.com> Date: 2016-09-01T07:27:20Z rsourceManager registration with JobManager commit f5e54a21e4a864b5ac5f2f548b6d3dea3edcb619 Author: beyond1920 <beyond1...@126.com> Date: 2016-09-07T09:53:44Z Add JobMasterLeaderRetriverListener at ResourceManager ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---