[ https://issues.apache.org/jira/browse/FLINK-26522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712914#comment-17712914 ]
Matthias Pohl commented on FLINK-26522: --------------------------------------- Sure, I added it to the title (y) > FLIP-285: Refactoring code for multiple component leader election > ----------------------------------------------------------------- > > Key: FLINK-26522 > URL: https://issues.apache.org/jira/browse/FLINK-26522 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Affects Versions: 1.16.0 > Reporter: Niklas Semmler > Assignee: Matthias Pohl > Priority: Major > Labels: pull-request-available > Attachments: leaderelection-FLINK-26522.class.svg, > leaderelection-FLINK-26522.class.v2.svg, > leaderelection-flink-1.15+.class.svg, leaderelection-flink-1.15-.class.svg > > > The current implementation of the multiple component leader election faces a > number of issues. These issues mostly stem from an attempt to make the > multiple leader election process work just the same way as the single > component leader election. > An attempt at listing the issues follows: > * *Naming* MultipleComponentLeaderElectionService appears by name similar to > the LeaderElectionService, but is in fact closer to the LeaderElectionDriver. > * *Similarity* The interfaces LeaderElectionService, LeaderElectionDriver and > MultipleComponentLeaderElectionDriver are very similar to each other. > * *Cyclic dependency* DefaultMultipleComponentLeaderElectionService holds a > reference to the ZooKeeperMultipleComponentLeaderElectionDriver > (MultipleComponentLeaderElectionDriver), which in turn holds a reference to > the DefaultMultipleComponentLeaderElectionService (LeaderLatchListener) > * *Unclear contract* With single component leader election drivers such as > ZooKeeperLeaderElectionDriver a call to the LeaderElectionService#stop from > JobMasterServiceLeadershipRunner#closeAsync implies giving up the leadership > of the JobMaster. With the multiple component leader election this is no > longer the case. The leadership is held until the HighAvailabilityServices > shutdown. This logic may be difficult to understand from the perspective of > one of the components (e.g., the Dispatcher) > * *Long call hierarchy* > DefaultLeaderElectionService->MultipleComponentLeaderElectionDriverAdapter->MultipleComponentLeaderElectionService->ZooKeeperMultipleComponentLeaderElectionDriver > * *Long prefix* "MultipleComponentLeaderElection" is quite a long prefix but > shared by many classes. > * *Adapter as primary implementation* All non-testing non-multiple-component > leadership drivers are deprecated. The primary implementation of > LeaderElectionDriver is the adapter > MultipleComponentLeaderElectionDriverAdapter. > * *Possible redundancy* We currently have similar methods for the Dispatcher, > ResourceManager, JobMaster and WebMonitorEndpoint. (E.g., for granting > leadership.) As these methods are called at the same time due to the multiple > component leader election, it may make sense to combine this logic into a > single object. -- This message was sent by Atlassian Jira (v8.20.10#820010)