[ 
https://issues.apache.org/jira/browse/FLINK-26522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712914#comment-17712914
 ] 

Matthias Pohl commented on FLINK-26522:
---------------------------------------

Sure, I added it to the title (y)

> FLIP-285: Refactoring code for multiple component leader election
> -----------------------------------------------------------------
>
>                 Key: FLINK-26522
>                 URL: https://issues.apache.org/jira/browse/FLINK-26522
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.16.0
>            Reporter: Niklas Semmler
>            Assignee: Matthias Pohl
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: leaderelection-FLINK-26522.class.svg, 
> leaderelection-FLINK-26522.class.v2.svg, 
> leaderelection-flink-1.15+.class.svg, leaderelection-flink-1.15-.class.svg
>
>
> The current implementation of the multiple component leader election faces a 
> number of issues. These issues mostly stem from an attempt to make the 
> multiple leader election process work just the same way as the single 
> component leader election.
> An attempt at listing the issues follows:
> * *Naming* MultipleComponentLeaderElectionService appears by name similar to 
> the LeaderElectionService, but is in fact closer to the LeaderElectionDriver.
> * *Similarity* The interfaces LeaderElectionService, LeaderElectionDriver and 
> MultipleComponentLeaderElectionDriver are very similar to each other.
> * *Cyclic dependency* DefaultMultipleComponentLeaderElectionService holds a 
> reference to the ZooKeeperMultipleComponentLeaderElectionDriver 
> (MultipleComponentLeaderElectionDriver), which in turn holds a reference to 
> the DefaultMultipleComponentLeaderElectionService (LeaderLatchListener)
> * *Unclear contract* With single component leader election drivers such as 
> ZooKeeperLeaderElectionDriver a call to the LeaderElectionService#stop from 
> JobMasterServiceLeadershipRunner#closeAsync implies giving up the leadership 
> of the JobMaster. With the multiple component leader election this is no 
> longer the case. The leadership is held until the HighAvailabilityServices 
> shutdown. This logic may be difficult to understand from the perspective of 
> one of the components (e.g., the Dispatcher)
> * *Long call hierarchy* 
> DefaultLeaderElectionService->MultipleComponentLeaderElectionDriverAdapter->MultipleComponentLeaderElectionService->ZooKeeperMultipleComponentLeaderElectionDriver
> * *Long prefix* "MultipleComponentLeaderElection" is quite a long prefix but 
> shared by many classes.
> * *Adapter as primary implementation* All non-testing non-multiple-component 
> leadership drivers are deprecated. The primary implementation of 
> LeaderElectionDriver is the adapter 
> MultipleComponentLeaderElectionDriverAdapter.
> * *Possible redundancy* We currently have similar methods for the Dispatcher, 
> ResourceManager, JobMaster and WebMonitorEndpoint. (E.g., for granting 
> leadership.) As these methods are called at the same time due to the multiple 
> component leader election, it may make sense to combine this logic into a 
> single object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to