Thanks for participating in the discussion, Yang & Chesnay. LeaderElection interface extension gave me a headache as well. I added it initially because I thought it would be of more value. But essentially, it doesn't help but make the code harder to understand (as your questions rightfully point out). I agree that the FLIP is good enough without this extension. I moved it into the Rejected Alternatives section of the FLIP and would propose going ahead without it.
I will answer your questions about the LeaderElection extension, anyway: BTW, if the *LeaderElectionService#register(return LeaderElection)* and > *LeaderElectionService#onGrantLeadership* are guarded by a same lock, then > we could ensure that the leaderElection in *LeaderContender* is always > non-null when it tries to confirm the leadership. And then we do not need > the > *LeaderContender#initializeLeaderElection*. Right? No, we still would need LeaderContender#initializeLeaderElection because the LeaderElectionService needs to be capable of setting the LeaderElection within the LeaderContender before triggering the process for granting the leadership. This all needs to happen within the LeaderElectionService#register(LeaderContender). It's indepent of the lock. With the extension, how does the leader contender get access to the > LeaderElection? I would've assumed that LEService returns a LeaderElection > when register is called, but according to the diagram this method doesn't > return anything. Is that what initiateLeaderElection is doing? Correct. My initial plan was to make LeaderElectionService#register(LeaderContender) return the LeaderElection instance. That method could have been called within the LeaderContender. But this approach has the flaw that LeaderContender would be in charge within this control flow where, actually, we would want LeaderElectionService to be still in charge to trigger the process for granting the leadership. This required the LeaderContender.initializeLeaderElection(LeaderElection) method to be added to enable the LeaderElectionService to do the initialization. I added a comment to the corresponding class diagram to make this clearer. The DefaultLeaderElection will rely on package-private methods of the > DLEService to handle confirm/hasLeadership calls? Correct. I added the missing package-private methods to the class diagram in the FLIP to clear things up. On Wed, Jan 18, 2023 at 11:47 AM Chesnay Schepler <ches...@apache.org> wrote: > There are a lot of good things in this, and until the Extension bit I'm > fully on board. > > With the extension, how does the leader contender get access to the > LeaderElection? I would've assumed that LEService returns a > LeaderElection when register is called, but according to the diagram > this method doesn't return anything. Is that what initiateLeaderElection > is doing? > > The DefaultLeaderElection will rely on package-private methods of the > DLEService to handle confirm/hasLeadership calls? > > I don't fully understand why LContender#initializeLeaderElection is > required. > > On 05/01/2023 14:49, Matthias Pohl wrote: > > Hi everyone, > > I brought up FLINK-26522 [1] in the mailing list discussion about > > consolidating the HighAvailabilityServices interfaces [2], previously. > > There, it was concluded that the community still wants the ability to > have > > per-component leader election and, therefore, keep the > > HighAvailabilityServices interface as is. I went back to work on > > FLINK-26522 [1] to figure out how we can simplify the current codebase > > keeping the decision in mind. > > > > I wanted to handle FLINK-26522 [1] as a follow-up cleanup task of > > FLINK-24038 [3]. But while working on it, I realized that even > FLINK-24038 > > [3] shouldn't have been handled without a FLIP. The per-process leader > > election which was introduced in FLINK-24038 [3] changed the ownership of > > certain components. This is actually a change that should have been > > discussed in the mailing list and deserved a FLIP. To overcome this > > shortcoming of FLINK-24038 [3], I decided to prepare FLIP-285 [4] to > > provide proper documentation of what happened in FLINK-24038 and what > will > > be manifested with resolving its follow-up FLINK-26522 [1]. > > > > Conceptually, this FLIP proposes moving away from Flink's support for > > single-contender LeaderElectionServices and introducing multi-contender > > support by disconnecting the HA-backend leader election lifecycle from > the > > LeaderContender's lifecycle. This allows us to provide LeaderElection per > > component (as it was requested in [2]) but also enables us to utilize a > > single leader election for multiple components/contenders as well without > > the complexity of the code that was introduced by FLINK-24038 [3]. > > > > I'm looking forward to your comments. > > > > Matthias > > > > [1] https://issues.apache.org/jira/browse/FLINK-26522 > > [2] https://lists.apache.org/thread/9oy2ml9s3j1v6r77h31sndhc3gw57cfm > > [3] https://issues.apache.org/jira/browse/FLINK-24038 > > [4] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box > > > >