Having the *start()* in *LeaderContender* interface and bringing back the
*LeaderElection* with some new methods make sense to me.

I have no more concerns now.


>    - *LeaderContender*: The LeaderContender is integrated as usual except
>    that it accesses the LeaderElection instead of the LeaderElectionService.
>    It's going to call startLeaderElection(LeaderContender) where, previously,
>    LeaderElectionService.start(LeaderContender) was called.
>
> nit: we call the *LeaderElection#startLeaderElection()*, not the
*LeaderElection#startLeaderElection(LeaderContender)*. Because we have
already set the leaderContender in the
*LeaderElection#register(LeaderContender)*.


Best,
Yang


Chesnay Schepler <ches...@apache.org> 于2023年1月23日周一 23:16写道:

> Thanks for updating the design. From my side this looks good.
>
> On 18/01/2023 17:59, Matthias Pohl wrote:
> > After another round of discussion, I came up with a (hopefully) final
> > proposal. The previously discussed approach was still not optimal because
> > the contender ID lived in the LeaderContender even though it is actually
> > LeaderElectionService-internal knowledge. Fixing that helped fix the
> > overall architecture. Additionally, it brought back the LeaderElection
> > interface with slightly different methods.
> >
> > I updated the "Code Cleanup: Merge MultipleComponentLeaderElectionService
> > into LeaderElectionService" section and moved the old proposal into the
> > section for rejected alternatives. Feel free to have another look at the
> > updated version [1].
> >
> > Matthias
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> >
> > On Wed, Jan 18, 2023 at 1:40 PM Matthias Pohl <matthias.p...@aiven.io>
> > wrote:
> >
> >> Thanks for participating in the discussion, Yang & Chesnay.
> LeaderElection
> >> interface extension gave me a headache as well. I added it initially
> >> because I thought it would be of more value. But essentially, it doesn't
> >> help but make the code harder to understand (as your questions
> rightfully
> >> point out). I agree that the FLIP is good enough without this
> extension. I
> >> moved it into the Rejected Alternatives section of the FLIP and would
> >> propose going ahead without it.
> >>
> >> I will answer your questions about the LeaderElection extension, anyway:
> >>
> >> BTW, if the *LeaderElectionService#register(return LeaderElection)* and
> >>> *LeaderElectionService#onGrantLeadership* are guarded by a same lock,
> then
> >>> we could ensure that the leaderElection in *LeaderContender* is always
> >>> non-null when it tries to confirm the leadership. And then we do not
> need
> >>> the
> >>> *LeaderContender#initializeLeaderElection*. Right?
> >>
> >> No, we still would need LeaderContender#initializeLeaderElection because
> >> the LeaderElectionService needs to be capable of setting the
> LeaderElection
> >> within the LeaderContender before triggering the process for granting
> the
> >> leadership. This all needs to happen within the
> >> LeaderElectionService#register(LeaderContender). It's indepent of the
> lock.
> >>
> >> With the extension, how does the leader contender get access to the
> >>> LeaderElection? I would've assumed that LEService returns a
> LeaderElection
> >>> when register is called, but according to the diagram this method
> doesn't
> >>> return anything. Is that what initiateLeaderElection is doing?
> >>
> >> Correct. My initial plan was to make
> >> LeaderElectionService#register(LeaderContender) return the
> LeaderElection
> >> instance. That method could have been called within the LeaderContender.
> >> But this approach has the flaw that LeaderContender would be in charge
> >> within this control flow where, actually, we would want
> >> LeaderElectionService to be still in charge to trigger the process for
> >> granting the leadership. This required the
> >> LeaderContender.initializeLeaderElection(LeaderElection) method to be
> added
> >> to enable the LeaderElectionService to do the initialization. I added a
> >> comment to the corresponding class diagram to make this clearer.
> >>
> >> The DefaultLeaderElection will rely on package-private methods of the
> >>> DLEService to handle confirm/hasLeadership calls?
> >>
> >> Correct. I added the missing package-private methods to the class
> diagram
> >> in the FLIP to clear things up.
> >>
> >> On Wed, Jan 18, 2023 at 11:47 AM Chesnay Schepler <ches...@apache.org>
> >> wrote:
> >>
> >>> There are a lot of good things in this, and until the Extension bit I'm
> >>> fully on board.
> >>>
> >>> With the extension, how does the leader contender get access to the
> >>> LeaderElection? I would've assumed that LEService returns a
> >>> LeaderElection when register is called, but according to the diagram
> >>> this method doesn't return anything. Is that what
> initiateLeaderElection
> >>> is doing?
> >>>
> >>> The DefaultLeaderElection will rely on package-private methods of the
> >>> DLEService to handle confirm/hasLeadership calls?
> >>>
> >>> I don't fully understand why LContender#initializeLeaderElection is
> >>> required.
> >>>
> >>> On 05/01/2023 14:49, Matthias Pohl wrote:
> >>>> Hi everyone,
> >>>> I brought up FLINK-26522 [1] in the mailing list discussion about
> >>>> consolidating the HighAvailabilityServices interfaces [2], previously.
> >>>> There, it was concluded that the community still wants the ability to
> >>> have
> >>>> per-component leader election and, therefore, keep the
> >>>> HighAvailabilityServices interface as is. I went back to work on
> >>>> FLINK-26522 [1] to figure out how we can simplify the current codebase
> >>>> keeping the decision in mind.
> >>>>
> >>>> I wanted to handle FLINK-26522 [1] as a follow-up cleanup task of
> >>>> FLINK-24038 [3]. But while working on it, I realized that even
> >>> FLINK-24038
> >>>> [3] shouldn't have been handled without a FLIP. The per-process leader
> >>>> election which was introduced in FLINK-24038 [3] changed the ownership
> >>> of
> >>>> certain components. This is actually a change that should have been
> >>>> discussed in the mailing list and deserved a FLIP. To overcome this
> >>>> shortcoming of FLINK-24038 [3], I decided to prepare FLIP-285 [4] to
> >>>> provide proper documentation of what happened in FLINK-24038 and what
> >>> will
> >>>> be manifested with resolving its follow-up FLINK-26522 [1].
> >>>>
> >>>> Conceptually, this FLIP proposes moving away from Flink's support for
> >>>> single-contender LeaderElectionServices and introducing
> multi-contender
> >>>> support by disconnecting the HA-backend leader election lifecycle from
> >>> the
> >>>> LeaderContender's lifecycle. This allows us to provide LeaderElection
> >>> per
> >>>> component (as it was requested in [2]) but also enables us to utilize
> a
> >>>> single leader election for multiple components/contenders as well
> >>> without
> >>>> the complexity of the code that was introduced by FLINK-24038 [3].
> >>>>
> >>>> I'm looking forward to your comments.
> >>>>
> >>>> Matthias
> >>>>
> >>>> [1] https://issues.apache.org/jira/browse/FLINK-26522
> >>>> [2] https://lists.apache.org/thread/9oy2ml9s3j1v6r77h31sndhc3gw57cfm
> >>>> [3] https://issues.apache.org/jira/browse/FLINK-24038
> >>>> [4]
> >>>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> >>>
>
>

Reply via email to