On Tue, Jul 25, 2023, at 05:30, Luke Chen wrote:
> Hi Colin,
>
> Some more comments:
> 1. In the KIP, we mentioned "controller heartbeats", but it is not
> explained anywhere.
> I think "controller heartbeats" = controller registration", is that
> correct?
> If no, please explain more about it in the KIP.

Hi Luke,

Sorry, this was an artifact of earlier revisions. I have changed it to 
"ControllerRegistration" in all the cases where I didn't update it before.

>
> 2. Following this question:
> > Which endpoint will the inactive controllers use to send the
> > ControllerRegistrationRequest?
> > A: They will use the endpoint in controller.quorum.voters.
> If the registration request will include controller.quorum.voters, why
> bother sending this information to active controller again?
> The active controller should already have all the controller.quorum.voters
> when start up.
> Any purpose of that design? For validation?

The controllers don't know which endpoint controller.quorum.voters is 
referencing.

>
> 3. If a controller node is not part of `controller.quorum.voters`, when it
> sends ControllerRegistrationRequest, what will we respond to it?
>

Good point. I added a new error, UNKNOWN_CONTROLLER_ID, for this case.

> 4. Nice and clear compatibility matrix!
>

Thanks!
Colin

> Thank you.
> Luke
>
> On Sat, Jul 22, 2023 at 3:38 AM Colin McCabe <cmcc...@apache.org> wrote:
>
>> On Fri, Jul 21, 2023, at 09:43, José Armando García Sancio wrote:
>> > Thanks for the KIP Colin. Apologies if some of these points have
>> > already been made. I have not followed the discussion closely:
>> >
>> > 1. Re: Periodically, each controller will check that the controller
>> > registration for its ID is as expected
>> >
>> > Does this need to be periodic? Can't the controller schedule this RPC,
>> > retry etc, when it finds that the incarnation ID doesn't match its
>> > own?
>> >
>>
>> Hi José,
>>
>> Thanks for the reviews.
>>
>> David had the same question. I agree that it should be event-driven rather
>> than periodic (except for retries, etc.)
>>
>> >
>> > 2. Did you consider including the active controller's epoch in the
>> > ControllerRegistrationRequest?
>> >
>> > This would allow the active controller to reject registration from
>> > controllers that are not part of the active quorum and don't know the
>> > latest controller epoch. This should mitigate some of the concerns you
>> > raised in bullet point 1.
>> >
>>
>> Good idea. I will add the active controller epoch to the registration
>> request.
>>
>> >
>> > 3. Which endpoint will the inactive controllers use to send the
>> > ControllerRegistrationRequest?
>> >
>> > Will it use the first endpoint described in the cluster metadata
>> > controller registration record? Or would it use the endpoint described
>> > in the server configuration at controller.quorum.voters?
>> >
>>
>> They will use the endpoint in controller.quorum.voters. In general, the
>> endpoints from the registration are only used for responding to
>> DESCRIBE_CLUSTER. Since, after all, we may not even have the registration
>> endpoints when we start up.
>>
>> >
>> > 4. Re: Raft integration in the rejected alternatives
>> >
>> > Yes, The KRaft layer needs to solve a similar problem like endpoint
>> > discovery to support dynamic controller membership change. As you
>> > point out the requirements are different and the set of information
>> > that needs to be tracked is different. I think it is okay to use a
>> > different solution for each of these problems.
>>
>> Yeah that was my feeling too. Thanks for taking a look.
>>
>> regards,
>> Colin
>>
>> >
>> > Thanks!
>> > --
>> > -José
>>

Reply via email to