Hi Colin, Some more comments: 1. In the KIP, we mentioned "controller heartbeats", but it is not explained anywhere. I think "controller heartbeats" = controller registration", is that correct? If no, please explain more about it in the KIP.
2. Following this question: > Which endpoint will the inactive controllers use to send the ControllerRegistrationRequest? > A: They will use the endpoint in controller.quorum.voters. If the registration request will include controller.quorum.voters, why bother sending this information to active controller again? The active controller should already have all the controller.quorum.voters when start up. Any purpose of that design? For validation? 3. If a controller node is not part of `controller.quorum.voters`, when it sends ControllerRegistrationRequest, what will we respond to it? 4. Nice and clear compatibility matrix! Thank you. Luke On Sat, Jul 22, 2023 at 3:38 AM Colin McCabe <cmcc...@apache.org> wrote: > On Fri, Jul 21, 2023, at 09:43, José Armando García Sancio wrote: > > Thanks for the KIP Colin. Apologies if some of these points have > > already been made. I have not followed the discussion closely: > > > > 1. Re: Periodically, each controller will check that the controller > > registration for its ID is as expected > > > > Does this need to be periodic? Can't the controller schedule this RPC, > > retry etc, when it finds that the incarnation ID doesn't match its > > own? > > > > Hi José, > > Thanks for the reviews. > > David had the same question. I agree that it should be event-driven rather > than periodic (except for retries, etc.) > > > > > 2. Did you consider including the active controller's epoch in the > > ControllerRegistrationRequest? > > > > This would allow the active controller to reject registration from > > controllers that are not part of the active quorum and don't know the > > latest controller epoch. This should mitigate some of the concerns you > > raised in bullet point 1. > > > > Good idea. I will add the active controller epoch to the registration > request. > > > > > 3. Which endpoint will the inactive controllers use to send the > > ControllerRegistrationRequest? > > > > Will it use the first endpoint described in the cluster metadata > > controller registration record? Or would it use the endpoint described > > in the server configuration at controller.quorum.voters? > > > > They will use the endpoint in controller.quorum.voters. In general, the > endpoints from the registration are only used for responding to > DESCRIBE_CLUSTER. Since, after all, we may not even have the registration > endpoints when we start up. > > > > > 4. Re: Raft integration in the rejected alternatives > > > > Yes, The KRaft layer needs to solve a similar problem like endpoint > > discovery to support dynamic controller membership change. As you > > point out the requirements are different and the set of information > > that needs to be tracked is different. I think it is okay to use a > > different solution for each of these problems. > > Yeah that was my feeling too. Thanks for taking a look. > > regards, > Colin > > > > > Thanks! > > -- > > -José >