Thanks, David. I think the KIP looks good. +1 (binding) from me

-David

On Wed, Jun 1, 2022 at 12:54 PM Colin McCabe <cmcc...@apache.org> wrote:

> Thanks, David. With these changes, I am +1 (binding)
>
> Great to see this KIP moving forward!
>
> Colin
>
>
> On Wed, Jun 1, 2022, at 02:04, David Jacot wrote:
> > Hi Colin,
> >
> > Thanks for your feedback! Please find my answers below.
> >
> >> However, I wonder if this will be feasible for ZK-based brokers to do.
> We still support pre-topic-id IBP versions there, right? This might end up
> being more complex than you were hoping.
> >
> > That's right. We support pre-topic-id IBP versions for ZK-based
> > brokers. We will only use version 2 if all the topics have an ID when
> > the request is constructed. Version 1 is used otherwise. I have
> > already implemented this part in the draft PR if you want to see how
> > it looks. It brings a little more complexity in the request
> > building/handling but that works. I think that it is definitely worth
> > it.
> >
> >> We should add a comment in AlterPartitionResponse about the new error
> code INELIGIBLE_REPLICA. This is very important for error codes so we can
> track which ones are returned in which RPC version.
> >
> > Totally. I have those comments in the PR but I forgot to add them in
> > the KIP. I will fix this.
> >
> >> I also wonder if we should add a special error code for the case where
> the AlterPartitions call completes a reassignment AND the completed
> reassignment no longer includes the previous leader. We have been
> overloading FENCED_LEADER_EPOCH for this case, but this is coonfusing to
> operators (especially since this RPC does not support error messages, as
> opposed to codes)
> >
> > Interesting. I was not aware of this one. We don't do this in the ZK
> > controller, right? I do agree that using FENCED_LEADER_EPOCH is
> > confusing here. We could introduce a NEW_LEADER_ELECTED error code for
> > this purpose.
> >
> >> One thing, though, is that we should define how this interacts with
> replica placement. It seems to me that replicas should not be able to be
> placed on these inControlledShutdown nodes (unless done manually via the
> explicit placement API).
> >
> > Yeah, I agree that we need to clarify this. If not mistaken, replicas
> > can be placed on fenced nodes so I don't see why it should be
> > different for inControlledShutdown nodes. Otherwise we will again have
> > that create topic issue when the cluster has only 3 nodes. However I
> > think that we have to guarantee two other invariants:
> > 1) an inControlledShutdown node should not be added to the ISR when a
> > partition is created.
> > 2) an inControlledShutdown node should not be picked as a leader for a
> > partition. (e.g. KAFKA-13944)
> >
> > Let me add this part to the KIP.
> >
> >> I also think we should spell out the fact that once you go into
> controlled shutdown, you don't come out except by creating a new broker
> instance. (new incarnation ID). This also makes me wonder if we need to
> support the shutting down -> not shutting down transition in
> BrokerRegistrationChangeRecord, since we don't plan on using that
> transition.
> >
> > That makes sense. I will add this and remove that transition from
> > BrokerRegistrationChangeRecord.
> >
> >> Finally, RegisterBrokerRecord / BrokerRegistrationChangeRecord should
> be bumped to the next RPC version since we have added a new field. You will
> also need to assign yourself a new IBP / MetadataVersion. (For
> BrokerRegistrationChangeRecord it would be possible to avoid the version
> bump, since we're using tagged fields, but it's better to have it for
> consistency, I think.)
> >
> > Noted.
> >
> > Let me update the KIP to incorporate all your feedback.
> >
> > Cheers,
> > David
> >
> > On Tue, May 31, 2022 at 9:42 PM Colin McCabe <cmcc...@apache.org> wrote:
> >>
> >> > We should add a comment in AlterPartitionResponse about the new error
> >> > code INELIGIBLE_REPLICA. This is very important for error codes so we
> >> > can track which ones are returned in which RPC version. I also wonder
> >>
> >> Here I'm referring to AlterPartitionResponse.json
> >>
> >> cheers,
> >> Colin
> >>
> >> >
> >> >
> >> > On Tue, May 31, 2022, at 08:36, David Jacot wrote:
> >> >> Hi folks,
> >> >>
> >> >> I'd like to start a vote for KIP-841:
> >> >> https://cwiki.apache.org/confluence/x/phmhD.
> >> >>
> >> >> Thanks,
> >> >> David
>


-- 
David Arthur

Reply via email to