Hi Colin,

Thanks for your feedback! Please find my answers below.

> However, I wonder if this will be feasible for ZK-based brokers to do. We 
> still support pre-topic-id IBP versions there, right? This might end up being 
> more complex than you were hoping.

That's right. We support pre-topic-id IBP versions for ZK-based
brokers. We will only use version 2 if all the topics have an ID when
the request is constructed. Version 1 is used otherwise. I have
already implemented this part in the draft PR if you want to see how
it looks. It brings a little more complexity in the request
building/handling but that works. I think that it is definitely worth
it.

> We should add a comment in AlterPartitionResponse about the new error code 
> INELIGIBLE_REPLICA. This is very important for error codes so we can track 
> which ones are returned in which RPC version.

Totally. I have those comments in the PR but I forgot to add them in
the KIP. I will fix this.

> I also wonder if we should add a special error code for the case where the 
> AlterPartitions call completes a reassignment AND the completed reassignment 
> no longer includes the previous leader. We have been overloading 
> FENCED_LEADER_EPOCH for this case, but this is coonfusing to operators 
> (especially since this RPC does not support error messages, as opposed to 
> codes)

Interesting. I was not aware of this one. We don't do this in the ZK
controller, right? I do agree that using FENCED_LEADER_EPOCH is
confusing here. We could introduce a NEW_LEADER_ELECTED error code for
this purpose.

> One thing, though, is that we should define how this interacts with replica 
> placement. It seems to me that replicas should not be able to be placed on 
> these inControlledShutdown nodes (unless done manually via the explicit 
> placement API).

Yeah, I agree that we need to clarify this. If not mistaken, replicas
can be placed on fenced nodes so I don't see why it should be
different for inControlledShutdown nodes. Otherwise we will again have
that create topic issue when the cluster has only 3 nodes. However I
think that we have to guarantee two other invariants:
1) an inControlledShutdown node should not be added to the ISR when a
partition is created.
2) an inControlledShutdown node should not be picked as a leader for a
partition. (e.g. KAFKA-13944)

Let me add this part to the KIP.

> I also think we should spell out the fact that once you go into controlled 
> shutdown, you don't come out except by creating a new broker instance. (new 
> incarnation ID). This also makes me wonder if we need to support the shutting 
> down -> not shutting down transition in BrokerRegistrationChangeRecord, since 
> we don't plan on using that transition.

That makes sense. I will add this and remove that transition from
BrokerRegistrationChangeRecord.

> Finally, RegisterBrokerRecord / BrokerRegistrationChangeRecord should be 
> bumped to the next RPC version since we have added a new field. You will also 
> need to assign yourself a new IBP / MetadataVersion. (For 
> BrokerRegistrationChangeRecord it would be possible to avoid the version 
> bump, since we're using tagged fields, but it's better to have it for 
> consistency, I think.)

Noted.

Let me update the KIP to incorporate all your feedback.

Cheers,
David

On Tue, May 31, 2022 at 9:42 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > We should add a comment in AlterPartitionResponse about the new error
> > code INELIGIBLE_REPLICA. This is very important for error codes so we
> > can track which ones are returned in which RPC version. I also wonder
>
> Here I'm referring to AlterPartitionResponse.json
>
> cheers,
> Colin
>
> >
> >
> > On Tue, May 31, 2022, at 08:36, David Jacot wrote:
> >> Hi folks,
> >>
> >> I'd like to start a vote for KIP-841:
> >> https://cwiki.apache.org/confluence/x/phmhD.
> >>
> >> Thanks,
> >> David

Reply via email to