Hi José,

Thanks for the KIP.

The KIP talks a bit about "recovery," which is a new term (as far as I know). 
If I understand correctly, this is a state that the partition enters into after 
an unclean leader election. I would suggest using a different term for this, 
since we already use the term "log recovery" to express the cleanup we do after 
an unclean broker restart. How about something like "leader cleansing"? 
Whatever term we pick, we should probably use it in all the metadata rather 
than is_unclean, etc.

I think we also need to spell out a few other things about "cleansing" state. 
For example, if the controller needs to change the leader of the partition 
again, before cleansing completes, does that mean the partition is still 
unclean? I would guess yes. We should spell this out in the KIP.

With regard to AlterIsr, it seems like we might end up in a scenario where the 
broker is on an old version that doesn't support the cleansing state, and the 
controller is on a new version that does. In that case, it seems like we could 
get stuck in the cleansing state for an arbitrary amount of time. We will not 
exit it until the broker sends an AlterIsr, but the broker may not send one of 
those ever (for example, if we go down to one uncleanly elected replica, and 
stay there).

The most obvious way to resolve this would be to use the IBP / metadata.version 
to gate using this feature. So it sounds like we should add a few lines to the 
KIP indicating that we're doing that.

I think the compatibility logic needs needs more thought. One example is, if we 
have some brokers in the cluster that support KIP-704 and some that don't, do 
we auto-clear the cleansing state when transitioning leadership uncleanly to an 
old broker? I think you probably have to. So we should discuss this in the KIP.

best,
Colin


On Mon, Jan 17, 2022, at 19:27, José Armando García Sancio wrote:
> Jose wrote:
>> I'll update the KIP with this information. The leader will return
>> "NOT_LEADER_OR_FOLLOWER" for any partition that is still recovering
>> for Fetch, Produce, OffsetsForLeaderEpoch and DeleteRecords requests.
>> This error type is retriable by the clients.
>
> I forgot to include ListOffset. Updated the KIP to include this information.
>
> Thanks
> -Jose

Reply via email to