Jose, Is there a specific need for adding "isUnclean '' flag to `AlterIsrResponse`. A potential sequence of events will be: 1. Controller sets the flag at `ZK` and informs the leader via the `LeaderAndIsrRequest` 2. Leader undertakes the necessary recovery and sends out a `AlterIsrRequest` to controller, with "isUnclean" flag reset 3. Controller will reset the flag at `ZK` if the `AlterIsrRequest` goes through. Shouldn't the partition level error code at `AlterIsrResponse` be enough for the leader to know of a successful update.
On Wed, Jan 19, 2022 at 6:11 PM Raman Verma <rve...@confluent.io> wrote: > > Thanks for this KIP, Jose > > - Could you please explain the following about backward compatibility. > If a leader has been elected unclean. And we decide to roll the > cluster back when the leader is in the middle of recovery, leader will > simply not be able to recover when we roll back because it will lose > its local copy of the unclean flag. Even if the controller sends > another `LISR` to the leader, it will ignore the flag. So, the leader > will not be able to carry on its recovery workflow. It will treat the > situation as if it was not elected unclean and carry on with expanding > ISR.- > ``` > When thinking about backward compatibility it is important to note > that if the "is unclean" field is true then the ISR is guarantee to > have a size of 1. The topic partition leader will not increase the ISR > until it has recovered from the unclean leader election and has set > the "is unclean" field to false. > ``` > > On Wed, Jan 19, 2022 at 4:52 PM José Armando García Sancio > <jsan...@confluent.io.invalid> wrote: > > > > Hi all, > > > > I made the following changes to the KIP: > > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=173082256&selectedPageVersions=12&selectedPageVersions=11 > > > > Some of the highlights are: > > 1. Changed the field from IsUnclean to IsLeaderRecovering > > 2. Added a few more sentences explaining why this KIP is backward > > compatible and the interaction between the controller and the > > partition leaders when they are in different software versions. > > > > Thanks > > -José > > > > -- > Best Regards, > Raman Verma -- Best Regards, Raman Verma