Hi Viktor, Thanks for writing this up. As far as questions about overlap with KIP-236, I agree it seems mostly orthogonal. I think KIP-236 may have had a larger initial scope, but now it focuses on cancellation and batching is left for future work.
With that said, I think we may not actually need a KIP for the current proposal since it doesn't change any APIs. To make it more generally useful, however, it would be nice to handle batching at the partition level as well as Jun suggests. The basic question is at what level should the batching be determined. You could rely on external processes (e.g. cruise control) or it could be built into the controller. There are tradeoffs either way, but I think it simplifies such tools if it is handled internally. Then it would be much safer to submit a larger reassignment even just using the simple tools that come with Kafka. By the way, since you are looking into some of the reassignment logic, another problem that we might want to address is the misleading way we report URPs during a reassignment. I had a naive proposal for this previously, but it didn't really work https://cwiki.apache.org/confluence/display/KAFKA/KIP-352%3A+Distinguish+URPs+caused+by+reassignment. Potentially fixing that could fall under this work as well if you think it makes sense. Best, Jason On Thu, Apr 4, 2019 at 4:49 PM Jun Rao <j...@confluent.io> wrote: > Hi, Viktor, > > Thanks for the KIP. A couple of comments below. > > 1. Another potential thing to do reassignment incrementally is to move a > batch of partitions at a time, instead of all partitions. This may lead to > less data replication since by the time the first batch of partitions have > been completely moved, some data of the next batch may have been deleted > due to retention and doesn't need to be replicated. > > 2. "Update CR in Zookeeper with TR for the given partition". Which ZK path > is this for? > > Jun > > On Sat, Feb 23, 2019 at 2:12 AM Viktor Somogyi-Vass < > viktorsomo...@gmail.com> > wrote: > > > Hi Harsha, > > > > As far as I understand KIP-236 it's about enabling reassignment > > cancellation and as a future plan providing a queue of replica > reassignment > > steps to allow manual reassignment chains. While I agree that the > > reassignment chain has a specific use case that allows fine grain control > > over reassignment process, My proposal on the other hand doesn't talk > about > > cancellation but it only provides an automatic way to incrementalize an > > arbitrary reassignment which I think fits the general use case where > users > > don't want that level of control but still would like a balanced way of > > reassignments. Therefore I think it's still relevant as an improvement of > > the current algorithm. > > Nevertheless I'm happy to add my ideas to KIP-236 as I think it would be > a > > great improvement to Kafka. > > > > Cheers, > > Viktor > > > > On Fri, Feb 22, 2019 at 5:05 PM Harsha <ka...@harsha.io> wrote: > > > > > Hi Viktor, > > > There is already KIP-236 for the same feature and George > made > > > a PR for this as well. > > > Lets consolidate these two discussions. If you have any cases that are > > not > > > being solved by KIP-236 can you please mention them in that thread. We > > can > > > address as part of KIP-236. > > > > > > Thanks, > > > Harsha > > > > > > On Fri, Feb 22, 2019, at 5:44 AM, Viktor Somogyi-Vass wrote: > > > > Hi Folks, > > > > > > > > I've created a KIP about an improvement of the reassignment algorithm > > we > > > > have. It aims to enable partition-wise incremental reassignment. The > > > > motivation for this is to avoid excess load that the current > > replication > > > > algorithm implicitly carries as in that case there are points in the > > > > algorithm where both the new and old replica set could be online and > > > > replicating which puts double (or almost double) pressure on the > > brokers > > > > which could cause problems. > > > > Instead my proposal would slice this up into several steps where each > > > step > > > > is calculated based on the final target replicas and the current > > replica > > > > assignment taking into account scenarios where brokers could be > offline > > > and > > > > when there are not enough replicas to fulfil the min.insync.replica > > > > requirement. > > > > > > > > The link to the KIP: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-435%3A+Incremental+Partition+Reassignment > > > > > > > > I'd be happy to receive any feedback. > > > > > > > > An important note is that this KIP and another one, KIP-236 that is > > > > about > > > > interruptible reassignment ( > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-236%3A+Interruptible+Partition+Reassignment > > > ) > > > > should be compatible. > > > > > > > > Thanks, > > > > Viktor > > > > > > > > > >