Hi Colin,

I think storing reassignment state at the partition level is the right move
and I also agree that replicas should understand that there is a
reassignment in progress. This makes KIP-352 a trivial follow-up for
example. The only doubt I have is whether the leader and isr znode is the
right place to store the target reassignment. It is a bit odd to keep the
target assignment in a separate place from the current assignment, right? I
assume the thinking is probably that although the current assignment should
probably be in the leader and isr znode as well, it is hard to move the
state in a compatible way. Is that right? But if we have no plan to remove
the assignment znode, do you see a downside to storing the target
assignment there as well?

A few additional questions:

1. Should `alterPartitionReassignments` be `alterPartitionAssignments`?
It's the current assignment we're altering, right?
2. Does this change affect the Metadata API? In other words, are clients
aware of reassignments? If so, then we probably need a change to
UpdateMetadata as well. The only alternative I can think of would be to
represent the replica set in the Metadata request as the union of the
current and target replicas, but I can't think of any benefit to hiding
reassignments. Note that if we did this, we probably wouldn't need a
separate API to list reassignments.
3. As replicas come into sync, they will join the ISR. Will we await all
target replicas joining the ISR before taking the replica out of the target
replicas set? Also, I assume that target replicas can still be elected as
leader?
4. Probably useful to mention permissions for the new APIs.

Thanks,
Jason

On Fri, May 10, 2019 at 9:30 AM Gwen Shapira <g...@confluent.io> wrote:

> +1 (binding)
> Looks great, and will be awesome to have this new capability.
>
> On Wed, May 8, 2019 at 10:23 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > Hi all,
> >
> > I'd like to start the vote for KIP-455: Create an Administrative API for
> > Replica Reassignment.  I think this KIP is important since it will unlock
> > many follow-on improvements to Kafka reassignment (see the "Future work"
> > section, plus a lot of the other discussions we've had recently about
> > reassignment).  It also furthers the important KIP-4 goal of removing
> > direct access to ZK.
> >
> > I made a few changes based on the discussion in the [DISCUSS] thread.  As
> > Robert suggested, I removed the need to explicitly cancel a reassignment
> > for a partition before setting up a different reassignment for that
> > specific partition.  I also simplified the API a bit by adding a
> > PartitionReassignment class which is used by both the alter and list
> APIs.
> >
> > I modified the proposal so that we now deprecate the old znode-based API
> > rather than removing it completely.  That should give external
> rebalancing
> > tools some time to transition to the new API.
> >
> > To clarify a question Viktor asked, I added a note that the
> > kafka-reassign-partitions.sh will now use a --bootstrap-server argument
> to
> > contact the admin APIs.
> >
> > thanks,
> > Colin
> >
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>
>

Reply via email to