On Fri, May 10, 2019, at 16:43, Jason Gustafson wrote:
> Hi Colin,
> 
> I think storing reassignment state at the partition level is the right move
> and I also agree that replicas should understand that there is a
> reassignment in progress. This makes KIP-352 a trivial follow-up for
> example. The only doubt I have is whether the leader and isr znode is the
> right place to store the target reassignment. It is a bit odd to keep the
> target assignment in a separate place from the current assignment, right? I
> assume the thinking is probably that although the current assignment should
> probably be in the leader and isr znode as well, it is hard to move the
> state in a compatible way. Is that right? But if we have no plan to remove
> the assignment znode, do you see a downside to storing the target
> assignment there as well?
>

Hi Jason,

That's a good point -- it's probably better to keep the target assignment in 
the same znode as the current assignment, for consistency.  I'll change the KIP.

> A few additional questions:
> 
> 1. Should `alterPartitionReassignments` be `alterPartitionAssignments`?
> It's the current assignment we're altering, right?

That's fair.  AlterPartitionAssigments reads a little better, and I'll change 
it to that.

> 2. Does this change affect the Metadata API? In other words, are clients
> aware of reassignments? If so, then we probably need a change to
> UpdateMetadata as well. The only alternative I can think of would be to
> represent the replica set in the Metadata request as the union of the
> current and target replicas, but I can't think of any benefit to hiding
> reassignments. Note that if we did this, we probably wouldn't need a
> separate API to list reassignments.

I thought about this a bit... and I think on balance, you're right.  We should 
keep this information together with the replica nodes, isr nodes, and offline 
replicas, and that information is available in the MetadataResponse. 
 However, I do think in order to do this, we'll need a flag in the 
MetadataRequest that specifiies "only show me reassigning partitions".  I'll 
add this.

> 3. As replicas come into sync, they will join the ISR. Will we await all
> target replicas joining the ISR before taking the replica out of the target
> replicas set? Also, I assume that target replicas can still be elected as
> leader?

We'll take a replica out of the target replicas set as soon as that replica is 
in the ISR.  Let me clarify this in the KIP.

> 4. Probably useful to mention permissions for the new APIs.

Good point.  I think alterPartitionAssignments should require ALTER on CLUSTER. 
 MetadataRequest permissions will be unchanged.

best,
Colin

> 
> Thanks,
> Jason
> 
> On Fri, May 10, 2019 at 9:30 AM Gwen Shapira <g...@confluent.io> wrote:
> 
> > +1 (binding)
> > Looks great, and will be awesome to have this new capability.
> >
> > On Wed, May 8, 2019 at 10:23 PM Colin McCabe <cmcc...@apache.org> wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start the vote for KIP-455: Create an Administrative API for
> > > Replica Reassignment.  I think this KIP is important since it will unlock
> > > many follow-on improvements to Kafka reassignment (see the "Future work"
> > > section, plus a lot of the other discussions we've had recently about
> > > reassignment).  It also furthers the important KIP-4 goal of removing
> > > direct access to ZK.
> > >
> > > I made a few changes based on the discussion in the [DISCUSS] thread.  As
> > > Robert suggested, I removed the need to explicitly cancel a reassignment
> > > for a partition before setting up a different reassignment for that
> > > specific partition.  I also simplified the API a bit by adding a
> > > PartitionReassignment class which is used by both the alter and list
> > APIs.
> > >
> > > I modified the proposal so that we now deprecate the old znode-based API
> > > rather than removing it completely.  That should give external
> > rebalancing
> > > tools some time to transition to the new API.
> > >
> > > To clarify a question Viktor asked, I added a note that the
> > > kafka-reassign-partitions.sh will now use a --bootstrap-server argument
> > to
> > > contact the admin APIs.
> > >
> > > thanks,
> > > Colin
> > >
> >
> >
> > --
> > *Gwen Shapira*
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> > <http://www.confluent.io/blog>
> >
>

Reply via email to