Hi Ted and Colin,

Thanks for the comments.

It seems you're both happier with reassign rather than assign, so I'm happy
to stick with that.


On 5 September 2017 at 18:46, Colin McCabe <cmcc...@apache.org> wrote:

> ...


> Do we expect that reducing the number of partitions will ever be
> supported by this API?  It seems like decreasing would require a
> different API-- one which supported data movement, had a "check status
> of this operation" feature, etc. etc.  If this API is only ever going to
> be used to increase the number of partitions, I think we should just
> call it "increasePartitionCount" to avoid confusion.
>

I thought a little about the decrease possibility (hence the static factory
methods on PartitionCount), but not really in enough detail. I suppose a
decrease process could look like this:

1. Producers start partitioning based on the decreased partition count.
2. Consumers continue to consume from all partitions.
3. At some point all the records in the old partitions have expired and
they can be deleted.

This wouldn't work for compacted topics. Of course a more aggressive
strategy is also possible where we forcibly move data from the partitions
to be deleted.

Anyway, in either case the process would be long running, whereas the
increase case is fast, so the semantics are quite different. So I agree,
lets rename the method to make clear that it's only for increasing the
partition count.

>
> > Outstanding questions:
> >
> > 1. Is the proposed alterInterReplicaThrottle() API really better than
> > changing the throttle via alterConfigs()?
>
> That's a good point.  I would argue that we should just use alterConfigs
> to set the broker configuration, rather than having a special RPC just
> for this.
>

Yes, I'm minded to agree.

The reason I originally thought a special API might be better was if people
felt that the DynamicConfig mechanism (which seems to exist only to support
these throttles) was an implementation detail of the throttles. But I now
realise that they're visible via kafka-topics.sh, so they're effectively
already a public API.


>
> ...
> > Would it be a problem that
> > triggering the reassignment required ClusterAction on the Cluster, but
> > throttling the assignment required Alter on the Topic? What if a user had
> > the former permission, but not the latter?
>
> We've been trying to reserve ClusterAction on Cluster for
> broker-initiated operations.  Alter on Cluster is the ACL for "root
> stuff" and I would argue that it should be what we use here.
>
> For reconfiguring the broker, I think we should follow KIP-133 and use
> AlterConfigs on the Broker resource.  (Of course, if you just use the
> existing alterConfigs call, you get this without any special effort.)
>

Yes, sorry, what I put in the email about authorisation wasn't what I put
in the KIP (I revised the KIP after drafting the email and then forgot to
update the email).

Although KIP-133 proposes a Broker resource, I don't see one in the code
and KIP-133 was supposedly delivered in 0.11. Can anyone fill in the story
here? Is it simply because the functionality to update broker configs
hasn't been implemented yet?

As currently proposed, both reassignPartitions() and alterInterBrokerThrottle()
require Alter on the Cluster. If we used alterConfigs() to set the
throttles then we create a situation where the triggering of the
reassignment required Alter(Cluster), but the throttling required
Alter(Broker), and the user might have the former but not the latter. I
don't think this is likely to be a big deal in practice, but maybe others
disagree?


> >
> > 2. Is reassignPartitions() really the best name? I find the distinction
> > between reassigning and just assigning to be of limited value, and
> > "reassign" is misleading when the APIs now used for changing the
> > replication factor too. Since the API is asynchronous/long running, it
> > might be better to indicate that in the name some how. What about
> > startPartitionAssignment()?
>
> Good idea -- I like the idea of using "start" or "initiate" to indicate
> that this is kicking off a long-running operation.
>
> "reassign" seemed like a natural choice to me since this is changing an
> existing assignment.  It was assigned (when the topic it was created)--
> now it's being re-assigned.  If you change it to just "assign" then it
> feels confusing to me.  A new user might ask if "assigning partitions"
> is a step that they should take after creating a topic, similar to how
> subscribing to topics is a step you take after creating a consumer.
>

Yeah, I find this new user argument persuasive, so I'm happy to stick with
reassign.

Thanks again for the feedback,

Tom

Reply via email to