Hi Jun,

Thank you for the feedback.

1. You are right. At the end, we do care about the percentage of time that
an operation ties up the controller thread. I thought about this but I was
not entirely convinced by it for following reasons:

1.1. While I do agree that setting up a rate and a burst is a bit harder
than
allocating a percentage for the administrator of the cluster, I believe
that a
rate and a burst are way easier to understand for the users of the cluster.

1.2. Measuring the time that a request ties up the controller thread is not
as straightforward as it sounds because the controller reacts to ZK
TopicChange and TopicDeletion events in lieu of handling requests directly.
These events do not carry on the client id nor the user information so the
best would be to refactor the controller to accept requests instead of
reacting
to the events. This will be possible with KIP-590. It has obviously other
side effects in the controller (e.g. batching).

I leaned towards the current proposal mainly due to 1.1. as 1.2. can be (or
will be) fixed. Does 1.1. sound like a reasonable trade off to you?

2. It is not in the current proposal. I thought that a global quota would be
enough to start with. We can definitely make it work like the other quotas.

3. The main difference is that the Token Bucket algorithm defines an
explicit
burst B while guaranteeing an average rate R whereas our existing quota
guarantees an average rate R as well but starts to throttle as soon as the
rate goes above the defined quota.

Creating and deleting topics is bursty by nature. Applications create or
delete
topics occasionally by usually sending one request with multiple topics. The
reasoning behind allowing a burst is to allow such requests with a
reasonable
size to pass without being throttled whereas our current quota mechanism
would reject any topics as soon as the rate is above the quota requiring the
applications to send subsequent requests to create or to delete all the
topics.

Best,
David


On Fri, Apr 24, 2020 at 9:03 PM Jun Rao <j...@confluent.io> wrote:

> Hi, David,
>
> Thanks for the KIP. A few quick comments.
>
> 1. About quota.partition.mutations.rate. I am not sure if it's very easy
> for the user to set the quota as a rate. For example, each partition
> mutation could take a different number of ZK operations (depending on
> things like retry). The time to process each ZK operation may also vary
> from cluster to cluster. An alternative way to model this is to do sth
> similar to the request (CPU) quota, which exposes quota as a percentage of
> the server threads that can be used. The current request quota doesn't
> include the controller thread. We could add something that measures/exposes
> the percentage of time that a request ties up the controller thread, which
> seems to be what we really care about.
>
> 2. Is the new quota per user? Intuitively, we want to only penalize
> applications that overuse the broker resources, but not others. Also, in
> existing types of quotas (request, bandwidth), there is a hierarchy among
> clientId vs user and default vs customized (see
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users
> ). Does the new quota fit into the existing hierarchy?
>
> 3. It seems that you are proposing a new quota mechanism based on Token
> Bucket algorithm. Could you describe its tradeoff with the existing quota
> mechanism? Ideally, it would be better if we have a single quota mechanism
> within Kafka.
>
> Jun
>
>
>
>
> On Fri, Apr 24, 2020 at 9:52 AM David Jacot <dja...@confluent.io> wrote:
>
> > Hi folks,
> >
> > I'd like to start the discussion for KIP-599:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations
> >
> > It proposes to introduce quotas for the create topics, create partitions
> > and delete topics operations. Let me know what you think, thanks.
> >
> > Best,
> > David
> >
>

Reply via email to