Re: [VOTE] KIP-599: Throttle Create Topic, Create Partition and Delete Topic Operations

David Jacot Thu, 11 Jun 2020 23:52:25 -0700

Colin, Jun,

Do the proposed error code and the updated KIP look good to you guys? I’d
like to wrap up and close the vote.


Thanks,
David

Le mer. 10 juin 2020 à 14:50, David Jacot <dja...@confluent.io> a écrit :

> Hi Colin and Jun,
>
> I have no problem if we have to rewrite part of it when the new controller
> comes
> out. I will be more than happy to help out.
>
> Regarding KIP-590, I think that we can cope with a principal as a string
> when the
> time comes. The user entity name is defined with a string already.
>
> Regarding the name of the error, you have made a good point. I do agree
> that it
> is important to differentiate the two cases. I propose the following two
> errors:
> - THROTTLING_QUOTA_EXCEEDED - Throttling is slightly better than rate as
> we have quotas which are not rate (e.g. request quota). This one is
> retryable
> once the throttle time is passed.
> - LIMIT_QUOTA_EXCEEDED - This one would indicate that the limit has been
> reached and is a final error.
> We only need the former in this KIP. What do you think?
>
> Jun, I have added a few examples in the KIP. The new name works exactly
> like
> the existing one once it is added to the accepted dynamic configs for the
> user
> and the client entities. I have added a "Kafka Config Command" chapter in
> the
> KIP. I will also open a Jira to not forget updating the AK documentation
> once
> the KIP gets merged.
>
> Thanks,
> David
>
> On Wed, Jun 10, 2020 at 3:03 AM Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Colin,
>>
>> Good point. Maybe sth like THROTTLING_QUOTA_VIOLATED will make this clear.
>>
>> Hi, David,
>>
>> We added a new quota name in the KIP. You chose not to bump up the version
>> of DESCRIBE_CLIENT_QUOTAS and ALTER_CLIENT_QUOTAS, which seems ok since
>> the
>> quota name is represented as a string. However, the new quota name can be
>> used in client tools for setting and listing the quota (
>> https://kafka.apache.org/documentation/#quotas). Could you document how
>> the
>> new name will be used in those tools?
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Jun 9, 2020 at 3:37 PM Colin McCabe <cmcc...@apache.org> wrote:
>>
>> > On Tue, Jun 9, 2020, at 05:06, David Jacot wrote:
>> > > Hi Colin,
>> > >
>> > > Thank you for your feedback.
>> > >
>> > > Jun has summarized the situation pretty well. Thanks Jun! I would
>> like to
>> > > complement it with the following points:
>> > >
>> > > 1. Indeed, when the quota is exceeded, the broker will reject the
>> topic
>> > > creations, partition creations and topics deletions that are exceeding
>> > > with the new QUOTA_VIOLATED error. The ThrottleTimeMs field will
>> > > be populated accordingly to let the client know how long it must wait.
>> > >
>> > > 2. I do agree that we actually want a mechanism to apply back pressure
>> > > to the clients. The KIP basically proposes a mechanism to control and
>> to
>> > > limit the rate of operations before entering the controller. I think
>> that
>> > > it is similar to your thinking but is enforced based on a defined
>> > > instead of relying on the number of pending items in the controller.
>> > >
>> > > 3. You mentioned an alternative idea in your comments that, if I
>> > understood
>> > > correctly, would bound the queue to limit the overload and reject
>> work if
>> > > the queue is full. I have been thinking about this as well but I don't
>> > think
>> > > that it  works well in our case.
>> > > - The first reason is the one mentioned by Jun. We actually want to be
>> > able
>> > > to limit specific clients (the misbehaving ones) in a multi-tenant
>> > > environment.
>> > > - The second reason is that, at least in our current implementation,
>> the
>> > > length of the queue is not really a good characteristic to estimate
>> the
>> > load.
>> > > Coming back to your example of the CreateTopicsRequest. They create
>> path
>> > >  in ZK for each newly created topics which trigger a ChangeTopic event
>> > in
>> > > the controller. That single event could be for a single topic in some
>> > cases or
>> > > for a thousand topics in others.
>> > > These two reasons aside, bounding the queue also introduces a knob
>> which
>> > > requires some tuning and thus suffers from all the points you
>> mentioned
>> > as
>> > > well, isn't it? The value may be true for one version but not for
>> > another.
>> > >
>> > > 4. Regarding the integration with KIP-500. The implementation of this
>> KIP
>> > > will span across the ApiLayer and the AdminManager. To be honest, we
>> > > can influence the implementation to work best with what you have in
>> mind
>> > > for the future controller. If you could shed some more light on the
>> > future
>> > > internal architecture, I can take this into account during the
>> > > implementation.
>> > >
>> >
>> > Hi David,
>> >
>> > Good question.  The approach we are thinking of is that in the future,
>> > topic creation will be a controller RPC.  In other words, rather than
>> > changing ZK and waiting for the controller code to notice, we'll go
>> through
>> > the controller code (which may change ZK, or may do something else in
>> the
>> > ZK-free environment).
>> >
>> > I don't think there is a good way to write this in the short term that
>> > avoids having to rewrite in the long term.  That's probably OK though,
>> as
>> > long as we keep in mind that we'll need to.
>> >
>> > >
>> > > 5. Regarding KIP-590. For the create topics request, create partitions
>> > > request, and delete topics request, we are lucky enough to have
>> directed
>> > > all of them to the controller already so we are fine with doing the
>> > admission
>> > > in the broker which hosts the controller. If we were to throttle more
>> > operations
>> > > in the future,
>> > > I believe that we can continue to do it centrally while leveraging the
>> > > principal that is forwarded to account for the right tenant. The
>> reason
>> > > why I would like to keep it central is that we are talking about
>> sparse
>> > (or bursty)
>> > > workloads here so distributing the quota among all the brokers makes
>> > little sense.
>> > >
>> >
>> > Right.  The main requirement here is that the quota must operate based
>> on
>> > principal names rather than KafkaPrincipal objects.  We had a long
>> > discussion about KIP-590 and eventually concluded that we didn't want to
>> > make KafkaPrincipal serializable (at least not yet) so what would be
>> > forwarded is just a string, not the principal object.
>> >
>> > >
>> > > 6. Regarding the naming of the new error code. BUSY sounds too
>> generic to
>> > > me so I would rather prefer a specific error code. The main reason
>> being
>> > > that we may be able to reuse it in the future for other quotas. That
>> > being said,
>> > > we can find another name. QUOTA_EXHAUSTED? I don't feel too strongly
>> > about
>> > > the naming. I wonder what the others think about this.
>> > >
>> >
>> > Think about if you're someone writing software that uses the Kafka
>> > client.  Let's say you try to create some topics and get back
>> > QUOTA_VIOLATED.  What does it mean?  Maybe it means that you tried to
>> > create too many topics in too short a time (violating the controller
>> > throttle).  Or maybe it means that you exceeded your quota specifying
>> the
>> > number of partitions that they are allowed to create (let's assume that
>> > someone did a follow-on KIP for that that reuses this error code for
>> that.)
>> >
>> > But now you have a dilemma.  If the controller was just busy, then
>> trying
>> > again is the right thing to do.  If there is some other quota you
>> violated
>> > (number of partitions, number of topics, etc.) then retrying is wasteful
>> > and will not accomplish anything.  Of course, the Kafka client software
>> > itself can tell the two cases apart since it has access to
>> throttleTimeMs.
>> > But the end user can't (and in certain modes, the exception is exposed
>> > directly to the user here).
>> >
>> > That's why "you violated some quota, not sure which" is not a good error
>> > code.  So I think we should distinguish the two cases by having two
>> > separate error codes.  Maybe something like RATE_QUOTA_VIOLATED and
>> > RESOURCE_QUOTA_VIOLATED.  This KIP would only need the former one.
>> >
>> > Another possibility is that the user could distinguish the two cases by
>> > the error string.  But typically we want error strings to be used to
>> give
>> > extra debugging information, not to make big decisions about what the
>> > client should do.  So I think that the error code should actually be a
>> > little bit more specific, or at least tell the end user what to do with
>> it
>> > (that's why I suggested "busy").
>> >
>> > >
>> > > Voilà. I hope that I have addressed all your questions/points and I am
>> > > sorry for the lengthy email.
>> > >
>> >
>> > Thanks, David.  It looks good to me overall.  And I thought your email
>> was
>> > very clear-- not too long at all.
>> >  Let's just figure out the error code thing.
>> >
>> > best,
>> > Colin
>> >
>> > >
>> > > Regards,
>> > > David
>> > >
>> > >
>> > > On Tue, Jun 9, 2020 at 2:13 AM Colin McCabe <cmcc...@apache.org>
>> wrote:
>> > >
>> > > > On Mon, Jun 8, 2020, at 14:41, Jun Rao wrote:
>> > > > > Hi, Colin,
>> > > > >
>> > > > > Thanks for the comment. You brought up several points.
>> > > > >
>> > > > > 1. Should we set up a per user quota? To me, it does seem we need
>> > some
>> > > > sort
>> > > > > of a quota. When the controller runs out of resources, ideally, we
>> > only
>> > > > > want to penalize the bad behaving applications, instead of every
>> > > > > application. To do that, we will need to know what each
>> application
>> > is
>> > > > > entitled to and the per user quota is intended to capture that.
>> > > > >
>> > > > > 2. How easy is it to configure a quota? The following is how an
>> admin
>> > > > > typically sets up a quota in our existing systems. Pick a generous
>> > > > default
>> > > > > per user quota works for most applications. For the few resource
>> > > > intensive
>> > > > > applications, customize a higher quota for them. Reserve enough
>> > resources
>> > > > > in anticipation that a single (or a few) application will exceed
>> the
>> > > > quota
>> > > > > at a given time.
>> > > > >
>> > > >
>> > > > Hi Jun,
>> > > >
>> > > > Thanks for the response.
>> > > >
>> > > > Maybe I was too pessimistic about the ability of admins to
>> configure a
>> > > > useful quota here.  I do agree that it would be nice to have the
>> > ability to
>> > > > set different quotas for different users, as you mentioned.
>> > > >
>> > > > >
>> > > > > 3. How should the quota be defined? In the discussion thread, we
>> > debated
>> > > > > between a usage based model vs a rate based model. Dave and Anna
>> > argued
>> > > > for
>> > > > > the rate based model mostly because it's simpler to implement.
>> > > > >
>> > > >
>> > > > I'm trying to think more about how this integrates with our plans
>> for
>> > > > KIP-500.  When we get rid of ZK, we will have to handle this in the
>> > > > controller itself, rather than in the AdminManager.  That implies
>> we'll
>> > > > have to rewrite the code.  Maybe this is worth it if we want this
>> > feature
>> > > > now, though.
>> > > >
>> > > > Another wrinkle here is that as we discussed in KIP-590, controller
>> > > > operations will land on a random broker first, and only then be
>> > forwarded
>> > > > to the active controller.  This implies that either admissions
>> control
>> > > > should happen on all brokers (needing some kind of distributed quota
>> > > > scheme), or be done on the controller after we've already done the
>> > work of
>> > > > forwarding the message.  The second approach might not be that bad,
>> > but it
>> > > > would be nice to figure this out.
>> > > >
>> > > > >
>> > > > > 4. If a quota is exceeded, how is that enforced? My understanding
>> of
>> > the
>> > > > > KIP is that, if a quota is exceeded, the broker immediately sends
>> > back
>> > > > > a QUOTA_VIOLATED error and a throttle time back to the client, and
>> > the
>> > > > > client will wait for the throttle time before issuing the next
>> > request.
>> > > > > This seems to be the same as the BUSY error code you mentioned.
>> > > > >
>> > > >
>> > > > Yes, I agree, it sounds like we're thinking along the same lines.
>> > > > However, rather than QUOTA_VIOLATED, how about naming the error code
>> > BUSY?
>> > > > Then the error text could indicate the quota that we violated.  This
>> > would
>> > > > be more generally useful as an error code and also avoid being
>> > confusingly
>> > > > similar to POLICY_VIOLATION.
>> > > >
>> > > > best,
>> > > > Colin
>> > > >
>> > > > >
>> > > > > I will let David chime in more on that.
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Jun
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Sun, Jun 7, 2020 at 2:30 PM Colin McCabe <cmcc...@apache.org>
>> > wrote:
>> > > > >
>> > > > > > Hi David,
>> > > > > >
>> > > > > > Thanks for the KIP.
>> > > > > >
>> > > > > > I thought about this for a while and I actually think this
>> > approach is
>> > > > not
>> > > > > > quite right.  The problem that I see here is that using an
>> > explicitly
>> > > > set
>> > > > > > quota here requires careful tuning by the cluster operator.
>> Even
>> > > > worse,
>> > > > > > this tuning might be invalidated by changes in overall
>> conditions
>> > or
>> > > > even
>> > > > > > more efficient controller software.
>> > > > > >
>> > > > > > For example, if we empirically find that the controller can do
>> 1000
>> > > > topics
>> > > > > > in a minute (or whatever), this tuning might actually be wrong
>> if
>> > the
>> > > > next
>> > > > > > version of the software can do 2000 topics in a minute because
>> of
>> > > > > > efficiency upgrades.  Or, the broker that the controller is
>> > located on
>> > > > > > might be experiencing heavy load from its non-controller
>> > operations,
>> > > > and so
>> > > > > > it can only do 500 topics in a minute during this period.
>> > > > > >
>> > > > > > So the system administrator gets a very obscure tunable (it's
>> not
>> > > > clear to
>> > > > > > a non-Kafka-developer what "controller mutations" are or why
>> they
>> > > > should
>> > > > > > care).  And even worse, they will have to significantly
>> "sandbag"
>> > the
>> > > > value
>> > > > > > that they set it to, so that even under the heaviest load and
>> > oldest
>> > > > > > deployed version of the software, the controller can still
>> > function.
>> > > > Even
>> > > > > > worse, this new quota adds a lot of complexity to the
>> controller.
>> > > > > >
>> > > > > > What we really want is backpressure when the controller is
>> > > > overloaded.  I
>> > > > > > believe this is the alternative you discuss in "Rejected
>> > Alternatives"
>> > > > > > under "Throttle the Execution instead of the Admission"  Your
>> > reason
>> > > > for
>> > > > > > rejecting it is that the client error handling does not work
>> well
>> > in
>> > > > this
>> > > > > > case.  But actually, this is an artifact of our current
>> > implementation,
>> > > > > > rather than a fundamental issue with backpressure.
>> > > > > >
>> > > > > > Consider the example of a CreateTopicsRequest.  The controller
>> > could
>> > > > > > return a special error code if the load was too high, and take
>> the
>> > > > create
>> > > > > > topics event off the controller queue.  Let's call that error
>> code
>> > > > BUSY.
>> > > > > >  Additionally, the controller could immediately refuse new
>> events
>> > if
>> > > > the
>> > > > > > queue had reached its maximum length, and simply return BUSY for
>> > that
>> > > > case
>> > > > > > as well.
>> > > > > >
>> > > > > > Basically, the way we handle RPC timeouts in the controller
>> right
>> > now
>> > > > is
>> > > > > > not very good.  As you know, we time out the RPC, so the client
>> > gets
>> > > > > > TimeoutException, but then keep the event on the queue, so that
>> it
>> > > > > > eventually gets executed!  There's no reason why we have to do
>> > that.
>> > > > We
>> > > > > > could take the event off the queue if there is a timeout.  This
>> > would
>> > > > > > reduce load and mostly avoid the paradoxical situations you
>> > describe
>> > > > > > (getting TopicExistsException for a CreateTopicsRequest retry,
>> > etc.)
>> > > > > >
>> > > > > > I say "mostly" because there are still cases where retries
>> could go
>> > > > astray
>> > > > > > (for example if we execute the topic creation but a network
>> problem
>> > > > > > prevents the response from being sent to the client).  But this
>> > would
>> > > > still
>> > > > > > be a very big improvement over what we have now.
>> > > > > >
>> > > > > > Sorry for commenting so late on this but I got distracted by
>> some
>> > other
>> > > > > > things.  I hope we can figure this one out-- I feel like there
>> is a
>> > > > chance
>> > > > > > to significantly simplify this.
>> > > > > >
>> > > > > > best,
>> > > > > > Colin
>> > > > > >
>> > > > > >
>> > > > > > On Fri, May 29, 2020, at 07:57, David Jacot wrote:
>> > > > > > > Hi folks,
>> > > > > > >
>> > > > > > > I'd like to start the vote for KIP-599 which proposes a new
>> > quota to
>> > > > > > > throttle create topic, create partition, and delete topics
>> > > > operations to
>> > > > > > > protect the Kafka controller:
>> > > > > > >
>> > > > > >
>> > > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations
>> > > > > > >
>> > > > > > > Please, let me know what you think.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > David
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [VOTE] KIP-599: Throttle Create Topic, Create Partition and Delete Topic Operations

Reply via email to