Hi Colin and Jun, I have no problem if we have to rewrite part of it when the new controller comes out. I will be more than happy to help out.
Regarding KIP-590, I think that we can cope with a principal as a string when the time comes. The user entity name is defined with a string already. Regarding the name of the error, you have made a good point. I do agree that it is important to differentiate the two cases. I propose the following two errors: - THROTTLING_QUOTA_EXCEEDED - Throttling is slightly better than rate as we have quotas which are not rate (e.g. request quota). This one is retryable once the throttle time is passed. - LIMIT_QUOTA_EXCEEDED - This one would indicate that the limit has been reached and is a final error. We only need the former in this KIP. What do you think? Jun, I have added a few examples in the KIP. The new name works exactly like the existing one once it is added to the accepted dynamic configs for the user and the client entities. I have added a "Kafka Config Command" chapter in the KIP. I will also open a Jira to not forget updating the AK documentation once the KIP gets merged. Thanks, David On Wed, Jun 10, 2020 at 3:03 AM Jun Rao <j...@confluent.io> wrote: > Hi, Colin, > > Good point. Maybe sth like THROTTLING_QUOTA_VIOLATED will make this clear. > > Hi, David, > > We added a new quota name in the KIP. You chose not to bump up the version > of DESCRIBE_CLIENT_QUOTAS and ALTER_CLIENT_QUOTAS, which seems ok since the > quota name is represented as a string. However, the new quota name can be > used in client tools for setting and listing the quota ( > https://kafka.apache.org/documentation/#quotas). Could you document how > the > new name will be used in those tools? > > Thanks, > > Jun > > On Tue, Jun 9, 2020 at 3:37 PM Colin McCabe <cmcc...@apache.org> wrote: > > > On Tue, Jun 9, 2020, at 05:06, David Jacot wrote: > > > Hi Colin, > > > > > > Thank you for your feedback. > > > > > > Jun has summarized the situation pretty well. Thanks Jun! I would like > to > > > complement it with the following points: > > > > > > 1. Indeed, when the quota is exceeded, the broker will reject the topic > > > creations, partition creations and topics deletions that are exceeding > > > with the new QUOTA_VIOLATED error. The ThrottleTimeMs field will > > > be populated accordingly to let the client know how long it must wait. > > > > > > 2. I do agree that we actually want a mechanism to apply back pressure > > > to the clients. The KIP basically proposes a mechanism to control and > to > > > limit the rate of operations before entering the controller. I think > that > > > it is similar to your thinking but is enforced based on a defined > > > instead of relying on the number of pending items in the controller. > > > > > > 3. You mentioned an alternative idea in your comments that, if I > > understood > > > correctly, would bound the queue to limit the overload and reject work > if > > > the queue is full. I have been thinking about this as well but I don't > > think > > > that it works well in our case. > > > - The first reason is the one mentioned by Jun. We actually want to be > > able > > > to limit specific clients (the misbehaving ones) in a multi-tenant > > > environment. > > > - The second reason is that, at least in our current implementation, > the > > > length of the queue is not really a good characteristic to estimate the > > load. > > > Coming back to your example of the CreateTopicsRequest. They create > path > > > in ZK for each newly created topics which trigger a ChangeTopic event > > in > > > the controller. That single event could be for a single topic in some > > cases or > > > for a thousand topics in others. > > > These two reasons aside, bounding the queue also introduces a knob > which > > > requires some tuning and thus suffers from all the points you mentioned > > as > > > well, isn't it? The value may be true for one version but not for > > another. > > > > > > 4. Regarding the integration with KIP-500. The implementation of this > KIP > > > will span across the ApiLayer and the AdminManager. To be honest, we > > > can influence the implementation to work best with what you have in > mind > > > for the future controller. If you could shed some more light on the > > future > > > internal architecture, I can take this into account during the > > > implementation. > > > > > > > Hi David, > > > > Good question. The approach we are thinking of is that in the future, > > topic creation will be a controller RPC. In other words, rather than > > changing ZK and waiting for the controller code to notice, we'll go > through > > the controller code (which may change ZK, or may do something else in the > > ZK-free environment). > > > > I don't think there is a good way to write this in the short term that > > avoids having to rewrite in the long term. That's probably OK though, as > > long as we keep in mind that we'll need to. > > > > > > > > 5. Regarding KIP-590. For the create topics request, create partitions > > > request, and delete topics request, we are lucky enough to have > directed > > > all of them to the controller already so we are fine with doing the > > admission > > > in the broker which hosts the controller. If we were to throttle more > > operations > > > in the future, > > > I believe that we can continue to do it centrally while leveraging the > > > principal that is forwarded to account for the right tenant. The reason > > > why I would like to keep it central is that we are talking about sparse > > (or bursty) > > > workloads here so distributing the quota among all the brokers makes > > little sense. > > > > > > > Right. The main requirement here is that the quota must operate based on > > principal names rather than KafkaPrincipal objects. We had a long > > discussion about KIP-590 and eventually concluded that we didn't want to > > make KafkaPrincipal serializable (at least not yet) so what would be > > forwarded is just a string, not the principal object. > > > > > > > > 6. Regarding the naming of the new error code. BUSY sounds too generic > to > > > me so I would rather prefer a specific error code. The main reason > being > > > that we may be able to reuse it in the future for other quotas. That > > being said, > > > we can find another name. QUOTA_EXHAUSTED? I don't feel too strongly > > about > > > the naming. I wonder what the others think about this. > > > > > > > Think about if you're someone writing software that uses the Kafka > > client. Let's say you try to create some topics and get back > > QUOTA_VIOLATED. What does it mean? Maybe it means that you tried to > > create too many topics in too short a time (violating the controller > > throttle). Or maybe it means that you exceeded your quota specifying the > > number of partitions that they are allowed to create (let's assume that > > someone did a follow-on KIP for that that reuses this error code for > that.) > > > > But now you have a dilemma. If the controller was just busy, then trying > > again is the right thing to do. If there is some other quota you > violated > > (number of partitions, number of topics, etc.) then retrying is wasteful > > and will not accomplish anything. Of course, the Kafka client software > > itself can tell the two cases apart since it has access to > throttleTimeMs. > > But the end user can't (and in certain modes, the exception is exposed > > directly to the user here). > > > > That's why "you violated some quota, not sure which" is not a good error > > code. So I think we should distinguish the two cases by having two > > separate error codes. Maybe something like RATE_QUOTA_VIOLATED and > > RESOURCE_QUOTA_VIOLATED. This KIP would only need the former one. > > > > Another possibility is that the user could distinguish the two cases by > > the error string. But typically we want error strings to be used to give > > extra debugging information, not to make big decisions about what the > > client should do. So I think that the error code should actually be a > > little bit more specific, or at least tell the end user what to do with > it > > (that's why I suggested "busy"). > > > > > > > > VoilĂ . I hope that I have addressed all your questions/points and I am > > > sorry for the lengthy email. > > > > > > > Thanks, David. It looks good to me overall. And I thought your email > was > > very clear-- not too long at all. > > Let's just figure out the error code thing. > > > > best, > > Colin > > > > > > > > Regards, > > > David > > > > > > > > > On Tue, Jun 9, 2020 at 2:13 AM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > > > On Mon, Jun 8, 2020, at 14:41, Jun Rao wrote: > > > > > Hi, Colin, > > > > > > > > > > Thanks for the comment. You brought up several points. > > > > > > > > > > 1. Should we set up a per user quota? To me, it does seem we need > > some > > > > sort > > > > > of a quota. When the controller runs out of resources, ideally, we > > only > > > > > want to penalize the bad behaving applications, instead of every > > > > > application. To do that, we will need to know what each application > > is > > > > > entitled to and the per user quota is intended to capture that. > > > > > > > > > > 2. How easy is it to configure a quota? The following is how an > admin > > > > > typically sets up a quota in our existing systems. Pick a generous > > > > default > > > > > per user quota works for most applications. For the few resource > > > > intensive > > > > > applications, customize a higher quota for them. Reserve enough > > resources > > > > > in anticipation that a single (or a few) application will exceed > the > > > > quota > > > > > at a given time. > > > > > > > > > > > > > Hi Jun, > > > > > > > > Thanks for the response. > > > > > > > > Maybe I was too pessimistic about the ability of admins to configure > a > > > > useful quota here. I do agree that it would be nice to have the > > ability to > > > > set different quotas for different users, as you mentioned. > > > > > > > > > > > > > > 3. How should the quota be defined? In the discussion thread, we > > debated > > > > > between a usage based model vs a rate based model. Dave and Anna > > argued > > > > for > > > > > the rate based model mostly because it's simpler to implement. > > > > > > > > > > > > > I'm trying to think more about how this integrates with our plans for > > > > KIP-500. When we get rid of ZK, we will have to handle this in the > > > > controller itself, rather than in the AdminManager. That implies > we'll > > > > have to rewrite the code. Maybe this is worth it if we want this > > feature > > > > now, though. > > > > > > > > Another wrinkle here is that as we discussed in KIP-590, controller > > > > operations will land on a random broker first, and only then be > > forwarded > > > > to the active controller. This implies that either admissions > control > > > > should happen on all brokers (needing some kind of distributed quota > > > > scheme), or be done on the controller after we've already done the > > work of > > > > forwarding the message. The second approach might not be that bad, > > but it > > > > would be nice to figure this out. > > > > > > > > > > > > > > 4. If a quota is exceeded, how is that enforced? My understanding > of > > the > > > > > KIP is that, if a quota is exceeded, the broker immediately sends > > back > > > > > a QUOTA_VIOLATED error and a throttle time back to the client, and > > the > > > > > client will wait for the throttle time before issuing the next > > request. > > > > > This seems to be the same as the BUSY error code you mentioned. > > > > > > > > > > > > > Yes, I agree, it sounds like we're thinking along the same lines. > > > > However, rather than QUOTA_VIOLATED, how about naming the error code > > BUSY? > > > > Then the error text could indicate the quota that we violated. This > > would > > > > be more generally useful as an error code and also avoid being > > confusingly > > > > similar to POLICY_VIOLATION. > > > > > > > > best, > > > > Colin > > > > > > > > > > > > > > I will let David chime in more on that. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > On Sun, Jun 7, 2020 at 2:30 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > > > > > > > > > > Hi David, > > > > > > > > > > > > Thanks for the KIP. > > > > > > > > > > > > I thought about this for a while and I actually think this > > approach is > > > > not > > > > > > quite right. The problem that I see here is that using an > > explicitly > > > > set > > > > > > quota here requires careful tuning by the cluster operator. Even > > > > worse, > > > > > > this tuning might be invalidated by changes in overall conditions > > or > > > > even > > > > > > more efficient controller software. > > > > > > > > > > > > For example, if we empirically find that the controller can do > 1000 > > > > topics > > > > > > in a minute (or whatever), this tuning might actually be wrong if > > the > > > > next > > > > > > version of the software can do 2000 topics in a minute because of > > > > > > efficiency upgrades. Or, the broker that the controller is > > located on > > > > > > might be experiencing heavy load from its non-controller > > operations, > > > > and so > > > > > > it can only do 500 topics in a minute during this period. > > > > > > > > > > > > So the system administrator gets a very obscure tunable (it's not > > > > clear to > > > > > > a non-Kafka-developer what "controller mutations" are or why they > > > > should > > > > > > care). And even worse, they will have to significantly "sandbag" > > the > > > > value > > > > > > that they set it to, so that even under the heaviest load and > > oldest > > > > > > deployed version of the software, the controller can still > > function. > > > > Even > > > > > > worse, this new quota adds a lot of complexity to the controller. > > > > > > > > > > > > What we really want is backpressure when the controller is > > > > overloaded. I > > > > > > believe this is the alternative you discuss in "Rejected > > Alternatives" > > > > > > under "Throttle the Execution instead of the Admission" Your > > reason > > > > for > > > > > > rejecting it is that the client error handling does not work well > > in > > > > this > > > > > > case. But actually, this is an artifact of our current > > implementation, > > > > > > rather than a fundamental issue with backpressure. > > > > > > > > > > > > Consider the example of a CreateTopicsRequest. The controller > > could > > > > > > return a special error code if the load was too high, and take > the > > > > create > > > > > > topics event off the controller queue. Let's call that error > code > > > > BUSY. > > > > > > Additionally, the controller could immediately refuse new events > > if > > > > the > > > > > > queue had reached its maximum length, and simply return BUSY for > > that > > > > case > > > > > > as well. > > > > > > > > > > > > Basically, the way we handle RPC timeouts in the controller right > > now > > > > is > > > > > > not very good. As you know, we time out the RPC, so the client > > gets > > > > > > TimeoutException, but then keep the event on the queue, so that > it > > > > > > eventually gets executed! There's no reason why we have to do > > that. > > > > We > > > > > > could take the event off the queue if there is a timeout. This > > would > > > > > > reduce load and mostly avoid the paradoxical situations you > > describe > > > > > > (getting TopicExistsException for a CreateTopicsRequest retry, > > etc.) > > > > > > > > > > > > I say "mostly" because there are still cases where retries could > go > > > > astray > > > > > > (for example if we execute the topic creation but a network > problem > > > > > > prevents the response from being sent to the client). But this > > would > > > > still > > > > > > be a very big improvement over what we have now. > > > > > > > > > > > > Sorry for commenting so late on this but I got distracted by some > > other > > > > > > things. I hope we can figure this one out-- I feel like there > is a > > > > chance > > > > > > to significantly simplify this. > > > > > > > > > > > > best, > > > > > > Colin > > > > > > > > > > > > > > > > > > On Fri, May 29, 2020, at 07:57, David Jacot wrote: > > > > > > > Hi folks, > > > > > > > > > > > > > > I'd like to start the vote for KIP-599 which proposes a new > > quota to > > > > > > > throttle create topic, create partition, and delete topics > > > > operations to > > > > > > > protect the Kafka controller: > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations > > > > > > > > > > > > > > Please, let me know what you think. > > > > > > > > > > > > > > Cheers, > > > > > > > David > > > > > > > > > > > > > > > > > > > > > > > > > > > >