Thanks for the KIP, Rajini. This is a welcome improvement and the KIP page covers it well. A few comments:
1. Can you expand a bit on the motivation for throttling requests that fail authorization for ClusterAction? Under what scenarios would this help? 2. I think we should rename `throttle_time_ms` in the new version of produce/fetch response to make it clear that it refers to the byte rate throttling. Also, it would be good to include the updated schema for the responses (we typically try to do that whenever we update protocol APIs). 3. I think I am OK with using absolute units, but I am not sure about the argument why it's better than a percentage. We are comparing request threads to CPUs, but they're not the same as increasing the number of request threads doesn't necessarily mean that the server can cope with more requests. In the example where we double the number of threads, all the existing users would still have the same capacity proportionally speaking so it seems intuitive to me. One thing that would be helpful, I think, is to describe a few scenarios where the setting needs to be adjusted and how users would go about doing it. 4. I think it's worth mentioning that TLS increases the load on the network thread significantly and for cases where there is mixed plaintext and TLS traffic, the existing byte rate throttling may not do a great job. I think it's OK to tackle this in a separate KIP, but worth mentioning the limitation. 5. We mention DoS attacks in the document. It may be worth mentioning that this mostly helps with clients that are not malicious. A malicious client could generate a large number of connections to counteract the delays that this KIP introduces. Kafka has connection limits per IP today, but not per user, so a distributed DoS could bypass those. This is not easy to solve at the Kafka level since the authentication step required to get the user may be costly enough that the brokers will eventually be overwhelmed. 6. It's unfortunate that the existing byte rate quota configs use underscores instead of dots (like every other config) as separators. It's reasonable for `io_thread_units` to use the same convention as the byte rate configs, but it's not great that we are adding to the inconsistency. I don't have any great solutions apart from perhaps accepting the dot notation for all these configs as well. Ismael On Fri, Feb 17, 2017 at 5:05 PM, Rajini Sivaram <rajinisiva...@gmail.com> wrote: > Hi all, > > I have just created KIP-124 to introduce request rate quotas to Kafka: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+ > Request+rate+quotas > > The proposal is for a simple percentage request handling time quota that > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There > are a few other suggestions also under "Rejected alternatives". Feedback > and suggestions are welcome. > > Thank you... > > Regards, > > Rajini >