Hey everyone, I've made changes to the KIP to capture our discussions over the last couple of weeks. https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
I'll start a voting thread after people have had a chance to read/comment. Thanks, Aditya ________________________________________ From: Steven Wu [stevenz...@gmail.com] Sent: Friday, March 20, 2015 9:14 AM To: dev@kafka.apache.org Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas +1 on Jun's suggestion of maintaining one set/style of metrics at broker. In Netflix, we have to convert the yammer metrics to servo metrics at broker. it will be painful to know some metrics are in a different style and get to be handled differently. On Fri, Mar 20, 2015 at 8:17 AM, Jun Rao <j...@confluent.io> wrote: > Not so sure. People who use quota will definitely want to monitor the new > metrics at the client id level. Then they will need to deal with those > metrics differently from the rest of the metrics. It would be better if we > can hide this complexity from the users. > > Thanks, > > Jun > > On Thu, Mar 19, 2015 at 10:45 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > Actually thinking again - since these will be a few new metrics at the > > client id level (bytes in and bytes out to start with) maybe it is fine > to > > have the two type of metrics coexist and we can migrate the existing > > metrics in parallel. > > > > On Thursday, March 19, 2015, Joel Koshy <jjkosh...@gmail.com> wrote: > > > > > That is a valid concern but in that case I think it would be better to > > > just migrate completely to the new metrics package first. > > > > > > On Thursday, March 19, 2015, Jun Rao <j...@confluent.io > > > <javascript:_e(%7B%7D,'cvml','j...@confluent.io');>> wrote: > > > > > >> Hmm, I was thinking a bit differently on the metrics stuff. I think it > > >> would be confusing to have some metrics defined in the new metrics > > package > > >> while some others defined in Coda Hale. Those metrics will look > > different > > >> (e.g., rates in Coda Hale will have special attributes such as > > >> 1-min-average). People may need different ways to export the metrics > to > > >> external systems such as Graphite. So, instead of using the new > metrics > > >> package on the broker, I was thinking that we can just implement a > > >> QuotaMetrics that wraps the Coda Hale metrics. The implementation can > be > > >> the same as what's in the new metrics package. > > >> > > >> Thanks, > > >> > > >> Jun > > >> > > >> On Thu, Mar 19, 2015 at 8:09 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > > >> > > >> > Yeah I was saying was that we are blocked on picking an approach for > > >> > metrics but not necessarily the full conversion. Clearly if we pick > > the > > >> new > > >> > metrics package we would need to implement the two metrics we want > to > > >> quota > > >> > on. But the conversion of the remaining metrics can be done > > >> asynchronously. > > >> > > > >> > -Jay > > >> > > > >> > On Thu, Mar 19, 2015 at 5:56 PM, Joel Koshy <jjkosh...@gmail.com> > > >> wrote: > > >> > > > >> > > > in KAFKA-1930). I agree that this KIP doesn't need to block on > the > > >> > > > migration of the metrics package. > > >> > > > > >> > > Can you clarify the above? i.e., if we are going to quota on > > something > > >> > > then we would want to have migrated that metric over right? Or do > > you > > >> > > mean we don't need to complete the migration of all metrics to the > > >> > > metrics package right? > > >> > > > > >> > > I think most of us now feel that the delay + no error is a good > > >> > > approach, but it would be good to make sure everyone is on the > same > > >> > > page. > > >> > > > > >> > > As Aditya requested a couple of days ago I think we should go over > > >> > > this at the next KIP hangout. > > >> > > > > >> > > Joel > > >> > > > > >> > > On Thu, Mar 19, 2015 at 09:24:09AM -0700, Jun Rao wrote: > > >> > > > 1. Delay + no error seems reasonable to me. However, I do feel > > that > > >> we > > >> > > need > > >> > > > to give the client an indicator that it's being throttled, > instead > > >> of > > >> > > doing > > >> > > > this silently. For that, we probably need to evolve the > > >> produce/fetch > > >> > > > protocol to include an extra status field in the response. We > > >> probably > > >> > > need > > >> > > > to think more about whether we just want to return a simple > status > > >> code > > >> > > > (e.g., 1 = throttled) or a value that indicates how much is > being > > >> > > throttled. > > >> > > > > > >> > > > 2. We probably need to improve the histogram support in the new > > >> metrics > > >> > > > package before we can use it more widely on the server side > (left > > a > > >> > > comment > > >> > > > in KAFKA-1930). I agree that this KIP doesn't need to block on > the > > >> > > > migration of the metrics package. > > >> > > > > > >> > > > Thanks, > > >> > > > > > >> > > > Jun > > >> > > > > > >> > > > On Wed, Mar 18, 2015 at 4:02 PM, Aditya Auradkar < > > >> > > > aaurad...@linkedin.com.invalid> wrote: > > >> > > > > > >> > > > > Hey everyone, > > >> > > > > > > >> > > > > Thanks for the great discussion. There are currently a few > > points > > >> on > > >> > > this > > >> > > > > KIP that need addressing and I want to make sure we are on the > > >> same > > >> > > page > > >> > > > > about those. > > >> > > > > > > >> > > > > 1. Append and delay response vs delay and return error > > >> > > > > - I think we've discussed the pros and cons of each approach > but > > >> > > haven't > > >> > > > > chosen an approach yet. Where does everyone stand on this > issue? > > >> > > > > > > >> > > > > 2. Metrics Migration and usage in quotas > > >> > > > > - The metrics library in clients has a notion of quotas that > we > > >> > should > > >> > > > > reuse. For that to happen, we need to migrate the server to > the > > >> new > > >> > > metrics > > >> > > > > package. > > >> > > > > - Need more clarification on how to compute throttling time > and > > >> > > windowing > > >> > > > > for quotas. > > >> > > > > > > >> > > > > I'm going to start a new KIP to discuss metrics migration > > >> separately. > > >> > > That > > >> > > > > will also contain a section on quotas. > > >> > > > > > > >> > > > > 3. Dynamic Configuration management - Being discussed in > KIP-5. > > >> > > Basically > > >> > > > > we need something that will model default quotas and allow > > >> per-client > > >> > > > > overrides. > > >> > > > > > > >> > > > > Is there something else that I'm missing? > > >> > > > > > > >> > > > > Thanks, > > >> > > > > Aditya > > >> > > > > ________________________________________ > > >> > > > > From: Jay Kreps [jay.kr...@gmail.com] > > >> > > > > Sent: Wednesday, March 18, 2015 2:10 PM > > >> > > > > To: dev@kafka.apache.org > > >> > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > >> > > > > > > >> > > > > Hey Steven, > > >> > > > > > > >> > > > > The current proposal is actually to enforce quotas at the > > >> > > > > client/application level, NOT the topic level. So if you have > a > > >> > service > > >> > > > > with a few dozen instances the quota is against all of those > > >> > instances > > >> > > > > added up across all their topics. So actually the effect would > > be > > >> the > > >> > > same > > >> > > > > either way but throttling gives the producer the choice of > > either > > >> > > blocking > > >> > > > > or dropping. > > >> > > > > > > >> > > > > -Jay > > >> > > > > > > >> > > > > On Tue, Mar 17, 2015 at 10:08 AM, Steven Wu < > > stevenz...@gmail.com > > >> > > > >> > > wrote: > > >> > > > > > > >> > > > > > Jay, > > >> > > > > > > > >> > > > > > let's say an app produces to 10 different topics. one of the > > >> topic > > >> > is > > >> > > > > sent > > >> > > > > > from a library. due to whatever condition/bug, this lib > starts > > >> to > > >> > > send > > >> > > > > > messages over the quota. if we go with the delayed response > > >> > > approach, it > > >> > > > > > will cause the whole shared RecordAccumulator buffer to be > > >> filled > > >> > up. > > >> > > > > that > > >> > > > > > will penalize other 9 topics who are within the quota. that > is > > >> the > > >> > > > > > unfairness point that Ewen and I were trying to make. > > >> > > > > > > > >> > > > > > if broker just drop the msg and return an error/status code > > >> > > indicates the > > >> > > > > > drop and why. then producer can just move on and accept the > > >> drop. > > >> > > shared > > >> > > > > > buffer won't be saturated and other 9 topics won't be > > penalized. > > >> > > > > > > > >> > > > > > Thanks, > > >> > > > > > Steven > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > On Tue, Mar 17, 2015 at 9:44 AM, Jay Kreps < > > jay.kr...@gmail.com > > >> > > > >> > > wrote: > > >> > > > > > > > >> > > > > > > Hey Steven, > > >> > > > > > > > > >> > > > > > > It is true that hitting the quota will cause back-pressure > > on > > >> the > > >> > > > > > producer. > > >> > > > > > > But the solution is simple, a producer that wants to avoid > > >> this > > >> > > should > > >> > > > > > stay > > >> > > > > > > under its quota. In other words this is a contract between > > the > > >> > > cluster > > >> > > > > > and > > >> > > > > > > the client, with each side having something to uphold. > Quite > > >> > > possibly > > >> > > > > the > > >> > > > > > > same thing will happen in the absence of a quota, a client > > >> that > > >> > > > > produces > > >> > > > > > an > > >> > > > > > > unexpected amount of load will hit the limits of the > server > > >> and > > >> > > > > > experience > > >> > > > > > > backpressure. Quotas just allow you to set that same limit > > at > > >> > > something > > >> > > > > > > lower than 100% of all resources on the server, which is > > >> useful > > >> > > for a > > >> > > > > > > shared cluster. > > >> > > > > > > > > >> > > > > > > -Jay > > >> > > > > > > > > >> > > > > > > On Mon, Mar 16, 2015 at 11:34 PM, Steven Wu < > > >> > stevenz...@gmail.com> > > >> > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > wait. we create one kafka producer for each cluster. > each > > >> > > cluster can > > >> > > > > > > have > > >> > > > > > > > many topics. if producer buffer got filled up due to > > delayed > > >> > > response > > >> > > > > > for > > >> > > > > > > > one throttled topic, won't that penalize other topics > > >> unfairly? > > >> > > it > > >> > > > > > seems > > >> > > > > > > to > > >> > > > > > > > me that broker should just return error without delay. > > >> > > > > > > > > > >> > > > > > > > sorry that I am chatting to myself :) > > >> > > > > > > > > > >> > > > > > > > On Mon, Mar 16, 2015 at 11:29 PM, Steven Wu < > > >> > > stevenz...@gmail.com> > > >> > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > I think I can answer my own question. delayed response > > >> will > > >> > > cause > > >> > > > > the > > >> > > > > > > > > producer buffer to be full, which then result in > either > > >> > thread > > >> > > > > > blocking > > >> > > > > > > > or > > >> > > > > > > > > message drop. > > >> > > > > > > > > > > >> > > > > > > > > On Mon, Mar 16, 2015 at 11:24 PM, Steven Wu < > > >> > > stevenz...@gmail.com> > > >> > > > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > >> please correct me if I am missing sth here. I am not > > >> > > understanding > > >> > > > > > how > > >> > > > > > > > >> would throttle work without cooperation/back-off from > > >> > > producer. > > >> > > > > new > > >> > > > > > > Java > > >> > > > > > > > >> producer supports non-blocking API. why would delayed > > >> > > response be > > >> > > > > > able > > >> > > > > > > > to > > >> > > > > > > > >> slow down producer? producer will continue to fire > > async > > >> > > sends. > > >> > > > > > > > >> > > >> > > > > > > > >> On Mon, Mar 16, 2015 at 10:58 PM, Guozhang Wang < > > >> > > > > wangg...@gmail.com > > >> > > > > > > > > >> > > > > > > > >> wrote: > > >> > > > > > > > >> > > >> > > > > > > > >>> I think we are really discussing two separate issues > > >> here: > > >> > > > > > > > >>> > > >> > > > > > > > >>> 1. Whether we should a) > > >> > > > > append-then-block-then-returnOKButThrottled > > >> > > > > > > or > > >> > > > > > > > b) > > >> > > > > > > > >>> block-then-returnFailDuetoThrottled for quota > actions > > on > > >> > > produce > > >> > > > > > > > >>> requests. > > >> > > > > > > > >>> > > >> > > > > > > > >>> Both these approaches assume some kind of > > >> well-behaveness > > >> > of > > >> > > the > > >> > > > > > > > clients: > > >> > > > > > > > >>> option a) assumes the client sets an proper timeout > > >> value > > >> > > while > > >> > > > > can > > >> > > > > > > > just > > >> > > > > > > > >>> ignore "OKButThrottled" response, while option b) > > >> assumes > > >> > the > > >> > > > > > client > > >> > > > > > > > >>> handles the "FailDuetoThrottled" appropriately. For > > any > > >> > > malicious > > >> > > > > > > > clients > > >> > > > > > > > >>> that, for example, just keep retrying either > > >> intentionally > > >> > or > > >> > > > > not, > > >> > > > > > > > >>> neither > > >> > > > > > > > >>> of these approaches are actually effective. > > >> > > > > > > > >>> > > >> > > > > > > > >>> 2. For "OKButThrottled" and "FailDuetoThrottled" > > >> responses, > > >> > > shall > > >> > > > > > we > > >> > > > > > > > >>> encode > > >> > > > > > > > >>> them as error codes or augment the protocol to use a > > >> > separate > > >> > > > > field > > >> > > > > > > > >>> indicating "status codes". > > >> > > > > > > > >>> > > >> > > > > > > > >>> Today we have already incorporated some status code > as > > >> > error > > >> > > > > codes > > >> > > > > > in > > >> > > > > > > > the > > >> > > > > > > > >>> responses, e.g. ReplicaNotAvailable in > > MetadataResponse, > > >> > the > > >> > > pros > > >> > > > > > of > > >> > > > > > > > this > > >> > > > > > > > >>> is of course using a single field for response > status > > >> like > > >> > > the > > >> > > > > HTTP > > >> > > > > > > > >>> status > > >> > > > > > > > >>> codes, while the cons is that it requires clients to > > >> handle > > >> > > the > > >> > > > > > error > > >> > > > > > > > >>> codes > > >> > > > > > > > >>> carefully. > > >> > > > > > > > >>> > > >> > > > > > > > >>> I think maybe we can actually extend the single-code > > >> > > approach to > > >> > > > > > > > overcome > > >> > > > > > > > >>> its drawbacks, that is, wrap the error codes > semantics > > >> to > > >> > the > > >> > > > > users > > >> > > > > > > so > > >> > > > > > > > >>> that > > >> > > > > > > > >>> users do not need to handle the codes one-by-one. > More > > >> > > > > concretely, > > >> > > > > > > > >>> following Jay's example the client could write sth. > > like > > >> > > this: > > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> ----------------- > > >> > > > > > > > >>> > > >> > > > > > > > >>> if(error.isOK()) > > >> > > > > > > > >>> // status code is good or the code can be > simply > > >> > > ignored for > > >> > > > > > > this > > >> > > > > > > > >>> request type, process the request > > >> > > > > > > > >>> else if(error.needsRetry()) > > >> > > > > > > > >>> // throttled, transient error, etc: retry > > >> > > > > > > > >>> else if(error.isFatal()) > > >> > > > > > > > >>> // non-retriable errors, etc: notify / > terminate > > / > > >> > other > > >> > > > > > > handling > > >> > > > > > > > >>> > > >> > > > > > > > >>> ----------------- > > >> > > > > > > > >>> > > >> > > > > > > > >>> Only when the clients really want to handle, for > > example > > >> > > > > > > > >>> FailDuetoThrottled > > >> > > > > > > > >>> status code specifically, it needs to: > > >> > > > > > > > >>> > > >> > > > > > > > >>> if(error.isOK()) > > >> > > > > > > > >>> // status code is good or the code can be > simply > > >> > > ignored for > > >> > > > > > > this > > >> > > > > > > > >>> request type, process the request > > >> > > > > > > > >>> else if(error == FailDuetoThrottled ) > > >> > > > > > > > >>> // throttled: log it > > >> > > > > > > > >>> else if(error.needsRetry()) > > >> > > > > > > > >>> // transient error, etc: retry > > >> > > > > > > > >>> else if(error.isFatal()) > > >> > > > > > > > >>> // non-retriable errors, etc: notify / > terminate > > / > > >> > other > > >> > > > > > > handling > > >> > > > > > > > >>> > > >> > > > > > > > >>> ----------------- > > >> > > > > > > > >>> > > >> > > > > > > > >>> And for implementation we can probably group the > codes > > >> > > > > accordingly > > >> > > > > > > like > > >> > > > > > > > >>> HTTP status code such that we can do: > > >> > > > > > > > >>> > > >> > > > > > > > >>> boolean Error.isOK() { > > >> > > > > > > > >>> return code < 300 && code >= 200; > > >> > > > > > > > >>> } > > >> > > > > > > > >>> > > >> > > > > > > > >>> Guozhang > > >> > > > > > > > >>> > > >> > > > > > > > >>> On Mon, Mar 16, 2015 at 10:24 PM, Ewen > > Cheslack-Postava > > >> < > > >> > > > > > > > >>> e...@confluent.io> > > >> > > > > > > > >>> wrote: > > >> > > > > > > > >>> > > >> > > > > > > > >>> > Agreed that trying to shoehorn non-error codes > into > > >> the > > >> > > error > > >> > > > > > field > > >> > > > > > > > is > > >> > > > > > > > >>> a > > >> > > > > > > > >>> > bad idea. It makes it *way* too easy to write code > > >> that > > >> > > looks > > >> > > > > > (and > > >> > > > > > > > >>> should > > >> > > > > > > > >>> > be) correct but is actually incorrect. If > > necessary, I > > >> > > think > > >> > > > > it's > > >> > > > > > > > much > > >> > > > > > > > >>> > better to to spend a couple of extra bytes to > encode > > >> that > > >> > > > > > > information > > >> > > > > > > > >>> > separately (a "status" or "warning" section of the > > >> > > response). > > >> > > > > An > > >> > > > > > > > >>> indication > > >> > > > > > > > >>> > that throttling is occurring is something I'd > expect > > >> to > > >> > be > > >> > > > > > > indicated > > >> > > > > > > > >>> by a > > >> > > > > > > > >>> > bit flag in the response rather than as an error > > code. > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > Gwen - I think an error code makes sense when the > > >> request > > >> > > > > > actually > > >> > > > > > > > >>> failed. > > >> > > > > > > > >>> > Option B, which Jun was advocating, would have > > >> appended > > >> > the > > >> > > > > > > messages > > >> > > > > > > > >>> > successfully. If the rate-limiting case you're > > talking > > >> > > about > > >> > > > > had > > >> > > > > > > > >>> > successfully committed the messages, I would say > > >> that's > > >> > > also a > > >> > > > > > bad > > >> > > > > > > > use > > >> > > > > > > > >>> of > > >> > > > > > > > >>> > error codes. > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > On Mon, Mar 16, 2015 at 10:16 PM, Gwen Shapira < > > >> > > > > > > > gshap...@cloudera.com> > > >> > > > > > > > >>> > wrote: > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > We discussed an error code for rate-limiting > > (which > > >> I > > >> > > think > > >> > > > > > made > > >> > > > > > > > >>> > > sense), isn't it a similar case? > > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > On Mon, Mar 16, 2015 at 10:10 PM, Jay Kreps < > > >> > > > > > jay.kr...@gmail.com > > >> > > > > > > > > > >> > > > > > > > >>> wrote: > > >> > > > > > > > >>> > > > My concern is that as soon as you start > encoding > > >> > > non-error > > >> > > > > > > > response > > >> > > > > > > > >>> > > > information into error codes the next question > > is > > >> > what > > >> > > to > > >> > > > > do > > >> > > > > > if > > >> > > > > > > > two > > >> > > > > > > > >>> > such > > >> > > > > > > > >>> > > > codes apply (i.e. you have a replica down and > > the > > >> > > response > > >> > > > > is > > >> > > > > > > > >>> > quota'd). I > > >> > > > > > > > >>> > > > think I am trying to argue that error should > > mean > > >> > "why > > >> > > we > > >> > > > > > > failed > > >> > > > > > > > >>> your > > >> > > > > > > > >>> > > > request", for which there will really only be > > one > > >> > > reason, > > >> > > > > and > > >> > > > > > > any > > >> > > > > > > > >>> other > > >> > > > > > > > >>> > > > useful information we want to send back is > just > > >> > another > > >> > > > > field > > >> > > > > > > in > > >> > > > > > > > >>> the > > >> > > > > > > > >>> > > > response. > > >> > > > > > > > >>> > > > > > >> > > > > > > > >>> > > > -Jay > > >> > > > > > > > >>> > > > > > >> > > > > > > > >>> > > > On Mon, Mar 16, 2015 at 9:51 PM, Gwen Shapira > < > > >> > > > > > > > >>> gshap...@cloudera.com> > > >> > > > > > > > >>> > > wrote: > > >> > > > > > > > >>> > > > > > >> > > > > > > > >>> > > >> I think its not too late to reserve a set of > > >> error > > >> > > codes > > >> > > > > > > > >>> (200-299?) > > >> > > > > > > > >>> > > >> for "non-error" codes. > > >> > > > > > > > >>> > > >> > > >> > > > > > > > >>> > > >> It won't be backward compatible (i.e. clients > > >> that > > >> > > > > currently > > >> > > > > > > do > > >> > > > > > > > >>> "else > > >> > > > > > > > >>> > > >> throw" will throw on non-errors), but perhaps > > its > > >> > > > > > worthwhile. > > >> > > > > > > > >>> > > >> > > >> > > > > > > > >>> > > >> On Mon, Mar 16, 2015 at 9:42 PM, Jay Kreps < > > >> > > > > > > jay.kr...@gmail.com > > >> > > > > > > > > > > >> > > > > > > > >>> > wrote: > > >> > > > > > > > >>> > > >> > Hey Jun, > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > I'd really really really like to avoid > that. > > >> > Having > > >> > > just > > >> > > > > > > > spent a > > >> > > > > > > > >>> > > bunch of > > >> > > > > > > > >>> > > >> > time on the clients, using the error codes > to > > >> > encode > > >> > > > > other > > >> > > > > > > > >>> > information > > >> > > > > > > > >>> > > >> > about the response is super dangerous. The > > >> error > > >> > > > > handling > > >> > > > > > is > > >> > > > > > > > >>> one of > > >> > > > > > > > >>> > > the > > >> > > > > > > > >>> > > >> > hardest parts of the client (Guozhang chime > > in > > >> > > here). > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > Generally the error handling looks like > > >> > > > > > > > >>> > > >> > if(error == none) > > >> > > > > > > > >>> > > >> > // good, process the request > > >> > > > > > > > >>> > > >> > else if(error == KNOWN_ERROR_1) > > >> > > > > > > > >>> > > >> > // handle known error 1 > > >> > > > > > > > >>> > > >> > else if(error == KNOWN_ERROR_2) > > >> > > > > > > > >>> > > >> > // handle known error 2 > > >> > > > > > > > >>> > > >> > else > > >> > > > > > > > >>> > > >> > throw > Errors.forCode(error).exception(); > > >> // > > >> > or > > >> > > some > > >> > > > > > > other > > >> > > > > > > > >>> > default > > >> > > > > > > > >>> > > >> > behavior > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > This works because we have a convention > that > > >> and > > >> > > error > > >> > > > > is > > >> > > > > > > > >>> something > > >> > > > > > > > >>> > > that > > >> > > > > > > > >>> > > >> > prevented your getting the response so the > > >> default > > >> > > > > > handling > > >> > > > > > > > >>> case is > > >> > > > > > > > >>> > > sane > > >> > > > > > > > >>> > > >> > and forward compatible. It is tempting to > use > > >> the > > >> > > error > > >> > > > > > code > > >> > > > > > > > to > > >> > > > > > > > >>> > convey > > >> > > > > > > > >>> > > >> > information in the success case. For > example > > we > > >> > > could > > >> > > > > use > > >> > > > > > > > error > > >> > > > > > > > >>> > codes > > >> > > > > > > > >>> > > to > > >> > > > > > > > >>> > > >> > encode whether quotas were enforced, > whether > > >> the > > >> > > request > > >> > > > > > was > > >> > > > > > > > >>> served > > >> > > > > > > > >>> > > out > > >> > > > > > > > >>> > > >> of > > >> > > > > > > > >>> > > >> > cache, whether the stock market is up > today, > > or > > >> > > > > whatever. > > >> > > > > > > The > > >> > > > > > > > >>> > problem > > >> > > > > > > > >>> > > is > > >> > > > > > > > >>> > > >> > that since these are not errors as far as > the > > >> > > client is > > >> > > > > > > > >>> concerned it > > >> > > > > > > > >>> > > >> should > > >> > > > > > > > >>> > > >> > not throw an exception but process the > > >> response, > > >> > > but now > > >> > > > > > we > > >> > > > > > > > >>> created > > >> > > > > > > > >>> > an > > >> > > > > > > > >>> > > >> > explicit requirement that that error be > > handled > > >> > > > > explicitly > > >> > > > > > > > >>> since it > > >> > > > > > > > >>> > is > > >> > > > > > > > >>> > > >> > different. I really think that this kind of > > >> > > information > > >> > > > > is > > >> > > > > > > not > > >> > > > > > > > >>> an > > >> > > > > > > > >>> > > error, > > >> > > > > > > > >>> > > >> it > > >> > > > > > > > >>> > > >> > is just information, and if we want it in > the > > >> > > response > > >> > > > > we > > >> > > > > > > > >>> should do > > >> > > > > > > > >>> > > the > > >> > > > > > > > >>> > > >> > right thing and add a new field to the > > >> response. > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > I think you saw the Samza bug that was > > >> literally > > >> > an > > >> > > > > > example > > >> > > > > > > of > > >> > > > > > > > >>> this > > >> > > > > > > > >>> > > >> > happening and leading to an infinite retry > > >> loop. > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > Further more I really want to emphasize > that > > >> > hitting > > >> > > > > your > > >> > > > > > > > quota > > >> > > > > > > > >>> in > > >> > > > > > > > >>> > the > > >> > > > > > > > >>> > > >> > design that Adi has proposed is actually > not > > an > > >> > > error > > >> > > > > > > > condition > > >> > > > > > > > >>> at > > >> > > > > > > > >>> > > all. > > >> > > > > > > > >>> > > >> It > > >> > > > > > > > >>> > > >> > is totally reasonable in any bootstrap > > >> situation > > >> > to > > >> > > > > > > > >>> intentionally > > >> > > > > > > > >>> > > want to > > >> > > > > > > > >>> > > >> > run at the limit the system imposes on you. > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > -Jay > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> > On Mon, Mar 16, 2015 at 4:27 PM, Jun Rao < > > >> > > > > > j...@confluent.io> > > >> > > > > > > > >>> wrote: > > >> > > > > > > > >>> > > >> > > > >> > > > > > > > >>> > > >> >> It's probably useful for a client to know > > >> whether > > >> > > its > > >> > > > > > > > requests > > >> > > > > > > > >>> are > > >> > > > > > > > >>> > > >> >> throttled or not (e.g., for monitoring and > > >> > > alerting). > > >> > > > > > From > > >> > > > > > > > that > > >> > > > > > > > >>> > > >> >> perspective, option B (delay the requests > > and > > >> > > return an > > >> > > > > > > > error) > > >> > > > > > > > >>> > seems > > >> > > > > > > > >>> > > >> >> better. > > >> > > > > > > > >>> > > >> >> > > >> > > > > > > > >>> > > >> >> Thanks, > > >> > > > > > > > >>> > > >> >> > > >> > > > > > > > >>> > > >> >> Jun > > >> > > > > > > > >>> > > >> >> > > >> > > > > > > > >>> > > >> >> On Wed, Mar 4, 2015 at 3:51 PM, Aditya > > >> Auradkar < > > >> > > > > > > > >>> > > >> >> aaurad...@linkedin.com.invalid> wrote: > > >> > > > > > > > >>> > > >> >> > > >> > > > > > > > >>> > > >> >> > Posted a KIP for quotas in kafka. > > >> > > > > > > > >>> > > >> >> > > > >> > > > > > > > >>> > > > >> > > > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas > > >> > > > > > > > >>> > > >> >> > > > >> > > > > > > > >>> > > >> >> > Appreciate any feedback. > > >> > > > > > > > >>> > > >> >> > > > >> > > > > > > > >>> > > >> >> > Aditya > > >> > > > > > > > >>> > > >> >> > > > >> > > > > > > > >>> > > >> >> > > >> > > > > > > > >>> > > >> > > >> > > > > > > > >>> > > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > -- > > >> > > > > > > > >>> > Thanks, > > >> > > > > > > > >>> > Ewen > > >> > > > > > > > >>> > > > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> -- > > >> > > > > > > > >>> -- Guozhang > > >> > > > > > > > >>> > > >> > > > > > > > >> > > >> > > > > > > > >> > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > >> > > > > >> > > > >> > > > > > > > > > -- > > > Sent from Gmail Mobile > > > > > > > > > -- > > Sent from Gmail Mobile > > >