I think Jay meant a catch-all request/sec limit on all requests per-client. That makes sense.
On Fri, Apr 24, 2015 at 11:02:29PM +0000, Aditya Auradkar wrote: > I think Joel's suggestion is quite good. It's still possible to throttle > other types of requests using purgatory but we will need a separate purgatory > and DelayedOperation variants of different request types or perhaps add a > ThrottledOperation type. It also addresses a couple of special case > situations wrt delay time and replication timeouts. > > Jay, if we have a general mechanism of delaying requests then it should be > possible to throttle any type of request as long as we have metrics on a > per-client basis. For offset commit requests, we would simply need a request > rate metric per-client and a good default quota. > > Thanks, > Aditya > > ________________________________________ > From: Jay Kreps [jay.kr...@gmail.com] > Sent: Friday, April 24, 2015 3:20 PM > To: dev@kafka.apache.org > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > Hey Jun/Joel, > > Yeah we will definitely want to quota non-produce/consume requests. > Especially offset commit and any other requests the consumer can trigger > could easily get invoked in a tight loop by accident. We haven't talked > about this a ton, but presumably the mechanism for all these would just be > a general requests/sec limit that covers all requests? > > -Jay > > > On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao <j...@confluent.io> wrote: > > > Joel, > > > > What you suggested makes sense. Not sure if there is a strong need to > > throttle TMR though since it should be infrequent. > > > > Thanks, > > > > Jun > > > > On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > > > Given the caveats, it may be worth doing further investigation on the > > > alternate approach which is to use a dedicated DelayQueue for requests > > > that violate quota and compare pros/cons. > > > > > > So the approach is the following: all request handling occurs normally > > > (i.e., unchanged from what we do today). i.e., purgatories will be > > > unchanged. After handling a request and before sending the response, > > > check if the request has violated a quota. If so, then enqueue the > > > response into a DelayQueue. All responses can share the same > > > DelayQueue. Send those responses out after the delay has been met. > > > > > > There are some benefits to doing this: > > > > > > - We will eventually want to quota other requests as well. The above > > > seems to be a clean staged approach that should work uniformly for > > > all requests. i.e., parse request -> handle request normally -> > > > check quota -> hold in delay queue if quota violated -> respond . > > > All requests can share the same DelayQueue. (In contrast with the > > > current proposal we could end up with a bunch of purgatories, or a > > > combination of purgatories and delay queues.) > > > - Since this approach does not need any fundamental modifications to > > > the current request handling, it addresses the caveats that Adi > > > noted (which is holding producer requests/fetch requests longer than > > > strictly necessary if quota is violated since the proposal was to > > > not watch on keys in that case). Likewise it addresses the caveat > > > that Guozhang noted (we may return no error if the request is held > > > long enough due to quota violation and satisfy a producer request > > > that may have in fact exceeded the ack timeout) although it is > > > probably reasonable to hide this case from the user. > > > - By avoiding the caveats it also avoids the suggested work-around to > > > the caveats which is effectively to add a min-hold-time to the > > > purgatory. Although this is not a lot of code, I think it adds a > > > quota-driven feature to the purgatory which is already non-trivial > > > and should ideally remain unassociated with quota enforcement. > > > > > > For this to work well we need to be sure that we don't hold a lot of > > > data in the DelayQueue - and therein lies a quirk to this approach. > > > Producer responses (and most other responses) are very small so there > > > is no issue. Fetch responses are fine as well - since we read off a > > > FileMessageSet in response (zero-copy). This will remain true even > > > when we support SSL since encryption occurs at the session layer (not > > > the application layer). > > > > > > Topic metadata response can be a problem though. For this we ideally > > > want to build the topic metadata response only when we are ready to > > > respond. So for metadata-style responses which could contain large > > > response objects we may want to put the quota check and delay queue > > > _before_ handling the request. So the design in this approach would > > > need an amendment: provide a choice of where to put a request in the > > > delay queue: either before handling or after handling (before > > > response). So for: > > > > > > small request, large response: delay queue before handling > > > large request, small response: delay queue after handling, before > > response > > > small request, small response: either is fine > > > large request, large resopnse: we really cannot do anything here but we > > > don't really have this scenario yet > > > > > > So the design would look like this: > > > > > > - parse request > > > - before handling request check if quota violated; if so compute two > > delay > > > numbers: > > > - before handling delay > > > - before response delay > > > - if before-handling delay > 0 insert into before-handling delay queue > > > - handle the request > > > - if before-response delay > 0 insert into before-response delay queue > > > - respond > > > > > > Just throwing this out there for discussion. > > > > > > Thanks, > > > > > > Joel > > > > > > On Thu, Apr 16, 2015 at 02:56:23PM -0700, Jun Rao wrote: > > > > The quota check for the fetch request is a bit different from the > > produce > > > > request. I assume that for the fetch request, we will first get an > > > > estimated fetch response size to do the quota check. There are two > > things > > > > to think about. First, when we actually send the response, we probably > > > > don't want to record the metric again since it will double count. > > Second, > > > > the bytes that the fetch response actually sends could be more than the > > > > estimate. This means that the metric may not be 100% accurate. We may > > be > > > > able to limit the fetch size of each partition to what's in the > > original > > > > estimate. > > > > > > > > For the produce request, I was thinking that another way to do this is > > to > > > > first figure out the quota_timeout. Then wait in Purgatory for > > > > quota_timeout with no key. If the request is not satisfied in > > > quota_timeout > > > > and (request_timeout > quota_timeout), wait in Purgatory for > > > > (request_timeout - quota_timeout) with the original keys. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Tue, Apr 14, 2015 at 5:01 PM, Aditya Auradkar < > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > This is an implementation proposal for delaying requests in quotas > > > using > > > > > the current purgatory. I'll discuss the usage for produce and fetch > > > > > requests separately. > > > > > > > > > > 1. Delayed Produce Requests - Here, the proposal is basically to > > reuse > > > > > DelayedProduce objects and insert them into the purgatory with no > > > watcher > > > > > keys if the request is being throttled. The timeout used in the > > request > > > > > should be the Max(quota_delay_time, replication_timeout). > > > > > In most cases, the quota timeout should be greater than the existing > > > > > timeout but in order to be safe, we can use the maximum of these > > > values. > > > > > Having no watch keys will allow the operation to be enqueued directly > > > into > > > > > the timer and will not add any overhead in terms of watching keys > > > (which > > > > > was a concern). In this case, having watch keys is not beneficial > > > since the > > > > > operation must be delayed for a fixed amount of time and there is no > > > > > possibility for the operation to complete before the timeout i.e. > > > > > tryComplete() can never return true before the timeout. On timeout, > > > since > > > > > the operation is a TimerTask, the timer will call run() which calls > > > > > onComplete(). > > > > > In onComplete, the DelayedProduce can repeat the check in > > tryComplete() > > > > > (only if acks=-1 whether all replicas fetched upto a certain offset) > > > and > > > > > return the response immediately. > > > > > > > > > > Code will be structured as follows in ReplicaManager:appendMessages() > > > > > > > > > > if(isThrottled) { > > > > > fetch = new DelayedProduce(timeout) > > > > > purgatory.tryCompleteElseWatch(fetch, Seq()) > > > > > } > > > > > else if(delayedRequestRequired()) { > > > > > // Insert into purgatory with watched keys for unthrottled requests > > > > > } > > > > > > > > > > In this proposal, we avoid adding unnecessary watches because there > > is > > > no > > > > > possibility of early completion and this avoids any potential > > > performance > > > > > penalties we were concerned about earlier. > > > > > > > > > > 2. Delayed Fetch Requests - Similarly, the proposal here is to reuse > > > the > > > > > DelayedFetch objects and insert them into the purgatory with no > > watcher > > > > > keys if the request is throttled. Timeout used is the > > > Max(quota_delay_time, > > > > > max_wait_timeout). Having no watch keys provides the same benefits as > > > > > described above. Upon timeout, the onComplete() is called and the > > > operation > > > > > proceeds normally i.e. perform a readFromLocalLog and return a > > > response. > > > > > The caveat here is that if the request is throttled but the throttle > > > time > > > > > is less than the max_wait timeout on the fetch request, the request > > > will be > > > > > delayed to a Max(quota_delay_time, max_wait_timeout). This may be > > more > > > than > > > > > strictly necessary (since we are not watching for > > > > > satisfaction on any keys). > > > > > > > > > > I added some testcases to DelayedOperationTest to verify that it is > > > > > possible to schedule operations with no watcher keys. By inserting > > > elements > > > > > with no watch keys, the purgatory simply becomes a delay queue. It > > may > > > also > > > > > make sense to add a new API to the purgatory called > > > > > delayFor() that basically accepts an operation without any watch keys > > > > > (Thanks for the suggestion Joel). > > > > > > > > > > Thoughts? > > > > > > > > > > Thanks, > > > > > Aditya > > > > > > > > > > ________________________________________ > > > > > From: Guozhang Wang [wangg...@gmail.com] > > > > > Sent: Monday, April 13, 2015 7:27 PM > > > > > To: dev@kafka.apache.org > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > I think KAFKA-2063 (bounding fetch response) is still under > > > discussion, and > > > > > may not be got it in time with KAFKA-1927. > > > > > > > > > > On Thu, Apr 9, 2015 at 4:49 PM, Aditya Auradkar < > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > I think it's reasonable to batch the protocol changes together. In > > > > > > addition to the protocol changes, is someone actively driving the > > > server > > > > > > side changes/KIP process for KAFKA-2063? > > > > > > > > > > > > Thanks, > > > > > > Aditya > > > > > > > > > > > > ________________________________________ > > > > > > From: Jun Rao [j...@confluent.io] > > > > > > Sent: Thursday, April 09, 2015 8:59 AM > > > > > > To: dev@kafka.apache.org > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > Since we are also thinking about evolving the fetch request > > protocol > > > in > > > > > > KAFKA-2063 (bound fetch response size), perhaps it's worth thinking > > > > > through > > > > > > if we can just evolve the protocol once. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Wed, Apr 8, 2015 at 10:43 AM, Aditya Auradkar < > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > Thanks for the detailed review. I've addressed your comments. > > > > > > > > > > > > > > For rejected alternatives, we've rejected per-partition > > > distribution > > > > > > > because we choose client based quotas where there is no notion of > > > > > > > partitions. I've explained in a bit more detail in that section. > > > > > > > > > > > > > > Aditya > > > > > > > > > > > > > > ________________________________________ > > > > > > > From: Joel Koshy [jjkosh...@gmail.com] > > > > > > > Sent: Wednesday, April 08, 2015 6:30 AM > > > > > > > To: dev@kafka.apache.org > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > Thanks for updating the wiki. Looks great overall. Just a couple > > > > > > > more comments: > > > > > > > > > > > > > > Client status code: > > > > > > > - v0 requests -> current version (0) of those requests. > > > > > > > - Fetch response has a throttled flag instead of throttle time > > - I > > > > > > > think you intended the latter. > > > > > > > - Can you make it clear that the quota status is a new field > > > > > > > called throttleTimeMs (or equivalent). It would help if some of > > > > > > > that is moved (or repeated) in compatibility/migration plan. > > > > > > > - So you would need to upgrade brokers first, then the clients. > > > > > > > While upgrading the brokers (via a rolling bounce) the brokers > > > > > > > cannot start using the latest fetch-request version immediately > > > > > > > (for replica fetches). Since there will be older brokers in the > > > mix > > > > > > > those brokers would not be able to read v1 fetch requests. So > > all > > > > > > > the brokers should be upgraded before switching to the latest > > > > > > > fetch request version. This is similar to what Gwen proposed in > > > > > > > KIP-2/KAFKA-1809 and I think we will need to use the > > > > > > > inter-broker protocol version config. > > > > > > > > > > > > > > Rejected alternatives-quota-distribution.B: notes that this is > > the > > > > > > > most elegant model, but does not explain why it was rejected. I > > > > > > > think this was because we would then need some sort of gossip > > > > > > > between brokers since partitions are across the cluster. Can you > > > > > > > confirm? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Joel > > > > > > > > > > > > > > On Wed, Apr 08, 2015 at 05:45:34AM +0000, Aditya Auradkar wrote: > > > > > > > > Hey everyone, > > > > > > > > > > > > > > > > Following up after today's hangout. After discussing the client > > > side > > > > > > > metrics piece internally, we've incorporated that section into > > the > > > KIP. > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas > > > > > > > > > > > > > > > > Since there appears to be sufficient consensus, I'm going to > > > start a > > > > > > > voting thread. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Aditya > > > > > > > > ________________________________________ > > > > > > > > From: Gwen Shapira [gshap...@cloudera.com] > > > > > > > > Sent: Tuesday, April 07, 2015 11:31 AM > > > > > > > > To: Sriharsha Chintalapani > > > > > > > > Cc: dev@kafka.apache.org > > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > > > Yeah, I was not suggesting adding auth to metrics - I think > > this > > > > > > > needlessly > > > > > > > > complicates everything. > > > > > > > > But we need to assume that client developers will not have > > > access to > > > > > > the > > > > > > > > broker metrics (because in secure environment they probably > > > won't). > > > > > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > > > On Tue, Apr 7, 2015 at 11:20 AM, Sriharsha Chintalapani < > > > > > > ka...@harsha.io > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Having auth on top of metrics is going to be lot more > > > difficult. > > > > > How > > > > > > > are > > > > > > > > > we going to restrict metrics reporter which run as part of > > > kafka > > > > > > server > > > > > > > > > they will have access to all the metrics and they can publish > > > to > > > > > > > ganglia > > > > > > > > > etc.. I look at the metrics as a read-only info. As you said > > > > > metrics > > > > > > > for > > > > > > > > > all the topics can be visible but what actions are we looking > > > that > > > > > > can > > > > > > > be > > > > > > > > > non-secure based on metrics alone? . This probably can be > > part > > > of > > > > > > > KIP-11 > > > > > > > > > discussion. > > > > > > > > > Having said that it will be great if the throttling details > > > can be > > > > > > > > > exposed as part of the response to the client. Instead of > > > looking > > > > > at > > > > > > > > > metrics , client can depend on the response to slow down if > > its > > > > > being > > > > > > > > > throttled. This allows us the clients can be self-reliant > > > based on > > > > > > the > > > > > > > > > response . > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Harsha > > > > > > > > > > > > > > > > > > > > > > > > > > > On April 7, 2015 at 9:55:41 AM, Gwen Shapira ( > > > > > gshap...@cloudera.com) > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Re (1): > > > > > > > > > We have no authorization story on the metrics collected by > > > brokers, > > > > > > so > > > > > > > I > > > > > > > > > assume that access to broker metrics means knowing exactly > > > which > > > > > > topics > > > > > > > > > exist and their throughputs. (Prath and Don, correct me if I > > > got it > > > > > > > > > wrong...) > > > > > > > > > Secure environments will strictly control access to this > > > > > information, > > > > > > > so I > > > > > > > > > am pretty sure the client developers will not have access to > > > server > > > > > > > > > metrics > > > > > > > > > at all. > > > > > > > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > > > > > On Tue, Apr 7, 2015 at 7:41 AM, Jay Kreps < > > jay.kr...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Totally. But is that the only use? What I wanted to flesh > > > out was > > > > > > > > > whether > > > > > > > > > > the goal was: > > > > > > > > > > 1. Expose throttling in the client metrics > > > > > > > > > > 2. Enable programmatic response (i.e. stop sending stuff or > > > > > > something > > > > > > > > > like > > > > > > > > > > that) > > > > > > > > > > > > > > > > > > > > I think I kind of understand (1) but let's get specific on > > > the > > > > > > > metric we > > > > > > > > > > would be adding and what exactly you would expose in a > > > dashboard. > > > > > > For > > > > > > > > > > example if the goal is just monitoring do I really want a > > > boolean > > > > > > > flag > > > > > > > > > for > > > > > > > > > > is_throttled or do I want to know how much I am being > > > throttled > > > > > > (i.e. > > > > > > > > > > throttle_pct might indicate the percent of your request > > time > > > that > > > > > > was > > > > > > > > > due > > > > > > > > > > to throttling or something like that)? If I am 1% throttled > > > that > > > > > > may > > > > > > > be > > > > > > > > > > irrelevant but 99% throttled would be quite relevant? Not > > > sure I > > > > > > > agree, > > > > > > > > > > just throwing that out there... > > > > > > > > > > > > > > > > > > > > For (2) the prior discussion seemed to kind of allude to > > this > > > > > but I > > > > > > > > > can't > > > > > > > > > > really come up with a use case. Is there one? > > > > > > > > > > > > > > > > > > > > If it is just (1) I think the question is whether it really > > > helps > > > > > > > much > > > > > > > > > to > > > > > > > > > > have the metric on the client vs the server. I suppose this > > > is a > > > > > > bit > > > > > > > > > > environment specific. If you have a central metrics system > > it > > > > > > > shouldn't > > > > > > > > > > make any difference, but if you don't I suppose it does. > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > On Mon, Apr 6, 2015 at 7:57 PM, Gwen Shapira < > > > > > > gshap...@cloudera.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Here's a wild guess: > > > > > > > > > > > > > > > > > > > > > > An app developer included a Kafka Producer in his app, > > and > > > is > > > > > not > > > > > > > > > happy > > > > > > > > > > > with the throughput. He doesn't have visibility into the > > > > > brokers > > > > > > > since > > > > > > > > > > they > > > > > > > > > > > are owned by a different team. Obviously the first > > > instinct of > > > > > a > > > > > > > > > > developer > > > > > > > > > > > who knows that throttling exists is to blame throttling > > > for any > > > > > > > > > slowdown > > > > > > > > > > in > > > > > > > > > > > the app. > > > > > > > > > > > If he doesn't have a way to know from the responses > > > whether or > > > > > > not > > > > > > > his > > > > > > > > > > app > > > > > > > > > > > is throttled, he may end up calling Aditya at 4am asked > > > "Hey, > > > > > is > > > > > > my > > > > > > > > > app > > > > > > > > > > > throttled?". > > > > > > > > > > > > > > > > > > > > > > I assume Aditya is trying to avoid this scenario. > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 6, 2015 at 7:47 PM, Jay Kreps < > > > jay.kr...@gmail.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hey Aditya, > > > > > > > > > > > > > > > > > > > > > > > > 2. I kind of buy it, but I really like to understand > > the > > > > > > details > > > > > > > of > > > > > > > > > the > > > > > > > > > > > use > > > > > > > > > > > > case before we make protocol changes. What changes are > > > you > > > > > > > proposing > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > clients for monitoring and how would that be used? > > > > > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 6, 2015 at 10:36 AM, Aditya Auradkar < > > > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Jay, > > > > > > > > > > > > > > > > > > > > > > > > > > 2. At this time, the proposed response format changes > > > are > > > > > > only > > > > > > > for > > > > > > > > > > > > > monitoring/informing clients. As Jun mentioned, we > > get > > > > > > instance > > > > > > > > > level > > > > > > > > > > > > > monitoring in this case since each instance that got > > > > > > throttled > > > > > > > > > will > > > > > > > > > > > have > > > > > > > > > > > > a > > > > > > > > > > > > > metric confirming the same. Without client level > > > monitoring > > > > > > for > > > > > > > > > this, > > > > > > > > > > > > it's > > > > > > > > > > > > > hard for application developers to find if they are > > > being > > > > > > > > > throttled > > > > > > > > > > > since > > > > > > > > > > > > > they will also have to be aware of all the brokers in > > > the > > > > > > > cluster. > > > > > > > > > > This > > > > > > > > > > > > is > > > > > > > > > > > > > quite problematic for large clusters. > > > > > > > > > > > > > > > > > > > > > > > > > > It seems nice for app developers to not have to think > > > about > > > > > > > kafka > > > > > > > > > > > > internal > > > > > > > > > > > > > metrics and only focus on the metrics exposed on > > their > > > > > > > instances. > > > > > > > > > > > > Analogous > > > > > > > > > > > > > to having client-sde request latency metrics. > > > Basically, we > > > > > > > want > > > > > > > > > an > > > > > > > > > > > easy > > > > > > > > > > > > > way for clients to be aware if they are being > > > throttled. > > > > > > > > > > > > > > > > > > > > > > > > > > 4. For purgatory v delay queue, I think we are on the > > > same > > > > > > > page. I > > > > > > > > > > feel > > > > > > > > > > > > it > > > > > > > > > > > > > is nicer to use the purgatory but I'm happy to use a > > > > > > > DelayQueue if > > > > > > > > > > > there > > > > > > > > > > > > > are performance implications. I don't know enough > > > about the > > > > > > > > > current > > > > > > > > > > and > > > > > > > > > > > > > Yasuhiro's new implementation to be sure one way or > > the > > > > > > other. > > > > > > > > > > > > > > > > > > > > > > > > > > Stepping back, I think these two things are the only > > > > > > remaining > > > > > > > > > point > > > > > > > > > > of > > > > > > > > > > > > > discussion within the current proposal. Any concerns > > > if I > > > > > > > started > > > > > > > > > a > > > > > > > > > > > > voting > > > > > > > > > > > > > thread on the proposal after the KIP discussion > > > tomorrow? > > > > > > > > > (assuming > > > > > > > > > > we > > > > > > > > > > > > > reach consensus on these items) > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > Aditya > > > > > > > > > > > > > ________________________________________ > > > > > > > > > > > > > From: Jay Kreps [jay.kr...@gmail.com] > > > > > > > > > > > > > Sent: Saturday, April 04, 2015 1:36 PM > > > > > > > > > > > > > To: dev@kafka.apache.org > > > > > > > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > > > > > > > > > > > > > Hey Aditya, > > > > > > > > > > > > > > > > > > > > > > > > > > 2. For the return flag I'm not terribly particular. > > If > > > we > > > > > > want > > > > > > > to > > > > > > > > > add > > > > > > > > > > > it > > > > > > > > > > > > > let's fully think through how it will be used. The > > only > > > > > > > concern I > > > > > > > > > > have > > > > > > > > > > > is > > > > > > > > > > > > > adding to the protocol without really thinking > > through > > > the > > > > > > use > > > > > > > > > cases. > > > > > > > > > > > So > > > > > > > > > > > > > let's work out the APIs we want to add to the Java > > > consumer > > > > > > and > > > > > > > > > > > producer > > > > > > > > > > > > > and the use cases for how clients will make use of > > > these. > > > > > For > > > > > > > my > > > > > > > > > > part I > > > > > > > > > > > > > actually don't see much use other than monitoring > > > since it > > > > > > > isn't > > > > > > > > > an > > > > > > > > > > > error > > > > > > > > > > > > > condition to be at your quota. And if it is just > > > > > monitoring I > > > > > > > > > don't > > > > > > > > > > > see a > > > > > > > > > > > > > big enough difference between having the monitoring > > on > > > the > > > > > > > > > > server-side > > > > > > > > > > > > > versus in the clients to justify putting it in the > > > > > protocol. > > > > > > > But I > > > > > > > > > > > think > > > > > > > > > > > > > you guys may have other use cases in mind of how a > > > client > > > > > > would > > > > > > > > > make > > > > > > > > > > > some > > > > > > > > > > > > > use of this? Let's work that out. I also don't feel > > > > > strongly > > > > > > > about > > > > > > > > > > > it--it > > > > > > > > > > > > > wouldn't be *bad* to have the monitoring available on > > > the > > > > > > > client, > > > > > > > > > > just > > > > > > > > > > > > > doesn't seem that much better. > > > > > > > > > > > > > > > > > > > > > > > > > > 4. For the purgatory vs delay queue I think is > > arguably > > > > > nicer > > > > > > > to > > > > > > > > > > reuse > > > > > > > > > > > > the > > > > > > > > > > > > > purgatory we just have to be ultra-conscious of > > > > > efficiency. I > > > > > > > > > think > > > > > > > > > > our > > > > > > > > > > > > > goal is to turn quotas on across the board, so at > > > LinkedIn > > > > > > that > > > > > > > > > would > > > > > > > > > > > > mean > > > > > > > > > > > > > potentially every request will need a small delay. I > > > > > haven't > > > > > > > > > worked > > > > > > > > > > out > > > > > > > > > > > > the > > > > > > > > > > > > > efficiency implications of this choice, so as long as > > > we do > > > > > > > that > > > > > > > > > I'm > > > > > > > > > > > > happy. > > > > > > > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Apr 3, 2015 at 1:10 PM, Aditya Auradkar < > > > > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Some responses to Jay's points. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. Using commas - Cool. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Adding return flag - I'm inclined to agree with > > > Joel > > > > > > that > > > > > > > > > this > > > > > > > > > > is > > > > > > > > > > > > good > > > > > > > > > > > > > > to have in the initial implementation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Config - +1. I'll remove it from the KIP. We can > > > > > discuss > > > > > > > this > > > > > > > > > in > > > > > > > > > > > > > > parallel. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Purgatory vs Delay queue - I feel that it is > > > simpler > > > > > to > > > > > > > reuse > > > > > > > > > > the > > > > > > > > > > > > > > existing purgatories for both delayed produce and > > > fetch > > > > > > > > > requests. > > > > > > > > > > > IIUC, > > > > > > > > > > > > > all > > > > > > > > > > > > > > we need for quotas is a minWait parameter for > > > > > > > DelayedOperation > > > > > > > > > (or > > > > > > > > > > > > > > something equivalent) since there is already a max > > > wait. > > > > > > The > > > > > > > > > > > completion > > > > > > > > > > > > > > criteria can check if minWait time has elapsed > > before > > > > > > > declaring > > > > > > > > > the > > > > > > > > > > > > > > operation complete. For this to impact > > performance, a > > > > > > > > > significant > > > > > > > > > > > > number > > > > > > > > > > > > > of > > > > > > > > > > > > > > clients may need to exceed their quota at the same > > > time > > > > > and > > > > > > > even > > > > > > > > > > then > > > > > > > > > > > > I'm > > > > > > > > > > > > > > not very clear on the scope of the impact. Two > > > layers of > > > > > > > delays > > > > > > > > > > might > > > > > > > > > > > > add > > > > > > > > > > > > > > complexity to the implementation which I'm hoping > > to > > > > > avoid. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Aditya > > > > > > > > > > > > > > > > > > > > > > > > > > > > ________________________________________ > > > > > > > > > > > > > > From: Joel Koshy [jjkosh...@gmail.com] > > > > > > > > > > > > > > Sent: Friday, April 03, 2015 12:48 PM > > > > > > > > > > > > > > To: dev@kafka.apache.org > > > > > > > > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > > > > > > > > > > > > > > > Aditya, thanks for the updated KIP and Jay/Jun > > > thanks for > > > > > > the > > > > > > > > > > > > > > comments. Couple of comments in-line: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. I would advocate for adding the return flag > > > when we > > > > > > next > > > > > > > > > bump > > > > > > > > > > > the > > > > > > > > > > > > > > > request format version just to avoid > > > proliferation. I > > > > > > agree > > > > > > > > > this > > > > > > > > > > > is a > > > > > > > > > > > > > > good > > > > > > > > > > > > > > > thing to know about, but at the moment I don't > > > think we > > > > > > > have a > > > > > > > > > > very > > > > > > > > > > > > > well > > > > > > > > > > > > > > > flushed out idea of how the client would actually > > > make > > > > > > use > > > > > > > of > > > > > > > > > > this > > > > > > > > > > > > > info. > > > > > > > > > > > > > > I > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm somewhat inclined to having something > > > appropriate off > > > > > > the > > > > > > > > > bat - > > > > > > > > > > > > > > mainly because (i) clients really should know that > > > they > > > > > > have > > > > > > > > > been > > > > > > > > > > > > > > throttled (ii) a smart producer/consumer > > > implementation > > > > > > would > > > > > > > > > want > > > > > > > > > > to > > > > > > > > > > > > > > know how much to back off. So perhaps this and > > > > > > > config-management > > > > > > > > > > > > > > should be moved to a separate discussion, but it > > > would be > > > > > > > good > > > > > > > > > to > > > > > > > > > > > have > > > > > > > > > > > > > > this discussion going and incorporated into the > > first > > > > > quota > > > > > > > > > > > > > > implementation. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. Config--I think we need to generalize the > > topic > > > > > stuff > > > > > > > so we > > > > > > > > > > can > > > > > > > > > > > > > > override > > > > > > > > > > > > > > > at multiple levels. We have topic and client, > > but I > > > > > > suspect > > > > > > > > > > "user" > > > > > > > > > > > > and > > > > > > > > > > > > > > > "broker" will also be important. I recommend we > > > take > > > > > > config > > > > > > > > > stuff > > > > > > > > > > > out > > > > > > > > > > > > > of > > > > > > > > > > > > > > > this KIP since we really need to fully think > > > through a > > > > > > > > > proposal > > > > > > > > > > > that > > > > > > > > > > > > > will > > > > > > > > > > > > > > > cover all these types of overrides. > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 - it is definitely orthogonal to the core quota > > > > > > > > > implementation > > > > > > > > > > > > > > (although necessary for its operability). Having a > > > > > > > > > config-related > > > > > > > > > > > > > > discussion in this KIP would only draw out the > > > discussion > > > > > > and > > > > > > > > > vote > > > > > > > > > > > > > > even if the core quota design looks good to > > everyone. > > > > > > > > > > > > > > > > > > > > > > > > > > > > So basically I think we can remove the portions on > > > > > dynamic > > > > > > > > > config > > > > > > > > > > as > > > > > > > > > > > > > > well as the response format but I really think we > > > should > > > > > > > close > > > > > > > > > on > > > > > > > > > > > > > > those while the implementation is in progress and > > > before > > > > > > > quotas > > > > > > > > > is > > > > > > > > > > > > > > officially released. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. Instead of using purgatories to implement the > > > delay > > > > > > > would > > > > > > > > > it > > > > > > > > > > > make > > > > > > > > > > > > > more > > > > > > > > > > > > > > > sense to just use a delay queue? I think all the > > > > > > additional > > > > > > > > > stuff > > > > > > > > > > > in > > > > > > > > > > > > > the > > > > > > > > > > > > > > > purgatory other than the delay queue doesn't make > > > sense > > > > > > as > > > > > > > the > > > > > > > > > > > quota > > > > > > > > > > > > > is a > > > > > > > > > > > > > > > hard N ms penalty with no chance of early > > > eviction. If > > > > > > > there > > > > > > > > > is > > > > > > > > > > no > > > > > > > > > > > > perf > > > > > > > > > > > > > > > penalty for the full purgatory that may be fine > > > (even > > > > > > > good) to > > > > > > > > > > > reuse, > > > > > > > > > > > > > > but I > > > > > > > > > > > > > > > haven't looked into that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > A simple delay queue sounds good - I think Aditya > > was > > > > > also > > > > > > > > > trying > > > > > > > > > > to > > > > > > > > > > > > > > avoid adding a new quota purgatory. i.e., it may be > > > > > > possible > > > > > > > to > > > > > > > > > use > > > > > > > > > > > > > > the existing purgatory instances to enforce quotas. > > > That > > > > > > may > > > > > > > be > > > > > > > > > > > > > > simpler, but would be incur a slight perf penalty > > if > > > too > > > > > > many > > > > > > > > > > clients > > > > > > > > > > > > > > are being throttled. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Joel > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Apr 3, 2015 at 10:45 AM, Aditya Auradkar > > < > > > > > > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> Update, I added a proposal on doing dynamic > > client > > > > > based > > > > > > > > > > > > configuration > > > > > > > > > > > > > > >> that can be used for quotas. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Please take a look and let me know if there are > > > any > > > > > > > concerns. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > > > >> Aditya > > > > > > > > > > > > > > >> ________________________________________ > > > > > > > > > > > > > > >> From: Aditya Auradkar > > > > > > > > > > > > > > >> Sent: Friday, April 03, 2015 10:10 AM > > > > > > > > > > > > > > >> To: dev@kafka.apache.org > > > > > > > > > > > > > > >> Subject: RE: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks Jun. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Some thoughts: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> 10) I think it is better we throttle regardless > > > of the > > > > > > > > > > > produce/fetch > > > > > > > > > > > > > > >> version. This is a nice feature where clients > > can > > > tell > > > > > > if > > > > > > > > > they > > > > > > > > > > are > > > > > > > > > > > > > being > > > > > > > > > > > > > > >> throttled or not. If we only throttle newer > > > clients, > > > > > > then > > > > > > > we > > > > > > > > > > have > > > > > > > > > > > > > > >> inconsistent behavior across clients in a > > > multi-tenant > > > > > > > > > cluster. > > > > > > > > > > > > Having > > > > > > > > > > > > > > >> quota metrics on the client side is also a nice > > > > > > incentive > > > > > > > to > > > > > > > > > > > upgrade > > > > > > > > > > > > > > client > > > > > > > > > > > > > > >> versions. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> 11) I think we can call metric.record(fetchSize) > > > > > before > > > > > > > > > adding > > > > > > > > > > the > > > > > > > > > > > > > > >> delayedFetch request into the purgatory. This > > will > > > > > give > > > > > > us > > > > > > > > > the > > > > > > > > > > > > > estimated > > > > > > > > > > > > > > >> delay of the request up-front. The timeout on > > the > > > > > > > > > DelayedFetch > > > > > > > > > > is > > > > > > > > > > > > the > > > > > > > > > > > > > > >> Max(maxWait, quotaDelay). The DelayedFetch > > > completion > > > > > > > > > criteria > > > > > > > > > > can > > > > > > > > > > > > > > change a > > > > > > > > > > > > > > >> little to accomodate quotas. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> - I agree the quota code should return the > > > estimated > > > > > > delay > > > > > > > > > time > > > > > > > > > > in > > > > > > > > > > > > > > >> QuotaViolationException. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > > > >> Aditya > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> ________________________________________ > > > > > > > > > > > > > > >> From: Jun Rao [j...@confluent.io] > > > > > > > > > > > > > > >> Sent: Friday, April 03, 2015 9:16 AM > > > > > > > > > > > > > > >> To: dev@kafka.apache.org > > > > > > > > > > > > > > >> Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks for the update. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> 10. About whether to return a new field in the > > > > > response > > > > > > to > > > > > > > > > > > indicate > > > > > > > > > > > > > > >> throttling. Earlier, the plan was to not change > > > the > > > > > > > response > > > > > > > > > > > format > > > > > > > > > > > > > and > > > > > > > > > > > > > > >> just have a metric on the broker to indicate > > > whether a > > > > > > > > > clientId > > > > > > > > > > is > > > > > > > > > > > > > > >> throttled or not. The issue is that we don't > > know > > > > > > whether > > > > > > > a > > > > > > > > > > > > particular > > > > > > > > > > > > > > >> clientId instance is throttled or not (since > > there > > > > > could > > > > > > > be > > > > > > > > > > > multiple > > > > > > > > > > > > > > >> clients with the same clientId). Your proposal > > of > > > > > adding > > > > > > > an > > > > > > > > > > > > > isThrottled > > > > > > > > > > > > > > >> field in the response addresses and seems > > better. > > > > > Then, > > > > > > > do we > > > > > > > > > > just > > > > > > > > > > > > > > throttle > > > > > > > > > > > > > > >> the new version of produce/fetch request or both > > > the > > > > > old > > > > > > > and > > > > > > > > > the > > > > > > > > > > > new > > > > > > > > > > > > > > >> versions? Also, we probably still need a > > separate > > > > > metric > > > > > > > on > > > > > > > > > the > > > > > > > > > > > > broker > > > > > > > > > > > > > > side > > > > > > > > > > > > > > >> to indicate whether a clientId is throttled or > > > not. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> 11. Just to clarify. For fetch requests, when > > will > > > > > > > > > > > > > > metric.record(fetchSize) > > > > > > > > > > > > > > >> be called? Is it when we are ready to send the > > > fetch > > > > > > > response > > > > > > > > > > > (after > > > > > > > > > > > > > > >> minBytes and maxWait are satisfied)? > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> As an implementation detail, it may be useful > > for > > > the > > > > > > > quota > > > > > > > > > code > > > > > > > > > > > to > > > > > > > > > > > > > > return > > > > > > > > > > > > > > >> an estimated delay time (to bring the > > measurement > > > > > within > > > > > > > the > > > > > > > > > > > limit) > > > > > > > > > > > > in > > > > > > > > > > > > > > >> QuotaViolationException. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Jun > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> On Wed, Apr 1, 2015 at 3:27 PM, Aditya Auradkar > > < > > > > > > > > > > > > > > >> aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > Hey everyone, > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > I've made changes to the KIP to capture our > > > > > > discussions > > > > > > > > > over > > > > > > > > > > the > > > > > > > > > > > > > last > > > > > > > > > > > > > > >> > couple of weeks. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > I'll start a voting thread after people have > > > had a > > > > > > > chance > > > > > > > > > to > > > > > > > > > > > > > > >> read/comment. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > Thanks, > > > > > > > > > > > > > > >> > Aditya > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > ________________________________________ > > > > > > > > > > > > > > >> > From: Steven Wu [stevenz...@gmail.com] > > > > > > > > > > > > > > >> > Sent: Friday, March 20, 2015 9:14 AM > > > > > > > > > > > > > > >> > To: dev@kafka.apache.org > > > > > > > > > > > > > > >> > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > +1 on Jun's suggestion of maintaining one > > > set/style > > > > > of > > > > > > > > > metrics > > > > > > > > > > > at > > > > > > > > > > > > > > broker. > > > > > > > > > > > > > > >> > In Netflix, we have to convert the yammer > > > metrics to > > > > > > > servo > > > > > > > > > > > metrics > > > > > > > > > > > > > at > > > > > > > > > > > > > > >> > broker. it will be painful to know some > > metrics > > > are > > > > > > in a > > > > > > > > > > > different > > > > > > > > > > > > > > style > > > > > > > > > > > > > > >> > and get to be handled differently. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > On Fri, Mar 20, 2015 at 8:17 AM, Jun Rao < > > > > > > > j...@confluent.io> > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > Not so sure. People who use quota will > > > definitely > > > > > > > want to > > > > > > > > > > > > monitor > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> new > > > > > > > > > > > > > > >> > > metrics at the client id level. Then they > > will > > > > > need > > > > > > to > > > > > > > > > deal > > > > > > > > > > > with > > > > > > > > > > > > > > those > > > > > > > > > > > > > > >> > > metrics differently from the rest of the > > > metrics. > > > > > It > > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > > better if > > > > > > > > > > > > > > >> > we > > > > > > > > > > > > > > >> > > can hide this complexity from the users. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > Thanks, > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > Jun > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > On Thu, Mar 19, 2015 at 10:45 PM, Joel > > Koshy < > > > > > > > > > > > > jjkosh...@gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > Actually thinking again - since these will > > > be a > > > > > > few > > > > > > > new > > > > > > > > > > > > metrics > > > > > > > > > > > > > at > > > > > > > > > > > > > > >> the > > > > > > > > > > > > > > >> > > > client id level (bytes in and bytes out to > > > start > > > > > > > with) > > > > > > > > > > maybe > > > > > > > > > > > > it > > > > > > > > > > > > > is > > > > > > > > > > > > > > >> fine > > > > > > > > > > > > > > >> > > to > > > > > > > > > > > > > > >> > > > have the two type of metrics coexist and > > we > > > can > > > > > > > migrate > > > > > > > > > > the > > > > > > > > > > > > > > existing > > > > > > > > > > > > > > >> > > > metrics in parallel. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > On Thursday, March 19, 2015, Joel Koshy < > > > > > > > > > > > jjkosh...@gmail.com> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > That is a valid concern but in that > > case I > > > > > think > > > > > > > it > > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > > better > > > > > > > > > > > > > > >> > to > > > > > > > > > > > > > > >> > > > > just migrate completely to the new > > metrics > > > > > > package > > > > > > > > > > first. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > On Thursday, March 19, 2015, Jun Rao < > > > > > > > > > j...@confluent.io > > > > > > > > > > > > > > >> > > > > <javascript:_e(%7B%7D,'cvml',' > > > > > j...@confluent.io > > > > > > > ');>> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> Hmm, I was thinking a bit differently > > on > > > the > > > > > > > metrics > > > > > > > > > > > > stuff. I > > > > > > > > > > > > > > >> think > > > > > > > > > > > > > > >> > it > > > > > > > > > > > > > > >> > > > >> would be confusing to have some metrics > > > > > defined > > > > > > > in > > > > > > > > > the > > > > > > > > > > > new > > > > > > > > > > > > > > metrics > > > > > > > > > > > > > > >> > > > package > > > > > > > > > > > > > > >> > > > >> while some others defined in Coda Hale. > > > Those > > > > > > > > > metrics > > > > > > > > > > > will > > > > > > > > > > > > > look > > > > > > > > > > > > > > >> > > > different > > > > > > > > > > > > > > >> > > > >> (e.g., rates in Coda Hale will have > > > special > > > > > > > > > attributes > > > > > > > > > > > such > > > > > > > > > > > > > as > > > > > > > > > > > > > > >> > > > >> 1-min-average). People may need > > different > > > > > ways > > > > > > to > > > > > > > > > > export > > > > > > > > > > > > the > > > > > > > > > > > > > > >> metrics > > > > > > > > > > > > > > >> > > to > > > > > > > > > > > > > > >> > > > >> external systems such as Graphite. So, > > > > > instead > > > > > > of > > > > > > > > > using > > > > > > > > > > > the > > > > > > > > > > > > > new > > > > > > > > > > > > > > >> > > metrics > > > > > > > > > > > > > > >> > > > >> package on the broker, I was thinking > > > that we > > > > > > can > > > > > > > > > just > > > > > > > > > > > > > > implement a > > > > > > > > > > > > > > >> > > > >> QuotaMetrics that wraps the Coda Hale > > > > > metrics. > > > > > > > The > > > > > > > > > > > > > > implementation > > > > > > > > > > > > > > >> > can > > > > > > > > > > > > > > >> > > be > > > > > > > > > > > > > > >> > > > >> the same as what's in the new metrics > > > > > package. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> Jun > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> On Thu, Mar 19, 2015 at 8:09 PM, Jay > > > Kreps < > > > > > > > > > > > > > > jay.kr...@gmail.com> > > > > > > > > > > > > > > >> > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > Yeah I was saying was that we are > > > blocked > > > > > on > > > > > > > > > picking > > > > > > > > > > an > > > > > > > > > > > > > > approach > > > > > > > > > > > > > > >> > for > > > > > > > > > > > > > > >> > > > >> > metrics but not necessarily the full > > > > > > > conversion. > > > > > > > > > > > Clearly > > > > > > > > > > > > if > > > > > > > > > > > > > > we > > > > > > > > > > > > > > >> > pick > > > > > > > > > > > > > > >> > > > the > > > > > > > > > > > > > > >> > > > >> new > > > > > > > > > > > > > > >> > > > >> > metrics package we would need to > > > implement > > > > > > the > > > > > > > two > > > > > > > > > > > > metrics > > > > > > > > > > > > > we > > > > > > > > > > > > > > >> want > > > > > > > > > > > > > > >> > > to > > > > > > > > > > > > > > >> > > > >> quota > > > > > > > > > > > > > > >> > > > >> > on. But the conversion of the > > remaining > > > > > > metrics > > > > > > > > > can > > > > > > > > > > be > > > > > > > > > > > > done > > > > > > > > > > > > > > >> > > > >> asynchronously. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > -Jay > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > On Thu, Mar 19, 2015 at 5:56 PM, Joel > > > > > Koshy < > > > > > > > > > > > > > > >> jjkosh...@gmail.com> > > > > > > > > > > > > > > >> > > > >> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > in KAFKA-1930). I agree that this > > > KIP > > > > > > > doesn't > > > > > > > > > > need > > > > > > > > > > > to > > > > > > > > > > > > > > block > > > > > > > > > > > > > > >> on > > > > > > > > > > > > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > > migration of the metrics package. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > Can you clarify the above? i.e., if > > > we > > > > > are > > > > > > > going > > > > > > > > > to > > > > > > > > > > > > quota > > > > > > > > > > > > > > on > > > > > > > > > > > > > > >> > > > something > > > > > > > > > > > > > > >> > > > >> > > then we would want to have migrated > > > that > > > > > > > metric > > > > > > > > > > over > > > > > > > > > > > > > > right? Or > > > > > > > > > > > > > > >> > do > > > > > > > > > > > > > > >> > > > you > > > > > > > > > > > > > > >> > > > >> > > mean we don't need to complete the > > > > > > migration > > > > > > > of > > > > > > > > > all > > > > > > > > > > > > > > metrics to > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > metrics package right? > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > I think most of us now feel that > > the > > > > > delay > > > > > > + > > > > > > > no > > > > > > > > > > error > > > > > > > > > > > > is > > > > > > > > > > > > > a > > > > > > > > > > > > > > >> good > > > > > > > > > > > > > > >> > > > >> > > approach, but it would be good to > > > make > > > > > sure > > > > > > > > > > everyone > > > > > > > > > > > is > > > > > > > > > > > > > on > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > same > > > > > > > > > > > > > > >> > > > >> > > page. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > As Aditya requested a couple of > > days > > > ago > > > > > I > > > > > > > think > > > > > > > > > we > > > > > > > > > > > > > should > > > > > > > > > > > > > > go > > > > > > > > > > > > > > >> > over > > > > > > > > > > > > > > >> > > > >> > > this at the next KIP hangout. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > Joel > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > On Thu, Mar 19, 2015 at 09:24:09AM > > > -0700, > > > > > > Jun > > > > > > > > > Rao > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > 1. Delay + no error seems > > > reasonable to > > > > > > me. > > > > > > > > > > > However, > > > > > > > > > > > > I > > > > > > > > > > > > > do > > > > > > > > > > > > > > >> feel > > > > > > > > > > > > > > >> > > > that > > > > > > > > > > > > > > >> > > > >> we > > > > > > > > > > > > > > >> > > > >> > > need > > > > > > > > > > > > > > >> > > > >> > > > to give the client an indicator > > > that > > > > > it's > > > > > > > > > being > > > > > > > > > > > > > > throttled, > > > > > > > > > > > > > > >> > > instead > > > > > > > > > > > > > > >> > > > >> of > > > > > > > > > > > > > > >> > > > >> > > doing > > > > > > > > > > > > > > >> > > > >> > > > this silently. For that, we > > > probably > > > > > need > > > > > > > to > > > > > > > > > > evolve > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > > >> produce/fetch > > > > > > > > > > > > > > >> > > > >> > > > protocol to include an extra > > status > > > > > field > > > > > > > in > > > > > > > > > the > > > > > > > > > > > > > > response. > > > > > > > > > > > > > > >> We > > > > > > > > > > > > > > >> > > > >> probably > > > > > > > > > > > > > > >> > > > >> > > need > > > > > > > > > > > > > > >> > > > >> > > > to think more about whether we > > just > > > > > want > > > > > > to > > > > > > > > > > return > > > > > > > > > > > a > > > > > > > > > > > > > > simple > > > > > > > > > > > > > > >> > > status > > > > > > > > > > > > > > >> > > > >> code > > > > > > > > > > > > > > >> > > > >> > > > (e.g., 1 = throttled) or a value > > > that > > > > > > > > > indicates > > > > > > > > > > how > > > > > > > > > > > > > much > > > > > > > > > > > > > > is > > > > > > > > > > > > > > >> > > being > > > > > > > > > > > > > > >> > > > >> > > throttled. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > >> > > > >> > > > 2. We probably need to improve > > the > > > > > > > histogram > > > > > > > > > > > support > > > > > > > > > > > > in > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > new > > > > > > > > > > > > > > >> > > > >> metrics > > > > > > > > > > > > > > >> > > > >> > > > package before we can use it more > > > > > widely > > > > > > on > > > > > > > > > the > > > > > > > > > > > > server > > > > > > > > > > > > > > side > > > > > > > > > > > > > > >> > > (left > > > > > > > > > > > > > > >> > > > a > > > > > > > > > > > > > > >> > > > >> > > comment > > > > > > > > > > > > > > >> > > > >> > > > in KAFKA-1930). I agree that this > > > KIP > > > > > > > doesn't > > > > > > > > > > need > > > > > > > > > > > to > > > > > > > > > > > > > > block > > > > > > > > > > > > > > >> on > > > > > > > > > > > > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > > migration of the metrics package. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > >> > > > >> > > > Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > >> > > > >> > > > Jun > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > >> > > > >> > > > On Wed, Mar 18, 2015 at 4:02 PM, > > > Aditya > > > > > > > > > Auradkar > > > > > > > > > > < > > > > > > > > > > > > > > >> > > > >> > > > aaurad...@linkedin.com.invalid> > > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > >> > > > >> > > > > Hey everyone, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > Thanks for the great > > discussion. > > > > > There > > > > > > > are > > > > > > > > > > > > currently > > > > > > > > > > > > > a > > > > > > > > > > > > > > few > > > > > > > > > > > > > > >> > > > points > > > > > > > > > > > > > > >> > > > >> on > > > > > > > > > > > > > > >> > > > >> > > this > > > > > > > > > > > > > > >> > > > >> > > > > KIP that need addressing and I > > > want > > > > > to > > > > > > > make > > > > > > > > > > sure > > > > > > > > > > > we > > > > > > > > > > > > > > are on > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> same > > > > > > > > > > > > > > >> > > > >> > > page > > > > > > > > > > > > > > >> > > > >> > > > > about those. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > 1. Append and delay response vs > > > delay > > > > > > and > > > > > > > > > > return > > > > > > > > > > > > > error > > > > > > > > > > > > > > >> > > > >> > > > > - I think we've discussed the > > > pros > > > > > and > > > > > > > cons > > > > > > > > > of > > > > > > > > > > > each > > > > > > > > > > > > > > >> approach > > > > > > > > > > > > > > >> > > but > > > > > > > > > > > > > > >> > > > >> > > haven't > > > > > > > > > > > > > > >> > > > >> > > > > chosen an approach yet. Where > > > does > > > > > > > everyone > > > > > > > > > > stand > > > > > > > > > > > > on > > > > > > > > > > > > > > this > > > > > > > > > > > > > > >> > > issue? > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > 2. Metrics Migration and usage > > in > > > > > > quotas > > > > > > > > > > > > > > >> > > > >> > > > > - The metrics library in > > clients > > > has > > > > > a > > > > > > > > > notion > > > > > > > > > > of > > > > > > > > > > > > > quotas > > > > > > > > > > > > > > >> that > > > > > > > > > > > > > > >> > > we > > > > > > > > > > > > > > >> > > > >> > should > > > > > > > > > > > > > > >> > > > >> > > > > reuse. For that to happen, we > > > need to > > > > > > > > > migrate > > > > > > > > > > the > > > > > > > > > > > > > > server > > > > > > > > > > > > > > >> to > > > > > > > > > > > > > > >> > > the > > > > > > > > > > > > > > >> > > > >> new > > > > > > > > > > > > > > >> > > > >> > > metrics > > > > > > > > > > > > > > >> > > > >> > > > > package. > > > > > > > > > > > > > > >> > > > >> > > > > - Need more clarification on > > how > > > to > > > > > > > compute > > > > > > > > > > > > > throttling > > > > > > > > > > > > > > >> time > > > > > > > > > > > > > > >> > > and > > > > > > > > > > > > > > >> > > > >> > > windowing > > > > > > > > > > > > > > >> > > > >> > > > > for quotas. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > I'm going to start a new KIP to > > > > > discuss > > > > > > > > > metrics > > > > > > > > > > > > > > migration > > > > > > > > > > > > > > >> > > > >> separately. > > > > > > > > > > > > > > >> > > > >> > > That > > > > > > > > > > > > > > >> > > > >> > > > > will also contain a section on > > > > > quotas. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > 3. Dynamic Configuration > > > management - > > > > > > > Being > > > > > > > > > > > > discussed > > > > > > > > > > > > > > in > > > > > > > > > > > > > > >> > > KIP-5. > > > > > > > > > > > > > > >> > > > >> > > Basically > > > > > > > > > > > > > > >> > > > >> > > > > we need something that will > > model > > > > > > default > > > > > > > > > > quotas > > > > > > > > > > > > and > > > > > > > > > > > > > > allow > > > > > > > > > > > > > > >> > > > >> per-client > > > > > > > > > > > > > > >> > > > >> > > > > overrides. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > Is there something else that > > I'm > > > > > > missing? > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > Aditya > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > ________________________________________ > > > > > > > > > > > > > > >> > > > >> > > > > From: Jay Kreps [ > > > jay.kr...@gmail.com > > > > > ] > > > > > > > > > > > > > > >> > > > >> > > > > Sent: Wednesday, March 18, 2015 > > > 2:10 > > > > > PM > > > > > > > > > > > > > > >> > > > >> > > > > To: dev@kafka.apache.org > > > > > > > > > > > > > > >> > > > >> > > > > Subject: Re: [KIP-DISCUSSION] > > > KIP-13 > > > > > > > Quotas > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > Hey Steven, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > The current proposal is > > actually > > > to > > > > > > > enforce > > > > > > > > > > > quotas > > > > > > > > > > > > at > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > > >> > > > > client/application level, NOT > > the > > > > > topic > > > > > > > > > level. > > > > > > > > > > So > > > > > > > > > > > > if > > > > > > > > > > > > > > you > > > > > > > > > > > > > > >> > have > > > > > > > > > > > > > > >> > > a > > > > > > > > > > > > > > >> > > > >> > service > > > > > > > > > > > > > > >> > > > >> > > > > with a few dozen instances the > > > quota > > > > > is > > > > > > > > > against > > > > > > > > > > > all > > > > > > > > > > > > > of > > > > > > > > > > > > > > >> those > > > > > > > > > > > > > > >> > > > >> > instances > > > > > > > > > > > > > > >> > > > >> > > > > added up across all their > > > topics. So > > > > > > > > > actually > > > > > > > > > > the > > > > > > > > > > > > > > effect > > > > > > > > > > > > > > >> > would > > > > > > > > > > > > > > >> > > > be > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > same > > > > > > > > > > > > > > >> > > > >> > > > > either way but throttling gives > > > the > > > > > > > producer > > > > > > > > > > the > > > > > > > > > > > > > > choice of > > > > > > > > > > > > > > >> > > > either > > > > > > > > > > > > > > >> > > > >> > > blocking > > > > > > > > > > > > > > >> > > > >> > > > > or dropping. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > -Jay > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > On Tue, Mar 17, 2015 at 10:08 > > AM, > > > > > > Steven > > > > > > > Wu > > > > > > > > > < > > > > > > > > > > > > > > >> > > > stevenz...@gmail.com > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > Jay, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > let's say an app produces to > > 10 > > > > > > > different > > > > > > > > > > > topics. > > > > > > > > > > > > > > one of > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> topic > > > > > > > > > > > > > > >> > > > >> > is > > > > > > > > > > > > > > >> > > > >> > > > > sent > > > > > > > > > > > > > > >> > > > >> > > > > > from a library. due to > > whatever > > > > > > > > > > condition/bug, > > > > > > > > > > > > this > > > > > > > > > > > > > > lib > > > > > > > > > > > > > > >> > > starts > > > > > > > > > > > > > > >> > > > >> to > > > > > > > > > > > > > > >> > > > >> > > send > > > > > > > > > > > > > > >> > > > >> > > > > > messages over the quota. if > > we > > > go > > > > > > with > > > > > > > the > > > > > > > > > > > > delayed > > > > > > > > > > > > > > >> > response > > > > > > > > > > > > > > >> > > > >> > > approach, it > > > > > > > > > > > > > > >> > > > >> > > > > > will cause the whole shared > > > > > > > > > RecordAccumulator > > > > > > > > > > > > > buffer > > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> be > > > > > > > > > > > > > > >> > > > >> filled > > > > > > > > > > > > > > >> > > > >> > up. > > > > > > > > > > > > > > >> > > > >> > > > > that > > > > > > > > > > > > > > >> > > > >> > > > > > will penalize other 9 topics > > > who > > > > > are > > > > > > > > > within > > > > > > > > > > the > > > > > > > > > > > > > > quota. > > > > > > > > > > > > > > >> > that > > > > > > > > > > > > > > >> > > is > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > > > > unfairness point that Ewen > > and > > > I > > > > > were > > > > > > > > > trying > > > > > > > > > > to > > > > > > > > > > > > > make. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > if broker just drop the msg > > and > > > > > > return > > > > > > > an > > > > > > > > > > > > > > error/status > > > > > > > > > > > > > > >> > code > > > > > > > > > > > > > > >> > > > >> > > indicates the > > > > > > > > > > > > > > >> > > > >> > > > > > drop and why. then producer > > can > > > > > just > > > > > > > move > > > > > > > > > on > > > > > > > > > > > and > > > > > > > > > > > > > > accept > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> drop. > > > > > > > > > > > > > > >> > > > >> > > shared > > > > > > > > > > > > > > >> > > > >> > > > > > buffer won't be saturated and > > > > > other 9 > > > > > > > > > topics > > > > > > > > > > > > won't > > > > > > > > > > > > > be > > > > > > > > > > > > > > >> > > > penalized. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > > Steven > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > On Tue, Mar 17, 2015 at 9:44 > > > AM, > > > > > Jay > > > > > > > Kreps > > > > > > > > > < > > > > > > > > > > > > > > >> > > > jay.kr...@gmail.com > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > Hey Steven, > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > It is true that hitting the > > > quota > > > > > > > will > > > > > > > > > > cause > > > > > > > > > > > > > > >> > back-pressure > > > > > > > > > > > > > > >> > > > on > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > > > > producer. > > > > > > > > > > > > > > >> > > > >> > > > > > > But the solution is > > simple, a > > > > > > > producer > > > > > > > > > that > > > > > > > > > > > > wants > > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> > avoid > > > > > > > > > > > > > > >> > > > >> this > > > > > > > > > > > > > > >> > > > >> > > should > > > > > > > > > > > > > > >> > > > >> > > > > > stay > > > > > > > > > > > > > > >> > > > >> > > > > > > under its quota. In other > > > words > > > > > > this > > > > > > > is > > > > > > > > > a > > > > > > > > > > > > > contract > > > > > > > > > > > > > > >> > between > > > > > > > > > > > > > > >> > > > the > > > > > > > > > > > > > > >> > > > >> > > cluster > > > > > > > > > > > > > > >> > > > >> > > > > > and > > > > > > > > > > > > > > >> > > > >> > > > > > > the client, with each side > > > having > > > > > > > > > something > > > > > > > > > > > to > > > > > > > > > > > > > > uphold. > > > > > > > > > > > > > > >> > > Quite > > > > > > > > > > > > > > >> > > > >> > > possibly > > > > > > > > > > > > > > >> > > > >> > > > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > same thing will happen in > > the > > > > > > > absence of > > > > > > > > > a > > > > > > > > > > > > > quota, a > > > > > > > > > > > > > > >> > client > > > > > > > > > > > > > > >> > > > >> that > > > > > > > > > > > > > > >> > > > >> > > > > produces > > > > > > > > > > > > > > >> > > > >> > > > > > an > > > > > > > > > > > > > > >> > > > >> > > > > > > unexpected amount of load > > > will > > > > > hit > > > > > > > the > > > > > > > > > > limits > > > > > > > > > > > > of > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > server > > > > > > > > > > > > > > >> > > > >> and > > > > > > > > > > > > > > >> > > > >> > > > > > experience > > > > > > > > > > > > > > >> > > > >> > > > > > > backpressure. Quotas just > > > allow > > > > > you > > > > > > > to > > > > > > > > > set > > > > > > > > > > > that > > > > > > > > > > > > > > same > > > > > > > > > > > > > > >> > limit > > > > > > > > > > > > > > >> > > > at > > > > > > > > > > > > > > >> > > > >> > > something > > > > > > > > > > > > > > >> > > > >> > > > > > > lower than 100% of all > > > resources > > > > > on > > > > > > > the > > > > > > > > > > > server, > > > > > > > > > > > > > > which > > > > > > > > > > > > > > >> is > > > > > > > > > > > > > > >> > > > >> useful > > > > > > > > > > > > > > >> > > > >> > > for a > > > > > > > > > > > > > > >> > > > >> > > > > > > shared cluster. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > -Jay > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > On Mon, Mar 16, 2015 at > > > 11:34 PM, > > > > > > > Steven > > > > > > > > > > Wu < > > > > > > > > > > > > > > >> > > > >> > stevenz...@gmail.com> > > > > > > > > > > > > > > >> > > > >> > > > > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > wait. we create one kafka > > > > > > producer > > > > > > > for > > > > > > > > > > each > > > > > > > > > > > > > > cluster. > > > > > > > > > > > > > > >> > > each > > > > > > > > > > > > > > >> > > > >> > > cluster can > > > > > > > > > > > > > > >> > > > >> > > > > > > have > > > > > > > > > > > > > > >> > > > >> > > > > > > > many topics. if producer > > > buffer > > > > > > got > > > > > > > > > > filled > > > > > > > > > > > up > > > > > > > > > > > > > > due to > > > > > > > > > > > > > > >> > > > delayed > > > > > > > > > > > > > > >> > > > >> > > response > > > > > > > > > > > > > > >> > > > >> > > > > > for > > > > > > > > > > > > > > >> > > > >> > > > > > > > one throttled topic, > > won't > > > that > > > > > > > > > penalize > > > > > > > > > > > > other > > > > > > > > > > > > > > >> topics > > > > > > > > > > > > > > >> > > > >> unfairly? > > > > > > > > > > > > > > >> > > > >> > > it > > > > > > > > > > > > > > >> > > > >> > > > > > seems > > > > > > > > > > > > > > >> > > > >> > > > > > > to > > > > > > > > > > > > > > >> > > > >> > > > > > > > me that broker should > > just > > > > > return > > > > > > > > > error > > > > > > > > > > > > without > > > > > > > > > > > > > > >> delay. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > sorry that I am chatting > > to > > > > > > myself > > > > > > > :) > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > On Mon, Mar 16, 2015 at > > > 11:29 > > > > > PM, > > > > > > > > > Steven > > > > > > > > > > > Wu < > > > > > > > > > > > > > > >> > > > >> > > stevenz...@gmail.com> > > > > > > > > > > > > > > >> > > > >> > > > > > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > I think I can answer my > > > own > > > > > > > > > question. > > > > > > > > > > > > delayed > > > > > > > > > > > > > > >> > response > > > > > > > > > > > > > > >> > > > >> will > > > > > > > > > > > > > > >> > > > >> > > cause > > > > > > > > > > > > > > >> > > > >> > > > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > > producer buffer to be > > > full, > > > > > > which > > > > > > > > > then > > > > > > > > > > > > result > > > > > > > > > > > > > > in > > > > > > > > > > > > > > >> > > either > > > > > > > > > > > > > > >> > > > >> > thread > > > > > > > > > > > > > > >> > > > >> > > > > > blocking > > > > > > > > > > > > > > >> > > > >> > > > > > > > or > > > > > > > > > > > > > > >> > > > >> > > > > > > > > message drop. > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > On Mon, Mar 16, 2015 at > > > 11:24 > > > > > > PM, > > > > > > > > > > Steven > > > > > > > > > > > > Wu < > > > > > > > > > > > > > > >> > > > >> > > stevenz...@gmail.com> > > > > > > > > > > > > > > >> > > > >> > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> please correct me if I > > > am > > > > > > > missing > > > > > > > > > sth > > > > > > > > > > > > here. > > > > > > > > > > > > > I > > > > > > > > > > > > > > am > > > > > > > > > > > > > > >> > not > > > > > > > > > > > > > > >> > > > >> > > understanding > > > > > > > > > > > > > > >> > > > >> > > > > > how > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> would throttle work > > > without > > > > > > > > > > > > > > cooperation/back-off > > > > > > > > > > > > > > >> > from > > > > > > > > > > > > > > >> > > > >> > > producer. > > > > > > > > > > > > > > >> > > > >> > > > > new > > > > > > > > > > > > > > >> > > > >> > > > > > > Java > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> producer supports > > > > > non-blocking > > > > > > > API. > > > > > > > > > > why > > > > > > > > > > > > > would > > > > > > > > > > > > > > >> > delayed > > > > > > > > > > > > > > >> > > > >> > > response be > > > > > > > > > > > > > > >> > > > >> > > > > > able > > > > > > > > > > > > > > >> > > > >> > > > > > > > to > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> slow down producer? > > > producer > > > > > > > will > > > > > > > > > > > continue > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> fire > > > > > > > > > > > > > > >> > > > async > > > > > > > > > > > > > > >> > > > >> > > sends. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> On Mon, Mar 16, 2015 > > at > > > > > 10:58 > > > > > > > PM, > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > Wang < > > > > > > > > > > > > > > >> > > > >> > > > > wangg...@gmail.com > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> I think we are really > > > > > > > discussing > > > > > > > > > two > > > > > > > > > > > > > separate > > > > > > > > > > > > > > >> > issues > > > > > > > > > > > > > > >> > > > >> here: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> 1. Whether we should > > a) > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > append-then-block-then-returnOKButThrottled > > > > > > > > > > > > > > >> > > > >> > > > > > > or > > > > > > > > > > > > > > >> > > > >> > > > > > > > b) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > block-then-returnFailDuetoThrottled > > > > > > > > > > for > > > > > > > > > > > > > quota > > > > > > > > > > > > > > >> > > actions > > > > > > > > > > > > > > >> > > > on > > > > > > > > > > > > > > >> > > > >> > > produce > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> requests. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> Both these approaches > > > > > assume > > > > > > > some > > > > > > > > > > kind > > > > > > > > > > > of > > > > > > > > > > > > > > >> > > > >> well-behaveness > > > > > > > > > > > > > > >> > > > >> > of > > > > > > > > > > > > > > >> > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > clients: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> option a) assumes the > > > > > client > > > > > > > sets > > > > > > > > > an > > > > > > > > > > > > proper > > > > > > > > > > > > > > >> > timeout > > > > > > > > > > > > > > >> > > > >> value > > > > > > > > > > > > > > >> > > > >> > > while > > > > > > > > > > > > > > >> > > > >> > > > > can > > > > > > > > > > > > > > >> > > > >> > > > > > > > just > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> ignore > > "OKButThrottled" > > > > > > > response, > > > > > > > > > > while > > > > > > > > > > > > > > option > > > > > > > > > > > > > > >> b) > > > > > > > > > > > > > > >> > > > >> assumes > > > > > > > > > > > > > > >> > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > > > > client > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> handles the > > > > > > > "FailDuetoThrottled" > > > > > > > > > > > > > > appropriately. > > > > > > > > > > > > > > >> > For > > > > > > > > > > > > > > >> > > > any > > > > > > > > > > > > > > >> > > > >> > > malicious > > > > > > > > > > > > > > >> > > > >> > > > > > > > clients > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> that, for example, > > just > > > > > keep > > > > > > > > > retrying > > > > > > > > > > > > > either > > > > > > > > > > > > > > >> > > > >> intentionally > > > > > > > > > > > > > > >> > > > >> > or > > > > > > > > > > > > > > >> > > > >> > > > > not, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> neither > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> of these approaches > > are > > > > > > > actually > > > > > > > > > > > > effective. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> 2. For > > > "OKButThrottled" and > > > > > > > > > > > > > > "FailDuetoThrottled" > > > > > > > > > > > > > > >> > > > >> responses, > > > > > > > > > > > > > > >> > > > >> > > shall > > > > > > > > > > > > > > >> > > > >> > > > > > we > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> encode > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> them as error codes > > or > > > > > > augment > > > > > > > the > > > > > > > > > > > > protocol > > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> > use a > > > > > > > > > > > > > > >> > > > >> > separate > > > > > > > > > > > > > > >> > > > >> > > > > field > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> indicating "status > > > codes". > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> Today we have already > > > > > > > incorporated > > > > > > > > > > some > > > > > > > > > > > > > > status > > > > > > > > > > > > > > >> > code > > > > > > > > > > > > > > >> > > as > > > > > > > > > > > > > > >> > > > >> > error > > > > > > > > > > > > > > >> > > > >> > > > > codes > > > > > > > > > > > > > > >> > > > >> > > > > > in > > > > > > > > > > > > > > >> > > > >> > > > > > > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> responses, e.g. > > > > > > > > > ReplicaNotAvailable > > > > > > > > > > in > > > > > > > > > > > > > > >> > > > MetadataResponse, > > > > > > > > > > > > > > >> > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > pros > > > > > > > > > > > > > > >> > > > >> > > > > > of > > > > > > > > > > > > > > >> > > > >> > > > > > > > this > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> is of course using a > > > single > > > > > > > field > > > > > > > > > for > > > > > > > > > > > > > > response > > > > > > > > > > > > > > >> > > status > > > > > > > > > > > > > > >> > > > >> like > > > > > > > > > > > > > > >> > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > > > HTTP > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> status > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> codes, while the cons > > > is > > > > > that > > > > > > > it > > > > > > > > > > > requires > > > > > > > > > > > > > > >> clients > > > > > > > > > > > > > > >> > to > > > > > > > > > > > > > > >> > > > >> handle > > > > > > > > > > > > > > >> > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > > > > error > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> codes > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> carefully. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> I think maybe we can > > > > > actually > > > > > > > > > extend > > > > > > > > > > > the > > > > > > > > > > > > > > >> > single-code > > > > > > > > > > > > > > >> > > > >> > > approach to > > > > > > > > > > > > > > >> > > > >> > > > > > > > overcome > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> its drawbacks, that > > is, > > > > > wrap > > > > > > > the > > > > > > > > > > error > > > > > > > > > > > > > codes > > > > > > > > > > > > > > >> > > semantics > > > > > > > > > > > > > > >> > > > >> to > > > > > > > > > > > > > > >> > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > > > users > > > > > > > > > > > > > > >> > > > >> > > > > > > so > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> that > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> users do not need to > > > handle > > > > > > the > > > > > > > > > codes > > > > > > > > > > > > > > >> one-by-one. > > > > > > > > > > > > > > >> > > More > > > > > > > > > > > > > > >> > > > >> > > > > concretely, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> following Jay's > > > example the > > > > > > > client > > > > > > > > > > > could > > > > > > > > > > > > > > write > > > > > > > > > > > > > > >> > sth. > > > > > > > > > > > > > > >> > > > like > > > > > > > > > > > > > > >> > > > >> > > this: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> ----------------- > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> if(error.isOK()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // status code is > > good > > > or > > > > > the > > > > > > > > > > code > > > > > > > > > > > > can > > > > > > > > > > > > > > be > > > > > > > > > > > > > > >> > > simply > > > > > > > > > > > > > > >> > > > >> > > ignored for > > > > > > > > > > > > > > >> > > > >> > > > > > > this > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> request type, process > > > the > > > > > > > request > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> else > > > if(error.needsRetry()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // throttled, > > transient > > > > > > error, > > > > > > > > > > > etc: > > > > > > > > > > > > > > retry > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> else > > > if(error.isFatal()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // non-retriable > > > errors, > > > > > etc: > > > > > > > > > > > > notify / > > > > > > > > > > > > > > >> > > terminate > > > > > > > > > > > > > > >> > > > / > > > > > > > > > > > > > > >> > > > >> > other > > > > > > > > > > > > > > >> > > > >> > > > > > > handling > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> ----------------- > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> Only when the clients > > > > > really > > > > > > > want > > > > > > > > > to > > > > > > > > > > > > > handle, > > > > > > > > > > > > > > for > > > > > > > > > > > > > > >> > > > example > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> FailDuetoThrottled > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> status code > > > specifically, > > > > > it > > > > > > > needs > > > > > > > > > > to: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> if(error.isOK()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // status code is > > good > > > or > > > > > the > > > > > > > > > > code > > > > > > > > > > > > can > > > > > > > > > > > > > > be > > > > > > > > > > > > > > >> > > simply > > > > > > > > > > > > > > >> > > > >> > > ignored for > > > > > > > > > > > > > > >> > > > >> > > > > > > this > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> request type, process > > > the > > > > > > > request > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> else if(error == > > > > > > > > > > FailDuetoThrottled ) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // throttled: log it > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> else > > > if(error.needsRetry()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // transient error, > > > etc: > > > > > > retry > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> else > > > if(error.isFatal()) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> // non-retriable > > > errors, > > > > > etc: > > > > > > > > > > > > notify / > > > > > > > > > > > > > > >> > > terminate > > > > > > > > > > > > > > >> > > > / > > > > > > > > > > > > > > >> > > > >> > other > > > > > > > > > > > > > > >> > > > >> > > > > > > handling > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> ----------------- > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> And for > > implementation > > > we > > > > > can > > > > > > > > > > probably > > > > > > > > > > > > > group > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > codes > > > > > > > > > > > > > > >> > > > >> > > > > accordingly > > > > > > > > > > > > > > >> > > > >> > > > > > > like > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> HTTP status code such > > > that > > > > > we > > > > > > > can > > > > > > > > > do: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> boolean Error.isOK() > > { > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> return code < 300 && > > > code > > > > > >= > > > > > > > 200; > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> } > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> Guozhang > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> On Mon, Mar 16, 2015 > > at > > > > > 10:24 > > > > > > > PM, > > > > > > > > > > Ewen > > > > > > > > > > > > > > >> > > > Cheslack-Postava > > > > > > > > > > > > > > >> > > > >> < > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> e...@confluent.io> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > Agreed that trying > > to > > > > > > > shoehorn > > > > > > > > > > > > non-error > > > > > > > > > > > > > > codes > > > > > > > > > > > > > > >> > > into > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > error > > > > > > > > > > > > > > >> > > > >> > > > > > field > > > > > > > > > > > > > > >> > > > >> > > > > > > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> a > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > bad idea. It makes > > it > > > > > *way* > > > > > > > too > > > > > > > > > > easy > > > > > > > > > > > to > > > > > > > > > > > > > > write > > > > > > > > > > > > > > >> > code > > > > > > > > > > > > > > >> > > > >> that > > > > > > > > > > > > > > >> > > > >> > > looks > > > > > > > > > > > > > > >> > > > >> > > > > > (and > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> should > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > be) correct but is > > > > > actually > > > > > > > > > > > incorrect. > > > > > > > > > > > > If > > > > > > > > > > > > > > >> > > > necessary, I > > > > > > > > > > > > > > >> > > > >> > > think > > > > > > > > > > > > > > >> > > > >> > > > > it's > > > > > > > > > > > > > > >> > > > >> > > > > > > > much > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > better to to spend > > a > > > > > couple > > > > > > > of > > > > > > > > > > extra > > > > > > > > > > > > > bytes > > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> > > encode > > > > > > > > > > > > > > >> > > > >> that > > > > > > > > > > > > > > >> > > > >> > > > > > > information > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > separately (a > > > "status" or > > > > > > > > > "warning" > > > > > > > > > > > > > > section of > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > response). > > > > > > > > > > > > > > >> > > > >> > > > > An > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> indication > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > that throttling is > > > > > > occurring > > > > > > > is > > > > > > > > > > > > something > > > > > > > > > > > > > > I'd > > > > > > > > > > > > > > >> > > expect > > > > > > > > > > > > > > >> > > > >> to > > > > > > > > > > > > > > >> > > > >> > be > > > > > > > > > > > > > > >> > > > >> > > > > > > indicated > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> by a > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > bit flag in the > > > response > > > > > > > rather > > > > > > > > > > than > > > > > > > > > > > as > > > > > > > > > > > > > an > > > > > > > > > > > > > > >> error > > > > > > > > > > > > > > >> > > > code. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > Gwen - I think an > > > error > > > > > > code > > > > > > > > > makes > > > > > > > > > > > > sense > > > > > > > > > > > > > > when > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> request > > > > > > > > > > > > > > >> > > > >> > > > > > actually > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> failed. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > Option B, which Jun > > > was > > > > > > > > > advocating, > > > > > > > > > > > > would > > > > > > > > > > > > > > have > > > > > > > > > > > > > > >> > > > >> appended > > > > > > > > > > > > > > >> > > > >> > the > > > > > > > > > > > > > > >> > > > >> > > > > > > messages > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > successfully. If > > the > > > > > > > > > rate-limiting > > > > > > > > > > > case > > > > > > > > > > > > > > you're > > > > > > > > > > > > > > >> > > > talking > > > > > > > > > > > > > > >> > > > >> > > about > > > > > > > > > > > > > > >> > > > >> > > > > had > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > successfully > > > committed > > > > > the > > > > > > > > > > messages, > > > > > > > > > > > I > > > > > > > > > > > > > > would > > > > > > > > > > > > > > >> say > > > > > > > > > > > > > > >> > > > >> that's > > > > > > > > > > > > > > >> > > > >> > > also a > > > > > > > > > > > > > > >> > > > >> > > > > > bad > > > > > > > > > > > > > > >> > > > >> > > > > > > > use > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> of > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > error codes. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > On Mon, Mar 16, > > 2015 > > > at > > > > > > 10:16 > > > > > > > > > PM, > > > > > > > > > > > Gwen > > > > > > > > > > > > > > >> Shapira < > > > > > > > > > > > > > > >> > > > >> > > > > > > > gshap...@cloudera.com> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > We discussed an > > > error > > > > > > code > > > > > > > for > > > > > > > > > > > > > > rate-limiting > > > > > > > > > > > > > > >> > > > (which > > > > > > > > > > > > > > >> > > > >> I > > > > > > > > > > > > > > >> > > > >> > > think > > > > > > > > > > > > > > >> > > > >> > > > > > made > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > sense), isn't it > > a > > > > > > similar > > > > > > > > > case? > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > On Mon, Mar 16, > > > 2015 at > > > > > > > 10:10 > > > > > > > > > PM, > > > > > > > > > > > Jay > > > > > > > > > > > > > > Kreps > > > > > > > > > > > > > > >> < > > > > > > > > > > > > > > >> > > > >> > > > > > jay.kr...@gmail.com > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > My concern is > > > that as > > > > > > > soon > > > > > > > > > as > > > > > > > > > > you > > > > > > > > > > > > > start > > > > > > > > > > > > > > >> > > encoding > > > > > > > > > > > > > > >> > > > >> > > non-error > > > > > > > > > > > > > > >> > > > >> > > > > > > > response > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > information > > into > > > > > error > > > > > > > codes > > > > > > > > > > the > > > > > > > > > > > > next > > > > > > > > > > > > > > >> > question > > > > > > > > > > > > > > >> > > > is > > > > > > > > > > > > > > >> > > > >> > what > > > > > > > > > > > > > > >> > > > >> > > to > > > > > > > > > > > > > > >> > > > >> > > > > do > > > > > > > > > > > > > > >> > > > >> > > > > > if > > > > > > > > > > > > > > >> > > > >> > > > > > > > two > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > such > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > codes apply > > > (i.e. you > > > > > > > have a > > > > > > > > > > > > replica > > > > > > > > > > > > > > down > > > > > > > > > > > > > > >> > and > > > > > > > > > > > > > > >> > > > the > > > > > > > > > > > > > > >> > > > >> > > response > > > > > > > > > > > > > > >> > > > >> > > > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > quota'd). I > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > think I am > > > trying to > > > > > > > argue > > > > > > > > > that > > > > > > > > > > > > error > > > > > > > > > > > > > > >> should > > > > > > > > > > > > > > >> > > > mean > > > > > > > > > > > > > > >> > > > >> > "why > > > > > > > > > > > > > > >> > > > >> > > we > > > > > > > > > > > > > > >> > > > >> > > > > > > failed > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> your > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > request", for > > > which > > > > > > there > > > > > > > > > will > > > > > > > > > > > > really > > > > > > > > > > > > > > only > > > > > > > > > > > > > > >> > be > > > > > > > > > > > > > > >> > > > one > > > > > > > > > > > > > > >> > > > >> > > reason, > > > > > > > > > > > > > > >> > > > >> > > > > and > > > > > > > > > > > > > > >> > > > >> > > > > > > any > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> other > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > useful > > > information we > > > > > > > want > > > > > > > > > to > > > > > > > > > > > send > > > > > > > > > > > > > > back is > > > > > > > > > > > > > > >> > > just > > > > > > > > > > > > > > >> > > > >> > another > > > > > > > > > > > > > > >> > > > >> > > > > field > > > > > > > > > > > > > > >> > > > >> > > > > > > in > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> the > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > response. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > -Jay > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > On Mon, Mar 16, > > > 2015 > > > > > at > > > > > > > 9:51 > > > > > > > > > > PM, > > > > > > > > > > > > Gwen > > > > > > > > > > > > > > >> > Shapira > > > > > > > > > > > > > > >> > > < > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > gshap...@cloudera.com> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> I think its > > not > > > too > > > > > > > late to > > > > > > > > > > > > reserve > > > > > > > > > > > > > a > > > > > > > > > > > > > > set > > > > > > > > > > > > > > >> > of > > > > > > > > > > > > > > >> > > > >> error > > > > > > > > > > > > > > >> > > > >> > > codes > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> (200-299?) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> for > > "non-error" > > > > > codes. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> It won't be > > > backward > > > > > > > > > > compatible > > > > > > > > > > > > > (i.e. > > > > > > > > > > > > > > >> > clients > > > > > > > > > > > > > > >> > > > >> that > > > > > > > > > > > > > > >> > > > >> > > > > currently > > > > > > > > > > > > > > >> > > > >> > > > > > > do > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> "else > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> throw" will > > > throw on > > > > > > > > > > > non-errors), > > > > > > > > > > > > > but > > > > > > > > > > > > > > >> > perhaps > > > > > > > > > > > > > > >> > > > its > > > > > > > > > > > > > > >> > > > >> > > > > > worthwhile. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> On Mon, Mar > > 16, > > > 2015 > > > > > > at > > > > > > > > > 9:42 > > > > > > > > > > PM, > > > > > > > > > > > > Jay > > > > > > > > > > > > > > >> Kreps > > > > > > > > > > > > > > >> > < > > > > > > > > > > > > > > >> > > > >> > > > > > > jay.kr...@gmail.com > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Hey Jun, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > I'd really > > > really > > > > > > > really > > > > > > > > > > like > > > > > > > > > > > to > > > > > > > > > > > > > > avoid > > > > > > > > > > > > > > >> > > that. > > > > > > > > > > > > > > >> > > > >> > Having > > > > > > > > > > > > > > >> > > > >> > > just > > > > > > > > > > > > > > >> > > > >> > > > > > > > spent a > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > bunch of > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > time on the > > > > > clients, > > > > > > > > > using > > > > > > > > > > the > > > > > > > > > > > > > error > > > > > > > > > > > > > > >> > codes > > > > > > > > > > > > > > >> > > to > > > > > > > > > > > > > > >> > > > >> > encode > > > > > > > > > > > > > > >> > > > >> > > > > other > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > information > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > about the > > > response > > > > > > is > > > > > > > > > super > > > > > > > > > > > > > > dangerous. > > > > > > > > > > > > > > >> > The > > > > > > > > > > > > > > >> > > > >> error > > > > > > > > > > > > > > >> > > > >> > > > > handling > > > > > > > > > > > > > > >> > > > >> > > > > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> one of > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > hardest > > parts > > > of > > > > > the > > > > > > > > > client > > > > > > > > > > > > > > (Guozhang > > > > > > > > > > > > > > >> > chime > > > > > > > > > > > > > > >> > > > in > > > > > > > > > > > > > > >> > > > >> > > here). > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Generally > > the > > > > > error > > > > > > > > > handling > > > > > > > > > > > > looks > > > > > > > > > > > > > > like > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > if(error == > > > none) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // good, > > > process > > > > > the > > > > > > > > > > > > request > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else > > if(error > > > == > > > > > > > > > > > > KNOWN_ERROR_1) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // handle > > > known > > > > > > error > > > > > > > 1 > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else > > if(error > > > == > > > > > > > > > > > > KNOWN_ERROR_2) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // handle > > > known > > > > > > error > > > > > > > 2 > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > throw > > > > > > > > > > > > > > >> > > Errors.forCode(error).exception(); > > > > > > > > > > > > > > >> > > > >> // > > > > > > > > > > > > > > >> > > > >> > or > > > > > > > > > > > > > > >> > > > >> > > some > > > > > > > > > > > > > > >> > > > >> > > > > > > other > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > default > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > behavior > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > This works > > > because > > > > > > we > > > > > > > > > have a > > > > > > > > > > > > > > convention > > > > > > > > > > > > > > >> > > that > > > > > > > > > > > > > > >> > > > >> and > > > > > > > > > > > > > > >> > > > >> > > error > > > > > > > > > > > > > > >> > > > >> > > > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> something > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > that > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > prevented > > your > > > > > > getting > > > > > > > > > the > > > > > > > > > > > > > response > > > > > > > > > > > > > > so > > > > > > > > > > > > > > >> > the > > > > > > > > > > > > > > >> > > > >> default > > > > > > > > > > > > > > >> > > > >> > > > > > handling > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> case is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > sane > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > and forward > > > > > > > compatible. > > > > > > > > > It > > > > > > > > > > is > > > > > > > > > > > > > > tempting > > > > > > > > > > > > > > >> to > > > > > > > > > > > > > > >> > > use > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > error > > > > > > > > > > > > > > >> > > > >> > > > > > code > > > > > > > > > > > > > > >> > > > >> > > > > > > > to > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > convey > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > information > > > in the > > > > > > > > > success > > > > > > > > > > > case. > > > > > > > > > > > > > For > > > > > > > > > > > > > > >> > > example > > > > > > > > > > > > > > >> > > > we > > > > > > > > > > > > > > >> > > > >> > > could > > > > > > > > > > > > > > >> > > > >> > > > > use > > > > > > > > > > > > > > >> > > > >> > > > > > > > error > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > codes > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > to > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > encode > > whether > > > > > > quotas > > > > > > > > > were > > > > > > > > > > > > > enforced, > > > > > > > > > > > > > > >> > > whether > > > > > > > > > > > > > > >> > > > >> the > > > > > > > > > > > > > > >> > > > >> > > request > > > > > > > > > > > > > > >> > > > >> > > > > > was > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> served > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > out > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> of > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > cache, > > > whether the > > > > > > > stock > > > > > > > > > > > market > > > > > > > > > > > > is > > > > > > > > > > > > > > up > > > > > > > > > > > > > > >> > > today, > > > > > > > > > > > > > > >> > > > or > > > > > > > > > > > > > > >> > > > >> > > > > whatever. > > > > > > > > > > > > > > >> > > > >> > > > > > > The > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > problem > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > that since > > > these > > > > > are > > > > > > > not > > > > > > > > > > > errors > > > > > > > > > > > > as > > > > > > > > > > > > > > far > > > > > > > > > > > > > > >> as > > > > > > > > > > > > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > client is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> concerned it > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> should > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > not throw an > > > > > > exception > > > > > > > > > but > > > > > > > > > > > > process > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > > >> response, > > > > > > > > > > > > > > >> > > > >> > > but now > > > > > > > > > > > > > > >> > > > >> > > > > > we > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> created > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > an > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > explicit > > > > > requirement > > > > > > > that > > > > > > > > > > that > > > > > > > > > > > > > > error be > > > > > > > > > > > > > > >> > > > handled > > > > > > > > > > > > > > >> > > > >> > > > > explicitly > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> since it > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > is > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > different. I > > > > > really > > > > > > > think > > > > > > > > > > that > > > > > > > > > > > > > this > > > > > > > > > > > > > > >> kind > > > > > > > > > > > > > > >> > of > > > > > > > > > > > > > > >> > > > >> > > information > > > > > > > > > > > > > > >> > > > >> > > > > is > > > > > > > > > > > > > > >> > > > >> > > > > > > not > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> an > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > error, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> it > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > is just > > > > > information, > > > > > > > and > > > > > > > > > if > > > > > > > > > > we > > > > > > > > > > > > > want > > > > > > > > > > > > > > it > > > > > > > > > > > > > > >> in > > > > > > > > > > > > > > >> > > the > > > > > > > > > > > > > > >> > > > >> > > response > > > > > > > > > > > > > > >> > > > >> > > > > we > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> should do > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > right thing > > > and > > > > > add > > > > > > a > > > > > > > new > > > > > > > > > > > field > > > > > > > > > > > > to > > > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > > > >> response. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > I think you > > > saw > > > > > the > > > > > > > Samza > > > > > > > > > > bug > > > > > > > > > > > > that > > > > > > > > > > > > > > was > > > > > > > > > > > > > > >> > > > >> literally > > > > > > > > > > > > > > >> > > > >> > an > > > > > > > > > > > > > > >> > > > >> > > > > > example > > > > > > > > > > > > > > >> > > > >> > > > > > > of > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> this > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > happening > > and > > > > > > leading > > > > > > > to > > > > > > > > > an > > > > > > > > > > > > > infinite > > > > > > > > > > > > > > >> > retry > > > > > > > > > > > > > > >> > > > >> loop. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Further > > more I > > > > > > really > > > > > > > > > want > > > > > > > > > > to > > > > > > > > > > > > > > emphasize > > > > > > > > > > > > > > >> > > that > > > > > > > > > > > > > > >> > > > >> > hitting > > > > > > > > > > > > > > >> > > > >> > > > > your > > > > > > > > > > > > > > >> > > > >> > > > > > > > quota > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> in > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > the > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > design that > > > Adi > > > > > has > > > > > > > > > proposed > > > > > > > > > > > is > > > > > > > > > > > > > > >> actually > > > > > > > > > > > > > > >> > > not > > > > > > > > > > > > > > >> > > > an > > > > > > > > > > > > > > >> > > > >> > > error > > > > > > > > > > > > > > >> > > > >> > > > > > > > condition > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> at > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > all. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> It > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > is totally > > > > > > reasonable > > > > > > > in > > > > > > > > > any > > > > > > > > > > > > > > bootstrap > > > > > > > > > > > > > > >> > > > >> situation > > > > > > > > > > > > > > >> > > > >> > to > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> intentionally > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > want to > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > run at the > > > limit > > > > > the > > > > > > > > > system > > > > > > > > > > > > > imposes > > > > > > > > > > > > > > on > > > > > > > > > > > > > > >> > you. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > -Jay > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > On Mon, Mar > > > 16, > > > > > 2015 > > > > > > > at > > > > > > > > > 4:27 > > > > > > > > > > > PM, > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > >> Rao > > > > > > > > > > > > > > >> > < > > > > > > > > > > > > > > >> > > > >> > > > > > j...@confluent.io> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> It's > > probably > > > > > > useful > > > > > > > for > > > > > > > > > a > > > > > > > > > > > > client > > > > > > > > > > > > > > to > > > > > > > > > > > > > > >> > know > > > > > > > > > > > > > > >> > > > >> whether > > > > > > > > > > > > > > >> > > > >> > > its > > > > > > > > > > > > > > >> > > > >> > > > > > > > requests > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> are > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> throttled > > or > > > not > > > > > > > (e.g., > > > > > > > > > for > > > > > > > > > > > > > > monitoring > > > > > > > > > > > > > > >> > and > > > > > > > > > > > > > > >> > > > >> > > alerting). > > > > > > > > > > > > > > >> > > > >> > > > > > From > > > > > > > > > > > > > > >> > > > >> > > > > > > > that > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > perspective, > > > > > > option B > > > > > > > > > > (delay > > > > > > > > > > > > the > > > > > > > > > > > > > > >> > requests > > > > > > > > > > > > > > >> > > > and > > > > > > > > > > > > > > >> > > > >> > > return an > > > > > > > > > > > > > > >> > > > >> > > > > > > > error) > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > seems > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> better. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> Jun > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> On Wed, Mar > > > 4, > > > > > 2015 > > > > > > > at > > > > > > > > > 3:51 > > > > > > > > > > > PM, > > > > > > > > > > > > > > Aditya > > > > > > > > > > > > > > >> > > > >> Auradkar < > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > aaurad...@linkedin.com.invalid > > > > > > > > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > Posted a > > > KIP > > > > > for > > > > > > > > > quotas > > > > > > > > > > in > > > > > > > > > > > > > kafka. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > Appreciate > > > any > > > > > > > > > feedback. > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > Aditya > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > -- > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > Thanks, > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > Ewen > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> -- > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> -- Guozhang > > > > > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > -- > > > > > > > > > > > > > > >> > > > > Sent from Gmail Mobile > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > -- > > > > > > > > > > > > > > >> > > > Sent from Gmail Mobile > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > >