RE: [KIP-DISCUSSION] KIP-13 Quotas

Aditya Auradkar Thu, 16 Apr 2015 14:46:09 -0700

Hey Guozhang,

I don't think we should return an error if the request is satisfied after Y 
(throttling timeout) because it may cause the producer to think that the 
request was not ack'ed at all.


Aditya

________________________________________
From: Guozhang Wang [wangg...@gmail.com]
Sent: Thursday, April 16, 2015 9:06 AM
To: dev@kafka.apache.org
Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas

Hi Adi,

2. I assume you were saying "than strictly needed for replications" here?

Also the concern I have is around error code: today if the replication is
not finished within in the replication timeout then the error code will be
set accordingly when it returns. Let's say if the produce request is not
satisfied after X (replication timeout) ms, but is satisfied after Y
(throttling timeout), should we still set the error code or not? I think it
is OK to just set NO_ERROR but we need to document such cases clearly for
quote actions mixed with ack = -1.

Guozhang

On Wed, Apr 15, 2015 at 4:23 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Thanks for the review Guozhang.
>
> 1. Agreed.
>
> 2. This proposal actually waits for the maximum of the 2 timeouts. This
> reduces implementation complexity at the cost of waiting longer than
> strictly needed for quotas. Note that this is only for the case where
> acks=-1.
>
> However we can solve this if it is a significant concern by adding watcher
> keys for all partitions (only if acks=-1). These are the keys we would
> normally add while waiting for acknowledgements. We can change the
> tryComplete() function to return false until 'quota_timeout' time has
> elapsed AND all the acknowledgements have been received.
>
> Thanks,
> Aditya
> ________________________________________
> From: Guozhang Wang [wangg...@gmail.com]
> Sent: Wednesday, April 15, 2015 3:42 PM
> To: dev@kafka.apache.org
> Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
>
> Thanks for the summary. A few comments below:
>
> 1. Say a produce request has replication timeout X, and upon finishing the
> local append it is determined to be throttled Y ms where Y > X, then after
> it has timed out in the purgatory after Y ms we should still check if the
> #.acks has fulfilled in order to set the correct error codes in the
> response.
>
> 2. I think it is actually common that the calculated throttle time Y is
> less than the replication timeout X, which will be a tricky case since we
> need to make sure 1) at least the request it held in the purgatory for Y
> ms, 2) after Y ms elapsed, if the #.acks has fulfilled within X ms then set
> no-error-code and return immediately, 3) after X ms elapsed, set
> timeout-error-code and return.
>
> Guozhang
>
> On Tue, Apr 14, 2015 at 5:01 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > This is an implementation proposal for delaying requests in quotas using
> > the current purgatory. I'll discuss the usage for produce and fetch
> > requests separately.
> >
> > 1. Delayed Produce Requests - Here, the proposal is basically to reuse
> > DelayedProduce objects and insert them into the purgatory with no watcher
> > keys if the request is being throttled. The timeout used in the request
> > should be the Max(quota_delay_time, replication_timeout).
> > In most cases, the quota timeout should be greater than the existing
> > timeout but in order to be safe, we can use the maximum of these values.
> > Having no watch keys will allow the operation to be enqueued directly
> into
> > the timer and will not add any overhead in terms of watching keys (which
> > was a concern). In this case, having watch keys is not beneficial since
> the
> > operation must be delayed for a fixed amount of time and there is no
> > possibility for the operation to complete before the timeout i.e.
> > tryComplete() can never return true before the timeout. On timeout, since
> > the operation is a TimerTask, the timer will call run() which calls
> > onComplete().
> > In onComplete, the DelayedProduce can repeat the check in tryComplete()
> > (only if acks=-1 whether all replicas fetched upto a certain offset) and
> > return the response immediately.
> >
> > Code will be structured as follows in ReplicaManager:appendMessages()
> >
> > if(isThrottled) {
> >   fetch = new DelayedProduce(timeout)
> >   purgatory.tryCompleteElseWatch(fetch, Seq())
> > }
> > else if(delayedRequestRequired()) {
> >  // Insert into purgatory with watched keys for unthrottled requests
> > }
> >
> > In this proposal, we avoid adding unnecessary watches because there is no
> > possibility of early completion and this avoids any potential performance
> > penalties we were concerned about earlier.
> >
> > 2. Delayed Fetch Requests - Similarly, the proposal here is to reuse the
> > DelayedFetch objects and insert them into the purgatory with no watcher
> > keys if the request is throttled. Timeout used is the
> Max(quota_delay_time,
> > max_wait_timeout). Having no watch keys provides the same benefits as
> > described above. Upon timeout, the onComplete() is called and the
> operation
> > proceeds normally i.e. perform a readFromLocalLog and return a response.
> > The caveat here is that if the request is throttled but the throttle time
> > is less than the max_wait timeout on the fetch request, the request will
> be
> > delayed to a Max(quota_delay_time, max_wait_timeout). This may be more
> than
> > strictly necessary (since we are not watching for
> > satisfaction on any keys).
> >
> > I added some testcases to DelayedOperationTest to verify that it is
> > possible to schedule operations with no watcher keys. By inserting
> elements
> > with no watch keys, the purgatory simply becomes a delay queue. It may
> also
> > make sense to add a new API to the purgatory called
> > delayFor() that basically accepts an operation without any watch keys
> > (Thanks for the suggestion Joel).
> >
> > Thoughts?
> >
> > Thanks,
> > Aditya
> >
> > ________________________________________
> > From: Guozhang Wang [wangg...@gmail.com]
> > Sent: Monday, April 13, 2015 7:27 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> >
> > I think KAFKA-2063 (bounding fetch response) is still under discussion,
> and
> > may not be got it in time with KAFKA-1927.
> >
> > On Thu, Apr 9, 2015 at 4:49 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > I think it's reasonable to batch the protocol changes together. In
> > > addition to the protocol changes, is someone actively driving the
> server
> > > side changes/KIP process for KAFKA-2063?
> > >
> > > Thanks,
> > > Aditya
> > >
> > > ________________________________________
> > > From: Jun Rao [j...@confluent.io]
> > > Sent: Thursday, April 09, 2015 8:59 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > >
> > > Since we are also thinking about evolving the fetch request protocol in
> > > KAFKA-2063 (bound fetch response size), perhaps it's worth thinking
> > through
> > > if we can just evolve the protocol once.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Apr 8, 2015 at 10:43 AM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Thanks for the detailed review. I've addressed your comments.
> > > >
> > > > For rejected alternatives, we've rejected per-partition distribution
> > > > because we choose client based quotas where there is no notion of
> > > > partitions. I've explained in a bit more detail in that section.
> > > >
> > > > Aditya
> > > >
> > > > ________________________________________
> > > > From: Joel Koshy [jjkosh...@gmail.com]
> > > > Sent: Wednesday, April 08, 2015 6:30 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > >
> > > > Thanks for updating the wiki. Looks great overall. Just a couple
> > > > more comments:
> > > >
> > > > Client status code:
> > > > - v0 requests -> current version (0) of those requests.
> > > > - Fetch response has a throttled flag instead of throttle time -  I
> > > >   think you intended the latter.
> > > > - Can you make it clear that the quota status is a new field
> > > >   called throttleTimeMs (or equivalent). It would help if some of
> > > >   that is moved (or repeated) in compatibility/migration plan.
> > > > - So you would need to upgrade brokers first, then the clients.
> > > >   While upgrading the brokers (via a rolling bounce) the brokers
> > > >   cannot start using the latest fetch-request version immediately
> > > >   (for replica fetches). Since there will be older brokers in the mix
> > > >   those brokers would not be able to read v1 fetch requests. So all
> > > >   the brokers should be upgraded before switching to the latest
> > > >   fetch request version. This is similar to what Gwen proposed in
> > > >   KIP-2/KAFKA-1809 and I think we will need to use the
> > > >   inter-broker protocol version config.
> > > >
> > > > Rejected alternatives-quota-distribution.B: notes that this is the
> > > > most elegant model, but does not explain why it was rejected. I
> > > > think this was because we would then need some sort of gossip
> > > > between brokers since partitions are across the cluster. Can you
> > > > confirm?
> > > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > > > On Wed, Apr 08, 2015 at 05:45:34AM +0000, Aditya Auradkar wrote:
> > > > > Hey everyone,
> > > > >
> > > > > Following up after today's hangout. After discussing the client
> side
> > > > metrics piece internally, we've incorporated that section into the
> KIP.
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > > >
> > > > > Since there appears to be sufficient consensus, I'm going to start
> a
> > > > voting thread.
> > > > >
> > > > > Thanks,
> > > > > Aditya
> > > > > ________________________________________
> > > > > From: Gwen Shapira [gshap...@cloudera.com]
> > > > > Sent: Tuesday, April 07, 2015 11:31 AM
> > > > > To: Sriharsha Chintalapani
> > > > > Cc: dev@kafka.apache.org
> > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > > >
> > > > > Yeah, I was not suggesting adding auth to metrics - I think this
> > > > needlessly
> > > > > complicates everything.
> > > > > But we need to assume that client developers will not have access
> to
> > > the
> > > > > broker metrics (because in secure environment they probably won't).
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Tue, Apr 7, 2015 at 11:20 AM, Sriharsha Chintalapani <
> > > ka...@harsha.io
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Having auth  on top of metrics is going to be lot more difficult.
> > How
> > > > are
> > > > > > we going to restrict metrics reporter which run as part of kafka
> > > server
> > > > > > they will have access to all the metrics and they can publish to
> > > > ganglia
> > > > > > etc..  I look at the metrics as a read-only info. As you said
> > metrics
> > > > for
> > > > > > all the topics can be visible but what actions are we looking
> that
> > > can
> > > > be
> > > > > > non-secure based on metrics alone? . This probably can be part of
> > > > KIP-11
> > > > > > discussion.
> > > > > >  Having said that it will be great if the throttling details can
> be
> > > > > > exposed as part of the response to the client. Instead of looking
> > at
> > > > > > metrics , client can depend on the response to slow down if its
> > being
> > > > > > throttled.  This allows us the clients can be self-reliant based
> on
> > > the
> > > > > > response .
> > > > > >
> > > > > > --
> > > > > > Harsha
> > > > > >
> > > > > >
> > > > > > On April 7, 2015 at 9:55:41 AM, Gwen Shapira (
> > gshap...@cloudera.com)
> > > > > > wrote:
> > > > > >
> > > > > > Re (1):
> > > > > > We have no authorization story on the metrics collected by
> brokers,
> > > so
> > > > I
> > > > > > assume that access to broker metrics means knowing exactly which
> > > topics
> > > > > > exist and their throughputs. (Prath and Don, correct me if I got
> it
> > > > > > wrong...)
> > > > > > Secure environments will strictly control access to this
> > information,
> > > > so I
> > > > > > am pretty sure the client developers will not have access to
> server
> > > > > > metrics
> > > > > > at all.
> > > > > >
> > > > > > Gwen
> > > > > >
> > > > > > On Tue, Apr 7, 2015 at 7:41 AM, Jay Kreps <jay.kr...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Totally. But is that the only use? What I wanted to flesh out
> was
> > > > > > whether
> > > > > > > the goal was:
> > > > > > > 1. Expose throttling in the client metrics
> > > > > > > 2. Enable programmatic response (i.e. stop sending stuff or
> > > something
> > > > > > like
> > > > > > > that)
> > > > > > >
> > > > > > > I think I kind of understand (1) but let's get specific on the
> > > > metric we
> > > > > > > would be adding and what exactly you would expose in a
> dashboard.
> > > For
> > > > > > > example if the goal is just monitoring do I really want a
> boolean
> > > > flag
> > > > > > for
> > > > > > > is_throttled or do I want to know how much I am being throttled
> > > (i.e.
> > > > > > > throttle_pct might indicate the percent of your request time
> that
> > > was
> > > > > > due
> > > > > > > to throttling or something like that)? If I am 1% throttled
> that
> > > may
> > > > be
> > > > > > > irrelevant but 99% throttled would be quite relevant? Not sure
> I
> > > > agree,
> > > > > > > just throwing that out there...
> > > > > > >
> > > > > > > For (2) the prior discussion seemed to kind of allude to this
> > but I
> > > > > > can't
> > > > > > > really come up with a use case. Is there one?
> > > > > > >
> > > > > > > If it is just (1) I think the question is whether it really
> helps
> > > > much
> > > > > > to
> > > > > > > have the metric on the client vs the server. I suppose this is
> a
> > > bit
> > > > > > > environment specific. If you have a central metrics system it
> > > > shouldn't
> > > > > > > make any difference, but if you don't I suppose it does.
> > > > > > >
> > > > > > > -Jay
> > > > > > >
> > > > > > > On Mon, Apr 6, 2015 at 7:57 PM, Gwen Shapira <
> > > gshap...@cloudera.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Here's a wild guess:
> > > > > > > >
> > > > > > > > An app developer included a Kafka Producer in his app, and is
> > not
> > > > > > happy
> > > > > > > > with the throughput. He doesn't have visibility into the
> > brokers
> > > > since
> > > > > > > they
> > > > > > > > are owned by a different team. Obviously the first instinct
> of
> > a
> > > > > > > developer
> > > > > > > > who knows that throttling exists is to blame throttling for
> any
> > > > > > slowdown
> > > > > > > in
> > > > > > > > the app.
> > > > > > > > If he doesn't have a way to know from the responses whether
> or
> > > not
> > > > his
> > > > > > > app
> > > > > > > > is throttled, he may end up calling Aditya at 4am asked "Hey,
> > is
> > > my
> > > > > > app
> > > > > > > > throttled?".
> > > > > > > >
> > > > > > > > I assume Aditya is trying to avoid this scenario.
> > > > > > > >
> > > > > > > > On Mon, Apr 6, 2015 at 7:47 PM, Jay Kreps <
> jay.kr...@gmail.com
> > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Aditya,
> > > > > > > > >
> > > > > > > > > 2. I kind of buy it, but I really like to understand the
> > > details
> > > > of
> > > > > > the
> > > > > > > > use
> > > > > > > > > case before we make protocol changes. What changes are you
> > > > proposing
> > > > > > in
> > > > > > > > the
> > > > > > > > > clients for monitoring and how would that be used?
> > > > > > > > >
> > > > > > > > > -Jay
> > > > > > > > >
> > > > > > > > > On Mon, Apr 6, 2015 at 10:36 AM, Aditya Auradkar <
> > > > > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Jay,
> > > > > > > > > >
> > > > > > > > > > 2. At this time, the proposed response format changes are
> > > only
> > > > for
> > > > > > > > > > monitoring/informing clients. As Jun mentioned, we get
> > > instance
> > > > > > level
> > > > > > > > > > monitoring in this case since each instance that got
> > > throttled
> > > > > > will
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > metric confirming the same. Without client level
> monitoring
> > > for
> > > > > > this,
> > > > > > > > > it's
> > > > > > > > > > hard for application developers to find if they are being
> > > > > > throttled
> > > > > > > > since
> > > > > > > > > > they will also have to be aware of all the brokers in the
> > > > cluster.
> > > > > > > This
> > > > > > > > > is
> > > > > > > > > > quite problematic for large clusters.
> > > > > > > > > >
> > > > > > > > > > It seems nice for app developers to not have to think
> about
> > > > kafka
> > > > > > > > > internal
> > > > > > > > > > metrics and only focus on the metrics exposed on their
> > > > instances.
> > > > > > > > > Analogous
> > > > > > > > > > to having client-sde request latency metrics. Basically,
> we
> > > > want
> > > > > > an
> > > > > > > > easy
> > > > > > > > > > way for clients to be aware if they are being throttled.
> > > > > > > > > >
> > > > > > > > > > 4. For purgatory v delay queue, I think we are on the
> same
> > > > page. I
> > > > > > > feel
> > > > > > > > > it
> > > > > > > > > > is nicer to use the purgatory but I'm happy to use a
> > > > DelayQueue if
> > > > > > > > there
> > > > > > > > > > are performance implications. I don't know enough about
> the
> > > > > > current
> > > > > > > and
> > > > > > > > > > Yasuhiro's new implementation to be sure one way or the
> > > other.
> > > > > > > > > >
> > > > > > > > > > Stepping back, I think these two things are the only
> > > remaining
> > > > > > point
> > > > > > > of
> > > > > > > > > > discussion within the current proposal. Any concerns if I
> > > > started
> > > > > > a
> > > > > > > > > voting
> > > > > > > > > > thread on the proposal after the KIP discussion tomorrow?
> > > > > > (assuming
> > > > > > > we
> > > > > > > > > > reach consensus on these items)
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Aditya
> > > > > > > > > > ________________________________________
> > > > > > > > > > From: Jay Kreps [jay.kr...@gmail.com]
> > > > > > > > > > Sent: Saturday, April 04, 2015 1:36 PM
> > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > > > > > > > >
> > > > > > > > > > Hey Aditya,
> > > > > > > > > >
> > > > > > > > > > 2. For the return flag I'm not terribly particular. If we
> > > want
> > > > to
> > > > > > add
> > > > > > > > it
> > > > > > > > > > let's fully think through how it will be used. The only
> > > > concern I
> > > > > > > have
> > > > > > > > is
> > > > > > > > > > adding to the protocol without really thinking through
> the
> > > use
> > > > > > cases.
> > > > > > > > So
> > > > > > > > > > let's work out the APIs we want to add to the Java
> consumer
> > > and
> > > > > > > > producer
> > > > > > > > > > and the use cases for how clients will make use of these.
> > For
> > > > my
> > > > > > > part I
> > > > > > > > > > actually don't see much use other than monitoring since
> it
> > > > isn't
> > > > > > an
> > > > > > > > error
> > > > > > > > > > condition to be at your quota. And if it is just
> > monitoring I
> > > > > > don't
> > > > > > > > see a
> > > > > > > > > > big enough difference between having the monitoring on
> the
> > > > > > > server-side
> > > > > > > > > > versus in the clients to justify putting it in the
> > protocol.
> > > > But I
> > > > > > > > think
> > > > > > > > > > you guys may have other use cases in mind of how a client
> > > would
> > > > > > make
> > > > > > > > some
> > > > > > > > > > use of this? Let's work that out. I also don't feel
> > strongly
> > > > about
> > > > > > > > it--it
> > > > > > > > > > wouldn't be *bad* to have the monitoring available on the
> > > > client,
> > > > > > > just
> > > > > > > > > > doesn't seem that much better.
> > > > > > > > > >
> > > > > > > > > > 4. For the purgatory vs delay queue I think is arguably
> > nicer
> > > > to
> > > > > > > reuse
> > > > > > > > > the
> > > > > > > > > > purgatory we just have to be ultra-conscious of
> > efficiency. I
> > > > > > think
> > > > > > > our
> > > > > > > > > > goal is to turn quotas on across the board, so at
> LinkedIn
> > > that
> > > > > > would
> > > > > > > > > mean
> > > > > > > > > > potentially every request will need a small delay. I
> > haven't
> > > > > > worked
> > > > > > > out
> > > > > > > > > the
> > > > > > > > > > efficiency implications of this choice, so as long as we
> do
> > > > that
> > > > > > I'm
> > > > > > > > > happy.
> > > > > > > > > >
> > > > > > > > > > -Jay
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 3, 2015 at 1:10 PM, Aditya Auradkar <
> > > > > > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Some responses to Jay's points.
> > > > > > > > > > >
> > > > > > > > > > > 1. Using commas - Cool.
> > > > > > > > > > >
> > > > > > > > > > > 2. Adding return flag - I'm inclined to agree with Joel
> > > that
> > > > > > this
> > > > > > > is
> > > > > > > > > good
> > > > > > > > > > > to have in the initial implementation.
> > > > > > > > > > >
> > > > > > > > > > > 3. Config - +1. I'll remove it from the KIP. We can
> > discuss
> > > > this
> > > > > > in
> > > > > > > > > > > parallel.
> > > > > > > > > > >
> > > > > > > > > > > 4. Purgatory vs Delay queue - I feel that it is simpler
> > to
> > > > reuse
> > > > > > > the
> > > > > > > > > > > existing purgatories for both delayed produce and fetch
> > > > > > requests.
> > > > > > > > IIUC,
> > > > > > > > > > all
> > > > > > > > > > > we need for quotas is a minWait parameter for
> > > > DelayedOperation
> > > > > > (or
> > > > > > > > > > > something equivalent) since there is already a max
> wait.
> > > The
> > > > > > > > completion
> > > > > > > > > > > criteria can check if minWait time has elapsed before
> > > > declaring
> > > > > > the
> > > > > > > > > > > operation complete. For this to impact performance, a
> > > > > > significant
> > > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > clients may need to exceed their quota at the same time
> > and
> > > > even
> > > > > > > then
> > > > > > > > > I'm
> > > > > > > > > > > not very clear on the scope of the impact. Two layers
> of
> > > > delays
> > > > > > > might
> > > > > > > > > add
> > > > > > > > > > > complexity to the implementation which I'm hoping to
> > avoid.
> > > > > > > > > > >
> > > > > > > > > > > Aditya
> > > > > > > > > > >
> > > > > > > > > > > ________________________________________
> > > > > > > > > > > From: Joel Koshy [jjkosh...@gmail.com]
> > > > > > > > > > > Sent: Friday, April 03, 2015 12:48 PM
> > > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > > > > > > > > >
> > > > > > > > > > > Aditya, thanks for the updated KIP and Jay/Jun thanks
> for
> > > the
> > > > > > > > > > > comments. Couple of comments in-line:
> > > > > > > > > > >
> > > > > > > > > > > > 2. I would advocate for adding the return flag when
> we
> > > next
> > > > > > bump
> > > > > > > > the
> > > > > > > > > > > > request format version just to avoid proliferation. I
> > > agree
> > > > > > this
> > > > > > > > is a
> > > > > > > > > > > good
> > > > > > > > > > > > thing to know about, but at the moment I don't think
> we
> > > > have a
> > > > > > > very
> > > > > > > > > > well
> > > > > > > > > > > > flushed out idea of how the client would actually
> make
> > > use
> > > > of
> > > > > > > this
> > > > > > > > > > info.
> > > > > > > > > > > I
> > > > > > > > > > >
> > > > > > > > > > > I'm somewhat inclined to having something appropriate
> off
> > > the
> > > > > > bat -
> > > > > > > > > > > mainly because (i) clients really should know that they
> > > have
> > > > > > been
> > > > > > > > > > > throttled (ii) a smart producer/consumer implementation
> > > would
> > > > > > want
> > > > > > > to
> > > > > > > > > > > know how much to back off. So perhaps this and
> > > > config-management
> > > > > > > > > > > should be moved to a separate discussion, but it would
> be
> > > > good
> > > > > > to
> > > > > > > > have
> > > > > > > > > > > this discussion going and incorporated into the first
> > quota
> > > > > > > > > > > implementation.
> > > > > > > > > > >
> > > > > > > > > > > > 3. Config--I think we need to generalize the topic
> > stuff
> > > > so we
> > > > > > > can
> > > > > > > > > > > override
> > > > > > > > > > > > at multiple levels. We have topic and client, but I
> > > suspect
> > > > > > > "user"
> > > > > > > > > and
> > > > > > > > > > > > "broker" will also be important. I recommend we take
> > > config
> > > > > > stuff
> > > > > > > > out
> > > > > > > > > > of
> > > > > > > > > > > > this KIP since we really need to fully think through
> a
> > > > > > proposal
> > > > > > > > that
> > > > > > > > > > will
> > > > > > > > > > > > cover all these types of overrides.
> > > > > > > > > > >
> > > > > > > > > > > +1 - it is definitely orthogonal to the core quota
> > > > > > implementation
> > > > > > > > > > > (although necessary for its operability). Having a
> > > > > > config-related
> > > > > > > > > > > discussion in this KIP would only draw out the
> discussion
> > > and
> > > > > > vote
> > > > > > > > > > > even if the core quota design looks good to everyone.
> > > > > > > > > > >
> > > > > > > > > > > So basically I think we can remove the portions on
> > dynamic
> > > > > > config
> > > > > > > as
> > > > > > > > > > > well as the response format but I really think we
> should
> > > > close
> > > > > > on
> > > > > > > > > > > those while the implementation is in progress and
> before
> > > > quotas
> > > > > > is
> > > > > > > > > > > officially released.
> > > > > > > > > > >
> > > > > > > > > > > > 4. Instead of using purgatories to implement the
> delay
> > > > would
> > > > > > it
> > > > > > > > make
> > > > > > > > > > more
> > > > > > > > > > > > sense to just use a delay queue? I think all the
> > > additional
> > > > > > stuff
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > purgatory other than the delay queue doesn't make
> sense
> > > as
> > > > the
> > > > > > > > quota
> > > > > > > > > > is a
> > > > > > > > > > > > hard N ms penalty with no chance of early eviction.
> If
> > > > there
> > > > > > is
> > > > > > > no
> > > > > > > > > perf
> > > > > > > > > > > > penalty for the full purgatory that may be fine (even
> > > > good) to
> > > > > > > > reuse,
> > > > > > > > > > > but I
> > > > > > > > > > > > haven't looked into that.
> > > > > > > > > > >
> > > > > > > > > > > A simple delay queue sounds good - I think Aditya was
> > also
> > > > > > trying
> > > > > > > to
> > > > > > > > > > > avoid adding a new quota purgatory. i.e., it may be
> > > possible
> > > > to
> > > > > > use
> > > > > > > > > > > the existing purgatory instances to enforce quotas.
> That
> > > may
> > > > be
> > > > > > > > > > > simpler, but would be incur a slight perf penalty if
> too
> > > many
> > > > > > > clients
> > > > > > > > > > > are being throttled.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Joel
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > -Jay
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Apr 3, 2015 at 10:45 AM, Aditya Auradkar <
> > > > > > > > > > > > aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> Update, I added a proposal on doing dynamic client
> > based
> > > > > > > > > configuration
> > > > > > > > > > > >> that can be used for quotas.
> > > > > > > > > > > >>
> > > > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > > > > > > > > > >>
> > > > > > > > > > > >> Please take a look and let me know if there are any
> > > > concerns.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks,
> > > > > > > > > > > >> Aditya
> > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > >> From: Aditya Auradkar
> > > > > > > > > > > >> Sent: Friday, April 03, 2015 10:10 AM
> > > > > > > > > > > >> To: dev@kafka.apache.org
> > > > > > > > > > > >> Subject: RE: [KIP-DISCUSSION] KIP-13 Quotas
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks Jun.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Some thoughts:
> > > > > > > > > > > >>
> > > > > > > > > > > >> 10) I think it is better we throttle regardless of
> the
> > > > > > > > produce/fetch
> > > > > > > > > > > >> version. This is a nice feature where clients can
> tell
> > > if
> > > > > > they
> > > > > > > are
> > > > > > > > > > being
> > > > > > > > > > > >> throttled or not. If we only throttle newer clients,
> > > then
> > > > we
> > > > > > > have
> > > > > > > > > > > >> inconsistent behavior across clients in a
> multi-tenant
> > > > > > cluster.
> > > > > > > > > Having
> > > > > > > > > > > >> quota metrics on the client side is also a nice
> > > incentive
> > > > to
> > > > > > > > upgrade
> > > > > > > > > > > client
> > > > > > > > > > > >> versions.
> > > > > > > > > > > >>
> > > > > > > > > > > >> 11) I think we can call metric.record(fetchSize)
> > before
> > > > > > adding
> > > > > > > the
> > > > > > > > > > > >> delayedFetch request into the purgatory. This will
> > give
> > > us
> > > > > > the
> > > > > > > > > > estimated
> > > > > > > > > > > >> delay of the request up-front. The timeout on the
> > > > > > DelayedFetch
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > >> Max(maxWait, quotaDelay). The DelayedFetch
> completion
> > > > > > criteria
> > > > > > > can
> > > > > > > > > > > change a
> > > > > > > > > > > >> little to accomodate quotas.
> > > > > > > > > > > >>
> > > > > > > > > > > >> - I agree the quota code should return the estimated
> > > delay
> > > > > > time
> > > > > > > in
> > > > > > > > > > > >> QuotaViolationException.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks,
> > > > > > > > > > > >> Aditya
> > > > > > > > > > > >>
> > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > >> From: Jun Rao [j...@confluent.io]
> > > > > > > > > > > >> Sent: Friday, April 03, 2015 9:16 AM
> > > > > > > > > > > >> To: dev@kafka.apache.org
> > > > > > > > > > > >> Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks for the update.
> > > > > > > > > > > >>
> > > > > > > > > > > >> 10. About whether to return a new field in the
> > response
> > > to
> > > > > > > > indicate
> > > > > > > > > > > >> throttling. Earlier, the plan was to not change the
> > > > response
> > > > > > > > format
> > > > > > > > > > and
> > > > > > > > > > > >> just have a metric on the broker to indicate
> whether a
> > > > > > clientId
> > > > > > > is
> > > > > > > > > > > >> throttled or not. The issue is that we don't know
> > > whether
> > > > a
> > > > > > > > > particular
> > > > > > > > > > > >> clientId instance is throttled or not (since there
> > could
> > > > be
> > > > > > > > multiple
> > > > > > > > > > > >> clients with the same clientId). Your proposal of
> > adding
> > > > an
> > > > > > > > > > isThrottled
> > > > > > > > > > > >> field in the response addresses and seems better.
> > Then,
> > > > do we
> > > > > > > just
> > > > > > > > > > > throttle
> > > > > > > > > > > >> the new version of produce/fetch request or both the
> > old
> > > > and
> > > > > > the
> > > > > > > > new
> > > > > > > > > > > >> versions? Also, we probably still need a separate
> > metric
> > > > on
> > > > > > the
> > > > > > > > > broker
> > > > > > > > > > > side
> > > > > > > > > > > >> to indicate whether a clientId is throttled or not.
> > > > > > > > > > > >>
> > > > > > > > > > > >> 11. Just to clarify. For fetch requests, when will
> > > > > > > > > > > metric.record(fetchSize)
> > > > > > > > > > > >> be called? Is it when we are ready to send the fetch
> > > > response
> > > > > > > > (after
> > > > > > > > > > > >> minBytes and maxWait are satisfied)?
> > > > > > > > > > > >>
> > > > > > > > > > > >> As an implementation detail, it may be useful for
> the
> > > > quota
> > > > > > code
> > > > > > > > to
> > > > > > > > > > > return
> > > > > > > > > > > >> an estimated delay time (to bring the measurement
> > within
> > > > the
> > > > > > > > limit)
> > > > > > > > > in
> > > > > > > > > > > >> QuotaViolationException.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks,
> > > > > > > > > > > >>
> > > > > > > > > > > >> Jun
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Wed, Apr 1, 2015 at 3:27 PM, Aditya Auradkar <
> > > > > > > > > > > >> aaurad...@linkedin.com.invalid> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> > Hey everyone,
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > I've made changes to the KIP to capture our
> > > discussions
> > > > > > over
> > > > > > > the
> > > > > > > > > > last
> > > > > > > > > > > >> > couple of weeks.
> > > > > > > > > > > >> >
> > > > > > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > I'll start a voting thread after people have had a
> > > > chance
> > > > > > to
> > > > > > > > > > > >> read/comment.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Thanks,
> > > > > > > > > > > >> > Aditya
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > ________________________________________
> > > > > > > > > > > >> > From: Steven Wu [stevenz...@gmail.com]
> > > > > > > > > > > >> > Sent: Friday, March 20, 2015 9:14 AM
> > > > > > > > > > > >> > To: dev@kafka.apache.org
> > > > > > > > > > > >> > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > +1 on Jun's suggestion of maintaining one
> set/style
> > of
> > > > > > metrics
> > > > > > > > at
> > > > > > > > > > > broker.
> > > > > > > > > > > >> > In Netflix, we have to convert the yammer metrics
> to
> > > > servo
> > > > > > > > metrics
> > > > > > > > > > at
> > > > > > > > > > > >> > broker. it will be painful to know some metrics
> are
> > > in a
> > > > > > > > different
> > > > > > > > > > > style
> > > > > > > > > > > >> > and get to be handled differently.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Fri, Mar 20, 2015 at 8:17 AM, Jun Rao <
> > > > j...@confluent.io>
> > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > Not so sure. People who use quota will
> definitely
> > > > want to
> > > > > > > > > monitor
> > > > > > > > > > > the
> > > > > > > > > > > >> new
> > > > > > > > > > > >> > > metrics at the client id level. Then they will
> > need
> > > to
> > > > > > deal
> > > > > > > > with
> > > > > > > > > > > those
> > > > > > > > > > > >> > > metrics differently from the rest of the
> metrics.
> > It
> > > > > > would
> > > > > > > be
> > > > > > > > > > > better if
> > > > > > > > > > > >> > we
> > > > > > > > > > > >> > > can hide this complexity from the users.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Jun
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > On Thu, Mar 19, 2015 at 10:45 PM, Joel Koshy <
> > > > > > > > > jjkosh...@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > > Actually thinking again - since these will be
> a
> > > few
> > > > new
> > > > > > > > > metrics
> > > > > > > > > > at
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > client id level (bytes in and bytes out to
> start
> > > > with)
> > > > > > > maybe
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > >> fine
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > have the two type of metrics coexist and we
> can
> > > > migrate
> > > > > > > the
> > > > > > > > > > > existing
> > > > > > > > > > > >> > > > metrics in parallel.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > On Thursday, March 19, 2015, Joel Koshy <
> > > > > > > > jjkosh...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > That is a valid concern but in that case I
> > think
> > > > it
> > > > > > > would
> > > > > > > > be
> > > > > > > > > > > better
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > > > > just migrate completely to the new metrics
> > > package
> > > > > > > first.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > On Thursday, March 19, 2015, Jun Rao <
> > > > > > j...@confluent.io
> > > > > > > > > > > >> > > > > <javascript:_e(%7B%7D,'cvml','
> > j...@confluent.io
> > > > ');>>
> > > > > > > > wrote:
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >> Hmm, I was thinking a bit differently on
> the
> > > > metrics
> > > > > > > > > stuff. I
> > > > > > > > > > > >> think
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > > > >> would be confusing to have some metrics
> > defined
> > > > in
> > > > > > the
> > > > > > > > new
> > > > > > > > > > > metrics
> > > > > > > > > > > >> > > > package
> > > > > > > > > > > >> > > > >> while some others defined in Coda Hale.
> Those
> > > > > > metrics
> > > > > > > > will
> > > > > > > > > > look
> > > > > > > > > > > >> > > > different
> > > > > > > > > > > >> > > > >> (e.g., rates in Coda Hale will have special
> > > > > > attributes
> > > > > > > > such
> > > > > > > > > > as
> > > > > > > > > > > >> > > > >> 1-min-average). People may need different
> > ways
> > > to
> > > > > > > export
> > > > > > > > > the
> > > > > > > > > > > >> metrics
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> external systems such as Graphite. So,
> > instead
> > > of
> > > > > > using
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > >> > > metrics
> > > > > > > > > > > >> > > > >> package on the broker, I was thinking that
> we
> > > can
> > > > > > just
> > > > > > > > > > > implement a
> > > > > > > > > > > >> > > > >> QuotaMetrics that wraps the Coda Hale
> > metrics.
> > > > The
> > > > > > > > > > > implementation
> > > > > > > > > > > >> > can
> > > > > > > > > > > >> > > be
> > > > > > > > > > > >> > > > >> the same as what's in the new metrics
> > package.
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> Thanks,
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> Jun
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> On Thu, Mar 19, 2015 at 8:09 PM, Jay Kreps
> <
> > > > > > > > > > > jay.kr...@gmail.com>
> > > > > > > > > > > >> > > wrote:
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> > Yeah I was saying was that we are blocked
> > on
> > > > > > picking
> > > > > > > an
> > > > > > > > > > > approach
> > > > > > > > > > > >> > for
> > > > > > > > > > > >> > > > >> > metrics but not necessarily the full
> > > > conversion.
> > > > > > > > Clearly
> > > > > > > > > if
> > > > > > > > > > > we
> > > > > > > > > > > >> > pick
> > > > > > > > > > > >> > > > the
> > > > > > > > > > > >> > > > >> new
> > > > > > > > > > > >> > > > >> > metrics package we would need to
> implement
> > > the
> > > > two
> > > > > > > > > metrics
> > > > > > > > > > we
> > > > > > > > > > > >> want
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> quota
> > > > > > > > > > > >> > > > >> > on. But the conversion of the remaining
> > > metrics
> > > > > > can
> > > > > > > be
> > > > > > > > > done
> > > > > > > > > > > >> > > > >> asynchronously.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > -Jay
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > On Thu, Mar 19, 2015 at 5:56 PM, Joel
> > Koshy <
> > > > > > > > > > > >> jjkosh...@gmail.com>
> > > > > > > > > > > >> > > > >> wrote:
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > > > in KAFKA-1930). I agree that this KIP
> > > > doesn't
> > > > > > > need
> > > > > > > > to
> > > > > > > > > > > block
> > > > > > > > > > > >> on
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > migration of the metrics package.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Can you clarify the above? i.e., if we
> > are
> > > > going
> > > > > > to
> > > > > > > > > quota
> > > > > > > > > > > on
> > > > > > > > > > > >> > > > something
> > > > > > > > > > > >> > > > >> > > then we would want to have migrated
> that
> > > > metric
> > > > > > > over
> > > > > > > > > > > right? Or
> > > > > > > > > > > >> > do
> > > > > > > > > > > >> > > > you
> > > > > > > > > > > >> > > > >> > > mean we don't need to complete the
> > > migration
> > > > of
> > > > > > all
> > > > > > > > > > > metrics to
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> > > metrics package right?
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > I think most of us now feel that the
> > delay
> > > +
> > > > no
> > > > > > > error
> > > > > > > > > is
> > > > > > > > > > a
> > > > > > > > > > > >> good
> > > > > > > > > > > >> > > > >> > > approach, but it would be good to make
> > sure
> > > > > > > everyone
> > > > > > > > is
> > > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > >> > > same
> > > > > > > > > > > >> > > > >> > > page.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > As Aditya requested a couple of days
> ago
> > I
> > > > think
> > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > go
> > > > > > > > > > > >> > over
> > > > > > > > > > > >> > > > >> > > this at the next KIP hangout.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Joel
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > On Thu, Mar 19, 2015 at 09:24:09AM
> -0700,
> > > Jun
> > > > > > Rao
> > > > > > > > > wrote:
> > > > > > > > > > > >> > > > >> > > > 1. Delay + no error seems reasonable
> to
> > > me.
> > > > > > > > However,
> > > > > > > > > I
> > > > > > > > > > do
> > > > > > > > > > > >> feel
> > > > > > > > > > > >> > > > that
> > > > > > > > > > > >> > > > >> we
> > > > > > > > > > > >> > > > >> > > need
> > > > > > > > > > > >> > > > >> > > > to give the client an indicator that
> > it's
> > > > > > being
> > > > > > > > > > > throttled,
> > > > > > > > > > > >> > > instead
> > > > > > > > > > > >> > > > >> of
> > > > > > > > > > > >> > > > >> > > doing
> > > > > > > > > > > >> > > > >> > > > this silently. For that, we probably
> > need
> > > > to
> > > > > > > evolve
> > > > > > > > > the
> > > > > > > > > > > >> > > > >> produce/fetch
> > > > > > > > > > > >> > > > >> > > > protocol to include an extra status
> > field
> > > > in
> > > > > > the
> > > > > > > > > > > response.
> > > > > > > > > > > >> We
> > > > > > > > > > > >> > > > >> probably
> > > > > > > > > > > >> > > > >> > > need
> > > > > > > > > > > >> > > > >> > > > to think more about whether we just
> > want
> > > to
> > > > > > > return
> > > > > > > > a
> > > > > > > > > > > simple
> > > > > > > > > > > >> > > status
> > > > > > > > > > > >> > > > >> code
> > > > > > > > > > > >> > > > >> > > > (e.g., 1 = throttled) or a value that
> > > > > > indicates
> > > > > > > how
> > > > > > > > > > much
> > > > > > > > > > > is
> > > > > > > > > > > >> > > being
> > > > > > > > > > > >> > > > >> > > throttled.
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > 2. We probably need to improve the
> > > > histogram
> > > > > > > > support
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > >> > new
> > > > > > > > > > > >> > > > >> metrics
> > > > > > > > > > > >> > > > >> > > > package before we can use it more
> > widely
> > > on
> > > > > > the
> > > > > > > > > server
> > > > > > > > > > > side
> > > > > > > > > > > >> > > (left
> > > > > > > > > > > >> > > > a
> > > > > > > > > > > >> > > > >> > > comment
> > > > > > > > > > > >> > > > >> > > > in KAFKA-1930). I agree that this KIP
> > > > doesn't
> > > > > > > need
> > > > > > > > to
> > > > > > > > > > > block
> > > > > > > > > > > >> on
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > migration of the metrics package.
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > Thanks,
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > Jun
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > On Wed, Mar 18, 2015 at 4:02 PM,
> Aditya
> > > > > > Auradkar
> > > > > > > <
> > > > > > > > > > > >> > > > >> > > > aaurad...@linkedin.com.invalid>
> wrote:
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > > Hey everyone,
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Thanks for the great discussion.
> > There
> > > > are
> > > > > > > > > currently
> > > > > > > > > > a
> > > > > > > > > > > few
> > > > > > > > > > > >> > > > points
> > > > > > > > > > > >> > > > >> on
> > > > > > > > > > > >> > > > >> > > this
> > > > > > > > > > > >> > > > >> > > > > KIP that need addressing and I want
> > to
> > > > make
> > > > > > > sure
> > > > > > > > we
> > > > > > > > > > > are on
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> same
> > > > > > > > > > > >> > > > >> > > page
> > > > > > > > > > > >> > > > >> > > > > about those.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > 1. Append and delay response vs
> delay
> > > and
> > > > > > > return
> > > > > > > > > > error
> > > > > > > > > > > >> > > > >> > > > > - I think we've discussed the pros
> > and
> > > > cons
> > > > > > of
> > > > > > > > each
> > > > > > > > > > > >> approach
> > > > > > > > > > > >> > > but
> > > > > > > > > > > >> > > > >> > > haven't
> > > > > > > > > > > >> > > > >> > > > > chosen an approach yet. Where does
> > > > everyone
> > > > > > > stand
> > > > > > > > > on
> > > > > > > > > > > this
> > > > > > > > > > > >> > > issue?
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > 2. Metrics Migration and usage in
> > > quotas
> > > > > > > > > > > >> > > > >> > > > > - The metrics library in clients
> has
> > a
> > > > > > notion
> > > > > > > of
> > > > > > > > > > quotas
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > we
> > > > > > > > > > > >> > > > >> > should
> > > > > > > > > > > >> > > > >> > > > > reuse. For that to happen, we need
> to
> > > > > > migrate
> > > > > > > the
> > > > > > > > > > > server
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> new
> > > > > > > > > > > >> > > > >> > > metrics
> > > > > > > > > > > >> > > > >> > > > > package.
> > > > > > > > > > > >> > > > >> > > > > - Need more clarification on how to
> > > > compute
> > > > > > > > > > throttling
> > > > > > > > > > > >> time
> > > > > > > > > > > >> > > and
> > > > > > > > > > > >> > > > >> > > windowing
> > > > > > > > > > > >> > > > >> > > > > for quotas.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > I'm going to start a new KIP to
> > discuss
> > > > > > metrics
> > > > > > > > > > > migration
> > > > > > > > > > > >> > > > >> separately.
> > > > > > > > > > > >> > > > >> > > That
> > > > > > > > > > > >> > > > >> > > > > will also contain a section on
> > quotas.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > 3. Dynamic Configuration
> management -
> > > > Being
> > > > > > > > > discussed
> > > > > > > > > > > in
> > > > > > > > > > > >> > > KIP-5.
> > > > > > > > > > > >> > > > >> > > Basically
> > > > > > > > > > > >> > > > >> > > > > we need something that will model
> > > default
> > > > > > > quotas
> > > > > > > > > and
> > > > > > > > > > > allow
> > > > > > > > > > > >> > > > >> per-client
> > > > > > > > > > > >> > > > >> > > > > overrides.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Is there something else that I'm
> > > missing?
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Thanks,
> > > > > > > > > > > >> > > > >> > > > > Aditya
> > > > > > > > > > > >> > > > >> > > > >
> > > ________________________________________
> > > > > > > > > > > >> > > > >> > > > > From: Jay Kreps [
> jay.kr...@gmail.com
> > ]
> > > > > > > > > > > >> > > > >> > > > > Sent: Wednesday, March 18, 2015
> 2:10
> > PM
> > > > > > > > > > > >> > > > >> > > > > To: dev@kafka.apache.org
> > > > > > > > > > > >> > > > >> > > > > Subject: Re: [KIP-DISCUSSION]
> KIP-13
> > > > Quotas
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Hey Steven,
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > The current proposal is actually to
> > > > enforce
> > > > > > > > quotas
> > > > > > > > > at
> > > > > > > > > > > the
> > > > > > > > > > > >> > > > >> > > > > client/application level, NOT the
> > topic
> > > > > > level.
> > > > > > > So
> > > > > > > > > if
> > > > > > > > > > > you
> > > > > > > > > > > >> > have
> > > > > > > > > > > >> > > a
> > > > > > > > > > > >> > > > >> > service
> > > > > > > > > > > >> > > > >> > > > > with a few dozen instances the
> quota
> > is
> > > > > > against
> > > > > > > > all
> > > > > > > > > > of
> > > > > > > > > > > >> those
> > > > > > > > > > > >> > > > >> > instances
> > > > > > > > > > > >> > > > >> > > > > added up across all their topics.
> So
> > > > > > actually
> > > > > > > the
> > > > > > > > > > > effect
> > > > > > > > > > > >> > would
> > > > > > > > > > > >> > > > be
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > same
> > > > > > > > > > > >> > > > >> > > > > either way but throttling gives the
> > > > producer
> > > > > > > the
> > > > > > > > > > > choice of
> > > > > > > > > > > >> > > > either
> > > > > > > > > > > >> > > > >> > > blocking
> > > > > > > > > > > >> > > > >> > > > > or dropping.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > -Jay
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > On Tue, Mar 17, 2015 at 10:08 AM,
> > > Steven
> > > > Wu
> > > > > > <
> > > > > > > > > > > >> > > > stevenz...@gmail.com
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > > wrote:
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > > Jay,
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > > let's say an app produces to 10
> > > > different
> > > > > > > > topics.
> > > > > > > > > > > one of
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> topic
> > > > > > > > > > > >> > > > >> > is
> > > > > > > > > > > >> > > > >> > > > > sent
> > > > > > > > > > > >> > > > >> > > > > > from a library. due to whatever
> > > > > > > condition/bug,
> > > > > > > > > this
> > > > > > > > > > > lib
> > > > > > > > > > > >> > > starts
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > > send
> > > > > > > > > > > >> > > > >> > > > > > messages over the quota. if we go
> > > with
> > > > the
> > > > > > > > > delayed
> > > > > > > > > > > >> > response
> > > > > > > > > > > >> > > > >> > > approach, it
> > > > > > > > > > > >> > > > >> > > > > > will cause the whole shared
> > > > > > RecordAccumulator
> > > > > > > > > > buffer
> > > > > > > > > > > to
> > > > > > > > > > > >> be
> > > > > > > > > > > >> > > > >> filled
> > > > > > > > > > > >> > > > >> > up.
> > > > > > > > > > > >> > > > >> > > > > that
> > > > > > > > > > > >> > > > >> > > > > > will penalize other 9 topics who
> > are
> > > > > > within
> > > > > > > the
> > > > > > > > > > > quota.
> > > > > > > > > > > >> > that
> > > > > > > > > > > >> > > is
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > > > > unfairness point that Ewen and I
> > were
> > > > > > trying
> > > > > > > to
> > > > > > > > > > make.
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > > if broker just drop the msg and
> > > return
> > > > an
> > > > > > > > > > > error/status
> > > > > > > > > > > >> > code
> > > > > > > > > > > >> > > > >> > > indicates the
> > > > > > > > > > > >> > > > >> > > > > > drop and why. then producer can
> > just
> > > > move
> > > > > > on
> > > > > > > > and
> > > > > > > > > > > accept
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> drop.
> > > > > > > > > > > >> > > > >> > > shared
> > > > > > > > > > > >> > > > >> > > > > > buffer won't be saturated and
> > other 9
> > > > > > topics
> > > > > > > > > won't
> > > > > > > > > > be
> > > > > > > > > > > >> > > > penalized.
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > > Thanks,
> > > > > > > > > > > >> > > > >> > > > > > Steven
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > > On Tue, Mar 17, 2015 at 9:44 AM,
> > Jay
> > > > Kreps
> > > > > > <
> > > > > > > > > > > >> > > > jay.kr...@gmail.com
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > > wrote:
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > Hey Steven,
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > It is true that hitting the
> quota
> > > > will
> > > > > > > cause
> > > > > > > > > > > >> > back-pressure
> > > > > > > > > > > >> > > > on
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > > > > producer.
> > > > > > > > > > > >> > > > >> > > > > > > But the solution is simple, a
> > > > producer
> > > > > > that
> > > > > > > > > wants
> > > > > > > > > > > to
> > > > > > > > > > > >> > avoid
> > > > > > > > > > > >> > > > >> this
> > > > > > > > > > > >> > > > >> > > should
> > > > > > > > > > > >> > > > >> > > > > > stay
> > > > > > > > > > > >> > > > >> > > > > > > under its quota. In other words
> > > this
> > > > is
> > > > > > a
> > > > > > > > > > contract
> > > > > > > > > > > >> > between
> > > > > > > > > > > >> > > > the
> > > > > > > > > > > >> > > > >> > > cluster
> > > > > > > > > > > >> > > > >> > > > > > and
> > > > > > > > > > > >> > > > >> > > > > > > the client, with each side
> having
> > > > > > something
> > > > > > > > to
> > > > > > > > > > > uphold.
> > > > > > > > > > > >> > > Quite
> > > > > > > > > > > >> > > > >> > > possibly
> > > > > > > > > > > >> > > > >> > > > > the
> > > > > > > > > > > >> > > > >> > > > > > > same thing will happen in the
> > > > absence of
> > > > > > a
> > > > > > > > > > quota, a
> > > > > > > > > > > >> > client
> > > > > > > > > > > >> > > > >> that
> > > > > > > > > > > >> > > > >> > > > > produces
> > > > > > > > > > > >> > > > >> > > > > > an
> > > > > > > > > > > >> > > > >> > > > > > > unexpected amount of load will
> > hit
> > > > the
> > > > > > > limits
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > >> > > server
> > > > > > > > > > > >> > > > >> and
> > > > > > > > > > > >> > > > >> > > > > > experience
> > > > > > > > > > > >> > > > >> > > > > > > backpressure. Quotas just allow
> > you
> > > > to
> > > > > > set
> > > > > > > > that
> > > > > > > > > > > same
> > > > > > > > > > > >> > limit
> > > > > > > > > > > >> > > > at
> > > > > > > > > > > >> > > > >> > > something
> > > > > > > > > > > >> > > > >> > > > > > > lower than 100% of all
> resources
> > on
> > > > the
> > > > > > > > server,
> > > > > > > > > > > which
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > > > >> useful
> > > > > > > > > > > >> > > > >> > > for a
> > > > > > > > > > > >> > > > >> > > > > > > shared cluster.
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > -Jay
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > On Mon, Mar 16, 2015 at 11:34
> PM,
> > > > Steven
> > > > > > > Wu <
> > > > > > > > > > > >> > > > >> > stevenz...@gmail.com>
> > > > > > > > > > > >> > > > >> > > > > > wrote:
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > wait. we create one kafka
> > > producer
> > > > for
> > > > > > > each
> > > > > > > > > > > cluster.
> > > > > > > > > > > >> > > each
> > > > > > > > > > > >> > > > >> > > cluster can
> > > > > > > > > > > >> > > > >> > > > > > > have
> > > > > > > > > > > >> > > > >> > > > > > > > many topics. if producer
> buffer
> > > got
> > > > > > > filled
> > > > > > > > up
> > > > > > > > > > > due to
> > > > > > > > > > > >> > > > delayed
> > > > > > > > > > > >> > > > >> > > response
> > > > > > > > > > > >> > > > >> > > > > > for
> > > > > > > > > > > >> > > > >> > > > > > > > one throttled topic, won't
> that
> > > > > > penalize
> > > > > > > > > other
> > > > > > > > > > > >> topics
> > > > > > > > > > > >> > > > >> unfairly?
> > > > > > > > > > > >> > > > >> > > it
> > > > > > > > > > > >> > > > >> > > > > > seems
> > > > > > > > > > > >> > > > >> > > > > > > to
> > > > > > > > > > > >> > > > >> > > > > > > > me that broker should just
> > return
> > > > > > error
> > > > > > > > > without
> > > > > > > > > > > >> delay.
> > > > > > > > > > > >> > > > >> > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > sorry that I am chatting to
> > > myself
> > > > :)
> > > > > > > > > > > >> > > > >> > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > On Mon, Mar 16, 2015 at 11:29
> > PM,
> > > > > > Steven
> > > > > > > > Wu <
> > > > > > > > > > > >> > > > >> > > stevenz...@gmail.com>
> > > > > > > > > > > >> > > > >> > > > > > > wrote:
> > > > > > > > > > > >> > > > >> > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > > I think I can answer my own
> > > > > > question.
> > > > > > > > > delayed
> > > > > > > > > > > >> > response
> > > > > > > > > > > >> > > > >> will
> > > > > > > > > > > >> > > > >> > > cause
> > > > > > > > > > > >> > > > >> > > > > the
> > > > > > > > > > > >> > > > >> > > > > > > > > producer buffer to be full,
> > > which
> > > > > > then
> > > > > > > > > result
> > > > > > > > > > > in
> > > > > > > > > > > >> > > either
> > > > > > > > > > > >> > > > >> > thread
> > > > > > > > > > > >> > > > >> > > > > > blocking
> > > > > > > > > > > >> > > > >> > > > > > > > or
> > > > > > > > > > > >> > > > >> > > > > > > > > message drop.
> > > > > > > > > > > >> > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > > On Mon, Mar 16, 2015 at
> 11:24
> > > PM,
> > > > > > > Steven
> > > > > > > > > Wu <
> > > > > > > > > > > >> > > > >> > > stevenz...@gmail.com>
> > > > > > > > > > > >> > > > >> > > > > > > > wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >> please correct me if I am
> > > > missing
> > > > > > sth
> > > > > > > > > here.
> > > > > > > > > > I
> > > > > > > > > > > am
> > > > > > > > > > > >> > not
> > > > > > > > > > > >> > > > >> > > understanding
> > > > > > > > > > > >> > > > >> > > > > > how
> > > > > > > > > > > >> > > > >> > > > > > > > >> would throttle work
> without
> > > > > > > > > > > cooperation/back-off
> > > > > > > > > > > >> > from
> > > > > > > > > > > >> > > > >> > > producer.
> > > > > > > > > > > >> > > > >> > > > > new
> > > > > > > > > > > >> > > > >> > > > > > > Java
> > > > > > > > > > > >> > > > >> > > > > > > > >> producer supports
> > non-blocking
> > > > API.
> > > > > > > why
> > > > > > > > > > would
> > > > > > > > > > > >> > delayed
> > > > > > > > > > > >> > > > >> > > response be
> > > > > > > > > > > >> > > > >> > > > > > able
> > > > > > > > > > > >> > > > >> > > > > > > > to
> > > > > > > > > > > >> > > > >> > > > > > > > >> slow down producer?
> producer
> > > > will
> > > > > > > > continue
> > > > > > > > > > to
> > > > > > > > > > > >> fire
> > > > > > > > > > > >> > > > async
> > > > > > > > > > > >> > > > >> > > sends.
> > > > > > > > > > > >> > > > >> > > > > > > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >> On Mon, Mar 16, 2015 at
> > 10:58
> > > > PM,
> > > > > > > > Guozhang
> > > > > > > > > > > Wang <
> > > > > > > > > > > >> > > > >> > > > > wangg...@gmail.com
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >> wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> I think we are really
> > > > discussing
> > > > > > two
> > > > > > > > > > separate
> > > > > > > > > > > >> > issues
> > > > > > > > > > > >> > > > >> here:
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> 1. Whether we should a)
> > > > > > > > > > > >> > > > >> > > > >
> > > > append-then-block-then-returnOKButThrottled
> > > > > > > > > > > >> > > > >> > > > > > > or
> > > > > > > > > > > >> > > > >> > > > > > > > b)
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > block-then-returnFailDuetoThrottled
> > > > > > > for
> > > > > > > > > > quota
> > > > > > > > > > > >> > > actions
> > > > > > > > > > > >> > > > on
> > > > > > > > > > > >> > > > >> > > produce
> > > > > > > > > > > >> > > > >> > > > > > > > >>> requests.
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> Both these approaches
> > assume
> > > > some
> > > > > > > kind
> > > > > > > > of
> > > > > > > > > > > >> > > > >> well-behaveness
> > > > > > > > > > > >> > > > >> > of
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > > > > clients:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> option a) assumes the
> > client
> > > > sets
> > > > > > an
> > > > > > > > > proper
> > > > > > > > > > > >> > timeout
> > > > > > > > > > > >> > > > >> value
> > > > > > > > > > > >> > > > >> > > while
> > > > > > > > > > > >> > > > >> > > > > can
> > > > > > > > > > > >> > > > >> > > > > > > > just
> > > > > > > > > > > >> > > > >> > > > > > > > >>> ignore "OKButThrottled"
> > > > response,
> > > > > > > while
> > > > > > > > > > > option
> > > > > > > > > > > >> b)
> > > > > > > > > > > >> > > > >> assumes
> > > > > > > > > > > >> > > > >> > the
> > > > > > > > > > > >> > > > >> > > > > > client
> > > > > > > > > > > >> > > > >> > > > > > > > >>> handles the
> > > > "FailDuetoThrottled"
> > > > > > > > > > > appropriately.
> > > > > > > > > > > >> > For
> > > > > > > > > > > >> > > > any
> > > > > > > > > > > >> > > > >> > > malicious
> > > > > > > > > > > >> > > > >> > > > > > > > clients
> > > > > > > > > > > >> > > > >> > > > > > > > >>> that, for example, just
> > keep
> > > > > > retrying
> > > > > > > > > > either
> > > > > > > > > > > >> > > > >> intentionally
> > > > > > > > > > > >> > > > >> > or
> > > > > > > > > > > >> > > > >> > > > > not,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> neither
> > > > > > > > > > > >> > > > >> > > > > > > > >>> of these approaches are
> > > > actually
> > > > > > > > > effective.
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> 2. For "OKButThrottled"
> and
> > > > > > > > > > > "FailDuetoThrottled"
> > > > > > > > > > > >> > > > >> responses,
> > > > > > > > > > > >> > > > >> > > shall
> > > > > > > > > > > >> > > > >> > > > > > we
> > > > > > > > > > > >> > > > >> > > > > > > > >>> encode
> > > > > > > > > > > >> > > > >> > > > > > > > >>> them as error codes or
> > > augment
> > > > the
> > > > > > > > > protocol
> > > > > > > > > > > to
> > > > > > > > > > > >> > use a
> > > > > > > > > > > >> > > > >> > separate
> > > > > > > > > > > >> > > > >> > > > > field
> > > > > > > > > > > >> > > > >> > > > > > > > >>> indicating "status
> codes".
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> Today we have already
> > > > incorporated
> > > > > > > some
> > > > > > > > > > > status
> > > > > > > > > > > >> > code
> > > > > > > > > > > >> > > as
> > > > > > > > > > > >> > > > >> > error
> > > > > > > > > > > >> > > > >> > > > > codes
> > > > > > > > > > > >> > > > >> > > > > > in
> > > > > > > > > > > >> > > > >> > > > > > > > the
> > > > > > > > > > > >> > > > >> > > > > > > > >>> responses, e.g.
> > > > > > ReplicaNotAvailable
> > > > > > > in
> > > > > > > > > > > >> > > > MetadataResponse,
> > > > > > > > > > > >> > > > >> > the
> > > > > > > > > > > >> > > > >> > > pros
> > > > > > > > > > > >> > > > >> > > > > > of
> > > > > > > > > > > >> > > > >> > > > > > > > this
> > > > > > > > > > > >> > > > >> > > > > > > > >>> is of course using a
> single
> > > > field
> > > > > > for
> > > > > > > > > > > response
> > > > > > > > > > > >> > > status
> > > > > > > > > > > >> > > > >> like
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > HTTP
> > > > > > > > > > > >> > > > >> > > > > > > > >>> status
> > > > > > > > > > > >> > > > >> > > > > > > > >>> codes, while the cons is
> > that
> > > > it
> > > > > > > > requires
> > > > > > > > > > > >> clients
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > > > >> handle
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > > error
> > > > > > > > > > > >> > > > >> > > > > > > > >>> codes
> > > > > > > > > > > >> > > > >> > > > > > > > >>> carefully.
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> I think maybe we can
> > actually
> > > > > > extend
> > > > > > > > the
> > > > > > > > > > > >> > single-code
> > > > > > > > > > > >> > > > >> > > approach to
> > > > > > > > > > > >> > > > >> > > > > > > > overcome
> > > > > > > > > > > >> > > > >> > > > > > > > >>> its drawbacks, that is,
> > wrap
> > > > the
> > > > > > > error
> > > > > > > > > > codes
> > > > > > > > > > > >> > > semantics
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > the
> > > > > > > > > > > >> > > > >> > > > > users
> > > > > > > > > > > >> > > > >> > > > > > > so
> > > > > > > > > > > >> > > > >> > > > > > > > >>> that
> > > > > > > > > > > >> > > > >> > > > > > > > >>> users do not need to
> handle
> > > the
> > > > > > codes
> > > > > > > > > > > >> one-by-one.
> > > > > > > > > > > >> > > More
> > > > > > > > > > > >> > > > >> > > > > concretely,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> following Jay's example
> the
> > > > client
> > > > > > > > could
> > > > > > > > > > > write
> > > > > > > > > > > >> > sth.
> > > > > > > > > > > >> > > > like
> > > > > > > > > > > >> > > > >> > > this:
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> -----------------
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> if(error.isOK())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // status code is good or
> > the
> > > > > > > code
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > >> > > simply
> > > > > > > > > > > >> > > > >> > > ignored for
> > > > > > > > > > > >> > > > >> > > > > > > this
> > > > > > > > > > > >> > > > >> > > > > > > > >>> request type, process the
> > > > request
> > > > > > > > > > > >> > > > >> > > > > > > > >>> else
> if(error.needsRetry())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // throttled, transient
> > > error,
> > > > > > > > etc:
> > > > > > > > > > > retry
> > > > > > > > > > > >> > > > >> > > > > > > > >>> else if(error.isFatal())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // non-retriable errors,
> > etc:
> > > > > > > > > notify /
> > > > > > > > > > > >> > > terminate
> > > > > > > > > > > >> > > > /
> > > > > > > > > > > >> > > > >> > other
> > > > > > > > > > > >> > > > >> > > > > > > handling
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> -----------------
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> Only when the clients
> > really
> > > > want
> > > > > > to
> > > > > > > > > > handle,
> > > > > > > > > > > for
> > > > > > > > > > > >> > > > example
> > > > > > > > > > > >> > > > >> > > > > > > > >>> FailDuetoThrottled
> > > > > > > > > > > >> > > > >> > > > > > > > >>> status code specifically,
> > it
> > > > needs
> > > > > > > to:
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> if(error.isOK())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // status code is good or
> > the
> > > > > > > code
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > >> > > simply
> > > > > > > > > > > >> > > > >> > > ignored for
> > > > > > > > > > > >> > > > >> > > > > > > this
> > > > > > > > > > > >> > > > >> > > > > > > > >>> request type, process the
> > > > request
> > > > > > > > > > > >> > > > >> > > > > > > > >>> else if(error ==
> > > > > > > FailDuetoThrottled )
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // throttled: log it
> > > > > > > > > > > >> > > > >> > > > > > > > >>> else
> if(error.needsRetry())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // transient error, etc:
> > > retry
> > > > > > > > > > > >> > > > >> > > > > > > > >>> else if(error.isFatal())
> > > > > > > > > > > >> > > > >> > > > > > > > >>> // non-retriable errors,
> > etc:
> > > > > > > > > notify /
> > > > > > > > > > > >> > > terminate
> > > > > > > > > > > >> > > > /
> > > > > > > > > > > >> > > > >> > other
> > > > > > > > > > > >> > > > >> > > > > > > handling
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> -----------------
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> And for implementation we
> > can
> > > > > > > probably
> > > > > > > > > > group
> > > > > > > > > > > the
> > > > > > > > > > > >> > > codes
> > > > > > > > > > > >> > > > >> > > > > accordingly
> > > > > > > > > > > >> > > > >> > > > > > > like
> > > > > > > > > > > >> > > > >> > > > > > > > >>> HTTP status code such
> that
> > we
> > > > can
> > > > > > do:
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> boolean Error.isOK() {
> > > > > > > > > > > >> > > > >> > > > > > > > >>> return code < 300 && code
> > >=
> > > > 200;
> > > > > > > > > > > >> > > > >> > > > > > > > >>> }
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> Guozhang
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> On Mon, Mar 16, 2015 at
> > 10:24
> > > > PM,
> > > > > > > Ewen
> > > > > > > > > > > >> > > > Cheslack-Postava
> > > > > > > > > > > >> > > > >> <
> > > > > > > > > > > >> > > > >> > > > > > > > >>> e...@confluent.io>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > Agreed that trying to
> > > > shoehorn
> > > > > > > > > non-error
> > > > > > > > > > > codes
> > > > > > > > > > > >> > > into
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > error
> > > > > > > > > > > >> > > > >> > > > > > field
> > > > > > > > > > > >> > > > >> > > > > > > > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> a
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > bad idea. It makes it
> > *way*
> > > > too
> > > > > > > easy
> > > > > > > > to
> > > > > > > > > > > write
> > > > > > > > > > > >> > code
> > > > > > > > > > > >> > > > >> that
> > > > > > > > > > > >> > > > >> > > looks
> > > > > > > > > > > >> > > > >> > > > > > (and
> > > > > > > > > > > >> > > > >> > > > > > > > >>> should
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > be) correct but is
> > actually
> > > > > > > > incorrect.
> > > > > > > > > If
> > > > > > > > > > > >> > > > necessary, I
> > > > > > > > > > > >> > > > >> > > think
> > > > > > > > > > > >> > > > >> > > > > it's
> > > > > > > > > > > >> > > > >> > > > > > > > much
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > better to to spend a
> > couple
> > > > of
> > > > > > > extra
> > > > > > > > > > bytes
> > > > > > > > > > > to
> > > > > > > > > > > >> > > encode
> > > > > > > > > > > >> > > > >> that
> > > > > > > > > > > >> > > > >> > > > > > > information
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > separately (a "status"
> or
> > > > > > "warning"
> > > > > > > > > > > section of
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> > > response).
> > > > > > > > > > > >> > > > >> > > > > An
> > > > > > > > > > > >> > > > >> > > > > > > > >>> indication
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > that throttling is
> > > occurring
> > > > is
> > > > > > > > > something
> > > > > > > > > > > I'd
> > > > > > > > > > > >> > > expect
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > be
> > > > > > > > > > > >> > > > >> > > > > > > indicated
> > > > > > > > > > > >> > > > >> > > > > > > > >>> by a
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > bit flag in the
> response
> > > > rather
> > > > > > > than
> > > > > > > > as
> > > > > > > > > > an
> > > > > > > > > > > >> error
> > > > > > > > > > > >> > > > code.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > Gwen - I think an error
> > > code
> > > > > > makes
> > > > > > > > > sense
> > > > > > > > > > > when
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> request
> > > > > > > > > > > >> > > > >> > > > > > actually
> > > > > > > > > > > >> > > > >> > > > > > > > >>> failed.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > Option B, which Jun was
> > > > > > advocating,
> > > > > > > > > would
> > > > > > > > > > > have
> > > > > > > > > > > >> > > > >> appended
> > > > > > > > > > > >> > > > >> > the
> > > > > > > > > > > >> > > > >> > > > > > > messages
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > successfully. If the
> > > > > > rate-limiting
> > > > > > > > case
> > > > > > > > > > > you're
> > > > > > > > > > > >> > > > talking
> > > > > > > > > > > >> > > > >> > > about
> > > > > > > > > > > >> > > > >> > > > > had
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > successfully committed
> > the
> > > > > > > messages,
> > > > > > > > I
> > > > > > > > > > > would
> > > > > > > > > > > >> say
> > > > > > > > > > > >> > > > >> that's
> > > > > > > > > > > >> > > > >> > > also a
> > > > > > > > > > > >> > > > >> > > > > > bad
> > > > > > > > > > > >> > > > >> > > > > > > > use
> > > > > > > > > > > >> > > > >> > > > > > > > >>> of
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > error codes.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > On Mon, Mar 16, 2015 at
> > > 10:16
> > > > > > PM,
> > > > > > > > Gwen
> > > > > > > > > > > >> Shapira <
> > > > > > > > > > > >> > > > >> > > > > > > > gshap...@cloudera.com>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > We discussed an error
> > > code
> > > > for
> > > > > > > > > > > rate-limiting
> > > > > > > > > > > >> > > > (which
> > > > > > > > > > > >> > > > >> I
> > > > > > > > > > > >> > > > >> > > think
> > > > > > > > > > > >> > > > >> > > > > > made
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > sense), isn't it a
> > > similar
> > > > > > case?
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > On Mon, Mar 16, 2015
> at
> > > > 10:10
> > > > > > PM,
> > > > > > > > Jay
> > > > > > > > > > > Kreps
> > > > > > > > > > > >> <
> > > > > > > > > > > >> > > > >> > > > > > jay.kr...@gmail.com
> > > > > > > > > > > >> > > > >> > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > My concern is that
> as
> > > > soon
> > > > > > as
> > > > > > > you
> > > > > > > > > > start
> > > > > > > > > > > >> > > encoding
> > > > > > > > > > > >> > > > >> > > non-error
> > > > > > > > > > > >> > > > >> > > > > > > > response
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > information into
> > error
> > > > codes
> > > > > > > the
> > > > > > > > > next
> > > > > > > > > > > >> > question
> > > > > > > > > > > >> > > > is
> > > > > > > > > > > >> > > > >> > what
> > > > > > > > > > > >> > > > >> > > to
> > > > > > > > > > > >> > > > >> > > > > do
> > > > > > > > > > > >> > > > >> > > > > > if
> > > > > > > > > > > >> > > > >> > > > > > > > two
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > such
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > codes apply (i.e.
> you
> > > > have a
> > > > > > > > > replica
> > > > > > > > > > > down
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > > > the
> > > > > > > > > > > >> > > > >> > > response
> > > > > > > > > > > >> > > > >> > > > > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > quota'd). I
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > think I am trying
> to
> > > > argue
> > > > > > that
> > > > > > > > > error
> > > > > > > > > > > >> should
> > > > > > > > > > > >> > > > mean
> > > > > > > > > > > >> > > > >> > "why
> > > > > > > > > > > >> > > > >> > > we
> > > > > > > > > > > >> > > > >> > > > > > > failed
> > > > > > > > > > > >> > > > >> > > > > > > > >>> your
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > request", for which
> > > there
> > > > > > will
> > > > > > > > > really
> > > > > > > > > > > only
> > > > > > > > > > > >> > be
> > > > > > > > > > > >> > > > one
> > > > > > > > > > > >> > > > >> > > reason,
> > > > > > > > > > > >> > > > >> > > > > and
> > > > > > > > > > > >> > > > >> > > > > > > any
> > > > > > > > > > > >> > > > >> > > > > > > > >>> other
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > useful information
> we
> > > > want
> > > > > > to
> > > > > > > > send
> > > > > > > > > > > back is
> > > > > > > > > > > >> > > just
> > > > > > > > > > > >> > > > >> > another
> > > > > > > > > > > >> > > > >> > > > > field
> > > > > > > > > > > >> > > > >> > > > > > > in
> > > > > > > > > > > >> > > > >> > > > > > > > >>> the
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > response.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > -Jay
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > > On Mon, Mar 16,
> 2015
> > at
> > > > 9:51
> > > > > > > PM,
> > > > > > > > > Gwen
> > > > > > > > > > > >> > Shapira
> > > > > > > > > > > >> > > <
> > > > > > > > > > > >> > > > >> > > > > > > > >>> gshap...@cloudera.com>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> I think its not
> too
> > > > late to
> > > > > > > > > reserve
> > > > > > > > > > a
> > > > > > > > > > > set
> > > > > > > > > > > >> > of
> > > > > > > > > > > >> > > > >> error
> > > > > > > > > > > >> > > > >> > > codes
> > > > > > > > > > > >> > > > >> > > > > > > > >>> (200-299?)
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> for "non-error"
> > codes.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> It won't be
> backward
> > > > > > > compatible
> > > > > > > > > > (i.e.
> > > > > > > > > > > >> > clients
> > > > > > > > > > > >> > > > >> that
> > > > > > > > > > > >> > > > >> > > > > currently
> > > > > > > > > > > >> > > > >> > > > > > > do
> > > > > > > > > > > >> > > > >> > > > > > > > >>> "else
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> throw" will throw
> on
> > > > > > > > non-errors),
> > > > > > > > > > but
> > > > > > > > > > > >> > perhaps
> > > > > > > > > > > >> > > > its
> > > > > > > > > > > >> > > > >> > > > > > worthwhile.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> On Mon, Mar 16,
> 2015
> > > at
> > > > > > 9:42
> > > > > > > PM,
> > > > > > > > > Jay
> > > > > > > > > > > >> Kreps
> > > > > > > > > > > >> > <
> > > > > > > > > > > >> > > > >> > > > > > > jay.kr...@gmail.com
> > > > > > > > > > > >> > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Hey Jun,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > I'd really
> really
> > > > really
> > > > > > > like
> > > > > > > > to
> > > > > > > > > > > avoid
> > > > > > > > > > > >> > > that.
> > > > > > > > > > > >> > > > >> > Having
> > > > > > > > > > > >> > > > >> > > just
> > > > > > > > > > > >> > > > >> > > > > > > > spent a
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > bunch of
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > time on the
> > clients,
> > > > > > using
> > > > > > > the
> > > > > > > > > > error
> > > > > > > > > > > >> > codes
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> > encode
> > > > > > > > > > > >> > > > >> > > > > other
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > information
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > about the
> response
> > > is
> > > > > > super
> > > > > > > > > > > dangerous.
> > > > > > > > > > > >> > The
> > > > > > > > > > > >> > > > >> error
> > > > > > > > > > > >> > > > >> > > > > handling
> > > > > > > > > > > >> > > > >> > > > > > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> one of
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > the
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > hardest parts of
> > the
> > > > > > client
> > > > > > > > > > > (Guozhang
> > > > > > > > > > > >> > chime
> > > > > > > > > > > >> > > > in
> > > > > > > > > > > >> > > > >> > > here).
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Generally the
> > error
> > > > > > handling
> > > > > > > > > looks
> > > > > > > > > > > like
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > if(error ==
> none)
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // good, process
> > the
> > > > > > > > > request
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else if(error ==
> > > > > > > > > KNOWN_ERROR_1)
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // handle known
> > > error
> > > > 1
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else if(error ==
> > > > > > > > > KNOWN_ERROR_2)
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > // handle known
> > > error
> > > > 2
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > else
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > throw
> > > > > > > > > > > >> > > Errors.forCode(error).exception();
> > > > > > > > > > > >> > > > >> //
> > > > > > > > > > > >> > > > >> > or
> > > > > > > > > > > >> > > > >> > > some
> > > > > > > > > > > >> > > > >> > > > > > > other
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > default
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > behavior
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > This works
> because
> > > we
> > > > > > have a
> > > > > > > > > > > convention
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > >> and
> > > > > > > > > > > >> > > > >> > > error
> > > > > > > > > > > >> > > > >> > > > > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> something
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > that
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > prevented your
> > > getting
> > > > > > the
> > > > > > > > > > response
> > > > > > > > > > > so
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > >> default
> > > > > > > > > > > >> > > > >> > > > > > handling
> > > > > > > > > > > >> > > > >> > > > > > > > >>> case is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > sane
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > and forward
> > > > compatible.
> > > > > > It
> > > > > > > is
> > > > > > > > > > > tempting
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > use
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > error
> > > > > > > > > > > >> > > > >> > > > > > code
> > > > > > > > > > > >> > > > >> > > > > > > > to
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > convey
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > information in
> the
> > > > > > success
> > > > > > > > case.
> > > > > > > > > > For
> > > > > > > > > > > >> > > example
> > > > > > > > > > > >> > > > we
> > > > > > > > > > > >> > > > >> > > could
> > > > > > > > > > > >> > > > >> > > > > use
> > > > > > > > > > > >> > > > >> > > > > > > > error
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > codes
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > to
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > encode whether
> > > quotas
> > > > > > were
> > > > > > > > > > enforced,
> > > > > > > > > > > >> > > whether
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > request
> > > > > > > > > > > >> > > > >> > > > > > was
> > > > > > > > > > > >> > > > >> > > > > > > > >>> served
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > out
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> of
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > cache, whether
> the
> > > > stock
> > > > > > > > market
> > > > > > > > > is
> > > > > > > > > > > up
> > > > > > > > > > > >> > > today,
> > > > > > > > > > > >> > > > or
> > > > > > > > > > > >> > > > >> > > > > whatever.
> > > > > > > > > > > >> > > > >> > > > > > > The
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > problem
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > that since these
> > are
> > > > not
> > > > > > > > errors
> > > > > > > > > as
> > > > > > > > > > > far
> > > > > > > > > > > >> as
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > > client is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> concerned it
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> should
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > not throw an
> > > exception
> > > > > > but
> > > > > > > > > process
> > > > > > > > > > > the
> > > > > > > > > > > >> > > > >> response,
> > > > > > > > > > > >> > > > >> > > but now
> > > > > > > > > > > >> > > > >> > > > > > we
> > > > > > > > > > > >> > > > >> > > > > > > > >>> created
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > an
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > explicit
> > requirement
> > > > that
> > > > > > > that
> > > > > > > > > > > error be
> > > > > > > > > > > >> > > > handled
> > > > > > > > > > > >> > > > >> > > > > explicitly
> > > > > > > > > > > >> > > > >> > > > > > > > >>> since it
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > is
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > different. I
> > really
> > > > think
> > > > > > > that
> > > > > > > > > > this
> > > > > > > > > > > >> kind
> > > > > > > > > > > >> > of
> > > > > > > > > > > >> > > > >> > > information
> > > > > > > > > > > >> > > > >> > > > > is
> > > > > > > > > > > >> > > > >> > > > > > > not
> > > > > > > > > > > >> > > > >> > > > > > > > >>> an
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > error,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> it
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > is just
> > information,
> > > > and
> > > > > > if
> > > > > > > we
> > > > > > > > > > want
> > > > > > > > > > > it
> > > > > > > > > > > >> in
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > > response
> > > > > > > > > > > >> > > > >> > > > > we
> > > > > > > > > > > >> > > > >> > > > > > > > >>> should do
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > the
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > right thing and
> > add
> > > a
> > > > new
> > > > > > > > field
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > >> > > > >> response.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > I think you saw
> > the
> > > > Samza
> > > > > > > bug
> > > > > > > > > that
> > > > > > > > > > > was
> > > > > > > > > > > >> > > > >> literally
> > > > > > > > > > > >> > > > >> > an
> > > > > > > > > > > >> > > > >> > > > > > example
> > > > > > > > > > > >> > > > >> > > > > > > of
> > > > > > > > > > > >> > > > >> > > > > > > > >>> this
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > happening and
> > > leading
> > > > to
> > > > > > an
> > > > > > > > > > infinite
> > > > > > > > > > > >> > retry
> > > > > > > > > > > >> > > > >> loop.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > Further more I
> > > really
> > > > > > want
> > > > > > > to
> > > > > > > > > > > emphasize
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > >> > hitting
> > > > > > > > > > > >> > > > >> > > > > your
> > > > > > > > > > > >> > > > >> > > > > > > > quota
> > > > > > > > > > > >> > > > >> > > > > > > > >>> in
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > the
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > design that Adi
> > has
> > > > > > proposed
> > > > > > > > is
> > > > > > > > > > > >> actually
> > > > > > > > > > > >> > > not
> > > > > > > > > > > >> > > > an
> > > > > > > > > > > >> > > > >> > > error
> > > > > > > > > > > >> > > > >> > > > > > > > condition
> > > > > > > > > > > >> > > > >> > > > > > > > >>> at
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > all.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> It
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > is totally
> > > reasonable
> > > > in
> > > > > > any
> > > > > > > > > > > bootstrap
> > > > > > > > > > > >> > > > >> situation
> > > > > > > > > > > >> > > > >> > to
> > > > > > > > > > > >> > > > >> > > > > > > > >>> intentionally
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > want to
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > run at the limit
> > the
> > > > > > system
> > > > > > > > > > imposes
> > > > > > > > > > > on
> > > > > > > > > > > >> > you.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > -Jay
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> > On Mon, Mar 16,
> > 2015
> > > > at
> > > > > > 4:27
> > > > > > > > PM,
> > > > > > > > > > Jun
> > > > > > > > > > > >> Rao
> > > > > > > > > > > >> > <
> > > > > > > > > > > >> > > > >> > > > > > j...@confluent.io>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> It's probably
> > > useful
> > > > for
> > > > > > a
> > > > > > > > > client
> > > > > > > > > > > to
> > > > > > > > > > > >> > know
> > > > > > > > > > > >> > > > >> whether
> > > > > > > > > > > >> > > > >> > > its
> > > > > > > > > > > >> > > > >> > > > > > > > requests
> > > > > > > > > > > >> > > > >> > > > > > > > >>> are
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> throttled or
> not
> > > > (e.g.,
> > > > > > for
> > > > > > > > > > > monitoring
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > > > >> > > alerting).
> > > > > > > > > > > >> > > > >> > > > > > From
> > > > > > > > > > > >> > > > >> > > > > > > > that
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> perspective,
> > > option B
> > > > > > > (delay
> > > > > > > > > the
> > > > > > > > > > > >> > requests
> > > > > > > > > > > >> > > > and
> > > > > > > > > > > >> > > > >> > > return an
> > > > > > > > > > > >> > > > >> > > > > > > > error)
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > seems
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> better.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> Thanks,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> Jun
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> On Wed, Mar 4,
> > 2015
> > > > at
> > > > > > 3:51
> > > > > > > > PM,
> > > > > > > > > > > Aditya
> > > > > > > > > > > >> > > > >> Auradkar <
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > aaurad...@linkedin.com.invalid
> > > > > > > > > >
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > Posted a KIP
> > for
> > > > > > quotas
> > > > > > > in
> > > > > > > > > > kafka.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > >
> > > > > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > Appreciate
> any
> > > > > > feedback.
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> > Aditya
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >> >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > --
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > Thanks,
> > > > > > > > > > > >> > > > >> > > > > > > > >>> > Ewen
> > > > > > > > > > > >> > > > >> > > > > > > > >>> >
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>> --
> > > > > > > > > > > >> > > > >> > > > > > > > >>> -- Guozhang
> > > > > > > > > > > >> > > > >> > > > > > > > >>>
> > > > > > > > > > > >> > > > >> > > > > > > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >>
> > > > > > > > > > > >> > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > > >
> > > > > > > > > > > >> > > > >> > > > > > >
> > > > > > > > > > > >> > > > >> > > > > >
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > --
> > > > > > > > > > > >> > > > > Sent from Gmail Mobile
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > --
> > > > > > > > > > > >> > > > Sent from Gmail Mobile
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>
>
>
> --
> -- Guozhang
>



--
-- Guozhang

RE: [KIP-DISCUSSION] KIP-13 Quotas

Reply via email to