RE: [KIP-DISCUSSION] KIP-13 Quotas

Aditya Auradkar Wed, 01 Apr 2015 15:30:07 -0700

Hey everyone,

I've made changes to the KIP to capture our discussions over the last couple of 
weeks.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas


I'll start a voting thread after people have had a chance to read/comment.

Thanks,
Aditya

________________________________________
From: Steven Wu [[email protected]]
Sent: Friday, March 20, 2015 9:14 AM
To: [email protected]
Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas

+1 on Jun's suggestion of maintaining one set/style of metrics at broker.
In Netflix, we have to convert the yammer metrics to servo metrics at
broker. it will be painful to know some metrics are in a different style
and get to be handled differently.

On Fri, Mar 20, 2015 at 8:17 AM, Jun Rao <[email protected]> wrote:

> Not so sure. People who use quota will definitely want to monitor the new
> metrics at the client id level. Then they will need to deal with those
> metrics differently from the rest of the metrics. It would be better if we
> can hide this complexity from the users.
>
> Thanks,
>
> Jun
>
> On Thu, Mar 19, 2015 at 10:45 PM, Joel Koshy <[email protected]> wrote:
>
> > Actually thinking again - since these will be a few new metrics at the
> > client id level (bytes in and bytes out to start with) maybe it is fine
> to
> > have the two type of metrics coexist and we can migrate the existing
> > metrics in parallel.
> >
> > On Thursday, March 19, 2015, Joel Koshy <[email protected]> wrote:
> >
> > > That is a valid concern but in that case I think it would be better to
> > > just migrate completely to the new metrics package first.
> > >
> > > On Thursday, March 19, 2015, Jun Rao <[email protected]
> > > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
> > >
> > >> Hmm, I was thinking a bit differently on the metrics stuff. I think it
> > >> would be confusing to have some metrics defined in the new metrics
> > package
> > >> while some others defined in Coda Hale. Those metrics will look
> > different
> > >> (e.g., rates in Coda Hale will have special attributes such as
> > >> 1-min-average). People may need different ways to export the metrics
> to
> > >> external systems such as Graphite. So, instead of using the new
> metrics
> > >> package on the broker, I was thinking that we can just implement a
> > >> QuotaMetrics that wraps the Coda Hale metrics. The implementation can
> be
> > >> the same as what's in the new metrics package.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Thu, Mar 19, 2015 at 8:09 PM, Jay Kreps <[email protected]>
> wrote:
> > >>
> > >> > Yeah I was saying was that we are blocked on picking an approach for
> > >> > metrics but not necessarily the full conversion. Clearly if we pick
> > the
> > >> new
> > >> > metrics package we would need to implement the two metrics we want
> to
> > >> quota
> > >> > on. But the conversion of the remaining metrics can be done
> > >> asynchronously.
> > >> >
> > >> > -Jay
> > >> >
> > >> > On Thu, Mar 19, 2015 at 5:56 PM, Joel Koshy <[email protected]>
> > >> wrote:
> > >> >
> > >> > > > in KAFKA-1930). I agree that this KIP doesn't need to block on
> the
> > >> > > > migration of the metrics package.
> > >> > >
> > >> > > Can you clarify the above? i.e., if we are going to quota on
> > something
> > >> > > then we would want to have migrated that metric over right? Or do
> > you
> > >> > > mean we don't need to complete the migration of all metrics to the
> > >> > > metrics package right?
> > >> > >
> > >> > > I think most of us now feel that the delay + no error is a good
> > >> > > approach, but it would be good to make sure everyone is on the
> same
> > >> > > page.
> > >> > >
> > >> > > As Aditya requested a couple of days ago I think we should go over
> > >> > > this at the next KIP hangout.
> > >> > >
> > >> > > Joel
> > >> > >
> > >> > > On Thu, Mar 19, 2015 at 09:24:09AM -0700, Jun Rao wrote:
> > >> > > > 1. Delay + no error seems reasonable to me. However, I do feel
> > that
> > >> we
> > >> > > need
> > >> > > > to give the client an indicator that it's being throttled,
> instead
> > >> of
> > >> > > doing
> > >> > > > this silently. For that, we probably need to evolve the
> > >> produce/fetch
> > >> > > > protocol to include an extra status field in the response. We
> > >> probably
> > >> > > need
> > >> > > > to think more about whether we just want to return a simple
> status
> > >> code
> > >> > > > (e.g., 1 = throttled) or a value that indicates how much is
> being
> > >> > > throttled.
> > >> > > >
> > >> > > > 2. We probably need to improve the histogram support in the new
> > >> metrics
> > >> > > > package before we can use it more widely on the server side
> (left
> > a
> > >> > > comment
> > >> > > > in KAFKA-1930). I agree that this KIP doesn't need to block on
> the
> > >> > > > migration of the metrics package.
> > >> > > >
> > >> > > > Thanks,
> > >> > > >
> > >> > > > Jun
> > >> > > >
> > >> > > > On Wed, Mar 18, 2015 at 4:02 PM, Aditya Auradkar <
> > >> > > > [email protected]> wrote:
> > >> > > >
> > >> > > > > Hey everyone,
> > >> > > > >
> > >> > > > > Thanks for the great discussion. There are currently a few
> > points
> > >> on
> > >> > > this
> > >> > > > > KIP that need addressing and I want to make sure we are on the
> > >> same
> > >> > > page
> > >> > > > > about those.
> > >> > > > >
> > >> > > > > 1. Append and delay response vs delay and return error
> > >> > > > > - I think we've discussed the pros and cons of each approach
> but
> > >> > > haven't
> > >> > > > > chosen an approach yet. Where does everyone stand on this
> issue?
> > >> > > > >
> > >> > > > > 2. Metrics Migration and usage in quotas
> > >> > > > > - The metrics library in clients has a notion of quotas that
> we
> > >> > should
> > >> > > > > reuse. For that to happen, we need to migrate the server to
> the
> > >> new
> > >> > > metrics
> > >> > > > > package.
> > >> > > > > - Need more clarification on how to compute throttling time
> and
> > >> > > windowing
> > >> > > > > for quotas.
> > >> > > > >
> > >> > > > > I'm going to start a new KIP to discuss metrics migration
> > >> separately.
> > >> > > That
> > >> > > > > will also contain a section on quotas.
> > >> > > > >
> > >> > > > > 3. Dynamic Configuration management - Being discussed in
> KIP-5.
> > >> > > Basically
> > >> > > > > we need something that will model default quotas and allow
> > >> per-client
> > >> > > > > overrides.
> > >> > > > >
> > >> > > > > Is there something else that I'm missing?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Aditya
> > >> > > > > ________________________________________
> > >> > > > > From: Jay Kreps [[email protected]]
> > >> > > > > Sent: Wednesday, March 18, 2015 2:10 PM
> > >> > > > > To: [email protected]
> > >> > > > > Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> > >> > > > >
> > >> > > > > Hey Steven,
> > >> > > > >
> > >> > > > > The current proposal is actually to enforce quotas at the
> > >> > > > > client/application level, NOT the topic level. So if you have
> a
> > >> > service
> > >> > > > > with a few dozen instances the quota is against all of those
> > >> > instances
> > >> > > > > added up across all their topics. So actually the effect would
> > be
> > >> the
> > >> > > same
> > >> > > > > either way but throttling gives the producer the choice of
> > either
> > >> > > blocking
> > >> > > > > or dropping.
> > >> > > > >
> > >> > > > > -Jay
> > >> > > > >
> > >> > > > > On Tue, Mar 17, 2015 at 10:08 AM, Steven Wu <
> > [email protected]
> > >> >
> > >> > > wrote:
> > >> > > > >
> > >> > > > > > Jay,
> > >> > > > > >
> > >> > > > > > let's say an app produces to 10 different topics. one of the
> > >> topic
> > >> > is
> > >> > > > > sent
> > >> > > > > > from a library. due to whatever condition/bug, this lib
> starts
> > >> to
> > >> > > send
> > >> > > > > > messages over the quota. if we go with the delayed response
> > >> > > approach, it
> > >> > > > > > will cause the whole shared RecordAccumulator buffer to be
> > >> filled
> > >> > up.
> > >> > > > > that
> > >> > > > > > will penalize other 9 topics who are within the quota. that
> is
> > >> the
> > >> > > > > > unfairness point that Ewen and I were trying to make.
> > >> > > > > >
> > >> > > > > > if broker just drop the msg and return an error/status code
> > >> > > indicates the
> > >> > > > > > drop and why. then producer can just move on and accept the
> > >> drop.
> > >> > > shared
> > >> > > > > > buffer won't be saturated and other 9 topics won't be
> > penalized.
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > Steven
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Tue, Mar 17, 2015 at 9:44 AM, Jay Kreps <
> > [email protected]
> > >> >
> > >> > > wrote:
> > >> > > > > >
> > >> > > > > > > Hey Steven,
> > >> > > > > > >
> > >> > > > > > > It is true that hitting the quota will cause back-pressure
> > on
> > >> the
> > >> > > > > > producer.
> > >> > > > > > > But the solution is simple, a producer that wants to avoid
> > >> this
> > >> > > should
> > >> > > > > > stay
> > >> > > > > > > under its quota. In other words this is a contract between
> > the
> > >> > > cluster
> > >> > > > > > and
> > >> > > > > > > the client, with each side having something to uphold.
> Quite
> > >> > > possibly
> > >> > > > > the
> > >> > > > > > > same thing will happen in the absence of a quota, a client
> > >> that
> > >> > > > > produces
> > >> > > > > > an
> > >> > > > > > > unexpected amount of load will hit the limits of the
> server
> > >> and
> > >> > > > > > experience
> > >> > > > > > > backpressure. Quotas just allow you to set that same limit
> > at
> > >> > > something
> > >> > > > > > > lower than 100% of all resources on the server, which is
> > >> useful
> > >> > > for a
> > >> > > > > > > shared cluster.
> > >> > > > > > >
> > >> > > > > > > -Jay
> > >> > > > > > >
> > >> > > > > > > On Mon, Mar 16, 2015 at 11:34 PM, Steven Wu <
> > >> > [email protected]>
> > >> > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > wait. we create one kafka producer for each cluster.
> each
> > >> > > cluster can
> > >> > > > > > > have
> > >> > > > > > > > many topics. if producer buffer got filled up due to
> > delayed
> > >> > > response
> > >> > > > > > for
> > >> > > > > > > > one throttled topic, won't that penalize other topics
> > >> unfairly?
> > >> > > it
> > >> > > > > > seems
> > >> > > > > > > to
> > >> > > > > > > > me that broker should just return error without delay.
> > >> > > > > > > >
> > >> > > > > > > > sorry that I am chatting to myself :)
> > >> > > > > > > >
> > >> > > > > > > > On Mon, Mar 16, 2015 at 11:29 PM, Steven Wu <
> > >> > > [email protected]>
> > >> > > > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > I think I can answer my own question. delayed response
> > >> will
> > >> > > cause
> > >> > > > > the
> > >> > > > > > > > > producer buffer to be full, which then result in
> either
> > >> > thread
> > >> > > > > > blocking
> > >> > > > > > > > or
> > >> > > > > > > > > message drop.
> > >> > > > > > > > >
> > >> > > > > > > > > On Mon, Mar 16, 2015 at 11:24 PM, Steven Wu <
> > >> > > [email protected]>
> > >> > > > > > > > wrote:
> > >> > > > > > > > >
> > >> > > > > > > > >> please correct me if I am missing sth here. I am not
> > >> > > understanding
> > >> > > > > > how
> > >> > > > > > > > >> would throttle work without cooperation/back-off from
> > >> > > producer.
> > >> > > > > new
> > >> > > > > > > Java
> > >> > > > > > > > >> producer supports non-blocking API. why would delayed
> > >> > > response be
> > >> > > > > > able
> > >> > > > > > > > to
> > >> > > > > > > > >> slow down producer? producer will continue to fire
> > async
> > >> > > sends.
> > >> > > > > > > > >>
> > >> > > > > > > > >> On Mon, Mar 16, 2015 at 10:58 PM, Guozhang Wang <
> > >> > > > > [email protected]
> > >> > > > > > >
> > >> > > > > > > > >> wrote:
> > >> > > > > > > > >>
> > >> > > > > > > > >>> I think we are really discussing two separate issues
> > >> here:
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> 1. Whether we should a)
> > >> > > > > append-then-block-then-returnOKButThrottled
> > >> > > > > > > or
> > >> > > > > > > > b)
> > >> > > > > > > > >>> block-then-returnFailDuetoThrottled for quota
> actions
> > on
> > >> > > produce
> > >> > > > > > > > >>> requests.
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> Both these approaches assume some kind of
> > >> well-behaveness
> > >> > of
> > >> > > the
> > >> > > > > > > > clients:
> > >> > > > > > > > >>> option a) assumes the client sets an proper timeout
> > >> value
> > >> > > while
> > >> > > > > can
> > >> > > > > > > > just
> > >> > > > > > > > >>> ignore "OKButThrottled" response, while option b)
> > >> assumes
> > >> > the
> > >> > > > > > client
> > >> > > > > > > > >>> handles the "FailDuetoThrottled" appropriately. For
> > any
> > >> > > malicious
> > >> > > > > > > > clients
> > >> > > > > > > > >>> that, for example, just keep retrying either
> > >> intentionally
> > >> > or
> > >> > > > > not,
> > >> > > > > > > > >>> neither
> > >> > > > > > > > >>> of these approaches are actually effective.
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> 2. For "OKButThrottled" and "FailDuetoThrottled"
> > >> responses,
> > >> > > shall
> > >> > > > > > we
> > >> > > > > > > > >>> encode
> > >> > > > > > > > >>> them as error codes or augment the protocol to use a
> > >> > separate
> > >> > > > > field
> > >> > > > > > > > >>> indicating "status codes".
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> Today we have already incorporated some status code
> as
> > >> > error
> > >> > > > > codes
> > >> > > > > > in
> > >> > > > > > > > the
> > >> > > > > > > > >>> responses, e.g. ReplicaNotAvailable in
> > MetadataResponse,
> > >> > the
> > >> > > pros
> > >> > > > > > of
> > >> > > > > > > > this
> > >> > > > > > > > >>> is of course using a single field for response
> status
> > >> like
> > >> > > the
> > >> > > > > HTTP
> > >> > > > > > > > >>> status
> > >> > > > > > > > >>> codes, while the cons is that it requires clients to
> > >> handle
> > >> > > the
> > >> > > > > > error
> > >> > > > > > > > >>> codes
> > >> > > > > > > > >>> carefully.
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> I think maybe we can actually extend the single-code
> > >> > > approach to
> > >> > > > > > > > overcome
> > >> > > > > > > > >>> its drawbacks, that is, wrap the error codes
> semantics
> > >> to
> > >> > the
> > >> > > > > users
> > >> > > > > > > so
> > >> > > > > > > > >>> that
> > >> > > > > > > > >>> users do not need to handle the codes one-by-one.
> More
> > >> > > > > concretely,
> > >> > > > > > > > >>> following Jay's example the client could write sth.
> > like
> > >> > > this:
> > >> > > > > > > > >>>
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> -----------------
> > >> > > > > > > > >>>
> > >> > > > > > > > >>>   if(error.isOK())
> > >> > > > > > > > >>>      // status code is good or the code can be
> simply
> > >> > > ignored for
> > >> > > > > > > this
> > >> > > > > > > > >>> request type, process the request
> > >> > > > > > > > >>>   else if(error.needsRetry())
> > >> > > > > > > > >>>      // throttled, transient error, etc: retry
> > >> > > > > > > > >>>   else if(error.isFatal())
> > >> > > > > > > > >>>      // non-retriable errors, etc: notify /
> terminate
> > /
> > >> > other
> > >> > > > > > > handling
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> -----------------
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> Only when the clients really want to handle, for
> > example
> > >> > > > > > > > >>> FailDuetoThrottled
> > >> > > > > > > > >>> status code specifically, it needs to:
> > >> > > > > > > > >>>
> > >> > > > > > > > >>>   if(error.isOK())
> > >> > > > > > > > >>>      // status code is good or the code can be
> simply
> > >> > > ignored for
> > >> > > > > > > this
> > >> > > > > > > > >>> request type, process the request
> > >> > > > > > > > >>>   else if(error == FailDuetoThrottled )
> > >> > > > > > > > >>>      // throttled: log it
> > >> > > > > > > > >>>   else if(error.needsRetry())
> > >> > > > > > > > >>>      // transient error, etc: retry
> > >> > > > > > > > >>>   else if(error.isFatal())
> > >> > > > > > > > >>>      // non-retriable errors, etc: notify /
> terminate
> > /
> > >> > other
> > >> > > > > > > handling
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> -----------------
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> And for implementation we can probably group the
> codes
> > >> > > > > accordingly
> > >> > > > > > > like
> > >> > > > > > > > >>> HTTP status code such that we can do:
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> boolean Error.isOK() {
> > >> > > > > > > > >>>   return code < 300 && code >= 200;
> > >> > > > > > > > >>> }
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> Guozhang
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> On Mon, Mar 16, 2015 at 10:24 PM, Ewen
> > Cheslack-Postava
> > >> <
> > >> > > > > > > > >>> [email protected]>
> > >> > > > > > > > >>> wrote:
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> > Agreed that trying to shoehorn non-error codes
> into
> > >> the
> > >> > > error
> > >> > > > > > field
> > >> > > > > > > > is
> > >> > > > > > > > >>> a
> > >> > > > > > > > >>> > bad idea. It makes it *way* too easy to write code
> > >> that
> > >> > > looks
> > >> > > > > > (and
> > >> > > > > > > > >>> should
> > >> > > > > > > > >>> > be) correct but is actually incorrect. If
> > necessary, I
> > >> > > think
> > >> > > > > it's
> > >> > > > > > > > much
> > >> > > > > > > > >>> > better to to spend a couple of extra bytes to
> encode
> > >> that
> > >> > > > > > > information
> > >> > > > > > > > >>> > separately (a "status" or "warning" section of the
> > >> > > response).
> > >> > > > > An
> > >> > > > > > > > >>> indication
> > >> > > > > > > > >>> > that throttling is occurring is something I'd
> expect
> > >> to
> > >> > be
> > >> > > > > > > indicated
> > >> > > > > > > > >>> by a
> > >> > > > > > > > >>> > bit flag in the response rather than as an error
> > code.
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> > Gwen - I think an error code makes sense when the
> > >> request
> > >> > > > > > actually
> > >> > > > > > > > >>> failed.
> > >> > > > > > > > >>> > Option B, which Jun was advocating, would have
> > >> appended
> > >> > the
> > >> > > > > > > messages
> > >> > > > > > > > >>> > successfully. If the rate-limiting case you're
> > talking
> > >> > > about
> > >> > > > > had
> > >> > > > > > > > >>> > successfully committed the messages, I would say
> > >> that's
> > >> > > also a
> > >> > > > > > bad
> > >> > > > > > > > use
> > >> > > > > > > > >>> of
> > >> > > > > > > > >>> > error codes.
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> > On Mon, Mar 16, 2015 at 10:16 PM, Gwen Shapira <
> > >> > > > > > > > [email protected]>
> > >> > > > > > > > >>> > wrote:
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> > > We discussed an error code for rate-limiting
> > (which
> > >> I
> > >> > > think
> > >> > > > > > made
> > >> > > > > > > > >>> > > sense), isn't it a similar case?
> > >> > > > > > > > >>> > >
> > >> > > > > > > > >>> > > On Mon, Mar 16, 2015 at 10:10 PM, Jay Kreps <
> > >> > > > > > [email protected]
> > >> > > > > > > >
> > >> > > > > > > > >>> wrote:
> > >> > > > > > > > >>> > > > My concern is that as soon as you start
> encoding
> > >> > > non-error
> > >> > > > > > > > response
> > >> > > > > > > > >>> > > > information into error codes the next question
> > is
> > >> > what
> > >> > > to
> > >> > > > > do
> > >> > > > > > if
> > >> > > > > > > > two
> > >> > > > > > > > >>> > such
> > >> > > > > > > > >>> > > > codes apply (i.e. you have a replica down and
> > the
> > >> > > response
> > >> > > > > is
> > >> > > > > > > > >>> > quota'd). I
> > >> > > > > > > > >>> > > > think I am trying to argue that error should
> > mean
> > >> > "why
> > >> > > we
> > >> > > > > > > failed
> > >> > > > > > > > >>> your
> > >> > > > > > > > >>> > > > request", for which there will really only be
> > one
> > >> > > reason,
> > >> > > > > and
> > >> > > > > > > any
> > >> > > > > > > > >>> other
> > >> > > > > > > > >>> > > > useful information we want to send back is
> just
> > >> > another
> > >> > > > > field
> > >> > > > > > > in
> > >> > > > > > > > >>> the
> > >> > > > > > > > >>> > > > response.
> > >> > > > > > > > >>> > > >
> > >> > > > > > > > >>> > > > -Jay
> > >> > > > > > > > >>> > > >
> > >> > > > > > > > >>> > > > On Mon, Mar 16, 2015 at 9:51 PM, Gwen Shapira
> <
> > >> > > > > > > > >>> [email protected]>
> > >> > > > > > > > >>> > > wrote:
> > >> > > > > > > > >>> > > >
> > >> > > > > > > > >>> > > >> I think its not too late to reserve a set of
> > >> error
> > >> > > codes
> > >> > > > > > > > >>> (200-299?)
> > >> > > > > > > > >>> > > >> for "non-error" codes.
> > >> > > > > > > > >>> > > >>
> > >> > > > > > > > >>> > > >> It won't be backward compatible (i.e. clients
> > >> that
> > >> > > > > currently
> > >> > > > > > > do
> > >> > > > > > > > >>> "else
> > >> > > > > > > > >>> > > >> throw" will throw on non-errors), but perhaps
> > its
> > >> > > > > > worthwhile.
> > >> > > > > > > > >>> > > >>
> > >> > > > > > > > >>> > > >> On Mon, Mar 16, 2015 at 9:42 PM, Jay Kreps <
> > >> > > > > > > [email protected]
> > >> > > > > > > > >
> > >> > > > > > > > >>> > wrote:
> > >> > > > > > > > >>> > > >> > Hey Jun,
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > I'd really really really like to avoid
> that.
> > >> > Having
> > >> > > just
> > >> > > > > > > > spent a
> > >> > > > > > > > >>> > > bunch of
> > >> > > > > > > > >>> > > >> > time on the clients, using the error codes
> to
> > >> > encode
> > >> > > > > other
> > >> > > > > > > > >>> > information
> > >> > > > > > > > >>> > > >> > about the response is super dangerous. The
> > >> error
> > >> > > > > handling
> > >> > > > > > is
> > >> > > > > > > > >>> one of
> > >> > > > > > > > >>> > > the
> > >> > > > > > > > >>> > > >> > hardest parts of the client (Guozhang chime
> > in
> > >> > > here).
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > Generally the error handling looks like
> > >> > > > > > > > >>> > > >> >   if(error == none)
> > >> > > > > > > > >>> > > >> >      // good, process the request
> > >> > > > > > > > >>> > > >> >   else if(error == KNOWN_ERROR_1)
> > >> > > > > > > > >>> > > >> >      // handle known error 1
> > >> > > > > > > > >>> > > >> >   else if(error == KNOWN_ERROR_2)
> > >> > > > > > > > >>> > > >> >      // handle known error 2
> > >> > > > > > > > >>> > > >> >   else
> > >> > > > > > > > >>> > > >> >      throw
> Errors.forCode(error).exception();
> > >> //
> > >> > or
> > >> > > some
> > >> > > > > > > other
> > >> > > > > > > > >>> > default
> > >> > > > > > > > >>> > > >> > behavior
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > This works because we have a convention
> that
> > >> and
> > >> > > error
> > >> > > > > is
> > >> > > > > > > > >>> something
> > >> > > > > > > > >>> > > that
> > >> > > > > > > > >>> > > >> > prevented your getting the response so the
> > >> default
> > >> > > > > > handling
> > >> > > > > > > > >>> case is
> > >> > > > > > > > >>> > > sane
> > >> > > > > > > > >>> > > >> > and forward compatible. It is tempting to
> use
> > >> the
> > >> > > error
> > >> > > > > > code
> > >> > > > > > > > to
> > >> > > > > > > > >>> > convey
> > >> > > > > > > > >>> > > >> > information in the success case. For
> example
> > we
> > >> > > could
> > >> > > > > use
> > >> > > > > > > > error
> > >> > > > > > > > >>> > codes
> > >> > > > > > > > >>> > > to
> > >> > > > > > > > >>> > > >> > encode whether quotas were enforced,
> whether
> > >> the
> > >> > > request
> > >> > > > > > was
> > >> > > > > > > > >>> served
> > >> > > > > > > > >>> > > out
> > >> > > > > > > > >>> > > >> of
> > >> > > > > > > > >>> > > >> > cache, whether the stock market is up
> today,
> > or
> > >> > > > > whatever.
> > >> > > > > > > The
> > >> > > > > > > > >>> > problem
> > >> > > > > > > > >>> > > is
> > >> > > > > > > > >>> > > >> > that since these are not errors as far as
> the
> > >> > > client is
> > >> > > > > > > > >>> concerned it
> > >> > > > > > > > >>> > > >> should
> > >> > > > > > > > >>> > > >> > not throw an exception but process the
> > >> response,
> > >> > > but now
> > >> > > > > > we
> > >> > > > > > > > >>> created
> > >> > > > > > > > >>> > an
> > >> > > > > > > > >>> > > >> > explicit requirement that that error be
> > handled
> > >> > > > > explicitly
> > >> > > > > > > > >>> since it
> > >> > > > > > > > >>> > is
> > >> > > > > > > > >>> > > >> > different. I really think that this kind of
> > >> > > information
> > >> > > > > is
> > >> > > > > > > not
> > >> > > > > > > > >>> an
> > >> > > > > > > > >>> > > error,
> > >> > > > > > > > >>> > > >> it
> > >> > > > > > > > >>> > > >> > is just information, and if we want it in
> the
> > >> > > response
> > >> > > > > we
> > >> > > > > > > > >>> should do
> > >> > > > > > > > >>> > > the
> > >> > > > > > > > >>> > > >> > right thing and add a new field to the
> > >> response.
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > I think you saw the Samza bug that was
> > >> literally
> > >> > an
> > >> > > > > > example
> > >> > > > > > > of
> > >> > > > > > > > >>> this
> > >> > > > > > > > >>> > > >> > happening and leading to an infinite retry
> > >> loop.
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > Further more I really want to emphasize
> that
> > >> > hitting
> > >> > > > > your
> > >> > > > > > > > quota
> > >> > > > > > > > >>> in
> > >> > > > > > > > >>> > the
> > >> > > > > > > > >>> > > >> > design that Adi has proposed is actually
> not
> > an
> > >> > > error
> > >> > > > > > > > condition
> > >> > > > > > > > >>> at
> > >> > > > > > > > >>> > > all.
> > >> > > > > > > > >>> > > >> It
> > >> > > > > > > > >>> > > >> > is totally reasonable in any bootstrap
> > >> situation
> > >> > to
> > >> > > > > > > > >>> intentionally
> > >> > > > > > > > >>> > > want to
> > >> > > > > > > > >>> > > >> > run at the limit the system imposes on you.
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > -Jay
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> > On Mon, Mar 16, 2015 at 4:27 PM, Jun Rao <
> > >> > > > > > [email protected]>
> > >> > > > > > > > >>> wrote:
> > >> > > > > > > > >>> > > >> >
> > >> > > > > > > > >>> > > >> >> It's probably useful for a client to know
> > >> whether
> > >> > > its
> > >> > > > > > > > requests
> > >> > > > > > > > >>> are
> > >> > > > > > > > >>> > > >> >> throttled or not (e.g., for monitoring and
> > >> > > alerting).
> > >> > > > > > From
> > >> > > > > > > > that
> > >> > > > > > > > >>> > > >> >> perspective, option B (delay the requests
> > and
> > >> > > return an
> > >> > > > > > > > error)
> > >> > > > > > > > >>> > seems
> > >> > > > > > > > >>> > > >> >> better.
> > >> > > > > > > > >>> > > >> >>
> > >> > > > > > > > >>> > > >> >> Thanks,
> > >> > > > > > > > >>> > > >> >>
> > >> > > > > > > > >>> > > >> >> Jun
> > >> > > > > > > > >>> > > >> >>
> > >> > > > > > > > >>> > > >> >> On Wed, Mar 4, 2015 at 3:51 PM, Aditya
> > >> Auradkar <
> > >> > > > > > > > >>> > > >> >> [email protected]> wrote:
> > >> > > > > > > > >>> > > >> >>
> > >> > > > > > > > >>> > > >> >> > Posted a KIP for quotas in kafka.
> > >> > > > > > > > >>> > > >> >> >
> > >> > > > > > > > >>> >
> > >> > > > > >
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > >> > > > > > > > >>> > > >> >> >
> > >> > > > > > > > >>> > > >> >> > Appreciate any feedback.
> > >> > > > > > > > >>> > > >> >> >
> > >> > > > > > > > >>> > > >> >> > Aditya
> > >> > > > > > > > >>> > > >> >> >
> > >> > > > > > > > >>> > > >> >>
> > >> > > > > > > > >>> > > >>
> > >> > > > > > > > >>> > >
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>> > --
> > >> > > > > > > > >>> > Thanks,
> > >> > > > > > > > >>> > Ewen
> > >> > > > > > > > >>> >
> > >> > > > > > > > >>>
> > >> > > > > > > > >>>
> > >> > > > > > > > >>>
> > >> > > > > > > > >>> --
> > >> > > > > > > > >>> -- Guozhang
> > >> > > > > > > > >>>
> > >> > > > > > > > >>
> > >> > > > > > > > >>
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> > > --
> > > Sent from Gmail Mobile
> > >
> >
> >
> > --
> > Sent from Gmail Mobile
> >
>

RE: [KIP-DISCUSSION] KIP-13 Quotas

Reply via email to