please correct me if I am missing sth here. I am not understanding how
would throttle work without cooperation/back-off from producer. new Java
producer supports non-blocking API. why would delayed response be able to
slow down producer? producer will continue to fire async sends.

On Mon, Mar 16, 2015 at 10:58 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> I think we are really discussing two separate issues here:
>
> 1. Whether we should a) append-then-block-then-returnOKButThrottled or b)
> block-then-returnFailDuetoThrottled for quota actions on produce requests.
>
> Both these approaches assume some kind of well-behaveness of the clients:
> option a) assumes the client sets an proper timeout value while can just
> ignore "OKButThrottled" response, while option b) assumes the client
> handles the "FailDuetoThrottled" appropriately. For any malicious clients
> that, for example, just keep retrying either intentionally or not, neither
> of these approaches are actually effective.
>
> 2. For "OKButThrottled" and "FailDuetoThrottled" responses, shall we encode
> them as error codes or augment the protocol to use a separate field
> indicating "status codes".
>
> Today we have already incorporated some status code as error codes in the
> responses, e.g. ReplicaNotAvailable in MetadataResponse, the pros of this
> is of course using a single field for response status like the HTTP status
> codes, while the cons is that it requires clients to handle the error codes
> carefully.
>
> I think maybe we can actually extend the single-code approach to overcome
> its drawbacks, that is, wrap the error codes semantics to the users so that
> users do not need to handle the codes one-by-one. More concretely,
> following Jay's example the client could write sth. like this:
>
>
> -----------------
>
>   if(error.isOK())
>      // status code is good or the code can be simply ignored for this
> request type, process the request
>   else if(error.needsRetry())
>      // throttled, transient error, etc: retry
>   else if(error.isFatal())
>      // non-retriable errors, etc: notify / terminate / other handling
>
> -----------------
>
> Only when the clients really want to handle, for example FailDuetoThrottled
> status code specifically, it needs to:
>
>   if(error.isOK())
>      // status code is good or the code can be simply ignored for this
> request type, process the request
>   else if(error == FailDuetoThrottled )
>      // throttled: log it
>   else if(error.needsRetry())
>      // transient error, etc: retry
>   else if(error.isFatal())
>      // non-retriable errors, etc: notify / terminate / other handling
>
> -----------------
>
> And for implementation we can probably group the codes accordingly like
> HTTP status code such that we can do:
>
> boolean Error.isOK() {
>   return code < 300 && code >= 200;
> }
>
> Guozhang
>
> On Mon, Mar 16, 2015 at 10:24 PM, Ewen Cheslack-Postava <e...@confluent.io
> >
> wrote:
>
> > Agreed that trying to shoehorn non-error codes into the error field is a
> > bad idea. It makes it *way* too easy to write code that looks (and should
> > be) correct but is actually incorrect. If necessary, I think it's much
> > better to to spend a couple of extra bytes to encode that information
> > separately (a "status" or "warning" section of the response). An
> indication
> > that throttling is occurring is something I'd expect to be indicated by a
> > bit flag in the response rather than as an error code.
> >
> > Gwen - I think an error code makes sense when the request actually
> failed.
> > Option B, which Jun was advocating, would have appended the messages
> > successfully. If the rate-limiting case you're talking about had
> > successfully committed the messages, I would say that's also a bad use of
> > error codes.
> >
> >
> > On Mon, Mar 16, 2015 at 10:16 PM, Gwen Shapira <gshap...@cloudera.com>
> > wrote:
> >
> > > We discussed an error code for rate-limiting (which I think made
> > > sense), isn't it a similar case?
> > >
> > > On Mon, Mar 16, 2015 at 10:10 PM, Jay Kreps <jay.kr...@gmail.com>
> wrote:
> > > > My concern is that as soon as you start encoding non-error response
> > > > information into error codes the next question is what to do if two
> > such
> > > > codes apply (i.e. you have a replica down and the response is
> > quota'd). I
> > > > think I am trying to argue that error should mean "why we failed your
> > > > request", for which there will really only be one reason, and any
> other
> > > > useful information we want to send back is just another field in the
> > > > response.
> > > >
> > > > -Jay
> > > >
> > > > On Mon, Mar 16, 2015 at 9:51 PM, Gwen Shapira <gshap...@cloudera.com
> >
> > > wrote:
> > > >
> > > >> I think its not too late to reserve a set of error codes (200-299?)
> > > >> for "non-error" codes.
> > > >>
> > > >> It won't be backward compatible (i.e. clients that currently do
> "else
> > > >> throw" will throw on non-errors), but perhaps its worthwhile.
> > > >>
> > > >> On Mon, Mar 16, 2015 at 9:42 PM, Jay Kreps <jay.kr...@gmail.com>
> > wrote:
> > > >> > Hey Jun,
> > > >> >
> > > >> > I'd really really really like to avoid that. Having just spent a
> > > bunch of
> > > >> > time on the clients, using the error codes to encode other
> > information
> > > >> > about the response is super dangerous. The error handling is one
> of
> > > the
> > > >> > hardest parts of the client (Guozhang chime in here).
> > > >> >
> > > >> > Generally the error handling looks like
> > > >> >   if(error == none)
> > > >> >      // good, process the request
> > > >> >   else if(error == KNOWN_ERROR_1)
> > > >> >      // handle known error 1
> > > >> >   else if(error == KNOWN_ERROR_2)
> > > >> >      // handle known error 2
> > > >> >   else
> > > >> >      throw Errors.forCode(error).exception(); // or some other
> > default
> > > >> > behavior
> > > >> >
> > > >> > This works because we have a convention that and error is
> something
> > > that
> > > >> > prevented your getting the response so the default handling case
> is
> > > sane
> > > >> > and forward compatible. It is tempting to use the error code to
> > convey
> > > >> > information in the success case. For example we could use error
> > codes
> > > to
> > > >> > encode whether quotas were enforced, whether the request was
> served
> > > out
> > > >> of
> > > >> > cache, whether the stock market is up today, or whatever. The
> > problem
> > > is
> > > >> > that since these are not errors as far as the client is concerned
> it
> > > >> should
> > > >> > not throw an exception but process the response, but now we
> created
> > an
> > > >> > explicit requirement that that error be handled explicitly since
> it
> > is
> > > >> > different. I really think that this kind of information is not an
> > > error,
> > > >> it
> > > >> > is just information, and if we want it in the response we should
> do
> > > the
> > > >> > right thing and add a new field to the response.
> > > >> >
> > > >> > I think you saw the Samza bug that was literally an example of
> this
> > > >> > happening and leading to an infinite retry loop.
> > > >> >
> > > >> > Further more I really want to emphasize that hitting your quota in
> > the
> > > >> > design that Adi has proposed is actually not an error condition at
> > > all.
> > > >> It
> > > >> > is totally reasonable in any bootstrap situation to intentionally
> > > want to
> > > >> > run at the limit the system imposes on you.
> > > >> >
> > > >> > -Jay
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Mon, Mar 16, 2015 at 4:27 PM, Jun Rao <j...@confluent.io>
> wrote:
> > > >> >
> > > >> >> It's probably useful for a client to know whether its requests
> are
> > > >> >> throttled or not (e.g., for monitoring and alerting). From that
> > > >> >> perspective, option B (delay the requests and return an error)
> > seems
> > > >> >> better.
> > > >> >>
> > > >> >> Thanks,
> > > >> >>
> > > >> >> Jun
> > > >> >>
> > > >> >> On Wed, Mar 4, 2015 at 3:51 PM, Aditya Auradkar <
> > > >> >> aaurad...@linkedin.com.invalid> wrote:
> > > >> >>
> > > >> >> > Posted a KIP for quotas in kafka.
> > > >> >> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas
> > > >> >> >
> > > >> >> > Appreciate any feedback.
> > > >> >> >
> > > >> >> > Aditya
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to