You are right, shoe-horning status into an error field is a bad idea. While many projects use a single "status" field to indicate different error and non-error states, it doesn't seem like a good fit for the current Kafka implementation.
Do you think that adding a "status" field to our protocol is feasible at this point? On Mon, Mar 16, 2015 at 10:24 PM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > Agreed that trying to shoehorn non-error codes into the error field is a > bad idea. It makes it *way* too easy to write code that looks (and should > be) correct but is actually incorrect. If necessary, I think it's much > better to to spend a couple of extra bytes to encode that information > separately (a "status" or "warning" section of the response). An indication > that throttling is occurring is something I'd expect to be indicated by a > bit flag in the response rather than as an error code. > > Gwen - I think an error code makes sense when the request actually failed. > Option B, which Jun was advocating, would have appended the messages > successfully. If the rate-limiting case you're talking about had > successfully committed the messages, I would say that's also a bad use of > error codes. > > > On Mon, Mar 16, 2015 at 10:16 PM, Gwen Shapira <gshap...@cloudera.com> > wrote: > >> We discussed an error code for rate-limiting (which I think made >> sense), isn't it a similar case? >> >> On Mon, Mar 16, 2015 at 10:10 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >> > My concern is that as soon as you start encoding non-error response >> > information into error codes the next question is what to do if two such >> > codes apply (i.e. you have a replica down and the response is quota'd). I >> > think I am trying to argue that error should mean "why we failed your >> > request", for which there will really only be one reason, and any other >> > useful information we want to send back is just another field in the >> > response. >> > >> > -Jay >> > >> > On Mon, Mar 16, 2015 at 9:51 PM, Gwen Shapira <gshap...@cloudera.com> >> wrote: >> > >> >> I think its not too late to reserve a set of error codes (200-299?) >> >> for "non-error" codes. >> >> >> >> It won't be backward compatible (i.e. clients that currently do "else >> >> throw" will throw on non-errors), but perhaps its worthwhile. >> >> >> >> On Mon, Mar 16, 2015 at 9:42 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >> >> > Hey Jun, >> >> > >> >> > I'd really really really like to avoid that. Having just spent a >> bunch of >> >> > time on the clients, using the error codes to encode other information >> >> > about the response is super dangerous. The error handling is one of >> the >> >> > hardest parts of the client (Guozhang chime in here). >> >> > >> >> > Generally the error handling looks like >> >> > if(error == none) >> >> > // good, process the request >> >> > else if(error == KNOWN_ERROR_1) >> >> > // handle known error 1 >> >> > else if(error == KNOWN_ERROR_2) >> >> > // handle known error 2 >> >> > else >> >> > throw Errors.forCode(error).exception(); // or some other default >> >> > behavior >> >> > >> >> > This works because we have a convention that and error is something >> that >> >> > prevented your getting the response so the default handling case is >> sane >> >> > and forward compatible. It is tempting to use the error code to convey >> >> > information in the success case. For example we could use error codes >> to >> >> > encode whether quotas were enforced, whether the request was served >> out >> >> of >> >> > cache, whether the stock market is up today, or whatever. The problem >> is >> >> > that since these are not errors as far as the client is concerned it >> >> should >> >> > not throw an exception but process the response, but now we created an >> >> > explicit requirement that that error be handled explicitly since it is >> >> > different. I really think that this kind of information is not an >> error, >> >> it >> >> > is just information, and if we want it in the response we should do >> the >> >> > right thing and add a new field to the response. >> >> > >> >> > I think you saw the Samza bug that was literally an example of this >> >> > happening and leading to an infinite retry loop. >> >> > >> >> > Further more I really want to emphasize that hitting your quota in the >> >> > design that Adi has proposed is actually not an error condition at >> all. >> >> It >> >> > is totally reasonable in any bootstrap situation to intentionally >> want to >> >> > run at the limit the system imposes on you. >> >> > >> >> > -Jay >> >> > >> >> > >> >> > >> >> > On Mon, Mar 16, 2015 at 4:27 PM, Jun Rao <j...@confluent.io> wrote: >> >> > >> >> >> It's probably useful for a client to know whether its requests are >> >> >> throttled or not (e.g., for monitoring and alerting). From that >> >> >> perspective, option B (delay the requests and return an error) seems >> >> >> better. >> >> >> >> >> >> Thanks, >> >> >> >> >> >> Jun >> >> >> >> >> >> On Wed, Mar 4, 2015 at 3:51 PM, Aditya Auradkar < >> >> >> aaurad...@linkedin.com.invalid> wrote: >> >> >> >> >> >> > Posted a KIP for quotas in kafka. >> >> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas >> >> >> > >> >> >> > Appreciate any feedback. >> >> >> > >> >> >> > Aditya >> >> >> > >> >> >> >> >> >> > > > > -- > Thanks, > Ewen