I created a KIP for this suggestion: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
Basically documenting what was already discussed here. Comments will be awesome! Gwen On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira <gshap...@cloudera.com> wrote: > The errors are part of the KIP process now, so I think the clients are safe :) > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals > > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin <steve.mo...@gmail.com> wrote: >> Agree errors should be part of the protocol >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira <gshap...@cloudera.com> wrote: >>> >>> Hi, >>> >>> I got convinced by Joe and Dana that errors are indeed part of the >>> protocol and can't be randomly added. >>> >>> So, it looks like we need to bump version of ProduceRequest in the >>> following way: >>> Version 0 -> accept acks >1. I think we should keep the existing >>> behavior too (i.e. not replace it with -1) to avoid surprising >>> clients, but I'm willing to hear other opinions. >>> Version 1 -> do not accept acks >1 and return an error. >>> Are we ok with the error I added in KAFKA-1697? We can use something >>> less specific like InvalidRequestParameter. This error can be reused >>> in the future and reduce the need to add errors, but will also be less >>> clear to the client and its users. Maybe even add the error message >>> string to the protocol in addition to the error code? (since we are >>> bumping versions....) >>> >>> I think maintaining the old version throughout 0.8.X makes sense. IMO >>> dropping it for 0.9 is feasible, but I'll let client owners help make >>> that call. >>> >>> Am I missing anything? Should I start a KIP? It seems like a KIP-type >>> discussion :) >>> >>> Gwen >>> >>> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava >>> <e...@confluent.io> wrote: >>>> Gwen, >>>> >>>> I think the only option that wouldn't require a protocol version change is >>>> the one where acks > 1 is converted to acks = -1 since it's the only one >>>> that doesn't potentially break older clients. The protocol guide says that >>>> the expected upgrade path is servers first, then clients, so old clients, >>>> including non-java clients, that may be using acks > 1 should be able to >>>> work with a new broker version. >>>> >>>> It's more work, but I think dealing with the protocol change is the right >>>> thing to do since it eventually gets us to the behavior I think is better >>>> -- >>>> the broker should reject requests with invalid values. I think Joe and I >>>> were basically in agreement. In my mind the major piece missing from his >>>> description is how long we're going to maintain his "case 0" behavior. It's >>>> impractical to maintain old versions forever, but it sounds like there >>>> hasn't been a decision on how long to maintain them. Maybe that's another >>>> item to add to KIPs -- protocol versions and behavior need to be listed as >>>> deprecated and the earliest version in which they'll be removed should be >>>> specified so users can understand which versions are guaranteed to be >>>> compatible, even if they're using (well-written) non-java clients. >>>> >>>> -Ewen >>>> >>>> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers <dana.pow...@gmail.com> >>>>> wrote: >>>>> >>>>>> clients don't break on unknown errors >>>>> >>>>> maybe true for the official java clients, but I dont think the assumption >>>>> holds true for community-maintained clients and users of those clients. >>>>> kafka-python generally follows the fail-fast philosophy and raises an >>>>> exception on any unrecognized error code in any server response. in this >>>>> case, kafka-python allows users to set their own required-acks policy when >>>>> creating a producer instance. It is possible that users of kafka-python >>>>> have deployed producer code that uses ack>1 -- perhaps in production >>>>> environments -- and for those users the new error code will crash their >>>>> producer code. I would not be surprised if the same were true of other >>>>> community clients. >>>>> >>>>> *one reason for the fail-fast approach is that there isn't great >>>>> documentation on what errors to expect for each request / response -- so >>>>> we >>>>> use failures to alert that some error case is not handled properly. and >>>>> because of that, introducing new error cases without bumping the api >>>>> version is likely to cause those errors to get raised/thrown all the way >>>>> back up to the user. of course we (client maintainers) can fix the issues >>>>> in the client libraries and suggest users upgrade, but it's not the ideal >>>>> situation. >>>>> >>>>> >>>>> long-winded way of saying: I agree w/ Joe. >>>>> >>>>> -Dana >>>>> >>>>> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira <gshap...@cloudera.com> >>>>> wrote: >>>>> >>>>>> Is the protocol bump caused by the behavior change or the new error >>>>>> code? >>>>>> >>>>>> 1) IMO, error_codes are data, and clients can expect to receive errors >>>>>> that they don't understand (i.e. unknown errors). AFAIK, clients don't >>>>>> break on unknown errors, they are simple more challenging to debug. If >>>>>> we document the new behavior, then its definitely debuggable and >>>>>> fixable. >>>>>> >>>>>> 2) The behavior change is basically a deprecation - i.e. acks > 1 were >>>>>> never documented, and are not supported by Kafka clients starting with >>>>>> version 0.8.2. I'm not sure this requires a protocol bump either, >>>>>> although its a better case than new error codes. >>>>>> >>>>>> Thanks, >>>>>> Gwen >>>>>> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein <joe.st...@stealth.ly> >>>>>> wrote: >>>>>>> Looping in the mailing list that the client developers live on because >>>>>> they >>>>>>> are all not on dev (though they should be if they want to be helping >>>>>>> to >>>>>>> build the best client libraries they can). >>>>>>> >>>>>>> I whole hardily believe that we need to not break existing >>>>>>> functionality >>>>>> of >>>>>>> the client protocol, ever. >>>>>>> >>>>>>> There are many reasons for this and we have other threads on the >>>>>>> mailing >>>>>>> list where we are discussing that topic (no pun intended) that I don't >>>>>> want >>>>>>> to re-hash here. >>>>>>> >>>>>>> If we change wire protocol functionality OR the binary format (either) >>>>>>> we >>>>>>> must bump version AND treat version as a feature flag with backward >>>>>>> compatibility support until it is deprecated for some time for folks >>>>>>> to >>>>>> deal >>>>>>> with it. >>>>>>> >>>>>>> match version = { >>>>>>> case 0: keepDoingWhatWeWereDoing() >>>>>>> case 1: doNewStuff() >>>>>>> case 2: doEvenMoreNewStuff() >>>>>>> } >>>>>>> >>>>>>> has to be a practice we adopt imho ... I know feature flags can be >>>>>> construed >>>>>>> as "messy code" but I am eager to hear another (better? different?) >>>>>> solution >>>>>>> to this. >>>>>>> >>>>>>> If we don't do a feature flag like this specifically with this change >>>>>> then >>>>>>> what happens is that someone upgrades their brokers with a rolling >>>>>> restart >>>>>>> in 0.8.3 and every single one of their producer requests start to fail >>>>>> and >>>>>>> they have a major production outage. eeeek!!!! >>>>>>> >>>>>>> I do 100% agree that > 1 makes no sense and we *REALLY* need people to >>>>>> start >>>>>>> using 0,1,-1 but we need to-do that in a way that is going to work for >>>>>>> everyone. >>>>>>> >>>>>>> Old producers and consumers must keep working with new brokers and if >>>>>>> we >>>>>> are >>>>>>> not going to support that then I am unclear what the use of "version" >>>>>>> is >>>>>>> based on our original intentions of having it because of the >>>>>>> 0.7=>-0.8. >>>>>> We >>>>>>> said no more breaking changes when we did that. >>>>>>> >>>>>>> - Joe Stein >>>>>>> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava < >>>>>> e...@confluent.io> >>>>>>> wrote: >>>>>>>> >>>>>>>> Right, so this looks like it could create an issue similar to what's >>>>>>>> currently being discussed in >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get >>>>>>>> errors >>>>>>>> under conditions when they previously wouldn't. Old clients won't >>>>>>>> even >>>>>>>> know >>>>>>>> about the error code, so besides failing they won't even be able to >>>>>>>> log >>>>>>>> any >>>>>>>> meaningful error messages. >>>>>>>> >>>>>>>> I think there are two options for compatibility: >>>>>>>> >>>>>>>> 1. An alternative change is to remove the ack > 1 code, but silently >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as >>>>>>>> other >>>>>>>> changes to behavior since the interaction between the client and >>>>>>>> server >>>>>>>> remains the same, no error codes change, etc. The client might just >>>>>>>> see >>>>>>>> some increased latency since the message might need to be replicated >>>>>>>> to >>>>>>>> more brokers than they requested. >>>>>>>> 2. Split this into two patches, one that bumps the protocol version >>>>>>>> on >>>>>>>> that >>>>>>>> message to include the new error code but maintains both old (now >>>>>>>> deprecated) and new behavior, then a second that would be applied in >>>>>>>> a >>>>>>>> later release that removes the old protocol + code for handling acks >>>>>> 1. >>>>>>>> >>>>>>>> 2 is probably the right thing to do. If we specify the release when >>>>>> we'll >>>>>>>> remove the deprecated protocol at the time of deprecation it makes >>>>>> things >>>>>>>> a >>>>>>>> lot easier for people writing non-java clients and could give users >>>>>> better >>>>>>>> predictability (e.g. if clients are at most 1 major release behind >>>>>>>> brokers, >>>>>>>> they'll remain compatible but possibly use deprecated features). >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira <gshap...@cloudera.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Kafka Devs, >>>>>>>>> >>>>>>>>> We are working on KAFKA-1697 - remove code related to ack>1 on the >>>>>>>>> broker. Per Neha's suggestion, I'd like to give everyone a heads up >>>>>>>>> on >>>>>>>>> what these changes mean. >>>>>>>>> >>>>>>>>> Once this patch is included, any produce requests that include >>>>>>>>> request.required.acks > 1 will result in an exception. This will be >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and >>>>>>>>> UnknownException in existing versions (sorry, but I can't add error >>>>>>>>> codes retroactively). >>>>>>>>> >>>>>>>>> This behavior is already enforced by 0.8.2 producers (sync and >>>>>>>>> new), >>>>>>>>> but we expect impact on users with older producers that relied on >>>>>>>>> acks >>>>>>>>>> 1 and external clients (i.e python, go, etc). >>>>>>>>> >>>>>>>>> Users who relied on acks > 1 are expected to switch to using acks = >>>>>>>>> -1 >>>>>>>>> and a min.isr parameter than matches their user case. >>>>>>>>> >>>>>>>>> This change was discussed in the past in the context of KAFKA-1555 >>>>>>>>> (min.isr), but let us know if you have any questions or concerns >>>>>>>>> regarding this change. >>>>>>>>> >>>>>>>>> Gwen >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks, >>>>>>>> Ewen >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups >>>>>>> "kafka-clients" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>>> an >>>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>>>>>> To post to this group, send email to kafka-clie...@googlegroups.com. >>>>>>> Visit this group at http://groups.google.com/group/kafka-clients. >>>>>>> To view this discussion on the web visit >>>>>> >>>>>> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com >>>>>> . >>>>>>> >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups >>>>>> "kafka-clients" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an >>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>>>>> To post to this group, send email to kafka-clie...@googlegroups.com. >>>>>> Visit this group at http://groups.google.com/group/kafka-clients. >>>>>> To view this discussion on the web visit >>>>>> >>>>>> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> >>>> >>>> >>>> -- >>>> Thanks, >>>> Ewen