> clients don't break on unknown errors maybe true for the official java clients, but I dont think the assumption holds true for community-maintained clients and users of those clients. kafka-python generally follows the fail-fast philosophy and raises an exception on any unrecognized error code in any server response. in this case, kafka-python allows users to set their own required-acks policy when creating a producer instance. It is possible that users of kafka-python have deployed producer code that uses ack>1 -- perhaps in production environments -- and for those users the new error code will crash their producer code. I would not be surprised if the same were true of other community clients.
*one reason for the fail-fast approach is that there isn't great documentation on what errors to expect for each request / response -- so we use failures to alert that some error case is not handled properly. and because of that, introducing new error cases without bumping the api version is likely to cause those errors to get raised/thrown all the way back up to the user. of course we (client maintainers) can fix the issues in the client libraries and suggest users upgrade, but it's not the ideal situation. long-winded way of saying: I agree w/ Joe. -Dana On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira <gshap...@cloudera.com> wrote: > Is the protocol bump caused by the behavior change or the new error code? > > 1) IMO, error_codes are data, and clients can expect to receive errors > that they don't understand (i.e. unknown errors). AFAIK, clients don't > break on unknown errors, they are simple more challenging to debug. If > we document the new behavior, then its definitely debuggable and > fixable. > > 2) The behavior change is basically a deprecation - i.e. acks > 1 were > never documented, and are not supported by Kafka clients starting with > version 0.8.2. I'm not sure this requires a protocol bump either, > although its a better case than new error codes. > > Thanks, > Gwen > > On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein <joe.st...@stealth.ly> wrote: > > Looping in the mailing list that the client developers live on because > they > > are all not on dev (though they should be if they want to be helping to > > build the best client libraries they can). > > > > I whole hardily believe that we need to not break existing functionality > of > > the client protocol, ever. > > > > There are many reasons for this and we have other threads on the mailing > > list where we are discussing that topic (no pun intended) that I don't > want > > to re-hash here. > > > > If we change wire protocol functionality OR the binary format (either) we > > must bump version AND treat version as a feature flag with backward > > compatibility support until it is deprecated for some time for folks to > deal > > with it. > > > > match version = { > > case 0: keepDoingWhatWeWereDoing() > > case 1: doNewStuff() > > case 2: doEvenMoreNewStuff() > > } > > > > has to be a practice we adopt imho ... I know feature flags can be > construed > > as "messy code" but I am eager to hear another (better? different?) > solution > > to this. > > > > If we don't do a feature flag like this specifically with this change > then > > what happens is that someone upgrades their brokers with a rolling > restart > > in 0.8.3 and every single one of their producer requests start to fail > and > > they have a major production outage. eeeek!!!! > > > > I do 100% agree that > 1 makes no sense and we *REALLY* need people to > start > > using 0,1,-1 but we need to-do that in a way that is going to work for > > everyone. > > > > Old producers and consumers must keep working with new brokers and if we > are > > not going to support that then I am unclear what the use of "version" is > > based on our original intentions of having it because of the 0.7=>-0.8. > We > > said no more breaking changes when we did that. > > > > - Joe Stein > > > > On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava < > e...@confluent.io> > > wrote: > >> > >> Right, so this looks like it could create an issue similar to what's > >> currently being discussed in > >> https://issues.apache.org/jira/browse/KAFKA-1649 where users now get > >> errors > >> under conditions when they previously wouldn't. Old clients won't even > >> know > >> about the error code, so besides failing they won't even be able to log > >> any > >> meaningful error messages. > >> > >> I think there are two options for compatibility: > >> > >> 1. An alternative change is to remove the ack > 1 code, but silently > >> "upgrade" requests with acks > 1 to acks = -1. This isn't the same as > >> other > >> changes to behavior since the interaction between the client and server > >> remains the same, no error codes change, etc. The client might just see > >> some increased latency since the message might need to be replicated to > >> more brokers than they requested. > >> 2. Split this into two patches, one that bumps the protocol version on > >> that > >> message to include the new error code but maintains both old (now > >> deprecated) and new behavior, then a second that would be applied in a > >> later release that removes the old protocol + code for handling acks > > 1. > >> > >> 2 is probably the right thing to do. If we specify the release when > we'll > >> remove the deprecated protocol at the time of deprecation it makes > things > >> a > >> lot easier for people writing non-java clients and could give users > better > >> predictability (e.g. if clients are at most 1 major release behind > >> brokers, > >> they'll remain compatible but possibly use deprecated features). > >> > >> > >> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira <gshap...@cloudera.com> > >> wrote: > >> > >> > Hi Kafka Devs, > >> > > >> > We are working on KAFKA-1697 - remove code related to ack>1 on the > >> > broker. Per Neha's suggestion, I'd like to give everyone a heads up on > >> > what these changes mean. > >> > > >> > Once this patch is included, any produce requests that include > >> > request.required.acks > 1 will result in an exception. This will be > >> > InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and > >> > UnknownException in existing versions (sorry, but I can't add error > >> > codes retroactively). > >> > > >> > This behavior is already enforced by 0.8.2 producers (sync and new), > >> > but we expect impact on users with older producers that relied on acks > >> > > 1 and external clients (i.e python, go, etc). > >> > > >> > Users who relied on acks > 1 are expected to switch to using acks = -1 > >> > and a min.isr parameter than matches their user case. > >> > > >> > This change was discussed in the past in the context of KAFKA-1555 > >> > (min.isr), but let us know if you have any questions or concerns > >> > regarding this change. > >> > > >> > Gwen > >> > > >> > >> > >> > >> -- > >> Thanks, > >> Ewen > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "kafka-clients" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to kafka-clients+unsubscr...@googlegroups.com. > > To post to this group, send email to kafka-clie...@googlegroups.com. > > Visit this group at http://groups.google.com/group/kafka-clients. > > To view this discussion on the web visit > > > https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com > . > > > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "kafka-clients" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kafka-clients+unsubscr...@googlegroups.com. > To post to this group, send email to kafka-clie...@googlegroups.com. > Visit this group at http://groups.google.com/group/kafka-clients. > To view this discussion on the web visit > https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com > . > For more options, visit https://groups.google.com/d/optout. >