<< For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 and then drop the ack > 1 support in trunk (w/o bumping up the protocol version)?
+1 On Mon, Jan 19, 2015 at 12:35 PM, Jun Rao <j...@confluent.io> wrote: > For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 > and then drop the ack > 1 support in trunk (w/o bumping up the protocol > version)? Thanks, > > Jun > > On Sun, Jan 18, 2015 at 8:24 PM, Gwen Shapira <gshap...@cloudera.com> > wrote: > >> Overall, agree on point #1, less sure on point #2. >> >> 1. Some protocols never ever add new errors, while others add errors >> without bumping versions. HTTP is a good example of the second type. >> HTTP-451 was added fairly recently, there are some errors specific to >> NGINX, etc. No one cares. I think we should properly document in the >> wire-protocol doc that new errors can be added, and I think we should >> strongly suggest (and implement ourselves) that unknown error codes >> should be shown to users (or at least logged), so they can be googled >> and understood through our documentation. >> In addition, hierarchy of error codes, so clients will know if an >> error is retry-able just by looking at the code could be nice. Same >> for adding an error string to the protocol. These are future >> enhancements that should be discussed separately. >> >> 2. I think we want to allow admins to upgrade their Kafka brokers >> without having to chase down clients in their organization and without >> getting blamed if clients break. I think it makes sense to have one >> version that will support existing behavior, but log warnings, so >> admins will know about misbehaving clients and can track them down >> before an upgrade that breaks them (or before the broken config causes >> them to lose data!). Hopefully this is indeed a very rare behavior and >> we are taking extra precaution for nothing, but I have customers where >> one traumatic upgrade means they will never upgrade a Kafka again, so >> I'm being conservative. >> >> Gwen >> >> >> On Sun, Jan 18, 2015 at 3:50 PM, Jun Rao <j...@confluent.io> wrote: >> > Overall, I agree with Jay on both points. >> > >> > 1. I think it's reasonable to add new error codes w/o bumping up the >> > protocol version. In most cases, by adding new error codes, we are just >> > refining the categorization of those unknown errors. So, a client >> shouldn't >> > behave worse than before as long as unknown errors have been properly >> > handled. >> > >> > 2. I think it's reasonable to just document that 0.8.2 will be the last >> > release that will support ack > 1 and remove the support completely in >> trunk >> > w/o bumping up the protocol. This is because (a) we never included ack >> > 1 >> > explicitly in the documentation and so the usage should be limited; (2) >> ack >> >> 1 doesn't provide the guarantee that people really want and so it >> > shouldn't really be used. >> > >> > Thanks, >> > >> > Jun >> > >> > >> > On Sun, Jan 18, 2015 at 11:03 AM, Jay Kreps <jay.kr...@gmail.com> >> wrote: >> >> >> >> Hey guys, >> >> >> >> I really think we are discussing two things here: >> >> >> >> How should we generally handle changes to the set of errors? Should >> >> introducing new errors be considered a protocol change or should we >> reserve >> >> the right to introduce new error codes? >> >> Given that this particular change is possibly incompatible, how should >> we >> >> handle it? >> >> >> >> I think it would be good for people who are responding here to be >> specific >> >> about which they are addressing. >> >> >> >> Here is what I think: >> >> >> >> 1. Errors should be extensible within a protocol version. >> >> >> >> We should change the protocol documentation to list the errors that >> can be >> >> given back from each api, their meaning, and how to handle them, BUT we >> >> should explicitly state that the set of errors are open ended. That is >> we >> >> should reserve the right to introduce new errors and explicitly state >> that >> >> clients need a blanket "unknown error" handling mechanism. The error >> can >> >> link to the protocol definition (something like "Unknown error 42, see >> >> protocol definition at http://link"). We could make this work really >> well by >> >> instructing all the clients to report the error in a very googlable >> way as >> >> Oracle does with their error format (e.g. "ORA-32") so that if you >> ever get >> >> the raw error google will take you to the definition. >> >> >> >> I agree that a more rigid definition seems like "right thing", but >> having >> >> just implemented two clients and spent a bunch of time on the server >> side, I >> >> think, it will work out poorly in practice. Here is why: >> >> >> >> I think we will make a lot of mistakes in nailing down the set of error >> >> codes up front and we will end up going through 3-4 churns of the >> protocol >> >> definition just realizing the set of errors that can be thrown. I >> think this >> >> churn will actually make life worse for clients that now have to >> figure out >> >> 7 identical versions of the protocol and will be a mess in terms of >> testing >> >> on the server side. I actually know this to be true because while >> >> implementing the clients I tried to guess the errors that could be >> thrown, >> >> then checked my guess by close code inspection. It turned out that I >> always >> >> missed things in my belief about errors, but more importantly even >> after >> >> close code inspection I found tons of other errors in my stress >> testing. >> >> In practice error handling always involves calling out one or two >> >> meaningful failures that have special recovery and then a blanket case >> that >> >> just handles everything else. It's true that some clients may not have >> done >> >> this well, but I think it is for the best if they fix that. >> >> Reserving the right to add errors doesn't mean we will do it without >> care. >> >> We will think through each change and decide whether giving a little >> more >> >> precision in the error is worth the overhead and churn of a protocol >> version >> >> bump. >> >> >> >> 2. In this case in particular we should not introduce a new protocol >> >> version >> >> >> >> In this particular case we are saying that acks > 1 doesn't make sense >> and >> >> we want to give an error to people specifying this so that they change >> their >> >> configuration. This is a configuration that few people use and we want >> to >> >> just make it an error. The bad behavior will just be that the error >> will not >> >> be as good as it could be. I think that is a better tradeoff than >> >> introducing a separate protocol version (this may be true of the java >> >> clients too). >> >> >> >> We will have lots of cases like this in the future and we aren't going >> to >> >> want to churn the protocol for each of them. For example we previously >> had >> >> to get more precise about which characters were legal and which >> illegal in >> >> topic names. >> >> >> >> -Jay >> >> >> >> On Fri, Jan 16, 2015 at 11:55 AM, Gwen Shapira <gshap...@cloudera.com> >> >> wrote: >> >>> >> >>> I updated the KIP: Using acks > 1 in version 0 will log a WARN message >> >>> in the broker about client using deprecated behavior (suggested by Joe >> >>> in the JIRA, and I think it makes sense). >> >>> >> >>> Gwen >> >>> >> >>> On Fri, Jan 16, 2015 at 10:40 AM, Gwen Shapira <gshap...@cloudera.com >> > >> >>> wrote: >> >>> > How about we continue the discussion on this thread, so we won't >> lose >> >>> > the context of this discussion, and put it up for VOTE when this has >> >>> > been finalized? >> >>> > >> >>> > On Fri, Jan 16, 2015 at 10:22 AM, Neha Narkhede <n...@confluent.io> >> >>> > wrote: >> >>> >> Gwen, >> >>> >> >> >>> >> KIP write-up looks good. According to the rest of the KIP process >> >>> >> proposal, >> >>> >> would you like to start a DISCUSS/VOTE thread for it? >> >>> >> >> >>> >> Thanks, >> >>> >> Neha >> >>> >> >> >>> >> On Fri, Jan 16, 2015 at 9:37 AM, Ewen Cheslack-Postava >> >>> >> <e...@confluent.io> >> >>> >> wrote: >> >>> >> >> >>> >>> Gwen -- KIP write up looks good. Deprecation schedule probably >> needs >> >>> >>> to be >> >>> >>> more specific, but I think that discussion probably needs to >> happen >> >>> >>> after a >> >>> >>> solution is agreed upon. >> >>> >>> >> >>> >>> Jay -- I think "older clients will get a bad error message >> instead of >> >>> >>> a >> >>> >>> good one" isn't what would be happening with this change. >> Previously >> >>> >>> they >> >>> >>> wouldn't have received an error and they would have been able to >> >>> >>> produce >> >>> >>> messages. After the change they'll just receive this new error >> >>> >>> message >> >>> >>> which their clients can't possibly handle gracefully since it >> didn't >> >>> >>> exist >> >>> >>> when the client was written. Whether the acks > 1 setting was >> >>> >>> actually >> >>> >>> accomplishing what they thought doesn't matter. Someone could have >> >>> >>> reasonably read the docs on 0.8.1.1, thought acks = 2 is an ok >> >>> >>> setting for >> >>> >>> their applications, set it as a default across a bunch of apps, >> then >> >>> >>> follow >> >>> >>> the recommended upgrade path of updating brokers to 0.8.2 and all >> >>> >>> their >> >>> >>> apps will start failing on produce requests. >> >>> >>> >> >>> >>> >> >>> >>> On Thu, Jan 15, 2015 at 8:27 PM, Jay Kreps <jay.kr...@gmail.com> >> >>> >>> wrote: >> >>> >>> >> >>> >>> > This is a good case to discuss. >> >>> >>> > >> >>> >>> > Let's figure the general case of how we want to handle errors >> and >> >>> >>> > get >> >>> >>> that >> >>> >>> > documented in the protocol. The problem right now is that we >> give >> >>> >>> > no >> >>> >>> > guidance on this. I actually thought Gwen's suggestion made >> sense >> >>> >>> > on the >> >>> >>> > guidance we should have given which is that we will enumerate a >> set >> >>> >>> > of >> >>> >>> > errors and their meaning for each API but it is possible that >> other >> >>> >>> errors >> >>> >>> > will occur and they should be handled (maybe poorly) in the same >> >>> >>> > way >> >>> >>> > UNKNOWN_ERROR is handled which is our normal escape hatch for >> >>> >>> > things like >> >>> >>> > OOMException. >> >>> >>> > >> >>> >>> > I really do think we shouldn't be dogmatic here: In considering >> a >> >>> >>> > change >> >>> >>> > to errors we should consider the potential ill-effect vs the >> >>> >>> > complexity >> >>> >>> of >> >>> >>> > yet another protocol version. >> >>> >>> > >> >>> >>> > In this case I actually am not sure we need to bump the protocol >> >>> >>> > because >> >>> >>> > the whole point of the change was to make a setting we think >> >>> >>> > doesn't make >> >>> >>> > sense break, right? Well this will break it. It seems like the >> only >> >>> >>> > downside is that older clients will get a bad error message >> instead >> >>> >>> > of a >> >>> >>> > good one. But it isn't like we will have rendered a client >> >>> >>> > unusable, it >> >>> >>> is >> >>> >>> > just that they will need to change their config. >> >>> >>> > >> >>> >>> > -Jay >> >>> >>> > >> >>> >>> > On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira >> >>> >>> > <gshap...@cloudera.com> >> >>> >>> > wrote: >> >>> >>> > >> >>> >>> >> I created a KIP for this suggestion: >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks >> >>> >>> >> >> >>> >>> >> Basically documenting what was already discussed here. >> Comments >> >>> >>> >> will >> >>> >>> >> be awesome! >> >>> >>> >> >> >>> >>> >> Gwen >> >>> >>> >> >> >>> >>> >> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira >> >>> >>> >> <gshap...@cloudera.com> >> >>> >>> >> wrote: >> >>> >>> >> > The errors are part of the KIP process now, so I think the >> >>> >>> >> > clients are >> >>> >>> >> safe :) >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> >> >>> >>> >> >>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals >> >>> >>> >> > >> >>> >>> >> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin >> >>> >>> >> > <steve.mo...@gmail.com> >> >>> >>> >> wrote: >> >>> >>> >> >> Agree errors should be part of the protocol >> >>> >>> >> >> >> >>> >>> >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira >> >>> >>> >> >>> <gshap...@cloudera.com> >> >>> >>> >> wrote: >> >>> >>> >> >>> >> >>> >>> >> >>> Hi, >> >>> >>> >> >>> >> >>> >>> >> >>> I got convinced by Joe and Dana that errors are indeed >> part of >> >>> >>> >> >>> the >> >>> >>> >> >>> protocol and can't be randomly added. >> >>> >>> >> >>> >> >>> >>> >> >>> So, it looks like we need to bump version of >> ProduceRequest in >> >>> >>> >> >>> the >> >>> >>> >> >>> following way: >> >>> >>> >> >>> Version 0 -> accept acks >1. I think we should keep the >> >>> >>> >> >>> existing >> >>> >>> >> >>> behavior too (i.e. not replace it with -1) to avoid >> surprising >> >>> >>> >> >>> clients, but I'm willing to hear other opinions. >> >>> >>> >> >>> Version 1 -> do not accept acks >1 and return an error. >> >>> >>> >> >>> Are we ok with the error I added in KAFKA-1697? We can use >> >>> >>> >> >>> something >> >>> >>> >> >>> less specific like InvalidRequestParameter. This error can >> be >> >>> >>> >> >>> reused >> >>> >>> >> >>> in the future and reduce the need to add errors, but will >> also >> >>> >>> >> >>> be >> >>> >>> less >> >>> >>> >> >>> clear to the client and its users. Maybe even add the error >> >>> >>> >> >>> message >> >>> >>> >> >>> string to the protocol in addition to the error code? >> (since >> >>> >>> >> >>> we are >> >>> >>> >> >>> bumping versions....) >> >>> >>> >> >>> >> >>> >>> >> >>> I think maintaining the old version throughout 0.8.X makes >> >>> >>> >> >>> sense. >> >>> >>> IMO >> >>> >>> >> >>> dropping it for 0.9 is feasible, but I'll let client owners >> >>> >>> >> >>> help >> >>> >>> make >> >>> >>> >> >>> that call. >> >>> >>> >> >>> >> >>> >>> >> >>> Am I missing anything? Should I start a KIP? It seems like >> a >> >>> >>> KIP-type >> >>> >>> >> >>> discussion :) >> >>> >>> >> >>> >> >>> >>> >> >>> Gwen >> >>> >>> >> >>> >> >>> >>> >> >>> >> >>> >>> >> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava >> >>> >>> >> >>> <e...@confluent.io> wrote: >> >>> >>> >> >>>> Gwen, >> >>> >>> >> >>>> >> >>> >>> >> >>>> I think the only option that wouldn't require a protocol >> >>> >>> >> >>>> version >> >>> >>> >> change is >> >>> >>> >> >>>> the one where acks > 1 is converted to acks = -1 since >> it's >> >>> >>> >> >>>> the >> >>> >>> only >> >>> >>> >> one >> >>> >>> >> >>>> that doesn't potentially break older clients. The protocol >> >>> >>> >> >>>> guide >> >>> >>> >> says that >> >>> >>> >> >>>> the expected upgrade path is servers first, then clients, >> so >> >>> >>> >> >>>> old >> >>> >>> >> clients, >> >>> >>> >> >>>> including non-java clients, that may be using acks > 1 >> should >> >>> >>> >> >>>> be >> >>> >>> >> able to >> >>> >>> >> >>>> work with a new broker version. >> >>> >>> >> >>>> >> >>> >>> >> >>>> It's more work, but I think dealing with the protocol >> change >> >>> >>> >> >>>> is the >> >>> >>> >> right >> >>> >>> >> >>>> thing to do since it eventually gets us to the behavior I >> >>> >>> >> >>>> think is >> >>> >>> >> better -- >> >>> >>> >> >>>> the broker should reject requests with invalid values. I >> >>> >>> >> >>>> think Joe >> >>> >>> >> and I >> >>> >>> >> >>>> were basically in agreement. In my mind the major piece >> >>> >>> >> >>>> missing >> >>> >>> from >> >>> >>> >> his >> >>> >>> >> >>>> description is how long we're going to maintain his "case >> 0" >> >>> >>> >> behavior. It's >> >>> >>> >> >>>> impractical to maintain old versions forever, but it >> sounds >> >>> >>> >> >>>> like >> >>> >>> >> there >> >>> >>> >> >>>> hasn't been a decision on how long to maintain them. Maybe >> >>> >>> >> >>>> that's >> >>> >>> >> another >> >>> >>> >> >>>> item to add to KIPs -- protocol versions and behavior >> need to >> >>> >>> >> >>>> be >> >>> >>> >> listed as >> >>> >>> >> >>>> deprecated and the earliest version in which they'll be >> >>> >>> >> >>>> removed >> >>> >>> >> should be >> >>> >>> >> >>>> specified so users can understand which versions are >> >>> >>> >> >>>> guaranteed to >> >>> >>> be >> >>> >>> >> >>>> compatible, even if they're using (well-written) non-java >> >>> >>> >> >>>> clients. >> >>> >>> >> >>>> >> >>> >>> >> >>>> -Ewen >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers < >> >>> >>> >> dana.pow...@gmail.com> wrote: >> >>> >>> >> >>>>> >> >>> >>> >> >>>>>> clients don't break on unknown errors >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> maybe true for the official java clients, but I dont >> think >> >>> >>> >> >>>>> the >> >>> >>> >> assumption >> >>> >>> >> >>>>> holds true for community-maintained clients and users of >> >>> >>> >> >>>>> those >> >>> >>> >> clients. >> >>> >>> >> >>>>> kafka-python generally follows the fail-fast philosophy >> and >> >>> >>> >> >>>>> raises >> >>> >>> >> an >> >>> >>> >> >>>>> exception on any unrecognized error code in any server >> >>> >>> >> >>>>> response. >> >>> >>> >> in this >> >>> >>> >> >>>>> case, kafka-python allows users to set their own >> >>> >>> >> >>>>> required-acks >> >>> >>> >> policy when >> >>> >>> >> >>>>> creating a producer instance. It is possible that users >> of >> >>> >>> >> kafka-python >> >>> >>> >> >>>>> have deployed producer code that uses ack>1 -- perhaps in >> >>> >>> production >> >>> >>> >> >>>>> environments -- and for those users the new error code >> will >> >>> >>> >> >>>>> crash >> >>> >>> >> their >> >>> >>> >> >>>>> producer code. I would not be surprised if the same were >> >>> >>> >> >>>>> true of >> >>> >>> >> other >> >>> >>> >> >>>>> community clients. >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> *one reason for the fail-fast approach is that there >> isn't >> >>> >>> >> >>>>> great >> >>> >>> >> >>>>> documentation on what errors to expect for each request / >> >>> >>> >> >>>>> response >> >>> >>> >> -- so >> >>> >>> >> >>>>> we >> >>> >>> >> >>>>> use failures to alert that some error case is not handled >> >>> >>> >> properly. and >> >>> >>> >> >>>>> because of that, introducing new error cases without >> bumping >> >>> >>> >> >>>>> the >> >>> >>> api >> >>> >>> >> >>>>> version is likely to cause those errors to get >> raised/thrown >> >>> >>> >> >>>>> all >> >>> >>> >> the way >> >>> >>> >> >>>>> back up to the user. of course we (client maintainers) >> can >> >>> >>> >> >>>>> fix >> >>> >>> the >> >>> >>> >> issues >> >>> >>> >> >>>>> in the client libraries and suggest users upgrade, but >> it's >> >>> >>> >> >>>>> not >> >>> >>> the >> >>> >>> >> ideal >> >>> >>> >> >>>>> situation. >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> long-winded way of saying: I agree w/ Joe. >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> -Dana >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira < >> >>> >>> >> gshap...@cloudera.com> >> >>> >>> >> >>>>> wrote: >> >>> >>> >> >>>>> >> >>> >>> >> >>>>>> Is the protocol bump caused by the behavior change or >> the >> >>> >>> >> >>>>>> new >> >>> >>> error >> >>> >>> >> >>>>>> code? >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> 1) IMO, error_codes are data, and clients can expect to >> >>> >>> >> >>>>>> receive >> >>> >>> >> errors >> >>> >>> >> >>>>>> that they don't understand (i.e. unknown errors). AFAIK, >> >>> >>> >> >>>>>> clients >> >>> >>> >> don't >> >>> >>> >> >>>>>> break on unknown errors, they are simple more >> challenging >> >>> >>> >> >>>>>> to >> >>> >>> >> debug. If >> >>> >>> >> >>>>>> we document the new behavior, then its definitely >> >>> >>> >> >>>>>> debuggable and >> >>> >>> >> >>>>>> fixable. >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> 2) The behavior change is basically a deprecation - i.e. >> >>> >>> >> >>>>>> acks > 1 >> >>> >>> >> were >> >>> >>> >> >>>>>> never documented, and are not supported by Kafka clients >> >>> >>> >> >>>>>> starting >> >>> >>> >> with >> >>> >>> >> >>>>>> version 0.8.2. I'm not sure this requires a protocol >> bump >> >>> >>> >> >>>>>> either, >> >>> >>> >> >>>>>> although its a better case than new error codes. >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> Thanks, >> >>> >>> >> >>>>>> Gwen >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein < >> >>> >>> joe.st...@stealth.ly> >> >>> >>> >> >>>>>> wrote: >> >>> >>> >> >>>>>>> Looping in the mailing list that the client developers >> >>> >>> >> >>>>>>> live on >> >>> >>> >> because >> >>> >>> >> >>>>>> they >> >>> >>> >> >>>>>>> are all not on dev (though they should be if they want >> to >> >>> >>> >> >>>>>>> be >> >>> >>> >> helping >> >>> >>> >> >>>>>>> to >> >>> >>> >> >>>>>>> build the best client libraries they can). >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> I whole hardily believe that we need to not break >> existing >> >>> >>> >> >>>>>>> functionality >> >>> >>> >> >>>>>> of >> >>> >>> >> >>>>>>> the client protocol, ever. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> There are many reasons for this and we have other >> threads >> >>> >>> >> >>>>>>> on the >> >>> >>> >> >>>>>>> mailing >> >>> >>> >> >>>>>>> list where we are discussing that topic (no pun >> intended) >> >>> >>> >> >>>>>>> that I >> >>> >>> >> don't >> >>> >>> >> >>>>>> want >> >>> >>> >> >>>>>>> to re-hash here. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> If we change wire protocol functionality OR the binary >> >>> >>> >> >>>>>>> format >> >>> >>> >> (either) >> >>> >>> >> >>>>>>> we >> >>> >>> >> >>>>>>> must bump version AND treat version as a feature flag >> with >> >>> >>> >> backward >> >>> >>> >> >>>>>>> compatibility support until it is deprecated for some >> time >> >>> >>> >> >>>>>>> for >> >>> >>> >> folks >> >>> >>> >> >>>>>>> to >> >>> >>> >> >>>>>> deal >> >>> >>> >> >>>>>>> with it. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> match version = { >> >>> >>> >> >>>>>>> case 0: keepDoingWhatWeWereDoing() >> >>> >>> >> >>>>>>> case 1: doNewStuff() >> >>> >>> >> >>>>>>> case 2: doEvenMoreNewStuff() >> >>> >>> >> >>>>>>> } >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> has to be a practice we adopt imho ... I know feature >> >>> >>> >> >>>>>>> flags can >> >>> >>> be >> >>> >>> >> >>>>>> construed >> >>> >>> >> >>>>>>> as "messy code" but I am eager to hear another (better? >> >>> >>> >> different?) >> >>> >>> >> >>>>>> solution >> >>> >>> >> >>>>>>> to this. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> If we don't do a feature flag like this specifically >> with >> >>> >>> >> >>>>>>> this >> >>> >>> >> change >> >>> >>> >> >>>>>> then >> >>> >>> >> >>>>>>> what happens is that someone upgrades their brokers >> with a >> >>> >>> rolling >> >>> >>> >> >>>>>> restart >> >>> >>> >> >>>>>>> in 0.8.3 and every single one of their producer >> requests >> >>> >>> >> >>>>>>> start >> >>> >>> to >> >>> >>> >> fail >> >>> >>> >> >>>>>> and >> >>> >>> >> >>>>>>> they have a major production outage. eeeek!!!! >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> I do 100% agree that > 1 makes no sense and we *REALLY* >> >>> >>> >> >>>>>>> need >> >>> >>> >> people to >> >>> >>> >> >>>>>> start >> >>> >>> >> >>>>>>> using 0,1,-1 but we need to-do that in a way that is >> going >> >>> >>> >> >>>>>>> to >> >>> >>> >> work for >> >>> >>> >> >>>>>>> everyone. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> Old producers and consumers must keep working with new >> >>> >>> >> >>>>>>> brokers >> >>> >>> >> and if >> >>> >>> >> >>>>>>> we >> >>> >>> >> >>>>>> are >> >>> >>> >> >>>>>>> not going to support that then I am unclear what the >> use >> >>> >>> >> >>>>>>> of >> >>> >>> >> "version" >> >>> >>> >> >>>>>>> is >> >>> >>> >> >>>>>>> based on our original intentions of having it because >> of >> >>> >>> >> >>>>>>> the >> >>> >>> >> >>>>>>> 0.7=>-0.8. >> >>> >>> >> >>>>>> We >> >>> >>> >> >>>>>>> said no more breaking changes when we did that. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> - Joe Stein >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen >> Cheslack-Postava < >> >>> >>> >> >>>>>> e...@confluent.io> >> >>> >>> >> >>>>>>> wrote: >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> Right, so this looks like it could create an issue >> >>> >>> >> >>>>>>>> similar to >> >>> >>> >> what's >> >>> >>> >> >>>>>>>> currently being discussed in >> >>> >>> >> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 >> where >> >>> >>> >> >>>>>>>> users >> >>> >>> >> now get >> >>> >>> >> >>>>>>>> errors >> >>> >>> >> >>>>>>>> under conditions when they previously wouldn't. Old >> >>> >>> >> >>>>>>>> clients >> >>> >>> won't >> >>> >>> >> >>>>>>>> even >> >>> >>> >> >>>>>>>> know >> >>> >>> >> >>>>>>>> about the error code, so besides failing they won't >> even >> >>> >>> >> >>>>>>>> be >> >>> >>> able >> >>> >>> >> to >> >>> >>> >> >>>>>>>> log >> >>> >>> >> >>>>>>>> any >> >>> >>> >> >>>>>>>> meaningful error messages. >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> I think there are two options for compatibility: >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> 1. An alternative change is to remove the ack > 1 >> code, >> >>> >>> >> >>>>>>>> but >> >>> >>> >> silently >> >>> >>> >> >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This >> isn't >> >>> >>> >> >>>>>>>> the >> >>> >>> >> same as >> >>> >>> >> >>>>>>>> other >> >>> >>> >> >>>>>>>> changes to behavior since the interaction between the >> >>> >>> >> >>>>>>>> client >> >>> >>> and >> >>> >>> >> >>>>>>>> server >> >>> >>> >> >>>>>>>> remains the same, no error codes change, etc. The >> client >> >>> >>> >> >>>>>>>> might >> >>> >>> >> just >> >>> >>> >> >>>>>>>> see >> >>> >>> >> >>>>>>>> some increased latency since the message might need >> to be >> >>> >>> >> replicated >> >>> >>> >> >>>>>>>> to >> >>> >>> >> >>>>>>>> more brokers than they requested. >> >>> >>> >> >>>>>>>> 2. Split this into two patches, one that bumps the >> >>> >>> >> >>>>>>>> protocol >> >>> >>> >> version >> >>> >>> >> >>>>>>>> on >> >>> >>> >> >>>>>>>> that >> >>> >>> >> >>>>>>>> message to include the new error code but maintains >> both >> >>> >>> >> >>>>>>>> old >> >>> >>> (now >> >>> >>> >> >>>>>>>> deprecated) and new behavior, then a second that >> would be >> >>> >>> >> applied in >> >>> >>> >> >>>>>>>> a >> >>> >>> >> >>>>>>>> later release that removes the old protocol + code for >> >>> >>> >> >>>>>>>> handling >> >>> >>> >> acks >> >>> >>> >> >>>>>> 1. >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> 2 is probably the right thing to do. If we specify the >> >>> >>> >> >>>>>>>> release >> >>> >>> >> when >> >>> >>> >> >>>>>> we'll >> >>> >>> >> >>>>>>>> remove the deprecated protocol at the time of >> deprecation >> >>> >>> >> >>>>>>>> it >> >>> >>> >> makes >> >>> >>> >> >>>>>> things >> >>> >>> >> >>>>>>>> a >> >>> >>> >> >>>>>>>> lot easier for people writing non-java clients and >> could >> >>> >>> >> >>>>>>>> give >> >>> >>> >> users >> >>> >>> >> >>>>>> better >> >>> >>> >> >>>>>>>> predictability (e.g. if clients are at most 1 major >> >>> >>> >> >>>>>>>> release >> >>> >>> >> behind >> >>> >>> >> >>>>>>>> brokers, >> >>> >>> >> >>>>>>>> they'll remain compatible but possibly use deprecated >> >>> >>> features). >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira < >> >>> >>> >> gshap...@cloudera.com> >> >>> >>> >> >>>>>>>> wrote: >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>>> Hi Kafka Devs, >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> We are working on KAFKA-1697 - remove code related to >> >>> >>> >> >>>>>>>>> ack>1 on >> >>> >>> >> the >> >>> >>> >> >>>>>>>>> broker. Per Neha's suggestion, I'd like to give >> everyone >> >>> >>> >> >>>>>>>>> a >> >>> >>> >> heads up >> >>> >>> >> >>>>>>>>> on >> >>> >>> >> >>>>>>>>> what these changes mean. >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> Once this patch is included, any produce requests >> that >> >>> >>> >> >>>>>>>>> include >> >>> >>> >> >>>>>>>>> request.required.acks > 1 will result in an >> exception. >> >>> >>> >> >>>>>>>>> This >> >>> >>> >> will be >> >>> >>> >> >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I >> >>> >>> >> >>>>>>>>> assume) >> >>> >>> and >> >>> >>> >> >>>>>>>>> UnknownException in existing versions (sorry, but I >> >>> >>> >> >>>>>>>>> can't add >> >>> >>> >> error >> >>> >>> >> >>>>>>>>> codes retroactively). >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> This behavior is already enforced by 0.8.2 producers >> >>> >>> >> >>>>>>>>> (sync and >> >>> >>> >> >>>>>>>>> new), >> >>> >>> >> >>>>>>>>> but we expect impact on users with older producers >> that >> >>> >>> >> >>>>>>>>> relied >> >>> >>> >> on >> >>> >>> >> >>>>>>>>> acks >> >>> >>> >> >>>>>>>>>> 1 and external clients (i.e python, go, etc). >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> Users who relied on acks > 1 are expected to switch >> to >> >>> >>> >> >>>>>>>>> using >> >>> >>> >> acks = >> >>> >>> >> >>>>>>>>> -1 >> >>> >>> >> >>>>>>>>> and a min.isr parameter than matches their user case. >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> This change was discussed in the past in the context >> of >> >>> >>> >> KAFKA-1555 >> >>> >>> >> >>>>>>>>> (min.isr), but let us know if you have any questions >> or >> >>> >>> concerns >> >>> >>> >> >>>>>>>>> regarding this change. >> >>> >>> >> >>>>>>>>> >> >>> >>> >> >>>>>>>>> Gwen >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> >> >>> >>> >> >>>>>>>> -- >> >>> >>> >> >>>>>>>> Thanks, >> >>> >>> >> >>>>>>>> Ewen >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> -- >> >>> >>> >> >>>>>>> You received this message because you are subscribed to >> >>> >>> >> >>>>>>> the >> >>> >>> Google >> >>> >>> >> >>>>>>> Groups >> >>> >>> >> >>>>>>> "kafka-clients" group. >> >>> >>> >> >>>>>>> To unsubscribe from this group and stop receiving >> emails >> >>> >>> >> >>>>>>> from >> >>> >>> it, >> >>> >>> >> send >> >>> >>> >> >>>>>>> an >> >>> >>> >> >>>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >> >>> >>> >> >>>>>>> To post to this group, send email to >> >>> >>> >> kafka-clie...@googlegroups.com. >> >>> >>> >> >>>>>>> Visit this group at >> >>> >>> http://groups.google.com/group/kafka-clients. >> >>> >>> >> >>>>>>> To view this discussion on the web visit >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> >> >>> >>> >> >> >>> >>> >> >>> >>> >> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com >> >>> >>> >> >>>>>> . >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> For more options, visit >> >>> >>> >> >>>>>>> https://groups.google.com/d/optout. >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> -- >> >>> >>> >> >>>>>> You received this message because you are subscribed to >> the >> >>> >>> Google >> >>> >>> >> >>>>>> Groups >> >>> >>> >> >>>>>> "kafka-clients" group. >> >>> >>> >> >>>>>> To unsubscribe from this group and stop receiving emails >> >>> >>> >> >>>>>> from it, >> >>> >>> >> send >> >>> >>> >> >>>>>> an >> >>> >>> >> >>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >> >>> >>> >> >>>>>> To post to this group, send email to >> >>> >>> >> kafka-clie...@googlegroups.com. >> >>> >>> >> >>>>>> Visit this group at >> >>> >>> >> >>>>>> http://groups.google.com/group/kafka-clients >> >>> >>> . >> >>> >>> >> >>>>>> To view this discussion on the web visit >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> >> >>> >>> >> >> >>> >>> >> >>> >>> >> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com >> >>> >>> >> >>>>>> . >> >>> >>> >> >>>>>> For more options, visit >> https://groups.google.com/d/optout. >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> -- >> >>> >>> >> >>>> Thanks, >> >>> >>> >> >>>> Ewen >> >>> >>> >> >> >>> >>> > >> >>> >>> > >> >>> >>> >> >>> >>> >> >>> >>> -- >> >>> >>> Thanks, >> >>> >>> Ewen >> >>> >>> >> >>> >> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Thanks, >> >>> >> Neha >> >> >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> Groups >> >> "kafka-clients" group. >> >> To unsubscribe from this group and stop receiving emails from it, send >> an >> >> email to kafka-clients+unsubscr...@googlegroups.com. >> >> To post to this group, send email to kafka-clie...@googlegroups.com. >> >> Visit this group at http://groups.google.com/group/kafka-clients. >> >> To view this discussion on the web visit >> >> >> https://groups.google.com/d/msgid/kafka-clients/CAOeJiJh17CYq%3D-qgPu9rnArsPW%3D7RL9AAW_h%3DrrXx0%2BKhhKgNQ%40mail.gmail.com >> . >> >> >> >> For more options, visit https://groups.google.com/d/optout. >> > >> > >> > >